Building the First MCP Integration for Claude

When Anthropic announced the Model Context Protocol in November 2024, I immediately saw the potential. I had been building Audioscrape - a semantic search engine for podcasts with over 1M hours indexed - and MCP was the perfect way to make that data accessible to AI assistants.

Within a week, Audioscrape became one of the first MCP integrations listed in the official awesome-mcp-servers repository.

Here’s how I built it, and what I learned about designing tools for LLMs.

Why MCP Matters

Before MCP, connecting external data to Claude meant awkward workarounds: copying text into prompts, building custom API wrappers, or hoping the context window was large enough.

MCP standardizes this. It defines how AI assistants discover, authenticate with, and call external tools. Think of it as USB for AI - a universal protocol that lets any tool plug into any compatible assistant.

For Audioscrape, this means Claude can now answer questions like:

“What has Tim Ferriss said about morning routines?”
“Find podcast discussions about failed startups”
“What do VCs say about AI valuations?”

And get real answers from real podcast episodes, with timestamps and context.

The Architecture

The MCP server is written in Rust, matching Audioscrape’s backend. Here’s the core structure:

use axum::{routing::post, Router, Json};
use serde::{Deserialize, Serialize};

#[derive(Deserialize)]
struct SearchRequest {
    query: String,
    limit: Option<usize>,
}

#[derive(Serialize)]
struct SearchResult {
    episode: String,
    podcast: String,
    timestamp: String,
    transcript: String,
    relevance: f32,
}

async fn search(Json(req): Json<SearchRequest>) -> Json<Vec<SearchResult>> {
    let results = audioscrape::search(&req.query, req.limit.unwrap_or(5)).await;
    Json(results.into_iter().map(|r| SearchResult {
        episode: r.episode_title,
        podcast: r.podcast_name,
        timestamp: format_timestamp(r.start_time),
        transcript: r.text,
        relevance: r.score,
    }).collect())
}

pub fn create_router() -> Router {
    Router::new()
        .route("/search", post(search))
        .route("/health", get(|| async { "ok" }))
}

The implementation is intentionally simple. MCP tools should do one thing well and return structured data. Claude handles the complexity of understanding user intent - your job is to provide clean, relevant results.

Three Things That Made It Work

1. Speed

MCP tools need to respond fast. Users expect Claude to feel conversational, not laggy. Audioscrape’s Rust backend returns results in under 100ms, even when searching across millions of transcript segments.

The secret is pre-computed embeddings. When a podcast is ingested, I chunk the transcript into ~30 second segments, embed each one, and store them in a vector database. Search is just a nearest-neighbor lookup.

2. Semantic Understanding

Keyword search fails for conversational queries. When someone asks about “startup failures,” they also want results mentioning “companies that went bankrupt,” “founders who lost everything,” or “why my business died.”

Audioscrape uses embedding-based semantic search. The query and all transcript segments live in the same vector space. Similar meanings cluster together, regardless of exact wording.

3. Structured Output

Claude works best with clean, predictable data. Each search result includes:

Episode and podcast name (for attribution)
Timestamp (so users can jump to that moment)
Relevant transcript snippet (context)
Relevance score (so Claude can prioritize)

No HTML, no markdown, no formatting tricks. Just data.

What I Learned

Building for LLMs is different from building for humans. A few insights:

Be opinionated about defaults. Claude doesn’t want 50 results. It wants 3-5 highly relevant ones. I default to 5 and let the model ask for more if needed.

Fail gracefully. If search returns nothing, say so clearly. Don’t return empty arrays without explanation. Claude will pass that confusion to the user.

Think about context windows. Each transcript snippet consumes tokens. I limit snippets to ~200 characters - enough context without bloating the response.

Try It

The MCP server is open source and listed in awesome-mcp-servers.

Install it:

npm install -g @anthropic/mcp-audioscrape

Then ask Claude about any topic. It will search through millions of hours of podcasts to find relevant discussions.

What’s Next

I’m working on:

Speaker identification - Know who said what
Episode summaries - Get the gist before diving into timestamps
Deep linking - Jump directly to the relevant moment in the original audio

If you’re building MCP tools, my advice: start simple, respond fast, and let the LLM do the heavy lifting on understanding intent. Your job is to provide the best possible data.

Audioscrape: audioscrape.com