Building an MCP Server — Parallel Search with Grok, Gemini & Brave

Building an MCP Server — Parallel Search with Grok, Gemini & Brave
🔍 FastMCP · Grok (xAI) · Gemini (Google) · Brave Search · ThreadPoolExecutor · Parallel Search

build MCP server FastMCP parallel search Grok API Gemini Search Brave Search

🤔 Why One Search Engine Isn’t Enough

When you’re building AI agents, web search becomes essential sooner or later. Whether you’re looking up the latest news, researching trends, or reviewing technical docs — without search, hallucinations just pile up. But relying on a single search engine introduces bias. Google results are heavily SEO-optimized, Brave offers an independent index with different sources, and Grok can pull real-time data from X (Twitter).

So we built an MCP server that calls 3 search sources in parallel, cross-validates the results, and produces a synthesized report. It’s called terry-research. If you’re interested in how to build an MCP server, today we’re sharing everything honestly — from architecture design to fixing an asyncio bug and real-world test results.

🏗️ Architecture — The Orchestrator Pattern

At the heart of terry-research is the Orchestrator pattern. A single conductor dispatches work to multiple search agents simultaneously, then collects and merges the results.

MCP (Model Context Protocol)

A standard protocol for AI models to call external tools. AI clients like Claude and Cursor can automatically discover and execute tools from MCP servers. Check the official site for the full spec.

LayerModuleRoleLines
Entryserver.pyFastMCP server + 4 tool definitions226
Orchestratororchestrator.pyParallel dispatch + result collection176
Agentsgrok / gemini / braveIndividual search API calls345
Coremodels + merger + configData structures + merging + config238

1,066 lines total. A small codebase, but the clean agent separation and parallel processing make it easy to extend.

🔧 Four MCP Tools

terry-research provides 4 tools. research() is the main one, and the other 3 let you call individual sources directly.

@mcp.tool()
def research(
    query: str,
    topic_type: str = "general",
    depth: str = "standard",
    sources: list[str] | None = None,
    include_images: bool = False,
    language: str = "ko",
    max_results_per_source: int = 10,
) -> str:
    """Multi-source research
    with parallel dispatch."""
ToolSourceFeature
research()Auto-selectTopic routing + parallel + cross-validation
grok_search()xAIWeb + X (Twitter) real-time search
gemini_search()GoogleGoogle Search grounding + thinking
brave_search()BraveIndependent index, country/language filters

🎯 Topic Routing — Auto-Selecting the Right Sources

When you call research() with a topic_type, the routing table in config.yaml automatically picks the best source combination.

# config.yaml
topic_routing:
  general:
    - grok
    - gemini
  fashion_trend:
    - grok
    - gemini
  sns_trend:
    - grok          # X (Twitter) data
  tech:
    - gemini
    - brave         # Independent index
  news:
    - grok
    - gemini

For example, SNS trends use only Grok because it’s the only one that can search X posts directly. For tech topics, the Gemini + Brave combo works great — you get Google’s vast index alongside Brave’s independent perspective.

The depth parameter controls search intensity.

DepthSourcesModelUse Case
quick1DefaultQuick fact-check
standard2-3DefaultGeneral research
deepAllReasoningDeep research

In deep mode, Grok upgrades to grok-4-1-fast-reasoning, Gemini switches to gemini-3.1-pro-preview, and search turns increase from 3 to 10.

⚡ Parallel Dispatch — ThreadPoolExecutor

This was the trickiest part. We needed to call 3 APIs simultaneously, so async processing was essential. Naturally, we tried asyncio first.

# First attempt (failed!)
async def _dispatch(self, sources):
    tasks = [
        self._search(s) for s in sources
    ]
    return await asyncio.gather(*tasks)
asyncio.run() Conflict — The FastMCP Trap

FastMCP itself already runs inside an async event loop. Calling asyncio.run() or asyncio.gather() inside it triggers an “event loop is already running” error. A classic nested event loop problem.

The solution was ThreadPoolExecutor. Thread-based parallelism doesn’t conflict with the event loop.

def _dispatch_parallel(
    self, query, sources, deep, lang
) -> list[SourceResponse]:
    with ThreadPoolExecutor(
        max_workers=len(sources)
    ) as pool:
        futures = {
            pool.submit(
                self._search_single,
                src, query, deep, lang
            ): src
            for src in sources
        }
        results = []
        for fut in as_completed(
            futures, timeout=self.timeout
        ):
            try:
                results.append(fut.result())
            except Exception as e:
                src = futures[fut]
                results.append(
                    SourceResponse(
                        source=src,
                        error=str(e)
                    )
                )
        return results

This way, each search agent runs in its own separate thread. If one source is slow, it doesn’t block the others, and after the timeout (30 seconds by default), any unfinished sources are simply skipped.

Error isolation is key

Even if one source throws an error, the entire pipeline doesn’t crash. Errors are recorded in SourceResponse.error, and the report is built from whatever sources succeeded. This matters a lot in practice — API keys expire, services go down temporarily. It’s just part of daily life.

🔀 Result Merging — The Power of Cross-Validation

When results come back from 3 sources, the Merger deduplicates and cross-validates them.

def merge(self, responses):
    # 1. Normalize URLs (strip www, trailing /)
    # 2. Deduplicate citations
    # 3. Track multi_source_urls
    #    — URLs found in 2+ sources
    # 4. Preserve per_source answers

The key is multi_source_urls. When the same URL appears in both Grok and Gemini results, it’s a strong signal of reliability. The report marks these as “cross-validated”, giving the AI client higher confidence.

🧩 Agent-Specific Features

Each agent is designed to maximize the strengths of its underlying API.

AgentSDK/APIUnique Feature
GrokAgentxAI Responses APIweb_search + x_search combined, date filtering
GeminiAgentgoogle-genai SDKGoogle Search grounding, thinking levels (0-8192 tokens)
BraveAgentREST API (httpx)Independent index, country/language/freshness filters

Grok’s biggest advantage is searching X posts. You can see real-time reactions during SNS trend research. Gemini returns precise citations via grounding_metadata with URI/title pairs. Brave offers diversity with an index independent from Google.

📊 Real-World Test Results

Here are the results from searching “2026 Korean fashion trends” with topic_type: fashion_trend.

SourceResponse SizeCitationsNotes
Grok5,300 chars12Includes 3 X posts
Gemini4,281 chars8Google grounding metadata
Brave10 results10Independent sources not in Google
Combined10,724 chars2433 seconds, 5 cross-validated URLs

A 10,724-character report with 24 citations in just 33 seconds. Calling each source sequentially would have taken over 60 seconds, but parallel dispatch cut that nearly in half.

4 Demo Scenarios

1) Fashion trends (Grok+Gemini) · 2) Tech research (Gemini+Brave) · 3) SNS trends (Grok only) · 4) News fact-check (all sources, deep). You can see how source combinations change automatically based on topic routing.

🛠️ Try It Yourself — Setup Guide

If you want to build an MCP server yourself, terry-research is a great starting point. It will be available on GitHub soon. Register it with Claude Code or Cursor and you’re ready to go.

# 1. Clone (after public release)
git clone https://github.com/goandon/terry-research.git
cd terry-research

# 2. Install dependencies
pip install -r requirements.txt

# 3. Set API keys (.env)
XAI_API_KEY=your-xai-key
GEMINI_API_KEY=your-gemini-key
BRAVE_API_KEY=your-brave-key

# 4. Run the MCP server
python server.py

Sources without API keys are automatically skipped. Even with just a Brave key, the brave_search() tool works perfectly fine.

Registering with Claude Code

Add this to your .mcp.json:

{
  "terry-research": {
    "command": "python",
    "args": ["server.py"],
    "cwd": "/path/to/terry-research"
  }
}

💡 Lessons Learned Along the Way

  • FastMCP + asyncio is a trap — FastMCP already runs its own event loop internally, so ThreadPoolExecutor is the safe choice for parallelism
  • Error isolation isn’t optional — When you’re calling 3 external APIs, at least one will fail eventually. Without per-source error handling, the whole thing falls apart
  • Cross-validation is the real value — When the same URL shows up across multiple sources, confidence goes up. It’s a completely different level from simply concatenating results
  • Topic routing saves costs — Calling every source every time wastes both money and time. Selecting only the relevant sources for each topic is far more practical
  • 1,066 lines is enough — Keep it simple with one file per agent, avoid over-abstraction, and you’ll find it easy to read and maintain

📚 References

✅ Wrapping Up

The three core ideas behind terry-research MCP are: topic routing for optimal source selection, ThreadPoolExecutor for parallel dispatch, and URL cross-validation for reliability. These patterns apply not just to search, but to any MCP server that needs to call multiple APIs simultaneously.

Once you try to build an MCP server yourself, you’ll feel the possibilities of AI agents expand dramatically. Even adding a single search tool makes your agent so much smarter. Give it a try — build your own MCP server!


Discover more from AI-Girls Lab

Subscribe to get the latest posts sent to your email.

featured terry research mcp multi source 1 web
© 2026 AI-Girls Lab | Privacy Policy | About

Discover more from AI-Girls Lab

Subscribe now to keep reading and get access to the full archive.

Continue reading