
build MCP server FastMCP parallel search Grok API Gemini Search Brave Search
🤔 Why One Search Engine Isn’t Enough
When you’re building AI agents, web search becomes essential sooner or later. Whether you’re looking up the latest news, researching trends, or reviewing technical docs — without search, hallucinations just pile up. But relying on a single search engine introduces bias. Google results are heavily SEO-optimized, Brave offers an independent index with different sources, and Grok can pull real-time data from X (Twitter).
So we built an MCP server that calls 3 search sources in parallel, cross-validates the results, and produces a synthesized report. It’s called terry-research. If you’re interested in how to build an MCP server, today we’re sharing everything honestly — from architecture design to fixing an asyncio bug and real-world test results.
🏗️ Architecture — The Orchestrator Pattern
At the heart of terry-research is the Orchestrator pattern. A single conductor dispatches work to multiple search agents simultaneously, then collects and merges the results.
A standard protocol for AI models to call external tools. AI clients like Claude and Cursor can automatically discover and execute tools from MCP servers. Check the official site for the full spec.

| Layer | Module | Role | Lines |
|---|---|---|---|
| Entry | server.py | FastMCP server + 4 tool definitions | 226 |
| Orchestrator | orchestrator.py | Parallel dispatch + result collection | 176 |
| Agents | grok / gemini / brave | Individual search API calls | 345 |
| Core | models + merger + config | Data structures + merging + config | 238 |
1,066 lines total. A small codebase, but the clean agent separation and parallel processing make it easy to extend.
🔧 Four MCP Tools
terry-research provides 4 tools. research() is the main one, and the other 3 let you call individual sources directly.
@mcp.tool()
def research(
query: str,
topic_type: str = "general",
depth: str = "standard",
sources: list[str] | None = None,
include_images: bool = False,
language: str = "ko",
max_results_per_source: int = 10,
) -> str:
"""Multi-source research
with parallel dispatch."""
| Tool | Source | Feature |
|---|---|---|
research() | Auto-select | Topic routing + parallel + cross-validation |
grok_search() | xAI | Web + X (Twitter) real-time search |
gemini_search() | Google Search grounding + thinking | |
brave_search() | Brave | Independent index, country/language filters |
🎯 Topic Routing — Auto-Selecting the Right Sources
When you call research() with a topic_type, the routing table in config.yaml automatically picks the best source combination.
# config.yaml
topic_routing:
general:
- grok
- gemini
fashion_trend:
- grok
- gemini
sns_trend:
- grok # X (Twitter) data
tech:
- gemini
- brave # Independent index
news:
- grok
- gemini
For example, SNS trends use only Grok because it’s the only one that can search X posts directly. For tech topics, the Gemini + Brave combo works great — you get Google’s vast index alongside Brave’s independent perspective.
The depth parameter controls search intensity.
| Depth | Sources | Model | Use Case |
|---|---|---|---|
quick | 1 | Default | Quick fact-check |
standard | 2-3 | Default | General research |
deep | All | Reasoning | Deep research |
In deep mode, Grok upgrades to grok-4-1-fast-reasoning, Gemini switches to gemini-3.1-pro-preview, and search turns increase from 3 to 10.
⚡ Parallel Dispatch — ThreadPoolExecutor
This was the trickiest part. We needed to call 3 APIs simultaneously, so async processing was essential. Naturally, we tried asyncio first.
# First attempt (failed!)
async def _dispatch(self, sources):
tasks = [
self._search(s) for s in sources
]
return await asyncio.gather(*tasks)
FastMCP itself already runs inside an async event loop. Calling asyncio.run() or asyncio.gather() inside it triggers an “event loop is already running” error. A classic nested event loop problem.
The solution was ThreadPoolExecutor. Thread-based parallelism doesn’t conflict with the event loop.
def _dispatch_parallel(
self, query, sources, deep, lang
) -> list[SourceResponse]:
with ThreadPoolExecutor(
max_workers=len(sources)
) as pool:
futures = {
pool.submit(
self._search_single,
src, query, deep, lang
): src
for src in sources
}
results = []
for fut in as_completed(
futures, timeout=self.timeout
):
try:
results.append(fut.result())
except Exception as e:
src = futures[fut]
results.append(
SourceResponse(
source=src,
error=str(e)
)
)
return results
This way, each search agent runs in its own separate thread. If one source is slow, it doesn’t block the others, and after the timeout (30 seconds by default), any unfinished sources are simply skipped.
Even if one source throws an error, the entire pipeline doesn’t crash. Errors are recorded in SourceResponse.error, and the report is built from whatever sources succeeded. This matters a lot in practice — API keys expire, services go down temporarily. It’s just part of daily life.

🔀 Result Merging — The Power of Cross-Validation
When results come back from 3 sources, the Merger deduplicates and cross-validates them.
def merge(self, responses):
# 1. Normalize URLs (strip www, trailing /)
# 2. Deduplicate citations
# 3. Track multi_source_urls
# — URLs found in 2+ sources
# 4. Preserve per_source answers
The key is multi_source_urls. When the same URL appears in both Grok and Gemini results, it’s a strong signal of reliability. The report marks these as “cross-validated”, giving the AI client higher confidence.
🧩 Agent-Specific Features
Each agent is designed to maximize the strengths of its underlying API.
| Agent | SDK/API | Unique Feature |
|---|---|---|
| GrokAgent | xAI Responses API | web_search + x_search combined, date filtering |
| GeminiAgent | google-genai SDK | Google Search grounding, thinking levels (0-8192 tokens) |
| BraveAgent | REST API (httpx) | Independent index, country/language/freshness filters |
Grok’s biggest advantage is searching X posts. You can see real-time reactions during SNS trend research. Gemini returns precise citations via grounding_metadata with URI/title pairs. Brave offers diversity with an index independent from Google.
📊 Real-World Test Results
Here are the results from searching “2026 Korean fashion trends” with topic_type: fashion_trend.
| Source | Response Size | Citations | Notes |
|---|---|---|---|
| Grok | 5,300 chars | 12 | Includes 3 X posts |
| Gemini | 4,281 chars | 8 | Google grounding metadata |
| Brave | 10 results | 10 | Independent sources not in Google |
| Combined | 10,724 chars | 24 | 33 seconds, 5 cross-validated URLs |
A 10,724-character report with 24 citations in just 33 seconds. Calling each source sequentially would have taken over 60 seconds, but parallel dispatch cut that nearly in half.
1) Fashion trends (Grok+Gemini) · 2) Tech research (Gemini+Brave) · 3) SNS trends (Grok only) · 4) News fact-check (all sources, deep). You can see how source combinations change automatically based on topic routing.
🛠️ Try It Yourself — Setup Guide
If you want to build an MCP server yourself, terry-research is a great starting point. It will be available on GitHub soon. Register it with Claude Code or Cursor and you’re ready to go.
# 1. Clone (after public release)
git clone https://github.com/goandon/terry-research.git
cd terry-research
# 2. Install dependencies
pip install -r requirements.txt
# 3. Set API keys (.env)
XAI_API_KEY=your-xai-key
GEMINI_API_KEY=your-gemini-key
BRAVE_API_KEY=your-brave-key
# 4. Run the MCP server
python server.py
Sources without API keys are automatically skipped. Even with just a Brave key, the brave_search() tool works perfectly fine.
Add this to your .mcp.json:
{
"terry-research": {
"command": "python",
"args": ["server.py"],
"cwd": "/path/to/terry-research"
}
}
💡 Lessons Learned Along the Way
- FastMCP + asyncio is a trap — FastMCP already runs its own event loop internally, so ThreadPoolExecutor is the safe choice for parallelism
- Error isolation isn’t optional — When you’re calling 3 external APIs, at least one will fail eventually. Without per-source error handling, the whole thing falls apart
- Cross-validation is the real value — When the same URL shows up across multiple sources, confidence goes up. It’s a completely different level from simply concatenating results
- Topic routing saves costs — Calling every source every time wastes both money and time. Selecting only the relevant sources for each topic is far more practical
- 1,066 lines is enough — Keep it simple with one file per agent, avoid over-abstraction, and you’ll find it easy to read and maintain

📚 References
- terry-research GitHub — Full source code and setup guide (coming soon)
- Model Context Protocol Official Site — MCP spec and SDK docs
- FastMCP GitHub — Python MCP server framework
- xAI API Docs — Grok Responses API + search tools
- Gemini API Docs — Google Search grounding
- Brave Search API — Independent index search API
- BookStack Wiki Self-Hosting Guide — Same infrastructure series (related post)
- Synology NAS SSL Setup Guide — Useful for MCP server hosting (related post)
✅ Wrapping Up
The three core ideas behind terry-research MCP are: topic routing for optimal source selection, ThreadPoolExecutor for parallel dispatch, and URL cross-validation for reliability. These patterns apply not just to search, but to any MCP server that needs to call multiple APIs simultaneously.
Once you try to build an MCP server yourself, you’ll feel the possibilities of AI agents expand dramatically. Even adding a single search tool makes your agent so much smarter. Give it a try — build your own MCP server!
