Introduction to MCP and Tiny Agents
The Model Context Protocol (MCP) is revolutionizing how we build AI applications by standardizing tool integration for LLMs. In this guide, I’ll show you how to create a fully functional Python agent that leverages MCP to dynamically discover and use tools – all in under 100 lines of code.
Key benefits of this approach:
-
No custom integration code needed for new tools
-
Real-time tool discovery from MCP servers
-
⚡ Streaming responses for smooth user experience
-
Pure Python implementation
Getting Started: Installation
First, ensure you have Python 3.8-3.11 installed (3.12+ has known issues). Then install the required packages:
1 2 3 4 |
pip install "huggingface_hub[mcp]>=0.32.4" playwright playwright install |
Building the Core Agent
Here’s our complete Tiny Agent implementation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
import asyncio import json from typing import AsyncGenerator, Dict, List, Optional from huggingface_hub import AsyncInferenceClient from huggingface_hub.inference._mcp import MCPClient, ServerType class TinyAgent(MCPClient): def __init__( self, model: str, provider: Optional[str] = None, api_key: Optional[str] = None, system_prompt: str = "You are a helpful AI assistant.", ): super().__init__(model=model, provider=provider, api_key=api_key) self.messages = [{"role": "system", "content": system_prompt}] async def chat( self, user_input: str, max_turns: int = 10 ) -> AsyncGenerator[str, None]: """Handle a conversation turn with tool usage""" self.messages.append({"role": "user", "content": user_input}) turn_count = 0 while turn_count < max_turns: turn_count += 1 # Process LLM response and tool calls async for chunk in self.process_single_turn_with_tools(self.messages): if chunk.content: yield chunk.content if hasattr(chunk, "tool_calls"): for tool_call in chunk.tool_calls: result = await self.execute_tool(tool_call) self.messages.append({ "role": "tool", "name": tool_call.function.name, "content": str(result) }) # Exit if LLM gives final response if self.messages[-1]["role"] != "tool": return |
Connecting to MCP Servers
Let’s add a Playwright MCP server for web browsing capabilities:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
async def main(): # Initialize agent with Qwen model agent = TinyAgent( model="Qwen/Qwen2.5-72B-Instruct", provider="nebius", system_prompt="You are a web research assistant." ) # Add Playwright MCP server await agent.add_mcp_server( ServerType.STDIO, command="npx", args=["@playwright/mcp@latest"] ) # Interactive chat loop while True: user_input = input("\nYou: ") if user_input.lower() in ['exit', 'quit']: break print("\nAssistant: ", end="", flush=True) async for response in agent.chat(user_input): print(response, end="", flush=True) if __name__ == "__main__": asyncio.run(main()) |
Example Agent Configurations
1. Web Research Agent (agent_web.json
)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
{ "model": "Qwen/Qwen2.5-72B-Instruct", "provider": "nebius", "servers": [ { "type": "stdio", "config": { "command": "npx", "args": ["@playwright/mcp@latest"] } } ] } |
2. Image Generation Agent (agent_image.json
)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
{ "model": "Qwen/Qwen2.5-72B-Instruct", "provider": "nebius", "servers": [ { "type": "http", "config": { "url": "https://flux-1-schnell.hf.space/mcp" } } ] } |
Running Your Agent
Execute the agent with:
1 2 3 |
python tiny_agent.py --config agent_web.json |
Example session: