Skip to main content

How to Build a Custom MCP Server with Python and FastMCP

 Connecting Large Language Models (LLMs) to your internal data—whether it's a local SQLite database, a legacy CRM, or a private microservice—is the next frontier in AI engineering. The Model Context Protocol (MCP) by Anthropic has emerged as the standard for this connectivity.

However, many developers hit a wall immediately after reading the spec. While the concept is elegant, the implementation detail—managing a stateless JSON-RPC 2.0 connection over standard input/output (stdio)—is tedious. It requires handling message correlation, error serialization, and strict buffer management.

If you are writing raw JSON-RPC handlers to connect an AI agent to your database, you are wasting time on plumbing.

This guide details how to bypass the protocol complexity using FastMCP, a Pythonic framework that treats MCP servers like FastAPI applications. We will build a production-ready server that grants an AI agent safe, structured access to a local order database.

The Root Cause: Why Raw MCP Is Difficult

To understand why we need FastMCP, we must understand the friction of the raw protocol. MCP operates over a transport layer (usually stdio for local agents).

When an LLM wants to use a tool you've provided, the flow looks like this:

  1. Handshake: The client and server negotiate protocol versions and capabilities.
  2. Tool Discovery: The client requests a list of tools. The server must return JSON schemas describing every available function and its arguments.
  3. Execution: The client sends a call_tool request with a specific JSON-RPC id.
  4. Response: The server must execute the logic and return the result matching that exact id.

The Engineering Problem: In a raw implementation, you are responsible for the event loop. If your Python script uses print() for debugging, you corrupt the stdout stream, causing the JSON-RPC parse to fail and the connection to drop. Furthermore, manually converting Python function signatures into JSON Schema definitions is error-prone and hard to maintain as your code evolves.

FastMCP abstracts the event loop and uses Python type hints to auto-generate the required JSON schemas, exactly how FastAPI handles REST endpoints.

The Fix: Building an Order Management MCP Server

We will build an MCP server that allows an AI agent to look up order status and refund eligibility from a local database.

Prerequisites

  • Python 3.10 or higher
  • uv or pip for package management

Step 1: Installation

Create a new virtual environment and install fastmcp.

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install FastMCP
pip install fastmcp

Step 2: Seed a Mock Database

For this example, we need data to query. Run this snippet once to create a local SQLite database named orders.db.

import sqlite3

def init_db():
    conn = sqlite3.connect("orders.db")
    cursor = conn.cursor()
    
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS orders (
            order_id TEXT PRIMARY KEY,
            customer_email TEXT,
            status TEXT,
            total_amount REAL,
            is_refundable BOOLEAN
        )
    """)
    
    data = [
        ("ORD-101", "alice@example.com", "SHIPPED", 120.50, 1),
        ("ORD-102", "bob@example.com", "PROCESSING", 450.00, 1),
        ("ORD-103", "charlie@example.com", "DELIVERED", 25.00, 0),
        ("ORD-104", "alice@example.com", "CANCELLED", 0.00, 0),
    ]
    
    cursor.executemany("INSERT OR IGNORE INTO orders VALUES (?, ?, ?, ?, ?)", data)
    conn.commit()
    conn.close()
    print("Database initialized.")

if __name__ == "__main__":
    init_db()

Step 3: The FastMCP Server Implementation

Here is the core logic. Create a file named server.py. Notice how we use type hints (strfloat) and docstrings. FastMCP uses these to teach the LLM how to use the tool.

from fastmcp import FastMCP
import sqlite3
from typing import List, Optional, Dict, Any
from pydantic import BaseModel

# Initialize the FastMCP server
mcp = FastMCP("OrderManager")

DB_PATH = "orders.db"

def get_db_connection():
    """Establishes a connection to the SQLite database."""
    conn = sqlite3.connect(DB_PATH)
    conn.row_factory = sqlite3.Row  # Access columns by name
    return conn

@mcp.tool()
def get_order_details(order_id: str) -> str:
    """
    Retrieves full details for a specific order by ID.
    Use this when the user provides a specific order reference (e.g., ORD-101).
    """
    try:
        conn = get_db_connection()
        cursor = conn.cursor()
        
        cursor.execute("SELECT * FROM orders WHERE order_id = ?", (order_id,))
        row = cursor.fetchone()
        conn.close()

        if row:
            # Format the output for the LLM
            return (
                f"Order ID: {row['order_id']}\n"
                f"Customer: {row['customer_email']}\n"
                f"Status: {row['status']}\n"
                f"Total: ${row['total_amount']:.2f}\n"
                f"Refundable: {'Yes' if row['is_refundable'] else 'No'}"
            )
        else:
            return f"Error: Order {order_id} not found."
            
    except Exception as e:
        return f"Database Error: {str(e)}"

@mcp.tool()
def list_orders_by_status(status: str) -> List[Dict[str, Any]]:
    """
    Lists all orders matching a specific status (e.g., SHIPPED, PROCESSING, DELIVERED).
    Returns a list of JSON objects containing order ID and amount.
    """
    conn = get_db_connection()
    cursor = conn.cursor()
    
    # normalize status to uppercase for consistency
    query_status = status.upper()
    
    cursor.execute(
        "SELECT order_id, total_amount, customer_email FROM orders WHERE status = ?", 
        (query_status,)
    )
    rows = cursor.fetchall()
    conn.close()

    results = []
    for row in rows:
        results.append({
            "id": row["order_id"],
            "amount": row["total_amount"],
            "email": row["customer_email"]
        })
        
    return results

if __name__ == "__main__":
    # This runs the MCP event loop over stdio
    mcp.run()

Step 4: Connecting to Claude Desktop

To test this, you need to configure the Claude Desktop app (or any MCP client) to run this script.

  1. Open your Claude Desktop configuration file:
    • Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
  2. Add the following configuration:
{
  "mcpServers": {
    "order-manager": {
      "command": "uv",
      "args": [
        "run",
        "python",
        "/ABSOLUTE/PATH/TO/YOUR/server.py"
      ]
    }
  }
}

Note: Replace /ABSOLUTE/PATH/TO/YOUR/ with the actual path to your project folder.

  1. Restart Claude Desktop. Look for the plug icon indicating the "OrderManager" tool is connected.

Deep Dive: How FastMCP Bridges the Gap

When you annotate a function with @mcp.tool(), FastMCP performs three critical actions behind the scenes:

1. Introspection and Schema Generation

The LLM cannot read Python code; it reads JSON Schema. FastMCP inspects the get_order_details function signature. It sees order_id: str. It converts this into the following structure automatically:

{
  "name": "get_order_details",
  "description": "Retrieves full details for a specific order by ID...",
  "inputSchema": {
    "type": "object",
    "properties": {
      "order_id": { "type": "string" }
    },
    "required": ["order_id"]
  }
}

Without this introspection, you would have to manually write this JSON object for every single tool you create.

2. Transport Abstraction

The mcp.run() command creates a blocking listener on sys.stdin. It parses incoming lines as JSON-RPC messages. When a request matches a registered tool, it delegates execution to your Python function, captures the return value, serializes it back to JSON, and writes it to sys.stdout.

3. Context Management

Notice we didn't have to handle the connection lifecycle. FastMCP handles the initialize handshake automatically, advertising the "OrderManager" name and the tools we defined during the setup phase.

Common Pitfalls and Edge Cases

When taking an MCP server to production, generic "hello world" advice fails. Watch out for these specific issues:

1. The Print Statement Trap

Issue: If you leave a print("Debug: query executed") statement in your code, that text goes to stdoutResult: The MCP client (Claude) expects valid JSON. It receives "Debug: query executed", fails to parse it, and disconnects the server immediately. Solution: Always print to sys.stderr for debugging. FastMCP logs usually route to stderr automatically, which MCP clients capture as logs rather than protocol messages.

import sys

# Do this
print("DEBUG: Checking database connection...", file=sys.stderr)

# NEVER do this
# print("Checking database connection...") 

2. Security and SQL Injection

Giving an AI agent access to a database is risky if done poorly. Anti-Pattern: Creating a tool that accepts raw SQL (def run_query(sql: str)). This allows the LLM (or a prompt injection attack) to DROP TABLE ordersBest Practice: Only expose semantic tools (get_ordercancel_order) with parameterized queries, as shown in the code above (WHERE order_id = ?). This confines the agent to specific, safe actions.

3. Concurrency Blocking

The default implementation interacts with SQLite, which is synchronous. If you have a long-running query, it blocks the main thread. For heavy operations (like complex vector search or external API calls), define your tools as async:

import asyncio

@mcp.tool()
async def fetch_external_data(query: str) -> str:
    # This yields the event loop, allowing other requests to process
    await asyncio.sleep(1) 
    return "Data fetched"

Conclusion

The ability to connect LLMs to local context is what separates a generic chatbot from a powerful engineering assistant. By using FastMCP, you eliminate the overhead of the JSON-RPC protocol and focus entirely on the business logic of your tools.

You now have a working framework. The next step is to expand your server: add tools for creating new orders, updating statuses, or even integrating a vector database for semantic search over order notes. The interface remains the same—just write a Python function, add the decorator, and let the AI handle the rest.