By 2026, the Model Context Protocol (MCP) has quietly become the default way to plug Claude into the rest of your stack. If you are still wiring custom tool definitions into every agent, hand-rolling JSON schemas, and pasting documentation into system prompts, you are doing roughly twice the work for half the result. This guide walks through what MCP actually is, how to build a real server, how to deploy it, and how it interacts with a gateway like Claudexia.
What MCP is, in one paragraph
MCP is an open protocol from Anthropic that standardises how an AI application (the client, e.g. Claude Desktop, Claude Code, OpenCode, your own agent) talks to an external capability provider (the server, e.g. a filesystem, a Postgres database, a GitHub account, your internal API). The server exposes three primitive types — tools (callable functions), resources (readable data), and prompts (parameterised templates) — over a transport. The client discovers them at runtime and hands them to the model. That's it. No bespoke glue per integration, no custom schema per project.
Why it beats raw tool definitions
You can absolutely call Claude's tools parameter directly with hand-
written JSON schemas. People did this for two years. It works. But it
does not scale across an organisation:
- Every app re-implements the same five tools. Filesystem, search, database, ticketing, calendar — each team writes their own version, with subtly different parameters and bug surface.
- Tool schemas live inside application code. Updating a parameter means a redeploy of every consumer.
- There is no discovery. A new agent does not know what tools the organisation already has. With MCP, it queries the server.
- Auth and rate limits are per-app. With MCP, they live on the server.
MCP turns a tool integration into a service. Build it once, run it once, let any Claude-powered client connect.
Architecture
An MCP server is a process — local or remote — that speaks the MCP wire protocol over a transport. The two transports that matter in 2026:
- stdio — the server is a child process of the client. Used for local-only servers (filesystem, local git, a desktop app's own data). Zero network, zero auth, fastest to ship.
- Streamable HTTP — the server is a long-running HTTP service. Used for anything multi-user or hosted (your SaaS, a shared internal API, third-party integrations). Supports auth headers, SSE streaming, and horizontal scaling like any other web service.
The client maintains a session, lists capabilities (tools/list,
resources/list, prompts/list), and invokes them on demand. The model
itself never speaks MCP — the harness does. From the model's point of
view it is still calling normal tools.
Building a minimal TypeScript MCP server
Install the official SDK:
npm install @modelcontextprotocol/sdk zod
Here is a server that exposes one tool (add) and one resource
(config://app):
import { McpServer, ResourceTemplate } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const server = new McpServer({
name: "demo-server",
version: "1.0.0",
});
server.tool(
"add",
"Add two numbers and return the sum.",
{ a: z.number(), b: z.number() },
async ({ a, b }) => ({
content: [{ type: "text", text: String(a + b) }],
}),
);
server.resource(
"config",
"config://app",
async (uri) => ({
contents: [{
uri: uri.href,
text: JSON.stringify({ env: "prod", region: "eu" }),
}],
}),
);
const transport = new StdioServerTransport();
await server.connect(transport);
That's a complete, spec-compliant MCP server. Save it as server.ts,
compile, and you can register it with any MCP-capable client.
Testing it
Three clients, three configs:
Claude Desktop — edit claude_desktop_config.json:
{
"mcpServers": {
"demo": {
"command": "node",
"args": ["/abs/path/to/server.js"]
}
}
}
Restart Claude Desktop. The tool appears in the hammer menu.
Claude Code — claude mcp add demo node /abs/path/to/server.js.
Inside a session, /mcp lists connected servers.
OpenCode — add to opencode.json:
{
"mcp": {
"demo": {
"type": "local",
"command": ["node", "/abs/path/to/server.js"]
}
}
}
In all three, ask the model "what is 2 + 3?" — it will call your add
tool. Welcome to MCP.
Deploying as a remote HTTP MCP server
For anything multi-user, switch to Streamable HTTP. Replace the stdio transport:
import express from "express";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
const app = express();
app.use(express.json());
app.post("/mcp", async (req, res) => {
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: () => crypto.randomUUID(),
});
await server.connect(transport);
await transport.handleRequest(req, res, req.body);
});
app.listen(3000);
Put it behind your normal load balancer, terminate TLS, add an auth middleware that validates a bearer token per tenant, and you have a production MCP server. Clients connect with:
{
"mcpServers": {
"demo": {
"url": "https://mcp.example.com/mcp",
"headers": { "Authorization": "Bearer ..." }
}
}
}
Security: the parts everyone forgets
MCP servers run with your credentials. Treat them like any other API:
- Auth per client. Issue scoped tokens. Do not share one master key across every consumer. Rotate.
- Scope tools to the caller. A tool that lists all tickets is dangerous; a tool that lists tickets for the authenticated user is not. Carry identity through every call.
- Rate limit at the transport. A misbehaving agent can call a tool in a tight loop. Cap requests-per-minute per token.
- Validate every argument. Zod schemas are not optional. Treat tool arguments as untrusted user input, because they effectively are — the model can be jailbroken into emitting arbitrary JSON.
- Log every call. MCP is the audit boundary between the model and your data. Log tool name, arguments, caller, and outcome.
- Never put secrets in resources. Resources are read by the model. Anything you expose as a resource may end up echoed in a response.
Real-world examples worth borrowing from
The official Anthropic and community servers are the fastest way to learn idioms:
- filesystem — sandboxed read/write to a directory. The reference implementation for path validation.
- postgres — read-only SQL with schema introspection as resources.
- github — issues, PRs, code search. Good example of pagination inside tool responses.
- stripe — payments and customers. Excellent example of scoped auth and idempotent operations.
- sentry — issues and events. Shows how to expose time-windowed data as tools rather than resources.
Read their source. The patterns are reusable.
How this works through Claudexia
A common question: do I need to do anything special on the gateway side to use MCP servers? No. MCP is a client-side concern. The flow is:
- Your client (Claude Desktop, Claude Code, OpenCode, your own agent) connects to one or more MCP servers.
- The client lists their tools and resources.
- When the user sends a message, the client sends a normal
/v1/messagesrequest to Claudexia withtools: [...]populated from the MCP servers. - Claudexia routes that request to Claude with our standard pay-per- token pricing.
- Claude returns a
tool_useblock. The client — not the gateway — executes the call against the MCP server and returns the result.
In other words, Claudexia is on the model side of the protocol, and MCP is on the harness side. They compose cleanly. Point your MCP-aware client at Claudexia's Anthropic-compatible base URL, populate your tools from any MCP server, and pricing stays predictable per-token.
Bottom line
MCP is the right abstraction for tool integration in 2026. Build servers once, mount them anywhere, and let the harness do the wiring. Start with a stdio server for a local workflow, graduate to HTTP when you need multi-user, and keep auth, scoping, and rate limits on the server where they belong. Then stop reinventing tool schemas and ship.