Your First MCP Server — and When You Don’t Need One

Mic
Jun 5
14 min read

Updated: 2 days ago

In the last post we looked at the difference between "Tools for LLMs" and "LLMs as Tools". At the end we mentioned that MCP servers are a way of making tools for LLMs be reusable and provided to different applications. They are also easy to build, if you have the tools already, and surprisingly easy to over-build.

The protocol itself is small, much smaller than the discourse around it would suggest. But because it’s easy, it’s tempting to wrap every function you own in an MCP server “just in case,” the same way it’s tempting to add a logging decorator to every function “just in case it’s useful later.”

The problem here is, that it usually isn’t. And the cost of an unused MCP server isn’t zero, you are running a process, you have a transport to configure, and a surface to keep working when your function signature changes.

This post is not a complete MCP reference, and it’s not a survey of every transport option or SDK. It’s the fundamental idea of what MCP actually is, the smallest server that does something real, how to wire that server into several different setups you might actually have on your machine, and some thoughts on “do I even need this?”

If you haven’t read the previous post, the short version of what it covers: a function becomes callable by a model when its docstring, parameter names, and types form a clear, unambiguous contract. Everything in this post assumes that work is already done — MCP is the delivery mechanism, not the design step.

Pretty good visualization, it simply serves tools to whoever asks for it..

What Problem Is MCP Actually Solving?

Before we explain anything, let's describe the problem.

Say you’ve written the five GitHub-issue tools from the previous post, those were search_issues, get_issue, find_similar_issues, apply_labels, post_comment, and they work nicely in your Python agent loop. Then:

A colleague wants the same five tools available inside their IDE’s AI assistant.
You want them available in a separate chat-based tool you use for ad-hoc triage.
Three months later, someone on another team wants to build a different agent that also needs to read and label issues.

If all of these are using the same Python version and packages, and also the same models, you could package everything. But what happens if they run on completely different systems and even use different packages to call models. Without a shared protocol, each of these would be is a separate integration: re-implementing the same five functions, or even worse, slightly different versions of them, inside each consuming application, in whatever plugin format that application happens to support. Four consumers, four copies, four chances for the apply_labels “additive, not replacing” behaviour to quietly diverge and come up with different solutions.

MCP, or Model Context Protocol, is a standard way for a client (by which we mean an application embedding an LLM) to discover and call capabilities exposed by a separate server, over a small set of well-defined transports. Write the five tools once, expose them via an MCP server, and any MCP-compatible client, for example Claude Desktop, an IDE plugin, your own custom agent, a colleague’s agent, can discover and use them without you writing per-client integration code.

That’s the whole pitch. What is really important here, is what it’s not: it’s not a new way to design tools, and it’s not a replacement for writing good Python. It’s simply a packaging format.

Resources, Tools, Prompts

An MCP server can expose three kinds of things to a client. Understanding which one fits your use case is the first real decision.

Tools: These are functions the model can call. This is the direct continuation of the previous post: a tool is a name, a description, a parameter schema, and code that runs when the model decides to call it. If you’ve already written functions with good docstrings and clear parameters, you’re most of the way to an MCP tool, since the contract you wrote is the MCP tool definition, almost verbatim.

Resources: This is data the client can read, more like files than functions. A resource doesn’t do anything when accessed. It just returns content, identified by a URI (=Uniform Resource Identifier). Think “the current contents of config/pricing.yaml” or “the latest 50 rows of the orders table, as CSV.” The model doesn’t call a resource the way it calls a tool; the client fetches it and includes it as context, often without the model “deciding” to fetch it at all, it might be the host application that attaches it automatically. Here we see another important advantage of MCP in addition to making tools available. You can have a server with certain resources can be made available via the MCP.

Prompts: These are reusable prompt templates the server provides, often parameterised. This option is less commonly used than the other two. A server might expose a prompt template like “review this pull request for security issues,” parameterised by a PR URL (=Pull Request URL), so that any client connected to the server can offer that as a one-click action without the client author having had to write that prompt themselves.

To make this concrete: a prompt definition doesn’t call a model itself, rather it returns a string template, which the client then sends to a model.

Of all the options, this might look like overkill the most. Why not just type that prompt yourself? After all there is a function you use that needs it. The answer shows up at scale and coherence: if five people on a team each write their own slightly-different “summarise for standup” prompt, you get five slightly different ideas of what “standup-friendly” means. Exposing it as a prompt template means the server, which is also the place that defines what the tools do, also defines the recommended way of asking for things, and that definition can be improved in one place for everyone using it.

A simple way to decide: does the model need to take an action with side effects or computation (tool), does it need to read something that exists independently of the call (resource), or does the server want to suggest a way of asking for something (prompt)?

In practice, the overwhelming majority of what people build is tools. Resources are valuable when you have genuinely large or frequently-changing reference data, whether this is documentation, schemas, logs and similar data, that you don’t want stuffed into every system prompt. Finally, prompts are valuable mostly for servers meant to be used by many different teams who’d otherwise each write their own slightly-different version of “the standard prompt for using this server.”

For the rest of this post, we’ll focus on tools, since they’re the most common case, and the one that connects most directly to everything from the previous post. Plus, without tools, the whole concept falls apart.

The Baby Example

Let’s take the check_username_available function from the previous post and serve it over MCP. This is deliberately the smallest possible useful example, it includes one tool, one file, using Python’s official MCP SDK.

That’s it, that is eveything we have to do to have a working MCP server.

Notice what didn’t change from the previous post: the docstring, word for word. The contract you write for an LLM-callable function is exactly the contract MCP exposes. The main thing that the @mcp.tool() decorator does, is to make all this discoverable over a standard protocol instead of being baked into one specific application’s tool list.

This is the through-line across both posts: the work of designing a good tool happens before you ever think about MCP. MCP doesn’t change the function, it just makes a clear function reachable from more places. If check_username_available had a one-line, ambiguous docstring, wrapping it in @mcp.tool() wouldn’t fix that — it would just make the ambiguity reachable from more places too.

A Slightly Bigger Example

One tool is the minimum useful example, but most servers expose a handful of related tools. Let’s extend the username server with a second, related tool, and we also add a resource, to show what that looks like in practice:

Two tools that compose naturally, int his case suggest_usernames even calls check_username_available internally, same as you’d compose any two Python functions, plus one resource exposing the reserved-prefix list as readable text. A client could fetch that resource to show users why admin-bob was rejected, without that explanation needing to be a tool call at all.

Docstrings Are Guidance, Code Is Enforcement

The username server is safe largely because it can’t do much damage, in the worst case, it tells someone "mic2" is available when it isn’t. Most real servers expose at least one tool that does something, and the moment that’s true, the docstring stops being the only thing standing between a model and a bad outcome.

Take a tool that writes to a local notes folder:

Notice the docstring says “use this only when the user explicitly asks”. But this sentence is a suggestion, not a constraint. In reality, nothing stops a model from calling create_note anyway if it misreads the conversation or simply hallucinates. The actual constraints are in safe_note_path and the exists() check: a request for filename="../../etc/passwd" gets rejected by the path check regardless of what the model intended, and a request to recreate project-alpha.md gets rejected by the overwrite check regardless of how the model phrased its reasoning if the file alrady exists.

This is the same read-only-vs-side-effect tiering from the previous post, now showing up as code rather than as a design heuristic: the docstring is where you tell the model what’s expected of it; the implementation is where you guarantee what actually happens regardless. Both layers matter, but only one of them is load-bearing. If you find yourself writing an elaborate docstring explaining all the ways a tool shouldn’t be misused, that’s often a sign the validation belongs in the function body instead. The rule here should be a docstring caveat is a hope, an if statement is a guarantee.

Trying It Out Before Wiring Anything Up

Before connecting a server to any client at all, it’s worth being able to poke at it directly — call each tool, check what comes back, confirm the safety checks actually trigger. The MCP Python SDK ships with an inspector for exactly this, run via:

mcp dev username_server.py

This opens an interactive interface listing the server’s tools, resources, and prompts. In our case, this would be check_username_available, suggest_usernames, and the config://reserved-prefixes resource. In addition, it lets you call each one with arbitrary arguments and see the raw result, without any model in the loop at all.

This step is easy to skip, but skipping it tends to produce a particular kind of confusing bug: something goes wrong inside a conversation with a model, and now you’re trying to figure out simultaneously whether the model made a bad call, whether the tool definition is ambiguous, or whether the underlying function has a bug. Calling create_note directly through the inspector with a deliberately bad filename, and confirming you get ValueError: Only files inside the notes directory are allowed rather than a stack trace or, even worse, a file created somewhere unexpected, collapses that down to “the function itself works,” leaving only the model-and-schema half to debug if something still goes wrong once a client is involved.

Wiring It Up: Several Setups

Until now this all just looks like the same functions as before with some decorators and a some mcp.run() command. Now we need to actually connect this server to the places you might want to use it. The server code above doesn’t change at all between these setups. What changes is purely how a client finds and starts it.

Setup 1: Claude Desktop

Claude Desktop reads a configuration file that lists MCP servers it should launch on startup. On macOS this is typically at

~/Library/Application Support/Claude/claude_desktop_config.json

on Windows

%APPDATA%\Claude\claude_desktop_config.json.

We add the following to the JSON file

{
  "mcpServers": {
    "username-checker": {
      "command": "python",
      "args": ["/Users/mic/projects/username_server.py"]
    }
  }
}

After restarting, both check_username_available and suggest_usernames become available as tools the model can call during a conversation. In addition the config://reserved-prefixes resource becomes something Claude can read if relevant. You’d see this reflected in the app’s MCP/tools indicator, and Claude might say something like “let me check if that username is available” mid-conversation, then actually call the tool.

If your server has dependencies beyond the mcp package itself, point command at the right interpreter, for example a virtualenv’s Python, for instance:

{
  "mcpServers": {
    "username-checker": {
      "command": "/Users/mic/projects/.venv/bin/python",
      "args": ["/Users/mic/projects/username_server.py"]
    }
  }
}

Setup 2: A Python Script Using the MCP Client SDK Directly

If you’re building your own agent loop in Python, what we called System B from the previous post, you don’t need Claude Desktop at all. The mcp package includes a client side too, which can launch the server as a subprocess and talk to it directly.

This is the same server file from before, completely unchanged, the difference is that stdio_client launches it as a subprocess, exactly the way Claude Desktop does under the hood, just from your own script instead of from an app. The tools.list_tools() call is the discovery step: it returns the tool names, descriptions, and parameter schemas — the exact contract from the docstrings, which is what you’d feed into a model’s tools parameter in an agent loop.

A minimal agent loop tying this together with an LLM call might look like:

The run_agent function does not do much, it simply looks long. It sends the request to gemini, then looks whether the LLM wants to call a tool, goes through the tools to be called, converts the response to one gemini can understand, collects all the function responses and adds them to the content. This loop continues until it either hits a limit or comes up with a response that does not include any function calls.

The exact content of this example is not that important, it should just show how you tell gemini which tools are available, in the mcp_tools_to_gemini_tool function. Note that the function_declarations is built dynamically from whatever the server exposes. If you later add a third tool to username_server.py, this agent picks it up automatically on next run. There is no need to change agent.py at all.

Note that newer versions of the Google SDK have MCP compatibility build in. In which case you can pass the session as a whole into the tools. In this case you also do not need mcp_result_to_function_response() or extract_function_calls() anymore. This more roundabout example was there to show what you might have to do if the package does not come with everything wired in already.

Of course this use is secondary, a good package structure could have provided the same tools to the agent.py script and be similarly extendable by providing different tools whenever the model is called.

Setup 3: A Different Host Application (Generic HTTP-Based Client)

Not every client launches servers as local subprocesses. For servers that need to be reachable over a network, hence shared infrastructure rather than something running on the same machine as the client, MCP also supports HTTP-based transports. This is often referred to as “Streamable HTTP” in current MCP tooling.

The server side changes only slightly:

Now the server is a long-running process listening on a port, rather than something launched per-client. A client configuration that previously specified a command to launch now specifies a URL instead:

{
  "mcpServers": {
    "username-checker": {
      "url": "http://localhost:8765/mcp"
    }
  }
}

The exact configuration key names vary slightly between client applications, some use "url", others nest it under a "transport" block, but the underlying idea is the same: the client connects to an already-running server instead of starting one itself. This is the setup that makes sense for the “colleague’s agent, three months later” scenario from the start of this post, the server runs once, centrally, and any number of clients connect to the same instance.

A Python client connecting to this over HTTP looks structurally similar to the stdio version, swapping stdio_client for an HTTP-based client transport from the same SDK, the ClientSession usage, including the commands initialize(), list_tools(), call_tool() is identical regardless of transport. That consistency, same client-side calls, different transport underneath, is arguably the most practically useful part of the whole protocol.

Setup 4: Other Editors and Tools

The pattern repeats across other MCP-compatible applications, with the same two pieces every time: a server (your code, unchanged) and a client configuration (application-specific syntax) that tells the client how to reach it.

For an editor or IDE assistant with MCP support, this is typically a workspace or user-level settings file accepting the same shape of configuration as Claude Desktop’s, a named server entry with either a command/args pair (for subprocess/stdio servers) or a url (for HTTP servers). If you’ve configured one MCP-compatible client, configuring a second is usually a matter of finding where that application stores its server list and pasting in an equivalent entry, the server itself needs zero changes.

The practical workflow, once you’ve built a server: get it working with the stdio transport and a simple Python client script first, what we had as Setup 2 above, this gives you the fastest feedback loop, since you can print debug output directly and don’t need to restart a GUI application to see changes. Once the tools themselves work correctly, wire the same server into whichever GUI client(s) you actually want to use day-to-day, whether this is Setup 1 or its equivalent for other editors. Only reach for the HTTP transport, i.e. Setup 3, once more than one machine, or more than one long-lived client, needs to share a single running instance.

When a Script Is Enough

Here’s the question that matters more than any transport detail: how many different clients will ever call this?

If the answer is “one, my own agent loop, in my own application”, you don’t need MCP. You need the function from the previous post, in your tools list, called directly:

No server, protocol, or subprocess is needed here. Compare the work involved: the plain version above is two lines once the functions exist. The MCP version requires a server file, a transport decision, a client configuration, and if something breaks, debugging across a process boundary, where your usual debugger and print statements need an extra hop to be useful.

MCP earns its keep when the same capability needs to be reachable from multiple, independent clients, including your IDE assistant, a colleague’s agent, Claude Desktop, a separate internal tool, without each of those maintaining its own copy of check_username_available and its own opinions about whether it reserves the username or not.

That’s the real value proposition: one definition of the contract, many callers, instead of the contract getting copy-pasted and silently drifting across every project that needs it. The moment that second caller shows up, is the moment MCP starts paying for itself. That still does not mean it is the right idea, but then one should consider it. Before that moment, it’s the username-server equivalent of adding a @retry decorator to a function that’s never failed: technically harmless, but solving a problem you don’t have yet, at the cost of a layer you’ll need to maintain.

If you’re building a tool for your own agent and nothing else currently exists or is planned that would also call it, a script is the reasonable choice, not a lesser one. The MCP version of that same function isn’t more correct. It’s the same function, with overhead that pays for itself only when something else shows up to use it. Plus nothing is keeping you from adding it to an MCP server later on if it is needed. After all adding it, takes up very little time.

A Short Checklist, for Both Decisions

Two decisions have come up throughout this post, whether a tool is safe to expose as-is, and whether MCP is the right packaging for it. A short list, asked before writing the server, tends to be more honest than the same questions asked after.

For the tool itself:

Can it be described correctly in two or three sentences, without an enumerated list of special cases?
Does the implementation enforce its own constraints (paths, sizes, valid values), independent of what the docstring asks for?
If it has side effects, is that obvious from the name and the first line of the docstring and not buried in paragraph three?

For the MCP decision:

Is there a second caller, whether this is a different client, a colleague’s project, a different host application, that exists now or is concretely planned?
Would a plain function in a tools list, called directly, actually be harder to maintain than a server-plus-transport-plus-config?
If this becomes a long-running HTTP server rather than a per-client subprocess, are you prepared for the ordinary web-service questions, including who can connect, what’s logged, what happens if it’s down, that the protocol itself doesn’t answer?

If most of your answers land on “no” or “not yet” then MCP is probably mostly useless for now.

Putting It Together

Both posts come back to the same point: start with the tool contract.

A model can only call a function well if the signature and docstring clearly explain what it does, what it needs, and what it returns. That matters whether the tool lives in a simple script, an agent framework, or an MCP server.

MCP adds reach, not magic. It lets the same well-described tool be used by more clients and models. But if the function is unclear, MCP only gives you a more complicated place to debug the same problem.

So the order is simple: define the tool clearly first, then decide whether it belongs in a plain toolbox or behind an MCP server.

Frameworks like LangChain do not change that, they simply handle the agent side; MCP can be one source of tools underneath, but it also does not need to be the only source.