Local MCP Server

its own set of challenges — along with some familiar ones. How much does using these services actually cost me? Who receives, processes, and stores my data? And how can I test my MCP server locally? Data security in particular is a major topic in Germany. In this blog post, I’ll show you how to set up a local MCP server that can be accessed by a locally hosted AI model.

2025-11-28

1. Setting up a local AI (Ollama)

First, we need a local AI model. For this, we use https://ollama.com/. Ollama is open-source software that lets you run LLMs on your own machine without cloud services and without registration.

The setup is straightforward. Download, install, and launch Ollama:

https://ollama.com/download.

Next, open a terminal and run the model qwen3:30b. To verify the installation, you can run any model: ollama run <model>. A list of available models is located here:

https://ollama.com/search.

I am using the qwen3:30b model because it supports tool calls, which we need for our MCP server.

Note: qwen3:30b requires a powerful GPU or plenty of RAM.

On weaker hardware, qwen3:0.6b is recommended.

ollama run qwen3:0.6b

If Ollama is working correctly, you should see a chatbot prompt in the terminal:

ollama run qwen3:30b
>>> Send a message (/? for help)

To test it, we send a simple request:

"""Translate the following sayings and words literally into English:
- Das ist nicht das Gelbe vom Ei
- Fuchsteufelswild
- Ich glaube, mein Schwein pfeift.
- Holla, die Waldfee!"""

Result:

- That is not the yellow from the egg
- Fox devil wild
- I believe, my pig whistles
- Holla, the forest fairy

""" indicates that we are providing a multiline prompt. After confirming that the model works, we can exit the chat with /bye.

2. Setting up the chat interface (Tome)

A console-based chatbot works, but it is inconvenient. Multiline input requires """, and we also have to manage the MCP client ourselves, which is the component that connects to MCP servers and communicates using the MCP protocol.

Tome solves this problem. It is open-source, provides a graphical chat interface for local AI models, and supports MCP servers with a simple configuration.

https://gettome.app/

After installation, Tome automatically detects installed Ollama models.

3. MCP server

We now have a local AI model and a chat interface. The next step is to extend the model with additional functionality through an MCP server.

AI models are limited by their training data. They cannot access up-to-date information unless they have an MCP server that provides it, and they sometimes struggle with simple tasks.

For example:

How many times does the letter t appear in the word 'Ratttattta'?

The model often answers incorrectly. In my case:

The letter "t" appears 5 times.

To fix this, we create an MCP server that counts letters. This is unnecessary for such a small task, but it illustrates how MCP servers add new capabilities such as counting letters, creating records, or accessing and processing data.

3.1 Project structure

letter-counting-mcp-server/
├── src/
│   └── index.ts
├── package.json
├── tsconfig.json

3.2 Code

index.ts

#!/usr/bin/env node
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "letter-counting-mcp-server",
  version: "1.0.0",
  description: "MCP server for counting letters in words",
});

server.registerTool(
  "count_specific_letter",
  {
    title: "Count Specific Letter",
    description:
      "Count how many times a specific letter appears in a word (case-insensitive)",
    inputSchema: {
      word: z.string().min(1).describe("The word to search in"),
      letter: z.string().length(1).describe("The letter to count"),
    },
  },
  async ({ word, letter }) => {
    let count = 0;
    const target = letter.toLowerCase();

    for (const char of word.toLowerCase()) {
      if (char === target) count++;
    }

    return {
      content: [
        {
          type: "text",
          text: `The letter "${letter}" appears ${count} time(s) in the word "${word}".`,
        },
      ],
    };
  }
);

async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
}

main().catch((error) => {
  console.error("Failed to start Letter Counting MCP Server:", error);
  process.exit(1);
});

The server uses StdioServerTransport, which communicates via standard input and output. This is ideal for local use, CLI tools, and quick testing.

package.json

{
  "name": "letter-counting-mcp-server",
  "version": "1.0.0",
  "description": "A TypeScript MCP server that provides letter counting functionality for words",
  "main": "dist/index.js",
  "type": "module",
  "scripts": {
    "build": "tsc",
    "start": "node dist/index.js",
    "watch": "tsx watch src/index.ts"
  },
  "devDependencies": {
    "@types/node": "^24.9.2",
    "tsx": "^4.0.0",
    "typescript": "^5.0.0"
  },
  "dependencies": {
    "@modelcontextprotocol/sdk": "^1.0.0",
    "zod": "^3.22.0"
  }
}

tsconfig.json

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "node",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "outDir": "./dist",
    "rootDir": "./src"
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

To initialize and build:

cd <path/to/mcp/server>
npm install
npm run build

4. Integrating the MCP server into Tome

Open Tome, navigate to the MCP server settings, and add a new entry with:

node <path/to/mcp/server>/dist/index.js

Tome detects the server and exposes its tools inside the chat.

5. Test

To test the server:

How many times does the letter t appear in the word 'Ratttattta'? Use the letter counting MCP server.

Since the model tends to answer on its own, it must be instructed to use the tool.

6. Conclusion

We set up a local MCP server, connected it to a locally hosted AI model, and integrated it into a chat interface. The example is simple, but the concept applies to more complex tasks. This setup allows controlled extensions of language models without sending data to external cloud services.

In environments where data protection and local control matter, this approach provides a practical and secure foundation for AI systems. It also makes it possible to control well-defined processes through natural language, which helps streamline internal workflows.

Undo, Redo, and the Command Pattern