Model Context Protocol (MCP) is quickly becoming the standard for enabling LLMs to call external tools. It’s built around clean, declarative tool definitions—but most current implementations fall short of being production-ready. Every official MCP server in the Anthropic repo, for instance, runs locally and communicates over stdio. Even the few that support HTTP rely on Server-Sent Events (SSE) for streaming. This introduces stateful behavior, requiring persistent TCP connections, complicating retries, and ultimately making it incompatible with stateless environments like AWS Lambda. We’ve written more about these limitations, and how we’ve addressed them with MCPEngine.
AWS Lambda offers instant scalability, no server management, and efficient, event-driven execution. We built native support for it in MCPEngine, so that MCP tools can run cleanly and reliably in serverless environments.
MCPEngine is an open-source implementation of MCP that supports streamable HTTP alongside SSE, making it compatible with Lambda. It also includes first-class support for authentication, packaging, and other capabilities to build and deploy production-grade MCP servers.
This post walks through building three progressively more realistic examples:
All three run entirely in Lambda, don't require a custom agent, and are MCP-spec compliant.
You can follow along on GitHub here for the full project.
We'll start with a single tool called get_weather. It takes a city name and returns a canned string response. There's no state or external API call — just enough to validate end-to-end behavior with a live LLM.
Install the Python SDK:
pip install mcpengine[cli,lambda]
Create a file called app.py:
from mcpengine import MCPEngine
engine = MCPEngine()
@engine.tool()
def get_weather(city: str) -> str:
"""Return the current weather for a given city.
"""
return f"The weather in {city} is sunny and 72°F."
handler = engine.get_lambda_handler()
What this does:
@engine.tool
registers the function with the MCP manifest. The function name (`get_weather`) becomes the tool name exposed to the LLM.You can deploy this manually or use Terraform to automate setup.
If you want to skip most of the boilerplate, we provide Terraform scripts that:
You can run it in the directory by calling:
terraform apply
Grab the ECR repository url and Lambda function name from the terraform output:
export REPOSITORY_URL=$(terraform output -raw repository_url)
export FUNCTION_NAME=$(terraform output -raw lambda_name)
And then build, tag, and push the image:
docker build --platform=linux/amd64 --provenance=false -t mcp-lambda:latest .
docker tag mcp-lambda ${REPOSITORY_URL}:latest
docker push ${REPOSITORY_URL}:latest
Finally, we’ll update the Lambda with this new image:
aws lambda update-function-code \
--function-name ${FUNCTION_NAME} \
--image-uri ${REPOSITORY_URL}:latest
And the application will be running. Once deployed, you can tear it down with:
terraform destroy
If you prefer to deploy manually:
Step 1: Dockerize the Server
FROM public.ecr.aws/lambda/python:3.12
# Set working directory in the container
WORKDIR /var/task
# Copy application code
COPY . .
# Install dependencies
RUN pip install --system --no-cache-dir .
# Expose port for the server
EXPOSE 8000
# Command to run the web server
CMD ["weather.server.handler"]
Then:
docker build --platform=linux/amd64 --provenance=false -t mcp-lambda .
Step 2: Push to ECR
docker tag mcp-lambda:latest <your-ecr-url>/mcp-lambda:latest
docker push <your-ecr-url>/mcp-lambda:latest
Step 3: Deploy to Lambda
aws lambda create-function \
--function-name mcp-lambda \
--package-type Image \
--code ImageUri=<your-ecr-url>/mcp-lambda:latest \
--role arn:aws:iam::<account-id>:role/<lambda-role>
Step 4: Add permissions for Lambda
aws iam create-role \
--role-name lambda-container-execution \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}'
aws iam attach-role-policy \
--role-name lambda-container-execution \
--policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
Step 5: Enable a Function URL, and add an allow-all permission to call it:
aws lambda create-function-url-config \
--function-name mcp-lambda \
--auth-type NONE
aws lambda add-permission \
--function-name mcp-lambda \
--function-url-auth-type NONE \
--action lambda:InvokeFunctionUrl \
--statement-id PublicInvoke \
--principal '*'
Once deployed, you can connect to the server using any compatible LLM. For example, to connect from Claude:
mcpengine proxy <service-name> <your-lambda-function-url> --mode http --claude
Open Claude, and your tool should appear at the bottom of the chat bubble. When you ask something like "What's the weather in Tokyo?", Claude will:
That's it. You now have a fully deployed, Lambda-hosted MCP server, responding to real LLM calls over HTTP.
You can follow along on GitHub here for the full project.
Stateless tools are useful for demos, but most real applications need to persist data. In this section, we'll extend the minimal MCP server to include state. Specifically, we'll build a basic Slack-like message board that stores and retrieves messages from a relational database.
This version uses:
The goal is not to build a full chat system, just to show how you can add state to an MCP server without giving up stateless infrastructure like Lambda.
We'll store each message as a row in a single table. For simplicity, all messages go into the same global timeline.
The schema will look like:
CREATE TABLE messages (
id SERIAL PRIMARY KEY,
username TEXT NOT NULL,
text TEXT NOT NULL,
timestamp TIMESTAMP DEFAULT now()
);
post_message()
will insert into this table, and get_messages()
will return the most recent entries.
You shouldn't open database connections inside your tool functions. Instead, MCPEngine provides a context system: you define a setup function that runs before the server boots up, and MCPEngine makes the result available as ctx
.
In this case, the context will:
ctx.db
)This keeps your tools focused on business logic, not lifecycle management.
Assuming ctx.db
is a valid psycopg2 connection, the tools look like this:
@engine.tool()
def post_message(ctx: Context, username: str, text: str) -> str:
"""Post a message to the global timeline."""
with ctx.db.cursor() as cur:
cur.execute("INSERT INTO messages (username, text) VALUES (%s, %s)", (username, text))
ctx.db.commit()
return "Message posted."
@engine.tool()
def get_messages(ctx: Context) -> list[str]:
"""Get the most recent messages."""
with ctx.db.cursor() as cur:
cur.execute("SELECT username, text FROM messages ORDER BY timestamp DESC LIMIT 10")
return [f"{row[0]}: {row[1]}" for row in cur.fetchall()]
Add the context handler:
@asynccontextmanager
def app_lifespan():
import psycopg2
conn = psycopg2.connect(
host=os.environ["DB_HOST"],
user=os.environ["DB_USER"],
password=os.environ["DB_PASS"],
dbname=os.environ["DB_NAME"],
)
try:
yield {"db": conn}
finally:
conn.close()
You then update the constructor of the MCPEngine, to pass it this lifespan context builder.
engine = MCPEngine(
lifespan=app_lifespan,
)
MCPEngine will run the lifecycle to get the connection pool as the server boots up, and will attach it as context to every request that comes in. Additionally, when the server stops and shuts down, it will run the cleanup (everything after the yield statement).
We recommend using Terraform here, since this version involves provisioning an RDS instance, IAM roles, and security groups. If you prefer to deploy manually, you can use the Terraform script as a reference.
terraform apply
This will:
Grab the ECR repository url and Lambda function name from the terraform output:
export REPOSITORY_URL=$(terraform output -raw repository_url)
export FUNCTION_NAME=$(terraform output -raw lambda_name)
And then build, tag, and push the image:
docker build --platform=linux/amd64 --provenance=false -t mcp-lambda:latest .
docker tag mcp-lambda ${REPOSITORY_URL}:latest
docker push ${REPOSITORY_URL}:latest
Finally, we’ll update the Lambda with this new image:
aws lambda update-function-code \
--function-name ${FUNCTION_NAME} \
--image-uri ${REPOSITORY_URL}:latest
When you're done, you can tear down the resources with:
terraform destroy
Once deployed, connect Claude again using:
mcpengine proxy <service-name> <your-lambda-function-url> --claude --mode http
Open Claude and you should now see two tools: post_message
and get_messages
.
You can prompt Claude to send or retrieve messages. You can also connect from another Claude window, use the same tools, and confirm the messages are shared — even across users and cold starts.
You can follow along on GitHub here for the full project.
The tools we've built so far work, but they're open. Anyone can call them, impersonate any username, and there's no mechanism for verifying identity. That might be fine for testing, but it's not acceptable in anything that resembles a production system.
MCPEngine supports token-based authentication using standard OpenID Connect (OIDC). That means you can integrate with any identity provider that issues JWTs, including Google, AWS Cognito, Auth0, or your internal auth stack.
In this section, we'll secure our existing tools using Google as the identity provider. We'll:
First, set up an OAuth client in Google Cloud:
That's all you need for server-side token validation — the Client ID and the standard Google issuer (https://accounts.google.com).
To enable auth, you'll need to:
idp_config
when constructing the engine:from mcpengine import MCPEngine, GoogleIdpConfig
engine = MCPEngine(
lifespan=app_lifespan,
idp_config=GoogleIdpConfig(),
)
This tells MCPEngine to use Google's public JWKS endpoint to verify incoming tokens.
@engine.auth()
:@engine.auth()
@engine.tool()
def post_message(text: str, ctx: Context) -> str:
"""Post a message to the global timeline."""
# Only runs if the token is valid
...
If the request doesn't include a valid token, it will be rejected automatically. If it does, user info will be available through the context.
When calling a protected tool from a client, you need to pass a valid Google-issued ID token. Claude handles this automatically once it sees that the tool requires authentication.
When you install the tool in Claude, add the client ID and client secret:
mcpengine proxy <service-name> <your-endpoint> --claude --client-id <google-client-id> --client-secret <google-client-secret> --mode http
This tells Claude to request a token from Google using your registered client ID. When the user grants permission, Claude includes that token in every call to your MCP server.
You don't need to verify anything manually; MCPEngine handles token validation and decoding internally.
You don't need to change the Dockerfile or tool definitions — just make sure you:
docker build --platform=linux/amd64 --provenance=false -t mcp-lambda .
docker tag mcp-lambda ${REPOSITORY_URL}/mcp-lambda
docker push ${REPOSITORY_URL}/mcp-lambda
aws lambda update-function-code \
--function-name mcp-lambda \
--image-uri ${REPOSITORY_URL}/mcp-lambda:latest
Once deployed:
ctx
With two changes — adding idp_config
to the engine and decorating tools with @engine.auth()
— we've added working authentication to our MCP server. Google handles the user login. Claude handles the token flow. MCPEngine handles the verification and exposes identity to your tool code.
At this point, we've deployed three working MCP servers on AWS Lambda:
The authenticated example is closer to a real-world use case. It's minimal, but demonstrates that you can build something stateful, Lambda-native, and MCP-compliant, without ever running a server or maintaining sticky connections.
We used Claude as the client here, but the interface is fully standard MCP. You can just as easily connect using the MCPEngine client from another LLM or orchestrator. This opens the door to agentic systems. For example, you could:
None of this requires any special integrations. Just tools, schemas, and tokens.
As of today, MCPEngine is the only Python implementation of MCP that supports built-in authentication. In the next post, we'll walk through more complex authentication patterns including scoped access, restricting tools to specific users, and surfacing identity inside the tool logic.
See what a virtual feature store means for your organization.