Going to Production
Your Expert works in development. Now you want to deploy it for real users.
The concern: AI agents have larger attack surfaces than typical applications. They make decisions, use tools, and interact with external systems. How do you deploy them safely?
Perstack’s approach: The runtime is designed to run in isolated containers with full observability. Instead of restricting what the Expert can do, you contain the impact at the infrastructure level.
The deployment model
Section titled “The deployment model”┌─────────────────────────────────────────────────┐│ Container (your sandbox) ││ ┌─────────────────────────────────────────┐ ││ │ Perstack Runtime │ ││ │ ├── Expert execution │ ││ │ ├── MCP skill servers │ ││ │ └── Workspace (mounted volume) │ ││ └─────────────────────────────────────────┘ ││ │ ││ ▼ ││ JSON events → stdout │└─────────────────────────────────────────────────┘ │ ▼ Your application / orchestratorOne container = one Expert execution. The container:
- Has no persistent state (stateless)
- Writes execution events to stdout (observable)
- Has a mounted workspace for file I/O (controlled)
- Terminates when the Expert completes (isolated)
Optimizing startup with lockfiles
Section titled “Optimizing startup with lockfiles”By default, Perstack initializes MCP skills at runtime to discover their tools. This adds latency (500ms-6s per skill). For production, pre-collect tool definitions:
# Generate lockfile with all tool definitionsperstack install --config perstack.tomlThis creates perstack.lock containing all expert and tool definitions. When the lockfile exists:
- The runtime loads tool schemas instantly (no MCP initialization delay)
- LLM inference starts immediately
- MCP connections are only opened when tools are actually called
Workflow:
# Development: iterate without lockfileperstack start my-expert "test query"
# Before deployment: generate lockfileperstack install
# Production: instant startupperstack run my-expert "query"Run perstack install again whenever you add or modify skills.
Basic Docker setup
Section titled “Basic Docker setup”FROM node:22-slimRUN npm install -g perstackCOPY perstack.toml /app/perstack.tomlCOPY perstack.lock /app/perstack.lockWORKDIR /workspaceENTRYPOINT ["perstack", "run", "--config", "/app/perstack.toml"]# Generate lockfile before buildingperstack install
docker build -t my-expert .docker run --rm \ -e ANTHROPIC_API_KEY \ -v $(pwd)/workspace:/workspace \ my-expert "my-expert" "User query here"The ENTRYPOINT fixes the Expert. Callers just pass the query.
Per-user workspaces
Section titled “Per-user workspaces”For multi-user applications, mount per-user workspaces:
# User Alicedocker run --rm \ -e ANTHROPIC_API_KEY \ -v /data/users/alice:/workspace \ my-expert "assistant" "What's on my schedule today?"
# User Bobdocker run --rm \ -e ANTHROPIC_API_KEY \ -v /data/users/bob:/workspace \ my-expert "assistant" "What's on my schedule today?"Each user gets isolated state. The Expert reads and writes to their workspace only.
Observability in production
Section titled “Observability in production”perstack run outputs JSON events to stdout. Each line is a structured event:
{"type":"startRun","timestamp":1705312200000,"runId":"abc123",...}{"type":"startGeneration","timestamp":1705312201000,...}{"type":"callTools","timestamp":1705312202000,...}{"type":"completeRun","timestamp":1705312203000,...}Pipe these to your logging system:
docker run --rm ... my-expert "assistant" "query" | your-log-collectorYou get full execution traces without any instrumentation code.
Events are also written to
workspace/perstack/as checkpoints. You can replay any execution for debugging or auditing.
Isolation model
Section titled “Isolation model”Traditional security approach: Restrict what the agent can do — limit tools, filter outputs, add guardrails inside the agent.
Problem: This creates an arms race. The agent tries to be helpful; the restrictions try to prevent misuse. Complex, brittle, never complete.
Perstack’s approach: Let the Expert do its job. Contain the impact at the infrastructure level.
| Aspect | Traditional | Container isolation |
|---|---|---|
| Tool access | Restricted, filtered | Full access within container |
| Output handling | Content filtering | Events to stdout, you decide |
| Failure mode | Agent fights guardrails | Container terminates |
| Audit | Logs + hope | Complete event stream |
The Expert operates freely within its container. Your infrastructure controls what the container can affect.
Production checklist
Section titled “Production checklist”Container isolation:
- Each execution runs in a fresh container
- No shared state between containers
- Network access controlled (if agent shouldn’t reach internet, don’t allow it)
Secrets management:
- API keys passed via environment variables
- Only required variables passed to container
- No secrets in Expert definitions or workspace
Observability:
- JSON events collected and indexed
- Checkpoints retained for replay
- Alerts on execution failures
Resource limits:
- Container memory limits set
- Execution time limits (via container timeout)
- Workspace size limits
Scaling patterns
Section titled “Scaling patterns”Job queue: Push queries to a queue, workers pull and execute in containers.
Queue → Worker → Container → Expert → Events → Your systemServerless: Run containers on-demand (AWS Lambda, Cloud Run, etc.).
Kubernetes: Use Jobs for batch, Deployments for persistent workers.
The stateless container model fits all of these patterns.
What’s next
Section titled “What’s next”- Sandbox Integration — Deep dive on the security model
- Operating Experts — Monitoring, maintenance, and operations
- Isolation by Design — How isolation is enforced