Skip to content
Home
GitHub

Providers and Models

Perstack supports multiple LLM providers. Configure via CLI options, environment variables, or perstack.toml.

When no model is specified, Perstack resolves it automatically using a tier-based system:

  1. Expert’s defaultModelTier — if the expert definition sets a defaultModelTier (e.g. high, middle, low), the runtime picks the corresponding model from the provider
  2. Provider’s middle tier — if no tier is set, falls back to the provider’s “middle” tier (e.g. claude-sonnet-4-5 for Anthropic)

This ensures cost-efficient defaults while letting experts request more capable models when needed. You can always override the resolved model with --model via the CLI or model = in perstack.toml.

To override the default, specify in perstack.toml:

model = "gemini-2.5-pro"
[provider]
providerName = "google"

Or via CLI:

Terminal window
npx perstack run my-expert "query" --provider google --model gemini-2.5-pro
ProviderKeyDescription
AnthropicanthropicClaude models
GooglegoogleGemini models
OpenAIopenaiGPT and reasoning models
FireworksfireworksOpen-weight models (Kimi, DeepSeek)
DeepSeekdeepseekDeepSeek models
OllamaollamaLocal model hosting
Azure OpenAIazure-openaiAzure-hosted OpenAI models
Amazon Bedrockamazon-bedrockAWS Bedrock-hosted models
Google Vertex AIgoogle-vertexGoogle Cloud Vertex AI models

Environment variables:

VariableRequiredDescription
ANTHROPIC_API_KEYYesAPI key
ANTHROPIC_BASE_URLNoCustom endpoint

perstack.toml settings:

[provider]
providerName = "anthropic"
[provider.setting]
baseUrl = "https://custom-endpoint.example.com" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
SettingTypeDescription
baseUrlstringCustom API endpoint
headersobjectCustom HTTP headers

Native reasoning: Supported via extended thinking.

Models:

ModelContextMax Output
claude-opus-4-5200K32K
claude-opus-4-1200K32K
claude-opus-4-20250514200K32K
claude-sonnet-4-5200K64K
claude-sonnet-4-20250514200K64K
claude-3-7-sonnet-20250219200K64K
claude-haiku-4-5200K8K
claude-3-5-haiku-latest200K8K
Terminal window
export ANTHROPIC_API_KEY=sk-ant-...
npx perstack run my-expert "query" --provider anthropic --model claude-sonnet-4-5

Environment variables:

VariableRequiredDescription
GOOGLE_GENERATIVE_AI_API_KEYYesAPI key
GOOGLE_GENERATIVE_AI_BASE_URLNoCustom endpoint

perstack.toml settings:

[provider]
providerName = "google"
[provider.setting]
baseUrl = "https://custom-endpoint.example.com" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
SettingTypeDescription
baseUrlstringCustom API endpoint
headersobjectCustom HTTP headers

Models:

ModelContextMax Output
gemini-3-pro-preview1M64K
gemini-2.5-pro1M64K
gemini-2.5-flash1M64K
gemini-2.5-flash-lite1M64K
Terminal window
export GOOGLE_GENERATIVE_AI_API_KEY=AIza...
npx perstack run my-expert "query" --provider google --model gemini-2.5-pro

Environment variables:

VariableRequiredDescription
OPENAI_API_KEYYesAPI key
OPENAI_BASE_URLNoCustom endpoint (OpenAI-compatible)
OPENAI_ORGANIZATIONNoOrganization ID
OPENAI_PROJECTNoProject ID

perstack.toml settings:

[provider]
providerName = "openai"
[provider.setting]
baseUrl = "https://custom-endpoint.example.com" # Optional
organization = "org-xxx" # Optional
project = "proj-xxx" # Optional
name = "custom-openai" # Optional: custom provider name
headers = { "X-Custom-Header" = "value" } # Optional
SettingTypeDescription
baseUrlstringCustom API endpoint
organizationstringOpenAI organization ID
projectstringOpenAI project ID
namestringCustom provider name
headersobjectCustom HTTP headers

Native reasoning: Supported via reasoningEffort. Works with o-series models (o1, o3, o4-mini).

Models:

ModelContextMax Output
gpt-5400K128K
gpt-5-mini400K128K
gpt-5-nano400K128K
gpt-5-chat-latest128K16K
o4-mini200K100K
o3200K10K
o3-mini200K10K
gpt-4.11M32K
Terminal window
export OPENAI_API_KEY=sk-proj-...
npx perstack run my-expert "query" --provider openai --model gpt-5

Environment variables:

VariableRequiredDescription
DEEPSEEK_API_KEYYesAPI key
DEEPSEEK_BASE_URLNoCustom endpoint

perstack.toml settings:

[provider]
providerName = "deepseek"
[provider.setting]
baseUrl = "https://custom-endpoint.example.com" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
SettingTypeDescription
baseUrlstringCustom API endpoint
headersobjectCustom HTTP headers

Models:

ModelContextMax Output
deepseek-chat128K8K
deepseek-reasoner128K64K
Terminal window
export DEEPSEEK_API_KEY=sk-...
npx perstack run my-expert "query" --provider deepseek --model deepseek-chat

Environment variables:

VariableRequiredDescription
FIREWORKS_API_KEYYesAPI key
FIREWORKS_BASE_URLNoCustom endpoint

perstack.toml settings:

[provider]
providerName = "fireworks"
[provider.setting]
baseUrl = "https://custom-endpoint.example.com" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
SettingTypeDescription
baseUrlstringCustom API endpoint
headersobjectCustom HTTP headers

Models:

ModelContextMax Output
accounts/fireworks/models/kimi-k2p5262K262K
accounts/fireworks/models/deepseek-v3p2164K164K
accounts/fireworks/models/glm-5203K203K
Terminal window
export FIREWORKS_API_KEY=fw_...
npx perstack run my-expert "query" --provider fireworks --model accounts/fireworks/models/kimi-k2p5

Environment variables:

VariableRequiredDescription
OLLAMA_BASE_URLNoServer URL (default: http://localhost:11434)

perstack.toml settings:

[provider]
providerName = "ollama"
[provider.setting]
baseUrl = "http://localhost:11434" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
SettingTypeDescription
baseUrlstringOllama server URL
headersobjectCustom HTTP headers

Models:

ModelContextMax Output
gpt-oss:20b128K128K
gpt-oss:120b128K128K
gemma3:1b32K32K
gemma3:4b128K128K
gemma3:12b128K128K
gemma3:27b128K128K
Terminal window
export OLLAMA_BASE_URL=http://localhost:11434
npx perstack run my-expert "query" --provider ollama --model gpt-oss:20b

Environment variables:

VariableRequiredDescription
AZURE_API_KEYYesAPI key
AZURE_RESOURCE_NAMEYesResource name
AZURE_API_VERSIONNoAPI version
AZURE_BASE_URLNoCustom endpoint

perstack.toml settings:

[provider]
providerName = "azure-openai"
[provider.setting]
resourceName = "your-resource-name" # Optional (env fallback)
apiVersion = "2024-02-15-preview" # Optional
baseUrl = "https://custom-endpoint.example.com" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
useDeploymentBasedUrls = true # Optional
SettingTypeDescription
resourceNamestringAzure resource name
apiVersionstringAzure API version
baseUrlstringCustom API endpoint
headersobjectCustom HTTP headers
useDeploymentBasedUrlsbooleanUse deployment-based URLs
Terminal window
export AZURE_API_KEY=your_azure_key
export AZURE_RESOURCE_NAME=your_resource_name
npx perstack run my-expert "query" --provider azure-openai --model your-deployment-name

Environment variables:

VariableRequiredDescription
AWS_ACCESS_KEY_IDYesAccess key ID
AWS_SECRET_ACCESS_KEYYesSecret access key
AWS_REGIONYesRegion (e.g., us-east-1)
AWS_SESSION_TOKENNoSession token (temporary credentials)

perstack.toml settings:

[provider]
providerName = "amazon-bedrock"
[provider.setting]
region = "us-east-1" # Optional (env fallback)
SettingTypeDescription
regionstringAWS region
Terminal window
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1
npx perstack run my-expert "query" --provider amazon-bedrock --model anthropic.claude-v2

Environment variables:

VariableRequiredDescription
GOOGLE_VERTEX_PROJECTNoGCP project ID
GOOGLE_VERTEX_LOCATIONNoGCP location (e.g., us-central1)
GOOGLE_VERTEX_BASE_URLNoCustom endpoint

perstack.toml settings:

[provider]
providerName = "google-vertex"
[provider.setting]
project = "my-gcp-project" # Optional
location = "us-central1" # Optional
baseUrl = "https://custom-endpoint.example.com" # Optional
headers = { "X-Custom-Header" = "value" } # Optional
SettingTypeDescription
projectstringGCP project ID
locationstringGCP location
baseUrlstringCustom API endpoint
headersobjectCustom HTTP headers
Terminal window
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
npx perstack run my-expert "query" --provider google-vertex --model gemini-1.5-pro