Ollama: Run PocketPaw with Free Local Models
Run PocketPaw entirely on your machine with Ollama — no API keys, no cloud, no costs. The Claude Agent SDK and OpenAI Agents backends work with Ollama out of the box.
How It Works
Since Ollama v0.14.0, Ollama exposes an Anthropic Messages API compatible endpoint. PocketPaw points the same AsyncAnthropic client (or Claude SDK subprocess) at your local Ollama server instead of Anthropic’s cloud. Same tool format, same streaming, zero format conversion.
Prerequisites
Install Ollama
curl -fsSL https://ollama.com/install.sh | shPull a model
ollama pull qwen2.5:7b # Good balance of speed and quality# orollama pull llama3.2 # Default modelStart Ollama
ollama serveQuick Start
export POCKETPAW_LLM_PROVIDER=ollamaexport POCKETPAW_OLLAMA_MODEL=qwen2.5:7bpocketpawEdit ~/.pocketpaw/config.json:
{ "llm_provider": "ollama", "ollama_host": "http://localhost:11434", "ollama_model": "qwen2.5:7b"}Open the web dashboard, go to Settings → General:
- Set LLM Provider to Ollama
- Set Ollama Host (defaults to
http://localhost:11434) - Set Ollama Model to the model you pulled (e.g.,
qwen2.5:7b,deepseek-r1:8b)
The Ollama Host and Ollama Model fields only appear when LLM Provider is set to Ollama. Make sure to set the model name to match what you have installed — run ollama list to check.
Verify Setup
Run the built-in connectivity check:
pocketpaw --check-ollamaThis performs 4 checks:
| Check | What it tests |
|---|---|
| Server reachable | Pings {ollama_host}/api/tags |
| Model available | Verifies configured model is pulled locally |
| Messages API | Tests Anthropic-compatible completion endpoint |
| Tool calling | Sends a dummy tool and checks the model uses it |
Configuration
| Setting | Env Var | Default | Description |
|---|---|---|---|
llm_provider | POCKETPAW_LLM_PROVIDER | "auto" | Set to "ollama" for explicit Ollama usage |
ollama_host | POCKETPAW_OLLAMA_HOST | "http://localhost:11434" | Ollama server URL |
ollama_model | POCKETPAW_OLLAMA_MODEL | "llama3.2" | Model to use |
Auto-Detection
When llm_provider is "auto" (the default):
- If
anthropic_api_keyis set → uses Anthropic - If no API key is set → falls back to Ollama automatically
This means if you install PocketPaw and Ollama without any API keys, it just works.
Compatible Backends
| Backend | Ollama Support | How |
|---|---|---|
| Claude Agent SDK | Yes | Sets ANTHROPIC_BASE_URL env var for the SDK subprocess |
| OpenAI Agents | Yes | Wraps model in OpenAIChatCompletionsModel via Ollama’s OpenAI-compatible API |
Claude Agent SDK + Ollama
The default backend. The SDK subprocess receives these environment variables:
ANTHROPIC_BASE_URL→ your Ollama hostANTHROPIC_API_KEY→"ollama"(accepted but not validated)
All SDK built-in tools (Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch) work as usual.
OpenAI Agents + Ollama
Set the provider to ollama and the model will be wrapped in OpenAIChatCompletionsModel from the OpenAI Agents SDK, communicating with Ollama’s OpenAI-compatible endpoint:
export POCKETPAW_AGENT_BACKEND="openai_agents"export POCKETPAW_OPENAI_AGENTS_PROVIDER="ollama"export POCKETPAW_OLLAMA_MODEL="qwen2.5:7b"Recommended Models
| Model | Size | Tool Calling | Notes |
|---|---|---|---|
qwen2.5:7b | 4.7 GB | Good | Best balance for most users |
qwen2.5:14b | 9 GB | Better | More reliable tool use |
llama3.2 | 2 GB | Fair | Fast, lightweight |
mistral:7b | 4.1 GB | Good | Strong reasoning |
deepseek-r1:8b | 4.9 GB | Good | Strong at coding tasks |
Limitations
- Smart Model Router is skipped — When using Ollama, the Model Router cannot switch between models. Smart routing is automatically disabled.
- Tool calling quality varies — Smaller models may not use tools reliably. If tools aren’t being called, try a larger model.
- Ollama v0.14.0+ required — Older versions don’t expose the Anthropic Messages API endpoint.
Error Messages
PocketPaw provides Ollama-specific error messages instead of generic API errors:
| Error | Meaning | Fix |
|---|---|---|
| Model ‘X’ not found in Ollama | The configured model isn’t pulled locally | Run ollama pull <model> or change the model in Settings → General → Ollama Model |
| Ollama error: connection refused | Ollama server isn’t running | Run ollama serve |
| Cannot connect to Ollama | Wrong host or Ollama is down | Check Ollama Host in Settings matches where Ollama is running |
Troubleshooting
”Model not found in Ollama”
This means PocketPaw is trying to use a model you haven’t pulled. The default model is llama3.2 — if you use a different model (e.g., deepseek), make sure to update the setting:
# Check what models you haveollama list
# Set the correct modelexport POCKETPAW_OLLAMA_MODEL="deepseek-r1:8b"Or update it in the dashboard: Settings → General → Ollama Model.
”Cannot reach Ollama server"
# Check Ollama is runningollama serve
# Verify it's listeningcurl http://localhost:11434/api/tags"Messages API failed”
Your Ollama version may be too old. Update:
# macOSbrew upgrade ollama
# Linuxcurl -fsSL https://ollama.com/install.sh | sh“Model responded but did not use the tool”
Try a more capable model:
ollama pull qwen2.5:14bThen set ollama_model to qwen2.5:14b in Settings or config.
Implementation
| File | Description |
|---|---|
agents/claude_sdk.py | Ollama env vars passed to SDK subprocess |
agents/openai_agents.py | Ollama via OpenAIChatCompletionsModel wrapper |
agents/router.py | Ollama detection logging in _initialize_agent() |
__main__.py | --check-ollama CLI command |
tests/test_ollama_agent.py | Tests covering Ollama backends |
Related
Claude Agent SDK
The recommended backend — works with Ollama out of the box.
OpenAI Agents SDK
Alternative backend that also supports Ollama for local inference.
Local LLM Agent Guide
Step-by-step guide to running PocketPaw with local models.
Self-Host PocketPaw
Deploy PocketPaw on your own server with no cloud dependencies.