OpenClaw Agent Deployment Guide
Self-hosted multi-agent system with GPU acceleration, Skill extension, and Feishu / Telegram / Discord integration
🚀 Quick Install
One command to spin up OpenClaw via Docker or native Python
5 min setup⚙️ Config Reference
Every field in config.yaml explained with usage examples
Full Reference❌ Troubleshooting
GPU detection, API key errors, port conflicts, agent silence — diagnosed and fixed
Debug Guide💡 Tips & Tricks
Multi-model routing, Memory management, Skill authoring, performance tuning
Pro TipsI. Installation
Requirements: Ubuntu 20.04+ / macOS 12+, Python 3.10+. NVIDIA GPU recommended for best performance but not required — CPU inference works too.
Method A: Docker (Recommended)
Requires docker and nvidia-docker (for GPU). One command to bring up everything.
docker pull openclaw/openclaw:latest docker run -d \ --gpus all \ -p 8080:8080 \ -v ~/openclaw/config.yaml:/app/config.yaml \ -v ~/openclaw/data:/app/data \ --name openclaw \ openclaw/openclaw:latest
Tip: First launch auto-downloads models (a few minutes). Subsequent starts take 10–15s. Check logs with docker logs -f openclaw.
Method B: Native Python
pip install openclaw-agent openclaw init ~/openclaw cd ~/openclaw openclaw start
Note: Native install requires manual model download and dependency handling. Try Docker first.
Verify Installation
After startup, visit http://<your-server-ip>:8080 to see the Web UI.
openclaw status openclaw logs -f curl http://localhost:8080/health
II. config.yaml Reference
| Field | Type | Description |
|---|---|---|
| agent.name | string | Agent identifier for multi-agent collaboration |
| agent.model | string | Default model: gpt-4o, claude-4-sonnet, deepseek-v3, etc. |
| agent.model_map | object | Model aliases per task type: { coding: "deepseek-v3" } |
| agent.proxy_url | string | Proxy/relay URL (e.g. FlowerWolf node). Leave blank for direct. |
| agent.api_key | string | Required. Get from flowerwolf.net/token_en.html |
| gpu.enabled | bool | Enable GPU acceleration. Requires NVIDIA GPU + CUDA. |
| gpu.device | string | GPU device: "0" or "cuda:0" |
| memory.type | string | Storage: sqlite, postgres, or memory |
| memory.session_limit | int | Max messages per session before auto-summarization |
| skills.dir | string | Skill directory, default ./skills |
| skills.autoload | bool | Auto-load all Skills on startup |
| log.level | string | debug / info / warn / error |
Minimal Config
agent: name: my-agent model: gpt-4o api_key: your-flowerwolf-token-here proxy_url: https://api.flowerwolf.net/v1 gpu: enabled: true device: "0" memory: type: sqlite session_limit: 50 log: level: info
III. Troubleshooting
GPU Not Detected / CUDA Error
Run nvidia-smi to confirm GPU is visible. For Docker: make sure the daemon has NVIDIA runtime enabled (/etc/docker/daemon.json → "default-runtime": "nvidia", then sudo systemctl restart docker).
API Key Error / 401 Unauthorized
Verify key spelling (no extra spaces). Check balance at flowerwolf.net/token_en.html. Confirm proxy_url is https://api.flowerwolf.net/v1 (no trailing slash).
Port 8080 Already in Use
Find the blocking process: ss -tlnp | grep 8080, then kill <PID>. Prefer using openclaw stop before restarting.
Agent Silent / Messages Not Received
For Feishu: confirm "Use long connection for events" is enabled (not Webhook URL). For Telegram: verify webhook URL is https://your-domain/telegram/webhook. Check logs: openclaw logs | grep "received".
Extremely Slow / Timeout
Usually GPU OOM → falls back to CPU. Try a smaller model (gpt-4o-mini) or reduce max_tokens. Also check network latency to proxy: ping api.flowerwolf.net.
Skill Not Loading
Files must be in skills.dir, extension .yaml or .py. No Chinese characters or spaces in filenames. Required fields: name, description, action.
IV. Tips & Tricks
Smart Multi-Model Routing
Use model_map to route tasks to the best model: deepseek-v3 for code, gpt-4o for creative writing, claude-4-sonnet for long-form analysis. Saves cost, improves quality.
Memory Management
Set session_limit to auto-summarize long conversations. Manually purge: openclaw memory purge --session <id>. View stats: openclaw memory stats.
Writing Skills
Skills let the Agent call external tools. Drop a .yaml in skills/ with a description and script. The Agent decides when to invoke based on the description — the more specific, the better.
Quantized Models for Small VRAM
On a 8GB GPU, enable INT8 quantization: gpu.quantization: "int8". Reduces model size by 50–75% with typically <3% accuracy loss.
Cron Jobs for Automation
Schedule tasks: cron: { "0 9 * * *": "daily_summary" }. Great for daily reports, hourly health checks, periodic data syncs.
Debug Mode
Set log.level: debug to see every skill trigger decision, full HTTP responses, and memory injection context. Switch back to info when done — it fills up logs fast.
V. FAQ
What's the difference between OpenClaw and calling the API directly?
Direct API calls are single-turn only. OpenClaw adds: multi-turn memory management, Skill invocation, platform integration (Feishu/Telegram/Discord), Cron automation, and debugging tools. Think of it as an "OS layer" around the API.
How many instances on one machine?
Each instance loads one model (3–10GB VRAM). A 4090 24GB can handle 2 instances. CPU mode is limited only by RAM — 16GB can run 3–5 instances.
Which models are supported?
Any OpenAI API-compatible model works: GPT-4o, GPT-4o mini, Claude 4 Sonnet, Claude 3.5 Sonnet, Gemini 2.0 Flash, DeepSeek V3, Qwen Turbo, Doubao, and more via FlowerWolf Token Market.
How to backup?
Backup data/ (SQLite + model cache) and config.yaml regularly. Docker: docker cp openclaw:/app/data ./backup.