Architecture

A production-grade agent with clear separation between channels, inference, tools, and storage. Designed to run on constrained hardware without compromising capability.

Overview

The system has four layers:

Channels — Telegram, Discord, Slack, Matrix, IRC, WhatsApp, XMPP, SMS, email, web chat, ScuttleBot (how users reach the agent)
Core loop — message routing, tool dispatch, memory management
Inference — local model + cloud LLM cascade
Tools + Storage — external integrations and persistent state

All channels run concurrently via the ChannelRunner, sharing the same AgentLoop instance. Each channel implements a simple protocol: start(loop), stop(), send_message(user_id, text).

                    ┌─────────────┐
                    │  Telegram   │
                    │    Bot      │
                    └──────┬──────┘
                           │
┌──────────┐        ┌──────┴──────┐        ┌──────────────┐
│   SMS    ├────────┤  AgentLoop  ├────────┤  Tool        │
│ (Termux) │        │  (core)     │        │  Registry    │
└──────────┘        └──────┬──────┘        └──────┬───────┘
                           │                      │
┌──────────┐        ┌──────┴──────┐        ┌──────┴───────┐
│ Web Chat ├────────┤  Inference  │        │ Calendar     │
│ (public) │        │  Cascade    │        │ Email        │
└──────────┘        └─────────────┘        │ Jira         │
                    local → light → heavy  │ Search       │
                                           │ Deploy       │
                    Security boundary:     │ MCP Gateway  │
                    Web visitors get a     └──────────────┘
                    sandboxed WebAgent
                    with NO tool access

Inference cascade

Messages flow through a three-tier cascade. Each tier is optional — if one isn't configured, it's skipped:

Tier	Default	When it fires	Cost
Local	Phi-3.5 Mini (Q4)	Simple messages, routing decisions	Free (on-device)
Light	Gemini Flash	Medium complexity, longer context	Very cheap
Heavy	Claude Sonnet	Complex reasoning, tool orchestration	Standard API pricing

The local model acts as a router: it classifies message complexity and decides which tier to forward to. Simple greetings stay local. Complex planning tasks go to the heavy tier. This keeps API costs low while maintaining high quality for tasks that need it.

Security model

The security architecture has one critical boundary:

Web visitors interact with a sandboxed WebAgent that has ZERO access to internal systems. It uses the cloud LLM for conversation only. No tools, no memory stores, no file access, no email, no calendar.

The internal AgentLoop — accessed via Telegram and SMS — has full tool access. These channels are authenticated (Telegram user IDs, phone number allowlists).

Defense in depth

Input sanitization — all web inputs are cleaned and length-limited before reaching the LLM
Rate limiting — per-IP chat and form rate limits prevent abuse
Session caps — web sessions expire and message counts are limited
Suspicious input logging — prompt injection attempts are flagged
HTTPS enforcement — Cloudflare Tunnel terminates TLS; HTTP redirects to HTTPS
CSP headers — script-src 'self' blocks all inline scripts
Blessing gate — destructive operations (deploy, Cursor tasks) require Telegram approval

Memory system

Four independent SQLite databases handle different types of persistence:

Store	File	What it holds
Conversation	`conversations.db`	Full chat history per channel. Used for context in subsequent messages.
Structured	`memories.db`	Extracted facts, preferences, relationships. The agent remembers what you told it.
Plans	`plans.db`	Multi-step plans with status tracking. "Plan a trip to Tokyo" creates actionable steps.
Knowledge	`knowledge.db`	Persistent key-value store for reference information. Survives conversation resets.

All stores use async SQLite (via aiosqlite) and initialize lazily inside the event loop. Web visitors have their own ephemeral in-memory sessions — they never touch these databases.

Tool dispatch

Tools register with a central ToolRegistry. Each tool declares its name, description, and parameter schema. The LLM generates [TOOL:name] blocks in its responses, which the agent loop parses and dispatches.

The dispatch flow:

LLM generates a response containing [TOOL:calendar] create meeting tomorrow at 2pm
Agent loop extracts the tool call
Registry looks up the tool by name
Tool executes (API call, DB query, etc.)
Result feeds back into the next LLM turn

Tools can be backed by MCP servers. The MCPGatewayTool wraps any MCP server as a standard tool, with automatic fallback to REST implementations if the MCP connection fails.

Web presence

The built-in web server (Starlette ASGI) provides:

Landing page — configurable via static/index.html with persona placeholders
Chat widget — SSE-based chat with the sandboxed WebAgent
Intake form — collects leads, notifies you on Telegram
Blog engine — zero-dependency markdown-to-HTML with frontmatter parsing
Lead outreach — auto-qualifies leads via LLM and sends branded follow-up emails

Everything runs behind Cloudflare Tunnel for TLS termination, DDoS protection, and edge caching.

Agent loop

The AgentLoop is the central conversation engine. For each incoming message:

Load conversation history from memory
Build the message context (system prompt + history + current message)
Check goal alignment (12-Week-Year integration)
Route to appropriate inference tier
Parse and dispatch any tool calls
Extract structured memories from the conversation
Save conversation and return the response

The loop handles tool call chains — if a tool result triggers another tool call, it continues dispatching until the LLM produces a final response.

Sovereign engine

The autonomous engine handles complex, multi-step tasks that go beyond a single conversation turn. Triggered via /engine in Telegram or engine: <task> in any channel.

It runs in a separate context with its own LLM session, using the heavy cloud tier. The engine can:

Break tasks into steps and execute them sequentially
Use all registered tools
Access structured memory and plans for context
Report progress back through the channel

Destructive operations go through the BlessingGate — a human-in-the-loop approval system that sends a Telegram message asking for confirmation before proceeding.

MCP gateway

The agent acts as an MCP client. MCP servers are launched on-demand (lazy initialization) and communicate over stdio. The MCPGatewayTool wraps each server as a standard tool:

Server starts on first tool use, not at boot
Automatic reconnection on failure
Optional REST fallback tools (e.g., Jira REST if Atlassian MCP fails)
Environment variables auto-mapped from config sections

Project structure

src/palmtop/
├── __main__.py          # Entry point
├── persona.py           # Persona config → system prompts
├── brand.py             # HTML email template (persona-driven)
├── config/settings.py   # Config loader (TOML → dataclasses)
├── core/
│   ├── loop.py          # AgentLoop — main conversation engine
│   ├── engine.py        # Sovereign engine (autonomous tasks)
│   ├── blessing.py      # Human-in-the-loop approval gate
│   ├── goal_aligner.py  # 12-Week-Year goal alignment
│   ├── monitor.py       # Proactive monitoring
│   └── tracing.py       # Observability (SQLite / Langfuse)
├── inference/
│   ├── local.py         # llama.cpp backend
│   └── cloud.py         # Anthropic / Google / OpenAI backends
├── channels/
│   ├── telegram.py      # Telegram bot
│   ├── sms.py           # Termux SMS
│   └── sms_listener.py  # Dual-channel SMS listener
├── tools/               # Calendar, email, Jira, search, deploy...
├── memory/              # Conversation, structured, plans
├── knowledge/           # SQLite knowledge base
├── mcp/                 # MCP client, server, gateway
├── voice/               # STT + TTS
├── cursor/              # Cursor Cloud Agents bridge
└── web/
    ├── app.py           # Starlette ASGI server
    ├── agent.py         # Sandboxed WebAgent
    ├── blog.py          # Blog engine
    ├── outreach.py      # Lead qualification + auto-email
    └── static/          # Landing page, CSS, JS, blog posts