跳到主要内容

会话存储

Hermes Agent 使用 SQLite 数据库 (~/.hermes/state.db) 来保存会话 跨 CLI 和网关的元数据、完整消息历史记录和模型配置 会议。这取代了早期的每会话 JSONL 文件方法。

源文件:hermes_state.py

架构概述

~/.hermes/state.db (SQLite, WAL mode)
├── sessions — Session metadata, token counts, billing
├── messages — Full message history per session
├── messages_fts — FTS5 virtual table for full-text search
└── schema_version — Single-row table tracking migration state

关键设计决策:

  • WAL 模式 用于并发读取器 + 一个写入器(网关多平台)
  • FTS5 虚拟表 用于跨所有会话消息进行快速文本搜索
  • 会话沿袭通过 parent_session_id 链(压缩触发的拆分)
  • 源标记clitelegramdiscord 等)用于平台过滤
  • Batch runner 和 RL 轨迹不存储在这里(单独的系统)

SQLite 架构

会话表

CREATE TABLE IF NOT EXISTS sessions (
id TEXT PRIMARY KEY,
source TEXT NOT NULL,
user_id TEXT,
model TEXT,
model_config TEXT,
system_prompt TEXT,
parent_session_id TEXT,
started_at REAL NOT NULL,
ended_at REAL,
end_reason TEXT,
message_count INTEGER DEFAULT 0,
tool_call_count INTEGER DEFAULT 0,
input_tokens INTEGER DEFAULT 0,
output_tokens INTEGER DEFAULT 0,
cache_read_tokens INTEGER DEFAULT 0,
cache_write_tokens INTEGER DEFAULT 0,
reasoning_tokens INTEGER DEFAULT 0,
billing_provider TEXT,
billing_base_url TEXT,
billing_mode TEXT,
estimated_cost_usd REAL,
actual_cost_usd REAL,
cost_status TEXT,
cost_source TEXT,
pricing_version TEXT,
title TEXT,
FOREIGN KEY (parent_session_id) REFERENCES sessions(id)
);

CREATE INDEX IF NOT EXISTS idx_sessions_source ON sessions(source);
CREATE INDEX IF NOT EXISTS idx_sessions_parent ON sessions(parent_session_id);
CREATE INDEX IF NOT EXISTS idx_sessions_started ON sessions(started_at DESC);
CREATE UNIQUE INDEX IF NOT EXISTS idx_sessions_title_unique
ON sessions(title) WHERE title IS NOT NULL;

消息表

CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL REFERENCES sessions(id),
role TEXT NOT NULL,
content TEXT,
tool_call_id TEXT,
tool_calls TEXT,
tool_name TEXT,
timestamp REAL NOT NULL,
token_count INTEGER,
finish_reason TEXT,
reasoning TEXT,
reasoning_details TEXT,
codex_reasoning_items TEXT
);

CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, timestamp);

笔记:

  • tool_calls 存储为 JSON 字符串(工具调用对象的序列化列表)
  • reasoning_detailscodex_reasoning_items 存储为 JSON 字符串
  • reasoning 为公开它的大模型提供商(provider)存储原始推理文本
  • 时间戳是 Unix 纪元浮点数 (time.time())

FTS5 全文搜索

CREATE VIRTUAL TABLE IF NOT EXISTS messages_fts USING fts5(
content,
content=messages,
content_rowid=id
);

FTS5 表通过在 INSERT、UPDATE、 并删除 messages 表:

CREATE TRIGGER IF NOT EXISTS messages_fts_insert AFTER INSERT ON messages BEGIN
INSERT INTO messages_fts(rowid, content) VALUES (new.id, new.content);
END;

CREATE TRIGGER IF NOT EXISTS messages_fts_delete AFTER DELETE ON messages BEGIN
INSERT INTO messages_fts(messages_fts, rowid, content)
VALUES('delete', old.id, old.content);
END;

CREATE TRIGGER IF NOT EXISTS messages_fts_update AFTER UPDATE ON messages BEGIN
INSERT INTO messages_fts(messages_fts, rowid, content)
VALUES('delete', old.id, old.content);
INSERT INTO messages_fts(rowid, content) VALUES (new.id, new.content);
END;

架构版本和迁移

当前架构版本:6

schema_version 表存储单个整数。在初始化时, _init_schema() 检查当前版本并按顺序应用迁移:

版本改变
1初始模式(会话、消息、FTS5)
2finish_reason 列添加到消息
3title 列添加到会话
4title 上添加唯一索引(允许 NULL,非 NULL 必须是唯一的)
5添加计费列:cache_read_tokenscache_write_tokensreasoning_tokensbilling_providerbilling_base_urlbilling_modeestimated_cost_usdactual_cost_usdcost_statuscost_sourcepricing_version
6向消息添加推理列:reasoningreasoning_detailscodex_reasoning_items

每次迁移都使用 ALTER TABLE ADD COLUMN 包裹在 try/ except 中来处理 列已存在的情况(幂等)。之后版本号被撞了 每个成功的迁移块。

编写争用处理

多个hermes进程(网关+CLI会话+工作树代理)共享一个 state.dbSessionDB 类通过以下方式处理写入争用:

  • 短 SQLite 超时(1 秒)而不是默认的 30 秒
  • 应用程序级重试,具有随机抖动(20-150ms,最多 15 次重试)
  • 立即开始事务在事务开始时表面锁定争用
  • 定期 WAL 检查点 每 50 次成功写入(被动模式)

这避免了 SQLite 确定性内部退避的“护航效应” 导致所有竞争写入器以相同的时间间隔重试。

_WRITE_MAX_RETRIES = 15
_WRITE_RETRY_MIN_S = 0.020 # 20ms
_WRITE_RETRY_MAX_S = 0.150 # 150ms
_CHECKPOINT_EVERY_N_WRITES = 50

常用操作

初始化

from hermes_state import SessionDB

db = SessionDB() # Default: ~/.hermes/state.db
db = SessionDB(db_path=Path("/tmp/test.db")) # Custom path

创建和管理会话

# Create a new session
db.create_session(
session_id="sess_abc123",
source="cli",
model="anthropic/claude-sonnet-4.6",
user_id="user_1",
parent_session_id=None, # or previous session ID for lineage
)

# End a session
db.end_session("sess_abc123", end_reason="user_exit")

# Reopen a session (clear ended_at/end_reason)
db.reopen_session("sess_abc123")

存储消息

msg_id = db.append_message(
session_id="sess_abc123",
role="assistant",
content="Here's the answer...",
tool_calls=[{"id": "call_1", "function": {"name": "terminal", "arguments": "{}"}}],
token_count=150,
finish_reason="stop",
reasoning="Let me think about this...",
)

检索消息

# Raw messages with all metadata
messages = db.get_messages("sess_abc123")

# OpenAI conversation format (for API replay)
conversation = db.get_messages_as_conversation("sess_abc123")
# Returns: [{"role": "user", "content": "..."}, {"role": "assistant", ...}]

会议标题

# Set a title (must be unique among non-NULL titles)
db.set_session_title("sess_abc123", "Fix Docker Build")

# Resolve by title (returns most recent in lineage)
session_id = db.resolve_session_by_title("Fix Docker Build")

# Auto-generate next title in lineage
next_title = db.get_next_title_in_lineage("Fix Docker Build")
# Returns: "Fix Docker Build #2"

全文搜索

search_messages() 方法支持具有自动功能的 FTS5 查询语法 用户输入的净化。

基本搜索

results = db.search_messages("docker deployment")

FTS5 查询语法

语法示例意义
关键词docker deploymentdocker deployment两个术语(隐式 AND)
引用的短语"exact phrase""exact phrase"精确词组匹配
布尔或docker OR kubernetesdocker OR kubernetes任一术语
布尔非python NOT javapython NOT java排除术语
前缀deploy*deploy*前缀匹配

过滤搜索

# Search only CLI sessions
results = db.search_messages("error", source_filter=["cli"])

# Exclude gateway sessions
results = db.search_messages("bug", exclude_sources=["telegram", "discord"])

# Search only user messages
results = db.search_messages("help", role_filter=["user"])

搜索结果格式

每个结果包括:

  • idsession_idroletimestamp
  • snippet — FTS5 生成的带有 >>>match<<< 标记的片段
  • context — 比赛前后各 1 条消息(内容被截断为 200 个字符)
  • sourcemodelsession_started — 来自父会话

_sanitize_fts5_query() 方法处理边缘情况:

  • 删除不匹配的引号和特殊字符
  • 将连字符括在引号中 (chat-send"chat-send")
  • 删除悬空布尔运算符 (hello ANDhello)

会话沿袭

会话可以通过 parent_session_id 形成链。当上下文发生这种情况 压缩会触发网关中的会话分裂。

查询:查找会话沿袭

-- Find all ancestors of a session
WITH RECURSIVE lineage AS (
SELECT * FROM sessions WHERE id = ?
UNION ALL
SELECT s.* FROM sessions s
JOIN lineage l ON s.id = l.parent_session_id
)
SELECT id, title, started_at, parent_session_id FROM lineage;

-- Find all descendants of a session
WITH RECURSIVE descendants AS (
SELECT * FROM sessions WHERE id = ?
UNION ALL
SELECT s.* FROM sessions s
JOIN descendants d ON s.parent_session_id = d.id
)
SELECT id, title, started_at FROM descendants;

查询:带预览的最近会话

SELECT s.*,
COALESCE(
(SELECT SUBSTR(m.content, 1, 63)
FROM messages m
WHERE m.session_id = s.id AND m.role = 'user' AND m.content IS NOT NULL
ORDER BY m.timestamp, m.id LIMIT 1),
''
) AS preview,
COALESCE(
(SELECT MAX(m2.timestamp) FROM messages m2 WHERE m2.session_id = s.id),
s.started_at
) AS last_active
FROM sessions s
ORDER BY s.started_at DESC
LIMIT 20;

查询:代币使用统计

-- Total tokens by model
SELECT model,
COUNT(*) as session_count,
SUM(input_tokens) as total_input,
SUM(output_tokens) as total_output,
SUM(estimated_cost_usd) as total_cost
FROM sessions
WHERE model IS NOT NULL
GROUP BY model
ORDER BY total_cost DESC;

-- Sessions with highest token usage
SELECT id, title, model, input_tokens + output_tokens AS total_tokens,
estimated_cost_usd
FROM sessions
ORDER BY total_tokens DESC
LIMIT 10;

导出和清理

# Export a single session with messages
data = db.export_session("sess_abc123")

# Export all sessions (with messages) as list of dicts
all_data = db.export_all(source="cli")

# Delete old sessions (only ended sessions)
deleted_count = db.prune_sessions(older_than_days=90)
deleted_count = db.prune_sessions(older_than_days=30, source="telegram")

# Clear messages but keep the session record
db.clear_messages("sess_abc123")

# Delete session and all messages
db.delete_session("sess_abc123")

数据库位置

默认路径:~/.hermes/state.db

这是从 hermes_constants.get_hermes_home() 派生出来的,它解析为 默认为 ~/.hermes/ ,或 HERMES_HOME 环境变量的值。

数据库文件、WAL 文件 (state.db-wal) 和共享内存文件 (state.db-shm) 全部创建在同一目录中。