Memory Layer¶
The Problem¶
Stateless LLMs treat each session as a clean slate. Without memory, agents repeat mistakes humans already corrected, re-inject context at token cost, and cannot maintain continuity over multi-step workflows. The existing knowledge capture tools record institutional knowledge, but lack per-analyst personalization, temporal reasoning, and automatic surfacing of relevant context.
How It Works¶
The memory layer stores everything agents accumulate across sessions in a single memory_records table backed by PostgreSQL with pgvector for semantic search. Memories are scoped by two axes: user (who created it) and persona (who can see it).
flowchart TB
subgraph "During AI Session"
A[Agent discovers knowledge] --> B{memory_manage<br/>command: remember}
C[User shares context] --> D[capture_insight]
end
subgraph "PostgreSQL + pgvector"
B --> E[(memory_records)]
D --> E
end
subgraph "Automatic"
E --> F[Cross-injection middleware<br/>attaches memories to<br/>toolkit responses]
E --> G[Staleness watcher<br/>flags stale memories]
end
subgraph "Explicit Recall"
H[memory_recall] --> E
end
subgraph "Admin Curation"
E --> I[apply_knowledge<br/>promotes to DataHub]
end
Memory Types¶
Memories are classified by LOCOMO dimension for structured retrieval:
| Dimension | Purpose | Examples |
|---|---|---|
knowledge |
Factual/institutional | "We have two distinct selling seasons", "Test stores 9001-9099 are training environments" |
event |
Temporal/episodic | "On March 15 the analyst ran a Q1 sales rollup filtering out test stores" |
entity |
Entity attributes | "The customer_id column contains PII", "This table was migrated from Oracle in 2024" |
relationship |
Links between entities | "acme_legacy_sales is deprecated in favor of elasticsearch.sales" |
preference |
User preferences | "This analyst prefers SQL over natural language queries" |
Scoping¶
| Axis | Field | Purpose |
|---|---|---|
| User | created_by (email) |
Ownership. Users can only update/forget their own memories unless admin. |
| Persona | persona |
Visibility. Memories created under a persona are visible to that persona. Admin sees all. |
Tools¶
memory_manage¶
CRUD operations for memory records. Opt-in per persona (requires memory_* in tools.allow).
| Command | Purpose |
|---|---|
remember |
Create a new memory with optional embedding |
update |
Revise content, category, tags on an existing record |
forget |
Soft-delete (archive) a memory |
list |
Query memories with filters, persona-scoped by default |
review_stale |
List memories flagged as stale by the lineage watcher |
memory_recall¶
Multi-strategy retrieval for when cross-injection is not enough.
| Strategy | Method | LOCOMO Dimension |
|---|---|---|
entity |
Direct URN lookup | Single-hop recall |
semantic |
Vector similarity via pgvector | Open-domain recall |
graph |
DataHub lineage traversal + entity lookup | Multi-hop reasoning |
auto (default) |
Runs entity + semantic + graph in parallel, deduplicates | All dimensions |
capture_insight (existing, refactored)¶
Now writes to memory_records instead of the legacy knowledge_insights table. Creates memory records with insight-specific metadata (suggested_actions, related_columns). Generates embeddings via Ollama when available.
apply_knowledge (existing, refactored)¶
Reads from memory_records via an adapter. Promotes curated memories into durable DataHub knowledge (context documents, glossary terms, tags, structured properties).
Cross-Injection¶
The existing bidirectional enrichment middleware automatically attaches relevant memories to toolkit responses. When a Trino query, DataHub lookup, or S3 operation returns results containing DataHub URNs, the middleware recalls memories linked to those entities and appends them as a memory_context content block.
No explicit memory_recall call is needed for this — it happens transparently on every enriched tool response.
Staleness Detection¶
A background watcher periodically checks active memories against DataHub entity state. When a referenced entity is deprecated or its schema changes, the memory is flagged as stale with a reason. Stale memories are excluded from default recall and surfaced via memory_manage(command='review_stale') for admin curation.
Correction Chains¶
When a memory is updated or superseded, the correction chain is tracked in metadata.superseded_by. This supports temporal reasoning: "X was said, then corrected to Y" has a clean signal path through the memory graph.
Relationship to Knowledge Capture¶
Memory is the universal store. An insight (captured via capture_insight) is a subtype of memory — a memory that may carry proposed catalog changes. But knowledge is broader than catalog mutations. Domain context like "we have two selling seasons" is institutional knowledge that does not map to a DataHub tag or description update. The apply_knowledge tool is where differentiation happens: it reviews memories and promotes the appropriate ones into durable DataHub entities.