Skip to content

Memory Layer

The Problem

Stateless LLMs treat each session as a clean slate. Without memory, agents repeat mistakes humans already corrected, re-inject context at token cost, and cannot maintain continuity over multi-step workflows. The existing knowledge capture tools record institutional knowledge, but lack per-analyst personalization, temporal reasoning, and automatic surfacing of relevant context.

How It Works

The memory layer stores everything agents accumulate across sessions in a single memory_records table backed by PostgreSQL with pgvector for semantic search. Memories are scoped by two axes: user (who created it) and persona (who can see it).

flowchart TB
    subgraph "During AI Session"
        A[Agent discovers knowledge] --> B{memory_manage<br/>command: remember}
        C[User shares context] --> D[capture_insight]
    end

    subgraph "PostgreSQL + pgvector"
        B --> E[(memory_records)]
        D --> E
    end

    subgraph "Automatic"
        E --> F[Cross-injection middleware<br/>attaches memories to<br/>toolkit responses]
        E --> G[Staleness watcher<br/>flags stale memories]
    end

    subgraph "Explicit Recall"
        H[memory_recall] --> E
    end

    subgraph "Admin Curation"
        E --> I[apply_knowledge<br/>promotes to DataHub]
    end

Memory Types

Memories are classified by LOCOMO dimension for structured retrieval:

Dimension Purpose Examples
knowledge Factual/institutional "We have two distinct selling seasons", "Test stores 9001-9099 are training environments"
event Temporal/episodic "On March 15 the analyst ran a Q1 sales rollup filtering out test stores"
entity Entity attributes "The customer_id column contains PII", "This table was migrated from Oracle in 2024"
relationship Links between entities "acme_legacy_sales is deprecated in favor of elasticsearch.sales"
preference User preferences "This analyst prefers SQL over natural language queries"

Scoping

Axis Field Purpose
User created_by (email) Ownership. Users can only update/forget their own memories unless admin.
Persona persona Visibility. Memories created under a persona are visible to that persona. Admin sees all.

Tools

memory_manage

CRUD operations for memory records. Opt-in per persona (requires memory_* in tools.allow).

Command Purpose
remember Create a new memory with optional embedding
update Revise content, category, tags on an existing record
forget Soft-delete (archive) a memory
list Query memories with filters, persona-scoped by default
review_stale List memories flagged as stale by the lineage watcher

memory_recall

Multi-strategy retrieval for when cross-injection is not enough.

Strategy Method LOCOMO Dimension
entity Direct URN lookup Single-hop recall
semantic Vector similarity via pgvector Open-domain recall
graph DataHub lineage traversal + entity lookup Multi-hop reasoning
auto (default) Runs entity + semantic + graph in parallel, deduplicates All dimensions

capture_insight (existing, refactored)

Now writes to memory_records instead of the legacy knowledge_insights table. Creates memory records with insight-specific metadata (suggested_actions, related_columns). Generates embeddings via Ollama when available.

apply_knowledge (existing, refactored)

Reads from memory_records via an adapter. Promotes curated memories into durable DataHub knowledge (context documents, glossary terms, tags, structured properties).

Cross-Injection

The existing bidirectional enrichment middleware automatically attaches relevant memories to toolkit responses. When a Trino query, DataHub lookup, or S3 operation returns results containing DataHub URNs, the middleware recalls memories linked to those entities and appends them as a memory_context content block.

No explicit memory_recall call is needed for this — it happens transparently on every enriched tool response.

Staleness Detection

A background watcher periodically checks active memories against DataHub entity state. When a referenced entity is deprecated or its schema changes, the memory is flagged as stale with a reason. Stale memories are excluded from default recall and surfaced via memory_manage(command='review_stale') for admin curation.

Correction Chains

When a memory is updated or superseded, the correction chain is tracked in metadata.superseded_by. This supports temporal reasoning: "X was said, then corrected to Y" has a clean signal path through the memory graph.

Relationship to Knowledge Capture

Memory is the universal store. An insight (captured via capture_insight) is a subtype of memory — a memory that may carry proposed catalog changes. But knowledge is broader than catalog mutations. Domain context like "we have two selling seasons" is institutional knowledge that does not map to a DataHub tag or description update. The apply_knowledge tool is where differentiation happens: it reviews memories and promotes the appropriate ones into durable DataHub entities.