TemporalStore for Context Extraction and Ingestion

The extraction loop

The important shift is that ingestion happens around the model call. Before the model responds, MatrixArk can ingest user intent, active workspace, selected entity, retrieved sources, tool traces, and customer events. After the model responds, MatrixArk can ingest the final answer, accepted corrections, rejected suggestions, tool outcomes, and new commitments. That is how context becomes living memory instead of one-time retrieval.

This is where MatrixArk differs from a simple memory API. It does not only "save a memory." It extracts typed facts, chooses a temporal namespace, writes indexable serving records, marks summaries dirty, updates retrieval metadata, and stores enough audit state to replay the model input later.

Before LLM hook
query, hints, active entity, tool traces MatrixArk extraction
entities, event type, time, filters, source refs TemporalStore ingest
nodes, events, indexes, dirty summaries

Retrieve context pack
TemporalStore embeddings plus bounded temporal reads Final model answer
prompt-ready context and citations Feedback hook
accepted answer, corrections, commitments

Example: one GPU purchase conversation becomes serving state

The Python end-to-end example is intentionally small, but it mirrors a real product loop: approval arrives before the model call, tool output adds spend evidence, retrieval builds a compact context pack, and feedback writes the final answer back into memory.

Before LLM:
  text = "Alice approved the GPU purchase request for Project 1. Budget is 42000 USD."
  source = before_llm_hook
  hints = { team: infra_team, project: project_1, actor: alice }

Tool result:
  text = "The team spent 39000 USD on cloud GPU capacity last month for Project 1."
  source = tool_result_hook

Retrieve:
  query = "Can we buy another GPU batch for Project 1?"
  max_prompt_tokens = 120

Feedback:
  final_answer = "Buy only if finance confirms remaining GPU budget..."
  source = feedback

What the Python runtime proves

The new matrixark/matrixark package is small, but it captures the production boundary. Callers send an IngestRequest with tenant, raw text, source, optional time, hints, and raw object reference. MatrixArk extracts the record and writes TemporalStore-facing data contracts.

Python contract	Role	TemporalStore meaning
`IngestRequest`	Raw text plus hints from query, document, tool, or final answer.	Customer-facing ingest API stays simple.
`ExtractedEvent`	Node path, event type, event time, filters, actor, confidence, importance.	Extraction output becomes bounded serving metadata.
`ContextNode`	Stable context node with hash, parent, path, kind, summary, compact attrs.	Hierarchical context is compiled into hash-addressed serving state.
`ContextEvent`	Timestamped event with text, type, source, status, validity, attrs, source ref.	Append-only temporal memory row for prompt-time reads.
`IndexRef`	Secondary reference for event_type, status, project, team, actor, cost flags.	Declared filters become indexable lookup paths.
`SummaryDirtyMarker`	Signals that a node summary needs refresh after ingest.	Write path stays lightweight while summaries refresh async.
`ResourceDocument` / `ResourceChunk`	Parsed Markdown, PDF, or text resources with chunk offsets, pages, headings, and metadata.	Documents become prompt-ready chunks instead of opaque blobs.
`EmbeddingRecord`	Node summaries, resource chunks, VLM summaries, and query embeddings with refs and metadata.	TemporalStore can serve most context recall without requiring a separate vector database.
`ContextPackAudit`	Stores selected event ids, query plan, returned tokens, and decision notes.	Every context pack becomes replayable.

Concrete ingest path

In the current Python flow, MatrixArkContextService.ingest calls a local extractor, computes stable hashes, upserts a ContextNode, writes a ContextEvent, writes secondary index refs, marks the summary dirty, and stores a summary embedding. The same sequence is the product contract for a production TemporalStore adapter.

IngestRequest
  -> RuleBasedExtractor.extract_event()
  -> stable_hash64(tenant + node_path)
  -> upsert_node(ContextNode)
  -> write_event(ContextEvent)
  -> write_index_ref(IndexRef)
  -> mark_summary_dirty(SummaryDirtyMarker)
  -> upsert_embedding(EmbeddingRecord for node summary)

The example flow is practical: a before-LLM hook writes an approval, a tool-result hook writes GPU spend, retrieval asks whether another GPU batch is allowed, and the feedback hook writes the final accepted answer back as memory.

Resources are ingested the same way

The latest Python path also ingests Markdown, PDF, and text resources. It parses blocks, computes a content hash, writes a resource node and a resource-ingested event, indexes team, project, mime type, source path, and content hash, then stores document-summary and chunk embeddings in TemporalStore. If a VLM model is configured, it stores VLM-flavored summary and chunk embeddings too.

ResourceIngestRequest
  -> parse markdown/pdf/text into blocks
  -> write_resource_document(ResourceDocument)
  -> write_event(resource_ingested)
  -> write_resource_chunk(ResourceChunk)
  -> upsert_embedding(document_summary)
  -> upsert_embedding(resource_chunk)
  -> optional upsert_embedding(vlm_resource_chunk)

That matters for context engineering because Cursor-like or vertical-agent products can choose two simple modes. Option 1 uses TemporalStore only for events, summaries, resource chunks, local embeddings, freshness, replay, and token budgets. Option 2 adds a VectorDB only when broad ANN recall is needed across very large embedding collections.

How this maps to TemporalStore C++ concepts

The current TemporalStore C++ direction already has the right pieces. The feature module gives timestamped sequence storage backed by ordered timestamp keys. The temporal aggregate module shows prefix/range encoding for metric, dimension, bucket width, and bucket id. The IPS model points toward table schemas, slots, action types, time ranges, top-k, TTL, compaction, quotas, and custom config. MatrixArk context ingestion combines those ideas into a stricter context model.

Feature sequence base

Use timestamped rows for ordered context events, tool traces, memory deltas, and source changes.

Temporal aggregate indexes

Borrow encoded prefix/range keys so declared filters avoid scan-and-decode hot paths.

IPS-style schema

Use custom config for scopes, collections, TTL, top-k, quotas, compaction, and query limits.

Context pack audit

Persist selected records, blocked records, token budget, plan, and replay id for debugging and evals.

Why extraction must compile into strict storage

LLM extraction is flexible; serving storage cannot be. MatrixArk can use deterministic parsers, open-source extraction models, or customer-provided schema hints, but the output must compile into stable keys and declared indexes before it enters the low-latency path. Otherwise every prompt request turns into arbitrary JSON filtering.

Extraction without TemporalStore

Scattered LLM extracts fields into app code JSON blobs land in cache or database Prompt builder scans and filters later Replay requires joining logs

MatrixArk Extract once into strict records Write nodes, events, indexes, summaries Serve bounded context packs online

Read path: extraction again, but for query planning

Retrieval uses extraction too. The Python runtime's retrieve path converts a raw query into intent, time window, filters, and scope path. It embeds the query, records query embeddings, searches TemporalStore-owned summary and chunk embeddings, adds an exact scoped node hash, queries TemporalStore by node, time, and compact filters, pulls resource chunks when relevant, then builds a token-budgeted ContextPack.

RetrieveRequest
  -> RuleBasedExtractor.plan_query()
  -> embed(query)
  -> temporalstore.search_embeddings(summary and chunk vectors)
  -> candidate_node_hashes + exact scoped hash
  -> query_events(node_hashes, time window, filters, limit)
  -> query_resource_chunks(chunk ids, filters, limit)
  -> events_to_pack_items + chunks_to_pack_items(max_prompt_tokens)
  -> write_pack_audit(ContextPackAudit)

This gives MatrixArk two knobs: embeddings find relevant nodes and chunks, while TemporalStore enforces time, scope, filters, limits, replay, and source freshness. The model sees compact current context, not a dump of every possible source.

Production requirements

Customers should send raw text, hints, source refs, and optional structured plans, not TemporalStore keys.
MatrixArk should own extraction, schema validation, hash computation, index selection, and token budgeting.
TemporalStore should own append, secondary-index writes, bounded time-window reads, summary dirty markers, replay, and freshness guardrails.
Large raw artifacts should live in object storage; TemporalStore keeps source refs and serving metadata.
VectorDB should be optional. TemporalStore can own local summary and chunk embeddings; VectorDB is added when broad ANN recall is the right tool.
Feedback ingestion should write final answers, corrections, commitments, and rejected suggestions back into the timeline.

Compared with other ingestion patterns

Pattern	What usually happens	MatrixArk + TemporalStore
Vector-only ingestion	Chunk, embed, retrieve by similarity.	Also stores time, validity, event type, source refs, filters, summary dirtiness, and replay metadata.
App database logging	Raw messages and tool outputs land in product tables.	Extracted events become serving records with bounded time-window reads.
Memory library	Save and search memories with simple scopes.	Compiles memory into nodes, events, indexes, summaries, and context-pack audits.
Observability traces	Useful after failure, but not usually served before the next prompt.	Tool traces and feedback become prompt-time context and stale-action blockers.

The product message

TemporalStore makes context extraction useful in production because extraction output becomes queryable serving state immediately. MatrixArk can accept messy product events, tool traces, documents, and final answers; extract the useful temporal facts; compile them into nodes, events, indexes, summaries, and audits; and retrieve prompt-ready context with bounded latency.

Short version: extraction decides what the event means; TemporalStore decides how that meaning becomes fresh, indexed, replayable, and safe to serve in the next prompt.