TemporalStore Vision
Why TemporalStore is game-changing for LLM context engineering.
TemporalStore can be used alone as the Rust open-source starting point for LLM context engineering, with the Rust version planned to open source in July 2026: session timelines, tool history, memory deltas, prompt replay, open commitments, freshness counters, long sequences, cache eligibility, and invalidation signals. That is game-changing because LLM context is not static text; it is time-aware operational state. This area is still underused in context engineering, which makes it a strong place for MatrixArk to create differentiated infrastructure.
The core shift
Most production teams still split prompt context across too many systems. A vector database recalls similar chunks. Logs hold agent traces. Redis-style caches hold summaries, session state, and fast lookup keys. A transactional database holds permissions. Application code stitches together freshness, filters, retries, and fallbacks. That works until the product needs reliable context packs at high QPS.
TemporalStore turns those patterns into native serving behavior. Applications address
state by namespace_name, table_name, and key, then use typed
commands for session events, sequence row appends, filtered time-window reads, freshness
counters, prompt replay, and context-pack assembly. The result is less context
glue and more request-time intelligence. MatrixDB keeps a Redis-compatible bridge available
when teams need familiar hot-state APIs beside the temporal engine.
Why now
Modern AI products are becoming more stateful. Agents need structured memory, tool timelines, retrieval feedback, policy counters, user preferences, open commitments, source freshness, and permission-aware context that stays coherent across sessions.
That workload does not look like a simple cache read. It also does not fit cleanly into a vector database, raw logs, or offline materialization because the right context can change at request time. TemporalStore is built for that middle ground: persistent online context state with high write QPS, low-latency ingestion and query targets, long sequences, flexible filters, replay, and compute/storage disaggregation.
How teams use time-aware context
Fresh prompt assembly
Fetch recent events, latest entity state, open commitments, and source freshness before building the prompt.
Stale-memory blocking
Mark old summaries, superseded facts, expired policies, and repeated failed actions so they do not enter the model context.
Agent time travel
Replay exactly what the agent saw at a previous request, including tool outputs, memories, permissions, and prompt sections.
Runtime reuse control
Separate stable prompt sections from volatile timeline state so LMCache-style systems can reuse safely and refresh only what changed.
Before and now diagrams
The fastest way to understand TemporalStore is to compare the old spread of logs, summaries, caches, vector retrieval, and prompt code with one online system for context serving.
Session and tool timelines
ContextTimelineRow
Window + count query
ContextFilter online
Context freshness counters
Frequency caps
Context engineering before and after TemporalStore
The game-changing part is that prompt engineering becomes part of a broader context layer, not a static template plus a few vector hits. TemporalStore lets the prompt builder request a typed, time-aware context pack: what happened, what changed, what is still open, what is stale, and which prompt sections can safely be reused.
| Use case | Before | With TemporalStore context | Prompt change |
|---|---|---|---|
| Support reply | A generic instruction plus top-k help docs and the latest ticket summary. | Account timeline, last failed troubleshooting steps, open refund promise, escalation status, policy version, and stale-memory warnings. | The prompt tells the model what not to repeat, which promise must be honored, and which policy is current before drafting the reply. |
| Legal or compliance answer | Retrieved contract chunks, sometimes mixing old clauses, drafts, and approvals. | Document versions as-of the question time, approval events, matter timeline, permission scope, and conflicting newer drafts. | The prompt says: answer only from clauses valid at that time, cite the approved version, and flag later changes separately. |
| Security investigation | Similar incident summaries and raw alert logs pasted into the prompt. | Ordered alert timeline, identity changes, asset state, analyst actions, tool errors, containment status, and repeated failure counters. | The model can propose the next action because it sees sequence, attempted actions, and current containment state, not only similar text. |
| Sales or success copilot | CRM notes plus a generic account summary that may miss recent support pain. | Usage deltas, renewal commitments, unresolved tickets, sentiment changes, executive promises, and last-touch timeline. | The prompt can avoid a tone-deaf upsell and generate an outreach grounded in current account risk. |
| LMCache / KV-cache reuse | Cache the whole prompt prefix blindly, or skip cache because context may be stale. | Stable policy sections, volatile memory sections, source version hashes, cache eligibility, and invalidation signals. | The runtime reuses stable prompt parts while refreshing customer timeline, permissions, open commitments, and changed source context. |
get_context_pack(
vertical = "support",
task = "draft_refund_reply",
entity_id = "customer_acme",
as_of_time = "now",
token_budget = 6000,
include = [
"open_commitments",
"failed_tool_attempts",
"policy_at_time",
"stale_memory_warnings",
"cache_eligibility"
]
)
prompt_sections:
system: stable support policy v12
context: customer timeline + current entitlement + refund promise
do_not_repeat: troubleshooting steps already tried and failed
guardrails: stale memories blocked; permissions checked
cache_policy: reuse policy prefix, refresh customer-specific context
Before, prompt engineering meant writing better wording and gluing together retrieval results. With TemporalStore, prompt engineering becomes context engineering: query the right temporal facts, decide what to trust or ignore, compress them into sections, protect freshness, coordinate runtime reuse, and replay the exact inputs later.
What changes for builders
| Old pattern | TemporalStore pattern | Why it matters |
|---|---|---|
| One pipeline per context family | Typed online models for windows, counters, sequences, and context | Teams add new prompt and memory logic without rebuilding the data path each time. |
| Cache keys encode business logic | SDK commands expose namespace, table, key, filters, and time windows | The product surface is explicit instead of hidden inside naming conventions. |
| Precompute every useful window | Query filtered windows and sequence rows online | Applications can ask fresh questions when the request arrives. |
| Recent state is fast but fragile | Persistent state plus multi-layer cache | Hot reads stay fast while retained data remains recoverable and queryable. |
| One primary absorbs most reads | Replicas can serve reads when freshness policy allows | Read QPS can scale with the workload instead of bottlenecking on one owner. |
| Many systems create many oncall paths | One serving system owns temporal data models, cache, persistence, and recovery | Teams reduce operational surface area and maintenance load. |
Strong LLM context use cases
Agent time travel
Replay exactly what the model saw: user turns, retrieved sources, tool outputs, memory deltas, permissions, prompt sections, and committed actions.
Freshness-aware prompts
Decide whether a memory, source, summary, profile, or retrieved chunk is current enough to spend tokens on right now.
Open commitments
Track unresolved promises, pending follow-ups, failed tool attempts, escalations, approvals, and workflow state across sessions.
Prompt replay and evals
Run new prompts and models against historical context packs instead of relying only on synthetic examples or raw logs.
Cache eligibility
Mark stable prompt sections for LMCache-style reuse while refreshing volatile memories, changed sources, and permission-sensitive context.
Memory governance
Block stale, conflicting, low-confidence, unauthorized, or superseded memories before they enter the prompt.
End-to-end path: from ingestion APIs to cache and storage
TemporalStore is valuable because ingestion, online state updates, cache, durable storage, recovery, and serving reads are designed together. Applications can write one event, a small batch, or a large batch of typed rows without creating a separate pipeline for every context type.
agents, copilots, AI workspaces Ingestion APIs
single call or batch option SDK / proxy
namespace, table, key, model
latest, aggregate, sequence, counter, context Hot cache
request-path state and recent windows Warm cache / replicas
freshness-aware read scaling
ordered replay and recovery Retained temporal storage
history, windows, long sequences Shared store
rebuild, backfill, cold recovery
key, time window, count, filter Prompt-ready context
fresh reads for model calls Observation console
latency, cache, lag, recovery, node health
| Stage | What happens | Why it matters |
|---|---|---|
| Ingestion | Apps call typed APIs for single writes or batched context rows. | Teams can send events directly without building a custom stream and cache path per context type. |
| Online update | The model engine updates timelines, sequences, counters, freshness, and context state. | Context semantics live in the serving system instead of scattered application code. |
| Cache | Hot cache, warm cache, and replicas keep request-path reads fast while respecting freshness policy. | Low-latency serving does not have to give up persistence or recovery. |
| Storage | Durable streams, retained temporal records, and shared store keep history replayable. | Failures, backfills, and cold recovery are part of the product, not separate repair jobs. |
| Serving | Reads use key, window, count, and filters to return prompt-ready timelines and context. | Agent systems can ask fresh context questions at request time. |
Architecture innovation: storage built for temporal state
TemporalStore is not only a service layer in front of RocksDB. RocksDB is a strong embedded LSM engine for generic ordered KV, but TemporalStore's core idea is different: make the storage path understand online temporal data models, retained records, durable update streams, multi-layer cache, and replica-readable recovery.
RocksDB alone cannot satisfy the product need because it stores keys and values; it does not own the context model. TemporalStore needs to understand long sequences, filtered windows, counters, context timelines, freshness policy, hot/warm/cold cache behavior, and recovery as one serving system.
This is especially painful for hot update data. When counters, windows, and long context sequences change many times per entity, a RocksDB-backed generic KV design can have much larger write amplification: service code rewrites encoded blobs, caches mutate separately, RocksDB's LSM path adds compaction amplification, replay or repair logs duplicate updates, and offline materialization jobs often rewrite the same context state again. TemporalStore reduces that system-level amplification by writing typed deltas and retained records into a storage path built for update streams, cache refill, recovery, and model-aware reads.
RocksDB-style KV serving
window and filter logic outside storage Serialize value
opaque blob or latest KV Write LSM engine
compaction write amplification Add cache, replay, jobs, repair
more write paths and oncall surfaces
TemporalStore purpose-built temporal storage
timeline, counter, sequence, memory Model engine updates online state
sub-ms hot-path targets Durable temporal storage
retained records and update streams Shared store and replicas
recovery, cache, scalable reads
| Need | Plain RocksDB gives | TemporalStore adds |
|---|---|---|
| Context semantics | Generic ordered KV and local persistence | Typed models for timelines, counters, sequences, freshness, and context. |
| Request-time decisions | Point/range reads over encoded keys | Online windows, filters, counts, and prompt-ready context reads. |
| Hot-update write amplification | Frequent counters, windows, and sequence updates can trigger blob rewrites, cache mutations, LSM compaction, replay logs, and materialization jobs. | Typed deltas, update streams, and retained temporal records reduce duplicate write paths across cache, storage, recovery, and serving. |
| Operational simplicity | Each service builds its own cache, repair, and replay logic | One serving system owns cache, persistence, recovery, and observability. |
| Scale and freshness | Local embedded storage inside one service path | Replicas, freshness-aware reads, shared-store recovery, and compute/storage separation. |
| Product surface | Storage library APIs | SDK concepts developers can use: namespace, table, key, typed rows, filters, and windows. |
That architectural choice is why TemporalStore can target context timelines, long context sequences, freshness counters, and LLM context as first-class online data models instead of treating every update as a generic KV rewrite.
Why it is different
Redis-style systems are excellent for fast general-purpose data structures. Vector databases are strong at semantic retrieval. Logs are strong at append-only trace capture. TemporalStore is different because it focuses on persistent online temporal state: the context that must be fresh, filtered, high-cardinality, and served in the request path.
The architecture is also designed for serious performance work: predictable latency, efficient resource use, durable recovery, and high-QPS online serving without forcing every new context question into another pipeline.
The bigger idea
TemporalStore is not just a faster cache and not just another log store. It is a serving engine for product memory: the fresh, durable, high-QPS state that agents need before they answer, act, remember, or forget. That matters for AI products and any application where the recent past changes the next action.