Context is temporal
Prompts depend on recent actions, open commitments, superseded memories, tool failures, and state transitions. TemporalStore keeps that timeline queryable so the prompt can say what to use, avoid, refresh, or replay.
TemporalStore
TemporalStore should cover most LLM context needs by itself: time-aware memory, temporal KV, latest KV, prompt replay, low-latency fetch and processing, multi-layer cache, and persistent storage. It stores what happened, when it happened, what changed, what the agent already tried, the latest values that matter now, and which context should be trusted, ignored, replayed, filtered, reused, or refreshed. LMCache-style systems can handle prefix and model-runtime KV-cache reuse while TemporalStore manages durable application memory and prompt context behind the MatrixArk context surface.
TemporalStore is strongest when it gives vertical AI products a simple customer-facing context model and a strict internal serving model. Customers describe their business world: company, team, project, ticket, matter, claim, incident, approval, cost, policy, action, source object. MatrixArk compiles that into scope hashes, declared collections, timestamp sort keys, secondary indexes, freshness windows, and request-time query budgets.
Readable paths, object types, normal JSON records, common questions, and UI hints from the vertical AI harness.
Scope hashes, collection keys, equality-prefix indexes, time ranges, retention, limits, and context-pack sections.
Bounded online reads for latest facts, recent sequences, open commitments, stale-memory checks, replay, and prompt-time context.
MatrixDB supports hot state and Redis-compatible metadata; MatrixKV protects committed truth; OLAP/HSAP handles broad scans and ad hoc analysis.
MatrixArk routes time-aware LLM context to TemporalStore: session timelines, tool-call history, prompt replay, memory deltas, open commitments, stale-memory detection, freshness counters, temporal windows, latest per-entity KV, and long behavior sequences. That is the opportunity: most LLM stacks still treat time as logs or TTLs, not as a request-time context primitive.
Product Advantages
Prompts depend on recent actions, open commitments, superseded memories, tool failures, and state transitions. TemporalStore keeps that timeline queryable so the prompt can say what to use, avoid, refresh, or replay.
Store tool events, retrieved context, memory diffs, prompt packs, and committed actions so teams can debug and evaluate agent behavior later.
Serving workers, typed models, durable update streams, and shared-store recovery target low-latency ingestion and request-time context queries.
Windowed counters, recency features, rejection signals, and safety limits help decide which memories should enter the prompt now.
Serve compact context packs with latest valid facts, recent deltas, blocked stale memories, and citations so models spend fewer tokens on noise.
MatrixArk can route time-aware memory to TemporalStore, hot and nearline KV to MatrixDB, and canonical facts to MatrixKV behind one product surface.
TemporalStore works beside LMCache-style runtime reuse by separating stable prompt sections from volatile timeline, permission, source-version, and memory sections that must refresh.
Why TemporalStore Is Different
That is why TemporalStore is game-changing: the same system can record what happened, serve fresh context in the request path, replay what the model saw, and decide what memory should be trusted, ignored, reused, or refreshed. Most teams can build a demo with a vector database, prompt template, Redis cache, and logs. Customers feel the pain when the agent must answer with the right memory, at the right time, under the right permission, and then explain exactly why that context entered the prompt.
Logs are for inspection after failure. TemporalStore serves ordered timelines, windows, counters, and replay records during the request.
Cache keys are fast but fragile. TemporalStore gives durable, typed, queryable memory that can recover, replay, enforce freshness rules, and explain why context was used.
Semantic similarity is only one signal. TemporalStore adds recency, sequence, open commitments, repeated failures, source freshness, time-valid behavior, and cache eligibility.
Vertical AI builders can start with the Rust TemporalStore open-source release planned for July 2026, while MatrixArk keeps the broader state-engine choices behind the platform surface.
Feed LMCache-style systems stable prompt sections, reusable source packs, Redis-compatible metadata keys, invalidation signals, and freshness decisions while keeping volatile memories outside the runtime cache.
The direction for TemporalStore is to extend sequence feature serving into customizable temporal context serving. Vertical AI products should be able to define their own hierarchy, typed records, indexes, filters, freshness rules, and serving guardrails, then query fresh context online without forcing normal LLM requests through offline aggregation.
Company, team, project, matter, ticket, claim, or incident layers with collections such as approvals, costs, policies, tool history, and memory deltas.
Deep customer hierarchy is compiled into scope hashes, declared indexes, sort keys, and time shards so serving avoids expensive tree walking.
Only declared indexed filters, scoped time windows, limits, query budgets, and collection caps run in the request path.
Large scans, many filters, fuzzy matches, joins, group-bys, and multi-year analytics become summaries written back into TemporalStore.
TemporalStore keeps namespace, table, key, and model-aware temporal state close to the prompt-serving path. SDK writes append context rows, tool events, memory deltas, counters, and long sequences, while queries read bounded windows with filters for prompt-time assembly.
Scenario Architecture
Application events, stream consumers, SDK writes, and repair jobs write the same entity timeline or aggregate object at production serving scale.
Serving workers update typed records, append durable update streams, and keep shared-store state ready for replay.
LLM systems request timelines, replay records, freshness counters, long behavior sequences, ad hoc filters, and prompt-ready context through one low-latency serving API.
Before and Now
Agents can read recent behavior, tool history, and memory deltas without rebuilding a new context pipeline for every prompt question.
Agents and policy systems get fresh high-cardinality context signals without locking every prompt question into a precomputed view.
Frequency cap logic becomes a shared online data model instead of scattered TTL math across services and jobs.
Architecture Innovation
RocksDB is excellent embedded storage, but TemporalStore is designed around online temporal data models: hot typed state, durable update streams, retained records, multi-layer cache, replica reads, and compute/storage disaggregation. For hot update-heavy temporal data, generic RocksDB-backed serving can create much larger write amplification across encoded blob rewrites, cache mutation, LSM compaction, replay logs, repair jobs, and downstream materializations; TemporalStore's storage path is built to keep typed deltas, retained records, cache refill, and recovery in one model-aware flow.
Implementation Direction
The performance-critical serving and storage implementation is designed around C++ for low-latency data structures, cache control, memory management, and high-QPS execution. The open-source TemporalStore direction is Rust, with the Rust version planned to open source in July 2026, so the community-facing implementation can emphasize safety, maintainability, and a modern systems-programming developer experience.
Use C++ where hot-path latency, memory layout, cache locality, and storage-engine integration matter most.
Open source the Rust TemporalStore track in July 2026, with safer concurrency and a clean systems API surface.
Keep the public concepts consistent: namespaces, tables, typed updates, retained records, windows, and context reads.
LLM Context
TemporalStore is not a transformer KV-cache runtime. It is the persistent online state layer around the model: the place to keep structured context, temporal memory, counters, retrieval metadata, and context signals that decide what should enter the next prompt, tool call, ranking step, or safety policy. It can integrate beside LMCache-style systems and remote cache layers that reuse model prefixes or attention KV state.
Persist recent conversation turns, user actions, tool calls, and state transitions with time-aware retention and replay.
Serve freshness, frequency, recency, and interaction signals that help choose which memories or documents should enter the model context.
Work beside LMCache or remote cache services for prefix reuse, model-runtime KV-cache reuse, repeated prompt segments, cache eligibility, and invalidation hints.
Keep ordered tool results, errors, retries, and agent decisions as long context sequences for debugging and next-step planning.
Track rate limits, abuse signals, topic frequency, user risk, and policy state with online windows and distinct counts.
Store document impressions, clicks, feedback, source freshness, and retrieval history beside vector search instead of inside the vector index.
Serve user preference deltas, recent task history, account context, and behavior sequences for personalized agents and copilots.
Cloud-Native Operations
MatrixArk can provide TemporalStore as a managed public-cloud service on AWS, GCP, or Azure, with private deployment available for customer-controlled environments.
Cluster health, deployment status, and diagnostics live behind customer access control.
Managed service delivery on AWS, GCP, and Azure, plus private cloud or on-prem when needed.
Metrics and logs stay on private networking or controlled observability integrations.
Data Models
Counts and sums over keyed time buckets for velocity, caps, and online policies.
High-cardinality filtered sum, min, max, count, and model-specific rollups over recent event state and bucketed dimensions.
Unique merchant, device, campaign, IP, or session counts inside time windows.
Long user, item, session, and agent action sequences with filters, timestamps, and high-performance online reads.
Latest user, account, tenant, document, or session attributes beside temporal context.
LLM and agent context, tool timelines, session memory, retrieval metadata, preference deltas, and safety/rate counters.
| Alternative | Good at | TemporalStore difference |
|---|---|---|
| Redis / Redis Enterprise | Fast cache, strings, hashes, modules, and ephemeral serving patterns | TemporalStore adds typed timelines, filters, temporal windows, replayable prompt context, and durable shared-store recovery; MatrixDB keeps the Redis-compatible hot-state bridge. |
| Vector database | Semantic retrieval over chunks and embeddings | TemporalStore decides which memories, events, freshness signals, and timelines should enter the prompt now. |
| Logs plus cache | Trace capture and fast temporary state | Replayable context state, memory deltas, freshness counters, and prompt-ready timelines in one serving path. |
| Feature store | Feature definitions, lineage, training sets, and online lookups | TemporalStore focuses on LLM context timelines and prompt-time temporal state, with feature-serving patterns available as a secondary workload. |
| Prompt management | Prompt templates, versions, evals, and test cases | TemporalStore provides the live context substrate: memories, tool history, freshness, permissions, and replayable prompt inputs. |
| LLM observability | Trace collection, cost tracking, latency, and debugging views | TemporalStore governs context before the model call and stores enough state to replay why that context was selected. |
TemporalStore is the primary serving engine for time-aware LLM context engineering: timelines, memory deltas, tool events, temporal KV, latest KV, prompt replay, freshness counters, long sequences, persistence, and multi-layer caching. MatrixDB supplies the database layer only when teams need Redis-compatible online/offline/nearline KV, large profile or summary records, scans, exports, cheaper persisted storage, multi-tenancy, tens of millions of QPS, and familiar application APIs. MatrixKV supplies low-volume transactional KV when a permission, version, lease, approval, ownership record, or committed action must be correct.
Talk to MatrixArk