Graphiti – LLM-Powered Temporal Knowledge Graphs

Hey LLMDevs! We're Paul, Preston, and Daniel from Zep AI. We've just open-sourced Graphiti, a Python library for building temporal Knowledge Graphs using LLMs.

Graphiti helps you create and query graphs that evolve over time. Think of a knowledge graph as a network of interconnected facts, such as “Kendra loves Adidas shoes.” Each fact is a “triplet”, represented by two entities, or nodes (”Kendra”, “Adidas shoes”), and their relationship, or edge (”loves”). Knowledge Graphs have been explored extensively for information retrieval.

https://i.redd.it/ewl7t9lussmd1.gif

What makes Graphiti unique is its ability to automatically build a knowledge graph while handling changing relationships and maintaining historical context.

At Zep, we build a memory layer for LLM applications. Developers use Zep to recall relevant user information from past conversations without including the entire chat history in a prompt. Accurate context is crucial for LLM applications. If an AI agent doesn't remember that you've changed jobs or confuses the chronology of events, its responses can be jarring or irrelevant, or worse, inaccurate.

Zep's Broken Fact Pipeline

Before Graphiti, our approach to storing and retrieving user “memory” was, in effect, a specialized RAG pipeline. An LLM extracted “facts” from a user’s chat history. Semantic search, reranking, and other techniques then surfaced facts relevant to the current conversation back to a developer for inclusion in their prompt.

We attempted to reconcile how new information may change our understanding of existing facts:

Fact: “Kendra loves Adidas shoes”

User message: “I’m so angry! My favorite Adidas shoes fell apart! Puma’s are my new favorite shoes!”

Facts:

  • “Kendra used to love Adidas shoes but now prefers Puma.”
  • “Kendra’s Adidas shoes fell apart.”

Unfortunately, this approach became problematic. Reconciling facts from increasingly complex conversations challenged even frontier LLMs such as gpt-4o. We saw incomplete facts, poor recall, and hallucinations. Our RAG search also failed at times to capture the nuanced relationships between facts, leading to irrelevant or contradictory information being retrieved.

Enter Graphiti

We tried fixing these issues with prompt optimization but saw diminishing returns on effort. We realized that a graph would help model a user’s complex world, potentially addressing these challenges.

We were intrigued by Microsoft’s GraphRAG, which expanded on RAG text chunking with a graph to better model a document corpus. However, it didn't solve our core problem: GraphRAG is designed for static documents and doesn't natively handle temporality.

So, we built Graphiti, which is designed from the ground up to handle constantly changing information, hybrid semantic and graph search, and scale:

  • Temporal Awareness: Tracks changes in facts and relationships over time. Graph edges include temporal metadata to record relationship lifecycles.
  • Episodic Processing: Ingests data as discrete episodes, maintaining data provenance and enabling incremental processing.
  • Hybrid Search: Semantic and BM25 full-text search, with the ability to rerank results by distance from a central node.
  • Scalable: Designed for large datasets, parallelizing LLM calls for batch processing while preserving event chronology.
  • Varied Sources: Ingests both unstructured text and structured data.

See Graphiti in action: https://youtu.be/sygRBjILDn8

Graphiti has significantly improved our ability to maintain accurate user context. It does a far better job of fact reconciliation over long, complex conversations. Node distance reranking, which places a user at the center of the graph, has also been a valuable tool. Quantitative data evaluation results may be a future ShowHN.

A Key Learning: Task Decomposition

Getting Graphiti to be both accurate and performant was challenging. We were surprised by how often LLMs created spurious relationships or self-referential facts. Maintaining a consistent graph schema while allowing for flexible fact extraction was a difficult balancing act.

One of our key learnings was to avoid using LLMs for large, complex tasks. This works for prototypes but is often slow, unpredictable, and unsuitable for production. For example, we used a single prompt to extract all fact triplets from a new episode, and provided the entirety of the current graph as context.

Instead, we broke down big tasks into smaller subtasks. We use the LLM as a precise tool with a focused subset of inputs and outputs. For the example above, we created separate prompts to handle entity extraction, reconciling entities with existing nodes, and determining fact timestamps. This improved both accuracy and latency. Small tasks are easier to run in parallel, and inference is faster with shorter LLM prompts and outputs.

Coming Next

Work is ongoing, including:

  1. Improving support for faster and cheaper small language models.
  2. Exploring fine-tuning to improve accuracy and reduce latency.
  3. Adding new querying capabilities, including search over neighborhood (sub-graph) summaries.

Getting Started

Graphiti is open source and available on GitHub: https://git.new/graphiti.

We'd love to hear your thoughts. Please also consider contributing!