Intermediate

Knowledge Graphs

Knowledge graphs represent information as interconnected entities and relationships, enabling AI systems to reason about connections that keyword search would never reveal.

What Is a Knowledge Graph?

A knowledge graph is a structured representation of real-world entities (people, documents, concepts, projects) and the relationships between them. Unlike flat databases or document stores, knowledge graphs capture:

  • Entities: Named things with properties (e.g., "Project Alpha", type: project, status: active)
  • Relationships: Typed connections between entities (e.g., "Alice" —MANAGES→ "Project Alpha")
  • Context: Metadata about when, how, and why connections exist
💡
Think of it this way: A document search finds pages that mention "Alice." A knowledge graph tells you that Alice manages Project Alpha, reports to Bob, is an expert in machine learning, and authored the Q3 architecture proposal.

AI-Powered Entity Extraction

LLMs excel at extracting structured entities and relationships from unstructured text. Here is a practical approach:

Python
import anthropic

client = anthropic.Anthropic()

def extract_entities(text):
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2048,
        messages=[{"role": "user", "content": f"""
Extract entities and relationships from this text.
Return JSON with 'entities' and 'relationships' arrays.

Text: {text}
"""}]
    )
    return json.loads(response.content[0].text)

Graph Database Options

DatabaseTypeBest ForQuery Language
Neo4jProperty GraphComplex traversals, enterpriseCypher
Amazon NeptuneProperty + RDFAWS-native, managedGremlin / SPARQL
ArangoDBMulti-modelGraph + document combinedAQL
MemgraphProperty GraphReal-time, streaming dataCypher

Building a Knowledge Graph Pipeline

  1. Ingest Documents

    Collect documents from all sources: wikis, Confluence, Notion, Slack exports, email archives, meeting transcripts.

  2. Extract Entities

    Use LLMs to identify people, projects, technologies, decisions, and other relevant entities from each document.

  3. Identify Relationships

    Extract connections between entities: who works on what, which decisions affect which projects, what depends on what.

  4. Resolve & Deduplicate

    Merge references to the same entity (e.g., "ML team", "machine learning group", "the ML folks" are the same entity).

  5. Store & Index

    Load the graph into a graph database with proper indexing for fast traversal queries.

  6. Query & Explore

    Use graph queries and LLM-powered natural language interfaces to explore the knowledge graph.

Querying with Natural Language

Combine knowledge graphs with LLMs to enable natural language queries:

Cypher (Neo4j)
// User asks: "Who are the experts on microservices?"
// LLM generates Cypher:
MATCH (p:Person)-[:EXPERT_IN]->(t:Topic {name: "microservices"})
RETURN p.name, p.team, p.email
ORDER BY p.expertise_score DESC

Knowledge Graph + RAG

Knowledge graphs enhance RAG by providing structured context alongside document retrieval. Instead of just finding relevant text chunks, you can traverse the graph to find related entities, decisions, and context that enrich the LLM's answer.

GraphRAG pattern: First retrieve relevant graph nodes, then use the graph context plus retrieved documents as context for the LLM. This produces more connected, contextual answers than document-only RAG.