Intermediate

Querying & Search

Perform similarity searches, filter by metadata, tune top-k parameters, and optimize query performance in Pinecone.

Basic Similarity Search

Query Pinecone by providing a vector and the number of results you want (top_k):

Python

# Query with a vector
query_embedding = get_embedding("What is deep learning?")

results = index.query(
    vector=query_embedding,
    top_k=5,
    include_metadata=True
)

# Process results
for match in results["matches"]:
    print(f"Score: {match['score']:.4f}")
    print(f"ID: {match['id']}")
    print(f"Text: {match['metadata']['text']}")
    print("---")

# Output:
# Score: 0.9234  ID: doc-12  Text: Deep learning uses neural networks...
# Score: 0.8876  ID: doc-45  Text: Neural networks are the foundation...
# Score: 0.8543  ID: doc-23  Text: Machine learning includes...

Metadata Filtering

Combine vector similarity with metadata filters to narrow results. Filters are applied before the similarity search, making them very efficient:

Python

# Filter by exact match
results = index.query(
    vector=query_embedding,
    top_k=5,
    filter={"category": "machine-learning"},
    include_metadata=True
)

# Filter with operators
results = index.query(
    vector=query_embedding,
    top_k=10,
    filter={
        "$and": [
            {"category": {"$in": ["ml", "ai", "deep-learning"]}},
            {"date": {"$gte": "2025-01-01"}},
            {"is_published": True}
        ]
    },
    include_metadata=True
)

Filter Operators

Operator	Description	Example
`$eq`	Equal to	`{"status": {"$eq": "published"}}`
`$ne`	Not equal to	`{"status": {"$ne": "draft"}}`
`$gt`	Greater than	`{"rating": {"$gt": 4}}`
`$gte`	Greater than or equal	`{"price": {"$gte": 10}}`
`$lt`	Less than	`{"age": {"$lt": 30}}`
`$in`	In list	`{"tag": {"$in": ["ml", "ai"]}}`
`$nin`	Not in list	`{"tag": {"$nin": ["spam"]}}`
`$and`	All conditions	`{"$and": [{...}, {...}]}`
`$or`	Any condition	`{"$or": [{...}, {...}]}`

Querying by Namespace

Python

# Query within a specific namespace
results = index.query(
    vector=query_embedding,
    top_k=5,
    namespace="user-alice",
    include_metadata=True
)

# Each namespace is searched independently
# You cannot search across namespaces in a single query

Fetching Vectors by ID

Python

# Fetch specific vectors by their IDs
result = index.fetch(ids=["doc-1", "doc-2", "doc-3"])

for id, vector in result["vectors"].items():
    print(f"{id}: {vector['metadata']['title']}")

✅

Choosing top_k: For RAG applications, start with top_k=5 and adjust based on your context window. More results provide more context but increase LLM token usage and cost. For search UIs, top_k=10-20 is common.

⚠

Score interpretation: For cosine similarity, scores range from 0 to 1 (higher is more similar). A score of 0.9+ usually indicates a strong match. Scores below 0.7 may be only loosely related. Always test with your specific data and embedding model.

What's Next?

In the next lesson, we will build a complete RAG pipeline using Pinecone, LangChain, and OpenAI — from document chunking to retrieval-augmented generation.

← Previous Indexing Vectors Next → RAG Integration