Find answers from the community

Updated 4 months ago

Introducing Contextual Retrieval

At a glance

The post asks if LlamaIndex offers anything similar to the "Contextual Embeddings" feature in Anthropic's post. Community members respond that LlamaIndex has a similar concept, where you can add metadata to nodes and exclude it from the language model, using it only for embeddings. They also mention that LlamaIndex supports hybrid retrieval using a QueryFusionRetriever, which can combine multiple retrievers like BM25 and vector-based. The community members provide code examples and links to the LlamaIndex documentation to demonstrate these capabilities.

Useful resources
Does LlamaIndex offer anything similar to the "Contextual Embeddings" found in this Anthropic post? https://www.anthropic.com/news/contextual-retrieval
L
s
R
17 comments
Yea definitely not a new idea

Its basically the same as the summary metadata extractor
https://docs.llamaindex.ai/en/stable/module_guides/loading/documents_and_nodes/usage_metadata_extractor/#metadata-extraction-usage-pattern

Basically in simple from scratch terms, just add some text to metadata, exclude metadata from the llm so that its only used for embeddings, and off you go

Plain Text
node = TextNode(text="...")
node.metadata['context'] = "..."
node.excluded_llm_metadata_keys.append("context")

index = VectorStoreIndex(nodes=[node, ..], ...)
the cool thing I think it leveraging their prompt caching system for this, to reduce costs
Yeah, I love that new feature of theirs. Saved me a lot of money already!
@Logan M But what about the BM25 hybrid retrieval aspect?
thats also supported in a similar way -- we have a bm25 retriever integration (that is a wrapper around the bm25s library)

Combine multiple retrievers into a QueryFusionRetriever, and off you go
https://docs.llamaindex.ai/en/stable/examples/retrievers/bm25_retriever/#hybrid-retriever-with-bm25-chroma
Ah, I was searching for "hybrid retrierver" when I should have been searching for "fusion retriever," eh?
eh, it should have come up under hybrid (its in the title)
I may have glanced over it...
Perhaps I assumed it only applied to Chroma.
@Logan M Can I use the in-memory VectorStoreIndex to perform this hybrid search, or do I need to use a third-party vector store that supports hybrid search?
Depends on how you want to build it -- you can use any vector db, but if it doesn't natively support hybrid search, you need to build your own (usually combining a retriever with a bm25 retriever, or something similar)
Ah, so I can indeed use the QueryFusionRetriever with the in-memory VectorStoreIndex?
thanks, homie!
Add a reply
Sign up and join the conversation on Discord