llama_index/docs/examples/metadata

At a glance

I'm reading this doc on MetadataExtraction.
and with using the QuestionsAnsweredExtractor(questions=3, llm=llm), I see that it generates

Plain Text

 'questions_this_excerpt_can_answer': '1. How many countries does Uber operate in?\n2. What is the total gross bookings of Uber in 2019?\n3. How many trips did Uber facilitate in 2019?'}

My understanding is when doing a Vector Search a Document/Node's Text is searched. So how does it know how to search questions_this_excerpt_can_answer without specifying it as a metafilter using qdrant for example.

https://github.com/run-llama/llama_index/blob/main/docs/examples/metadata_extraction/MetadataExtractionSEC.ipynb

5 comments

LLogan M

When nodes are embedded by llama-index, the metadata is included in the embedding

It's something like embed_model.get_text_embedding(node.get_content(metadata_mode=MetadataMode.EMBED))

LLogan M

You can actually configure which metadata keys to include/exclude too

LLogan M

You can try see an example now actually

Plain Text

from llama_index.schema import MetadataMode

print(node.get_content(metadata_mode=MetadataMode.EMBED))

LLogan M

the node/document object is HIGHLY customizable in this aspect

cchantlong

ahh I see. I thought metadata were not part of the embedded search. good to know thanks.

Add a reply

Find answers from the community

llama_index/docs/examples/metadata_extra...