Find answers from the community

Updated 5 months ago

Metadata

team, i've built a basic RAG using (SimpleDirectoryReader, VectorStoreIndex, index.as_query_engine()) and querying using query_engine.query().
the LLM prompt gets appended with the context returned by retriever. i found that the context text contains file_path which is reducing the output for my LLM.
how can i specify that i don't want to add file_path to the context of the LLM, but only the retrieved text?
L
s
4 comments
When ingesting data, you can specify metada to exclude from the llm, from the embedding model, or botv
for the popular examples in llamaindex repo, we usually do the following step:
Plain Text
documents = SimpleDirectoryReader("data/pg_essage.txt").load_data()

which makes documents a list of Document objects.
to exclude the file_path metadata from llm/embedding, i'll have to call the method on each and every Document object in documents list using something like:
Plain Text
for doc in documents: doc.excluded_llm_metadata_keys.append("file_path")

i'm wondering if there's a better approach than this for loop.
Nah that's the best approach. Pretty straightforward
Add a reply
Sign up and join the conversation on Discord