Find answers from the community

Updated 2 months ago

Building a simple RAG for document management with metadata and file name search

Hi everyone, I have few questions and hope it could get helps. Since I am building a simple RAG for my document management, wanna search the target documents quickly.
#1 I noted metadata feature, file name might as information for query . I'm wondering if this function is referenced during the retrieval process? Or should the file name also be included in the embedding? How can I achieve this? I've referred to the documentation, but there are no sample codes for reference.

#2 Evaluation issue, I tried ragchecker package for evaluating my RAG, just followed any steps in RAGChecker example, but "RAGResults.from_dict({"results": [rag_result]})" step return nothing. Has anyone used this package before and can share their experience? Thz!
W
K
2 comments
Hi,
1: Yes for adding metadata like file_name into embedding, you need to remove them from exclusion list that contains some default keywords which are excluded from embedding process.

You can do that once your docs are created.
Plain Text
docs = SimpleDirectoryReader().load_data()
for doc in docs:
  doc.excluded_embed_metadata_keys = [] # This will exclude all the keywords from exclusion list and will include them into embedding process. 

Once you create embeddings then you can check this into your retriever as well.

Also if you are going to check with a particular keyword: You can take a look at metadata filtering as well: https://docs.llamaindex.ai/en/stable/examples/vector_stores/Qdrant_metadata_filter/#qdrant-vector-store-metadata-filter

2: Would you mind sharing the doc link please
thz for relying, #1 I am going to try it later.
#2 do you mean the documentation link?
here:https://docs.llamaindex.ai/en/stable/examples/evaluation/RAGChecker/
Thz again!
Add a reply
Sign up and join the conversation on Discord