Find answers from the community

Updated 3 months ago

❓ Do you guys know how to get a

❓ Do you guys know how to get a reference source file when I run query?

For example, I have multiple document files under ./data folder.

To query my question, to get the answer. I made this short code.

----------------------------------------
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader(input_dir="./data").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
------------------------------------------

I want to know what reference source file to use when the query engine is working.

Please share your knowledge. Thanks!!!!!! 👍
O
Z
W
13 comments
response.get_formatted_sources()
Thanks @OceanLi , Would you please explain more about "source"? Can I find the file name using Doc id? I need to know which file is used for query resutl. Thanks again.
Attachment
image.png
oh in this case when I am building this index, I already preprocessed the data, so each node is a text node from my prepared chunks, so this is one way you can do it. I 'm not too sure what is the built-in way of doing in llamaindex
but i think you can manipulate metadata schema
As of the source. It refers to specific nodes
So the metadata I was referring to are attributes of nodes
If you are parsing the docs using reader like SimpleDirectoryReader, It will add the file names by itself in case of PDF and doc files in metadata for each document.

Else You can add the filename and other info that you want in the retrieved nodes.
Thanks so much. This document page is big help.. Thanks @WhiteFang_Jr and @OceanLi
I saved the file name in Metadata. it works for me.

Plain Text
from llama_index import SimpleDirectoryReader

filename_fn = lambda filename: {"file_name": filename}

# automatically sets the metadata of each document according to filename_fn
documents = SimpleDirectoryReader("./data", 
                                  recursive=True, 
                                  file_metadata=filename_fn,
                                  filename_as_id=True,
).load_data()
Add a reply
Sign up and join the conversation on Discord