❓ Do you guys know how to get a

At a glance

❓ Do you guys know how to get a reference source file when I run query?

For example, I have multiple document files under ./data folder.

To query my question, to get the answer. I made this short code.

----------------------------------------
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader(input_dir="./data").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
------------------------------------------

I want to know what reference source file to use when the query engine is working.

Please share your knowledge. Thanks!!!!!! 👍

13 comments

OOceanLi

response.source_nodes? https://docs.llamaindex.ai/en/stable/understanding/querying/querying.html

OOceanLi

https://docs.llamaindex.ai/en/stable/examples/query_engine/CustomRetrievers.html

OOceanLi

Attachment

OOceanLi

response.get_formatted_sources()

ZZIRU

Thanks @OceanLi , Would you please explain more about "source"? Can I find the file name using Doc id? I need to know which file is used for query resutl. Thanks again.

Attachment

OOceanLi

oh in this case when I am building this index, I already preprocessed the data, so each node is a text node from my prepared chunks, so this is one way you can do it. I 'm not too sure what is the built-in way of doing in llamaindex

OOceanLi

but i think you can manipulate metadata schema

OOceanLi

As of the source. It refers to specific nodes

OOceanLi

So the metadata I was referring to are attributes of nodes

OOceanLi

@Logan M

WWhiteFang_Jr

If you are parsing the docs using reader like SimpleDirectoryReader, It will add the file names by itself in case of PDF and doc files in metadata for each document.

Else You can add the filename and other info that you want in the retrieved nodes.

WWhiteFang_Jr

More on this: https://docs.llamaindex.ai/en/stable/module_guides/loading/documents_and_nodes/usage_documents.html#customizing-documents

ZZIRU

Thanks so much. This document page is big help.. Thanks @WhiteFang_Jr and @OceanLi
I saved the file name in Metadata. it works for me.

Plain Text

from llama_index import SimpleDirectoryReader

filename_fn = lambda filename: {"file_name": filename}

# automatically sets the metadata of each document according to filename_fn
documents = SimpleDirectoryReader("./data", 
                                  recursive=True, 
                                  file_metadata=filename_fn,
                                  filename_as_id=True,
).load_data()

Add a reply

Find answers from the community

❓ Do you guys know how to get a