Find answers from the community

Updated 3 months ago

Hey can we know which nodes are used by

Hey, can we know which nodes are used by the model to create the answer, or do we need a custom prompt for that?

6 comments

They are on the response object!

response = query_engine.query(...)

response.source_nodes

Those are the nodes that the model read to create the answer. If you need to know EXACTLY which node/piece of text was used, you can try the citation engine I built, that tries to get the LLM to write in-text citations (the prompt for this is customizable, if you encounter issues)

https://github.com/jerryjliu/llama_index/blob/main/docs/examples/query_engine/citation_query_engine.ipynb

YYasmine

yes, I was looking for the exact text. I'll try this out, Thanks!

YYasmine

why we need chunk_size and chunk_overlap for CitationQueryEngine but we don't for RetrieverQueryEngine?

LLogan M

The citation query engine is basically taking the existing nodes in your index and breaking them into smaller numbered pieces, so that the LLM can cite sources.

You can leave both at their defaults, or optionally lower the chunk_size to get more granular sources (although with really low chunk_sizes, like less than 256, it might negatively effect the response quality)

I wouldn't worry about the overlap

jjon-chuang

Is there room for a citation highlighter, which uses post-processing to highlight/extract the most relevant parts of the source?

Citations [{content: ..., highlights: ...,}]

LLogan M

Definitely possible, especially with the new openai function calling api (could define a citation/response object using pydantic that has content and highlights)

Just need to implement the query engine for it 🙂

Add a reply