Find answers from the community

Updated last year

prefer document source

hello! I have a question, hopefully you can help me out with it. I have multiple data sources like public documentation, community pages, articles written by architects and sometimes some documents have conflicting information. For example an article written by an architect might be somehow different than the same thing that exists in the public documentation (i know we should not have this but oh well πŸ˜„ ). I am curious if there is a way to set sort of a "preferred" way should different sources being retrieved for a question. For example if I retrieve a doc from my architect published docs and another doc from the public documentation, id like to have the LLM to prefer the architect's document. How do you recommend to do that? Should I just index everything in the same vector store or use separate vector stores? If I put them together in the same vector store and they get retrieved, documents retrieved might be both semantically relevant to the question but the information might be conflicting so I am not sure if reranking would help here. Any suggestion is welcome πŸ™
J
L
3 comments
as always, I'll ask the super expert @Logan M πŸ˜„
My gut says just to set some source metadata in the input documents, and then write some kind of custom node postprocessor to remove retrieved sources that are known to conflict πŸ€”
will give it a try
Add a reply
Sign up and join the conversation on Discord