Find answers from the community

Updated 2 years ago

Doc ids

Hey, I think the doc_ids parameter isn't functioning correctly in the VectorIndexRetriever. When I send doc_ids=['doc_id1'], it returns chunks with different doc_ids. Did anyone use it ?
L
Y
14 comments
What exactly are you doing here?

As a heads up, if you set the doc id of input documents, those input documents will be parsed into nodes

Those nodes also have a new doc_id for each node, but you can also trace the node back to the original document using node.ref_doc_id
I'm using doc_ids to constrain search, like using only those docs for search, it's mentioned here https://gpt-index.readthedocs.io/en/stable/reference/query/retrievers/vector_store.html
am I misunderstanding the parameter ?
Oh neat, I learned something new about our code lol

Yea you are using weaviate if I remember right? Looking at the code, it looks like that parameter isn't used in the weaviate query function πŸ₯² it works with the default vector index it looks like though

Happy to implement that at some point for you though. It's hard to keep all the vector stores at feature parity when we have so many of them πŸ™ƒ
Thank you! yeah you're right there are many πŸ˜…
I'll help on that when I have some time
yes I'm using weaviate
Should be a simple enough change I hope! Just need to add that list of ids to the weaviate filter (I think) lol

The doc ids are available on query.doc_ids

https://github.com/jerryjliu/llama_index/blob/c0bf85f4e6cd15b9a6bd2cc4be1d328194c0312d/llama_index/vector_stores/weaviate.py#L151
Hey, I created a PR for this issue : https://github.com/jerryjliu/llama_index/pull/6467
I didn't run the pre-commit though, I had trouble with expected environment for python to be healthy() immediately after install.
I tested it locally on an example and it works. Please check it out and if it's good we merge
Thanks so much! I will try to merge this shortly
glad to contribute, thanks!
Hey, do you plan to release a new version today with this feature?
I think so at some point today!
Since it's merged, you can also install directly from git with pip if you need it πŸ™‚
okay, thanks!
Add a reply
Sign up and join the conversation on Discord