Find answers from the community

Updated 4 months ago

Below is what llama index provides for

At a glance
Below is what llama-index provides for. So if I wrapped Langchain's equivalent in a class that looks like that one...
Attachment
image.png
L
s
15 comments
i'm not 100% sure how the langchain retriever works

Does it return text chunks that match a given query? Then a custom retriever makes sense
We could maybe add wrappers for langchain stuff, but I'm curious why you arent using llama-index for retrieval ?
Here's langchain's output from a OpenSearchVectorSearch object
Attachment
image.png
It looks like llama-index almost supports at least one kind of filter though. Here's llama-index's query
Attachment
image.png
Here is langchain's
Attachment
image.png
I'm just going to stuff something in there and remove the validation about filtering being unimplemented and see what happens
Definitely worth noting though that llama-index is behind langchain in OpenSearch support.
Another reason to use it is that AWS has an OpenSearch service, and afaik it's the only vector store db I can use while keeping my company's legal and security depts happy. (edit: and satisfies my other requirements)
@Logan M I did get a basic boolean filter to work by small edits to llama-index/vector_stores/opensearch.py, but comparing functionality in more depth I think a wrapper is a better option until llama-index implements something more sophisticated.
Yea the vector store integrations are mostly community driven. Feel free to make a PR. Sadly opensearch is barely used (at least judging from discord/github issues), so it's a little barebones at the moment
As it turns out, OpenSearch's KNN filtering is applied after the k-results are retrieved anyway, and as such it would be just about as easy to filter the response instead of asking for a filtered response. Their Lucene engine has pre-filtering, but it only supports up to dimension 1024
Or alternatively, there is a brute force exact Knn that lets you pre-filter, but doesn't scale well
More details https://opensearch.org/docs/latest/search-plugins/knn/filter-search-knn/

tl;dr exact match filtering might not behave as expected unless you are using "Script Scoring" or "Painless Scripting", but those are not as scalable / flexible as the approximate-Knn to which only a "boolean" filter may be applied.
I submitted a PR to support filtering, see #🙌contributing
Add a reply
Sign up and join the conversation on Discord