Find answers from the community

Updated 3 months ago

Logan M Hi there

Hi there,

Just need a quick help some suggestions if u guys can make. I'm planning a shift of our app's architecutre to use MongoDB's Vector Search database instead of Pinecone. It's much easier to manage vectors in one location.

Above is a simple script that is accessing one database called Indexes, where it is successfully able to index all data, but querying it throws an empty response. Is this normal or I'm missing something?
Attachment
Screenshot_2023-06-27_at_6.46.07_AM.png
L
H
36 comments
heh I actually just ran into this this morning! Just a case of bad documentation

After you create the index, mongodb requires you to setup the "search index" manually in their UI
Attachment
image.png
I got the setup process from the langchain docs actually. But even those instructions were bad https://python.langchain.com/docs/modules/data_connection/vectorstores/integrations/mongodb_atlas_vector_search
It takes a few minutes for the index to "create" after doing this, but then the query should work
Wait. Is this process manual? Do I need to do this for every index that's created?
There might be way to do it using the python API, but I have not found it yet lol
I think I need to dive into their docs. So how to set it up. However, even doing this setup querying is returning empty response.
Gotta give it a few minutes πŸ™
(it's kinda slow)
there should be some kind if progress/indicator for when it's ready on that search tab
Intreastingly search is working in their UI. Not in my Jupyter while querying.
Attachment
Screenshot_2023-06-27_at_7.08.05_AM.png
I did add some metadata when I index it from my application, but this shouldn't affect querying I suppose.
Attachment
Screenshot_2023-06-27_at_7.10.14_AM.png
Does it work if you only pass in the vetor store when loading? without kwargs?

Plain Text
index = VectorStoreIndex.from_vector_store(store)
Same situation.
Attachment
Screenshot_2023-06-27_at_7.12.30_AM.png
I restarted the kernel as well.
dang, wonder why. I just had this working this morning too
Plain Text
index = VectorStoreIndex.from_vector_store(vector_store=store)
I changed it to this, still doesn't work.
I wonder if the search index thingy in their UI didn't apply to this specific collection, since you customized the collection and db name πŸ€” When I had it this morning, I just had all defaults there...
I can set this up again just to prove I'm not crazy lol
Let me check tho to make sure. Actually I didn't change much. I also just have one collection with a bunch of documents in there.
All looks right. The setup looks very simple and should work lol
Attachments
Screenshot_2023-06-27_at_7.17.51_AM.png
Screenshot_2023-06-27_at_7.16.54_AM.png
if it helps, this flow worked for me just now
Attachment
image.png
I changed it up a bit. Restarted the kernel too. No luck. I don't have a file locally, I can try, but should work too with an empty list. That's how I'm planning to do in production.
Attachment
Screenshot_2023-06-27_at_7.27.23_AM.png
I blame the search index thing in the mongo UI

Let me see if I customize the db name and collection on my side, if it still works
It successfully generates the embedding vector for query. So service context is good.

However, moving db_name and collection_name from default, shouldn't affect. I'm running the process with defaults. Taking time to index. Let's see how it rolls.
Yea I added a collection_name and db_name to my example and it still works πŸ€”
This is my dashboard for reference (I have two search indexes now)
Attachment
image.png
I can see this. It works now. I checked through the versions. Had to upgrade pymongo and langchain. More importantly, I missed adding this to JSON schema:

Plain Text
{
  "mappings": {
    "dynamic": true,
    "fields": {
      "embedding": {
        "dimensions": 1536,
        "similarity": "cosine",
        "type": "knnVector"
      }
    }
  }
}


I spotted this on Langchain docs. Better u guys should make it explicit on the docs as well.

And I also think that when u create the index, llama_index should itself do the creation of search index in MongoDB. Instead of us having to do it separately. It's better to keep this logic in one place if possible. I'm just gonna do this on the side now.
Attachments
Screenshot_2023-06-28_at_8.45.26_AM.png
Screenshot_2023-06-28_at_8.31.55_AM.png
I agree @HK-RantTest-HarisRashid but from what I can tell, mongodb doesn't offer a way to create the search index without using the UI 🫠
I guess it does through an API
Super helpful if we can integrate the creation of search index within the llama_index's workflow.
I wonder if that's included in the python API
They've included this in Node.js driver not Python I've looked.
Attachment
Screenshot_2023-06-28_at_11.54.13_AM.png
Add a reply
Sign up and join the conversation on Discord