LlamaIndex

Log inLog into community

Find answers from the community

Updated 6 months ago

Logan M Hi there

Logan M Hi there

At a glance

HHK-RantTest-HarisRashid

·

Hi there,

Just need a quick help some suggestions if u guys can make. I'm planning a shift of our app's architecutre to use MongoDB's Vector Search database instead of Pinecone. It's much easier to manage vectors in one location.

Above is a simple script that is accessing one database called Indexes, where it is successfully able to index all data, but querying it throws an empty response. Is this normal or I'm missing something?

Attachment

Screenshot_2023-06-27_at_6.46.07_AM.png

L

H

36 comments

heh I actually just ran into this this morning! Just a case of bad documentation

After you create the index, mongodb requires you to setup the "search index" manually in their UI

Attachment

I got the setup process from the langchain docs actually. But even those instructions were bad https://python.langchain.com/docs/modules/data_connection/vectorstores/integrations/mongodb_atlas_vector_search

It takes a few minutes for the index to "create" after doing this, but then the query should work

HHK-RantTest-HarisRashid

Wait. Is this process manual? Do I need to do this for every index that's created?

There might be way to do it using the python API, but I have not found it yet lol

HHK-RantTest-HarisRashid

I think I need to dive into their docs. So how to set it up. However, even doing this setup querying is returning empty response.

Gotta give it a few minutes 🙏

(it's kinda slow)

there should be some kind if progress/indicator for when it's ready on that search tab

HHK-RantTest-HarisRashid

Intreastingly search is working in their UI. Not in my Jupyter while querying.

Attachment

Screenshot_2023-06-27_at_7.08.05_AM.png

HHK-RantTest-HarisRashid

Attachment

Screenshot_2023-06-27_at_7.09.04_AM.png

HHK-RantTest-HarisRashid

I did add some metadata when I index it from my application, but this shouldn't affect querying I suppose.

Attachment

Screenshot_2023-06-27_at_7.10.14_AM.png

Does it work if you only pass in the vetor store when loading? without kwargs?

Plain Text

index = VectorStoreIndex.from_vector_store(store)

HHK-RantTest-HarisRashid

Same situation.

Attachment

Screenshot_2023-06-27_at_7.12.30_AM.png

HHK-RantTest-HarisRashid

I restarted the kernel as well.

dang, wonder why. I just had this working this morning too

HHK-RantTest-HarisRashid

Plain Text

index = VectorStoreIndex.from_vector_store(vector_store=store)

HHK-RantTest-HarisRashid

I changed it to this, still doesn't work.

I wonder if the search index thingy in their UI didn't apply to this specific collection, since you customized the collection and db name 🤔 When I had it this morning, I just had all defaults there...

I can set this up again just to prove I'm not crazy lol

HHK-RantTest-HarisRashid

Let me check tho to make sure. Actually I didn't change much. I also just have one collection with a bunch of documents in there.

HHK-RantTest-HarisRashid

All looks right. The setup looks very simple and should work lol

Attachments

Screenshot_2023-06-27_at_7.17.51_AM.png

Screenshot_2023-06-27_at_7.16.54_AM.png

if it helps, this flow worked for me just now

Attachment

HHK-RantTest-HarisRashid

I changed it up a bit. Restarted the kernel too. No luck. I don't have a file locally, I can try, but should work too with an empty list. That's how I'm planning to do in production.

Attachment

Screenshot_2023-06-27_at_7.27.23_AM.png

I blame the search index thing in the mongo UI

Let me see if I customize the db name and collection on my side, if it still works

HHK-RantTest-HarisRashid

It successfully generates the embedding vector for query. So service context is good.

However, moving db_name and collection_name from default, shouldn't affect. I'm running the process with defaults. Taking time to index. Let's see how it rolls.

Yea I added a collection_name and db_name to my example and it still works 🤔

This is my dashboard for reference (I have two search indexes now)

Attachment

HHK-RantTest-HarisRashid

I can see this. It works now. I checked through the versions. Had to upgrade pymongo and langchain. More importantly, I missed adding this to JSON schema:

Plain Text

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "embedding": {
        "dimensions": 1536,
        "similarity": "cosine",
        "type": "knnVector"
      }
    }
  }
}

I spotted this on Langchain docs. Better u guys should make it explicit on the docs as well.

And I also think that when u create the index, llama_index should itself do the creation of search index in MongoDB. Instead of us having to do it separately. It's better to keep this logic in one place if possible. I'm just gonna do this on the side now.

Attachments

Screenshot_2023-06-28_at_8.45.26_AM.png

Screenshot_2023-06-28_at_8.31.55_AM.png

I agree @HK-RantTest-HarisRashid but from what I can tell, mongodb doesn't offer a way to create the search index without using the UI 🫠

HHK-RantTest-HarisRashid

https://www.mongodb.com/docs/atlas/reference/api-resources-spec/v2/#tag/Atlas-Search/operation/createAtlasSearchIndex

HHK-RantTest-HarisRashid

I guess it does through an API

HHK-RantTest-HarisRashid

Super helpful if we can integrate the creation of search index within the llama_index's workflow.

I wonder if that's included in the python API

HHK-RantTest-HarisRashid

They've included this in Node.js driver not Python I've looked.

Attachment

Screenshot_2023-06-28_at_11.54.13_AM.png

HHK-RantTest-HarisRashid

https://www.mongodb.com/docs/atlas/atlas-search/create-index/#example-1

Add a reply

Sign up and join the conversation on Discord