Find answers from the community

Updated 2 months ago

I have set up a basic test to see how

I have set up a basic test to see how the vector, doc, and index data is separated and stored. I see when creating an index with a basic doc I get useful information in the docstore.json file like doc_hash, ref_doc_id, etc. under "docstore/metadata", and I get node info under "docstore/data" like extra_info, node_info, etc. But when I change over to use a milvus vector store none of the node data under "docstore/data" is present anymore. That data doesn't show up in milvus either. Is this how combining those two storage methods is intended to work?
L
p
3 comments
When using most vector store integrations, the docstore and index_store are not used, and everything goes into the vector store. There are some recent changes to the docstore that need to be translated to each vector store individually (i.e. index.ref_doc_info won't work when using a vector store integration yet)

In milvus, it's a slightly special case in that it's not able to store nodes directly (only text). This is what it looks like when inserting data for milvus


Plain Text
 # Process that data we are going to insert
for result in embedding_results:
    ids.append(result.id)
    doc_ids.append(result.ref_doc_id)
    texts.append(result.node.get_text())
    embeddings.append(result.embedding)

try:
    # Insert the data into milvus
    self.collection.insert([ids, doc_ids, texts, embeddings])
    logger.debug(
        f"Successfully inserted embeddings into: {self.collection_name} "
        f"Num Inserted: {len(ids)}"
    )


https://github.com/jerryjliu/llama_index/blob/main/llama_index/vector_stores/milvus.py#L322
Ah, I see you found the override, nice
It would be nice to be able to set extra fields in the milvus schema when the collection is initially created to map the extra_data to/from. Atm it looks like any extra data specified just gets appended to the beginning of the text field.

It would also be cool if the Milvus vectorstore supported some of Milvus' hybrid search functionality in the case extra schema fields are also supported πŸ˜„
Add a reply
Sign up and join the conversation on Discord