Find answers from the community

Updated last month

I have embeddings and text stored in local machine and want to create a VectorStoreIndex out of it but it's not working.

At a glance
Hello Folks! Happy New Year!

Just one Query!

I have embeddings and text stored in local machine. I want to create VectorStoreIndex out of it.
But its not working. Here is the code. Can anyone pls look into it?

Plain Text
dim = 1536
doc1_index = faiss.IndexFlatL2(dim)
doc1_documents = []
for i, doc in enumerate(response):
    source = doc["_source"]
    doc1_index.add(np.asarray([source["content_vector"]]))
    doc1_documents.append(Document(text=source["content"]))

doc1_vector_store = FaissVectorStore(faiss_index=doc1_index)
storage_context = StorageContext.from_defaults(vector_store=doc1_vector_store)
doc1_llama_index = VectorStoreIndex.from_vector_store(doc1_documents, storage_context=storage_context)


##### Output 

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[28], line 3
      1 doc1_vector_store = FaissVectorStore(faiss_index=doc1_index)
      2 storage_context = StorageContext.from_defaults(vector_store=doc1_vector_store)
----> 3 doc1_llama_index = VectorStoreIndex.from_vector_store(doc1_documents, storage_context=storage_context)

File c:\Users\61097809\loganalytics\graph_venv\Lib\site-packages\llama_index\core\indices\vector_store\base.py:94, in VectorStoreIndex.from_vector_store(cls, vector_store, embed_model, **kwargs)
     87 @classmethod
     88 def from_vector_store(
     89     cls,
   (...)
     92     **kwargs: Any,
     93 ) -> "VectorStoreIndex":
---> 94     if not vector_store.stores_text:
     95         raise ValueError(
     96             "Cannot initialize from a vector store that does not store text."
     97         )
     99     kwargs.pop("storage_context", None)

AttributeError: 'list' object has no attribute 'stores_text'


Thanks !
L
J
9 comments
This doesn't seem like correct syntax? from_vector_store() takes the vector store as the first argument.

Faiss probably isnt the best choice for this pattern, since it doesn't store the document contents.

You probably want something like this instead (assuming content_vector is a list of floats)

Plain Text
from llama_index.core.schema import TextNode

nodes = []
for i, doc in enumerate(response):
    source = doc["_source"]
    nodes.append(TextNode(text=source["content"], embedding=source["content_vector"]))

doc1_vector_store = FaissVectorStore(faiss_index=doc1_index)
storage_context = StorageContext.from_defaults(vector_store=doc1_vector_store)
doc1_llama_index = VectorStoreIndex(nodes=nodes, storage_context=storage_context)
@Logan M Thanks for reaching out!

Happy New Year to you!

As per the above code, i am receiving the following error attached as image.
Attachment
image.png
If I provide embedding model details, it starts to calculate from scratch for all the nodes.
which I do not want, since I already have those pre-existing.
How do you know it's embedding from scratch? There's specific code to skip embeddings if they are already there...
https://github.com/run-llama/llama_index/blob/ad3be7fec4fa0f032661f9783c462a45cf3f6a3f/llama-index-core/llama_index/core/indices/utils.py#L154

Youll need to pass in the embed model anyways in order to do retrieval though, which is also why it's still asking for an embed model
You could confirm this by passing in openai embeddings with a fake api key. If all the nodes have embeddings, it won't get called πŸ‘€
Thank you so much @Logan M
It indeed helped!
Add a reply
Sign up and join the conversation on Discord