Find answers from the community

Updated 4 months ago

Duckdb vector store limitations

At a glance

The post discusses the DuckDB vector store, which only allows the use of the .delete() method and not .add() or .upsert(). Community members discuss how to add data to the vector store, noting that it requires a list of nodes rather than just document IDs. They provide examples of how to create and add nodes, and mention the need to embed the nodes before adding them. One community member also asks how to delete data when loading a persisting index file, as they don't have the document ID to use the .delete() method.

Useful resources
Created DuckDB vector store(https://docs.llamaindex.ai/en/stable/examples/vector_stores/DuckDBDemo/) only letting using .delete and not other like .add or .upsert.
Attachment
image.png
W
L
J
28 comments
yea, upsert() is not a method on any vector store
Only able to do delete, it's not supporting add as well: vector_store.delete("doc_id")
Why it's displayed on DuckDB LlamaIndex doc
Attachment
image.png
Can you share how you are adding,
It takes a list of nodes while adding
The code for adding is mentioned here
vector_store.add("doc_id")
It takes a list of nodes
Can you give example?
It would be something like this:

vector_store.add([node1,node2,node3])
okay, and nodes will be doc_IDs?
No node is the object that you create from your docs
okay, i am using only Document to build index index = VectorStoreIndex.from_documents(documents), shall i use nodes just to use .add because doc_ID is already serving me in using .delete
Moreover just to confirm, node_id is to nodes what doc_ID is to documents?
I am providing node1 but it's throwing following error:
Attachment
image.png
You are provide the ID, node is a object
Try doing this:
Plain Text
# once you get document object from SimpleDirectory, convert them into Node
# parse nodes
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)

#Then pass these nodes into the `add` method.
vector_store.add(nodes)
if you wanna test it with single node try with this:
Plain Text
from llama_index.core.schema import TextNode
node1 = TextNode(text="<text_chunk>", id_="<node_id>")
node2 = TextNode(text="<text_chunk>", id_="<node_id>")

vector_store.add([node1,node2])
Fixed that error but now I am getting embedding not set error
Attachment
image.png
same error with your example
Attachment
image.png
You need to embed the nodes before adding them
Plain Text
embed_model = OpenAIEmbedding()

node_texts = [node.text for node in nodes]
embeddings = embed_model.get_text_embedding_batch(node_texts)
for (node, embedding) in zip(nodes, embeddings):
  node.embedding = embedding

vector_store.add(nodes)
thanks, it's working now! One last thing before i go, if i use single file in SimpleDirectoryReader, it throws below error which goes away if uses commented direcotry approach
Attachment
image.png
You need to add .load_data() at the end of simpledirectoryreader
documents=SimpleDirectoryReader(..).load_data()
thanks y'all, cheers.
One last thing, how to delete data [Document/Node object], when I am loading a persisting index file like in below screenshot. In that case, i don't have a document_Id to do vector_store.delete("doc_ID"), which I am currently using after SimpleDirectoryReader().
Attachment
image.png
Add a reply
Sign up and join the conversation on Discord