Find answers from the community

Updated 2 years ago

Pinecone

AttributeError Traceback (most recent call last)
<ipython-input-10-c60941fc391f> in <cell line: 4>()
2 vectore_store = PineconeVectorStore(pinecone_index=pinecone_index)
3 storage_context = StorageContext.from_defaults(vector_store=vectore_store)
----> 4 index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)

7 frames
/usr/local/lib/python3.10/dist-packages/llama_index/vector_stores/pinecone.py in add(self, embedding_results)
207 )[0]
208 entry.update({"sparse_values": sparse_vector})
--> 209 self._pinecone_index.upsert(
210 [entry], namespace=self._namespace, **self._insert_kwargs
211 )

AttributeError: 'str' object has no attribute 'upsert'
L
n
16 comments
OK two things to check
  1. How did you create pinecone_index for the vector store
  2. To index a single document, you still need a list, so you can do [documents[0]]
Just helped someone else today with setting up pinecone as well, you can see the final working code here
https://discord.com/channels/1059199217496772688/1108394457725947915/1113202045323845835
Thanks Logan, but actually i had already set the index before. Does it mean for every documents I want to add to an existing index, I have to always create a new index?
Also another question, I'd like to query my documents without necessarily having to store them in pinecone, for example After loading the documents, I'd like to use the index
index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)

without necessarily having a vectore store. Is that possible?
The pinecone code worked when I created another pinecone index. Here is a code I'm trying to set the index directly from the loaded documents but I still get an error
documents = [Document(t) for t in [docs[0]]]
index = GPTVectorStoreIndex.from_documents(documents=documents)
where docs[0] is a text (from a webpage) that has page_content and metadata.
With trying to set an index directly I get this error:AttributeError Traceback (most recent call last)
<ipython-input-44-a22d5eeb465a> in <cell line: 1>()
----> 1 index = GPTVectorStoreIndex.from_documents(documents=documents)

4 frames
/usr/local/lib/python3.10/dist-packages/llama_index/langchain_helpers/text_splitter.py in split_text_with_overlaps(self, text, extra_info_str)
164
165 # First we naively split the large input into a bunch of smaller ones.
--> 166 splits = text.split(self._separator)
167 splits = self._preprocess_splits(splits, effective_chunk_size)
168 # We now want to combine these smaller pieces into medium size

AttributeError: 'Document' object has no attribute 'split'
when I use the document as a list like this
index = GPTVectorStoreIndex.from_documents([documents])

I get the following error.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-42-4c8d64dc07f8> in <cell line: 2>()
1 from llama_index import GPTVectorStoreIndex
----> 2 index = GPTVectorStoreIndex.from_documents([[documents]])

/usr/local/lib/python3.10/dist-packages/llama_index/indices/base.py in from_documents(cls, documents, storage_context, service_context, **kwargs)
90 with service_context.callback_manager.as_trace("index_construction"):
91 for doc in documents:
---> 92 docstore.set_document_hash(doc.get_doc_id(), doc.get_doc_hash())
93
94 nodes = service_context.node_parser.get_nodes_from_documents(documents)

AttributeError: 'list' object has no attribute 'get_doc_id'
@Logan M Im not really sure what I'm doing wrong with the index, but I'd appreciate your help setting this up. For context; I have different articles scrapped from the web, so I want to set up different indexes for each article so that I can query each article separately. I also wanted to avoid storing the articles in a vectorestore
I think things are getting a little wonky here haha

The input to from_documents needs to be a list with a single dimension. And it should be a list of document objects. Each document object should have document.text, which is a string of text

Somewhere along the line, your documents are breaking these patterns πŸ˜…
@Logan M Thanks for your response, I got a way around it by creating different indexes for each document loaded.
A quick one though, I have two documents that I would like to compare using Llama index, like to check if they are similar semantically and a summary of the similarities between the two. Could you point me ti the right documentation for this?
Hmmm, I don't think llama-index has a built in way to do this actually.

You would need to generate an embedding for each document, and then compare using cosine similarity or similar metric
Alright, Ill check on this. Thanks.
Add a reply
Sign up and join the conversation on Discord