Pinecone

nnyx-12

AttributeError Traceback (most recent call last)
<ipython-input-10-c60941fc391f> in <cell line: 4>()
2 vectore_store = PineconeVectorStore(pinecone_index=pinecone_index)
3 storage_context = StorageContext.from_defaults(vector_store=vectore_store)
----> 4 index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)

7 frames
/usr/local/lib/python3.10/dist-packages/llama_index/vector_stores/pinecone.py in add(self, embedding_results)
207 )[0]
208 entry.update({"sparse_values": sparse_vector})
--> 209 self._pinecone_index.upsert(
210 [entry], namespace=self._namespace, **self._insert_kwargs
211 )

AttributeError: 'str' object has no attribute 'upsert'

16 comments

LLogan M

OK two things to check

How did you create pinecone_index for the vector store
To index a single document, you still need a list, so you can do [documents[0]]

LLogan M

Just helped someone else today with setting up pinecone as well, you can see the final working code here
https://discord.com/channels/1059199217496772688/1108394457725947915/1113202045323845835

nnyx-12

Thanks Logan, but actually i had already set the index before. Does it mean for every documents I want to add to an existing index, I have to always create a new index?

nnyx-12

Also another question, I'd like to query my documents without necessarily having to store them in pinecone, for example After loading the documents, I'd like to use the index
index = GPTVectorStoreIndex.from_documents(documents, storage_context=storage_context)

without necessarily having a vectore store. Is that possible?

nnyx-12

The pinecone code worked when I created another pinecone index. Here is a code I'm trying to set the index directly from the loaded documents but I still get an error

nnyx-12

documents = [Document(t) for t in [docs[0]]]
index = GPTVectorStoreIndex.from_documents(documents=documents)

nnyx-12

where docs[0] is a text (from a webpage) that has page_content and metadata.

nnyx-12

With trying to set an index directly I get this error:AttributeError Traceback (most recent call last)
<ipython-input-44-a22d5eeb465a> in <cell line: 1>()
----> 1 index = GPTVectorStoreIndex.from_documents(documents=documents)

4 frames
/usr/local/lib/python3.10/dist-packages/llama_index/langchain_helpers/text_splitter.py in split_text_with_overlaps(self, text, extra_info_str)
164
165 # First we naively split the large input into a bunch of smaller ones.
--> 166 splits = text.split(self._separator)
167 splits = self._preprocess_splits(splits, effective_chunk_size)
168 # We now want to combine these smaller pieces into medium size

AttributeError: 'Document' object has no attribute 'split'

nnyx-12

when I use the document as a list like this
index = GPTVectorStoreIndex.from_documents([documents])

I get the following error.

nnyx-12

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-42-4c8d64dc07f8> in <cell line: 2>()
1 from llama_index import GPTVectorStoreIndex
----> 2 index = GPTVectorStoreIndex.from_documents([[documents]])

/usr/local/lib/python3.10/dist-packages/llama_index/indices/base.py in from_documents(cls, documents, storage_context, service_context, **kwargs)
90 with service_context.callback_manager.as_trace("index_construction"):
91 for doc in documents:
---> 92 docstore.set_document_hash(doc.get_doc_id(), doc.get_doc_hash())
93
94 nodes = service_context.node_parser.get_nodes_from_documents(documents)

AttributeError: 'list' object has no attribute 'get_doc_id'

nnyx-12

@Logan M Im not really sure what I'm doing wrong with the index, but I'd appreciate your help setting this up. For context; I have different articles scrapped from the web, so I want to set up different indexes for each article so that I can query each article separately. I also wanted to avoid storing the articles in a vectorestore

LLogan M

I think things are getting a little wonky here haha

The input to from_documents needs to be a list with a single dimension. And it should be a list of document objects. Each document object should have document.text, which is a string of text

Somewhere along the line, your documents are breaking these patterns 😅

nnyx-12

@Logan M Thanks for your response, I got a way around it by creating different indexes for each document loaded.

nnyx-12

A quick one though, I have two documents that I would like to compare using Llama index, like to check if they are similar semantically and a summary of the similarities between the two. Could you point me ti the right documentation for this?

LLogan M

Hmmm, I don't think llama-index has a built in way to do this actually.

You would need to generate an embedding for each document, and then compare using cosine similarity or similar metric

nnyx-12

Alright, Ill check on this. Thanks.

Add a reply

Find answers from the community

Pinecone