Find answers from the community

Updated 2 years ago

I am having problems with using

I am having problems with using GPTPineconeIndex. I have been trying 3 times for it to index all my data but at some point just throws some connections errors. Pretty annoying as it takes like 5 hours to get through all of my index and the longest I have been able to get it to index is like 2 hours.

Code:
pinecone.init(api_key=pinecone_api_key, environment="us-east1-gcp")
pinecone.create_index("wed-match", dimension=1536, metric="euclidean", pod_type="p1")
index = pinecone.Index("wed-match")

documents = SimpleDirectoryReader('api\data').load_data()
INDEX = GPTPineconeIndex(documents, pinecone_index=index, chunk_size_limit=512)

Errors:
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

raise PineconeProtocolError(f'Failed to connect; did you specify the correct index name?') from e
pinecone.core.exceptions.PineconeProtocolError: Failed to connect; did you specify the correct index name?
j
E
7 comments
hey @Erik , thanks for raising. Out of curiosity how large is your dataset?
one thing you could try is using our insert call to insert Documents one at a time
that way if it fails you can always retry on a single insert instead of rebuilding the index
@jerryjliu0 I have around 6900 articles. When I used simple vector index and saved it as json it was 2GB.
Could I have better luck with other vector stores maybe? I am not set on Pinecone. But I would like to move the data to vector store as I understand this is way more memory efficient than loading in a 2GB file in my application.
Does that mean if I have 6900 articles, I would have to manually insert them one by one?
@Erik yeah at the moment. tbh that's how the build_index_from_documentsworks unless you're using our async functionality (we don't have async for insert yet but can add!)
Add a reply
Sign up and join the conversation on Discord