Is the recommended approach for using Pinecone (or another vector store) to load the documents into the store with gpt_index as the interface? (i.e. fresh index, create documents, insert into GPTPineconeIndex)
How would one interface with gpt_index in the case of a pre-existing vector index in pinecone in this case?
Trying to decide if the flexibility of being able to interface directly is better than using the gpt_index abstraction or not for non-full-document Q/A (ex. storing previous queries for a cache of Q/A so LLM calls can be limited)
Assuming the best interface is GPTPineconeIndex, is the best way to serialize the index saving to disk and loading from disk at boot? (i.e. in an API)
I kind of like the fact gpt_index does the job of mapping documents/nodes to vectors in the store so I don't have to keep track of a mapping but not sure of the trade-off space right now.
I tried it last night with a sample index and I was able to query against it. Not sure what you meant by "preserves the original documents" - You should be able to store text chunks inside metadata. (Be aware of the chunk size limit)
What I am getting at is, right now I am loading pre-made indices from disk via the index.load_from_disk() method, I am wondering if I have to preserve that workflow or if attaching to the pre-existing pinecone index (that was populated with gpt_index) is enough