Find answers from the community

Updated last year

hey @Logan M

hey @Logan M
how to get the nodes with embedding back from GPTVectorStoreIndex()
here is the sample code
Plain Text
    def store_index(self, documents, payload, service_context):
        with self.lock:
            parser = service_context.node_parser
            nodes = parser.get_nodes_from_documents(documents)

            storage_context = self.get_pinecone_storage_context(payload, toquery=False)

            storage_context.docstore.add_documents(nodes)

            pc_index = GPTVectorStoreIndex(
                nodes,
                storage_context=storage_context,
                service_context=service_context,
            )

            if "oldDocumentId" in payload:
                self.delete_old_vector(payload)
                
        return pc_index

can we extract out nodes with embedding from pc_index?
W
L
S
6 comments
Yes you can extract these details

Plain Text
nodes = index.docstore.docs
embedding_dict = index.vector_store._data.embedding_dict

for node_id, node in nodes.items():
  # This will print the node object
  print(node)
  # This will print the embedding associated with the above node object
  print(embedding_dict[node_id])
Yea that's the only way right now. Should probably make this easier at some point
@WhiteFang_Jr
getting error : 'PineconeVectorStore' object has no attribute '_data'
on line: embedding_dict = index.vector_store._data.embedding_dict
Ah okay, I guess different Vector stores keep embeddings in a different place.
My example was for only GPTVectorStoreIndex directly.

You could use debugger and check where Pinecone stores the embeddings. If you put the stopper right at index creation step.
Pinecone is definitely different. Would have to add a PR to attach the embeddings to the result nodes
Add a reply
Sign up and join the conversation on Discord