Find answers from the community

Updated 2 years ago

Can anyone explain what is the id to

At a glance
Can anyone explain what is the id_to_text_map? Also I suppose the query vector is the embedding for the prompt given to index. I'm even thinking this will present a challenge if the query happens to be complex that needs further breaking down before it approaches the index in many cases.

Plain Text
from llama_index import download_loader
import os

PineconeReader = download_loader('PineconeReader')

# the id_to_text_map specifies a mapping from the ID specified in Pinecone to your text.
id_to_text_map = {
    "id1": "text blob 1",
    "id2": "text blob 2",
}
# ...
query_vector=[n1, n2, n3, ...]

reader = PineconeReader(api_key=api_key, environment="us-west1-gcp")
documents = reader.load_data(
    index_name='quickstart',
    id_to_text_map=id_to_text_map,
    top_k=3,
    vector=query_vector,
    separate_documents=True
)


Can anyone shed some light here, will be super useful.
L
H
10 comments
Yea the vectordb readers aren't the most useful.

Normally, you'd skip the reader and let llama index create the index in pinecone + send queries to it to retrieve the top k

If questions are complex, we have a few abstractions for this, the main one being the sub question query engine (which generates sub-questions from an initial complex question)

Video + notebook are here: https://gpt-index.readthedocs.io/en/latest/guides/tutorials/discover_llamaindex.html#subquestionqueryengine-10k-analysis
I was just exploring the reader thinking it can solve my problem. Im still stuck on this.

How do I ingest data without creating the documents to put into index. I already have vectors stored in Pinecone, but now I want to connect. Passing an empty list into doesn’t seem to give me anything.

GPTVectorStoreIndex([], storage_context=storage_context)
Were the vectors on Pinecone inserted using llama-index? If not, they won't be in the correct format, and you'll still need to create document objects
Yes. They were inserted through Llama Index
Oh wait. Actually, I confused langchain and pinecone here. I used Langchain to make that update.
ooo yea that might be an issue then
If anyone comes across this problem, its clearly written in the docs. I missed it.
https://gpt-index.readthedocs.io/en/latest/how_to/index/vector_store_guide.html#connect-to-external-vector-stores-with-existing-embeddings



Plain Text
# Connect the pinecone and create the index for query
import pinecone
from llama_index.vector_stores import PineconeVectorStore
from llama_index.indices.vector_store import VectorStoreIndex

PINECONE_API_KEY = API_KEY
PINECONE_API_ENV = API_ENV

pinecone.init(api_key=PINECONE_API_KEY, environment=PINECONE_API_ENV)

vector_store = PineconeVectorStore(pinecone.Index("ds-websources"), namespace=appID)

# ... Create the service context
# ...

ds_websources_index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context)
Add a reply
Sign up and join the conversation on Discord