i have a csv and pdf file which has some

At a glance

The community member has a CSV and PDF file with text data, which they have read using LlamaIndex and stored as vectors in Pinecone. They now want to be able to query these vectors to find the most similar ones to a given sentence. However, they are facing an issue when trying to directly query Pinecone using the OpenAI embeddings of the input sentence. The community member is looking for a way to pass a string and get the top k most similar vectors from the Pinecone database, along with their metadata, without generating a full response.

In the comments, another community member suggests using the index.as_retriever() method to retrieve the nodes and their metadata without doing a language model call.

PPhalguna

i have a csv and pdf file which has some text data and i read that using llama-ndex simple directory readers and stored them in pineconeas vectors. i didnt have to mention which embeddings but the llm for service context was openai. Now i want to be able to query those vectors that i upserted. basically i pass a sentence i need to get the vectors that are most similar to it
is there any way i can do it
i thought of directly querying pinecone but when i convert the text to embeddings using open ai it throws an error when i pass it as a the vector for query filter
is there any way i can pass a string and get just the top k results of the vectors in the pinecone db(not the gen answer but just the vectors along with their metadata)

2 comments

TTeemu

You can do something like

Plain Text

retriever = index.as_retriever()
nodes = retriever.retrieve("What does the document say?")
print(nodes)

This will retrieve just the nodes + metadata without doing a LLM call

PPhalguna

thank you very much

Add a reply

Find answers from the community

i have a csv and pdf file which has some