Similarity

BBlake

I have these embeddings that i extracted from my index and i want to query to get the cosine similarity from another llama index (without using a query str/text str) - how do u recc taking raw embeddings and getting the most similar from an index?

eg: QueryBundle type only takes mandatory query_str

feels like im missing a simple way to do this

Attachment

23 comments

LLogan M

You'll need to do a pairwise comparison.

Take each candidate vector, compare to every vector in an index, and select the top k most similar

You could use this function to help with the top k part

https://github.com/jerryjliu/llama_index/blob/ac5141e548f4ff8ff1347d495e3e020a5cb3e3bd/llama_index/indices/query/embedding_utils.py#L11

LLogan M

It sounds slow, but it's surprisingly fast -- even with 10,000 vectors, the search should finish in a second or so

BBlake

Thanks logan!

query_task = index_to_query.as_retriever(num_results=2).aretrieve(QueryBundle(query_str=document.text))

any way to get the query embedding [floats] from this kind of aretrieve?

context in photo - want to create an openinference record from aretrieves (i.e. without using .query)

Attachment

LLogan M

hmm, the only way to get the embddings will be using a callback handler.

Actually, someone just recently contributed an openinference handler

LLogan M

One sec, I can link the notebook

BBlake

https://github.com/jerryjliu/llama_index/blob/main/llama_index/callbacks/open_inference_callback.py

BBlake

they set it here

Attachment

BBlake

thing is, open_inf_callback only works with .query (not .retreive)

BBlake

so i'm forming my own open inference record

LLogan M

ah right, because the callback isn't inside the retrieve method

BBlake

yeah. so i'm just trying to grab that query_embedding from the execution path of .retreive

LLogan M

quick hack: use a query engine, but set response_mode="no_text"

i.e. index.as_query_engine(response_mode="no_text", similarity_top_k=2)

This will skip calling the LLM, and only do the retrieve step. It will also hit the callback handler

BBlake

Interesting! I'll try this - i'm not sure it'll work with openinferencecallbackhandler out of the box w that thoug eh?

LLogan M

It should I think? But let me know how it goes haha

BBlake

Attachment

BBlake

This added to the openinf callback handler buffers

BBlake

which is neat because a) i can use the buffers to create my open inf record b) it didn't do an llm call

Attachment

BBlake

@Logan M one more thing

here's my index docstore docs:

doesn't seem there's a way to query/retrieve against a subset of a VectorStoreIndex / SimpleDocumentStore based on metadata keys

do you have a hack for this?

I've done my own filtering funcs e.g. pictured - but it's not efficient

Attachments

BBlake

e.g. this doesn't seem optimal

Attachment

LLogan M

Yea we haven't implemented metadata filtering for the base vector store sadly 😅

LLogan M

So custom approach is probably best for now, until we figure it out

BBlake

@Logan M Thank you. Also gotta ask: best way to send in several query texts at once?

I'm doing asyncio.gather(query_engine.aquery(text)) here on 4 docs

but as can see in terminal, it does 4 seperate openai embeddings calls

Attachment

LLogan M

Hmm I think the asyncio approach is still best 👀

Thankfully, at least in newer versions of llama-index, the embeddings calls are also async, so it should be fairly efficient 🙏

Add a reply

Find answers from the community

Similarity