Can someone explain the embeddings

At a glance

Can someone explain the embeddings module for GPTIndex? How is it different from doing cosine similarity across a set of vectors and grabbing top k results and injecting that into the prompt?

7 comments

jjerryjliu0

Hi @lucasneg , that basically is the gist of our embeddings support across our indices: https://gpt-index.readthedocs.io/en/latest/how_to/embeddings.html

However, 1) you don't have to worry about token limitations (you can feed in more examples than can fit within the max prompt size and you'll still get an answer), 2) with GPT Index tools you can try out different indices + combine indices to better synthesize an answer over your data

llucasneg

how does the accuracy compare to the tree index? it looks interesting but way too expensive for production usage it seems

llucasneg

is the tokenizer being used? in theory, couldn't I hit a token limit if I inject too much context?

jjerryjliu0

Re 1) question empirically an embeddings-based approach is better than tree index for retrieving top-k documents over a large corpus (you can specify your own embeddings per document too instead of having us call OpenAI for you e.g. here https://twitter.com/gpt_index/status/1608975108496068609?s=20&t=V4a7LUq7Bi7O_PvxN_eR8g).

2) yeah one of the main points of GPT Index is we build a data structure over your data so even if the context is > token limit, we handle that under the hood for you by doing iterative LLM calls per text chunk (you can see how each index works here https://gpt-index.readthedocs.io/en/latest/guides/index_guide.html)

llucasneg

by accuracy I mean in terms of finding the correct context. I find that embeddings sometimes finds the most similar document, but not necessarily the most relevant one

llucasneg

but nice, will check out the iterative logic

jjerryjliu0

👍 i haven't done extensive experiments, but would love your feedback as to where it's not working for future improvements

Re: iterative logic check out the response synthesis section too! https://gpt-index.readthedocs.io/en/latest/guides/index_guide.html you can specify diff response modes for each query

Add a reply

Find answers from the community

Can someone explain the embeddings