how does embedding work in the context

xxingxing

how does embedding work in the context of LLM model ? I understand that embedding is to 'digest' 'text' into vector that can be used for matching/search. But what role the LLM play in the context of response generation ? say I have a python project 'digested' through embedding(using openai's embedding api, via llamaindex) but when I try to ask questions about the code, my project obviously don't have all the info about python.

5 comments

LLogan M

embeddings are used to retrieve content that is then used as input to an LLM when asking a question

An LLM is a model trained on enough data that it has enough general knowledge that, given a question and some context, it can answer in a helpful way

For example, an LLM has probably already be trained on tons of python code and information. By asking it a question and giving some context from a specific python codebase, it can generate a helpful answer

xxingxing

yes but how can I 'include' my own codes to affect the context, just send the sources as part of the text in the chat api ?

xxingxing

or use the embedding to do some 'client' side text gen then include that in the 'chat' ?

LLogan M

In general, this is the core problem llama-index solves

You have a few steps

loading data (i.e. from a file) and parsing it into chunks
embedding data and storing it
retrieving data for a given query
synthesizing a response for a query using the retrieved context

Creating a vector index, and running a query engine, performs these steps for you

Highly recommend checking out some of our docs
https://docs.llamaindex.ai/en/stable/getting_started/concepts.html
https://docs.llamaindex.ai/en/stable/understanding/understanding.html

We also have a "RAG from scratch" which might be helpful to understand all these steps under the hood and what llama-index is automating for you
https://docs.llamaindex.ai/en/stable/optimizing/building_rag_from_scratch.html

aautratec

At the end, everything is prompt engineering. You need to combine the similarity search results with your own question together in a prompt and push to llm api for an answer.

Add a reply

Find answers from the community

how does embedding work in the context