how does embedding work in the context of LLM model ? I understand that embedding is to 'digest' 'text' into vector that can be used for matching/search. But what role the LLM play in the context of response generation ? say I have a python project 'digested' through embedding(using openai's embedding api, via llamaindex) but when I try to ask questions about the code, my project obviously don't have all the info about python.
embeddings are used to retrieve content that is then used as input to an LLM when asking a question
An LLM is a model trained on enough data that it has enough general knowledge that, given a question and some context, it can answer in a helpful way
For example, an LLM has probably already be trained on tons of python code and information. By asking it a question and giving some context from a specific python codebase, it can generate a helpful answer
At the end, everything is prompt engineering. You need to combine the similarity search results with your own question together in a prompt and push to llm api for an answer.