Hi everyone, I am new but have been scouring every document I can find, still have some things I am confused about and would appreciate any help! I am confused by the distinction between the llm predictor and the embed model. The docs state the llm predictor is used to create the index, which I assumed meant to generate embeddings, but that doesnt seem right. On one page it says the default llm predictor is davinci, but elsewhere I see ada-002 is default for embeddings. I understand how davinci would be used when querying, but I am confused about how it is used in index construction. If anyone can help clarify, I am grateful!
davinci-003 is mostly used for queries, but also for certain constructing certain indexes (knowledge graph, tree)
ada-002 is used usually for both queries and constructing indexes. For example, in GPTSimpleVectorIndex, the ada-002 creates an embedding vector for each "text chunk" of your input documents
Then, during a query, your query text is embeded and cosine similarity finds the closest matching text chunk(s) to send to the LLM (davinci)