What is currently the best combination

At a glance

What is currently the best combination of embedding model + gpt for a SimpleVectorStore and a context_mode chat_engine? text-embedding-3-small in combination with gpt-4o-2024-05-13 gives me terrible performance

20 comments

TTorsten

Also would like some consistency, and as the main models (gpt-4 e.g.) are constenly being updated I'd rather avoid those.

TTorsten

Or which vector store method should I use if I want to get answers below 10 seconds, but also allow the llm to generate the answer from multiple sources?

LLogan M

probably a better question is, what types of data are you indexing, what types of questions, how much data?

LLogan M

All of those questions influence the approach to take

TTorsten

Thanks! I'm scraping manuals for a financial application. My chatbot should be able to answer questions for which the answer can be found in these manuals. I have 150 text files with an average size of 3 kB.

TTorsten

Also, in these manuals there is an FAQ section. how can I make sure my chatbot is able to answer these more frequently? The question and answer are usually quite small and might get lost in noise due to the chunk-size.

TTorsten

@WhiteFang_Jr Could you maybe help me out further here? 🙂

WWhiteFang_Jr

I think for starter you can try different chunking numbers like 512, 256 or 1024 and see in which case it is giving you better results and important stuffs are not going missed out.

As I see that you are open to use OpenAI, I would suggest you parse your info with llama-parse with the help of GPT-4o. That can also help in extracting details in a better way.

TTorsten

Thanks! and then which embedding model would you suggest?

TTorsten

Oh nvm, this way you use GPT-4o as your "embedding" model as well. Can I also use LlamaParse without a LlamaCloud API-key, so exclusively the openai api-key? I'm implementing it for a company and there is much regulation before I can use the LlamaCloud API with permission

WWhiteFang_Jr

No it will not use GPT-4o as embedding model.
LlamaParse will work on the extraction part only from your text files.
For embedding model, I would suggest you use text-embedding-3-large it is much better.

WWhiteFang_Jr

After parsing, embeddings generation step takes place and there you'll require the embedding model.

TTorsten

Ah okay thanks, and I just use the standard VectorStoreIndex function for that

TTorsten

For the parsing is the API-key required though?

WWhiteFang_Jr

Yes it is required as parsing happens once the request is authenticated on the cloud

TTorsten

Ah that's unfortunately not an option as of yet. In that case you just suggest to play around with the chunksize of the VectorStoreIndex function?

TTorsten

And then use GPT-4o and text-embedding-3-large

WWhiteFang_Jr

Yeah, this should be able to answer better. Also if you find the responses are not as per your requirements. Try reducing the chunk size. default is 1024

TTorsten

Alright thank you very much for the help😄 Have a good one!

nneo.juan

if I have more than 40,000 files which is about 80% 1k, 10% 2-3k, others < 10k, what's the best trunk size?

Add a reply

Find answers from the community

What is currently the best combination