Embed

Question

Hello, we have a little problem with llamaindex, when we try to load a pdf file in a database (postrgres on neon) with Mistral's embed model we get an error message about going overlimit for the tokens, we tried to split the document for every page and use the TokenTextSplitter with no good result, the problem is the only "solution" was to set the insert_batch_size parameter lower (up to 21 max) but that shouldn't have an impact on the embed model but on the db right ? 😅

Logan M · Answer

What embedding model class are you using?The insert batch size will become the upper bound on the embed_batch_size

Rom.prr · Answer

We used MistralAIEmbedding from llama_index.embeddings.mistralai, that was for our project in the llm x law hackathon in Paris yesterday

Logan M · Answer

You didn't share the exact error, but my first thought was that your chunk size was too bigIf it truly is the batch size, you can set embed_batch_size=20 or similar on the embed model

Find answers from the community

Embed