Find answers from the community

Updated 3 months ago

Hi guys, i'm running into an issue in

Hi guys, i'm running into an issue in production, i can't tell if this because my chunks are too big when trying to get embeddings or if my batch size is too big in qdrant client. (right now the batch size is 16). If the chunks are too big shouldn't that be handled by the NodeParser, I'm currently using SemanticSplitterNodeParser?

"2024-07-17 14:03:22,630 - ERROR - 13 - ThreadPoolExecutor-0_0 - root - index_asset - index_asset.py:39 - index_asset() >>> Error indexing asset into Qdrant: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens, however you requested 10125 tokens (10125 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}}"
s
L
4 comments
@Logan M @WhiteFang_Jr hey guys any thoughts on this?
Semantic splitter does not account for chunk size πŸ‘€
You might need to run the outputs through a secondary splitter to ensure some max size
@Logan M thank you, as always you're the GOAT.
Add a reply
Sign up and join the conversation on Discord