Find answers from the community

Updated 3 months ago

Looking to index a larger amount of

Looking to index a larger amount of Documents with VectorStoreIndex, but always get an error, that the context window of the model is too small. Any example on how to mitigate this? I would highly appreciate 🙏

5 comments

rrahul

Split your documents into smaller chunks that fit within the model's context window before indexing them. This will ensure that each chunk of text is small enough to be processed by the model.

Remember to handle the splitting carefully, as splitting in the middle of sentences or important context could affect the quality of the vectors generated by the model.

nnik

Hi, thank you so much for the quick answer. Could you give me an example based on my LlamaIndexTS code. I'm parsing through a few documents:
const serviceContext = serviceContextFromDefaults({
chunkSize: CHUNK_SIZE,
chunkOverlap: CHUNK_OVERLAP,
});

await VectorStoreIndex.fromDocuments(documents, {
storageContext,
serviceContext,
});

concrete error is with the gpt4 api:
BadRequestError: 400 This model's maximum context length is 8192 tokens, however you requested 60000 tokens

FFried cheese

I think you can reduce the CHUNK_SIZE, which will reduce the size of each chunk stored in your db when inserting.

nnik

Thanks, somehow the VectorStoreIndex does not work, even when I reduce the chung size... Is there any good tutorial on this?

nnik

To give more context, the error seems to be here: /node_modules/openai/error.js:44

Add a reply