Find answers from the community

Updated 3 months ago

Hello everyone I am trying to read in

Hello everyone, I am trying to read in and index a set of large pdfs, but I get this error, do you know how I could fix this? InvalidRequestError: This model's maximum context length is 8191 tokens, however you requested 38416 tokens (38416 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

The above exception was the direct cause of the following exception:

RetryError Traceback (most recent call last)
/tmp/ipykernel_5663/105381258.py in <module>
----> 1 index = GPTVectorStoreIndex(documents)
T
r
7 comments
How are you chunking the text?
@rse Which version are you on? Should just work like this:

Plain Text
from llama_index import VectorStoreIndex, SimpleDirectoryReader

# Load documents
documents = SimpleDirectoryReader('data').load_data()

# Create an index
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
I installed it just last week, so it must be the latest version. But VectorStoreIndex gives the same error. Do I need to manually chunk the data then?
The chunk size should be 1024 by default, what does your code look like?
I re-ran my notebook and it worked now. Thanks Teemu! My code looks like this:
from llama_index import VectorStoreIndex, SimpleDirectoryReader, download_loader

data_directory = '../data/raw/'

documents = SimpleDirectoryReader(data_directory).load_data()

index = VectorStoreIndex.from_documents(documents)

index.storage_context.persist(persist_dir='index')
Add a reply
Sign up and join the conversation on Discord