Find answers from the community

n
nshern
Offline, last seen 3 months ago
Joined September 25, 2024
Hey Everyone, I have a question I hoppe someone can help me with!

I am creating my index with the following line:
Plain Text
Settings.llm = OpenAI(model="gpt-4-0125-preview", temperature=0.1)
Settings.embed_model = OpenAIEmbedding()
index = VectorStoreIndex(nodes=nodes)


So why am I getting the following error:
Plain Text
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens, however you requested 11486 tokens (11486 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}

When it is specified in the OpenAI documentation that the gpt-4-025-preview has a context length of 128.000 tokens and not 8192 (which vanilla gpt-4 has)
40 comments
m
L
n
W
Am I misunderstanding something or is the API reference not updated to v.0.10 https://docs.llamaindex.ai/en/stable/api_reference/index.html
2 comments
L
W
n
nshern
·

Rag

Hello! I have a general question I hope someone can help me with:

Say I want to do RAG on documents that are meeting notes that are structured so that it is a bunch of verbatim statements from people, structured like

person 1: bla bla bla, 11:46
person2: bla bla bla , 11:47
person1: blablablablabla, 11:50

etc.

Is it a good idea to define each node so that it is a single statement, together with metadata being name of person and time and with no chunk overlap? Or should I just let the node parser do it automatically?

Where can I find some information regarding this and what makes up a "good node"?
2 comments
n
W