MODEL = "gpt-4-1106-preview" EMBED_MODEL = "text-embedding-3-large" llm = OpenAI(model=MODEL, temperature=0.1) node_parser = SentenceWindowNodeParser.from_defaults( window_size=3, window_metadata_key="window", original_text_metadata_key="original_text", ) embed_model = OpenAIEmbedding() client = qdrant_client.QdrantClient(QDRANT_URL, api_key=QDRANT_API_KEY) pdf_reader = SimpleDirectoryReader(input_files=pdf_files) documents = pdf_reader.load_data() vector_store = QdrantVectorStore(client=client, collection_name=collection_name, batch_size=20) service_context = ServiceContext.from_defaults(llm=llm, node_parser=node_parser, embed_model=embed_model) index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context) refreshed_docs = index.refresh_ref_docs(documents)
WARNING - Retrying llama_index.embeddings.openai.get_embeddings in 1.6310027891256675 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens, however you requested 8212 tokens (8212 in your prompt; 0 for the completion). Please reduce your prompt or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}}.
text-embedding-3-large
and experimented with different values for embed_batch_size
(10, 50, 100), but nothing has worked.sentence_window = SentenceWindowNodeParser(...) token_splitter = TokenTextSplitter(chunk_size=7000) nodes = sentence_window(documents) nodes = token_splitter(nodes) index = VectorStoreIndex(nodes=nodes, ...)
service_context = ServiceContext.from_defaults(llm=llm, node_parser=node_parser, embed_model=embed_model)
node_parser
is currently SentenceWindowNodeParser()
index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context)
refreshed_docs = index.refresh_ref_docs(documents)
VectorStoreIndex
with nodes, so I won't be able to use your example directly, @Logan M . Is there a way I can use your recommendation within my setup? Possibly chaining the node parsers somehow, if that's possible?SentenceSplitter()
results in fewer calls to the OpenAI embedding API compared to SentenceWindowNodeParser()
, and the ingestion process completes in a reasonable amount of time for the document in question. SentenceSplitter()
, most of the nodes in Qdrant that I manually checked contained actual sentences and paragraphs. In contrast, with SentenceWindowNodeParser()
, many nodes contained items like "---" or "..."client = qdrant_client.QdrantClient(host=QDRANT_HOST, grpc_port=6334, prefer_grpc=True, api_key=QDRANT_API_KEY)