llm = Anthropic(model="claude-3-opus-20240229")
Unexpected err=ValueError('Unknown model: claude-3-opus-20240229. Please provide a valid Anthropic model name.Known models are: claude-instant-1, claude-instant-1.2, claude-2, claude-2.0, claude-2.1'), type(err)=<class 'ValueError'>
client = qdrant_client.QdrantClient(host=QDRANT_HOST, grpc_port=QDRANT_GRPC_PORT, prefer_grpc=True, api_key=QDRANT_API_KEY) vector_store = QdrantVectorStore(client=client, collection_name=collection_name, batch_size=20) index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context) for document in documents: try: log.info(f"SETUP: Source {source_id} Updating document") index.update_ref_doc( document, update_kwargs={ "delete_kwargs": { "delete_from_docstore": True } }, ) except Exception as err: log.info(f"SETUP: Source {source_id} Error: {err}") log.info(f"SETUP: Source {source_id} Update failed, trying insert") index.insert(document)
"UNKNOWN:Error received from peer {grpc_message:"Wrong input: Collection
166850 already exists!", grpc_status:3
. Consequently, both index.update_ref_doc
in the try
block and index.insert(document)
in the exception handler block fail.node_postprocessors=[ MetadataReplacementPostProcessor(target_metadata_key="window"), cohere_rerank ]
node_postprocessors=[ cohere_rerank, MetadataReplacementPostProcessor(target_metadata_key="window") ]
index.as_chat_engine()
alongside Qdrant as a vector store.as_query_engine()
/ as_chat_engine()
would still invoke the LLM without any context, from my understanding. How can I change this behaviour to return an error when no matches are found in the vector store?pip install llama-index-llms-gemini
is throwing this error:ERROR: Could not find a version that satisfies the requirement llama-index-llms-gemini (from versions: none) ERROR: No matching distribution found for llama-index-llms-gemini
node_parser = SentenceWindowNodeParser.from_defaults( window_size=3, window_metadata_key="window", original_text_metadata_key="original_text", )
service_context = ServiceContext.from_defaults(llm=llm, node_parser=node_parser, embed_model=embed_model) index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context) chat_engine = index.as_chat_engine( similarity_top_k=2, # the target key defaults to `window` to match the node_parser's default node_postprocessors=[ MetadataReplacementPostProcessor(target_metadata_key="window") ], vector_store_kwargs={"qdrant_filters": filters})
node_parser
parameter in ServiceContext.from_defaults
and the node_postprocessors
configuration in index.as_chat_engine()
:node_postprocessors=[ MetadataReplacementPostProcessor(target_metadata_key="window") ],
embed_model = OpenAIEmbedding(embed_batch_size=50)
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
MODEL = "gpt-4-1106-preview" EMBED_MODEL = "text-embedding-3-large" llm = OpenAI(model=MODEL, temperature=0.1) node_parser = SentenceWindowNodeParser.from_defaults( window_size=3, window_metadata_key="window", original_text_metadata_key="original_text", ) embed_model = OpenAIEmbedding() client = qdrant_client.QdrantClient(QDRANT_URL, api_key=QDRANT_API_KEY) pdf_reader = SimpleDirectoryReader(input_files=pdf_files) documents = pdf_reader.load_data() vector_store = QdrantVectorStore(client=client, collection_name=collection_name, batch_size=20) service_context = ServiceContext.from_defaults(llm=llm, node_parser=node_parser, embed_model=embed_model) index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context) refreshed_docs = index.refresh_ref_docs(documents)
WARNING - Retrying llama_index.embeddings.openai.get_embeddings in 1.6310027891256675 seconds as it raised BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 8192 tokens, however you requested 8212 tokens (8212 in your prompt; 0 for the completion). Please reduce your prompt or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}}.
text-embedding-3-large
and experimented with different values for embed_batch_size
(10, 50, 100), but nothing has worked.index = VectorStoreIndex.from_vector_store(vector_store=vector_store, service_context=service_context) refreshed_docs = index.refresh_ref_docs( documents, update_kwargs={"delete_kwargs": { "delete_from_docstore": True }} )
ERROR - Unexpected err=TypeError("delete_ref_doc() got multiple values for keyword argument 'delete_from_docstore'"), type(err)=<class 'TypeError'>
document.metadata = { "source_id": source_id, "document_name": document_name }
retriever = index.as_retriever(...) retrieved_nodes = retriever.retrieve(query)
retrieved_nodes[0].metadata
.AutoMergingRetriever
but I need it to function as a chat engine and to stream chat responses. My existing chat engine code is as follows:chat_engine = index.as_chat_engine( similarity_top_k=similarity_top_k, node_postprocessors=node_postprocessors, vector_store_kwargs={"qdrant_filters": filters})
AutoMergingRetriever
with the chat functionality. The documentation (https://docs.llamaindex.ai/en/latest/examples/retrievers/auto_merging_retriever.html) suggests using RetrieverQueryEngine
, but that would only provide me with a query engine. How can I get a chat engine?node_parser = SentenceWindowNodeParser.from_defaults( window_size=3, window_metadata_key="window", original_text_metadata_key="original_text", ) embed_model = OpenAIEmbedding(embed_batch_size=100) client = qdrant_client.QdrantClient(QDRANT_URL, api_key=QDRANT_API_KEY) loader = JsonDataReader() document = loader.load_data(json_string) vector_store = QdrantVectorStore(client=client, collection_name=collection_name) service_context = ServiceContext.from_defaults(llm=llm, node_parser=node_parser, embed_model=embed_model) storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, service_context=service_context)
ERROR - Unexpected err=ResponseHandlingException(WriteTimeout('The write operation timed out')), type(err)=<class 'qdrant_client.http.exceptions.ResponseHandlingException'>
embed_model = OpenAIEmbedding(embed_batch_size=100)