Find answers from the community

Updated 2 years ago

Qdrant

At a glance

The community member is using SimpleDirectoryReader and wants to know the best way to create embeddings or how to get the embeddings to store on Qdrant. Another community member suggests checking the Qdrant guide and mentions that when the index is set up to use Qdrant, the from_documents method will embed and upload the documents.

The community member then encounters an AuthenticationError when trying to create the index, and another community member suggests setting the OpenAI API key directly on the module, especially when using Colab or Notebooks.

After resolving the authentication issue, the community member encounters another bug, an UnexpectedResponse with a "Wrong input: Not existing vector name error" from Qdrant. The community members discuss the setup of the Qdrant client and the creation of the collection, and eventually, the issue is resolved by deleting and recreating the collection.

Useful resources
Hi guys, i'm using SimpleDirectoryReader, how the best way to create embeddings? (or after create the index, how to get the embeddings to store on qdrant for example ?)
L
R
19 comments
Check out the qdrant guide
https://gpt-index.readthedocs.io/en/stable/examples/vector_stores/QdrantIndexDemo.html

When the index is setup to use qdrant, from_documents will embed and upload
Hi @Logan M

Plain Text
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, service_context=service_context
)


This script are returning the following error:
Plain Text
AuthenticationError                       Traceback (most recent call last)
File /usr/local/lib/python3.11/site-packages/tenacity/__init__.py:382, in Retrying.__call__(self, fn, *args, **kwargs)
    381 try:
--> 382     result = fn(*args, **kwargs)
    383 except BaseException:  # noqa: B902

File /usr/local/lib/python3.11/site-packages/llama_index/embeddings/openai.py:166, in get_embeddings(list_of_text, engine, **kwargs)
    164 list_of_text = [text.replace("\n", " ") for text in list_of_text]
--> 166 data = openai.Embedding.create(input=list_of_text, model=engine, **kwargs).data
    167 return [d["embedding"] for d in data]

File /usr/local/lib/python3.11/site-packages/openai/api_resources/embedding.py:33, in Embedding.create(cls, *args, **kwargs)
     32 try:
---> 33     response = super().create(*args, **kwargs)
     35     # If a user specifies base64, we'll just return the encoded string.
     36     # This is only for the default case.

File /usr/local/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py:149, in EngineAPIResource.create(cls, api_key, api_base, api_type, request_id, api_version, organization, **params)
    127 @classmethod
    128 def create(
    129     cls,
   (...)
    136     **params,
    137 ):
...
--> 326     raise retry_exc from fut.exception()
    328 if self.wait:
    329     sleep = self.wait(retry_state)

RetryError: RetryError[]


Do you have any idea? My OpenAI key is setup by os.environ["OPENAI_API_KEY"]
sometimes you need to additionally set the key directly on the module, especially in colab/notebooks

Plain Text
import openai

openai.api_key = "sk-..."
Sorry again @Logan M

Your answer above was great, thanks!

But then, i get another bug:
Plain Text
UnexpectedResponse                        Traceback (most recent call last)
/Users/redhmam/dev/almir/tests/app/bigdata-test.py in line 70
     68 vector_store = QdrantVectorStore(client=client, collection_name="pdfs")
     69 storage_context = StorageContext.from_defaults(vector_store=vector_store)
---> 70 index = VectorStoreIndex.from_documents(
     71     documents, storage_context=storage_context, service_context=service_context
     72 )
     74 # query_engine = index.as_query_engine()
     75 # response = query_engine.query("What the documents talk about?")
     76 
     77 # display(Markdown(f"{response}"))

File /usr/local/lib/python3.11/site-packages/llama_index/indices/base.py:102, in BaseIndex.from_documents(cls, documents, storage_context, service_context, show_progress, **kwargs)
     97     docstore.set_document_hash(doc.get_doc_id(), doc.hash)
     98 nodes = service_context.node_parser.get_nodes_from_documents(
     99     documents, show_progress=show_progress
    100 )
--> 102 return cls(
    103     nodes=nodes,
    104     storage_context=storage_context,
    105     service_context=service_context,
    106     show_progress=show_progress,
    107     **kwargs,
    108 )
...
---> 97 raise UnexpectedResponse.for_response(response)

UnexpectedResponse: Unexpected Response: 400 (Bad Request)
Raw response content:
b'{"status":{"error":"Wrong input: Not existing vector name error: "},"time":0.0450627}'


Do you have any ideia?
Is there any way to expand the traceback? Looks like it got truncated in the middle πŸ€”
(But also, never seen this before haha)
Look, PUT request are done, but get 400 as http response
Attachment
image.png
I'm just tryind do the basic..
Hmmm, and how did you setup the qdrant client? Super weird error lol
So far nothing is really standing out πŸ˜… Wrong input: Not existing vector name error is super sus, but it sounds like that's coming from qdrant?
Plain Text
client = qdrant_client.QdrantClient("http://localhost:6333")
I'm using qdrant on docker
Then, i created the collection client.create_collection("pdfs")
Sorry, i this i'm creating the collection in a wrong way, right?
I just deleted my collection, i ran the script again and everything work nice!
ha well that works!
Add a reply
Sign up and join the conversation on Discord