Qdrant

At a glance

The community member is using SimpleDirectoryReader and wants to know the best way to create embeddings or how to get the embeddings to store on Qdrant. Another community member suggests checking the Qdrant guide and mentions that when the index is set up to use Qdrant, the from_documents method will embed and upload the documents.

The community member then encounters an AuthenticationError when trying to create the index, and another community member suggests setting the OpenAI API key directly on the module, especially when using Colab or Notebooks.

After resolving the authentication issue, the community member encounters another bug, an UnexpectedResponse with a "Wrong input: Not existing vector name error" from Qdrant. The community members discuss the setup of the Qdrant client and the creation of the collection, and eventually, the issue is resolved by deleting and recreating the collection.

Useful resources

RRedhmam

Hi guys, i'm using SimpleDirectoryReader, how the best way to create embeddings? (or after create the index, how to get the embeddings to store on qdrant for example ?)

19 comments

LLogan M

Check out the qdrant guide
https://gpt-index.readthedocs.io/en/stable/examples/vector_stores/QdrantIndexDemo.html

When the index is setup to use qdrant, from_documents will embed and upload

RRedhmam

Thanks a lot

RRedhmam

Hi @Logan M

Plain Text

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, service_context=service_context
)

This script are returning the following error:

Plain Text

AuthenticationError                       Traceback (most recent call last)
File /usr/local/lib/python3.11/site-packages/tenacity/__init__.py:382, in Retrying.__call__(self, fn, *args, **kwargs)
    381 try:
--> 382     result = fn(*args, **kwargs)
    383 except BaseException:  # noqa: B902

File /usr/local/lib/python3.11/site-packages/llama_index/embeddings/openai.py:166, in get_embeddings(list_of_text, engine, **kwargs)
    164 list_of_text = [text.replace("\n", " ") for text in list_of_text]
--> 166 data = openai.Embedding.create(input=list_of_text, model=engine, **kwargs).data
    167 return [d["embedding"] for d in data]

File /usr/local/lib/python3.11/site-packages/openai/api_resources/embedding.py:33, in Embedding.create(cls, *args, **kwargs)
     32 try:
---> 33     response = super().create(*args, **kwargs)
     35     # If a user specifies base64, we'll just return the encoded string.
     36     # This is only for the default case.

File /usr/local/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py:149, in EngineAPIResource.create(cls, api_key, api_base, api_type, request_id, api_version, organization, **params)
    127 @classmethod
    128 def create(
    129     cls,
   (...)
    136     **params,
    137 ):
...
--> 326     raise retry_exc from fut.exception()
    328 if self.wait:
    329     sleep = self.wait(retry_state)

RetryError: RetryError[]

Do you have any idea? My OpenAI key is setup by os.environ["OPENAI_API_KEY"]

LLogan M

sometimes you need to additionally set the key directly on the module, especially in colab/notebooks

Plain Text

import openai

openai.api_key = "sk-..."

RRedhmam

Sorry again @Logan M

Your answer above was great, thanks!

But then, i get another bug:

Plain Text

UnexpectedResponse                        Traceback (most recent call last)
/Users/redhmam/dev/almir/tests/app/bigdata-test.py in line 70
     68 vector_store = QdrantVectorStore(client=client, collection_name="pdfs")
     69 storage_context = StorageContext.from_defaults(vector_store=vector_store)
---> 70 index = VectorStoreIndex.from_documents(
     71     documents, storage_context=storage_context, service_context=service_context
     72 )
     74 # query_engine = index.as_query_engine()
     75 # response = query_engine.query("What the documents talk about?")
     76 
     77 # display(Markdown(f"{response}"))

File /usr/local/lib/python3.11/site-packages/llama_index/indices/base.py:102, in BaseIndex.from_documents(cls, documents, storage_context, service_context, show_progress, **kwargs)
     97     docstore.set_document_hash(doc.get_doc_id(), doc.hash)
     98 nodes = service_context.node_parser.get_nodes_from_documents(
     99     documents, show_progress=show_progress
    100 )
--> 102 return cls(
    103     nodes=nodes,
    104     storage_context=storage_context,
    105     service_context=service_context,
    106     show_progress=show_progress,
    107     **kwargs,
    108 )
...
---> 97 raise UnexpectedResponse.for_response(response)

UnexpectedResponse: Unexpected Response: 400 (Bad Request)
Raw response content:
b'{"status":{"error":"Wrong input: Not existing vector name error: "},"time":0.0450627}'

Do you have any ideia?

LLogan M

Is there any way to expand the traceback? Looks like it got truncated in the middle 🤔

LLogan M

(But also, never seen this before haha)

RRedhmam

Look, PUT request are done, but get 400 as http response

Attachment

RRedhmam

I'm just tryind do the basic..

RRedhmam

Attachment

LLogan M

Hmmm, and how did you setup the qdrant client? Super weird error lol

LLogan M

So far nothing is really standing out 😅 Wrong input: Not existing vector name error is super sus, but it sounds like that's coming from qdrant?

RRedhmam

Plain Text

client = qdrant_client.QdrantClient("http://localhost:6333")

RRedhmam

I'm using qdrant on docker

RRedhmam

Then, i created the collection client.create_collection("pdfs")

RRedhmam

Sorry, i this i'm creating the collection in a wrong way, right?

RRedhmam

I just deleted my collection, i ran the script again and everything work nice!

LLogan M

ha well that works!

LLogan M

nice!

Add a reply

Find answers from the community

Qdrant