Find answers from the community

Updated 4 months ago

Hey. I'm getting this weird key error

At a glance
Hey. I'm getting this weird key error after creating an index from documents extracted via unstructured module (html here). if rebuilt the index, and the key changed from -1 to 0, and now its stuck on 1. Any idea?
Attachment
image.png
L
F
18 comments
How did you build your index?
https://github.com/run-llama/llama_index/issues/1769

i referred to this, but i made the index in a fresh notebook
It worked for me with a pdf. I was thinking it was a issue with html's. I tried another file, but i got the same error. Currently trying on another html file in a new notebook, it takes time to load llama
I'll try with a few other files
Hmm tried with a random html file as well, still worked

Plain Text
!wget 'https://www.engadget.com/hyperloop-one-is-shutting-down-030049106.html'
If I could reproduce, I could look into it further
alright. I will try a file or two more, if I get it again, share you the notebook in your DM if you don't mind?

If this doesn't work, I'm planning to get text from html directly(beautifulsoup), and then find a way to create embeddings and store it in my index.
Yea feel free to share any notebooks that reproduce the issue πŸ™ Would love to know why that happens.

The error to me somehow indicates that the vector store and docstore do not contain the same data (somehow)
Hey. Tried all of it in a fresh notebook, turns out it gives the error only with that specfic file. Wasnt expecting this though
A different html file worked
made index with both the files back and forth in the same session, to confirm
Hey. So does just defining my openapi key in env will make the query engine use gpt 3.5? (because i dont see any llm defined here)
yea 3.5 is the default -- and text-embedding-ada-002 is the default embedding model
Just so that anyone stumbles here for help, I used the same document but in chroma db and didnt get any errors
Add a reply
Sign up and join the conversation on Discord