Driving me nuts! I can't figure out why my embeddings and sources/documents returned by llama index are all of a sudden different. The results are bizzare!
I have several articles ingested. The two I will reference as an example is an article about cloud services within my company. The other an article referencing how to install matlab. I will call my company name (abbreviated) XYZ.
- "How do I install matlab?" - Incorrect sources returned
- "How do I install matlab? XYZ - Correct sources returned
- "How do I install matlab? G" - Correct Sources returned
- "How do I install matlab? Flux Capacitor" - Correct Sources returned
- "How do I install matlab? How do I install matlab?" - Correct Sources returned
Similarly...
- "Cloud Services" - Incorrect Sources returned
- "Cloud Services Cloud Services" - Incorrect Sources returned
- "Cloud Services. Cloud Services" - Incorrect Sources returned
- "Cloud Services Cloud Services." - Incorrect Sources returned
- "Cloud Services. Cloud Services." - Correct Sources returned
Driving. Me. Nuts.
Hope someone has a magical solution 😄
Using ollama and nomic-embed-text for embeddings. Using Llama_index via
https://github.com/zylon-ai/private-gptAnd I should note I was using the tool happily before. Something has changed. I even tried loading code from known working code with the same result. I can't figure it out.