Find answers from the community

Updated 3 months ago

Querying

Thrilled by the potential of this tool, and my gratitude extends to you for supporting the community. I crafted a straightforward script to leverage Ollama's local LLM, incorporating a dataset of 100 entries, each enriched with 5 fields in a csv. However, I've encountered a puzzling scenario—my inquiries related to the dataset are met with responses that wildly inaccurate. Could there be a piece of the puzzle I'm overlooking?
L
M
17 comments
Can you tell me a bit more about the csv you are querying? What type of data is it? What kinds of queries are you trying?

If I'm not mistaken, the default behavior is to embed/retrieve per row (which doesn't suit all query/data types)
just fake text data - attached
And what kinds of questions are you asking?
how many countries do you see, how many records are in the csv file - it's says 4, how many products are there - it says 2
how many times does korea appear
Right so that makes total sense if you understand how it's working.

It's embedded each row.

And by default (unless you changed it) is fetching the top 2 most semantically similar rows compared to your query and sending that along to the LLM
So you can see how those types of questions don't really work with that kind of setup
You probably would be more interested in the pandas query engine, or even some text2sql stuff
so like a keyword search we would likely have to use txt2sql
Sorry I didn't understand how it works by default
No worries! Just wanted to make sure we were on the same page 😁
Semantic top k search has its place, but for most CSV data probably not the best fit 👍
is there a resource that can help with what tools to use for what?
Thank you for the help, love the product and would love to have a deeper conversation about scaling, enteprise support, and phone a friend
phone a friend is probably always going to be discord/github 🙂 Happy to give candid advice about deployments. Enterprise support is still in the works 💪
I would say that the "use cases" section is a decent start https://docs.llamaindex.ai/en/stable/use_cases/q_and_a/root.html
This entire understanding section is also a good read for people just starting out https://docs.llamaindex.ai/en/stable/understanding/understanding.html
Add a reply
Sign up and join the conversation on Discord