Querying

MMarkb123

Thrilled by the potential of this tool, and my gratitude extends to you for supporting the community. I crafted a straightforward script to leverage Ollama's local LLM, incorporating a dataset of 100 entries, each enriched with 5 fields in a csv. However, I've encountered a puzzling scenario—my inquiries related to the dataset are met with responses that wildly inaccurate. Could there be a piece of the puzzle I'm overlooking?

17 comments

LLogan M

Can you tell me a bit more about the csv you are querying? What type of data is it? What kinds of queries are you trying?

If I'm not mistaken, the default behavior is to embed/retrieve per row (which doesn't suit all query/data types)

MMarkb123

just fake text data - attached

LLogan M

And what kinds of questions are you asking?

MMarkb123

how many countries do you see, how many records are in the csv file - it's says 4, how many products are there - it says 2

MMarkb123

how many times does korea appear

LLogan M

Right so that makes total sense if you understand how it's working.

It's embedded each row.

And by default (unless you changed it) is fetching the top 2 most semantically similar rows compared to your query and sending that along to the LLM

LLogan M

So you can see how those types of questions don't really work with that kind of setup

LLogan M

You probably would be more interested in the pandas query engine, or even some text2sql stuff

MMarkb123

so like a keyword search we would likely have to use txt2sql

MMarkb123

Sorry I didn't understand how it works by default

LLogan M

No worries! Just wanted to make sure we were on the same page 😁

LLogan M

Semantic top k search has its place, but for most CSV data probably not the best fit 👍

MMarkb123

is there a resource that can help with what tools to use for what?

MMarkb123

Thank you for the help, love the product and would love to have a deeper conversation about scaling, enteprise support, and phone a friend

LLogan M

phone a friend is probably always going to be discord/github 🙂 Happy to give candid advice about deployments. Enterprise support is still in the works 💪

LLogan M

I would say that the "use cases" section is a decent start https://docs.llamaindex.ai/en/stable/use_cases/q_and_a/root.html

LLogan M

This entire understanding section is also a good read for people just starting out https://docs.llamaindex.ai/en/stable/understanding/understanding.html

Add a reply

Find answers from the community

Querying