Structured

At a glance

The community members are discussing issues with using the Google Gemma model with the LlamaIndex library. They are encountering a "json" error when trying to use the SQLAutoVectorQueryEngine and SQLJoinQueryEngine examples. The community members suggest that features requiring structured outputs may not work well with open-source models, and that a fairly capable language model is generally required. They discuss whether fine-tuning the language model could be helpful, particularly for outputting JSON that LlamaIndex can parse. Some community members provide suggestions, such as enabling debug logs and using the llama_index.core.set_global_handler("simple") function. The community members also discuss their experience with the Mistral library and whether fine-tuning could be helpful in that case as well.

Useful resources

AAlwin

have you ever tried any implementation of LlamaIndex with Google Gemma model?
the following examples give me the "json" error when using Gemma:

https://github.com/run-llama/llama_index/blob/main/docs/examples/query_engine/SQLAutoVectorQueryEngine.ipynb
https://github.com/run-llama/llama_index/blob/main/docs/examples/query_engine/SQLJoinQueryEngine.ipynb

Generally speaking, how can we tailor our custom LLM to work well with all the features of LlamaIndex? Such as GPT and Claude models.

Thanks in advance for your response.

10 comments

LLogan M

Features that require structured outputs generally don't work great with open source models

LLogan M

It generally requires a fairly capable LLM

AAlwin

do you think fine-tuning the LLM be helpful?

LLogan M

If you fine tuned it for outputting json and following the schema in the prompt, yes

AAlwin

can you please elaborate "outputting json"?
I do not get it!
right now what is the default output of Gemma?

LLogan M

It means we are prompting the LLM with a json schema. And it must output a json object that llama index can parse

LLogan M

Maye enabling some debug logs on model inputs/outputs would help

AAlwin

Oh, ok.
Can you kindly share some resources in this regard?
it would be really appreciated.

LLogan M

Plain Text

import llama_index.core

llama_index.core.set_global_handler("simple")

AAlwin

And another question.
Please accept my apologies for too many questions.

I have also used Mistral, it looks a better option for structured data.
However, when I query of SQL tables, it can't provide correct responses in some cases.
The responses are either incorrect or unable to provide any.

Do you think in this case fine-tuning is helpful?!

Add a reply

Find answers from the community

Structured