Hello guys,
I am having trouble running the following example that utilizes LlamaIndex to retrieve data from both an SQL table and Wikipedia:
"
https://github.com/run-llama/llama_index/blob/main/docs/examples/query_engine/SQLAutoVectorQueryEngine.ipynb"
This code functions seamlessly with GPT-3.5 and Chromadb. However, I attempted to substitute the GPT model with Gemma as:
from transformers import AutoTokenizer, AutoModelForCausalLM
from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.huggingface import HuggingFaceLLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b-it")
Settings.llm = HuggingFaceLLM(model_name="google/gemma-7b-it",tokenizer_name="google/gemma-7b-it", device_map="auto") # replace "auto" with "cuda" if having a GPU with high memory!
Settings.tokenizer = tokenizer
Settings.embed_model = "local:BAAI/bge-small-en-v1.5"
and running the query as:
response = query_engine.query(
"Tell me about the arts and culture of the city with the highest"
" population"
)
but, I got the following error:
JSONDecodeError: Extra data: line 7 column 1 (char 210)
During handling of the above exception, another exception occurred:
ScannerError Traceback (most recent call last)
File c:\Users\.conda\envs\llamaindex_py3.10\lib\site-packages\llama_index\core\output_parsers\selection.py:84, in SelectionOutputParser.parse(self, output)
...
{
"choice": 2,
"reason": "The question is about the arts and culture of a city, so the most relevant choice is (2) Useful for answering semantic questions about different cities."
}
]
Can someone assist me with this? Is there anything wrong in using Gemma with the created SQL data in the code?