hmm, I think
- you probably need to change the messagese_to_prompt and completion_to_prompt functions so that they format things properly for these models (the ones built-in for llama-index are only for llama2 format -- not sure about those other two models)
- Maybe change the global tokenizer to match the new models, usually blank responses are because the inputs got too big (and the tokenizer is used to count tokens)
i.e. for llama2 I might do something like
from llama_index import set_global_tokenizer
from transformers import AutoTokenizer
set_global_tokenizer(
AutoTokenizer.from_pretrained("NousResearch/Llama-2-7b-chat-hf").encode
)