I think the following behavior is generally undesirable and confusing. Currently llama_index will default to use openai if not set explicitly. Ok. However, if a ValueError gets thrown when trying to resolve to OpenAI, it will then try to use LlamaCPP instead for some reason, which by default is hard coded to download a quantized llama2 7b model from TheBloke via huggingface. I would not expect that a openai config issue would lead to me downloading an entire random llm to my system. Is this really the intended behavior? Here is the code from llms -> utils.py:
def resolve_llm(llm: Optional[LLMType] = None) -> LLM:
"""Resolve LLM from string or LLM instance."""
if llm == "default":
# return default OpenAI model. If it fails, return LlamaCPP
try:
llm = OpenAI()
except ValueError as e:
llm = "local"
print(
"******\n"
"Could not load OpenAI model. Using default LlamaCPP=llama2-13b-chat. "
"If you intended to use OpenAI, please check your OPENAI_API_KEY.\n"
"Original error:\n"
f"{e!s}"
"\n******"
)
@Logan M Just to clarify, I don't really have an opinion on what the defaults should be. The issue is that it will download an llm based on an exception. This leads to scenarios like when you just want to use the OpenAI embedding model so that's all you set up, but suddenly a 7 gig chat model is being downloaded to your system. I think that is a bug. (Really enjoying llama_index btw. Thanks again.)