Find answers from the community

Updated last year

Model

At a glance

The community members are trying to use the AutoModelForCausalLM from the ctransformers library with the llamaindex library, but are encountering issues with the model file not being found. The community members suggest skipping the llamaindex loading and passing the model directly, but this leads to other issues with the config and metadata of the ctransformers model. They explore options like monkey-patching and extending the HuggingFaceLLM class, but there are still problems. The community members also consider using llama.cpp instead of ctransformers, but encounter issues with the GUF model format. Eventually, they find that using the specific LLM class for llama.cpp seems to work.

Useful resources
how can I use AutoModelForCausalLM.from_pretrained('TheBloke/leo-hessianai-7B-chat-GGUF', model_file="leo-hessianai-7b-chat.Q4_K_M.gguf", model_type="llama") based on ctransformers with llamaindex? this is currently failing with TheBloke/leo-hessianai-7B-chat-GGUF does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack. for the AutoModelForCausalLM which is coming from the normal transformers library which is used by llamaindex?
L
g
25 comments
You can skip letting llamaindex load the model and pass it in directly yourself

Plain Text
model = <load model from ctransformers>

llm = HuggingFaceLLM(model=model, ...)
how does this work?
It will just let you load the model the way you want, and skip our default loading code
123 config_dict = self._model.config.to_dict()
124 model_context_window = int(
125 config_dict.get("max_position_embeddings", context_window)
126 )
127 if model_context_window and model_context_window < context_window:

AttributeError: 'Config' object has no attribute 'to_dict'

I am running into this error when calling HuggingFaceLLM(model=llm_model) with the aforementioned model
How can this be fixed? It llm_model.config.dict is sefined, but to_dict is not defined for the ctransformer models
No way to get around that
Probably need to extend the huggingface llm class and write the init to work
what would I need to change to make this work? check for the type and then retrieve the config in a different way?
However, even if we fix the to_dict somehow, the next line would fail as max_position_embeddings is not part of the ctransformers configuration.
it looks like monkey-patching works
Yea monkey patching works!

Not sure what the ideal fix is. If this is the only issue, we can add some value checking
More patching is needed though when trying to run the LLM

AttributeError: 'LLM' object has no attribute 'device'
Just curious, but why use ctransformers and not something like llama.cpp? Seems like llama.cpp is the same thing but with way more support
HF getting started suggestions were showing these suggestions. How could I load the same model via llamacpp?
When trying to run the aforementioned model (GUF) with llamacpp it fails for me with: invalid magic number 20200a7b
Would deconstructing the pipeline and using a custom query engine be an option? If yes, do you have some example to build upon?
notice: I am already using the latest llamacpp version which should work only with the GUF models
@Logan M but even when using AttributeError: 'Llama' object has no attribute 'metadata' --> 123 config_dict = self._model.config.to_dict()
124 model_context_window = int(
125 config_dict.get("max_position_embeddings", context_window)
126 )
127 if model_context_window and model_context_window < context_window:

AttributeError: 'Llama' object has no attribute 'config' llm_model = Llama(model_path, model_type="llama", . it looks like llama_cpp python is also not fully supported?
Oh there's a specific LLM class for llama cpp
this seems to work. Thanks
Add a reply
Sign up and join the conversation on Discord