LlamaIndex

Log inLog into community

Find answers from the community

Updated last year

Model

Model

At a glance

The community members are trying to use the AutoModelForCausalLM from the ctransformers library with the llamaindex library, but are encountering issues with the model file not being found. The community members suggest skipping the llamaindex loading and passing the model directly, but this leads to other issues with the config and metadata of the ctransformers model. They explore options like monkey-patching and extending the HuggingFaceLLM class, but there are still problems. The community members also consider using llama.cpp instead of ctransformers, but encounter issues with the GUF model format. Eventually, they find that using the specific LLM class for llama.cpp seems to work.

Useful resources

·

how can I use AutoModelForCausalLM.from_pretrained('TheBloke/leo-hessianai-7B-chat-GGUF', model_file="leo-hessianai-7b-chat.Q4_K_M.gguf", model_type="llama") based on ctransformers with llamaindex? this is currently failing with TheBloke/leo-hessianai-7B-chat-GGUF does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack. for the AutoModelForCausalLM which is coming from the normal transformers library which is used by llamaindex?

L

g

25 comments

You can skip letting llamaindex load the model and pass it in directly yourself

Plain Text

model = <load model from ctransformers>

llm = HuggingFaceLLM(model=model, ...)

how does this work?

thx

It will just let you load the model the way you want, and skip our default loading code

123 config_dict = self._model.config.to_dict()
124 model_context_window = int(
125 config_dict.get("max_position_embeddings", context_window)
126 )
127 if model_context_window and model_context_window < context_window:

AttributeError: 'Config' object has no attribute 'to_dict'

I am running into this error when calling HuggingFaceLLM(model=llm_model) with the aforementioned model

How can this be fixed? It llm_model.config.dict is sefined, but to_dict is not defined for the ctransformer models

No way to get around that

Probably need to extend the huggingface llm class and write the init to work

https://github.com/run-llama/llama_index/blob/31d132c56b1836603d48e02786cd29a74a28f527/llama_index/llms/huggingface.py#L30

let me create an issue for this https://github.com/run-llama/llama_index/issues/8140

what would I need to change to make this work? check for the type and then retrieve the config in a different way?

However, even if we fix the to_dict somehow, the next line would fail as max_position_embeddings is not part of the ctransformers configuration.

it looks like monkey-patching works

Yea monkey patching works!

Not sure what the ideal fix is. If this is the only issue, we can add some value checking

More patching is needed though when trying to run the LLM

AttributeError: 'LLM' object has no attribute 'device'

Just curious, but why use ctransformers and not something like llama.cpp? Seems like llama.cpp is the same thing but with way more support

HF getting started suggestions were showing these suggestions. How could I load the same model via llamacpp?

https://huggingface.co/TheBloke/leo-hessianai-7B-chat-GGUF

When trying to run the aforementioned model (GUF) with llamacpp it fails for me with: invalid magic number 20200a7b

Would deconstructing the pipeline and using a custom query engine be an option? If yes, do you have some example to build upon?

notice: I am already using the latest llamacpp version which should work only with the GUF models

@Logan M but even when using AttributeError: 'Llama' object has no attribute 'metadata' --> 123 config_dict = self._model.config.to_dict()
124 model_context_window = int(
125 config_dict.get("max_position_embeddings", context_window)
126 )
127 if model_context_window and model_context_window < context_window:

AttributeError: 'Llama' object has no attribute 'config' llm_model = Llama(model_path, model_type="llama", . it looks like llama_cpp python is also not fully supported?

Oh there's a specific LLM class for llama cpp

https://gpt-index.readthedocs.io/en/stable/examples/llm/llama_2_llama_cpp.html

this seems to work. Thanks

Add a reply

Sign up and join the conversation on Discord