LlamaIndex

Log inLog into community

Find answers from the community

Updated 2 years ago

Quantized LLama2

Quantized LLama2

At a glance

·

Hello Experts,
How to use llama_index with quantized llama2 models?

E

L

5 comments

EEmanuel Ferreira

https://gpt-index.readthedocs.io/en/latest/examples/llm/llama_2_llama_cpp.html

EEmanuel Ferreira

Almost the same thing, but then you would use a Hugging Face quantized model

https://huggingface.co/TheBloke/Llama-2-7B-GGML

EEmanuel Ferreira

maybe @Logan M can validate this

Yea, you'll want to use llama cpp for ggml or gguf files

https://gpt-index.readthedocs.io/en/stable/examples/llm/llama_2_llama_cpp.html

Huggingface also supports normal quantization using bitsandbytes or gptq

EEmanuel Ferreira

Quantized LLama2

Add a reply

Sign up and join the conversation on Discord