Using default LlamaCPP=llama2-13b-chat
when following the tutorial. What if I want to use TheBloke/Platypus2-70B-Instruct-GPTQ
instead? Having a hard time finding any info on llama-index + GPTQ.google/t5-efficient-mini
model using CTranslate2.llama.cpp: loading model from /opt/gptq/models/TheBloke_OpenOrca-Platypus2-13B-GPTQ/gptq_model-4bit-128g.safetensors error loading model: unknown (magic, version) combination: 000288b0, 00000000; is this really a GGML file? llama_init_from_file: failed to load model