Hopefully not too stupid a question. I have finetuned a zephyr7b-alpha-GPTQ via PEFT, and pushed the adapter to HuggingFace. I would like to use this model, with the adapter, and play around with RAG. The conventional way to use a HF hosted model, described here:
https://github.com/run-llama/llama_index/blob/main/docs/examples/llm/huggingface.ipynb would work if I were to just give the zephyr7b-alpha-GPTQ path, but throws an error if I point it at my HF hosted adapter (obvious I guess, since that is just the adapter).