----------------------
The error message you're seeing is due to the
query_wrapper_prompt
being passed as a string, while the
HuggingFaceLLM
class expects an instance of a
Prompt
class. In your case, you should use the
SimpleInputPrompt
class to wrap your string. Here's how you can do it:
from llama_index.prompts.prompts import SimpleInputPrompt
query_wrapper_prompt = SimpleInputPrompt("<|USER|>{query_str}<|ASSISTANT|>")
llm = HuggingFaceLLM(
context_window=1028,
max_new_tokens=256,
generate_kwargs={"temperature": 0.7, "do_sample": False},
system_prompt=system_prompt,
query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name="StabilityAI/stablelm-tuned-alpha-3b",
model_name="StabilityAI/stablelm-tuned-alpha-3b",
device_map="cpu",
stopping_ids=[50278, 50279, 50277, 1, 0],
tokenizer_kwargs={"max_length": 1028},
# uncomment this if using CUDA to reduce memory usage
# model_kwargs={"torch_dtype": torch.float16}
)
This code is based on the examples provided in the
LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again
:speech_balloon:
----------------------