(let me know if I should take this discussion elsewhere)
I am where does
completion
come from in here:
f"{completion.strip()} {E_INST}"
So trying to work through what you gave me, and applying it to what I have, I have:
OS, EOS = "<s>", "</s>"
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
query_wrapper_prompt=(
f"{BOS}{B_INST} "
"{query_str} "
f"{E_INST}"
)
wrapper = SimpleInputPrompt(query_wrapper_prompt, prompt_type=PromptType.SIMPLE_INPUT)
llm = HuggingFaceLLM(
context_window=4096,
max_new_tokens=2048,
# system_prompt=system_prompt,
generate_kwargs={"temperature": 0.0, "do_sample": False},
query_wrapper_prompt=wrapper,
tokenizer_name=selected_model,
model_name=selected_model,
device_map="cpu",
# change these settings below depending on your GPU
#model_kwargs={"torch_dtype": torch.float16, "load_in_8bit": True},
)
I got
"{query_str} "
and effectively the whole
wrapper = SimpleInputPrompt(query_wrapper_prompt, prompt_type=PromptType.SIMPLE_INPUT)
after reviewing docs.
I'm currently executing the query against the SQL database (cpu so slow) but I wanted to ask if you anything wrong in what I'm doing right now.
I wasn't sure where
{completion.strip()}
came from in your example so I really just swapped it out for
"{query_str} "
(per my understanding of the docs)