Where do you find the Camel ones?
why this one over say, mname = "databricks/dolly-v2-12b"
?
Meh, I've tried dolly with llama index. This one is smaller and seems to give better responses
You can definitely try dolly though
in what ways was it better for you?
ooo I like that it is instruction following trained
that is what I am looking for ....
For the refine prompt in llama index, camel worked much better. This is a super difficult prompt for most models
Plus it uses less resources, and is faster because of that
I watched that video by Andrew Ng and Open AI and I got jaded
about how easy it should be to string together instructinos
Even gpt-3.5 kinda stinks at it
I think the generality is not something I actually need - I have a very specific task that I bet I could just train on
I just don't feel like collecting all the data and formatting it correctly
how do I know whether to use GPTNeoXTokenizerFast
Usually you can just use AutoTokenizer and pass the model name
Otherwise, the model card on huggingface should have an explicit demo
is there a way from huggingface to see the default number of support input tokens?
Eh it's kind of convoluted. You usually have to look at the config json or read the model card
I wish there was an easier way though
Using the config will have something like max_possition_embeddings, or something like that
I can look at model in particular if you cant find it
I thought it was a token limit thing but now
I'm seeing something different
raise ValueError("If `eos_token_id` is defined, make sure that `pad_token_id` is defined.")
That's.. interesting lol
What model is this?
Can i see how you set it up? I never got that error when I tested a few days ago
mname = "Writer/camel-5b-hf"
tokenizer = AutoTokenizer.from_pretrained(mname, cache_dir="../camel/tokenizer")
model = AutoModelForCausalLM.from_pretrained(mname, device_map="auto", cache_dir="../camel/model")
FULL_PROMPT = QuestionAnswerPrompt(JTC_QA_PROMPT2)
class CamelLLM(LLM):
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
print(prompt)
generation_config = GenerationConfig(
max_new_tokens=5000,
temperature=0.1,
repetition_penalty=1.0,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
tokens = model.generate(**inputs, generation_config=generation_config, output_scores=True, max_new_tokens=num_output)
response = tokenizer.decode(tokens[0], skip_special_tokens=True).strip()
return response[len(prompt):]
@property
def _identifying_params(self) -> Mapping[str, Any]:
return {"name_of_model": model}
@property
def _llm_type(self) -> str:
return "custom"
I've been using the same template for a class and over writing/copy pasting
so it could be a mismatch of parameters or something obscure
Very sus haha
One note, the max new tokens is too big. Try like 256 or 512
Do you know which line of code raises that error?
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Just that the model will talk until it predicts a special token (or it runs out of room)
OK I think I just was mixing up two different custom classes
response = tokenizer.decode(generation_output[0], skip_special_tokens=True).strip()
return response[len(prompt):]
here what is the return response[len(prompt):]
mean?
becuase for Camel on Hugging face, it looks different
model_inputs = tokenizer(text, return_tensors="pt").to("cuda")
output_ids = model.generate(
**model_inputs,
max_length=256,
)
output_text = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
clean_output = output_text.split("### Response:")[1].strip()
print(clean_output)
So the input is the prompt, and the output is the prompt PLUS newly generated words
We only want to return the new words
And correct, that's just another way of returning only the new words
oh because Prompt has a length
Yea! And for camel, we know everything after that response string is new, so it splits on that to get the new stuff
but that was easy enough to see
Yea. So for camel, it's trained for that specific prompt, so using it helps it work better πͺ
Every model will be a little different, but yea lol
that makes sense but I would have no idea how to know that
Yea... it takes some experience haha
Ive been working with huggingface stuff for a few years, definitely not intuitive