Find answers from the community

Updated 2 months ago

@Logan M for Groq what could be the

for Groq what could be the optimal embed_model and tokenizer be?
L
M
8 comments
you can use any embed model, the LLM and embed model are indepdendant
hmm ... but just curious abt the tokenizer they don't seem to work as expected
Not sure what you mean?
for instance, in OpenAI it was something like this tokenizer=tiktoken.encoding_for_model(model) ... which help me compute the token_counter based on the same ... but in Groq that doesn't seem to quite work
Right, I have no idea what tokenizer groq uses lol (or if they even expose one). Tiktoken will get you an approximate count I suppose
basically I'm referring to the Llama2 that's been hosted on Groq
oh its llama2. You can just set the tokenizer then to something like tokenizer=AutoTokenizer.from_pretrained("<some llama2 model>").encode
just using some llama2 tokenizer from huggingface
Add a reply
Sign up and join the conversation on Discord