Find answers from the community

Updated last year

I am having issue with the compact and

At a glance

A community member is experiencing an issue with the "compact and refine" prompt for the Faithfulness evaluator, where it keeps throwing an error saying the input is breaching the context window. The community members discuss that this is likely a small bug with token counting, as the default tokenizer is using a GPT-3.5 tokenizer. They provide a workaround by configuring the tokenizer using AutoTokenizer and set_global_tokenizer. The community members mention that a pull request is being worked on to address this issue more comprehensively.

I am having issue with the compact and refine prompt (the default) for the Faithfullness evaluator. It should be squishing my context into the context window and refining but it keeps throwing an error saying that it is breaching the context window. Is this a bug?

Plain Text
huggingface_hub.inference._text_generation.ValidationError: Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4096. Given: 3896 `inputs` tokens and 256 `max_new_tokens`
make: *** [evals] Error 1
L
W
12 comments
ah, likely a small bug with token counting (the default tokenizer is using a gpt-3.5 tokenizer)
@Logan M

Ahhh ok. Yeah I am using this model on the Inference Endpoints:

TheBloke/prometheus-13B-v1.0-AWQ

Is the tokenizer configurable?
it is! And I'm actually just fixing the example lol give me one sec
This is kinda janky, but should work (I need to make a PR to do the partial under the hood I think lol)

Plain Text
from functools import partial
from transformers import AutoTokenizer
from llama_index import set_global_tokenizer

set_global_tokenizer(
    partial(AutoTokenizer.from_pretrained("TheBloke/prometheus-13B-v1.0-AWQ").encode, add_special_tokens=False)
)
ahhh sweet. I'll give it a go now.
I would love to check out the PR when its ready to btw so I can see what was the fix was
basically an isinstance check to check for the huggingface tokenizer base class lol
then doing the partial for you under the hood
(its part of a much larger PR to deprecate the entire serivce context and replace it with a better Settings object lol -- v0.10.0 coming soon!)
sounds like a lot of work!
Thanks for the quick fix man!
Add a reply
Sign up and join the conversation on Discord