I am having issue with the compact and

At a glance

A community member is experiencing an issue with the "compact and refine" prompt for the Faithfulness evaluator, where it keeps throwing an error saying the input is breaching the context window. The community members discuss that this is likely a small bug with token counting, as the default tokenizer is using a GPT-3.5 tokenizer. They provide a workaround by configuring the tokenizer using AutoTokenizer and set_global_tokenizer. The community members mention that a pull request is being worked on to address this issue more comprehensively.

WWizboar

I am having issue with the compact and refine prompt (the default) for the Faithfullness evaluator. It should be squishing my context into the context window and refining but it keeps throwing an error saying that it is breaching the context window. Is this a bug?

Plain Text

huggingface_hub.inference._text_generation.ValidationError: Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4096. Given: 3896 `inputs` tokens and 256 `max_new_tokens`
make: *** [evals] Error 1

12 comments

LLogan M

ah, likely a small bug with token counting (the default tokenizer is using a gpt-3.5 tokenizer)

WWizboar

@Logan M

Ahhh ok. Yeah I am using this model on the Inference Endpoints:

TheBloke/prometheus-13B-v1.0-AWQ

Is the tokenizer configurable?

LLogan M

it is! And I'm actually just fixing the example lol give me one sec

WWizboar

Niceeeeeeee

LLogan M

This is kinda janky, but should work (I need to make a PR to do the partial under the hood I think lol)

Plain Text

from functools import partial
from transformers import AutoTokenizer
from llama_index import set_global_tokenizer

set_global_tokenizer(
    partial(AutoTokenizer.from_pretrained("TheBloke/prometheus-13B-v1.0-AWQ").encode, add_special_tokens=False)
)

WWizboar

ahhh sweet. I'll give it a go now.

WWizboar

I would love to check out the PR when its ready to btw so I can see what was the fix was

LLogan M

basically an isinstance check to check for the huggingface tokenizer base class lol

LLogan M

then doing the partial for you under the hood

LLogan M

(its part of a much larger PR to deprecate the entire serivce context and replace it with a better Settings object lol -- v0.10.0 coming soon!)

WWizboar

sounds like a lot of work!

WWizboar

Thanks for the quick fix man!

Add a reply

Find answers from the community

I am having issue with the compact and