[Question]: How to use HuggingFaceTextGe...

At a glance

The community member is interested in using a model hosted on Hugging Face TGI (Text Generation Inference) together with LlamaIndex and its chat engine. They also want to use TGI's constrained generation features. The community member assumes this is the way to use TGI as a Language Model (LLM) and asks where they should pass the grammar parameter to the model, or if there is a way to pass kwargs into the chat engine's chat functions that will reach the TGI text_generation method.

In the comments, another community member suggests that the grammar parameter should be passed as additional_kwargs when creating the LLM class, and provides a link to the relevant code in the LlamaIndex repository.

Useful resources

MMarii

Hi everyone,

I would like to use a model hosted on Huggingface TGI together with LlamaIndex and it's chat engine. Additionally I would like to make use of TGI's constrained generation features.

I assume this is the way to use TGI as LLM?
https://github.com/run-llama/llama_index/issues/9532

In this case where would I pass the grammar parameter to the model / is there a way to pass kwargs into chat engine's chat functions that will reach TGI text_generation method?

1 comment

LLogan M

I think youd have to pass that as additional_kwargs when creating the LLM class
https://github.com/run-llama/llama_index/blob/96ee0d57b977ee53af0a905ce71d9c00781cae72/llama-index-integrations/llms/llama-index-llms-text-generation-inference/llama_index/llms/text_generation_inference/base.py#L95

Add a reply

Find answers from the community

[Question]: How to use HuggingFaceTextGe...