Hi everyone,
I would like to use a model hosted on Huggingface TGI together with LlamaIndex and it's chat engine. Additionally I would like to make use of TGI's constrained generation features.
I assume this is the way to use TGI as LLM?
https://github.com/run-llama/llama_index/issues/9532In this case where would I pass the grammar parameter to the model / is there a way to pass kwargs into chat engine's chat functions that will reach TGI text_generation method?