Pieplines

SSamarth_082

Hi,
I am using hf model & created pipeline, but unable to write code for service context as it allows input to llm model but not pipelines .
Can someone tell how to service context when using hf llm pipelines?

6 comments

LLogan M

We don't technically support pipelines, we use the model directly

SSamarth_082

But, I created textgeneration pipeline & tried putting in like

servicecontext( llm= pipeline..)
It resulted in few param missing - text_inputs & system_prompt

The reason I am using pipeline is, it provided flexibility while changing parameters which were default in llm model, eg- eos_token, padding_side, etc

Error got with model-
No chat template is defined for this tokenizer - using the default template for the GPT2TokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template.

Setting pad_token_id to eos_token_id:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side='left' when initializing the tokenizer.
[ ]

LLogan M

right, yea like I mentioned, huggingface pipelines aren't supported

SSamarth_082

Ok, then how can I change architecture or padding_side while using model directly.

If you could share some snippet for same, that would be really helpful.
Thanks

RRBB

@Logan M is there a design reason why HF pipelines aren't supported? are there plans for that? i am hoping to contribute optimum intel for better cpu performance and would be good to know

LLogan M

I find huggingface pipelines a little obnoxious to use lol but if you want to contribute an LLM class that wraps one, I'm all for it!

Add a reply

Find answers from the community

Pieplines