Find answers from the community

Updated 3 months ago

Pieplines

Hi,
I am using hf model & created pipeline, but unable to write code for service context as it allows input to llm model but not pipelines .
Can someone tell how to service context when using hf llm pipelines?
L
S
R
6 comments
We don't technically support pipelines, we use the model directly
But, I created textgeneration pipeline & tried putting in like

servicecontext( llm= pipeline..)
It resulted in few param missing - text_inputs & system_prompt


The reason I am using pipeline is, it provided flexibility while changing parameters which were default in llm model, eg- eos_token, padding_side, etc

Error got with model-
No chat template is defined for this tokenizer - using the default template for the GPT2TokenizerFast class. If the default is not appropriate for your model, please set tokenizer.chat_template to an appropriate template.

Setting pad_token_id to eos_token_id:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side='left' when initializing the tokenizer.
[ ]
right, yea like I mentioned, huggingface pipelines aren't supported
Ok, then how can I change architecture or padding_side while using model directly.

If you could share some snippet for same, that would be really helpful.
Thanks
@Logan M is there a design reason why HF pipelines aren't supported? are there plans for that? i am hoping to contribute optimum intel for better cpu performance and would be good to know
I find huggingface pipelines a little obnoxious to use lol but if you want to contribute an LLM class that wraps one, I'm all for it!
Add a reply
Sign up and join the conversation on Discord