I implemented this example:
except i am using the index as chat engine:
https://docs.llamaindex.ai/en/stable/module_guides/models/llms/usage_custom.html#example-using-a-custom-llm-model-advanced# chat_engine = index.as_chat_engine()
chat_engine = index.as_chat_engine(
chat_mode="context",
memory=memory,
system_prompt=system_prompt,
service_context=service_context
)
response = chat_engine.chat("Tell me a joke.")
print(f"Agent: {response}")
but when i put in an input it returns no output and gives error:
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
anyone know why this might be happening?
edit: now its giving error
ValueError: shapes (384,) and (1536,) not aligned: 384 (dim 0) != 1536 (dim 0)