the problem I have is that the prompt sended to the llm is talking about sources, and the llm answer also with "sources", even if there is only 1 text (it's sources of chunk I think)
ah, the citation query engine also mentions sources in the inputs. It treats each text chunk as a source, and attempts to prompt the LLM to write in-text citations
It's truncated because by default llama-index only leaves room for 256 tokens. Additionally, lots of models also default to stopping at 256 output tokens
So you can change the model config, as well as set num_outputs=300 or similar in the service context π€