I am tracing events using llama-index callbacks, I see there is a call back called chunking being made after synthesize. I do not know why chunking is happening at this stage given the documents have already been stored into the VEctor store. I though this might be due to nodepostprocessor but turns out that's not the source of chunking happening here. Any idea what might be happening?
during synthesis, the defualt mode is compact. There is a step where all retrieved text is combined and then split again, so that each input to the LLM is as big as possible