When printing the trace when using query engine I always see,
SYNTHESIZE
CHUNKING
CHUNKING
LLM
Chunking has this info
{
"__computed__": {
"latency_ms": 1.436,
"error_count": 0,
"cumulative_token_count": {
"total": 0,
"prompt": 0,
"completion": 0
},
"cumulative_error_count": 0
}
}
What is this chunking actually doing? Does it use prompt tokens ?