Find answers from the community

Updated 3 months ago

When printing the trace when using query

When printing the trace when using query engine I always see,
SYNTHESIZE
CHUNKING
CHUNKING
LLM

Chunking has this info
Plain Text
{
  "__computed__": {
    "latency_ms": 1.436,
    "error_count": 0,
    "cumulative_token_count": {
      "total": 0,
      "prompt": 0,
      "completion": 0
    },
    "cumulative_error_count": 0
  }
}


What is this chunking actually doing? Does it use prompt tokens ?
L
c
2 comments
its just compacting all the nodes to minizmize LLM calls, its not using tokens
Add a reply
Sign up and join the conversation on Discord