Find answers from the community

Updated 6 months ago

When printing the trace when using query

At a glance

The community member is asking about the "chunking" process they see when using a query engine, and whether it uses prompt tokens. The first comment explains that chunking is just compacting all the nodes to minimize LLM calls, and it does not use tokens. The second comment simply says "Great", indicating the explanation was helpful.

cchantlong

When printing the trace when using query engine I always see,
SYNTHESIZE
CHUNKING
CHUNKING
LLM

Chunking has this info

Plain Text

{
  "__computed__": {
    "latency_ms": 1.436,
    "error_count": 0,
    "cumulative_token_count": {
      "total": 0,
      "prompt": 0,
      "completion": 0
    },
    "cumulative_error_count": 0
  }
}

What is this chunking actually doing? Does it use prompt tokens ?

2 comments

LLogan M

its just compacting all the nodes to minizmize LLM calls, its not using tokens

cchantlong

Great

Add a reply