Hi all, I would like to calculate the number of tokens in order to get the pricing when using GPT 3.5 turbo, I found tiktoken that calculates the number of tokens of a text, but I am wondering when we use llama index, will the number of tokens of our knowledge base cost us also?
So, if you use a vector index, it will use openai tokens to embed all the data. This uses text-embedding-ada-002 (very cheap)
Then at query time, it embeds the query text (token usage again) and retrieves the top k pieces of text. Then it sends this all to the LLM, which will also cost tokens (with default settings, probably ~3k LLM tokens or so)
so first of all, it will embed all the tokens to create the index only once, ( first cost ). then when I pass a query, it embeds the query (second cost ) and retrieve the top k nodes. after that it passes the nodes and the query to the LLM which is the third cost, am I getting this right?