Find answers from the community

Updated 2 years ago

Token usage

At a glance

The community member is new to the project and is trying to create an index using GPTSimpleVectorIndex. They have two issues: 1) When passing the model_name parameter with text-davinci-003, the OpenAI usage page shows that text-embedding-ada-002-v2 was used instead, and they are unsure if this is related to being in the free trial usage. 2) They wanted to analyze token usage with no costs before creating the whole index, so they used the MockLLMPredictor with GPTSimpleVectorIndex, but it still used tokens from OpenAI even though llm_predictor.last_token_usage reported 0. They are unsure if they will be charged for creating the index and which model will be used.

Another community member responded, explaining that there are two main models in Llama Index: the llm_predictor (used to generate natural language responses to queries) and the embed_model (used to create embeddings for documents and queries). They suggested using the mock embedding model too, as Llama Index reports token usage for embeddings and LLM calls separately, and embeddings

Useful resources
Hi guys, I'm new here. First of all congrats for such a great project.
I'm currently trying to create an index using GPTSimpleVectorIndex and I have two issues at the moment:
  1. When passing model_name parameter with text-davinci-003, in my openai usage page I always see that text-embedding-ada-002-v2 was used. I'm currently in the Free trial usage, not sure if that's related or if davinci is used for the queries only.
  2. I wanted to analyse tokens usage with no costs before creating the whole index, so I was using the MockLLMPredictor with GPTSimpleVectorIndex but is still using tokens from openai even though llm_predictor.last_token_usage reports 0. I'm not sure if I'll be charged for create an index or not and with which model.
Thanks πŸ™‚
L
e
2 comments
Hey @emmett !

There are two main models in llama index, the llm_predictor (used to generate natural language responses to queries) and the embed_model (used to create embeddings for documents and queries, go retrieve relevant text for queries)

You'll want to use the mock embedding model too. Llama index reports token usage for embeddings and llm calls separately (embeddings are very cheap though)

https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html#using-mockembedding
awesome! thanks @Logan M
Add a reply
Sign up and join the conversation on Discord