Find answers from the community

Updated 4 months ago

Run

Oddly it ran once ok, I changed which llm I was using, ran it again and started getting this.
L
D
33 comments
Are you sure you have the latest version? Are you running in a notebook, try restarting it?
Not running into notebook. I am running in the command line and I believe I have the latest version. I did a direct pip install of llama index and posted the version in my request. Is there a best practices way to uninstall and reinstall llama index so it for example doesn't just use the cached version if I want to try to clear this out?
If you want a clean slate, I'd go with a fresh venv. Normally pip install -U ... is enough though
I did a full reinstall, down to the bones, uninstalling all of miniconda, re-install (miniconda python 12), create a new venv, install these packages from a requirements.txt and still getting the same error in token_counting.py that Usage has no get attribute:

Plain Text
typing
openai 
Constants
llama_index.llms.litellm
llama_index.llms.openai
llama_index.multi_modal_llms.openai
llama_index.llms.anthropic
pandas
tiktoken
tqdm
asyncio
requests
pyshorteners
nest_asyncio
pathlib
llama_index.postprocessor.longllmlingua
memory_profiler
llama_index.multi_modal_llms.anthropic
datetime
pathlib
llama_index.embeddings.openai
llama_index.postprocessor.cohere_rerank
llama_index
google-auth
ipython
matplotlib
The error is coming from this file. Looks like perhaps the usage object is not a dictionary for some reason?
Maybe an issue with your version of pydantic?
(The short term solution here is just removing your token counter or writing your own)
If you inspect your actual source code for the installed package, it has that if statement? The traceback has line numbers that match the file i linked?
@Logan M Here's the file from the stack trace folder. It seems to match what's in github. Is it possible that line 59 (raw_dict = response.raw.model_dump()) could return something other than a dictionary as that's what it does when response.raw is not a dict before it's assuned to raw_dict.

Plain Text
if isinstance(response.raw, dict):
                    raw_dict = response.raw
                else:
                    raw_dict = response.raw.model_dump()
Here's the top part of my stack trace, which doing a query_engine.query:

Plain Text
  File "D:\Users\brian\Onedrive\Llamaindex\ArcanumBotDev.py", line 572, in <module>
    load_game_system(game_system)
  File "D:\Users\brian\Onedrive\Llamaindex\ArcanumBotDev.py", line 555, in load_game_system
    Systems_Prompt = "<GAME_SYSTEMS>" + str(query_engine.query(f"**Task:**  Review the indexed rulebook for the {game_system} RPG game system and summarize the most important gameplay mechanics and skill checks (do not repeat text verbatim). Pay close attention to rules sections in core books, rule books or players handbooks in particular. Retrieve the most relevant chunks of text from the rulebook related to:* **Core Resolution Mechanic:**  How are actions generally resolved (dice rolls, target numbers, etc.)?* **Skill Checks:**  When is a check required in this game?* **Common Skill Checks:**  Which skills are frequently used? Provide a brief description of each.* **Special Mechanics:** Are there unique systems like combat, social interaction, magic, etc.? Summarize how they function.**Generation:** Provide a concise summary in the following format:* **Mechanic:*** **Description:*** **When Used:**")) + "</GAME_SYSTEMS>"
Here's my actual code:
Plain Text
query_engine = index.as_query_engine(
        chat_mode="context", 
        retriever_mode="embedding", 
        similarity_top_k=40,
        node_postprocessors=[reranker],
        verbose=True,
        system_prompt="You are in AI that reviews tabletop RPG sourcebook and retrieves a summary of the core rules and systems in the game."
    )

    Systems_Prompt = "<GAME_SYSTEMS>" + str(query_engine.query(f"**Task:**  Review the indexed rulebook for the {game_system} RPG game system and summarize the most important gameplay mechanics and skill checks (do not repeat text verbatim). Pay close attention to rules sections in core books, rule books or players handbooks in particular. Retrieve the most relevant chunks of text from the rulebook related to:* **Core Resolution Mechanic:**  How are actions generally resolved (dice rolls, target numbers, etc.)?* **Skill Checks:**  When is a check required in this game?* **Common Skill Checks:**  Which skills are frequently used? Provide a brief description of each.* **Special Mechanics:** Are there unique systems like combat, social interaction, magic, etc.? Summarize how they function.**Generation:** Provide a concise summary in the following format:* **Mechanic:*** **Description:*** **When Used:**")) + "</GAME_SYSTEMS>"
model_dump always returns a dict πŸ€”
Does this code reproduce the issue for you?

Plain Text
from llama_index.core.callbacks import CallbackManager, TokenCountingHandler
from llama_index.llms.openai import OpenAI

handler = TokenCountingHandler()
llm = OpenAI(callback_manager=CallbackManager([handler]))

resp = llm.complete("Test")
print(handler.completion_llm_token_count)
I get this, but this is just running hte code straight in iPython, not sure if I need to set up OpenAI credentials or something:
right i assumed you had your api key setup lol
Either in your env vars, or llm = OpenAI(api_key="...", callback_manager=CallbackManager([handler]))
(Im assuming you are already using openai?)
Correct, thanks, made the change, I don't usually initiate the llm that way, so that was helpful (trying to keep my test code as lean as possible to not introduce other issues). This seems to have worked, (returned "9"), so assuming it must be something specific in the way my code is making the call that is causing the issue? I posted my specific call though and it seems to be pretty standard structure (it's of course using the query engine on top of just a simple llm call, so not sure if that's part of the issue, ie that message I'm sending or the context is causing something to break internally/return null from the response. What do you think enxt steps woudl be, try to boil it down to the smallest amount of code to build an index and make this call and see if it still errors?
Yea basically just need to boil this down to the smallest case that reproduces the error
I'm glad the basic sanity test I wrote worked haha
but then curious where the actual issue is now πŸ€”
I guess it sounds like maybe something in the context chat engine is causing it?
Expanding the test slightly, this still works

Plain Text
from llama_index.core.callbacks import CallbackManager, TokenCountingHandler
from llama_index.llms.openai import OpenAI

handler = TokenCountingHandler()
callback_manager = CallbackManager([handler])
llm = OpenAI(callback_manager=callback_manager)

resp = llm.complete("Test")
print(handler.completion_llm_token_count)

from llama_index.core import Document, VectorStoreIndex

index = VectorStoreIndex.from_documents([Document.example()], callback_manager=callback_manager)
chat_engine = index.as_chat_engine(chat_mode="context", llm=llm)
response = chat_engine.chat("Test")
print(handler.completion_llm_token_count)
Maybe you can take it from there to figure out what the difference is that is causing it to break on your end
Sounds good, I'll slowly add in more until I can reproduce the error and report back if I get it reproduced.
Ok, so I've tried to trim down my code to the base essentials, and have replicated the error. It turns out I'm only getting it with claude haiku, and not with openia gpt-4o-mini, wondering if the token counting framework has changed such that it can't handle a claude haiku response?
aha, so the issue is on anthropic then, let me see if I can replicate
Ok, found the bug. But even if the types were correct, it wasn't going to count tokens properly, because the token counts are under a field thats not checked
The token counter was really only written for openai tbh lol
Yeah I think I noticed before it was often coming through with 0s etc. Do you recommend I just don't use it then (can I note use it) with Haiku in the meantime?
Yea I think I would just not use the token counter with haiku until the next release in a day or two πŸ™‚
Add a reply
Sign up and join the conversation on Discord