Find answers from the community

Updated 4 months ago

@Logan M @everyoneI wish to demonstrate

At a glance

@everyoneI wish to demonstrate to the community that Zephyr-7b-beta has a running and inference cost that is 13 times lower than GPT 3.5 and 6 times less than Llama 70b. Can I make this cost comparison in a Colab notebook?
How to do that? which will show cost per query something like this?

9 comments

WWhiteFang_Jr

I think you can use Token count handler for all these models.

https://docs.llamaindex.ai/en/stable/examples/callbacks/TokenCountingHandler.html#token-counting-handler

But I think you will not be able to run 70b on colab

sshekhargaikwad_

Thanks man!

sshekhargaikwad_

@WhiteFang_Jr How do i count the token for zephyr7b

WWhiteFang_Jr

I think this should work for Zephyr as well. as the current example shows the token count for Anthropic model

sshekhargaikwad_

@WhiteFang_Jr this need api key right there's no api key for Zephyr7B

LLogan M

Right -- you just need to change the llm to be zephyr, don't worry about an API key

sshekhargaikwad_

I'm still learning not familier with

LLogan M

I don't really know how to estimate cost for zephyr? Like, since the model is free, the only cost is the hardware to run it

sshekhargaikwad_

Thanks for the help 🙂

Add a reply