Log in
Log into community
Find answers from the community
View all posts
Related posts
Was this helpful?
π
π
π
Powered by
Hall
Inactive
Updated 4 months ago
0
Follow
stupid question - looking at https://
stupid question - looking at https://
Inactive
0
Follow
At a glance
t
thoraxe
last year
Β·
stupid question - looking at
https://docs.llamaindex.ai/en/stable/module_guides/observability/callbacks/token_counting_migration.html#token-counting-migration-guide
and thinking about token counting.
the callback manager is explicitly using tiktoken, which is counting tokens for openai. but what if i'm not using openai? is it "close enough"?
also, how does the embedding model (eg:
BAAI/bge-base-en-v1.5
) relate? or does it maybe not relate?
L
t
5 comments
Share
Open in Discord
L
Logan M
last year
its typically close enough. But you can pass in any function for counting tokens
i.e.
AutoTokenizer.from_pretrained("...").encode
works too
t
thoraxe
last year
π
L
Logan M
last year
I guess right now it uses the same tokenizer for both embeddings and LLMs
L
Logan M
last year
(embedding tokens usually matter much less, so it hasn't been a priority)
t
thoraxe
last year
tyty
Add a reply
Sign up and join the conversation on Discord
Join on Discord