Log in
Log into community
Find answers from the community
View all posts
Related posts
Did this answer your question?
π
π
π
Powered by
Hall
Inactive
Updated 3 months ago
0
Follow
stupid question - looking at https://
stupid question - looking at https://
Inactive
0
Follow
t
thoraxe
12 months ago
Β·
stupid question - looking at
https://docs.llamaindex.ai/en/stable/module_guides/observability/callbacks/token_counting_migration.html#token-counting-migration-guide
and thinking about token counting.
the callback manager is explicitly using tiktoken, which is counting tokens for openai. but what if i'm not using openai? is it "close enough"?
also, how does the embedding model (eg:
BAAI/bge-base-en-v1.5
) relate? or does it maybe not relate?
L
t
5 comments
Share
Open in Discord
L
Logan M
12 months ago
its typically close enough. But you can pass in any function for counting tokens
i.e.
AutoTokenizer.from_pretrained("...").encode
works too
t
thoraxe
12 months ago
π
L
Logan M
12 months ago
I guess right now it uses the same tokenizer for both embeddings and LLMs
L
Logan M
12 months ago
(embedding tokens usually matter much less, so it hasn't been a priority)
t
thoraxe
12 months ago
tyty
Add a reply
Sign up and join the conversation on Discord
Join on Discord