Find answers from the community

Updated 2 months ago

Hi What is actually Total embedding

Hi! What is actually "Total embedding token usage"? For example, I used 18, 550 tokens but don't see any difference on my usage page (the model is davinci-2). Do they count? How much is that? Thanks!
L
S
8 comments
The usage page is a little laggy I think

Embeddings use text-ada-002 by default, and it costs $0.0004/1k tokens (so very cheap)

They help turn text into a numerical representation, which helps llama index retrieve text that is similar to the query text
Thanks! Yeah, I understand what is embedding but was not clear how the tokens are actually used. You said "ada" is in use by default, even if I specify another one like in this code?:
Plain Text
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003",
                            openai_api_key=openai_api_key))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)
        index = GPTQdrantIndex.from_documents([document], 
                                                client=get_qrant_client(), 
                                                collection_name=project_id.hex,
                                                service_context=service_context)
Yea, you've only specified the llm_predictor. This is different than the embed model

It will still use text-ada-002 for embeddings

The llm predictor is only used for generating text
Gotcha, thanks!
I wonder, though, what is the difference between all these models, maybe there is some comparison you could point me at? I'd appreciate it πŸ™‚
Tbh lately people have not been sharing good benchmarks or comparisons haha so it's hard to say

Generally, text-ada-002 is definitely better than anything from huggingface for embeddings

For LLMs, gpt-4 is the best, then text-davinci-003, then gpt-3.5

Open source LLMs are still catching up to gpt-3.5, but they aren't quite there yet.
Add a reply
Sign up and join the conversation on Discord