Find answers from the community

Updated 9 months ago

TrNsform

can someone explain which part of this stack is used to learn during training and calculate during retrieval with embedding models like jinaai?
Attachment
image.png
L
L
7 comments
This entire stack is used πŸ‘€

The output of the last layer is called the hidden state/layer, and this is what's used for embeddings.

There's an embedding dimension for each token, and these are typically pooled (i.e. using averaging, or other techniques)
is that the default with embeddings models?
It is, at least to my knowledge πŸ‘
does that mean I could simply take the encoder part of an LLM to calculate embeddings?
especially if its fine-tuned on a specific task close to the retrieval task?
Most LLMs are decoder only actually.

But there is ways to get embeddings from decoder models, I know llama.cpp does it (I can't remember how it works exactly though)
I know thats the lingo some people use. However I was merely talking about the part without lm_head so linear layer and softmax to calculate the emebddings, which is always just an encoder πŸ€“ Because if you take away the head from a decoder, its basically an encoder.
Add a reply
Sign up and join the conversation on Discord