Find answers from the community

Updated 2 weeks ago

Using a Local TEI Server for Reranking is Possible

Is it possible to use a local TEI server for reranking?
s
L
26 comments
Thats just api reference

Plain Text
pip install llama-index-postprocessor-tei-rerank


Plain Text
from llama_index.postprocessor.tei_rerank import TextEmbeddingInference as TEIR

reranker = TEIR(...)
You might want to update this issue, which makes people think it's still not possible. https://github.com/run-llama/llama_index/issues/9572
2023 lol nice
leaving a comment
hmm, does LlamaIndex still not support TEI + gRPC?
I really haven't kept up with TEI
TEI Reranker is working great! However, it fails when I try to combine it with the QueryFusionRetriever:

'Input validation error: inputs must have less than 512 tokens. Given: 518'
Plain Text
top_k = 25
vector_retriever = index.as_retriever(similarity_top_k=top_k)
bm25_retriever = BM25Retriever.from_defaults(docstore=index.docstore, similarity_top_k=top_k)
retriever = QueryFusionRetriever(
    [vector_retriever, bm25_retriever],
    retriever_weights=[0.6, 0.4],
    similarity_top_k=top_k,
    num_queries=1,
    mode='relative_score',
    verbose=True,
)
set truncate_text=True in the TEI reranker
seems like some text is just a touch too large
I tried that. It's already set to True by default. No dice.
Everything works fine if I forgo the QueryFusionRetriever.
hmm, maybe TEI changed how that option works?
it might not be sending it properly to TEI
TEI rerank works great—if I replace the QueryFusionRetriever with a a regular ol' VectorStoreIndex...as_query_engine().
I don't think the retriever should have any difference here lol its just coming down to how TEI is handling the truncation?

Unless the full traceback shows the error is happening somewhere else?

Also you might be more familiar than me, you can double check the API usage here
https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/postprocessor/llama-index-postprocessor-tei-rerank/llama_index/postprocessor/tei_rerank/base.py
llama_index.postprocessor.tei_rerank.TextEmbeddingInference sets truncate_text on its parent class, BaseNodePostprocessor. I don't think TEI is respondible for truncation here.
...but I'm wrong a lot
I'll try subclassing it.
Ah ha! The solution was to add --auto-truncate to my Docker command for the TEI reranker.
thats so weird haha
Add a reply
Sign up and join the conversation on Discord