Find answers from the community

Updated 2 weeks ago

How to Deploy a Reranker on a GPU and Call That as a Service?

how to deploy a reranker on a gpu and call that as a service?
L
B
14 comments
how do i call this from llamaindex
2025-02-06 15:03:34.834 | ERROR | core.inference_retriver:query_index:338 - Error in query_index UUID: None token: None - 1 validation error for TextEmbeddingInference
top_n
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='BAAI/bge-reranker-large', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/int_parsing
Plain Text
>>> from llama_index.postprocessor.tei_rerank import TextEmbeddingInference as TEIR      
>>> from llama_index.core.schema import TextNode, NodeWithScore
>>> nodes = [NodeWithScore(score=1.0, node=TextNode(text="dog")), NodeWithScore(score=1.0, node=TextNode(text="cat")), NodeWithScore(score=1.0, node=TextNode(text="cow"))]
>>> reranker = TEIR(top_n=2, base_url="http://127.0.0.1:8081")
>>> reranker.postprocess_nodes(nodes, query_str="dog dog")[0].text
'dog'
try: logger.debug("Setting up rerranker") logger.debug(config.RERANKER_BASE_URL) reranker = TEIR(top_n=10,base_url=config.RERANKER_BASE_URL) except Exception as e: logger.error(f"Reranker error:{e}") query_engine = CitationQueryEngine.from_args( index, node_postprocessors=[reranker], similarity_top_k=similarity_top_k, llm=llm, )
something is wrong i tried:
i am using version 0.2.1 for teir
kens. Given: 694
2025-02-07T05:52:55.236885Z ERROR rerank:predict{truncate=false truncation_direction=Right raw_scores=false}: text_embeddings_core::infer: core/src/infer.rs:398: Input validation error: inputs must have less than 512 tokens. Given: 866
2025-02-07T07:04:30.293402Z ERROR rerank:predict{truncate=false truncation_direction=Right raw_scores=false}: text_embeddings_core::infer: core/src/infer.rs:398: Input validation error: inputs must have less than 512 tokens. Given: 694
2025-02-07T07:04:30.303219Z ERROR rerank:predict{truncate=false truncation_direction=Right raw_scores=false}: text_embeddings_core::infer: core/src/infer.rs:398: Input validation error: inputs must have less than 512 tokens. Given: 866
2025-02-07T07:04:30.306802Z ERROR rerank:predict{truncate=false truncation_direction=Right raw_scores=false}: text_embeddings_core::infer: core/src/infer.rs:398: Input validation error: inputs must have less than 512 tokens. Given: 866
2025-02-07T07:08:11.878829Z ERROR rerank:predict{truncate=false truncation_direction=Right raw_scores=false}: text_embeddings_core::infer: core/src/infer.rs:398: Input validation error: inputs must have less than 512 tokens. Given: 866
2025-02-07T07:08:11.891838Z ERROR rerank:predict{truncate=false truncation_direction=Right raw_scores=false}: text_embeddings_core::infer: core/src/infer.rs:398: Input validation error: inputs must have less than 512 tokens. Given: 694
2025-02-07T07:08:11.892099Z ERROR rerank:predict{truncate=false truncation_direction=Right raw_scores=false}: text_embeddings_core::infer: core/src/infer.rs:398: Input validation error: inputs must have less than 512 tokens. Given: 866

i guess this the error
fixed thanks tho
I think when you launch TEI there's an option to set auto-truncate
--auto-truncate
Add a reply
Sign up and join the conversation on Discord