LangChainLLM()
is fundamentally different than other llms like LlamaCPP()
or SagemakerLLM()
, specifically in that you cannot set messages_to_prompt
or completion_to_prompt
on LangChainLLM
but you can on the others. LangChainLLM
is the only one that extends LLM
instead of CustomLLM
. hf = HuggingFaceTextGenInference( inference_server_url="https://api-inference.huggingface.co/models/HuggingFaceH4/zephyr-7b-beta", max_new_tokens=512, top_k=10, top_p=0.95, typical_p=0.95, temperature=0.01, repetition_penalty=1.03, ) prompt_style = get_prompt_style(settings.huggingface.prompt_style) self.llm = LangChainLLM(llm=hf)
messages_to_prompt
function. Is there a way to do this?LangChainLLM
is just a wrapper around llms from langchain, so that they have the expected interface our our LLM classllama-index = { extras = ["local_models"], version = "0.9.3" }
pip install llama-index==0.9.25
and it should install fine (pip will complain though with a warning)messages_to_prompt
to LangChainLLM
with no issues, it doesn't appear that its used anywhere.LangChainLLM.complete
is straightforward as it calls _llm.predict()
which eventually calls the passed in completion_to_prompt
function. LangChainLLM.chat
is more confusing though as it calls predict_messages()
which seems to be defined on BaseLLM
and never in the chain seems to call messages_to_prompt
.messages_to_prompt
ever gets called? In practice it doesn't appear to be getting called ether.completion_to_prompt()
but not sure how messages_to_prompt
will be ever used.chat()
will only be called for chat models that support messages as input in the first place π€stream_chat(messages)
and have that function do something like the following.def stream_chat( self, messages: Sequence[ChatMessage], **kwargs: Any ) -> ChatResponseGen: prompt = self.messages_to_prompt(messages) completion_response = self.stream_complete(prompt, formatted=True, **kwargs) return stream_completion_response_to_chat_response(completion_response)
chat()
is called, but its not a chat model, that it gets routed to complete()
pip install git+https://github.com/run-llama/llama_index.git@logan/use_messages_to_prompt_langchain
pip install git+https://github.com/run-llama/llama_index.git
to try it nowpoetry shell
and poetry install
to setup an initial env, and then use pip install
for on-the-fly changesfrom llama_index.llms.base import LLM
BaseLLM
, i assume those are equivilant?BaseLLM
is the raw interface, same as the old LLM
LLM
adds some extra sugar on top