Find answers from the community

Updated 3 months ago

when loading llama2 locally do i need to

when loading llama2 locally do i need to set as service context??? The local llama example doesn't change service context. When i requested a query, the model said it was openai not llama2
L
D
14 comments
Yea, it needs to be set in the service context.

What does your setup currently look like? I can suggest the edits needed
whoops, that example is missing a step :PepeHands:
Plain Text
from llama_index import ServiceContext, set_global_service_context

ctx = ServiceContext.from_defaults(llm=llm)
set_global_service_context(ctx)
Then the query engine will use the huggingface LLM
got it thanks!
@Logan M is there a way to use vllm with llama index?
hmm langchain has an integration for it right? Should be compatible with us

Plain Text
from llama_index.llms import LangChainLLM

llm = LangChainLLM(<create vllm from langchain>)
https://python.langchain.com/docs/integrations/llms/vllm i thought about that just having trouble establishing a prompt template when using that method
Just because the LLM expects a specific input format? Or some other concern?
Llm expects instruction format. Fine tuned on alpaca data sets
yeaaa that will be tougher. Langchain doesn't really offer an easy way to set a static prompt template.

Your easiest bet would be to maybe wrap vllm into a CustomLLM object class

Then, you can format the prompt yourself before actually calling the llm?

Here's an example that wraps huggingface, but you could just replace that with vllm and go from there?
https://gpt-index.readthedocs.io/en/latest/core_modules/model_modules/llms/usage_custom.html#example-using-a-custom-llm-model-advanced
I wish more LLM providers (like vllm) just offered a way to do this for you :PSadge:
Add a reply
Sign up and join the conversation on Discord