I want to directly ask question with the LLM, instead of going through the whole RAG pipeline that requires me to specify the retriever, text_qa_template, and response_synthesizer.
When I loaded the LLM using AutoModelForCasualLm, I could do the following:
template = """some text here"""
prompt_template = LCPromptTemplate(
input_variables = ["text"]
template=template
)
response = llm(prompt_template.format(text=question))
But when I loaded the LLM using Vllm, I got the following error:
'Vllm' object is not callable