You'll have to check if increasing the chunk size or reducing it helps you in your case.
Check the response object and verify what nodes it picked whether they are righlty picked or not
@WhiteFang_Jr how can i verify what nodes are picked ?
You can check the response object. It contains the nodes which have been used to generate the response.
print(response.source_nodes)
i have this error CompletionResponse' object has no attribute 'sources_nodes' @WhiteFang_Jr
Sounds like you are using the LLM directly? What did your code look like? @WhiteFang_Jr was reffering to the output of a query engine
@Logan M @WhiteFang_Jr if custom_prompt_template:
query_with_context = custom_prompt_template.generate_prompt(query=query_text, documents=response_docs)
else:
query_with_context = query_text
response = Settings.llm.complete(query_with_context)
response_text = str(response) # Convert the response object to string
source_documents = [{"filename": doc['filename'], "text": doc['text']} for doc in response_docs]
response_data = {
'response': response_text,
}
return jsonify(response_data)
except Exception as e:
logging.error(f"Error processing query: {e}", exc_info=True)
return jsonify({'error': str(e)}), 500
Right, you are using the LLM directly
So you already have the source documents, i.e. response_docs
@Logan M i have a timeout problem is there a way for the code to be excuted more fast ?
nope. Its up to your LLM and what you are running it on
Increase the timeout 🤷♂️
i put the time out 400 s Settings.llm = Ollama(
model="llama3",
max_length=4096,
temperature=0.7,
top_p=0.9,
device_map="auto",
server_url="http://localhost:11434",
request_timeout=400.0,
)
and still givinga the timout error @Logan M is there a solution for that ?
put it higher? I usually put something like 3600 lol
I suspect it might not actually be seconds
i will try to put it a u said but i want to know if there is a solution for the code to anwer quickly
the speed is limited by
- the hardware you are running on
- the size of the input
- the size of the output (i.e. how much the LLM decides to write)
Local models are slow af my guy, especially when running on CPU
thank u for the answer , can u tell me plz with what can i replace local Models ?
use openai, or some other api based service
i wanted to use something different from openai
then you can use anthropic, mistrals api, vertex, bedrock, azure
the options are limitless
ok thanks a lot one more question plz when i upload a legal document with terms a french document and i ask about a certain term for example give me the term 3 ? the response say there is no term but there is one . what is the problem here ? @Logan M