Find answers from the community

Updated 2 months ago

Developing a RAG based llama-index with

Developing a RAG based llama-index with a local model (llama2) and default embedding. The answers vary fairly much asking the same question repeatedly. How would you make the answer(s) a bit more deterministic / consistent for the same query
T
r
2 comments
The smaller models struggle more with consistent outputs but lower temperature should make it better
To make the answers more deterministic or consistent for the same query when using a RAG based llama-index with a local model (llama2) and default embedding, you can adjust the temperature parameter of the model.

The temperature parameter controls the randomness of the model's predictions. A higher temperature will make the output more diverse and random, while a lower temperature will make the output more deterministic and focused on the most likely output sequence.

Here is how you can set the temperature when creating an instance of the model:

Plain Text
llama2_model = LlamaCPP(temperature=0.1)  # A lower temperature makes the output more deterministic


Please note that the exact value for the temperature parameter that will provide the best results can vary depending on the specific use case and may require some experimentation.
Add a reply
Sign up and join the conversation on Discord