You could just add to the prompts "write your answer in <insert language here>". Heck, maybe even adding that to the query string will work 🤔
(man you’re a mashine.. thanks for been so supportive..)
Well, i could do that, but the queries that gets send to openai, does will contains anyway english words… What im testing out, is to use a SimpleVectorIndex with very short documents. So im setting similar_top_k=100. When doing so, looks like llamaindex is trying to send multiple queries for openai, with each of the k’s documents with a ‘refine prev answer’ statement. (newbie here, just figuering out how does llamaindex work)
if i set similarity_top_k=50, then its not needed, and i get a good answer, but when i set it to =100, and llamaindex tried to enter the refine process, anwser gets bad
Yup that's how it works!
You can try and send less requests by setting response_mode="compact"
in the query, which will stuff as much text as possible in each LLM call
Out of curiosity, are you using gpt3.5? Something changed with the model recently and it does much worse with the refine process now 😔 (working on a fix, but it's difficult to work with haha)
And yes, im using response_mode=compact and mode=embedding too.
I know what you mean. I could make llamaindex generate a final responde in spanish, but its very bad. Probably im hitting the issue your saying. But anyway, i think that does not eliminate the need to customize the chattty-refining template to other languages. I think in my case it would actually help.
Translating the promptntemplates should be fairly straightforward, which part exactly were you having troubles with?
I know the chat refine prompt is made up of 3 messages, but I think you only need to translate the last message because it has all the instructions
I see that 3 part messages flowing to openai, which i would like to change.
Noticed tho, that on .query() there is an refine_template property. But what that does (it seems to me at least) is bypass the 3 parts messages, and make it send only 1.
Like ignoring the role’s method of chat-openai. (?)
I mean, instead of sending 3 messages (roles user, assistant, user) it sticks all into 1 message with role=user.
Nah the messages are not ignored, if you used a debugger and went deep into the code, the actual prompts sent to openai still include the message/role format.
So if you want to create your own refine template, copy lines 12-29 from that file, and use the result of line 29 as your refine template
ooh, i can put the whole thing as a refine template.
Yea! So you define the list of messages, pass that into ChatPromptTemplate.from_messages, and then pass the output of that into RefinePrompt.from_langchain_prompt
Then, you can pass that into your query as the refine_template
ok.. so when a prior response is good, and the refine process kicks in a tries to generate a better one, it tells things like the prev questions does not relate to the question. So it ends messing up the answer.
Is that what you were refering with there is something wire with gpt-3.5, that needs some fix?
Yea pretty much. Gpt3.5 is supposed to repeat the existing answer if the new context does not help. And this worked fine in the past.
But as you can see, now it refuses to cooperate lol
ok, got it. yup, i see exactly that.
is there a reported ticket with this somewhere? so i can subscribe to it?
Hmmm I actually don't see an issue yet on github for this!