Find answers from the community

Updated 3 months ago

Max tokens

nop, gpt-3.5-turbo
L
D
6 comments
That might by why the output changes so much. Gpt-3.5 can be... difficult to work with, compared to text-davinci-003
I honestly think openai has dumbed it down in the last month lol
With davinci-003 + max_token 1024 + chunk_size_limit=1024 + top_n=10 (Cohere) + k=10 (Weaviate),
I receive good responses, but they are very, very slow. LOL
Lol could enable streaming maybe, to help the responses feel faster

When you say top_n with cohere, you mean llm reranking or?
reranking, right!
Nice! Yea that's probably going to be the main bottleneck, llm calls are costly in terms of time
Add a reply
Sign up and join the conversation on Discord