Find answers from the community

Updated 4 months ago

Max tokens

At a glance

DDaslav

nop, gpt-3.5-turbo

6 comments

LLogan M

That might by why the output changes so much. Gpt-3.5 can be... difficult to work with, compared to text-davinci-003

LLogan M

I honestly think openai has dumbed it down in the last month lol

DDaslav

With davinci-003 + max_token 1024 + chunk_size_limit=1024 + top_n=10 (Cohere) + k=10 (Weaviate),
I receive good responses, but they are very, very slow. LOL

LLogan M

Lol could enable streaming maybe, to help the responses feel faster

When you say top_n with cohere, you mean llm reranking or?

DDaslav

reranking, right!

LLogan M

Nice! Yea that's probably going to be the main bottleneck, llm calls are costly in terms of time

Add a reply