Do you reckon there's some hacky way I can get at the prompt generated for now? I have a big demo for our CTO and I'm super keen to use this, but it looks starkly different to the rest of the app which is now streaming Gpt 3.5 turbo and flies. If I could retrieve the prompt, I could do the rest.
YEss! This works really well! Only weird thing I'm seeing is that I seem to lose newlines from the returned prompt \n etc. I noticed this with my normal responses too - is that something I'm doing my end or is that how it comes out?
I also have a few examples where I tried a non-English PDF to ask an English question, and it works! Sometimes though I get an error like: openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4100 tokens (3844 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.
Re: context length, sorry about that bug; still trying to debug where it happens. in the meantime you can try manually setting the chunk size to smaller with chunk_size_limit