Alish Satani

·

Reducing Latency Issues with Open AI and RAG

Hi Everyone! My current solution suffers from latency issues that negatively affect the user experience. We are using the Open AI with RAG, and as I'm new to this space and the project is directly handed over to me, I would appreciate the suggestions or advice on which area to look for to reduce the latency.

5 comments

W

A

AAlish Satani

·

Hi everyone, I have seen condensed

Hi everyone, I have seen condensed question + context mode for the chat engine in llama docs, but it's only available for OpenAI.
Can somebody suggest if it's achievable for anthropic LLM and llama index?

I'm not using embeddings or any other vector stores as of now, as I'm new to LLMs and have built a lot of basic stuff without adding any complexity. But I would love to have some new suggestions and learnings.

21 comments

W

A

AAlish Satani

·

Hi everyone, I have seen condensed

Hi everyone, I have seen condensed question + context mode for the chat engine in llama docs, but it's only available for OpenAI.
Can somebody suggest if it's achievable for anthropic LLM and llama index?

I'm not using embeddings or any other vector stores as of now, as I'm new to LLMs and have built a lot of basic stuff without adding any complexity. But I would love to have some new suggestions and learnings.

2 comments

L

Find answers from the community

Reducing Latency Issues with Open AI and RAG

Hi everyone, I have seen condensed

Hi everyone, I have seen condensed