OmriNach

Log inLog into community

Find answers from the community

Home

Members

OmriNach

Offline, last seen last month

Joined September 25, 2024

OOmriNach

Search-o1: Agentic Search-Enhanced Large...

Has anyone figured out how to trigger RAG retrieval inside a reasoning models <think> steps? Similar to the Search-o1 paper here:https://search-o1.github.io

In my opinion, this would be the best way to use deepseek to optimize RAG since new retrieve knowledge can push the reasoning chain in a different direction.

I tried to make the system prompt of deepseek llama 8b amd qwen distil 32b output <search> tokens but it does not like to follow instructions !

5 comments

OOmriNach

Transitioning from LLM OpenAI to AzureOpenAI gpt4o deployment with token output limitations

I transitioned from LLM OpenAI to AzureOpenAI gpt4o deployment, but I can't get the model to produce more than 1000 tokens. I have not set up max_tokens and confirmed its None in Settings.llm. Not sure what settings im missing here. any one experience the same?

5 comments

OOmriNach

I have a create llama backend running on

I have a create llama backend running on ec2 with a load balancer and auto scaling group, etc. it works perfectly fine serving my llama front end chat website on most wifis, but on certain corporate wifis I get ERR_CERT_AUTHORITY_INVALID trying to ping the backend. Does anyone know a solution to this? Our SSL certificate is provided by Aws as well, and gets an A grade from SSL labs

3 comments

OOmriNach

With arrival of the new workflows

With arrival of the new workflows architecture it opens the possibility of making the powerful COA chain of abstraction approach even better by making it execute in steps rather than one shot, it can then revise the execution plan as data comes in and even progressively refine an answer until a validator triggers a stop event. @Logan M have you thought about making a notebook on how to implement a COA rag pipeline with the new workflows?

18 comments