Not sure if this is the right place to post. Let me know if there is a better channel.
I want to ask for advice for my RAG architecture because I observed a low quality of responses and the LLM barely follows the instructions (GPT-4)
We have a SaaS app for restaurants where users can order etc and we are developing an intelligent cross-selling feature. The idea is that the LLM will recommend products that fit well with the customer cart (before they send their orders).
So far I have tried 2 approaches with little success
Approach 1: Chain of Thought Step 1: Give the product CATEGORIES as context in the LLM and based on the instructions and the user cart (query) choose 2-4 categories from which they would recommend products from. Pydantic response. Step 2: Give the products of the selected categories as context to the LLM and ask it to choose the best product based on the instructions and the user cart (query). Pydantic response
Approach 2: RAG Step 1: Create 1 doc per product with the name, description and category as text and also add the category as metadata Step 2: Calculate embeddings and setup RAM vector store Step 3: Ask the model to recommend the best product the complements the user cart based on the instructions and the user cart