I have a question that I hope some of you might be able to help with. We're currently building an intelligent chatbot for a client (a governmental entity) aimed at assisting entrepreneurs with setting up their businesses, financing, etc. So far, everything has been going great, and our Retrieval Augmented Generation (RAG) system, built with LlamaIndex, allows our LLM to provide extremely relevant solutions. However, we've encountered an issue regarding the display and retrieval of LINKS.
The bot consistently proposes the right solution, but often the associated link is a hallucinated URL. When we use only about 50/60 links, the results are perfect. But as soon as we expand to using 500 links, we keep running into this issue.
Does anyone have any ideas on how to optimize URL retrieval to exclusively use the URLs provided in our database?
Thank you in advance for any insights or suggestions you might have! Have a good day π