Hey everyone i am having around 20k pdf files with each of 2-4 pages, i am using MetadataExtractor to extract metadata using llama2 llm but my kernel got crashed how to resolve it??
"What are the key distinctions between using 'as_chat_engine' and 'as_query_engine' in terms of the responses they generate when the same question is asked?"
How can i pass key value in the ExactMatchFilter dynamics means if a user si using my rag chatbot in that case how can i defined my key and value which will be relevant to query
i have implemented rag using llama-cpp-python with mistral7b openorca model but response time is too high although the api is hosted on sever which has 2 nvidia gpu RTX a4000 . can someone help me out
I have developed a rag system now i want to integrate it on my website as a chatbot. What are the ways in which i can do that for example - i am thinking of creating an API in django but problem is how can i return the response in a streaming mode
Hello folks i am new in rag so can u help me out i have build a rag system using only 2 text file and weaviate vector database. I am getting pretty decent response but it usually take approx 2 min to get the whole response now what should i do to decrease the response time because i need to scale this system for 1000k text files??