Hi I am new to LlamaIndex. I am trying to make a chatbot with Python streamlit (based on this
https://blog.streamlit.io/build-a-chatbot-with-custom-data-sources-powered-by-llamaindex/)
That worked fine, but I wanted to make a an app where I could make a chat bot with several PDFs. :
https://esmo2022-abstracts-gastro.streamlit.app/Whenever I ask it certain general questions about titles of abstracts in my PDFs, like "Tell me some titles of abstracts" I get the response 'Abstract titles are not provided in the given context information. The phrase not provided in given context is a repeated pattern whenever I don't ask very specific questions about the abstracts in my PDF documents.
Am I not providing it enough data? Or is the PDF parsing not indexing the data correctly?
Here is my code:
def load_data():
with st.spinner(text="Loading and indexing abstracts! This should take 1-2 minutes."):
reader = SimpleDirectoryReader(input_dir="./ESMO_abstracts", recursive=True)
docs = reader.load_data()
service_context = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo", temperature=0.5, system_prompt="You are an expert ESMO 24th World Congress on Gastrointestinal Cancer 2022 abstracts and your job is to answer technical questions. Assume that all questions are related to ESMO 24th World Congress on Gastrointestinal Cancer 2022 abstracts Keep your answers technical and based on facts β do not hallucinate features."))
index = VectorStoreIndex.from_documents(docs, service_context=service_context)
return index
index = load_data()
chat_engine = index.as_chat_engine(chat_mode="condense_question", verbose=True)