Find answers from the community

Updated 3 months ago

Reading docs

I feel like when I load my files via a SimpleDirectoryReader, that the bot im making isnt taking ahold of the infomation well. I'm currently moving over from a purely openai based system to a llama-index/langchain openai system, because i need to access a readthedocs wiki (which im currently doing via local files). The wiki explains how to create various formats of json for a data driven system called powers. but when i ask this new system to create a specific power, it creates a bunch of fields and names that dont exist on the wiki. i didnt have this issue on my old system, despite it having the same settings afaik. could anyone help me with why its sort of hallucinating these made up fields?
L
R
16 comments
So you've indexed your documentation, and want the model to read the documentation and then generate some example using info in the docs. Is that a good summary?

What does your current setup look like? (LLM settings, index type, any other info that might be helpful)
Yeah that summary is good, Im pretty much brand new to this whole system.

Essentially what the goal is, is that a user specified a "power" they want to make, and the chatbot will create the power, along with a description, and a name, all in json format like so:

Plain Text
{
  "text": "A power that does...",
  "json": {
    "__comment": "power json here"
  },
  "power_mame": "a name for the power that is in snake case"
}

which i would then parse the json to create a nicer looking output for the user.

My current mockup looks like this, based on some examples i saw, and just way to save the json data so it wont have to load it every time.
Attachment
image.png
here is an example of what it outputs. ive just added some text for what is wrong with the output, 2 fields are there that dont exist there, and the type field is incorrect.
Attachment
image.png
Hmm I see I see!

Two things that stick out to me

  1. Maybe use temperature=0 so the output is a bit more predictable and possibly limit the hallucinations
  1. You might have more luck using a system message to enforce the output structure (see attached image as an example)
  1. Maybe try either increasing the chunk size, orrr setting similarity_top_k=3 ot similar in the query call. By default the top_k is 1 so it might not be seeing the documentation that it needs.
Attachment
image.png
Ah! i was wondering how to do system prompts! that should help alot, my old system was indeed using system prompts
ive added that system prompt and now its generating it correctly it seems! i will still try the other things you advised along with some other prompts to verify, but thank you so much for the help <3
Attachment
image.png
Amazing! πŸ‘ happy to help! πŸ’ͺ
just one last thing, is it possible to use gpt-4 with this system? i've attempted to use it but it just said model not found, wondering if thats on me or not :P
It's possible, assuming your account has access (I can't remember if they are using a waitlist still)

You can do something like this:

ChatGPTLLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
thats what i had done... weird, i certainly have access i used gpt-4 in my old system. ill give it another go to double check
yeah im still getting the error O_o
Attachment
image.png
and im certain im using the correct api key too (i have a personal key i use for 3.5-turbo, and an organisation key i use for 4). its strange.
Maybe you need to update openai? pip install --upgrade openai
just upgraded from 0.27.2 to 0.27.3, with no changes to the output
I'm not sure what the issue is then :PSadge: That's exactly how the examples in the repo do it too (i.e. https://github.com/jerryjliu/llama_index/blob/main/examples/test_wiki/TestNYC-Tree-GPT4.ipynb)
thats why i thought it was a me thing, because i was SURE that i was following everything i saw correctly, i guess somethings just being weird for me, thank you for the help again, i really appreciate it!
Add a reply
Sign up and join the conversation on Discord