Find answers from the community

Updated 3 months ago

Hi I m following this tutorial https

Hi! I'm following this tutorial: https://github.com/jerryjliu/llama_index/blob/main/docs/guides/tutorials/building_a_chatbot.md

and I have 2 questions:

  1. Where to embed a prompt, that I want the final output in html + markdown
  2. How to get source documents used as an input for LLM
L
p
53 comments
  1. You can try just appending to the initial query text. Something like What did the author do growing up? Format your response using Markdown -- if that doesn't work, you can modify the internal prompts: https://gpt-index.readthedocs.io/en/latest/how_to/customization/custom_prompts.html
  1. response = index.query(...) and then check response.source_nodes
@Logan M Hi! The first approach did not work. Sorry, the examples you've provided are using query method on index, but there is no such thing in the tutorial. Which part of the tutorial I need to expand(?)/modify(?) in order to use index.query instead of what is shown in the tutorial?
ohhh right right it's using langchain on the frontend πŸ€”

Getting the sources is a little more tricky then, since that tutorial uses a lot of wrapper classes.

You could skip the wrapper functions that llama_index provides and directly create the Tool for langchain + llama index, like this example: https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

Then in the func for the tool, you can pass your own function that can grab/log the source nodes instead of a lambda
@Logan M hm.. The chatbot tutorial uses graph and configs (e.g. use Index-1 If you need A, use Index-2 If you need B, use Graph If you need analyzing A and B).

Considering I have:

  1. Index A - use when query about A
  2. Index B - use when query about B
  3. Graph - use when query about A and B
Implementation wise, do I need to add three Tools here:

Plain Text
tools = [
    Tool(
        name = "GPT Index",
        func=lambda q: str(index.query(q)),
        description="useful for when you want to answer questions about the author. The input to this tool should be a complete english sentence.",
        return_direct=True
    ),
]


And use list_index.query(q) when I want to query graph, correct?
yea exactly! So since you have three indexes, you can make a tool for each, and the func will call query on the appropriate index/graph. And of course, the descriptions should match the index πŸ’ͺ
Sorry, I think I need to use graph.query , not list_index.query, ?
Yea that's right.

Here's a super condensed example

Plain Text
tools = [
    Tool(
        name = "Index A",
        func=lambda q: str(index_a.query(q)),
        description="useful for when you want to answer questions A.",
        return_direct=True
    ),
    Tool(
        name = "Index B",
        func=lambda q: str(index_b.query(q)),
        description="useful for when you want to answer questions about B.",
        return_direct=True
    ),
    Tool(
        name = "Graph AB",
        func=lambda q: str(graph.query(q)),
        description="useful for when you want to answer questions that need info on both A and B",
        return_direct=True
    )
]
Then instead of using lambda, you can pass in a function that calls query and logs the source nodes somewhere

Plain Text
def log_and_query_a(prompt):
  response = index_a.query(prompt)
  print(response.source_nodes)
  return str(response)

...
  func=log_and_query_a,
...
Really lets you define how things work and what goes on πŸ™‚
@Logan M Thank you! One more thing: who decides when to use which tool in the example you've sent? https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

Is the agent_chain = initialize_agent and further parts remain the same or do I need to change anything there too?
If, say, "router" won't work as expected
I think the agent decides, based on description argument, isn't it?
Exactly haha beat me to it
So you'll want to make sure you write some good descriptions. Sometimes you have to get a little creative haha
@Logan M I can't get the same logs as in the tutorial: https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

even though I specified:

Plain Text
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
Attachment
image.png
Ok, I needed to specify verbose=True in initialize_agent function
@Logan M Ok, so quick demo showed that the chat tutorial works much better than https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

At least in my particular case (the same questions been answered good with the chat example)
Attachment
image.png
Huh, that's super weird πŸ€”

When you create the agent with initialize_agent, try setting to a slightly different react mode, maybe "zero-shot-react-description" instead of "conversational-react-description"
That's the example while using the chat demo
Attachment
image.png
Alright, will try. Also, in the chat example they have some extra stuff e.g.: query_configs , graph_config ,toolkit . I think those configs play a huge role in the pipeline. I'm wondering where do I need to plug those in

https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

πŸ€”
For querying the graph, you'll still want to use the query conifgs https://gpt-index.readthedocs.io/en/latest/how_to/index_structs/composability.html#querying-the-graph

For the two other indexes, you can pass in whichever options you want in the query() call itself (I.e. similarity_top_k, response_mode, etc).

What did your original config look like for the original demo? I can help copy those settings over if you run into trouble
(Those wrappers in the original tutorial really hide a lot of these details lol)
@Logan M Sure, here they are:

Plain Text
    query_configs = [
        {
            "index_struct_type": "simple_dict",
            "query_mode": "default",
            "query_kwargs": {
                "similarity_top_k": 1,
                # "include_summary": True
            },
            "query_transform": decompose_transform
        },
        {
            "index_struct_type": "list",
            "query_mode": "default",
            "query_kwargs": {
                "response_mode": "tree_summarize",
                "verbose": True
            }
        },
    ]


Plain Text
    # graph config
    graph_config = GraphToolConfig(
        graph=graph,
        name=f"Graph Index",
        description="use this tool when you want to answer queries that require analyzing multiple documents/companies/reports. If they mention comparing more than one source, use this tool.",
        query_configs=query_configs,
        tool_kwargs={"return_direct": True}
    )


Plain Text
    index_configs = []
    for k, v in index_set.items():
        name = k.split("$$$")
        company_name, document_type, document_year = name[0], name[1], name[2]
        tool_config = IndexToolConfig(
            index=v,
            name=f"Vector Index {k}",
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            index_query_kwargs={"similarity_top_k": 3},
            tool_kwargs={"return_direct": True}
        )
        index_configs.append(tool_config)
    toolkit = LlamaToolkit(
        index_configs=index_configs,
        graph_configs=[graph_config]
    )

    agent_chain = create_llama_chat_agent(
        toolkit,
        llm,
        memory=memory,
        verbose=True
    )
Ok, so then we can move those configs over pretty easily!
I didn't change much from the original tutorial (cuz, I don't really understand for 100% what each of those do lol)
Plain Text
graph.query(..., query_configs=query_configs)  # use the same query configs object you pasted above

vector_index.query(..., similarity_top_k=3)
that should give the same settings as the original code, I think lol
ohhh I see you have a query transform too
I think it will still work actually, nvm (the query transform is for the graph, and we are still passing that config in)
That's awesome! Let me try! πŸ™
@Logan M Well...
Attachment
image.png
That seems sus hahaha
Maybe go back to the conversational react option lol
But now the index configs are the same at least
@Logan M Yeah, it is in conversational react option now, I tried to change to zero shot, but result is still wrong
@Logan M Maybe I missed something here. Could you take a look, pls?
@Logan M Can I DM you?
How to pass a prompt for the final answer?
Plain Text
        prompt_template = """
        Having the text below, beautify it with markdown and wrap with html tags (If necessary): 
        {response}
        """
        response = index_set[index_name].query(text, similarity_top_k=3)


I want to plug in the prompt into the query
I think you might just want to send this to openAI directly, after llama index gives an answer πŸ€”
Yes, currently I'm doing so. But it will increase the cost abit πŸ™‚

I was wondering If there is a way to wrap the whole thing that's being passed to OpenAI with this template, so that everything is sent only once
Are you using chatgpt?one option is It's possible to add a system message to every prompt, so that you can tell it to output in markdown every time
Attachment
image.png
Emmm.... no?? But I thought that I am:

Plain Text
llm = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY, max_tokens=512)
llm_predictor = LLMPredictor(llm=llm)
am I using chatgpt in this case?
I think so!

In my screenshot, theres another LLM class you can import that will allow you to prepend messages, like a system message telling it to always respond in valid markdown syntax
Wow! Let me try πŸ‘
actually, when I use it, the final output is wrong
Now, it is spitting a correct answer. Btw, the pirate talk worked only in the first question:
Attachment
image.png
Oh I see. this is because I'm using the predictor in my queryFunc
it is called, only when a tool is used
By the way, response = agent_chain.run(input=text) returns a string. Is there a way to also get information about which tool been used (or was not used at all) ?
I'm not sure for that πŸ€” would have to read some more on langchain docs
Add a reply
Sign up and join the conversation on Discord