Hi I m following this tutorial https

ppikachu8887867

Hi! I'm following this tutorial: https://github.com/jerryjliu/llama_index/blob/main/docs/guides/tutorials/building_a_chatbot.md

and I have 2 questions:

Where to embed a prompt, that I want the final output in html + markdown
How to get source documents used as an input for LLM

53 comments

LLogan M

You can try just appending to the initial query text. Something like What did the author do growing up? Format your response using Markdown -- if that doesn't work, you can modify the internal prompts: https://gpt-index.readthedocs.io/en/latest/how_to/customization/custom_prompts.html

response = index.query(...) and then check response.source_nodes

ppikachu8887867

@Logan M Hi! The first approach did not work. Sorry, the examples you've provided are using query method on index, but there is no such thing in the tutorial. Which part of the tutorial I need to expand(?)/modify(?) in order to use index.query instead of what is shown in the tutorial?

LLogan M

ohhh right right it's using langchain on the frontend 🤔

Getting the sources is a little more tricky then, since that tutorial uses a lot of wrapper classes.

You could skip the wrapper functions that llama_index provides and directly create the Tool for langchain + llama index, like this example: https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

Then in the func for the tool, you can pass your own function that can grab/log the source nodes instead of a lambda

ppikachu8887867

@Logan M hm.. The chatbot tutorial uses graph and configs (e.g. use Index-1 If you need A, use Index-2 If you need B, use Graph If you need analyzing A and B).

Considering I have:

Index A - use when query about A
Index B - use when query about B
Graph - use when query about A and B

Implementation wise, do I need to add three Tools here:

Plain Text

tools = [
    Tool(
        name = "GPT Index",
        func=lambda q: str(index.query(q)),
        description="useful for when you want to answer questions about the author. The input to this tool should be a complete english sentence.",
        return_direct=True
    ),
]

And use list_index.query(q) when I want to query graph, correct?

LLogan M

yea exactly! So since you have three indexes, you can make a tool for each, and the func will call query on the appropriate index/graph. And of course, the descriptions should match the index 💪

ppikachu8887867

Sorry, I think I need to use graph.query , not list_index.query, ?

LLogan M

Yea that's right.

Here's a super condensed example

Plain Text

tools = [
    Tool(
        name = "Index A",
        func=lambda q: str(index_a.query(q)),
        description="useful for when you want to answer questions A.",
        return_direct=True
    ),
    Tool(
        name = "Index B",
        func=lambda q: str(index_b.query(q)),
        description="useful for when you want to answer questions about B.",
        return_direct=True
    ),
    Tool(
        name = "Graph AB",
        func=lambda q: str(graph.query(q)),
        description="useful for when you want to answer questions that need info on both A and B",
        return_direct=True
    )
]

LLogan M

Then instead of using lambda, you can pass in a function that calls query and logs the source nodes somewhere

Plain Text

def log_and_query_a(prompt):
  response = index_a.query(prompt)
  print(response.source_nodes)
  return str(response)

...
  func=log_and_query_a,
...

LLogan M

Really lets you define how things work and what goes on 🙂

ppikachu8887867

@Logan M Thank you! One more thing: who decides when to use which tool in the example you've sent? https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

Is the agent_chain = initialize_agent and further parts remain the same or do I need to change anything there too?

ppikachu8887867

If, say, "router" won't work as expected

ppikachu8887867

I think the agent decides, based on description argument, isn't it?

LLogan M

Exactly haha beat me to it

LLogan M

So you'll want to make sure you write some good descriptions. Sometimes you have to get a little creative haha

ppikachu8887867

@Logan M I can't get the same logs as in the tutorial: https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

even though I specified:

Plain Text

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

Attachment

ppikachu8887867

Ok, I needed to specify verbose=True in initialize_agent function

ppikachu8887867

@Logan M Ok, so quick demo showed that the chat tutorial works much better than https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

At least in my particular case (the same questions been answered good with the chat example)

Attachment

LLogan M

Huh, that's super weird 🤔

When you create the agent with initialize_agent, try setting to a slightly different react mode, maybe "zero-shot-react-description" instead of "conversational-react-description"

ppikachu8887867

That's the example while using the chat demo

Attachment

ppikachu8887867

Alright, will try. Also, in the chat example they have some extra stuff e.g.: query_configs , graph_config ,toolkit . I think those configs play a huge role in the pipeline. I'm wondering where do I need to plug those in

https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynb

🤔

LLogan M

For querying the graph, you'll still want to use the query conifgs https://gpt-index.readthedocs.io/en/latest/how_to/index_structs/composability.html#querying-the-graph

For the two other indexes, you can pass in whichever options you want in the query() call itself (I.e. similarity_top_k, response_mode, etc).

What did your original config look like for the original demo? I can help copy those settings over if you run into trouble

LLogan M

(Those wrappers in the original tutorial really hide a lot of these details lol)

ppikachu8887867

@Logan M Sure, here they are:

Plain Text

    query_configs = [
        {
            "index_struct_type": "simple_dict",
            "query_mode": "default",
            "query_kwargs": {
                "similarity_top_k": 1,
                # "include_summary": True
            },
            "query_transform": decompose_transform
        },
        {
            "index_struct_type": "list",
            "query_mode": "default",
            "query_kwargs": {
                "response_mode": "tree_summarize",
                "verbose": True
            }
        },
    ]

Plain Text

    # graph config
    graph_config = GraphToolConfig(
        graph=graph,
        name=f"Graph Index",
        description="use this tool when you want to answer queries that require analyzing multiple documents/companies/reports. If they mention comparing more than one source, use this tool.",
        query_configs=query_configs,
        tool_kwargs={"return_direct": True}
    )

Plain Text

    index_configs = []
    for k, v in index_set.items():
        name = k.split("$$$")
        company_name, document_type, document_year = name[0], name[1], name[2]
        tool_config = IndexToolConfig(
            index=v,
            name=f"Vector Index {k}",
            description=f"use this tool to answer questions solely about {company_name} {document_type}, year {document_year}. Do not use this tool for comparison with other documents/companies/reports.",
            index_query_kwargs={"similarity_top_k": 3},
            tool_kwargs={"return_direct": True}
        )
        index_configs.append(tool_config)
    toolkit = LlamaToolkit(
        index_configs=index_configs,
        graph_configs=[graph_config]
    )

    agent_chain = create_llama_chat_agent(
        toolkit,
        llm,
        memory=memory,
        verbose=True
    )

LLogan M

Ok, so then we can move those configs over pretty easily!

ppikachu8887867

I didn't change much from the original tutorial (cuz, I don't really understand for 100% what each of those do lol)

LLogan M

Plain Text

graph.query(..., query_configs=query_configs)  # use the same query configs object you pasted above

vector_index.query(..., similarity_top_k=3)

LLogan M

that should give the same settings as the original code, I think lol

LLogan M

ohhh I see you have a query transform too

LLogan M

I think it will still work actually, nvm (the query transform is for the graph, and we are still passing that config in)

ppikachu8887867

That's awesome! Let me try! 🙏

ppikachu8887867

@Logan M Well...

Attachment

LLogan M

That seems sus hahaha

LLogan M

Maybe go back to the conversational react option lol

LLogan M

But now the index configs are the same at least

ppikachu8887867

@Logan M Yeah, it is in conversational react option now, I tried to change to zero shot, but result is still wrong

ppikachu8887867

@Logan M Maybe I missed something here. Could you take a look, pls?

ppikachu8887867

@Logan M Can I DM you?

LLogan M

For sure!

ppikachu8887867

How to pass a prompt for the final answer?

ppikachu8887867

Plain Text

        prompt_template = """
        Having the text below, beautify it with markdown and wrap with html tags (If necessary): 
        {response}
        """
        response = index_set[index_name].query(text, similarity_top_k=3)

I want to plug in the prompt into the query

LLogan M

I think you might just want to send this to openAI directly, after llama index gives an answer 🤔

ppikachu8887867

Yes, currently I'm doing so. But it will increase the cost abit 🙂

I was wondering If there is a way to wrap the whole thing that's being passed to OpenAI with this template, so that everything is sent only once

LLogan M

Are you using chatgpt?one option is It's possible to add a system message to every prompt, so that you can tell it to output in markdown every time

Attachment

ppikachu8887867

Emmm.... no?? But I thought that I am:

Plain Text

llm = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY, max_tokens=512)
llm_predictor = LLMPredictor(llm=llm)

ppikachu8887867

am I using chatgpt in this case?

LLogan M

I think so!

In my screenshot, theres another LLM class you can import that will allow you to prepend messages, like a system message telling it to always respond in valid markdown syntax

ppikachu8887867

Wow! Let me try 👍

ppikachu8887867

actually, when I use it, the final output is wrong

ppikachu8887867

Now, it is spitting a correct answer. Btw, the pirate talk worked only in the first question:

Attachment

ppikachu8887867

Oh I see. this is because I'm using the predictor in my queryFunc

ppikachu8887867

it is called, only when a tool is used

ppikachu8887867

By the way, response = agent_chain.run(input=text) returns a string. Is there a way to also get information about which tool been used (or was not used at all) ?

LLogan M

I'm not sure for that 🤔 would have to read some more on langchain docs

Add a reply

Find answers from the community

Hi I m following this tutorial https