Yea you'll either want to customize the prompt templates, or you can try using this (slightly beta) feature that can prepend for you (it has chatgpt in the name, but you can pass in the same gpt4 llm as a kwarg)
aha, let me try it immediately
it didn't work - here is what I have got so far
from langchain.chat_models import ChatOpenAI
from llama_index import ServiceContext
memory = ConversationBufferMemory(memory_key="chat_history", ai_prefix=system_message)
llm=ChatOpenAI(temperature=0, model_name="gpt-4")
agent_chain = create_llama_chat_agent(
toolkit,
llm,
memory=memory,
verbose=True,
agent_kwargs={"prefix": system_message})
# LLM Predictor (gpt-3.5-turbo / gpt-4) + service context
from llama_index.llm_predictor.chatgpt import ChatGPTLLMPredictor # use ChatGPT [beta] (unsure)
from langchain.prompts.chat import SystemMessagePromptTemplate
# add a system message
prepend_messages = [SystemMessagePromptTemplate.from_template(system_message)]
#llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-4"))
llm_predictor = ChatGPTLLMPredictor(prepend_messages=prepend_messages)
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)
# define a decompose transform
from llama_index.indices.query.query_transform.base import DecomposeQueryTransform
decompose_transform = DecomposeQueryTransform(
llm_predictor=llm_predictor, verbose=True
)
# define query configs for graph
query_configs = [
{
"index_struct_type": "simple_dict",
"query_mode": "default",
"query_kwargs": {
"similarity_top_k": 1,
# "include_summary": True
},
"query_transform": decompose_transform
},
{
"index_struct_type": "list",
"query_mode": "default",
"query_kwargs": {
"response_mode": "tree_summarize",
"verbose": True
}
},
]
when there is no tool it works (as I am sending it also in the question) but when it goes to the Product Index, it doesn't
How do you have the tool setup? The Initial configs look fine
# product index config
tool_config = IndexToolConfig(
index=product_index,
name="Product Index",
description="useful every time you want to answer queries about products",
index_query_kwargs={"similarity_top_k": 2},
tool_kwargs={"return_direct": True, "return_sources": True},
)
index_configs = [tool_config]
# graph config
graph_config = GraphToolConfig(
graph=graph,
name=f"Graph Index",
description="useful for when you want to answer queries related to glasses and sunglasses as well as frequently asked questions on these products.",
query_configs=query_configs,
tool_kwargs={"return_direct": True, "return_sources": True},
return_sources=True
)
toolkit = LlamaToolkit(
index_configs=index_configs,
graph_configs=[graph_config]
)
I need the system prompt because I want to control the brand tone of voice but also because I need to force the model to provide the sources (links to the product pages)
You couuuld turn off return_direct
Alternatively, you can also try to pass the service context with that new llm predictor into all the query kwargs (maybe its not passing through properly with all these layers π
)
nope, return_direct doesn't help
how can I pass the service context to all query kwargs?
{"similarity_top_k": 2, "service_context": service_context} -> something like this, but for all the query kwargs
I'm surprised setting that to false didn't help. Then the agent is responsible for returning the final answer, and it has that system message π€
service_context doesn't seem to help either, I will try to restart fresh but right now I find it difficult to use llama-index with langchain, get the proper sources and keep the system prompt
I can see I am getting close, I love how llama index handles data (I start from a Knowledge Graph and GraphQL)
I also managed to go into production with langchain-serve (by Jina)
I find the toolkit wrappers are useful for getting started quickly. But once you need to customize things it gets rough.
Maybe you do need a custom qa and refine template?
But also, you can use llama index as a custom tool in langchain, without all the extra wrappers, like this:
The lambda can even be a wrapper function, so you have more ability to grab sources the way you want etc.
https://github.com/jerryjliu/llama_index/blob/main/examples/langchain_demo/LangchainDemo.ipynbright this can be an alternative route but can I still use two index as separate tools (the graph and the product_index)?
I like the fact that with llama index I have this additional level of abstraction
Yea you definitely can do the same with the above approach, just need to make a tool for every index/graph
I'm glad you appreciate the wrappers though! π
I'll see if I can link some ways to change the prompt templates..
So, here's an example of customizing the refine_template. You'd want to do something similar for the text_qa_template. Then these all get passed into the query kwargs
from langchain.prompts.chat import (
AIMessagePromptTemplate,
ChatPromptTemplate,
HumanMessagePromptTemplate,
)
from llama_index.prompts.prompts import RefinePrompt
# Refine Prompt
CHAT_REFINE_PROMPT_TMPL_MSGS = [
HumanMessagePromptTemplate.from_template("{query_str}"),
AIMessagePromptTemplate.from_template("{existing_answer}"),
HumanMessagePromptTemplate.from_template(
"I have more context below which can be used "
"(only if needed) to update your previous answer.\n"
"------------\n"
"{context_msg}\n"
"------------\n"
"Given the new context, update the previous answer to better "
"answer my previous query."
"If the previous answer remains the same, repeat it verbatim. "
"Never reference the new context or my previous query directly.",
),
]
CHAT_REFINE_PROMPT_LC = ChatPromptTemplate.from_messages(CHAT_REFINE_PROMPT_TMPL_MSGS)
CHAT_REFINE_PROMPT = RefinePrompt.from_langchain_prompt(CHAT_REFINE_PROMPT_LC)
...
index.query("my query", similarity_top_k=3, refine_template=CHAT_REFINE_PROMPT)
Here's a link to what the text_qa_template currently looks like. It gets transformed into a single human message, but you can create one in a similar way to the refine template above to add the system message
https://github.com/jerryjliu/llama_index/blob/main/gpt_index/prompts/default_prompts.py#L105ok let me do some tests, thank you very much
it is working now π₯³
memory = ConversationBufferMemory(memory_key="chat_history", ai_prefix=system_message)
llm=ChatOpenAI(temperature=0, model_name="gpt-4")
agent_chain = create_llama_chat_agent(
toolkit,
llm,
memory=memory,
verbose=True,
agent_kwargs={"prefix": system_message},
return_sources=True)
this part can be improved in general, right now sources are coming out as I forced them in the system prompt. Thanks a lot for your kind help
Agent: {"response": "The Prada PR 13YS sunglasses are a great choice for a woman. These acetate frames feature a havana color and come with a price tag of $99.00. The lenses are designed to provide maximum protection from the sun's harmful UV rays. The sleek design and classic look make these sunglasses a perfect choice for any woman.", "sources": [{"ref_doc_id": "210872cb-2e6d-4e09-9ad3-a6d9ba831348"}, {"ref_doc_id": "b34fca92-fb15-4631-a311-03edf76c40bc"}]}
Yes!
Yea there are some constraints on how the response is returned
You can use json.loads() to parse/split the response and sources
yes - what is the simplest way to retrive the ref_doc_id?
response = agent.run(...)
response_dict = json.loads(response)
response_str = response_dict["response"]
ref_doc_ids = [x["ref_doc_id"] for x in response_dict["sources"]]
right but once I have the ref_doc_ids with the list how do I get back to the actual document?
graph.docstore.get_document(doc_id)
Something like that I think?
Hi @Logan M for some reason I cannot locate the document from the ref_ids; I can get the response and the sources (also from the agent now) but I cannot get the document behind it using the id.
response = graph.query("How do I know if Vogue sunglasses are authentic?")
using
response.get_formatted_sources()
I would get the Source, I can see the docstore is a list with documents but I tried with .get_document(doc_id) or .get_node(doc_id) and I always get not found. Example:
Document(text='What is the difference between Ray-Ban polarized and non-polarized? Polarized sunglasses filter out the ambient light and eliminate glare through the reflective surface. Non-polarized sunglasses lessen the overall intensity of the light reflecting on the lenses.\n', doc_id='fa9538d8-ae66-4393-b0eb-9e6d72a56433', embedding=None, doc_hash='05d7797846c5c9af285ae3e4b0a1c2af42fb7c05b340d1fd98b8a1c802c3e701', extra_info={'url': 'https://www.glasses.com/gl-us/ray-ban'}),
I miss only this part to get the prototype ready
I can write a function to retrieve it from the docstore (that is a list) but I believe there is a simpler way. I used product_index.docstore.get_node()
in the past but it doesn't seem to work right now. I have 2 index and 1 graph that combines them both, the agent will use the graph or the product_index.
Yea after chatting the other day with some other people, it seems like there is no easy way to retrieve the entire original document π the docstore actually only holds the nodes.. so ref_doc_id will not fetch the documents.
One option for this is to set the extra_info dict of each document to include the filename. Then this will show up in the node.node_info dict, so you can find the file that way
Using the agent makes us lose track of the response object that had that formatted sources function, since now we have to do things langchains way π
how do I get back to the node?
oh right you only have ref doc id π
Seems like you have to iterate over the doc store and pick up nodes that have the same ref_doc_id
This really should be easier π
Hi @Logan M I finally updated the code to 0.6.9 π and I am getting back the parsing of the source - I have now a GPTVectorStoreIndex (only one to start with) and I need to extract the URLs behind the extra_info from the ref_doc_id behind the answer. What is the best approach?
Nice! πͺ
Both the extra_info that you set on your input documents, and the ref_doc_id, can be accessed from the response object
response.source_nodes
is a list of source nodes
From there, you can do response.source_nodes[0].node.node_info
or response.source_nodes[0].ref_doc_id
ah cool, let me test it right a way
I can get the sources from the response, this is good but from the langchain agent I only have get ref_doc_id and I need to trace back the information behind it. Something like index.get_document(ref_doc_id) would help (or a way to iterate through the index)
here is what I have from the agent
{'answer': 'Generative AI refers to the ability of large language models to generate coherent text, as well as other abilities such as writing code, drawing, creating photo-realistic marketing materials, synthesizing and comprehending spoken language, and creating 3D objects. In the context of SEO, generative AI can be used for tasks such as text generation, internal linking, FAQ generation, product descriptions, summarization, introductory text, structured data automation, chatbots, and more. It is important to have a strong data fabric and ethical values when using generative AI. The feedback-loop is also crucial in improving the quality of the output.', 'sources': [{'start': 15405, 'end': 19450, 'ref_doc_id': '8f7cf82f-7c5a-4892-8a48-a36b119eb7f4', 'score': 0.8553929771241638}, {'start': 27089, 'end': 31077, 'ref_doc_id': '8f7cf82f-7c5a-4892-8a48-a36b119eb7f4', 'score': 0.8521241183413296}, {'start': 0, 'end': 3832, 'ref_doc_id': '8f7cf82f-7c5a-4892-8a48-a36b119eb7f4', 'score': 0.8484620447588704}]}
Yea the ref_doc_ids are not really kept track of properly right now. Hoping to make this easier in the future.
I think you should be able to iterate over the nodes in index.docstore.docs to find all the nodes with the same ref_doc_id (not the most efficient, but should work)