LlamaIndex

Log inLog into community

Find answers from the community

Updated 8 months ago

hello guys i want to ask about two

hello guys i want to ask about two

At a glance

ᴷᴷᵉⁿˢʰⁱHOUSSNI

·

hello guys i want to ask about two things firstly , how can i for example know the pdf page where the answer is coming and secondly i want to know how the chat memory system can work with llama index

W

ᴷ

b

28 comments

For pdf, Page label and filename is present in the source nodes of the response object.

Chat memory system helps in maintaining the conversation memory. This helps in maintaining the context so users can asks counter questions.

There are different chat memory approaches present in LlamaIndex ( Chat memory buffer, Chat memory summary )which can be used as per your use case.

ᴷᴷᵉⁿˢʰⁱHOUSSNI

i tried multiple time to access to sources nodes but i didnt success i m using this aproach on the reteiving response = Settings.llm.complete(query_with_context)
response_text = str(response) # Convert the response object to string source_documents = [{"filename": doc['filename'], "text": doc['text']} for doc in response_docs]

response_data = {
'response': response_text,
'sources': source_documents
}
return jsonify(response_data)

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr

You are interacting with the llm directly here. That is why there are no source nodes.

Source nodes will come when you make a RAG application

In this approach, even the page label wont come

ᴷᴷᵉⁿˢʰⁱHOUSSNI

i didnt understand u i thought i was doing a rag application what are the modification that i need to do to access to nodes

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr

RAG contains various section:

First one being, You create your vectors.
Second based on your query, certain nodes are pulled from your vectors data set.
You use those nodes to create final answers.

Now what I'm seeing here is you are interacting with the llm directly here Settings.llm.complete(query_with_context)

ᴷᴷᵉⁿˢʰⁱHOUSSNI

i thinks i misunderstand the documentation im confused now can u tell me what to do this is the vectro part of my code # Create the index from documents
documents = list(chunks_collection.find({}))
llama_documents = [Document(text=doc['text'], extra_info={"filename": doc["filename"]}) for doc in documents]
index = VectorStoreIndex.from_documents(llama_documents, storage_context=storage_context, embed_model=Settings.embed_model, service_context=service_context)
index.set_index_id("my_index")
index.storage_context.persist(persist_dir="./storage")

Load or create indices

index = load_index_from_storage(storage_context, index_id="my_index")
retriever = index.as_retriever(similarity_top_k=10)

Create an instance of your custom engine

custom_prompt_template = MongoDBContextPrompt(template="{context}\n\n{query}")
query_engine = CustomRetrieverQueryEngine(retriever=retriever, llm=Settings.llm, prompt_template=custom_prompt_template)
what do i need to change in the query funtion ? @WhiteFang_Jr

You have the query_engine, use this to ask the query
This will return nodes which will contain the metadata having page_label in it

response = query_engine.query(Your query here)

ᴷᴷᵉⁿˢʰⁱHOUSSNI

thank u i have two more question how can i use this nodes in my code and what can i do if the code is not answering right like he gives a part of the answer

Nodes contain information on metadata like page label from which text is extracted.

Now you can also show the source text that has been used to generate the final answer.

For your second question: if your code is not giving the correct answer. I would suggest you try to identify the root cause first.

For example root cause could be that prompt is not good then you should try with different prompts

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr can u tell me how to modify this code to get the sources response_text = query_engine.query(query_with_context)

source_documents = [{"filename": doc['filename'], "text": doc['text']} for doc in response_docs]

response_data = {
'response': response_text,
'sources': source_documents
}
return jsonify(response_data)

ᴷᴷᵉⁿˢʰⁱHOUSSNI

i did this response_data = {
'response': response_text,
'sources': source_documents ,
'response source': response_text.source_nodes
} ERROR:root:Error processing query: 'dict' object has no attribute 'source_nodes'
Traceback (most recent call last):
File "c:\Users\asus\Documents\PFEPROJECT\app.py", line 227, in query
'response source': response_text.source_nodes
AttributeError: 'dict' object has no attribute 'source_nodes'

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr

To get source nodes you need to check the response object.

Plain Text

response = query_engine.query(your query here)

# Now iterate over the nodes 
for node in response.source_nodes:
  print(node) # This will print the entire node

Now you can extract required items from this node like metadata and source text and add it to your final response object that you will return

ᴷᴷᵉⁿˢʰⁱHOUSSNI

like this
response_text = query_engine.query(query_with_context)

source_documents = [{"filename": doc['filename'], "text": doc['text']} for doc in response_docs]
response_data = {
'response': response_text,
'sources': source_documents ,

}
for node in response_text.source_nodes:
print(node)
return jsonify(response_data) @WhiteFang_Jr

ᴷᴷᵉⁿˢʰⁱHOUSSNI

?

No, What exact item do you want to return in the response object?

ᴷᴷᵉⁿˢʰⁱHOUSSNI

I WANT TO RETURN THE CHUNK , the document name and the page number @WhiteFang_Jr

ᴷᴷᵉⁿˢʰⁱHOUSSNI

is there any documentation about it @WhiteFang_Jr

Have you checked whats inside the node?

You get all these details inside the node.

Plain Text

final_response = {}
response = query_engine.query(your query here)

final_response['response'] = response.response
count = 0
for source in response.source_nodes:
  source_dict = {}
  source_dict['extra_info'] = source.node.extra_info # This dict will contain metadata for page label and filename
  source_dict['text'] = source.node.text
  final_response['source_'+str(count + 1)] = source_dict
  count = count +1

return final_response

@ᴷᵉⁿˢʰⁱHOUSSNI I would say the info is there in the metadata of the node by default. You can also explore manually creating nodes that way you have control over what goes in the metadata in case you want to add a few more things in the future.

ᴷᴷᵉⁿˢʰⁱHOUSSNI

i think its not working because im getting the document from Mongodb where there is different chunk text without and there is no mention of page label

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr

ᴷᴷᵉⁿˢʰⁱHOUSSNI

extra_info is empty

Add a reply

Sign up and join the conversation on Discord