mikeLiu

Above just a part code, user upload

Above just a part code, user upload their document to our system, we will create the DocumentSummaryIndex and persist it. I found the llamaindex will stored chunk text and summary result on local docstore.json file.
Q1: Each chunk text will send to LLM get summary, use the default prompt: Is parallel to call LLM API do chunk summarize?

Plain Text

Context information from multiple sources is below.
---------------------
{text}
---------------------
Given the information from multiple sources and not prior knowledge, answer the query.
Query: Describe what the provided text is about. Also describe some of the questions that this text can answer. 
Answer:

Q2: Why need LLM "describe some of the questions that this text can answer", What role do these questions play?

Q3: When use the DocumentSummaryIndex create the query engine to answer user query, I see query engine not used local docstore.json file summary result, just call LLM API to do each chunk summarize and combine those summary do final summarize.
What role do these summary result play on stage of create summary index?

Plain Text

doc_summary_index = load_index_from_storage(
    StorageContext.from_defaults(
        persist_dir=store_path
    ),
    service_context=service_context,
)
summary_query_engine = doc_summary_index.as_query_engine(
            response_mode="tree_summarize",
            streaming=True,
            use_async=True
        )
query = 'write a summary about this document'
summary_query_engine.query(query)

1 comment

mmikeLiu

Hi, i see LLMSingleSelector support

Hi, i see LLMSingleSelector support other LLM, but when i use custome LLM service_context, have issue:

Plain Text

llm = OurLLM()
service_context = ServiceContext.from_defaults(llm=llm,text_splitter=text_splitter,embed_model=embed_model)
# initialize router query engine (single selection, pydantic)
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(service_context=service_context),
    query_engine_tools=query_engine_tools,
)
ValueError: 
******
Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

To disable the LLM entirely, set llm=None.
******

2 comments

mmikeLiu

RetrieverQueryEngine set response_mode as "tree_summarize" return top 10 node will have is

@kapa.ai RetrieverQueryEngine set response_mode as "tree_summarize" return top 10 node will have issue: token more than 4096

3 comments

Find answers from the community

Above just a part code, user upload

Hi, i see LLMSingleSelector support

RetrieverQueryEngine set response_mode as "tree_summarize" return top 10 node will have is