Find answers from the community

Updated last year

Hey there πŸ™‚ got another questions for y

At a glance

The community member is using PyMuPDFReader to load a PDF document and create a DocumentSummaryIndex. They want to ensure the summary generated by the query engine is at least 20 sentences long or uses the full 4096 token capacity. Other community members explain that the input and output of a language model are connected, and the larger the input prompt, the less room there is for the output. They suggest setting num_outputs=2048 in the service context to ensure there is always room for 2048 output tokens, but note that this doesn't guarantee the model will use all the available space.

Hey there πŸ™‚ got another questions for y'all:

Plain Text
    PyMuPDFReader = download_loader("PDFReader")
    loader = PyMuPDFReader()

    documents = loader.load_data(file=Path("./test-doc2.pdf"))
    # Create and store Summray index
    storage_context = StorageContext.from_defaults()

    index = DocumentSummaryIndex.from_documents(
        documents,
        service_context=service_context,
        storage_context=storage_context,
        show_progress=True,
    )
    query_engine = index.as_query_engine()
    result = query_engine.query("Write an extensive summary of this context for me?")
    print(result)


How can I make sure that the summary that it writes is at longer than 20 sentences. Or how do I make sure it uses the full 4096 tokens for the response?
L
k
4 comments
So, the input and output of an LLM are connected.

The bigger the input prompt, the less room there is not generate tokens

i.e. if I input 2048 tokens, and the context window is 4096, I only have room for 2048 more tokens

This can be (slightly) controlled by setting service_context = ServiceContext.from_defaults(..., num_outputs=2048)

This will make sure there is always room for 2048 output tokens
that doesn't mean the LLM will use all that room though
that's mostly down to prompt engineering
@Logan M Gotcha, thanks for explaining.
Add a reply
Sign up and join the conversation on Discord