Find answers from the community

Updated 12 months ago

Hey there πŸ™‚ got another questions for y

Hey there πŸ™‚ got another questions for y'all:

Plain Text
    PyMuPDFReader = download_loader("PDFReader")
    loader = PyMuPDFReader()

    documents = loader.load_data(file=Path("./test-doc2.pdf"))
    # Create and store Summray index
    storage_context = StorageContext.from_defaults()

    index = DocumentSummaryIndex.from_documents(
        documents,
        service_context=service_context,
        storage_context=storage_context,
        show_progress=True,
    )
    query_engine = index.as_query_engine()
    result = query_engine.query("Write an extensive summary of this context for me?")
    print(result)


How can I make sure that the summary that it writes is at longer than 20 sentences. Or how do I make sure it uses the full 4096 tokens for the response?
L
k
4 comments
So, the input and output of an LLM are connected.

The bigger the input prompt, the less room there is not generate tokens

i.e. if I input 2048 tokens, and the context window is 4096, I only have room for 2048 more tokens

This can be (slightly) controlled by setting service_context = ServiceContext.from_defaults(..., num_outputs=2048)

This will make sure there is always room for 2048 output tokens
that doesn't mean the LLM will use all that room though
that's mostly down to prompt engineering
@Logan M Gotcha, thanks for explaining.
Add a reply
Sign up and join the conversation on Discord