Is there a way to control the message

At a glance

The community member is trying to use a metadata extractor, specifically the summary extractor, on a set of nodes they've created. However, they are encountering an error due to the message length exceeding the maximum context length of the model. The community member is using a sentence node parser with a window of 3, but the process still terminates after processing around 300 nodes.

The community members discuss potential solutions, such as excluding certain metadata keys from being sent to the language model, and adjusting the metadata extraction process to not include the metadata when generating the summaries. They also discuss issues with timeouts when processing a large number of nodes, and the need for a more robust node parsing approach to handle anomalous nodes with long text.

While there is no explicitly marked answer, the community members suggest a few approaches to address the issue, such as processing nodes in batches, adding a backup splitter to the sentence node parser, and potentially performing a second round of post-processing to handle the anomalous nodes.

ttheta

Is there a way to control the message length when using one of the metadata extractors? I'm trying to use the summary extractor on a set of nodes I've created and everytime I try to process them, I get the following error: InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4513
tokens. Please reduce the length of the messages.
I'm using a sentence node parser with a window of 3 so the nodes are already small. Nevertheless, I keep wasting a lot of tokens because I'll about 300 nodes in and the process will terminate. how do I control the length of the message for metadata extractors? Cheers!

15 comments

LLogan M

Intersting, I'm surprised this isn't being handled already.

Most likely, some metadata you only want to use for embeddings/retrieval

To prevent metadata from being sent to the LLM, you can set some settings like

Plain Text

node.excluded_llm_metadata_keys = ['key1', 'key2']

This will prevent metadata under those key values from being sent to the LLM

From the metadata extractor, there is likely a few keys you can exclude that are being inserted

ttheta

Thanks Logan for checking in! I'm not sure I follow though. Are you saying the summary extractor also summarizes the existing metadata within the node? because I do have a handful of metadata already in each node. I didn't think the summary extractor parsed the metadata, I thought it summarized the "text" attribute of the node. Now, I do specify a number of keys in "exclude_llm_metadata_key" and I'm happy to include more if that affects the total message length used by the summary extractor.

LLogan M

oh I guess I should have clarified, the error is occuring during the metadata-extraction process or during actual queries?

ttheta

the error is occurring during the "Extracting summaries" phase of sentence parsing

ttheta

because I passed a metadata dict to the sentence node parser constructor

ttheta

metadata_extractor = MetadataExtractor(
extractors=[
SummaryExtractor(summaries=["self"], llm=llm),
KeywordExtractor(keywords=3, llm=llm),
],
)
sentence_window_parser = SentenceWindowNodeParser(
window_size=3,
window_metadata_key="window",
original_text_metadata_key="original_text",
include_metadata=True,
include_prev_next_rel=True,
metadata_extractor=metadata_extractor
)

LLogan M

ah I see I see

By default, during the metadata extraction process, it is actually reading the text AND the metadata to generate the summary

Try this

Plain Text

from llama_index.schema import MetadataMode

metadata_extractor = MetadataExtractor(
  extractors=[
      SummaryExtractor(summaries=["self"], llm=llm, metadata_mode=MetadataMode.NONE),
      KeywordExtractor(keywords=3, llm=llm, metadata_mode=MetadataMode.NONE),
  ],
)

ttheta

Thank you kindly @Logan M testing now...

ttheta

While its processing, I am still getting timeouts from api.openai.com. Is there a recommended approach to titrating the rate that nodes are being processed to avoid the 600 second timeouts? Do people wait to process the metadata_extractors until after the nodes are extracted? I'm trying to process only 1000 nodes and its taking forever, made worse by the timeouts

LLogan M

oof yea, the metadata extractors are fairly new, so this hasn't been worked out yet 😅 Could process nodes afterwards in batches to avoid timeouts. Would be awesome to have a PR to better handle this too lol

ttheta

@Logan M it got much further, but still generated an InvalidRequestError due to a message length of 6534. is it possible to wrap the line in a try/except clause and print out the node thats causing the problem? what I don't understand is that all the nodes should only be 1 sentence in length.

LLogan M

perhaps there is a node that wasn't able to split properly? Tbh that node parser needs a backup splitter to handle this 😅

There are a few ways to handle this. The fastest is probably splitting everything into nodes (without the metadata extractors) and then doing

Plain Text

for i, node in enumerate(nodes):
  print(f"{i}: {len(node.text.split(' '))}")

And finding the index of the biggest nodes lol

ttheta

I found the node that has the issue, its the content from a web page that has several tables of values without punctuation indicating sentences. Can we force the sentence node parser to chunk at a certain size if no sentences can be found? or do I need a second round of processing to catch these anomalous nodes?

LLogan M

Yea I've been meaning to add this feature, but have not gotten to it yet 😅 A second round of post-processing might have to do the trick here

ttheta

okkies, thanks for all your smarts and time working with llamaindex!

Add a reply

Find answers from the community

Is there a way to control the message