Find answers from the community

Updated 4 months ago

GPTPineconeIndex error

At a glance
Getting this error when using the GPTPineconeIndex (on a rather large set of documents):

Plain Text
ValueError: Effective chunk size is non positive after considering extra_info


or others: any idea what's going on?
b
j
6 comments
Trace:

Plain Text
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/vector_store/vector_indices.py", line 310, in __init__
    super().__init__(
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/vector_store/base.py", line 63, in __init__
    super().__init__(
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/base.py", line 114, in __init__
    self._index_struct = self.build_index_from_documents(documents)
  File "/usr/local/lib/python3.9/dist-packages/llama_index/token_counter/token_counter.py", line 86, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/base.py", line 286, in build_index_from_documents
    return self._build_index_from_documents(documents)
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/vector_store/base.py", line 206, in _build_index_from_documents
    self._add_document_to_index(index_struct, d)
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/vector_store/base.py", line 181, in _add_document_to_index
    nodes = self._get_nodes_from_document(document)
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/base.py", line 268, in _get_nodes_from_document
    return get_nodes_from_document(
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/node_utils.py", line 50, in get_nodes_from_document
    text_splits = get_text_splits_from_document(
  File "/usr/local/lib/python3.9/dist-packages/llama_index/indices/node_utils.py", line 30, in get_text_splits_from_document
    text_splits = text_splitter.split_text_with_overlaps(
  File "/usr/local/lib/python3.9/dist-packages/llama_index/langchain_helpers/text_splitter.py", line 136, in split_text_with_overlaps
    raise ValueError(
ValueError: Effective chunk size is non positive after considering extra_info
How much metadata are you inserting / how big is your chunk size?
☝️ actually I don't know - this is happening on an AgentHQ user's index so it's not really my data. Would the issue be if the metadata was too large?
Hmm... looking at the code it looks like the default chunk size for the GPTPineconeIndex is 2048. If the extra_info_str is longer than that, you'll get a negative and raise an error. That looks like what's happening in this user's case.
yeah exactly - it's something we can try to explicitly caution about
the metadata isn't meant to be super big atm
Add a reply
Sign up and join the conversation on Discord