im having a very strange issue with the SentanceSplitter node parser. When i use a node_chunk_overlap of size 0 i have no issues, but if i use a positive value i always get an error that the chunk_overlap size is greater than the node_chunk_size, when it definitely is not larger. for example, a node_chunk_overlap of size 8 is considered larger than a node_chunk_size of 160. as shown here:
2024-07-04 10:11:54 Traceback (most recent call last): 2024-07-04 10:11:54 File "/app/main.py", line 27, in <module> 2024-07-04 10:11:54 init_settings() 2024-07-04 10:11:54 File "/app/app/settings.py", line 41, in init_settings 2024-07-04 10:11:54 Settings.node_parser = SentenceSplitter(chunk_size=Settings.chunk_size, chunk_overlap=Settings.chunk_overlap) 2024-07-04 10:11:54 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-07-04 10:11:54 File "/usr/local/lib/python3.11/site-packages/llama_index/core/node_parser/text/sentence.py", line 81, in init 2024-07-04 10:11:54 raise ValueError( 2024-07-04 10:11:54 ValueError: Got a larger chunk overlap (8) than chunk size (160), should be smaller.
i dont understand how it thinks 8 is larger than 160???
i think somehow the value was being converted into a string when its taken in with os.getenv(). it happened the same way with the HierarchicalNodeParser, so I removed the default values from os.getenv(), wrapped all the chunk_size and chunk_overlap variables in int() and it seems to be working now.