What's interesting about the high number of nodes is that the document chunking results in about a thousand nodes. But once the documentsummary index gets created, the ref_doc_info balloons to 1M+ refs. So there's something strange going on there. Have to look into that. If that number of refs is unavoidable, we may have to namespace by document and delete the namespace in the kv store directly.
@Logan M I tracked down the cause of the 1M refs. It looks like if the doc store is used to store multiple indexes each index will cause an exponential increase in refs. Here is a basic notebook to see the issue (I didnt test it this late on a friday, but it should work)
@Logan M question, I am trying to use this updated version in my application, and I refer to it in the requirements.txt as the git repo, but it always tries to get the wheel for 10.51 instead of using the updated core (because it's defined in poetry as such). How can I make it use the latest?