Hi guys, has anyone tried to use TreeIndex class. I used TreeIndex.from_documents to index (used transformation too). It parses and summarizes but does not embed. Tried many times with chatgpt but couldn't get it to work. I used SimpleDirectoryReader to read few documents. I did Settings and ServiceContext. For transformation, I set transformations = transformations_from_settings_or_context(Settings, service_context). Thanks!
I persisted the vector store to disk. The default_vectorstore is empty. docstore has content. Also the progress bar showed parsing done, summarizing done, but no embedding. Plus it's way too fast - embedding takes a while. I used to just use documentsummaryindex but want to use the standard TreeIndex this time. I found the code on GitHub and I thought I put in the right arguments: tree_index = TreeIndex.from_documents( documents, service_context=service_context, show_progress=True, transformations=transformations, )
Thanks Logan. Sorry about the late response. Yes, I see what you are saying. The indexing part of treeindex mainly does the summarization. I believe that this one doesn't necessarily need embedding for the retrieval neither. I looked at the code on github further. The treeindex mainly provides a structure template . It has a lot options, so that you can do a lot with it but it requires/can handle customization. But if you want a simple "just do it" tool, then it's not probably not the best choice. Since I was build a search bot that asks follow-up questions, which required "memory", I ended up using just as_chat_engine (condense + context, openai ) class instead, which worked out great, even without summarization. The embedded "memory" saves a lot extra coding and works out really well. Anyway, thanks again for your prompt response.