hi, i have a question if im on the right path: I have a large json array, this set I want to query with NL questions. I build a prototype with VectorStoreIndex which worked well until I put in all the json data. Then it got really slow and each query takes more then 8 minutes. Am I on the right path or should I focus on another solution for the problem?
Did you log which part was taking that much? At least the setup I've been using for JSONs is parsing each object into their own node and taking each JSON field and making it a metadata field (using VectorStoreIndex here). That approach has been working really well and is quick.
documents = JSONReader(is_jsonl = True).load_data("data.jsonl") index = VectorStoreIndex.from_documents(documents)
So i see in the log that now each line in my jsonl is considered as a node. how to add the metadata from each json field into it? Is there any example somewhere?
Second thing which made my implementation slow: I loaded the VectorStoreIndex each time, now i keep my console running and load VectorStoreIndex only once. I think this was the main performance improvement