Any experience with using Llama index in production? I'd like people to upload their own data then query them.
My concern is that loading all those data in memory is infeasible. Is there an on-disk-only setting that can be set to prevent the process for loading all indexes in memory and using too much?
I have little AI/ML experience, I've read about how LlamaIndex works under the hood. But I'm unaware of whether it is possible to use a SQL database (Postgres)/Elasticsearch as a backend for storing + querying indexes.
I'm building a production-scale web server that'll parse multiple files, each belonging to a different customer, and query them. So having all of them stored in memory during querying is scary.