sorry a completely diff question... I'm migrating some postgres data over to vector db. and i basically have 2MB of data when exported csv.
- Each row in postgres is basically 1-5 sentences chunked by 512.
I embed it in a 768 dimension embedding in Qdrant and the disk usage on that is around 38 MB.
Is that typical in that Vector DB would just naturally take up way more space per record?
If a chunk is gonna be 768 dimensions regardless, technically stuffing way bigger chunks that are like pages long would be more ideal in disk usage perspective right?
But then I see Llama Index promoting advanced RAGs like Sentence to Window retrieval making me think it's better to chunk by sentence but that would result in even way.... more disk usage?
Is that just the nature of vector dbs?