hey @unbittable , just curious the use-case you have for saving to string
I'm basically serializing indices into a database
it's turned out to involve jumping through a lot of hoops
llama does some cool thing but it feels a bit like it's designed for either scripting or heavyweight data management rather than web applications -- designed to recalculate everything each time through
ah gotcha, our plan is to directly support saving to the database (you just specify the connection)
I've got an ORM it's got to play nice with
saving/loading directly with the DB connection would create issues with transactions and other data munging that should be happening in the same operation
gotcha, are you mostly building index over a few documents at a time? And then saving the serialized string previously?
yeah, I'm going to have a large and expanding library of documents and the user will select a few to perform an operation on. I'll hydrate the index for each, compose a graph, and then run some queries / prompts against that.
Llama is awesome because it's making this possible
but it's also really difficult to set some things up (ex: the defaults for service context are a little painful, especially to override for automated testing without hitting the OpenAI APIs)
gotcha, this is valuable feedback
ya totally testing is a pain right now. We are setting up a better way to mock it
It's looking like I might have to drop down a level of abstraction and only store the embeddings and then build the index from that each time. It'll mean some extra processing, but allow me to skip the saving to string on the indices.
A little frustrating when I already wrote working code against 0.5.27, so I have to decide what to do about that.
but hey, a moving target is better than an unmaintained one
yea I totally get that, thanks for the patience haha
thanks for all your work supporting us crotchety, demanding developers!
It's not hard to add serialize to string back, let me take a look
the way I'm doing it right now is a bit hacky because I can't figure out exactly how the (now-proliferating) classes get assembled into the VectorStoreIndex, so I'm just punching through them to get directly to the KVstore
(and also all that dependency injection gets a little tiresome when there are so many layers)
ya as you said, there's different use-cases that we want to support: 1) data management/persistent data/etc with hundres or thousands of documents, 2) adhoc/app workflows dealing with few documents at a time
so we get stretched a bit in both directions.
Do you anticipate continued API volatility at this rate? I know it's a hazard of pre-1.0 software, but having some idea of what's coming up would help
def not as drastic as previous change haha
but ya trying to balance moving forward and not being too volatile
Yes, saving to string to store in database is definitely needed.