Hey . I'm also using AWS lambda but apperently save_to_string takes very long time to run even on small text files. Any idea how things can be speed up?
Yea if you have a vector index, it's generating 1526? Sized vectors for each chunk, so saving this might take some time when you have a lot of chunks. Not sure how that can be sped up π€
If saving is going to be a common operation, it might be worthwhile to setup a 3rd party vector store (pinecone, etc.). Then, the vectors are never saved in the index
I guess I'm getting a huge json when getting the embeddings from open AI. If using Pinecone, don't I need to somehow covert this json to a string before saving it in the DB? Where would the vectors be saved?
I'm pretty sure when using a store like pinecone, it can send the raw numbers, rather than converting to string (type conversion is probably what's slowing things down, if I had to guess). And this is all handled under the hood
@yoelk with pinecone, you don't need to do save_to_disk or save_to_string, when you add documents to the pinecone index it'll automatically be stored in the pinecone backend