Find answers from the community

Updated 2 years ago

Looking back at my records it looks like

At a glance

Looking back at my records it looks like it was about $50 for one full run through of the directory. 😬

9 comments

@arminta7 how large is your document database? by default we use openai to create embeddings for all your document chunks which can get expensive if your document set is large

jjerryjliu0

now that we allow you to provide embeddings per document i wonder if that would help you - if you are able to specify embeddings per document beforehand (perhaps using something from huggingface), we will use those embeddings instead in GPTSimpleVectorIndex

aarminta7

~20k files. Largest file ~700k words.

jjerryjliu0

Woah ok. Yeah that's a lot more than I've tested 😅 .

aarminta7

Ideally I would index the entire directory, then just insert edited files daily.

jjerryjliu0

You may be able to do that now, but I haven't fully added the documentation for it (the update is primarily in this tweet https://twitter.com/gpt_index/status/1610314250123358212?s=20&t=xd9noZLp2a2j-Ks5SvnJ1Q)

aarminta7

Yes, I saw that right away! I just have to try to figure out how to use it lol. All of this is a bit over my knowledge level 😝

jjerryjliu0

I'll try to come up with a few example notebook snippets so it's more easily usable

aarminta7

I really can't thank you enough for your thorough work and how helpful you've been. And adding functionality so quickly! You've made this experience a joy.

Add a reply