Find answers from the community

Updated 2 years ago

Cost analysis

At a glance
The post asks about how the costs (based on tokens) scale in the context of indexing and querying. The community members discuss that the costs depend on the type of index used, with vector store indices (such as GPTSimpleVectorIndex) being recommended for larger sets of documents as they are more scalable and cheaper. The community members mention a "Cost Analysis" page that provides an overview, and suggest using a vector store index if the user has larger sets of documents, or reaching out for help if they have additional data schema needs that a vector store index wouldn't support.
Useful resources
Hi, I'm wondering how the costs (based on tokens) scales.
After the initial indexing, what will be the main drivers of costs:
  • The total size of the initial document(s)
  • The "generality" of the prompt (if the prompt must check a lot of different nodes)
  • How many factors of child branches
  • Etc.
For example, if I have a really long text, and put child factor 2, are my costs going to explode as the index tried to find relevance in a lot of different nodes?
j
S
4 comments
Hey yeah that's a good question. This depends on the type of index used, which affects costs both during initial index construction as well as querying.

I have a "Cost Analysis" page here that hopefully should help give you an initial overview, but I just realized this doesn't contain any info about vector store indices (I can take a TODO): https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html. Vector store indices (including GPTSimpleVectorIndex), embed document chunks during index construction ($$ comes from OpenAI embedding API). During query time we fetch the top-k neighbors (cost is 0), and then uses an LLM call to synthesize an answer (cost is number of tokens)

My recommendation: if you have larger sets of documents use a vector store index https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html. It is the most easily scalable and cheapest. If you have additional data schema needs that a vector store index wouldn't support, let me know and I can help with that
I think you meant this for the second link:
https://gpt-index.readthedocs.io/en/latest/how_to/vector_stores.html

But thanks a lot, I didn't know about the cost analysis tool, that will help. And I'll let oyu know about the vector store index, thanks Jerry!
Oops yeah you're right πŸ™‚
yes let me know!
Add a reply
Sign up and join the conversation on Discord