Hey yeah that's a good question. This depends on the type of index used, which affects costs both during initial index construction as well as querying.
I have a "Cost Analysis" page here that hopefully should help give you an initial overview, but I just realized this doesn't contain any info about vector store indices (I can take a TODO):
https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html. Vector store indices (including GPTSimpleVectorIndex), embed document chunks during index construction ($$ comes from OpenAI embedding API). During query time we fetch the top-k neighbors (cost is 0), and then uses an LLM call to synthesize an answer (cost is number of tokens)
My recommendation: if you have larger sets of documents use a vector store index
https://gpt-index.readthedocs.io/en/latest/how_to/cost_analysis.html. It is the most easily scalable and cheapest. If you have additional data schema needs that a vector store index wouldn't support, let me know and I can help with that