Find answers from the community

Updated 2 years ago

i am trying to load index from json file

At a glance
i am trying to load index from json file, as GPTSimpleVectorIndex.load_from_disk('/tmp/index.json') , and i found there are several embedding requests to openai which are supposed not to send, right? could someone help to explain, thanks very much
or maybe it's because this line index.query("What did the author do growing up?") ? which i though it queries from local vector index
L
n
6 comments
The query also needs to generate embeddings for the query text, so you will see some (very small) embedding usage when querying the index.

But all the embeddings for your document text are saved and re-used πŸ‘
@Logan M thank you, is there any method to log requests to openai if any
@nullne you can set the logger to debug to see what is sent to openai

Plain Text
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
there are 2 requests to text-davince, and one to text-embedding-ada-002-v2, the latter should be the small embedding to the question, but the former two requests are quite big, 4064 tokens each. and only one line code was executed: index.query("What did the author do growing up?")
@nullne text-ada-002 is used to embed the query text

The two requests to text-davinci-003 are to generate the natural language response.

You can try setting the chunk size limit to decrease the token usage (the default size is 4000), but this can also impact the results of the query. You can set chunk_size_limit in the service context object.


Another option is switching to gpt-3.5-turbo, which is 10x cheaper
thank you! @Logan M
Add a reply
Sign up and join the conversation on Discord