Find answers from the community

Updated 2 years ago

Hello

Hello,
@kapa.ai I am using gpt-4 as the model in the below setup.
Plain Text
llm_predictor = LLMPredictor(
        llm=ChatOpenAI(model_name="gpt-4", max_tokens=512, temperature=0.1))
service_context = ServiceContext.from_defaults(
        llm_predictor=llm_predictor, chunk_size_limit=512
    )
UnstructuredReader = download_loader("UnstructuredReader", refresh_cache=True)
loader = UnstructuredReader()
document = loader.load_data(file=Path(path), split_documents=False)
index = GPTVectorStoreIndex.from_documents(document, service_context=service_context)
index.storage_context.persist()

It works fine.
However, when I check my open ai profile, it is using the text-davinci api and Not gpt-4. Any idea?
R
L
8 comments
@Logan M hi 😊 any tips on this?
Ah right. Are you loading the index from storage? You'll have to pass the service_context back in when loading from storage too

Gpt-4 will only be hit during queries though, not index construction πŸ‘
@Logan M thanks for the insights
Okay - here is what the issue was in my code:
It was what you said + I was calling the wrong function.
Fixed now - thank you
@Logan M I have a follow up question:
When I have
query_engine = index.as_query_engine()
query = 'some query'
results = query_engine.query(query)
Can I specify which embedding to be used for the query? In general what are the kind of params I could use for the as_query_engine() and query() methods?
Specify which embedding? Do you mean specifying a separate string to use for embeddings, and another string to use for the LLM response?

As for which options go into as_query_engine, it's a bit of a catchall bucket at the moment πŸ˜… you can specify almost anything in it (service_context, similarity_top_k, node postprocessors, templates, etc.)
@Logan M So when I check my open ai profile, I see a call to the text-embeddin-ada-002 as shown in the attached. I presume this is called by the .query() method?
Attachment
Screenshot_2023-05-22_at_12.10.06_PM.png
@Rouzbeh yea that's correct, this is the embedding model. That many requests/tokens looks like it also includes index construction

When query() is called, then it's only used to create an embedding for the query text
Thanks @Logan M
Add a reply
Sign up and join the conversation on Discord