Hello

At a glance

Hello,
@kapa.ai I am using gpt-4 as the model in the below setup.

Plain Text

llm_predictor = LLMPredictor(
        llm=ChatOpenAI(model_name="gpt-4", max_tokens=512, temperature=0.1))
service_context = ServiceContext.from_defaults(
        llm_predictor=llm_predictor, chunk_size_limit=512
    )
UnstructuredReader = download_loader("UnstructuredReader", refresh_cache=True)
loader = UnstructuredReader()
document = loader.load_data(file=Path(path), split_documents=False)
index = GPTVectorStoreIndex.from_documents(document, service_context=service_context)
index.storage_context.persist()

It works fine.
However, when I check my open ai profile, it is using the text-davinci api and Not gpt-4. Any idea?

8 comments

RRouzbeh

@Logan M hi 😊 any tips on this?

LLogan M

Ah right. Are you loading the index from storage? You'll have to pass the service_context back in when loading from storage too

Gpt-4 will only be hit during queries though, not index construction 👍

RRouzbeh

@Logan M thanks for the insights
Okay - here is what the issue was in my code:
It was what you said + I was calling the wrong function.
Fixed now - thank you

RRouzbeh

@Logan M I have a follow up question:
When I have
query_engine = index.as_query_engine()
query = 'some query'
results = query_engine.query(query)
Can I specify which embedding to be used for the query? In general what are the kind of params I could use for the as_query_engine() and query() methods?

LLogan M

Specify which embedding? Do you mean specifying a separate string to use for embeddings, and another string to use for the LLM response?

As for which options go into as_query_engine, it's a bit of a catchall bucket at the moment 😅 you can specify almost anything in it (service_context, similarity_top_k, node postprocessors, templates, etc.)

RRouzbeh

@Logan M So when I check my open ai profile, I see a call to the text-embeddin-ada-002 as shown in the attached. I presume this is called by the .query() method?

Attachment

Screenshot_2023-05-22_at_12.10.06_PM.png

LLogan M

@Rouzbeh yea that's correct, this is the embedding model. That many requests/tokens looks like it also includes index construction

When query() is called, then it's only used to create an embedding for the query text

RRouzbeh

Thanks @Logan M

Add a reply

Find answers from the community

Hello