Is llama index gpu accerelated Should I

At a glance

ttoniuyt

Is llama-index gpu accerelated? Should I rent a gpu on my VM instance on which I plan to run a llama-index application?

6 comments

LLogan M

It only uses GPU if you are running local models

ttoniuyt

So if I have a regular Vector index used as a openai chat engine it will have no difference?

LLogan M

correct 👍

ttoniuyt

How could go about speeding up the query then besides better CPU?

I did a little timing of the OpenAIAgent _chat function and got the following results:

1.93 sec for the initial agent_chat_response = self._get_agent_response(mode=mode, **llm_chat_kwargs)

Afterwards we have

Plain Text

=== Calling Function ===
Calling function: query_engine_tool with args: {
  "input": "some question..."
}
Got output: ....

which took 7.155 secs

And finally 6.54 sec fot the agent_response with current_func='auto' which I assume is the GPT resposne itself am I correct?
I can't speed up that but the other half time is the index query which can be sped up

ttoniuyt

I am using a basic Vector index of 237 small documents which are just just a couple of paragraphs of text. The total size of the index folder is less than 15 MB

ttoniuyt

It seems like such small data should be queried a lot faster even on weak machines?

Add a reply

Find answers from the community

Is llama index gpu accerelated Should I