Hello guys~ I need a help,, My Llama index take about 60sec to search, 130sec to generate. And, in the profiling result, that torch~~~ take a long time. How can I solve this?
Thank you, The model used is meta-llama/Meta-Llama-3-8B-Instruct, "The retriever used is based on the ChromaVectorStore, The query engine used is part of the LlamaIndex framework, which includes a response synthesizer configured with ResponseMode.TREE_SUMMARIZE. The query engine is created from a VectorStoreIndex, Index is not too big, about 17 PDF, 10 TXT