```py

Plain Text

llm = LlamaCpp(
        model_path=r'C:\Users\UserAdmin\Desktop\vicuna\Wizard-Vicuna-30B-Uncensored.ggmlv3.q2_K.bin', 
        verbose=False,
        n_ctx=2048,
        n_gpu_layers=55,
        n_batch=512,
        n_threads=11,
        temperature=0.65)
embed_model = LangchainEmbedding(HuggingFaceEmbeddings(
        model_name=r".\all-mpnet-base-v2",
        model_kwargs={'device': 'cuda'})
                )
llm_predictor = LLMPredictor(llm=llm)

    
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, chunk_size = 200, embed_model=embed_model)

documents = SimpleDirectoryReader(r'.\data\pdfs').load_data()
 
index =  VectorStoreIndex.from_documents(documents, service_context=service_context)

query_engine = index.as_query_engine(text_qa_template=QA_TEMPLATE)

9 comments

LLogan M

it uses cuda by default anyways, so specifying the device won't do much 😅

Assuming you have the room, you can try increasing the batch size (default is 10)

Plain Text

LangchainEmbedding(HuggingFaceEmbeddings(...), embed_batch_size=20)

I don't actually know for sure if this works with huggingface embeddings, but worth a shot!

HHomosexual Toaster

yeah im not sure that there's any noticiable positive changes from that

HHomosexual Toaster

somehow loading the documents took 10% longer hahahah

HHomosexual Toaster

thank you so much though

HHomosexual Toaster

i think there's just no way to make it more efficient i guess?

HHomosexual Toaster

i noticed an appreciable decrease in loading time for documents AND indexing if i increase chunk size. are there are negative repercussions for that/

HHomosexual Toaster

im planning to do semantic search over a set of documents

HHomosexual Toaster

the only issue i can think of is that if i increase chunk size, it might go beyond the context size?

LLogan M

llama-index will try to ensure that doesn't happen, but I wouldn't push the chunk size any higher than ~3000

Larger chunk sizes will also mean longer response times though 🤔

Add a reply

Find answers from the community

```py