Find answers from the community

Updated 2 months ago

I m trying to use `SummaryIndex` via a

I'm trying to use SummaryIndex via a TGIS server (and not run the LLM locally) but llamaindex seems like it's ignoring the TGIS predictor. Maybe I'm using this wrong?

Plain Text
service_context = ServiceContext.from_defaults(chunk_size=512,
                                               llm=tgis_predictor, 
                                               context_window=2048,
                                               prompt_helper=prompt_helper,
                                               embed_model=embed_model)

# Load data
documents = SimpleDirectoryReader('private-data').load_data()

index = SummaryIndex.from_documents(documents)
summary = index.as_query_engine(response_mode="tree_summarize").query("Summarize the text, describing what it might be most useful for")


but then it tries to download an HF model:
Plain Text
Downloading url https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q4_0.bin to path /tmp/llama_index/models/llama-2-13b-chat.ggmlv3.q4_0.bin
total size (MB): 7323.31


And ultimately blows up my machine trying to use this model via CPU
t
L
8 comments
Plain Text
tgis_predictor = LangChainLLM(
    llm=HuggingFaceTextGenInference(
        inference_server_url=inference_server_url,
        max_new_tokens=256,
        temperature=0.7,
        repetition_penalty=1.03,
        server_kwargs={},
    ),
)

FWIW
@Logan M you mentioned PromptHelper overrides other parameters -- does it override the LLM, too?
I see it takes a tokenizer argument
actually I removed PromptHelper and it still started to download the LLM
index = SummaryIndex.from_documents(documents, service_context=service_context)
πŸ€¦β€β™‚οΈ
There you go πŸ‘πŸ‘
Add a reply
Sign up and join the conversation on Discord