I m trying to use `SummaryIndex` via a

At a glance

I'm trying to use SummaryIndex via a TGIS server (and not run the LLM locally) but llamaindex seems like it's ignoring the TGIS predictor. Maybe I'm using this wrong?

Plain Text

service_context = ServiceContext.from_defaults(chunk_size=512,
                                               llm=tgis_predictor, 
                                               context_window=2048,
                                               prompt_helper=prompt_helper,
                                               embed_model=embed_model)

# Load data
documents = SimpleDirectoryReader('private-data').load_data()

index = SummaryIndex.from_documents(documents)
summary = index.as_query_engine(response_mode="tree_summarize").query("Summarize the text, describing what it might be most useful for")

but then it tries to download an HF model:

Plain Text

Downloading url https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q4_0.bin to path /tmp/llama_index/models/llama-2-13b-chat.ggmlv3.q4_0.bin
total size (MB): 7323.31

And ultimately blows up my machine trying to use this model via CPU

8 comments

tthoraxe

Plain Text

tgis_predictor = LangChainLLM(
    llm=HuggingFaceTextGenInference(
        inference_server_url=inference_server_url,
        max_new_tokens=256,
        temperature=0.7,
        repetition_penalty=1.03,
        server_kwargs={},
    ),
)

FWIW

tthoraxe

@Logan M you mentioned PromptHelper overrides other parameters -- does it override the LLM, too?

tthoraxe

I see it takes a tokenizer argument

tthoraxe

actually I removed PromptHelper and it still started to download the LLM

tthoraxe

OHHH

tthoraxe

index = SummaryIndex.from_documents(documents, service_context=service_context)

tthoraxe

🤦‍♂️

LLogan M

There you go 👍👍

Add a reply

Find answers from the community

I m trying to use `SummaryIndex` via a