is it possible to not download model

At a glance

is it possible to not download model llama-2-13b-chat.Q4_0.gguf if I already have one?

Plain Text

service_context = ServiceContext.from_defaults(llm=llm,
                                               embed_model="local",
                                               chunk_size=chunk_size,
                                               context_window=context_window - 200,
                                               llm_predictor=llm
                                               )
documents = SimpleDirectoryReader(input_dir="C:/temp_my/text_embeddings").load_data()
# %%

response_synthesizer = get_response_synthesizer(response_mode='tree_summarize', use_async=True, )

If I select llm_predictor to the same LLM, im getting error:

Plain Text

        if llm != "default":
            if llm_predictor is not None:
                raise ValueError("Cannot specify both llm and llm_predictor")

20 comments

LLogan M

You don't need to set llm_predictor, you can just set the llm to point to your LLM with your downloaded model

ААнтон

already pointed. llm=llm works okay for me

ААнтон

response_synthesizer = get_response_synthesizer(response_mode='tree_summarize', use_async=True, )

ААнтон

but on this step it start to download llama-2-13b-chat.Q4_0.gguf

ААнтон

I found it load because f this:

Plain Text

def resolve_llm(llm: Optional[LLMType] = None) -> LLM:
    """Resolve LLM from string or LLM instance."""
    if llm == "default":
        # return default OpenAI model. If it fails, return LlamaCPP
        try:
            llm = OpenAI()
        except ValueError as e:
            llm = "local"
            print(
                "******\n"
                "Could not load OpenAI model. Using default LlamaCPP=llama2-13b-chat. "
                "If you intended to use OpenAI, please check your OPENAI_API_KEY.\n"
                "Original error:\n"
                f"{e!s}"
                "\n******"
            )

ААнтон

but my LLM is clone of llama2-13b-chat, I dont want two models

ААнтон

can get_response_synthesizer use my llama2-13b clone model instead of downloading new?

ААнтон

Im using orca_platypus

LLogan M

set a global service context, otherwise you have to pass the service context in a lot of places

Plain Text

from llama_index import set_global_service_context

set_global_service_context(service_context)

ААнтон

hmm

ААнтон

or i can specify this right?

Plain Text

response_synthesizer = get_response_synthesizer(response_mode='tree_summarize', use_async=True, service_context=service_context)

ААнтон

so it will not download other models. It will use llm specified in service_context

LLogan M

yea that works too

ААнтон

Thank you!

ААнтон

Is this a right way to make summary index if it is first time?

Plain Text

storage_context = StorageContext.from_defaults(persist_dir='doc_summary_index')
doc_summary_index = load_index_from_storage(storage_context)
if not doc_summary_index:
    doc_summary_index = DocumentSummaryIndex.from_documents(
        documents,
        service_context=service_context,
        response_synthesizer=response_synthesizer,
        show_progress=True,
    )
    doc_summary_index.get_document_summary('Ranenie Zvezdy 4_ru')
    doc_summary_index.storage_context.persist('doc_summary_index')

ААнтон

Plain Text

if not doc_summary_index:
....blabla  make new

LLogan M

Hmm, almost, I think one of the first two lines might fail if the index doesn't exist, but not sure

ААнтон

Tutorial contain how to make and save index. But how to choose 'load' or 'make new' - not. Can you explain please?

LLogan M

Just a try/except? Or check if the dicrectory exists first?

ААнтон

💟

Add a reply

Find answers from the community

is it possible to not download model