Find answers from the community

Updated 3 months ago

is it possible to not download model

is it possible to not download model llama-2-13b-chat.Q4_0.gguf if I already have one?

Plain Text
service_context = ServiceContext.from_defaults(llm=llm,
                                               embed_model="local",
                                               chunk_size=chunk_size,
                                               context_window=context_window - 200,
                                               llm_predictor=llm
                                               )
documents = SimpleDirectoryReader(input_dir="C:/temp_my/text_embeddings").load_data()
# %%

response_synthesizer = get_response_synthesizer(response_mode='tree_summarize', use_async=True, )


If I select llm_predictor to the same LLM, im getting error:
Plain Text
        if llm != "default":
            if llm_predictor is not None:
                raise ValueError("Cannot specify both llm and llm_predictor")
L
А
20 comments
You don't need to set llm_predictor, you can just set the llm to point to your LLM with your downloaded model
already pointed. llm=llm works okay for me
response_synthesizer = get_response_synthesizer(response_mode='tree_summarize', use_async=True, )
but on this step it start to download llama-2-13b-chat.Q4_0.gguf
I found it load because f this:

Plain Text
def resolve_llm(llm: Optional[LLMType] = None) -> LLM:
    """Resolve LLM from string or LLM instance."""
    if llm == "default":
        # return default OpenAI model. If it fails, return LlamaCPP
        try:
            llm = OpenAI()
        except ValueError as e:
            llm = "local"
            print(
                "******\n"
                "Could not load OpenAI model. Using default LlamaCPP=llama2-13b-chat. "
                "If you intended to use OpenAI, please check your OPENAI_API_KEY.\n"
                "Original error:\n"
                f"{e!s}"
                "\n******"
            )
but my LLM is clone of llama2-13b-chat, I dont want two models
can get_response_synthesizer use my llama2-13b clone model instead of downloading new?
Im using orca_platypus
set a global service context, otherwise you have to pass the service context in a lot of places

Plain Text
from llama_index import set_global_service_context

set_global_service_context(service_context)
or i can specify this right?

Plain Text
response_synthesizer = get_response_synthesizer(response_mode='tree_summarize', use_async=True, service_context=service_context)
so it will not download other models. It will use llm specified in service_context
yea that works too
Is this a right way to make summary index if it is first time?
Plain Text
storage_context = StorageContext.from_defaults(persist_dir='doc_summary_index')
doc_summary_index = load_index_from_storage(storage_context)
if not doc_summary_index:
    doc_summary_index = DocumentSummaryIndex.from_documents(
        documents,
        service_context=service_context,
        response_synthesizer=response_synthesizer,
        show_progress=True,
    )
    doc_summary_index.get_document_summary('Ranenie Zvezdy 4_ru')
    doc_summary_index.storage_context.persist('doc_summary_index')
Plain Text
if not doc_summary_index:
....blabla  make new
Hmm, almost, I think one of the first two lines might fail if the index doesn't exist, but not sure
Tutorial contain how to make and save index. But how to choose 'load' or 'make new' - not. Can you explain please?
Just a try/except? Or check if the dicrectory exists first?
Add a reply
Sign up and join the conversation on Discord