Антон

is it possible to not download model

is it possible to not download model llama-2-13b-chat.Q4_0.gguf if I already have one?

Plain Text

service_context = ServiceContext.from_defaults(llm=llm,
                                               embed_model="local",
                                               chunk_size=chunk_size,
                                               context_window=context_window - 200,
                                               llm_predictor=llm
                                               )
documents = SimpleDirectoryReader(input_dir="C:/temp_my/text_embeddings").load_data()
# %%

response_synthesizer = get_response_synthesizer(response_mode='tree_summarize', use_async=True, )

If I select llm_predictor to the same LLM, im getting error:

Plain Text

        if llm != "default":
            if llm_predictor is not None:
                raise ValueError("Cannot specify both llm and llm_predictor")

20 comments

ААнтон

Encode

Error 'list' object has no attribute 'encode'
on this
response = chat_engine.chat("What is this book about?")

my code is from tutorial https://gpt-index.readthedocs.io/en/latest/examples/chat_engine/chat_engine_react.html#get-started-in-5-lines-of-code

16 comments

ААнтон

can i specify additional metadata to

can i specify additional metadata to messages when doing chat_engine.chat("mymessage") ? for example user name, chat room name, date and time

If Im right, chan_engine automatically update index with new messages and store it if persist specified.
Or maybe I should somehow chat_engine.index.update() and refresh() ?

1 comment

ААнтон

If I load index from storage for faster

If I load index from storage for faster startup, then how to update that index if there is new documents in folder? I need to load documents everytime and check is there new docs and need to refresh index?

6 comments

ААнтон

The OpenOrca Platypus2 13B model

The OpenOrca-Platypus2-13B model supports this template for chat:

Plain Text

<|user|> <|user-message|><|end_of_turn|>\n<|bot|> <|bot-message|>\n

But llamaindex react loop agent dont use this format. there is many \n and examples like '''Please use valid JSON.......
Observation: tool response....
...

Can it be really difficult for custom LLM?How to fix?

11 comments

ААнтон

How to use with Jupyter Notebook Getting

How to use with Jupyter Notebook? Getting error trying to load LlamaCPP

UnsupportedOperation Traceback (most recent call last)
c:\LLMs\SharikAI\My_llama_index.py in line 5
33 callback_manager = CallbackManager([llama_debug])
34 context_window=3900
----> 35 llm = LlamaCPP(
36 model_path='C:/LLMs/oobabooga_windows/text-generation-webui/models/openorca-platypus2-13b.Q4_K_M.gguf',
37 temperature=0.3,
38 max_new_tokens=512,
39 context_window=context_window,
40 generate_kwargs={"top_p": 1,},
41 model_kwargs={"n_gpu_layers": 32, "n_batch": 512, "n_threads": 10},
42 # callback_manager = callback_manager,
43 verbose=False,
44 messages_to_prompt=messages_to_prompt, # The function to convert messages to a prompt
45 completion_to_prompt=completion_to_prompt, # The function to convert a completion to a prompt.
46 )

File c:\LLMs\SharikAI.venv\Lib\site-packages\llama_index\llms\llama_cpp.py:110, in LlamaCPP.init(self, model_url, model_path, temperature, max_new_tokens, context_window, messages_to_prompt, completion_to_prompt, callback_manager, generate_kwargs, model_kwargs, verbose)
105 raise ValueError(
106 "Provided model path does not exist. "
107 "Please check the path or provide a model_url to download."
108 )
109 else:
--> 110 self._model = Llama(model_path=model_path, **model_kwargs)
...
370 else:
371 msg = "fileno"
--> 372 raise io.UnsupportedOperation(msg)

UnsupportedOperation: fileno

2 comments

ААнтон

Why when using index as chat engine chat

Why when using index.as_chat_engine(..., chat_mode=ChatMode.BEST ) it wont go in react loop, but gives only one short answer?

How to merge book index and chat together? So I can ask LLM about book in chat manner?

2 comments

ААнтон

What if LLM dont want to use query

What if LLM dont want to use query_engine_tool and give me answers about some imaginary book, not I embedded to it?

Response: Observation: query_engine_tool response
Title: The Da Vinci Code (WRONG!!!)
Summary: This book is a thriller that follows Robert Langdon, a Harvard symbologist, as he unravels ancient secrets and solves codes to save the life of a British Royal Family member.

Main characters include:

Robert Langdon - A Harvard professor of symbology.

....blabla

my code

Plain Text

service_context = ServiceContext.from_defaults(llm=llm, embed_model="local")
data = SimpleDirectoryReader(input_dir="C:/temp_my/text_embeddings").load_data()
index = VectorStoreIndex.from_documents(data, service_context=service_context)
chat_engine = index.as_chat_engine(service_context=service_context, chat_mode="react", verbose=True)
response = chat_engine.chat("What this book is about? List the names of main characters. And tell the story short.")
print(response)

Directory contains 1 big docx file converted from fb2 and translated to English with google

4 comments

Find answers from the community

is it possible to not download model

Encode

can i specify additional metadata to

If I load index from storage for faster

The OpenOrca Platypus2 13B model

How to use with Jupyter Notebook Getting

Why when using index as chat engine chat

What if LLM dont want to use query