=== Calling Function === Calling function: Wiki_Tool with args: {"input":"smoother braking"} Got output: Error: (sqlalchemy.dialects.postgresql.asyncpg.InterfaceError) <class 'asyncpg.exceptions._base.InterfaceError'>: cannot perform operation: another operation is in progress [SQL: SELECT public.data_wiki_docs.id, public.data_wiki_docs.node_id, public.data_wiki_docs.text, public.data_wiki_docs.metadata_, public.data_wiki_docs.embedding <=> $1 AS distance FROM public.data_wiki_docs ORDER BY distance asc LIMIT $2::INTEGER] [parameters: ('[-0.04321499168872833,-0.008070970885455608,0.038734838366508484,0.034068379551172256,-0.03698354959487915,0.08398095518350601,-0.007821937091648579, ... (7816 characters truncated) ... -0.027220504358410835,-0.04467884078621864,0.007395491935312748,-0.04819626361131668,0.009278454817831516,0.012993157841265202,-0.007883192971348763]', 5)] (Background on this error at: https://sqlalche.me/e/20/rvf5) ========================
Generating embeddings
sections? Right now it can take up to 15 min before the next set of embeddings is generated. Which is making the whole process take up to 48 hours. This is using hybrid qdrant vector store setup. I'm on an SSD btw.device = "cuda" if torch.cuda.is_available() else "cpu" print("GPU available:", torch.cuda.is_available()) Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5", device=device) #Settings.chunk_size = 512 qdrantclient = qdrant_client.QdrantClient(path="./qdrant_db") '''DISCORD DATA''' print("Loading local files...") dir_path = 'DiscordDocs' reader = SimpleDirectoryReader(input_dir=dir_path, required_exts=[".txt"]) discord_docs = reader.load_data() print("Local files loaded successfully. Setting up vector store for Discord data...") discord_vector_store = QdrantVectorStore(client=qdrantclient, enable_hybrid=True, batch_size=20, collection_name="discord-data") discord_storage_context = StorageContext.from_defaults(vector_store=discord_vector_store) discord_index = VectorStoreIndex.from_documents(discord_docs, storage_context=discord_storage_context, show_progress=True) print("Discord data setup complete.")
GPU available: True Loading local files... Local files loaded successfully. Setting up vector store for Discord data... Fetching 5 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<?, ?it/s] Fetching 5 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<?, ?it/s] Parsing nodes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 111/111 [03:59<00:00, 2.16s/it] Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2048/2048 [00:15<00:00, 131.86it/s] Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2048/2048 [00:13<00:00, 151.84it/s]
An error occurred: <MessageRole.MODEL: 'model'>
.async def fetch_reply_chain(message, max_tokens=4096): context = [] tokens_used = 0 current_prompt_tokens = len(message.content) // 4 max_tokens -= current_prompt_tokens while message.reference is not None and tokens_used < max_tokens: try: message = await message.channel.fetch_message(message.reference.message_id) role = Role.MODEL if message.author.bot else Role.USER message_content = f"{message.content}\n" message_tokens = len(message_content) // 4 if tokens_used + message_tokens <= max_tokens: context.append(HistoryChatMessage(message_content, role)) tokens_used += message_tokens else: break except Exception as e: print(f"Error fetching reply chain message: {e}") break return context[::-1]
memory = ChatMemoryBuffer.from_defaults(token_limit=8192) context = await fetch_reply_chain(message) memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)]) chat_engine = index.as_chat_engine( chat_mode="condense_plus_context", similarity_top_k=2, sparse_top_k=12, vector_store_query_mode="hybrid", memory=memory, -
WholeSiteReader
and I was wondering, cause I can't find anything in the code, if anyone knows a way to filter out parts of a site. this Doesn't seem to show anything for filters, but I'm hoping someone knows a way to add a filter through other means.CondensePlusContextChatEngine
and CondenseQuestionChatEngine
, neither one works with retrieving info. I made sure to try setting the retriever
and query_engine
for both. I know it's getting the prompt
and memory
, but not searching the info.client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API')) vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data") Settings.llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000) embed_model = OpenAIEmbedding(model="text-embedding-3-small") storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex.from_vector_store(vector_store, embed_model=embed_model) async def process_message_with_llm(message, client): content = message.content.replace(client.user.mention, '').strip() if content: try: async with message.channel.typing(): memory = ChatMemoryBuffer.from_defaults(token_limit=8192) context = await fetch_context_and_content(message, client, content) memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)]) chat_engine = CondensePlusContextChatEngine.from_defaults( retriever=index.as_retriever(), memory=memory, context_prompt=( "prompt" ) ) chat_response = await asyncio.to_thread(chat_engine.chat, content)
GithubRepositoryReader
:GithubClient.get_branch() got an unexpected keyword argument 'timeout'
GithubRepositoryReader
and saw FilterType
was added back, so I updated and now I'm getting this error.gpt-4-turbo-preview
but it's not saying it's an option, there is also no option for gpt-4-0125-preview
. Is there a way around this? Or are we stuck with gpt-4-0613-preview
?ValueError: ****** Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY. Original error: No API key found for OpenAI. Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization. API keys can be found or created at https://platform.openai.com/account/api-keys To disable the LLM entirely, set llm=None. ******
Multi-Step Query
:service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model) # Index setup PERSIST_DIR = "storage-data" if not os.path.exists(PERSIST_DIR): documents = SimpleDirectoryReader("data").load_data() index = VectorStoreIndex.from_documents(documents, service_context=service_context) index.storage_context.persist(persist_dir=PERSIST_DIR) else: storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR) index = load_index_from_storage(storage_context, service_context=service_context) query_engine = index.as_query_engine(response_mode="compact_accumulate") # Multi-step query engine setup step_decompose_transform = StepDecomposeQueryTransform(llm=llm, verbose=True) multi_step_query_engine = MultiStepQueryEngine( query_engine=query_engine, query_transform=step_decompose_transform, index_summary="Index summary for context" ) @app.get("/", response_class=HTMLResponse) async def get_form(request: Request): return templates.TemplateResponse("index.html", {"request": request}) @app.post("/query") async def query(user_input: str = Form(...)): response = multi_step_query_engine.query(user_input) response_text = str(response) return {"response": response_text}
step_decompose_transform = StepDecomposeQueryTransform(service_context=service_context)
but that gave me an error about not expecting thatTraceback (most recent call last): File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\utils.py", line 29, in resolve_llm validate_openai_api_key(llm.api_key) File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\openai_utils.py", line 379, in validate_openai_api_key raise ValueError(MISSING_API_KEY_ERROR_MESSAGE) ValueError: No API key found for OpenAI. Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization. API keys can be found or created at https://platform.openai.com/account/api-keys During handling of the above exception, another exception occurred: Traceback (most recent call last): File "s:\local-indexer\flask_server.py", line 53, in <module> index = load_index_from_storage(storage_context) File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\loading.py", line 33, in load_index_from_storage indices = load_indices_from_storage(storage_context, index_ids=index_ids, **kwargs) File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\loading.py", line 78, in load_indices_from_storage index = index_cls( File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\vector_store\base.py", line 52, in __init__ super().__init__( File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\base.py", line 62, in __init__ self._service_context = service_context or ServiceContext.from_defaults() File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\service_context.py", line 178, in from_defaults llm_predictor = llm_predictor or LLMPredictor( File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llm_predictor\base.py", line 109, in __init__ self._llm = resolve_llm(llm) File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\utils.py", line 31, in resolve_llm raise ValueError( ValueError: ****** Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY. Original error: No API key found for OpenAI. Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization. API keys can be found or created at https://platform.openai.com/account/api-keys To disable the LLM entirely, set llm=None. ******
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5") set_global_tokenizer( AutoTokenizer.from_pretrained("NousResearch/Llama-2-7b-chat-hf").encode ) model_url = "{url}" llm = LlamaCPP( model_url=model_url, temperature=0.1, max_new_tokens=256, context_window=3900, generate_kwargs={}, model_kwargs={"n_gpu_layers": 41}, messages_to_prompt=messages_to_prompt, completion_to_prompt=completion_to_prompt, verbose=True, ) service_context = ServiceContext.from_defaults( llm=llm, embed_model=embed_model, ) PERSIST_DIR = "storage-data" if not os.path.exists(PERSIST_DIR): documents = SimpleDirectoryReader("data").load_data() index = VectorStoreIndex.from_documents(documents, service_context=service_context) index.storage_context.persist(persist_dir=PERSIST_DIR) else: storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR) index = load_index_from_storage(storage_context)
D:\Documents\GitHub\DockerTest\Scripts\python.exe D:\Documents\GitHub\DockerTest\core.py D:\Documents\GitHub\DockerTest\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Traceback (most recent call last): File "D:\Documents\GitHub\DockerTest\core.py", line 4, in <module> from modules.utils.GPT import process_message_with_llm File "D:\Documents\GitHub\DockerTest\modules\utils\GPT.py", line 26, in <module> Settings.embed_model = HuggingFaceEmbedding(model_name="avsolatorio/NoInstruct-small-Embedding-v0") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Documents\GitHub\DockerTest\Lib\site-packages\llama_index\embeddings\huggingface\base.py", line 86, in __init__ self._model = SentenceTransformer( ^^^^^^^^^^^^^^^^^^^^ File "D:\Documents\GitHub\DockerTest\Lib\site-packages\sentence_transformers\SentenceTransformer.py", line 197, in __init__ modules = self._load_sbert_model( ^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Documents\GitHub\DockerTest\Lib\site-packages\sentence_transformers\SentenceTransformer.py", line 1309, in _load_sbert_model module = module_class.load(module_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\Documents\GitHub\DockerTest\Lib\site-packages\sentence_transformers\models\Pooling.py", line 230, in load return Pooling(**config) ^^^^^^^^^^^^^^^^^ TypeError: Pooling.__init__() got an unexpected keyword argument 'output_key' Process finished with exit code 1
SentenceTransformer
I install?0.10.1
, uninstalled and reinstalled llama-index
and llama-index-core
, but when I do from llama_index.core import VectorStoreIndex
it won't import VectorStoreIndex
.********** Trace: query |_query -> 6.062075 seconds |_templating -> 0.0 seconds |_llm -> 6.062075 seconds ********** Traceback (most recent call last): File "S:\Gemini-Coder\local-indexer\cmd_local_index_chat.py", line 83, in <module> respnose = query_engine.query( File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\core\base_query_engine.py", line 40, in query return self._query(str_or_query_bundle) File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\query_engine\sub_question_query_engine.py", line 129, in _query sub_questions = self._question_gen.generate(self._metadatas, query_bundle) File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\question_gen\llm_generators.py", line 78, in generate parse = self._prompt.output_parser.parse(prediction) File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\question_gen\output_parser.py", line 13, in parse raise ValueError(f"No valid JSON found in output: {output}") ValueError: No valid JSON found in output: Understood! I'll do my best to help you with your questions and provide relevant sub-questions based on the tools provided. Please go ahead and ask your user question, and I'll generate the list of sub-questions accordingly.
chat_engine
weaker than the query_engine
?