idontneedonetho

Sql

Any one have any idea why my tools would be failing? I'm using a locally hosted postgres database that's running via docker, and I'm connecting to it via the 0.0.0.0 ip address, and I'm getting sql errors like

Plain Text

=== Calling Function ===                                                                                                Calling function: Wiki_Tool with args: {"input":"smoother braking"}                                                     Got output: Error: (sqlalchemy.dialects.postgresql.asyncpg.InterfaceError) <class 'asyncpg.exceptions._base.InterfaceError'>: cannot perform operation: another operation is in progress                                                        [SQL: SELECT public.data_wiki_docs.id, public.data_wiki_docs.node_id, public.data_wiki_docs.text, public.data_wiki_docs.metadata_, public.data_wiki_docs.embedding <=> $1 AS distance                                                           FROM public.data_wiki_docs ORDER BY distance asc                                                                         LIMIT $2::INTEGER]                                                                                                     [parameters: ('[-0.04321499168872833,-0.008070970885455608,0.038734838366508484,0.034068379551172256,-0.03698354959487915,0.08398095518350601,-0.007821937091648579, ... (7816 characters truncated) ... -0.027220504358410835,-0.04467884078621864,0.007395491935312748,-0.04819626361131668,0.009278454817831516,0.012993157841265202,-0.007883192971348763]', 5)]    (Background on this error at: https://sqlalche.me/e/20/rvf5)                                                            ========================

29 comments

iidontneedonetho

Anyone else getting lots of 500 server

Anyone else getting lots of 500 server errors when using an OpenAIAgent?

7 comments

iidontneedonetho

Hybrid

How would I speed up the part between the Generating embeddings sections? Right now it can take up to 15 min before the next set of embeddings is generated. Which is making the whole process take up to 48 hours. This is using hybrid qdrant vector store setup. I'm on an SSD btw.

Plain Text

device = "cuda" if torch.cuda.is_available() else "cpu"
print("GPU available:", torch.cuda.is_available())
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5", device=device)
#Settings.chunk_size = 512
qdrantclient = qdrant_client.QdrantClient(path="./qdrant_db")

'''DISCORD DATA'''
print("Loading local files...")
dir_path = 'DiscordDocs'
reader = SimpleDirectoryReader(input_dir=dir_path, required_exts=[".txt"])
discord_docs = reader.load_data()

print("Local files loaded successfully. Setting up vector store for Discord data...")
discord_vector_store = QdrantVectorStore(client=qdrantclient, enable_hybrid=True, batch_size=20, collection_name="discord-data")
discord_storage_context = StorageContext.from_defaults(vector_store=discord_vector_store)

discord_index = VectorStoreIndex.from_documents(discord_docs, storage_context=discord_storage_context, show_progress=True)
print("Discord data setup complete.")

Plain Text

GPU available: True
Loading local files...
Local files loaded successfully. Setting up vector store for Discord data...
Fetching 5 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<?, ?it/s]
Fetching 5 files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<?, ?it/s]
Parsing nodes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 111/111 [03:59<00:00,  2.16s/it]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2048/2048 [00:15<00:00, 131.86it/s]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2048/2048 [00:13<00:00, 151.84it/s]

(I'm still generating embeddings right now)

41 comments

iidontneedonetho

Trying to use Gemini with my reply chain

Trying to use Gemini with my reply chain function, which works with GPT, but gemini keeps spitting out An error occurred: <MessageRole.MODEL: 'model'>.

Plain Text

async def fetch_reply_chain(message, max_tokens=4096):
    context = []
    tokens_used = 0
    current_prompt_tokens = len(message.content) // 4
    max_tokens -= current_prompt_tokens
    while message.reference is not None and tokens_used < max_tokens:
        try:
            message = await message.channel.fetch_message(message.reference.message_id)
            role = Role.MODEL if message.author.bot else Role.USER
            message_content = f"{message.content}\n"
            message_tokens = len(message_content) // 4
            if tokens_used + message_tokens <= max_tokens:
                context.append(HistoryChatMessage(message_content, role))
                tokens_used += message_tokens
            else:
                break
        except Exception as e:
            print(f"Error fetching reply chain message: {e}")
            break
    return context[::-1]

I am trying to set custom chat history via,

Plain Text

memory = ChatMemoryBuffer.from_defaults(token_limit=8192)
                context = await fetch_reply_chain(message)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = index.as_chat_engine(
                    chat_mode="condense_plus_context",
                    similarity_top_k=2,
                    sparse_top_k=12,
                    vector_store_query_mode="hybrid",
                    memory=memory,
                    -

20 comments

iidontneedonetho

llama_index/llama-index-integrations/rea...

I'm playing around with the WholeSiteReader and I was wondering, cause I can't find anything in the code, if anyone knows a way to filter out parts of a site. this Doesn't seem to show anything for filters, but I'm hoping someone knows a way to add a filter through other means.

27 comments

iidontneedonetho

Chat engine help

For some reason, I cannot get low level chat engine's to work anymore, I tried CondensePlusContextChatEngine and CondenseQuestionChatEngine, neither one works with retrieving info. I made sure to try setting the retriever and query_engine for both. I know it's getting the prompt and memory, but not searching the info.

Plain Text

client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API'))
vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data")
Settings.llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_vector_store(vector_store, embed_model=embed_model)

async def process_message_with_llm(message, client):
    content = message.content.replace(client.user.mention, '').strip()
    if content:
        try:
            async with message.channel.typing():
                memory = ChatMemoryBuffer.from_defaults(token_limit=8192)
                context = await fetch_context_and_content(message, client, content)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = CondensePlusContextChatEngine.from_defaults(
                    retriever=index.as_retriever(),
                    memory=memory,
                    context_prompt=(
                        "prompt"
                    )
                )
                chat_response = await asyncio.to_thread(chat_engine.chat, content)

1 comment

iidontneedonetho

GithubRepoReader bug

Getting this error when trying to use the GithubRepositoryReader:

Plain Text

GithubClient.get_branch() got an unexpected keyword argument 'timeout'

I just checked the bug reports for GithubRepositoryReader and saw FilterType was added back, so I updated and now I'm getting this error.

7 comments

iidontneedonetho

Thread

Weird one, more help than issue. I’ve noticed that while running llama index, ram usage is pretty noticeable, I understand this is for reasons. But I’m wondering if there’s a way to use llama index to just query data, without using ram, maybe sql? I looked at the sql docs in the docs and it seems possible, but I wanted to come and ask opinions for if that’s the best route or if there’s something better? I’m trying to host this on a 1 gig of ram free server, so I have a limit.

24 comments

iidontneedonetho

Hello, Trying to use the latest `gpt-4-'

Hello, Trying to use the latest gpt-4-turbo-preview but it's not saying it's an option, there is also no option for gpt-4-0125-preview. Is there a way around this? Or are we stuck with gpt-4-0613-preview?

11 comments

iidontneedonetho

Back again, still going down the local

Back again, still going down the local model only path using llama-cpp-python. Getting this same error:

Plain Text

ValueError:
******
Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

To disable the LLM entirely, set llm=None.
******

This time tho, I'm trying to introduce Multi-Step Query:

Plain Text

service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)

# Index setup
PERSIST_DIR = "storage-data"
if not os.path.exists(PERSIST_DIR):
    documents = SimpleDirectoryReader("data").load_data()
    index = VectorStoreIndex.from_documents(documents, service_context=service_context)
    index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context, service_context=service_context)

query_engine = index.as_query_engine(response_mode="compact_accumulate")

# Multi-step query engine setup
step_decompose_transform = StepDecomposeQueryTransform(llm=llm, verbose=True)
multi_step_query_engine = MultiStepQueryEngine(
    query_engine=query_engine,
    query_transform=step_decompose_transform,
    index_summary="Index summary for context"
)

@app.get("/", response_class=HTMLResponse)
async def get_form(request: Request):
    return templates.TemplateResponse("index.html", {"request": request})

@app.post("/query")
async def query(user_input: str = Form(...)):
    response = multi_step_query_engine.query(user_input)
    response_text = str(response)
    return {"response": response_text}

I tried doing step_decompose_transform = StepDecomposeQueryTransform(service_context=service_context) but that gave me an error about not expecting that

3 comments

iidontneedonetho

Discord - A New Way to Chat with Friends...

I tried asking KapaGPT, https://discord.com/channels/1059199217496772688/1194708617564270704, for help and it told me to reach out to the maintainers, so, I'm getting this error when trying to use local LLMs for loading an index, the index persistent directory has been deleted, and then remade using the local models, but when I try to run the exact same code again to access the index I get this:

Plain Text

Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\utils.py", line 29, in resolve_llm
    validate_openai_api_key(llm.api_key)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\openai_utils.py", line 379, in validate_openai_api_key
    raise ValueError(MISSING_API_KEY_ERROR_MESSAGE)
ValueError: No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "s:\local-indexer\flask_server.py", line 53, in <module>
    index = load_index_from_storage(storage_context)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\loading.py", line 33, in load_index_from_storage
    indices = load_indices_from_storage(storage_context, index_ids=index_ids, **kwargs)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\loading.py", line 78, in load_indices_from_storage
    index = index_cls(
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\vector_store\base.py", line 52, in __init__
    super().__init__(
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\indices\base.py", line 62, in __init__
    self._service_context = service_context or ServiceContext.from_defaults()
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\service_context.py", line 178, in from_defaults
    llm_predictor = llm_predictor or LLMPredictor(
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llm_predictor\base.py", line 109, in __init__
    self._llm = resolve_llm(llm)
  File "C:\Users\user\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\utils.py", line 31, in resolve_llm
    raise ValueError(
ValueError:
******
Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

To disable the LLM entirely, set llm=None.
******

I followed the tutorials on the docs for all of this:

Plain Text

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
set_global_tokenizer(
    AutoTokenizer.from_pretrained("NousResearch/Llama-2-7b-chat-hf").encode
)
model_url = "{url}"
llm = LlamaCPP(
    model_url=model_url,
    temperature=0.1,
    max_new_tokens=256,
    context_window=3900,
    generate_kwargs={},
    model_kwargs={"n_gpu_layers": 41},
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True,
)
service_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embed_model,
)
PERSIST_DIR = "storage-data"
if not os.path.exists(PERSIST_DIR):
    documents = SimpleDirectoryReader("data").load_data()
    index = VectorStoreIndex.from_documents(documents, service_context=service_context)
    index.storage_context.persist(persist_dir=PERSIST_DIR)
else:
    storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)
    index = load_index_from_storage(storage_context)

7 comments

iidontneedonetho

Hello, I've been trying to find an

Hello, I've been trying to find an answer in the docs, but, I'm not very well versed in this stuff yet. Would I be able to skip OpenAI all together and use google's gemini-pro/embedding llm for everything?

2 comments

iidontneedonetho

Vector db

Been searching for fast db methods and found hyperdb, searched around on discord and the web to see if it was compatible with llama-index, only things I found were twitter posts from a while ago. Any update on compatibility being added? Or should we be able to build it using the current tool sets we're given?

2 comments

iidontneedonetho

I'm trying to figure out docker, but

I'm trying to figure out docker, but when I'm setting up, through not docker, but venv, a new copy of my stuff it's throwing this error

Plain Text

D:\Documents\GitHub\DockerTest\Scripts\python.exe D:\Documents\GitHub\DockerTest\core.py 
D:\Documents\GitHub\DockerTest\Lib\site-packages\huggingface_hub\file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Traceback (most recent call last):
  File "D:\Documents\GitHub\DockerTest\core.py", line 4, in <module>
    from modules.utils.GPT import process_message_with_llm
  File "D:\Documents\GitHub\DockerTest\modules\utils\GPT.py", line 26, in <module>
    Settings.embed_model = HuggingFaceEmbedding(model_name="avsolatorio/NoInstruct-small-Embedding-v0")
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\DockerTest\Lib\site-packages\llama_index\embeddings\huggingface\base.py", line 86, in __init__
    self._model = SentenceTransformer(
                  ^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\DockerTest\Lib\site-packages\sentence_transformers\SentenceTransformer.py", line 197, in __init__
    modules = self._load_sbert_model(
              ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\DockerTest\Lib\site-packages\sentence_transformers\SentenceTransformer.py", line 1309, in _load_sbert_model
    module = module_class.load(module_path)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Documents\GitHub\DockerTest\Lib\site-packages\sentence_transformers\models\Pooling.py", line 230, in load
    return Pooling(**config)
           ^^^^^^^^^^^^^^^^^
TypeError: Pooling.__init__() got an unexpected keyword argument 'output_key'

Process finished with exit code 1

I'm assuming I may need to specify which version of SentenceTransformer I install?

6 comments

iidontneedonetho

Updated to `0.10.1`, uninstalled and

Updated to 0.10.1, uninstalled and reinstalled llama-index and llama-index-core, but when I do from llama_index.core import VectorStoreIndex it won't import VectorStoreIndex.

43 comments

iidontneedonetho

Is there a way to turn off Gemini's

Is there a way to turn off Gemini's safety filter when using it as the LLM for a chat engine?

63 comments

iidontneedonetho

Was following the https://docs.

Was following the https://docs.llamaindex.ai/en/stable/examples/query_engine/sub_question_query_engine.html tutorial and received this error output:

Plain Text

**********
Trace: query
    |_query ->  6.062075 seconds
      |_templating ->  0.0 seconds
      |_llm ->  6.062075 seconds
**********
Traceback (most recent call last):
  File "S:\Gemini-Coder\local-indexer\cmd_local_index_chat.py", line 83, in <module>
    respnose = query_engine.query(
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\core\base_query_engine.py", line 40, in query
    return self._query(str_or_query_bundle)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\query_engine\sub_question_query_engine.py", line 129, in _query
    sub_questions = self._question_gen.generate(self._metadatas, query_bundle)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\question_gen\llm_generators.py", line 78, in generate
    parse = self._prompt.output_parser.parse(prediction)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\question_gen\output_parser.py", line 13, in parse
    raise ValueError(f"No valid JSON found in output: {output}")
ValueError: No valid JSON found in output:   Understood! I'll do my best to help you with your questions and provide relevant sub-questions based on the tools provided. Please go ahead and ask your user question, and I'll generate the list of sub-questions accordingly.

I am using local embedding model and local language model, but everything else I kept the same. I didn't read anything bout linking a json file in that doc.

7 comments

iidontneedonetho

Is it just me, or is the `chat_engine`

Is it just me, or is the chat_engine weaker than the query_engine?

8 comments

Find answers from the community

Sql

Anyone else getting lots of 500 server

Hybrid

Trying to use Gemini with my reply chain

llama_index/llama-index-integrations/rea...

Chat engine help

GithubRepoReader bug

Thread

Hello, Trying to use the latest `gpt-4-'

Back again, still going down the local

Discord - A New Way to Chat with Friends...

Hello, I've been trying to find an

Vector db

I'm trying to figure out docker, but

Updated to `0.10.1`, uninstalled and

Is there a way to turn off Gemini's

Was following the https://docs.

Is it just me, or is the `chat_engine`