no_dice

Log inLog into community

Find answers from the community

Home

Members

no_dice

Offline, last seen 6 months ago

Joined September 25, 2024

nno_dice

Hi - I am currently trying to wrap an

Hi - I am currently trying to wrap an agent over the free Polygion.io (financial markets data) API and so far the agent is having a hard time parsing the response it gets back. I have fiddled a lot with prompting it to better parse the data but we're running into an error that I believe is in the RequestsToolSpec object:

My code:

Plain Text

# Wrap the Polygon API spec with LoadAndSearchToolSpec
wrapped_tools = LoadAndSearchToolSpec.from_defaults(
    api_spec.to_tool_list()[0],
).to_tool_list()

agent = ReActAgent.from_tools(
    [*wrapped_tools, requests_spec.to_tool_list()[0]]
    , verbose=True
    , llm=llm
    , context=CONTEXT
    , max_iterations=20
)

agent.chat("What are all the exchanges you have access to?")

Error:

Plain Text

File /workspaces/pye/pye/.venv/lib/python3.10/site-packages/llama_hub/tools/requests/base.py:75, in RequestsToolSpec._get_headers_for_url(self, url)
     74 def _get_headers_for_url(self, url: str) -> dict:
---> 75     return self.domain_headers[self._get_domain(url)]

KeyError: 'api.polygon.io'

It looks like its trying to look in the response headers for a key api.polygon.io even though I've never prompted it to do so. It also looks like its hardcoded to do this in the RequestsToolSpec object?

Any ideas on how to resolve this? Modifying the prompt doesn't seem to do anything.

1 comment

nno_dice

LlamaIndex CLI error

LlamaIndex CLI error

Whenever I try to create a new llama-pack I run into this weird error:

Plain Text

llamaindex-cli new-package --kind "packs" --name "pack-test"

Plain Text

Traceback (most recent call last):
  File "/workspaces/llama_index/.venv/bin/llamaindex-cli", line 6, in <module>
    sys.exit(main())
  File "/workspaces/llama_index/.venv/lib/python3.10/site-packages/llama_index/cli/command_line.py", line 269, in main
    args.func(args)
  File "/workspaces/llama_index/.venv/lib/python3.10/site-packages/llama_index/cli/command_line.py", line 263, in <lambda>
    new_package_parser.set_defaults(func=lambda args: handle_init_package(**vars(args)))
  File "/workspaces/llama_index/.venv/lib/python3.10/site-packages/llama_index/cli/command_line.py", line 26, in handle_init_package
    init_new_package(integration_name=name, integration_type=kind, prefix=prefix)
  File "/workspaces/llama_index/.venv/lib/python3.10/site-packages/llama_index/cli/new_package/base.py", line 120, in init_new_package
    shutil.copyfile(common_path + "/BUILD", pkg_path + "/BUILD")
  File "/usr/local/python/3.10.13/lib/python3.10/shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/workspaces/llama_index/.venv/lib/python3.10/site-packages/llama_index/cli/new_package/common/BUILD'

I believe this is the cause of an error my PR is currently experiencing. Not sure how to interpret these errors.

7 comments

nno_dice

LlamaPack dependency question..

LlamaPack dependency question..

I am writing a llama pack and poetry/ .toml files are a bit new to me. I think I've got the jist of it but I am not sure where I should add a dependency necessary for my tests? In my test's fixtures I use the wikipedia reader as a way to load some simple documents into a vector store. This is its own llama pack and I figure I should add that as a dependency. Where might I do that? I see multiple potential places as I look at other example packs

3 comments

nno_dice

**Simple local vector store index that

Simple local vector store index that supports hybrid search?

Alright I've got a weird problem trying to wrap up a llama-pack:

I NEED a vector store index object that has text and vector representations of its data. How can I build a simple vector store with a small corpus of local data (it really doesn't have to be much, just enough to answer 1-2 questions) that supports HYBRID SEARCH. Most of the guides online I've seen build from textnodes directly or documents directly, none of these work because the index provided in those examples does NOT support hybrid queries. I can't use free or small instances of services like Pinecone either because these are just test fixtures and I can't expect the llama-index repo to have my credentials (nor is it best practice)

ValueError: Invalid query mode: hybrid

13 comments

nno_dice

Is anyone else having issues with the

Is anyone else having issues with the cli tool to upgrade your imports to v0.10+ ?

It runs fine, no errors, I get the following output:

Plain Text

(.venv) @no-dice-io ➜ /workspaces/koda-retriever (main) $ llamaindex-cli upgrade .
Module not found: VectorStoreQueryMode
Switching to core
Module not found: BaseNodePostprocessor
Switching to core
Module not found: QueryType
Switching to core
Module not found: Settings
Switching to core
New installs:
pip install llama-index-llms-openai
pip install llama-index-vector-stores-pinecone
pip install llama-index-embeddings-openai

But none of the imports have actually changed. Am I not using this tool correctly? I did reinstall my .venv with the new imports above

6 comments

nno_dice

Retriever tests

Retriever tests

I am building a retriever tool for Llama Hub and was looking to see if there were any standard tests an object built from BaseRetriever should be able to pass? I took a look at the tests in the /tests/retrievers/ folder in the Llama Index repo but I only see one test in there. I can certainly follow that test but I worry that's not enough?

If anyone has any ideas some tests that I could put through my Retriever please let me know! I'm looking to get my PR ready by this weekend 😄

4 comments

nno_dice

Potential bug?

Potential bug? Postprocessor error when vector search yields 0 results

Exception: IndexError: list index out of range

I think there's a decent chance this is a bug. I'm fairly certain this happens because I just moved to a new Vector Database that has no data in it, but the index itself has been created.

My code was working fine and my tests were passing, but when I cleared my DB this error occurred. I believe its happening because the vector search yields no results and then the postprocessor (SentenceTransformerRerank) has nothing to rerank. I would guess that even when the results are zero, the reranker shouldn't run or just returns nothing.

I've attached my code and full stack-trace in the thread.

8 comments

nno_dice

Query Engine / Index Connection Errors

Having trouble actually running any queries on my query engine..

I get various errors like a 404 error or APIConnectionError depending on how I query the query engine (or when its wrapped over by a Context Augmented Agent). I've attached my code here in a text file because I don't think it'll fit. (Traceback is within the code as well)

PLEASE TAKE NOTE OF MY COMMENTS IF YOU READ MY CODE
YOU WILL ALSO NEED TO DOWNLOAD THE FILE, COMPANY SECURITY IS DOING WEIRD STUFF TO THE PREVIEW

Agent Error: Exception: APIConnectionError: Connection error.

I went ahead and tested my connection info to my AzureOpenAI class/wrapper via LangChain and it works fine on its own when in a simple notebook I create the object and prompt it. But when its wrapped in an index/engine it starts to have connection issues as shown in my code/traceback.

20 comments

nno_dice

`sqlalchemy.exc.ProgrammingError: (

sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column data_vector_embeddings.text_search_tsv does not exist

I setup the vector index with LlamaIndex so whatever columns are supposed to be there shouldbe there in my PGVector Store.

Any idea how I'd fix this. It originated from this code:

Plain Text

from cerebro.cbcore.utils.vector_store import CerebroVectorStore
#from asyncio import run
from os import name

query = "Who does Paul Graham think of with the word schtick"
vector_store = CerebroVectorStore()
#if name == 'nt':
#    from asyncio import set_event_loop_policy, WindowsSelectorEventLoopPolicy
#    set_event_loop_policy(WindowsSelectorEventLoopPolicy()) # this fixed it

vector_store.hybrid_search(query)

4 comments

nno_dice

**AttributeError: 'NoneType' object has

AttributeError: 'NoneType' object has no attribute 'send'

Plain Text

AttributeError: 'NoneType' object has no attribute 'send'
Exception ignored in: <function _SSLProtocolTransport.__del__ at 0x000001951368DB40>
Traceback (most recent call last):
...character limit...
Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\asyncio\base_events.py", line 515, in _check_closed
RuntimeError: Event loop is closed

Relevant code: (defined in a custom class I wrote to link the vector store and ingestion pipeline into a single object where I can access both)

Plain Text

    async def ingest(self, chunks: list[dict]):
        '''Ingests a list of chunks into the vector store asynchronously'''
        
        if hasattr(self, 'ingestion_pipeline') == False:
            self.init_ingestion_pipeline()
        print('################ vector store and ingestion pipeline initialized')
        
        processed_chunks = await process_chunks(chunks)
        print('################ chunks processed')

        return await self.ingestion_pipeline.arun(documents=processed_chunks)

Entrypoint:

Plain Text

vector_store = PFVectorStore() #custom class I was referring to
run(vector_store.ingest(test_data)) #asyncio

Pretty certain this is an async error, or something related to it?

38 comments

nno_dice

Async ingestion pipeline loads no data

Async ingestion pipeline loads no data after completion

Its not returning any errors...

My code:

Plain Text

    def init_ingestion_pipeline(self) -> IngestionPipeline:
        '''Initializes the ingestion pipeline for the vector store'''

        pipeline = IngestionPipeline(
            transformations=[
                AzureOpenAIEmbedding(
                    model="text-embedding-ada-002"
                    , azure_deployment="text-embedding-ada-002"
                    , azure_endpoint=str(settings.azure_openai_api_base)
                    , api_version=str(settings.azure_openai_api_version)
                    , api_key=str(settings.azure_openai_api_key)
                )
            ]
            , vector_store=self.vector_store
        )

        self.ingestion_pipeline = pipeline

        return pipeline

    async def ingest(self, chunks: list[dict]):
        '''Ingests a list of chunks into the vector store asynchronously'''
        
        if hasattr(self, 'ingestion_pipeline') == False:
            self.init_ingestion_pipeline()
        print('################ vector store and ingestion pipeline initialized')
        
        processed_chunks = await process_chunks(chunks)
        print('################ chunks processed')

        #await self.ingestion_pipeline.arun(documents=processed_chunks)
        nodes = self.ingestion_pipeline.run(documents=processed_chunks, show_progress=True)
        print('################ ingestion pipeline completed')

The code above is in a class that I'm calling:

Plain Text

vector_store = PFVectorStore() #Custom class, not a vector store from LlamaIndex
run(vector_store.ingest(test_data)) #Referencing the ingest function from an ingestion pipeline, ingest is a custom wrapper over the ingestion pipeline

99 comments

nno_dice

```pydantic.error_wrappers.ValidationError: 1 validation error for IngestionPipelinetrans

Plain Text

pydantic.error_wrappers.ValidationError: 1 validation error for IngestionPipeline
transformations -> 0
  value is not a valid dict (type=type_error.dict)

Plain Text

ingestion_pipeline = IngestionPipeline(
    transformations=[AzureOpenAIEmbedding]
    , vector_store=vector_store
)

ingestion_pipeline.run(documents=new_docs, show_progress=True)

The code above throws the error at the top. Not sure why, and I haven't been able to find much online. I am not superrrr familiar with pydantic so thats probably why I'm struggling. Any help?

?

17 comments

nno_dice

I've been experimenting with fine tuning

I've been experimenting with fine tuning lately and want some people's thoughts.

To what point does fine tuning NOT help in exposing an LLM with additional data as it pertains to a specific task? I recognize that fine tuning is great for structured outputs, edge cases, or formatting responses from an LLM.

But I've also seen people refer to fine tuning as a way to extend the underlying data an LLM has to work with.

To what degree is that true, and to what degree is it not true? In my research it seems to only be true in as much as the new data pertains to showing the model how it should respond in a specific use case. Any thoughts?

7 comments

nno_dice

Quick q :

Quick q :

gpt-3.5-turbo-0125 and gpt-4-turbo-preview are both listed as models trained on function calling by OpenAI. Have there since been any additional models that are trained for function calling out of the box? Am curious if any more have been trained for function calling recently

https://platform.openai.com/docs/guides/function-calling

3 comments

nno_dice

Is it possible to allow a ReAct Agent to

Is it possible to allow a ReAct Agent to be primed w/ context via a query engine to leverage that context to select a tool?

4 comments

nno_dice

langchain-spotify-assistant/spotify_assi...

Hi, does LlamaIndex currently have any implementation/support of OpenAI's API Specs? Specifically this would be useful as an input to a tool to be used within an agent - and these API specs provide rich context on how to use the API. LangChain's implementation seems to display its usefulness.

https://platform.openai.com/docs/plugins/getting-started

https://github.com/finnelliott/langchain-spotify-assistant/blob/main/spotify_assistant.py

7 comments

nno_dice

How does the Raptor Retriever in

How does the Raptor Retriever in LlamaIndex handle updates to a corpus of data? More specifically in this scenario:

I have persisted and clustered data in a vector index that was created previously via RAPTOR
I want to ADD more data to this persisted and clustered vector store, but I want to be sure this new data is included in the existing clusters

Does RAPTOR accomodate at all for step 2? Or do I need to truncate that table and just re-cluster the data to ensure the data is all clustered together?

6 comments

nno_dice

Has anyone built any Agents with DBRX in

Has anyone built any Agents with DBRX in LlamaIndex? How does it perform? Are there any considerations or issues with the LI abstractions (I understand for a while most of LangChain and LlamaIndex were tilted towards OpenAI)

4 comments

nno_dice

Does anyone have any easily accessible

Does anyone have any easily accessible datasets that are good for RAG evaluation? Preferably ones that might be integrated into Llama Index? I looked on LlamaHub but with the recent 1.0 update I'm not able to find any on the website itself - and I figure the imports have changed recently anyway

2 comments

nno_dice

Llm

Does LI have any dummy LLMs that fit the llama-index interface? I'm looking to potentially use one if so

4 comments

nno_dice

**ValueError: Invalid query mode: hybrid

ValueError: Invalid query mode: hybrid

I am running into an error where I cannot seem to query/retrieve a Vector Store Index with the query mode "hybrid". I believe it may be because I'm just using a simple vector store built from the Documents.example() function.

My code:

Plain Text

## setup
@pytest.fixture
def setup() -> dict:

    os.environ["OPENAI_API_KEY"] = str(settings.openai_api_key)

    service_context = ServiceContext.from_defaults(
        embed_model=OpenAIEmbedding(
            model="text-embedding-ada-002"
        ),
        llm=OpenAI(
            model="gpt-3.5-turbo"
        )
    )

    shots = AlphaMatrix(data=DEFAULT_CATEGORIES)

    vector_index = VectorStoreIndex.from_documents(
        [Document.example()]
        , service_context=service_context
    )

    reranker = LLMRerank(service_context=service_context)

    retriever = CustomRetriever(
        index=vector_index,
        llm=service_context.llm,
        reranker=reranker,
        matrix=shots,
        verbose=True,
    )

    return {
        "retriever": retriever,
        "service_context": service_context,
        "vector_index": vector_index,
        "matrix": shots,
    }

#Where the error occurs:
retriever = VectorIndexRetriever(
            index=index,
            vector_store_query_mode="hybrid",
            alpha=default_alpha,
            **kwargs,  # filters, etc, added here
        )

def test_retrieve(setup):

    retriever = setup.get("retriever")
    query = "What are LLMs good at?"
    results = retriever.retrieve(query)

Error:

Plain Text

ValueError: Invalid query mode: hybrid

18 comments

nno_dice

Are there any dummy/monkey objects I can

Are there any dummy/monkey objects I can use for testing? For example, a dummy vector store for which I can test retrievers on?

15 comments

nno_dice

Is there a way to dynamically adjust the

Is there a way to dynamically adjust the alpha parameter of a Hybrid Retriever that has already been created? Or can this only be done at instantiation? (I'm currently digging into the docs to find this answer myself but figured I'd ask in case someone else had encountered this)

7 comments

nno_dice

Llm

AttributeError: 'LLMPredictor' object has no attribute '_llm'. Did you mean: 'llm'

I am trying to reference the llm provided into a ServiceContext object but when I reference it from the service context to perform a simple complete function I get the following error:

manager.service_context.llm.complete('Hi what is 5+5?') (Manager is just a class I wrote that wraps over some LI objects, including a ServiceContext object.

My service context object: (created in the Manager class where I use type-hints and set a default)

Plain Text

    service_context: ServiceContext = ServiceContext.from_defaults(
        embed_model= AzureOpenAIEmbedding(
            model="text-embedding-ada-002"
            , azure_deployment="text-embedding-ada-002"
            , azure_endpoint=str(settings.azure_openai_api_base)
            , api_version=str(settings.azure_openai_api_version)
            , api_key=str(settings.azure_openai_api_key)
        )
        , llm = AzureOpenAI(
            model="gpt-4"
            , azure_deployment="gpt-4"
            , azure_endpoint=str(settings.azure_openai_api_base)
            , api_version=str(settings.azure_openai_api_version)
            , api_key=str(settings.azure_openai_api_key)
        )
    )

Error:
AttributeError: 'LLMPredictor' object has no attribute '_llm'. Did you mean: 'llm'?

I have in the past been able to reference the LLM from a service context.. not sure why all of a sudden its not working now. I am running llama-index 0.9.39

12 comments

nno_dice

I'm inputting a `system_prompt` kwarg

I'm inputting a system_prompt kwarg into my ReAct Agent when creating from tools, but it doesn't seem to be having any effect on the agent. When I dig into the source code I'm having a hard time discerning whether ReAct agents really accept this kwarg and do anything with it?

Are we able to update the prompt or system prompt of a react agent?

22 comments

Find answers from the community

Hi - I am currently trying to wrap an

**LlamaIndex CLI error**

**LlamaPack dependency question..**

**Simple local vector store index that

Is anyone else having issues with the

**Retriever tests**

**Potential bug?**

Query Engine / Index Connection Errors

`sqlalchemy.exc.ProgrammingError: (

**AttributeError: 'NoneType' object has

Async ingestion pipeline loads no data

```pydantic.error_wrappers.ValidationError: 1 validation error for IngestionPipelinetrans

I've been experimenting with fine tuning

Quick q :

Is it possible to allow a ReAct Agent to

langchain-spotify-assistant/spotify_assi...

How does the Raptor Retriever in

Has anyone built any Agents with DBRX in

Does anyone have any easily accessible

Llm

**ValueError: Invalid query mode: hybrid

Are there any dummy/monkey objects I can

Is there a way to dynamically adjust the

Llm

I'm inputting a `system_prompt` kwarg

LlamaIndex CLI error

LlamaPack dependency question..

Retriever tests

Potential bug?