kev

Is the Documentation for Llama Extract Up-to-Date?

https://github.com/run-llama/llama_extract - is the documentation up-to date?

i cannot call create_agent method on the extractor. Can I pass an s3 url ?

Plain Text

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[44], line 1
----> 1 extractor.create_agents()

File ~/myenv/lib/python3.12/site-packages/pydantic/main.py:891, in BaseModel.__getattr__(self, item)
    888     return super().__getattribute__(item)  # Raises AttributeError if appropriate
    889 else:
    890     # this is the current error
--> 891     raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}')

AttributeError: 'LlamaExtract' object has no attribute 'create_agents'

running version:

Plain Text

Name: llama-extract
Version: 0.0.4

18 comments

kkev

Workflow

Is there an example on how to build CUA with llamaindex workflows?

3 comments

kkev

Automated retrieval with llama-index

Automated retrieval with llama-index blocked. Is there a work-around?

Plain Text

from llama_index.core import SummaryIndex
from llama_index.readers.web import SimpleWebPageReader
from IPython.display import Markdown, display
import os

documents = SimpleWebPageReader(html_to_text=True).load_data(["https://www.xyz.com"])
documents

[MyHomepage] Main\nContent Main Navigation\n\n## Page not available\n\nYour access to website has been blocked because you are using an\nautomated process to retrieve content\n\nReason: Automated retrieval by user agent "python-requests/2.31.0".\n\nURL:

1 comment

kkev

Error add to a collection in chromadb:

Error add to a collection in chromadb:

collection_name = "name"
vector_store = ChromaVectorStore(chroma_collection=collection_name)

storage_context = StorageContext.from_defaults(vector_store=vector_store)

raw_index = VectorStoreIndex.from_documents(
parsed_docs,
storage_context=storage_context,
embed_model=Settings.embed_model
)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-41-79eec0778777> in <cell line: 7>()
      5 storage_context = StorageContext.from_defaults(vector_store=vector_store)
      6 
----> 7 raw_index = VectorStoreIndex.from_documents(
      8                                             parsed_docs,
      9                                             storage_context=storage_context,

6 frames
/usr/local/lib/python3.10/dist-packages/llama_index/vector_stores/chroma/base.py in add(self, nodes, **add_kwargs)
    263                 documents.append(node.get_content(metadata_mode=MetadataMode.NONE))
    264 
--> 265             self._collection.add(
    266                 embeddings=embeddings,
    267                 ids=ids,

AttributeError: 'str' object has no attribute 'add'

19 comments

kkev

Encode

when we load an image using SimpleDirectoryReader , does it encode the image? if so, what encoding type does it use?

img = SimpleDirectoryReader("/content/drive/images").load_data()

1 comment

kkev

Is there an example notebook showcasing

Is there an example notebook showcasing the use of approximate. meta data filtering ? For example, I am using workflows for RAG and I'd like to include. approximate metadata filtering for better retrieval accuracy.

Plain Text

custom_index = VectorStoreIndex.from_documents(
                                               documents,
                                               storage_context=storage_context        
                                              )


class RAGWorkflow(Workflow):
    
    @step
    async def ingest(self, ctx: Context, ev: StartEvent) -> StopEvent | None:
        """Entry point - ingest documents"""
        
        index = custom_index
        
        return StopEvent(result=index)

    @step
    async def retrieve(self, ctx: Context, ev: StartEvent) -> RetrieverEvent | None:
        "Entry point for RAG, triggered by a StartEvent with `query`."
        query = ev.get("query")
        index = ev.get("index")

        if not query:
            return None

        # store the query in the global context
        await ctx.set("query", query)
        await ctx.set("index", index)

        # get the index from the global context
        if index is None:
            print("Index is empty, load some documents before querying!")
            return None

        retriever = index.as_retriever(similarity_top_k=10)
        nodes = await retriever.aretrieve(query)
    
        return RetrieverEvent(nodes=nodes)

7 comments

kkev

What's the best way to use llama-index

What's the best way to use llama-index to retrieve row(s) and cell value from a pandas dataframe based on a natural language user query?

41 comments

kkev

Is there a good example / cookbook for

Is there a good example / cookbook for multi-vector / recursive retriever + multi-modal RAG using llama-index?

Here's an LangChain example: https://github.com/langchain-ai/langchain/blob/master/cookbook/Multi_modal_RAG.ipynb

21 comments

kkev

Hi there, I used `llama-parse` and

Hi there, I used llama-parse and implemented the RAG on a set of financial documents. Similar to the example in this notebook[1], we build a raw index and recursive index . To my surprise, the results from raw_index.as_query_engine are more accurate than the recursive one. I am try to get an intution for why this might be? For context, we have tables with financial data and a sample query might look like - what was the total rent for Property A in 2023? What is the key difference between the two indices? and how do they work under-the-hood?

https://github.com/run-llama/llama_parse/blob/main/examples/demo_advanced.ipynb

2 comments

Find answers from the community

Is the Documentation for Llama Extract Up-to-Date?

Workflow

Automated retrieval with llama-index

Error add to a collection in chromadb:

Encode

Is there an example notebook showcasing

What's the best way to use llama-index

Is there a good example / cookbook for

Hi there, I used `llama-parse` and