Find answers from the community

Can i know when chunking using default setting, which part of the nodes are transformed into embedding ?

Is it the nodegetcontent.metadata.embed ?
4 comments
L
V
Hi everyone,

Is it possible to instantiate a SummaryIndex without setting the LLM in the Settings global class? I’m looking for something similar to how it’s done in VectorStoreIndex, where I can pass the embed_model as an argument.

When I use the Settings class, everything works as expected, but this approach isn’t an option for me due to concurrency issues. In my app, I use dependency injection to handle LLM instances.
2 comments
L
c
Hello. I am trying to develop on Azure AI with OpenAI. I have the endpoints and generic setup working. When I try to use AzureAI with llama_index.core it still tries to do directly to an OpenAI service, not the service we have in Azure. I am following along with "Workflows for Advanced Text-2-SQL" and have changed this to llm=AzureOpenAI(...) yet it still wants to go to OpenAI. What am I missing?
3 comments
L
w
Hi everyone,
Is there any implementation of Cached Augmented Generation (CAG) with Gemini or other LLMs integrated with LlamaIndex?
2 comments
L
c
cmosguy
·

Storage

I’m seeing a significant slow down when I retrieve a storage connect from disk - is this there simmering I can do to figure out what is doing in here?
3 comments
L
Is there a way to write a flexible and complex typesafe multi agent rag using llamaindex ?

Is it using this one ?
https://docs.llamaindex.ai/en/stable/module_guides/workflow/
@kapa.ai I have microsoft word documents, how do I extract the document content into a markdown and include the images as well?
14 comments
k
W
c
Hello team, how you guys doing?

I'm getting this error:
Plain Text
ERROR:root:2 validation errors for LLMChatStartEvent
messages.0
  Input should be a valid dictionary or instance of ChatMessage [type=model_type, input_value=ChatMessage(role=<Message... additional_kwargs=None), input_type=ChatMessage]


when trying to use llm.chat:

Plain Text
messages = [
      ChatMessage(
        role=MessageRole.SYSTEM,
        content=PromptTemplate((
          "Você é responsável por reformular e criar uma nova questão a partir de uma questão existente.\n" +
          "A questão criada deverá manter a mesma qualidade e relevância da questão original.\n" +
          "É necessário reforumular o conteúdo para apresentar variações em estilo, complexidade e contexto.\n" +
          "A questão original é a seguinte:\n" +
          "{question}\n" +
          "As alternativas são:\n" +
          "{alternatives}\n" 
          # "Os comentários do professor são:\n" +
          # "{comments}"
        )).format(
          question=parsed_question_text,
          alternatives=alternative_text,
          comments=question_comments
        )
      ),
      ChatMessage(
        role=MessageRole.USER,
        content=PromptTemplate((
          "Reformule a questão original e crie uma nova questão.\n" +
          "Retorne com a nova questão e as alternativas."
        ))
      )
    ]
3 comments
L
v
Hey all, I am building an app that I want to give the users the ability to upload their own "knowledge bases". This means each user has there own index saved to redis and backed up on s3. The goal is to attempt to load an index from redis and if it doesnt exist build from s3 and deploy to redis. A regular caching strategy. My current method to check that the index is loading from cache is to do this:

Plain Text
redis_client: TtlRedis = TtlRedis(
            host=os.getenv("REDIS_CLUSTER_ADDRESS", "localhost"),
            password=os.getenv("REDIS_CLUSTER_PASSWORD", None),
            port=os.getenv("REDIS_CLUSTER_PORT", 6379),
            ttl=86400,
        )
        # Initialize Redis vector store
        vector_store = RedisVectorStore(
            redis_client=redis_client,
            schema=await create_schema(user_id, application_id)
        )

        index = VectorStoreIndex.from_vector_store(vector_store)

        # Check if the index exists; create if it doesn't
        if not vector_store.index_exists():
            # try loading from s3 since it is not in redis

Which makes sense in my head, but RedisVectorStore constructor creates an index at the end of it. So there is never a time where vector_store.index_exists() doesn't return True. Maybe I am using this wrong?
Hello! I took a stab at sprucing up LlamaIndex's integration with llmlingua (to include integration with llmlingua2) but the linting action failed at the MakeFile xD

Not sure how to clear it - anyone got an idea?

https://github.com/run-llama/llama_index/pull/17531
12 comments
L
t
is there a way to display eval result from llamaindex eval ?
1 comment
V
V
Vi
·

Eval

i wanna eval retriever with node postprocessor using llamaindex evaluator, but I don't know where to put the reranker since the evaluator only accept retriever not node postprocessor, is there a retriever bundled with node postprocessor ?
4 comments
L
V
Yep, was about to ask the same... I guess a temporary error but could be nice to read something officially
1 comment
L
I'm trying to sign up, but page has a issue.
2 comments
L
Hey all, currently trying to debug the following error:

{"error":"Wrong input: Vector inserting error: expected dim: 1536, got 1024"}

I am using an embedding model with a dimension of 1024 via BedrockEmbedding, and have set the embed_model in both Settings and the VectorStoreIndex.from_vector_store() method to be this embedding model, yet for some reason it is still expecting the OpenAI embedding model. Am I missing something or any advice on how to debug?
10 comments
D
L
How to update (add or edit docs) an existing index? I am not able to reuse index.

Here is my code for saving data:
Plain Text
email_docs = process_emails_sync(filtered_unprocessed_emails, user)
docstore = MongoDocumentStore.from_uri(uri=LLAMAINDEX_MONGODB_STORAGE_SRV)
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(my_docs)
docstore.add_documents(nodes)
Settings.llm = OpenAI(model=ModelType.OPENAI_GPT_4_o_MINI.value)
Settings.embed_model = OpenAIEmbedding(api_key=OPENAI_API_KEY)
client = qdrant_client.QdrantClient(url=QDRANT_API_URL, api_key=QDRANT_API_TOKEN)

vector_store = QdrantVectorStore(client=client, collection_name=LLAMAINDEX_QDRANT_COLLECTION_NAME)

index_store = MongoIndexStore.from_uri(uri=LLAMAINDEX_MONGODB_STORAGE_SRV)
storage_context = StorageContext.from_defaults(vector_store=vector_store, index_store=index_store, docstore=docstore)

index = VectorStoreIndex(nodes, storage_context=storage_context, show_progress=True)
index.storage_context.persist()

When I try to load the index using the same storage context as above I get an exception that I need to specify an index_id because a new index is created every time I run the code above. How to pass the index_id to the store so it updates existing index? Please note that I am already using doc_id correctly to ensure upserting of documents.

load_index_from_storage(storage_context=storage_context, index_id="8cebc4c8-9625-4a79-8544-4943b4182116")

I have tried using VectorStoreIndex(nodes, storage_context=storage_context, show_progress=True, index_id="<index_id>") but that approach didn't work.
33 comments
l
L
1) how should i think about the distinction between an agent and a workflow?

2) i built a system that querys a simple json file for 1 single CSV (short content). i want to expand on it by adding another JSON file that contains content from a blog (longer). in this case is QueryEngineTool necessary?

main reason for confusion is when I look at the workflow documentation they never use a QueryEngineTool https://docs.llamaindex.ai/en/stable/examples/workflow/rag/
but in the agent cookbooks a bunch of imports are used: https://github.com/run-llama/python-agents-tutorial/blob/main/5_memory.py

right now i handle it like this without using that import:
Plain Text
        try:
            # Load appropriate data based on query type
            json_data = load_marketing_data() if is_logo_query else load_json_data()
            
            result = await w.run(
                query=user_message.content,
                list_of_dict=json_data,
                llm=llm,
                table_name=DEFAULT_TABLE_NAME,
                timeout=10
            )
4 comments
L
F
Hello folks, i hope you are doing great.

I'm designing a Workflow with some HITL steps. I have successfully accomplished to break the streaming of events and capture the human input event.

At this point, most examples connect via input() or a websocket and then resume operations. But my use case needs to be asynhronous.

I need to be able to checkpoint the workflow, save it somewhere (e.g. a database) and pause execution.

Later, when i receive the response from the human, i want to load the checkpoint, set the new event and continue.

So far, i'm able to stop at the desired event, save the checkpoint and load it again.

THE PROBLEM:

Breaking the streaming loop to capture the event prevents the Workflow to mark such step as completed, causing the Checkpointer to not checkpoint the progress of the step that emitted the InputRequiredEvent.

Is it possible to accomplish this? Can i somehow force the checkpointer or mark the step as done?
3 comments
F
Hi, please help me choose a vectorDB given my requirements:

I am currently pre production, running on pinecone. Pinecone is fine, but going into production, I will have ~20k documents, and I think Pinecone will be too expensive.

I want a solution I can run with Azure, as I have some credits. I am trying to choose between hosting pgvector and Azure AI search. What should I look for before making the switch?
1 comment
W
Hello, I'm using DocumentSummaryIndex to create summaries of all the documents that I have. I have the code that uses an extended BaseExtractor class that would create extra metadata and also have the code that would loop through and append the metadata.

However, I ran into some code that showed that this may be done by sending an extractor into the DocumentSummaryIndex builder
summary_index = DocumentSummaryIndex.from_documents( documents=documents, transformations=[splitter], response_synthesizer = response_synthesizer, extractors=[sentiment_extractor], )
The above code doesn't work; the extractor is never called. Is there a way of doing the above? It's cleaner than the approach I currently have
1 comment
W
any example of agentic rag, agent worker should use the Querytool always
1 comment
W
Hey @Logan M how are you? I am trying to use the workflow example to generate an example sub questions, then go through and use react to answer the subquestions form here: https://docs.llamaindex.ai/en/stable/examples/workflow/sub_question_query_engine/

The issue is when I get to the point in the subquestion routine:

agent = ReActAgent.from_tools(
await ctx.get("tools"), llm=llm_4o_2, verbose=False, max_iterations=5
)
response = agent.chat(ev.question)

There are some subquestion quries where it fails with :

Error code: 400 - {'error': {'message': "This model's maximum context length is 128000 tokens. However, your messages resulted in 129643 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

I really do not understand how to control this. BTW, the tools is a lit of retriever tools, that was supposed to have node_postprocessor reranker to titrate down the nodes. but i do keep hitting this error regardless.
6 comments
c
L
Help: I'm trying to use an Extractor as so:

class SentimentExtractor(BaseExtractor): def __init__(self, me, str): print("here") self.me = me self.str = str async def aextract(self, nodes: Sequence) -> List[Dict]: metadata_list = [] for node in nodes: generated_sentiment = {"sentiment": "Positive"} # Replace with actual LLM call metadata_list.append(generated_sentiment) return metadata_list

when I instanciate the class
sentiment_extractor = SentimentExtractor(me = "zsdfsadf", str="hello")

I get errors:
ValueError: "SentimentExtractor" object has no field "me"

If I define my own BaseExtractor just as a replacement, I don't get the error

I'm trying to actually pass in an LLM and custom prompt to extract the sentiment out of nodes and add to the metadata but seem to be running into something odd. Anything with dependencies?
5 comments
L
S
@kapa.ai I am trying to use an LLM azure open ai model, the chat works but the .complete Keeps asking for api key
19 comments
k
c
As a simple example: If i have invoices as documents, a desired question would be "Are there any invoices about graphics cards? "
3 comments
L
v