paulo

Log inLog into community

Find answers from the community

Home

Members

paulo

Offline, last seen 3 months ago

Joined September 25, 2024

ppaulo

Speed

Despite saving my index locally (small file size, <1MB), it takes 30-45 seconds to receive a response for each of my queries. Does anyone know how to speed this up? I'm wondering how some apps can achieve sub 5-10second performance for answer retrieval

8 comments

ppaulo

Setting doc id

How would I set the doc_id if I'm loading in multiple files at once?

15 comments

ppaulo

From documents

Does anyone know why I'm getting an error on the index = GPTSimpleVectorIndex.from_documents(file_doc) line?

6 comments

ppaulo

Streaming

Yes I've tried that and it doesn't work. Here's the code:

Plain Text

@app.route("/query", methods=["GET"])
def query_index():
  global index
  query_text = request.args.get("text", None)
  if query_text is None:
    return "No text found, please include a ?text=blah parameter in the URL", 400
  query_engine = index.as_query_engine(service_context=service_context, streaming=True)

  def response_stream(response):
    def generate():
        for text in response:
            yield text
    return generate

  return stream_with_context(response_stream(query_engine.query(query_text).response_gen))

99 comments

ppaulo

Function api

Does LlamaIndex support OpenAI's function calling at the moment?

7 comments

ppaulo

Does LlamaIndex support pgvector w a

Does LlamaIndex support pgvector w/ a Postgres database (not Supabase)?

1 comment

ppaulo

I m using Guardrails but the response

I'm using Guardrails but the response keeps getting cut off (the JSON object doesn't fully close). Does anyone know how to solve this?

43 comments

ppaulo

Graph query

I'm trying to query a composable graph like this:

Plain Text

response = graph.query(
    query_str=query_str, 
    query_configs=query_configs, 
    service_context=service_context_chatgpt
)

But keep getting this error:

Plain Text

   response = graph.query(
               ^^^^^^^^^^^^
TypeError: BaseQueryEngine.query() got an unexpected keyword argument 'query_str'

Does anyone know how to solve this?

1 comment

ppaulo

I was originally loading an index into

I was originally loading an index into string and then saving that to S3. It looks like save_to_string has been replaced by StorageContext. How would I save it to S3?

Plain Text

from llama_index import StorageContext, load_index_from_storage

# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir="./storage")
# load index
index = load_index_from_storage(storage_context)

I'm assuming I can't pass in my S3 bucket to persist_dr since it requires the aws sdk

3 comments

ppaulo

Found out that it oftentimes produces

Found out that it oftentimes produces the correct JSON but it keeps printing this as part of the response: The new context does not provide any additional information, so the original answer remains the same. so it messes it up when I try loading the response into a JSON file. Do you know how to remove this additional commentary it keeps returning?

17 comments

ppaulo

Vector index

That makes sense. If I use a vector db, would I still use LlamaIndex? I'm a bit confused on how that would work since LlamaIndex generates a JSON file for the index

1 comment

ppaulo

I have several files with a prefix in my

I have several files with a prefix in my S3 bucket and would like to use build_graph_from_documents on them. What's the best way of doing this? I tried using the S3Reader but it, unfortunately, kept throwing an error.

1 comment

ppaulo

Does anyone know how to solve this error

Does anyone know how to solve this error?

Plain Text

Error querying graph: Invalid template: Context information is below. 
---------------------
{context_str
}
------------
...insert prompt here...

variables do not match the required input_variables: ['context_str'
]

I am simply passing in the query_str as args when calling the graph object. For example: response = graph.query(query_str=query_str, query_configs=query_configs)

ppaulo

Max tokens

Even after increasing my max_tokens limit, the output still keeps getting cut off. Does anyone know what might be causing this?

17 comments

ppaulo

Token sizes

Thanks! Where would I tweak the chunk size?

Also I can tell that when I run the query, the response gets cut off (I'm assuming this is something to do with the max number of tokens it's allowed to use?)— how would I go about solving this?

2 comments

ppaulo

Does anyone know how to solve this error

Does anyone know how to solve this error when installing `llama-index through pip?

I'm using an AWS Lambda Python 3.9 Docker Image and included the following in my requirements.txt:

Plain Text

openai==0.26.5
tiktoken==0.2.0
wheel===0.38.4
langchain==0.0.94
llama-index==0.4.13

I saw someone had a similar issue before so I tried adding --platform=linux/x86_64 to my Dockerfile but it didn't change anything.

6 comments