Find answers from the community

p
paulo
Offline, last seen 3 months ago
Joined September 25, 2024
p
paulo
·

Speed

Despite saving my index locally (small file size, <1MB), it takes 30-45 seconds to receive a response for each of my queries. Does anyone know how to speed this up? I'm wondering how some apps can achieve sub 5-10second performance for answer retrieval
8 comments
p
T
L
How would I set the doc_id if I'm loading in multiple files at once?
15 comments
L
p
Does anyone know why I'm getting an error on the index = GPTSimpleVectorIndex.from_documents(file_doc) line?
6 comments
p
L
p
paulo
·

Streaming

Yes I've tried that and it doesn't work. Here's the code:

Plain Text
@app.route("/query", methods=["GET"])
def query_index():
  global index
  query_text = request.args.get("text", None)
  if query_text is None:
    return "No text found, please include a ?text=blah parameter in the URL", 400
  query_engine = index.as_query_engine(service_context=service_context, streaming=True)

  def response_stream(response):
    def generate():
        for text in response:
            yield text
    return generate

  return stream_with_context(response_stream(query_engine.query(query_text).response_gen))
99 comments
I
p
L
p
paulo
·

Function api

Does LlamaIndex support OpenAI's function calling at the moment?
7 comments
L
p
Does LlamaIndex support pgvector w/ a Postgres database (not Supabase)?
1 comment
L
I'm using Guardrails but the response keeps getting cut off (the JSON object doesn't fully close). Does anyone know how to solve this?
43 comments
L
p
b
p
paulo
·

Graph query

I'm trying to query a composable graph like this:

Plain Text
response = graph.query(
    query_str=query_str, 
    query_configs=query_configs, 
    service_context=service_context_chatgpt
)


But keep getting this error:
Plain Text
   response = graph.query(
               ^^^^^^^^^^^^
TypeError: BaseQueryEngine.query() got an unexpected keyword argument 'query_str'

Does anyone know how to solve this?
1 comment
L
I was originally loading an index into string and then saving that to S3. It looks like save_to_string has been replaced by StorageContext. How would I save it to S3?

Plain Text
from llama_index import StorageContext, load_index_from_storage

# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir="./storage")
# load index
index = load_index_from_storage(storage_context)

I'm assuming I can't pass in my S3 bucket to persist_dr since it requires the aws sdk
3 comments
d
L
Found out that it oftentimes produces the correct JSON but it keeps printing this as part of the response: The new context does not provide any additional information, so the original answer remains the same. so it messes it up when I try loading the response into a JSON file. Do you know how to remove this additional commentary it keeps returning?
17 comments
L
p
E
p
paulo
·

Vector index

That makes sense. If I use a vector db, would I still use LlamaIndex? I'm a bit confused on how that would work since LlamaIndex generates a JSON file for the index
1 comment
L
I have several files with a prefix in my S3 bucket and would like to use build_graph_from_documents on them. What's the best way of doing this? I tried using the S3Reader but it, unfortunately, kept throwing an error.
1 comment
L
Does anyone know how to solve this error?

Plain Text
Error querying graph: Invalid template: Context information is below. 
---------------------
{context_str
}
------------
...insert prompt here...

variables do not match the required input_variables: ['context_str'
]



I am simply passing in the query_str as args when calling the graph object. For example: response = graph.query(query_str=query_str, query_configs=query_configs)
p
paulo
·

Max tokens

Even after increasing my max_tokens limit, the output still keeps getting cut off. Does anyone know what might be causing this?
17 comments
L
p
p
paulo
·

Token sizes

Thanks! Where would I tweak the chunk size?

Also I can tell that when I run the query, the response gets cut off (I'm assuming this is something to do with the max number of tokens it's allowed to use?)— how would I go about solving this?
2 comments
p
L
Does anyone know how to solve this error when installing `llama-index through pip?

I'm using an AWS Lambda Python 3.9 Docker Image and included the following in my requirements.txt:
Plain Text
openai==0.26.5
tiktoken==0.2.0
wheel===0.38.4
langchain==0.0.94
llama-index==0.4.13


I saw someone had a similar issue before so I tried adding --platform=linux/x86_64 to my Dockerfile but it didn't change anything.
6 comments
p
H