Find answers from the community

Home
Members
T. Ventura
T
T. Ventura
Offline, last seen 3 months ago
Joined September 25, 2024
Hey guys, any experience with LlamaIndex in AWS Lambda ? I wonder if it's a too big package being a layer a possible solution ?
21 comments
n
D
T
T
T. Ventura
·

Index

Hey guys,

I'm using Google Cloud Run to deploy my RAG app. I finding the app quite slow, sometimes takes 10 seconds to execute the code.
It seems to be related to index storage. Locally, I'm saving the index storage in a folder and also my .txt files.

What people are doing out there in a real production app ?

Here is my docker. I define my VOLUMES but I don't think this is the best approach.

Plain Text
# Use the official Python 3.11 image as the base image
FROM --platform=linux/amd64 python:3.11

# Set the working directory in the container
WORKDIR /code
VOLUME /code/data
VOLUME /code/storage

# Copy the requirements.txt file into the container at /code
COPY requirements.txt .

# Install any needed dependencies specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code into the container at /app
COPY . .

# Specify the command to run your application
# CMD [ "python", "app.py" ]
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "9090"]
# CMD ["uvicorn", "app.main:app", "--reload"]
25 comments
L
T
T
T. Ventura
·

Chat engine

Hey Guys,

How do I mix indexed docs with a chat history ?

I'm trying something like this:

Plain Text
...
chat_history = [
    ChatMessage(role=MessageRole.SYSTEM, content="You are a helpful QA chatbot that can answer questions about llama-index."),
    ChatMessage(role=MessageRole.USER, content="How do I create an index?"),
    ChatMessage(role=MessageRole.ASSISTANT, content="LlamaIndex is a data framework for LLM-based applications which benefit from context augmentation."),
]

llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
llm.chat(chat_history) # here I try to attach the chat history

documents = SimpleDirectoryReader("data").load_data() # this is just my CV in text
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
response = query_engine.query("What is LlamaIndex ?")
print(response)


Response: "LlamaIndex is not mentioned in the provided context information."
20 comments
W
T