I'm trying to run llama index with llama cpp by following the installation docs but inside a docker container.
DOCKERFILE
# Use the official Python image for Python 3.11
FROM python:3.11
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# ARG FORCE_CMAKE=1
# ARG CMAKE_ARGS="-DLLAMA_CUBLAS=on"
# Install project dependencies
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" python -m pip install -r requirements.txt
# Command to run the server
CMD ["python", "./server.py"]
Problem:
For some reason, the env variables in the llama cpp docs do not work as expected in a docker container.
Current behaviour: BLAS= 0 (llm using CPU)
Expected behaviour: BLAS= 1 (llm using GPU)