LlamaIndex

Log inLog into community

Find answers from the community

Updated 6 months ago

Thread

Thread

At a glance

The community member is experiencing high RAM usage while running the Llama Index and is looking for a way to query data without using as much RAM, potentially using SQL. They are trying to host this on a 1GB RAM server, so they have a limit. The comments suggest using a local vector database like Qdrant or Chroma, but the community member is unsure if they can stay under the 1GB limit. They encounter some errors with Qdrant, but eventually get it working with the free tier and under 1GB of RAM.

The community member also asks if the index is recreated every time they launch the application, and how they can "store" the index so it doesn't have to rebuild unless there's a database update. The response indicates that the index is not stored, and they should use VectorStoreIndex.from_vector_store() to connect to an existing vector store if they need to avoid rebuilding the index.

Useful resources

iidontneedonetho

·

Weird one, more help than issue. I’ve noticed that while running llama index, ram usage is pretty noticeable, I understand this is for reasons. But I’m wondering if there’s a way to use llama index to just query data, without using ram, maybe sql? I looked at the sql docs in the docs and it seems possible, but I wanted to come and ask opinions for if that’s the best route or if there’s something better? I’m trying to host this on a 1 gig of ram free server, so I have a limit.

i

L

24 comments

iidontneedonetho

I have an LLM setup to where I just pass “context” every time into the bot as a new convo, so I’m thinking I can use llama index to query an sql data base and send the results as plain text into my already setup context window.

what vector db are you using?

Limiting to 1GB of RAM will be tricky. Basically need to offload anything that might be holding memory to a remote server/cloud

iidontneedonetho

the default vector db

iidontneedonetho

I was looking at other's, too, but I wanna keep it 'local', as in on the server that we're hosting with.

so this means all the vectors basically have to be stored in memory -- using a local vector db like qdrant or chroma may help, but tbh Im not sure you'll be able to stay under the 1GB limit here

iidontneedonetho

I'll look into those two and if I figure it out or not, i'll be back

iidontneedonetho

Plain Text

Traceback (most recent call last):
  File "z:\Documents\GitHub\qdrant test.py", line 16, in <module>
    service_context = ServiceContext.from_defaults(chunk_size=512)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\service_context.py", line 178, in from_defaults
    llm_predictor = llm_predictor or LLMPredictor(
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llm_predictor\base.py", line 109, in __init__
    self._llm = resolve_llm(llm)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\utils.py", line 19, in resolve_llm
    from langchain.base_language import BaseLanguageModel
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\__init__.py", line 6, in <module>
    from langchain.agents import MRKLChain, ReActChain, SelfAskWithSearchChain
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\agents\__init__.py", line 2, in <module>
    from langchain.agents.agent import (
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\agents\agent.py", line 15, in <module>
    from langchain.agents.tools import InvalidTool
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\agents\tools.py", line 7, in <module>
    from langchain.tools.base import BaseTool
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\tools\__init__.py", line 3, in <module>
    from langchain.tools.base import BaseTool
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\tools\base.py", line 9, in <module>
    from langchain.callbacks import get_callback_manager
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\callbacks\__init__.py", line 6, in <module>
    from langchain.callbacks.aim_callback import AimCallbackHandler
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\callbacks\aim_callback.py", line 4, in <module>
    from langchain.callbacks.base import BaseCallbackHandler
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\callbacks\base.py", line 7, in <module>
    from langchain.schema import AgentAction, AgentFinish, LLMResult
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\schema.py", line 143, in <module>
    class ChatGeneration(Generation):
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\schema.py", line 150, in ChatGeneration
    def set_text(cls, values: Dict[str, Any]) -> Dict[str, Any]:
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\pydantic\deprecated\class_validators.py", line 231, in root_validator 
    return root_validator()(*__args)  # type: ignore
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\pydantic\deprecated\class_validators.py", line 237, in root_validator 
    raise PydanticUserError(
pydantic.errors.PydanticUserError: If you use `@root_validator` with pre=False (the default) you MUST specify `skip_on_failure=True`. Note that `@root_validator` is deprecated and should be replaced with `@model_validator`.

For further information visit https://errors.pydantic.dev/2.6/u/root-validator-pre-skip

Getting this error when trying to use Qdrant db, via api and local. Moving on to trying chromadb...

iidontneedonetho

I don't get this error when I run a non-llama index python script with the same db via api

iidontneedonetho

I'm restarting my adventure with QDrant, will see if I can't pin point where I went wrong

lol thats a fun error -- you could try playing around with the pydantic version (I'm guessing installing qdrant bumped your pyndatic version up)

ohhhh the error is coming from langchain too lol

do you need to use langchain? Chances are llama-index supports what you need without even installing langchain

(uninstalling langchain will probably help remove this error)

iidontneedonetho

I'm not touching LangChain, I'll remove it

iidontneedonetho

that seems to have removed that error!

nice 😎

iidontneedonetho

Just an update, got qdrant api working with their free tier, and it's working great with llama-index too! under one gig of ram!

Attachment

amazing!

iidontneedonetho

last question, cause I can't figure it out, using this method of the api, does it recreate the index every time I launch it?

iidontneedonetho

because I'm doing vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data") and calling it for storage_context = StorageContext.from_defaults(vector_store=vector_store) which is then called by index = VectorStoreIndex.from_documents(documents, storage_context, service_context)...

I'll just post the whole code:

Plain Text

index_loaded = False
chat_engine = None
try:
    print("Initializing Qdrant client and loading documents...")
    client = QdrantClient(
        os.getenv('QDRANT_URL'),
        api_key=os.getenv('QDRANT_API'),
    )
    documents = SimpleDirectoryReader("data").load_data()
    print("Setting up vector store and initializing OpenAI model...")
    vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data")
    llm = OpenAI(model="gpt-4-turbo-preview")
    print("Setting up storage and service context...")
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    service_context = ServiceContext.from_defaults(llm=llm)
    print("Creating vector store index and chat engine...")
    index = VectorStoreIndex.from_documents(documents, storage_context, service_context)
    index_loaded = True
    chat_engine = index.as_chat_engine(chat_mode="best")
except Exception as e:
    print("Index not loaded, falling back to Vertex AI API LLM.", e)

iidontneedonetho

if it is, how would I 'store' it so it doesn't have to rebuild unless there's a db update

iidontneedonetho

I only ask cause I'm using the api, so I don't fully understand it

theres nothing to store 👀 Every time you call from_documents() its adding those documents, and creating the index if its not already created

Use VectorStoreIndex.from_vector_store(vector_store, service_context=service_context) if you need to connect to an existing vector store

iidontneedonetho

ahhhh, okay okay!

Add a reply

Sign up and join the conversation on Discord