Find answers from the community

Updated 3 months ago

Thread

Weird one, more help than issue. I’ve noticed that while running llama index, ram usage is pretty noticeable, I understand this is for reasons. But I’m wondering if there’s a way to use llama index to just query data, without using ram, maybe sql? I looked at the sql docs in the docs and it seems possible, but I wanted to come and ask opinions for if that’s the best route or if there’s something better? I’m trying to host this on a 1 gig of ram free server, so I have a limit.
i
L
24 comments
I have an LLM setup to where I just pass “context” every time into the bot as a new convo, so I’m thinking I can use llama index to query an sql data base and send the results as plain text into my already setup context window.
what vector db are you using?

Limiting to 1GB of RAM will be tricky. Basically need to offload anything that might be holding memory to a remote server/cloud
the default vector db
I was looking at other's, too, but I wanna keep it 'local', as in on the server that we're hosting with.
so this means all the vectors basically have to be stored in memory -- using a local vector db like qdrant or chroma may help, but tbh Im not sure you'll be able to stay under the 1GB limit here
I'll look into those two and if I figure it out or not, i'll be back
Plain Text
Traceback (most recent call last):
  File "z:\Documents\GitHub\qdrant test.py", line 16, in <module>
    service_context = ServiceContext.from_defaults(chunk_size=512)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\service_context.py", line 178, in from_defaults
    llm_predictor = llm_predictor or LLMPredictor(
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llm_predictor\base.py", line 109, in __init__
    self._llm = resolve_llm(llm)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\llms\utils.py", line 19, in resolve_llm
    from langchain.base_language import BaseLanguageModel
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\__init__.py", line 6, in <module>
    from langchain.agents import MRKLChain, ReActChain, SelfAskWithSearchChain
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\agents\__init__.py", line 2, in <module>
    from langchain.agents.agent import (
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\agents\agent.py", line 15, in <module>
    from langchain.agents.tools import InvalidTool
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\agents\tools.py", line 7, in <module>
    from langchain.tools.base import BaseTool
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\tools\__init__.py", line 3, in <module>
    from langchain.tools.base import BaseTool
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\tools\base.py", line 9, in <module>
    from langchain.callbacks import get_callback_manager
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\callbacks\__init__.py", line 6, in <module>
    from langchain.callbacks.aim_callback import AimCallbackHandler
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\callbacks\aim_callback.py", line 4, in <module>
    from langchain.callbacks.base import BaseCallbackHandler
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\callbacks\base.py", line 7, in <module>
    from langchain.schema import AgentAction, AgentFinish, LLMResult
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\schema.py", line 143, in <module>
    class ChatGeneration(Generation):
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\langchain\schema.py", line 150, in ChatGeneration
    def set_text(cls, values: Dict[str, Any]) -> Dict[str, Any]:
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\pydantic\deprecated\class_validators.py", line 231, in root_validator 
    return root_validator()(*__args)  # type: ignore
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\pydantic\deprecated\class_validators.py", line 237, in root_validator 
    raise PydanticUserError(
pydantic.errors.PydanticUserError: If you use `@root_validator` with pre=False (the default) you MUST specify `skip_on_failure=True`. Note that `@root_validator` is deprecated and should be replaced with `@model_validator`.

For further information visit https://errors.pydantic.dev/2.6/u/root-validator-pre-skip


Getting this error when trying to use Qdrant db, via api and local. Moving on to trying chromadb...
I don't get this error when I run a non-llama index python script with the same db via api
I'm restarting my adventure with QDrant, will see if I can't pin point where I went wrong
lol thats a fun error -- you could try playing around with the pydantic version (I'm guessing installing qdrant bumped your pyndatic version up)
ohhhh the error is coming from langchain too lol
do you need to use langchain? Chances are llama-index supports what you need without even installing langchain
(uninstalling langchain will probably help remove this error)
I'm not touching LangChain, I'll remove it
that seems to have removed that error!
Just an update, got qdrant api working with their free tier, and it's working great with llama-index too! under one gig of ram!
Attachment
image.png
last question, cause I can't figure it out, using this method of the api, does it recreate the index every time I launch it?
because I'm doing vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data") and calling it for storage_context = StorageContext.from_defaults(vector_store=vector_store) which is then called by index = VectorStoreIndex.from_documents(documents, storage_context, service_context)...

I'll just post the whole code:
Plain Text
index_loaded = False
chat_engine = None
try:
    print("Initializing Qdrant client and loading documents...")
    client = QdrantClient(
        os.getenv('QDRANT_URL'),
        api_key=os.getenv('QDRANT_API'),
    )
    documents = SimpleDirectoryReader("data").load_data()
    print("Setting up vector store and initializing OpenAI model...")
    vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data")
    llm = OpenAI(model="gpt-4-turbo-preview")
    print("Setting up storage and service context...")
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    service_context = ServiceContext.from_defaults(llm=llm)
    print("Creating vector store index and chat engine...")
    index = VectorStoreIndex.from_documents(documents, storage_context, service_context)
    index_loaded = True
    chat_engine = index.as_chat_engine(chat_mode="best")
except Exception as e:
    print("Index not loaded, falling back to Vertex AI API LLM.", e)
if it is, how would I 'store' it so it doesn't have to rebuild unless there's a db update
I only ask cause I'm using the api, so I don't fully understand it
theres nothing to store 👀 Every time you call from_documents() its adding those documents, and creating the index if its not already created

Use VectorStoreIndex.from_vector_store(vector_store, service_context=service_context) if you need to connect to an existing vector store
Add a reply
Sign up and join the conversation on Discord