Find answers from the community

Updated 10 months ago

Updated to `0.10.1`, uninstalled and

Updated to 0.10.1, uninstalled and reinstalled llama-index and llama-index-core, but when I do from llama_index.core import VectorStoreIndex it won't import VectorStoreIndex.
i
L
43 comments
Same thing with StorageContext, I haven't gotten a chance to try, but I have that importing like this:
Plain Text
from llama_index.core.storage.storage_context import StorageContext
Are you running in a notebook or just py scripts?
in fact the automatic upgrade fails too:
Plain Text
C:\Users\thecr>llamaindex-cli upgrade Z:\Documents\GitHub\FrogBot\modules\utils
Traceback (most recent call last):
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\Scripts\llamaindex-cli.exe\__main__.py", line 4, in <module>
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\core\command_line\command_line.py", line 4, in <module>
    from llama_index.core.command_line.rag import RagCLI, default_ragcli_persist_dir
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\core\command_line\rag.py", line 9, in <module>
    from llama_index.core import (
ImportError: cannot import name 'Response' from 'llama_index.core' (unknown location)
I see you aren't using a venv
Plain Text
python -m venv venv
source venv/bin/activate
pip install -U llama-index
I've confirmed this works both locally (in a fresh venv) and in google colab
10-4 Iโ€™ll give it a shot when I can
I actually took this chance to update/upgrade from python 3.10 to 3.12, this fixed the import issues. Got a new error with the auto update for llama index:
Plain Text
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Scripts\llamaindex-cli.exe\__main__.py", line 7, in <module>
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\command_line.py", line 269, in main
    args.func(args)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\command_line.py", line 227, in <lambda>
    upgrade_parser.set_defaults(func=lambda args: upgrade_dir(args.directory))
                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\upgrade.py", line 283, in upgrade_dir
    upgrade_file(str(file_ref))
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\upgrade.py", line 267, in upgrade_file
    upgrade_py_md_file(file_path)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\upgrade.py", line 249, in upgrade_py_md_file
    lines = f.readlines()
            ^^^^^^^^^^^^^
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1798: character maps to <undefined>
yes, I'm not using veuv still, I know...
Hmmm, seems like a bug reading a file/directory actually
I did each file manually, and it was 'successful'. I'm setting up a venv for vs code right now
jk to python 3.12, some of the packages ya'll use have to be above 3.7 and below 3.12
yea was going to say, I've never even tried 3.12 lol
yah, i'm moving to 3.11
nice, everything is working for imports, now to fix everything that broke
hopefully not too much work! Feel free to ask any questions!
the blog says that service context is no more, yet I'm looking at the docs and seeing it mentioned, and I'm using it for pass the llm still for my chat engine, am I supposed to do it another way? cause when I pass the llm directly for the chat engine, it uses 3.5 and not 4 turbo like I wanted.
Service context is deprecated, but is supposed to still work ๐Ÿ˜… What does your code look like?
this one works:
Plain Text
client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API'))
vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data-nochunk")
llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
index = VectorStoreIndex.from_vector_store(vector_store, service_context=service_context)

async def process_message_with_llm(message, client):
    content = message.content.replace(client.user.mention, '').strip()
    if content:
        try:
            async with message.channel.typing():
                memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
                context = await fetch_context_and_content(message, client, content)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = index.as_chat_engine(
                    chat_mode="condense_plus_context",
                    memory=memory,
                    similarity_top_k=5,
                    context_prompt=(
                        f"You are {client.user.name}, a Discord bot, format responses as such."
                        "\nTopic: OpenPilot and its various forks."
                        "\n\nRelevant documents for the context:\n"
                        "{context_str}"
                        "\n\nInstruction: Use the previous chat history or the context above to interact and assist the user."
                    )
                )
                chat_response = chat_engine.chat(content)
                if not chat_response or not chat_response.response:
                    await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
                    return
                response_text = chat_response.response
                response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
                await send_long_message(message, response_text)
        except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")
This one doesn't:
Plain Text
client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API'))
vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data-nochunk")
llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_vector_store(vector_store, embed_model=embed_model)

async def process_message_with_llm(message, client):
    content = message.content.replace(client.user.mention, '').strip()
    if content:
        try:
            async with message.channel.typing():
                memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
                context = await fetch_context_and_content(message, client, content)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = index.as_chat_engine(
                    chat_mode="condense_plus_context",
                    llm=llm,
                    memory=memory,
                    similarity_top_k=5,
                    context_prompt=(
                        f"You are {client.user.name}, a Discord bot, format responses as such."
                        "\nTopic: OpenPilot and its various forks."
                        "\n\nRelevant documents for the context:\n"
                        "{context_str}"
                        "\n\nInstruction: Use the previous chat history or the context above to interact and assist the user."
                    )
                )
                chat_response = chat_engine.chat(content)
                if not chat_response or not chat_response.response:
                    await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
                    return
                response_text = chat_response.response
                response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
                await send_long_message(message, response_text)
        except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")
aha ok, its a bug lol
will make a PR -- thanks for explaining!
Just an update, it only breaks when you do index.as_chat_engine, if you do something like, CondensePlusContextChatEngine, it works as intended
Gotcha ๐Ÿซก Yea, as_chat_engine isn't passing in the llm as needed
uhh, I might have lied, CondensePlusContextChatEngine might not be working with llm=llm either...
Actually, I don't think chat engine is working with any passed llm in anyway right now...
I threw service_context back in to test, and it's not passing the llm either from the looks of it
Plain Text
client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API'))
vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data-nochunk")
llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model, llm=llm)
index = VectorStoreIndex.from_vector_store(vector_store, embed_model=embed_model, llm=llm, service_context=service_context)

async def process_message_with_llm(message, client):
    content = message.content.replace(client.user.mention, '').strip()
    if content:
        try:
            async with message.channel.typing():
                memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
                context = await fetch_context_and_content(message, client, content)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = CondensePlusContextChatEngine.from_defaults(
                    retriever=index.as_retriever(similarity_top_k=5, llm=llm),
                    llm=llm,
                    memory=memory,
                    context_prompt=(
                        f"You are {client.user.name}, a Discord bot, format responses as such."
                        "\nTopic: OpenPilot and its various forks."
                        "\n\nRelevant documents for the context:\n"
                        "{context_str}"
                        "\n\nInstruction: Use the previous chat history or the context above to interact and assist the user."
                    )
                )
                chat_response = chat_engine.chat(content)
                if not chat_response or not chat_response.response:
                    await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
                    return
                response_text = chat_response.response
                response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
                await send_long_message(message, response_text)
        except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")


I added llm=llm everywhere I could, and none of them are passing to the chat engine for use:
Plain Text
An error occurred: Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens. However, your messages resulted in 4572 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
Hmmm let me check
ugh, we got a case of the "million merge conflicts" here I think
We had a branch for v0.10 and a branch for service context, and merged them.... seems like some changes got washed away :PSadge: Slightly scary to think about
Let me fix that chat engines first
I'm only on my test env, my 'production' is on a server and isn't being touched for a while
good news is, this is only for CondensePlusContext

Basically, it had llm = llm_from_settings_or_context(Settings, service_context)

What it needed was llm = llm or llm_from_settings_or_context(Settings, service_context)
Its less bad than I thought, i was looking at an old checkout when I got scared lol
actual v0.10.0 is good (besides this)
Add a reply
Sign up and join the conversation on Discord