LlamaIndex

Log inLog into community

Find answers from the community

Updated last year

Updated to `0.10.1`, uninstalled and

Updated to `0.10.1`, uninstalled and

At a glance

The community member is having issues importing VectorStoreIndex and StorageContext from the llama_index.core module after updating to version 0.10.1. They have tried uninstalling and reinstalling the llama-index and llama-index-core packages, but the issue persists. Other community members suggest using a virtual environment and upgrading to Python 3.11 to resolve the import issues. However, the community member still encounters problems with the automatic upgrade of the llama-index package, leading to a UnicodeDecodeError. After manually upgrading each file, the community member is able to get the imports working, but now faces issues with the CondensePlusContextChatEngine and passing the llm parameter correctly. The community members collaborate to identify and fix the underlying issues in the llama_index library.

Useful resources

iidontneedonetho

·

Updated to 0.10.1, uninstalled and reinstalled llama-index and llama-index-core, but when I do from llama_index.core import VectorStoreIndex it won't import VectorStoreIndex.

i

L

43 comments

iidontneedonetho

Same thing with StorageContext, I haven't gotten a chance to try, but I have that importing like this:

Plain Text

from llama_index.core.storage.storage_context import StorageContext

Are you running in a notebook or just py scripts?

iidontneedonetho

py scripts

iidontneedonetho

in fact the automatic upgrade fails too:

Plain Text

C:\Users\thecr>llamaindex-cli upgrade Z:\Documents\GitHub\FrogBot\modules\utils
Traceback (most recent call last):
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\Scripts\llamaindex-cli.exe\__main__.py", line 4, in <module>
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\core\command_line\command_line.py", line 4, in <module>
    from llama_index.core.command_line.rag import RagCLI, default_ragcli_persist_dir
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python310\lib\site-packages\llama_index\core\command_line\rag.py", line 9, in <module>
    from llama_index.core import (
ImportError: cannot import name 'Response' from 'llama_index.core' (unknown location)

I see you aren't using a venv

Try this

Plain Text

python -m venv venv
source venv/bin/activate
pip install -U llama-index

I've confirmed this works both locally (in a fresh venv) and in google colab

iidontneedonetho

10-4 I’ll give it a shot when I can

iidontneedonetho

I actually took this chance to update/upgrade from python 3.10 to 3.12, this fixed the import issues. Got a new error with the auto update for llama index:

Plain Text

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Scripts\llamaindex-cli.exe\__main__.py", line 7, in <module>
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\command_line.py", line 269, in main
    args.func(args)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\command_line.py", line 227, in <lambda>
    upgrade_parser.set_defaults(func=lambda args: upgrade_dir(args.directory))
                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\upgrade.py", line 283, in upgrade_dir
    upgrade_file(str(file_ref))
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\upgrade.py", line 267, in upgrade_file
    upgrade_py_md_file(file_path)
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\site-packages\llama_index\core\command_line\upgrade.py", line 249, in upgrade_py_md_file
    lines = f.readlines()
            ^^^^^^^^^^^^^
  File "C:\Users\thecr\AppData\Local\Programs\Python\Python312\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1798: character maps to <undefined>

iidontneedonetho

yes, I'm not using veuv still, I know...

Hmmm, seems like a bug reading a file/directory actually

iidontneedonetho

I did each file manually, and it was 'successful'. I'm setting up a venv for vs code right now

iidontneedonetho

jk to python 3.12, some of the packages ya'll use have to be above 3.7 and below 3.12

yea was going to say, I've never even tried 3.12 lol

iidontneedonetho

yah, i'm moving to 3.11

iidontneedonetho

nice, everything is working for imports, now to fix everything that broke

hopefully not too much work! Feel free to ask any questions!

iidontneedonetho

the blog says that service context is no more, yet I'm looking at the docs and seeing it mentioned, and I'm using it for pass the llm still for my chat engine, am I supposed to do it another way? cause when I pass the llm directly for the chat engine, it uses 3.5 and not 4 turbo like I wanted.

Service context is deprecated, but is supposed to still work 😅 What does your code look like?

iidontneedonetho

this one works:

Plain Text

client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API'))
vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data-nochunk")
llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
index = VectorStoreIndex.from_vector_store(vector_store, service_context=service_context)

async def process_message_with_llm(message, client):
    content = message.content.replace(client.user.mention, '').strip()
    if content:
        try:
            async with message.channel.typing():
                memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
                context = await fetch_context_and_content(message, client, content)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = index.as_chat_engine(
                    chat_mode="condense_plus_context",
                    memory=memory,
                    similarity_top_k=5,
                    context_prompt=(
                        f"You are {client.user.name}, a Discord bot, format responses as such."
                        "\nTopic: OpenPilot and its various forks."
                        "\n\nRelevant documents for the context:\n"
                        "{context_str}"
                        "\n\nInstruction: Use the previous chat history or the context above to interact and assist the user."
                    )
                )
                chat_response = chat_engine.chat(content)
                if not chat_response or not chat_response.response:
                    await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
                    return
                response_text = chat_response.response
                response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
                await send_long_message(message, response_text)
        except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")

iidontneedonetho

This one doesn't:

Plain Text

client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API'))
vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data-nochunk")
llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_vector_store(vector_store, embed_model=embed_model)

async def process_message_with_llm(message, client):
    content = message.content.replace(client.user.mention, '').strip()
    if content:
        try:
            async with message.channel.typing():
                memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
                context = await fetch_context_and_content(message, client, content)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = index.as_chat_engine(
                    chat_mode="condense_plus_context",
                    llm=llm,
                    memory=memory,
                    similarity_top_k=5,
                    context_prompt=(
                        f"You are {client.user.name}, a Discord bot, format responses as such."
                        "\nTopic: OpenPilot and its various forks."
                        "\n\nRelevant documents for the context:\n"
                        "{context_str}"
                        "\n\nInstruction: Use the previous chat history or the context above to interact and assist the user."
                    )
                )
                chat_response = chat_engine.chat(content)
                if not chat_response or not chat_response.response:
                    await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
                    return
                response_text = chat_response.response
                response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
                await send_long_message(message, response_text)
        except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")

hmmmm

aha ok, its a bug lol

will make a PR -- thanks for explaining!

iidontneedonetho

anytime!

iidontneedonetho

Just an update, it only breaks when you do index.as_chat_engine, if you do something like, CondensePlusContextChatEngine, it works as intended

Gotcha 🫡 Yea, as_chat_engine isn't passing in the llm as needed

https://github.com/run-llama/llama_index/pull/10605

iidontneedonetho

uhh, I might have lied, CondensePlusContextChatEngine might not be working with llm=llm either...

iidontneedonetho

Actually, I don't think chat engine is working with any passed llm in anyway right now...

iidontneedonetho

I threw service_context back in to test, and it's not passing the llm either from the looks of it

iidontneedonetho

Plain Text

client = QdrantClient(os.getenv('QDRANT_URL'), api_key=os.getenv('QDRANT_API'))
vector_store = QdrantVectorStore(client=client, collection_name="openpilot-data-nochunk")
llm = OpenAI(model="gpt-4-turbo-preview", max_tokens=1000)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model, llm=llm)
index = VectorStoreIndex.from_vector_store(vector_store, embed_model=embed_model, llm=llm, service_context=service_context)

async def process_message_with_llm(message, client):
    content = message.content.replace(client.user.mention, '').strip()
    if content:
        try:
            async with message.channel.typing():
                memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
                context = await fetch_context_and_content(message, client, content)
                memory.set(context + [HistoryChatMessage(f"{content}", Role.USER)])
                chat_engine = CondensePlusContextChatEngine.from_defaults(
                    retriever=index.as_retriever(similarity_top_k=5, llm=llm),
                    llm=llm,
                    memory=memory,
                    context_prompt=(
                        f"You are {client.user.name}, a Discord bot, format responses as such."
                        "\nTopic: OpenPilot and its various forks."
                        "\n\nRelevant documents for the context:\n"
                        "{context_str}"
                        "\n\nInstruction: Use the previous chat history or the context above to interact and assist the user."
                    )
                )
                chat_response = chat_engine.chat(content)
                if not chat_response or not chat_response.response:
                    await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
                    return
                response_text = chat_response.response
                response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
                await send_long_message(message, response_text)
        except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")

I added llm=llm everywhere I could, and none of them are passing to the chat engine for use:

Plain Text

An error occurred: Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens. However, your messages resulted in 4572 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

Hmmm let me check

Super sus

ugh, we got a case of the "million merge conflicts" here I think

We had a branch for v0.10 and a branch for service context, and merged them.... seems like some changes got washed away :PSadge: Slightly scary to think about

Let me fix that chat engines first

iidontneedonetho

I'm only on my test env, my 'production' is on a server and isn't being touched for a while

good news is, this is only for CondensePlusContext

Basically, it had llm = llm_from_settings_or_context(Settings, service_context)

What it needed was llm = llm or llm_from_settings_or_context(Settings, service_context)

Its less bad than I thought, i was looking at an old checkout when I got scared lol

actual v0.10.0 is good (besides this)

iidontneedonetho

lol!

Add a reply

Sign up and join the conversation on Discord