Find answers from the community

Updated 2 months ago

Extra text

Uhm... v0.7.4 seems to be including extra text before the actual response from the LLM πŸ€” -- this have anything to do with your num_output changes Logan? πŸ€” (Or is this just... OpenAI leaking other ppls data? XD)
L
R
84 comments
Can you give an example? Are you using azure?
Never noticed something like this before πŸ˜…
Stuff is wonky
Plain Text
    response = query_engine.query(prompt)
    print("User Input:\n\t", prompt)
    print("Bot Response:\n\t", response)
output:
Plain Text
User Input:
         what is 2+2?
Bot Response:
         , and that the question does not fit the scope of your expertise.
oh sorry, the repeat at the bottom is not included (fixed above)
this time, seems to have cut it short
Man you're having a rough day πŸ˜…

Any settings you've changed?
here's one with some extra up front:
Plain Text
User Input:
         Where is the Euphrates river?
Bot Response:
         , and direct the questioner to an appropriate source of information.

The Euphrates river is located in the Middle East, in the countries of Iraq, Syria, and Turkey. It is one of the two major rivers in the region, the other being the Tigris. The Euphrates is the longest river in the Middle East, and is a major source of water for the region. It is also an important part of the region's history and culture.
Also, FWIW: that the question does not fit the scope of your expertise. and direct the questioner to an appropriate source of information. are not in any prompt I've given it.
What is going on πŸ₯΄

You didn't change any service context settings? Query settings?

Are your documents in ancient Latin per-chance? πŸ˜…

Another thing is you could check
Plain Text
print([n.node.text for node in response.source_nodes])


to see if the source nodes at least have sane text
I mean, I've got debug mode on, everything looks perfectly sane till:
Attachment
image.png
llama_index.llm_predictor.base and then it has the nonsense @Logan M
I... really can't explain this πŸ’€

Maybe davinci is wacked out? Can you switch to gpt-3.5?

Plain Text
from llama_index import ServiceContext, set_global_service_context
from llama_index.llms import OpenAI

service_context = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo", temperature=0.0))

set_global_service_context(service_context)
I do have it set to gpt-3.5-turbo already
How did you do it? Your logs show davinci being called no?
Plain Text
    qa_template = Prompt(template)

    # Access the environment variable
    api_key = os.getenv('OPENAI_API_KEY')
    openai.api_key = api_key

    mongo_uri = os.getenv('MONGO_URI')
    mongodb_client = pymongo.MongoClient(mongo_uri)
    db_name = os.getenv('MONGO_DB_NAME')
    collection_name = os.getenv('MONGO_COLLECTION_NAME')
    index_name = os.getenv('MONGO_INDEX_NAME')

    threshold_cutoff=float(os.getenv('THRESHOLD_CUTOFF') or 0.4)
    percentile_cutoff=float(os.getenv('PERCENTILE_CUTOFF') or 0.6)

    similarity_top_k=int(os.getenv('SIMILARITY_TOP_K') or 3)
    temperature = float(os.getenv('TEMPERATURE') or 0.2)
    model_name=os.getenv('MODEL_NAME') or "gpt-3.5-turbo"
    num_output=int(os.getenv('NUM_OUTPUT') or 700)
    
    store = MongoDBAtlasVectorSearch(mongodb_client, db_name=db_name,collection_name=collection_name, index_name=index_name)
    index = VectorStoreIndex.from_vector_store(vector_store=store)
    service_context = ServiceContext.from_defaults(llm=OpenAI(temperature=temperature, model_name=model_name), num_output=num_output)
    query_engine = index.as_query_engine(
        optimizer=SentenceEmbeddingOptimizer(threshold_cutoff=threshold_cutoff,percentile_cutoff=percentile_cutoff),
        retriever_mode="embedding",
        service_context = service_context,
        similarity_top_k=similarity_top_k,
        # response_mode="compact",
        # verbose=True,
        text_qa_template=qa_template
    )
"Do you have any settings customized"

"Nah none"

Bro πŸ˜…πŸ˜†
Lemme take a peek
"Do you have any settings customized"
this is not what you asked me
Are you importing OpenAI from langchain?
you asked if I "changed" any settings, presumably since before when it was working, to which I responded "No"
Plain Text
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
import pymongo
import os
import openai
from llama_index import ServiceContext
from llama_index.llms.openai import OpenAI
from llama_index.indices.postprocessor.optimizer import SentenceEmbeddingOptimizer
from llama_index import Prompt
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch
from llama_index.indices.vector_store.base import VectorStoreIndex
although, I do notice there is import openai in there as well as from llama_index.llms.openai import OpenAI... is that potentially problematic?
That should be fine
Can you try removing the optimizer?
Actually nvm, it's not even being used, it's passed in wrong πŸ€”
also, commenting it out had no change πŸ€·β€β™‚οΈ
Yea, it should be passed in like

node_postprocessors=[...] instead of using optimizer kwarg, but no worries
Was in change in 0.7.X
k well, postprocessors only recently added xD
Lol yea no worries, gonna scan the code a bit more
so like this?
Plain Text
node_postprocessors=[SentenceEmbeddingOptimizer(threshold_cutoff=threshold_cutoff,percentile_cutoff=percentile_cutoff)]
?
90% sure yea
You have model_name in OpenAI, should be just model
Like my example a second ago
well, trying that made a different error xD
Ah, that fixed that problem
The other problem now is with the optimizer:
Plain Text
 response = query_engine.query(prompt)
  File "/root/pytest/venv/lib/python3.10/site-packages/llama_index/indices/query/base.py", line 23, in query
    response = self._query(str_or_query_bundle)
  File "/root/pytest/venv/lib/python3.10/site-packages/llama_index/query_engine/retriever_query_engine.py", line 147, in _query
    nodes = self.retrieve(query_bundle)
  File "/root/pytest/venv/lib/python3.10/site-packages/llama_index/query_engine/retriever_query_engine.py", line 110, in retrieve
    nodes = node_postprocessor.postprocess_nodes(
  File "/root/pytest/venv/lib/python3.10/site-packages/llama_index/indices/postprocessor/optimizer.py", line 117, in postprocess_nodes
    nodes[i].node.set_content(" ".join(top_sentences))
IndexError: list index out of range
Hmm that seems like an actual bug maybe lol

Although looking at the code I have no idea how the index error would happen πŸ’€
seems like more stuff is getting eliminated than the code is ready for XD
tbh I wasn't able to reproduce the error you had there πŸ€”
But I cant remember which version you had, I was running 0.7.4
maaaaybe your optimizer settings removed all the sentences?
actually that looks like it's handled in the code, nvm
like, it just does

Plain Text
for i in range(len(nodes)):
   ...
   nodes[i].set_content(...)
no idea how that could create an index error
@Logan M is i on line 114 possibly modifying i from line 70?
I think that is likely since that seems to be the only way for this error to occur
Seems like it could occur when a node has more sentences than you have number of nodes.
Okay yeah, just did this:
Plain Text
for i in range(5):
    if True:
        for i in range(10):
            pass
    print(i)
and it outputs all 9s ;p
Will fix asap
ty ❀️
Hmmm so.. trying to figure out how to get the change without having to involve poetry :x
Not sure on your setup, but you can install from pip in one line

pip install git+https://github.com/jerryjliu/llama_index
Odd... I thought that was gonna work, but got same error... do I need to uninstall the original first? πŸ€”
Ah okay yup, that worked
ty so much! you dah best
glad you got it! :dotsCATJAM:
oh btw @Logan M -- my coworker just installed everything from scratch, and it was still missing nltk -- so I feel like there's a requirement missing somewhere
lemme try and do something from scratch..
ok discovered
its just an optional dependency
specific to a few features
like the sentence embedding optimizer
Add a reply
Sign up and join the conversation on Discord