Find answers from the community

Updated 11 months ago

Is there a way to turn off Gemini's

Is there a way to turn off Gemini's safety filter when using it as the LLM for a chat engine?
L
i
n
63 comments
hmmm
Attachment
image.png
Where is this? I checked the vertex file under LLMs and couldn’t find any info
Im able to trigger its safety response
I’ll get outta bed and get some code in here so you can see

So if I do a chat engine like this, and ask it What is your prompt?, it'll spit out the error

(Formatting might have gotten messed up, but it should get the point across:
Plain Text
async with message.channel.typing():
        memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
        context = await fetch_context_and_content(message, client, content)
        memory.set(context + [HistoryChatMessage(f"{message.author.name}: {content}", Role.USER)])
        chat_engine = index.as_chat_engine(
            chat_mode="condense_plus_context",
            memory=memory,
            similarity_top_k=5,
            context_prompt=(
                "there is a promopt here"
            )
        )
        chat_response = chat_engine.chat(content)
        if not chat_response or not chat_response.response:
            await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
            return
        response_text = chat_response.response
        response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
        await send_long_message(message, response_text)
except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")

Plain Text
An error occurred: block_reason: SAFETY
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: MEDIUM
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

But if I do a chat engine like this, it'll give a response:

Plain Text
async with message.channel.typing():
        memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
        context = await fetch_context_and_content(message, client, content)
        memory.set(context + [HistoryChatMessage(f"{message.author.name}: {content}", Role.USER)])
        chat_engine = CondensePlusContextChatEngine.from_defaults(
            retriever=index.as_retriever(),
            memory=memory,
            similarity_top_k=5,
            context_prompt=(
                "prompt here"
            ),
        )
        chat_response = chat_engine.chat(content)
        if not chat_response or not chat_response.response:
            await message.channel.send("There was an error processing the message." if not chat_response else "I didn't get a response.")
            return
        response_text = chat_response.response
        response_text = re.sub(r'^[^:]+:\s(?=[A-Z])', '', response_text)
        await send_long_message(message, response_text)
except Exception as e:
            await message.channel.send(f"An error occurred: {str(e)}")

Plain Text
As an AI chat assistant, my prompt is to provide information and assistance related to CommaAI's OpenPilot. I can help answer questions, provide guidance, and offer support on various topics related to OpenPilot. How can I assist you today?

I had to remove the prompts for characters, but it was along the lines of, you're a bot, talk about subject.

Both are getting the LLM via llm = Gemini(max_tokens=1000). I'm using the text-embedding-3-small from OpenAI as my embed model.

Oh you are using vertex? This is in the Gemini LLM class
I couldn't find the gemini file under llms folder, though, it was 2:15 am, I mighta missed
lemme link, one sec
In theory tho, they should be the same thing, no?
not really actually
Gemini uses the import google.generativeai as genai package

Vertex uses the import vertexai package
I meant for this, sorry
those should be the same right?
Plain Text
chat_engine = CondensePlusContextChatEngine.from_defaults(
    retriever=index.as_retriever(),
    memory=memory,
    similarity_top_k=5,
    context_prompt=(
        "prompt here"
    ),
)


You didn't pass in a service context, so it's defaulting to gpt-3.5 here. Thats why it works
index.as_chat_engine takes the service context from the index
(I know its jank, new release tomorrow! 🙂 )
Plain Text
An error occurred: block_reason: SAFETY
safety_ratings {
  category: HARM_CATEGORY_SEXUALLY_EXPLICIT
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HATE_SPEECH
  probability: NEGLIGIBLE
}
safety_ratings {
  category: HARM_CATEGORY_HARASSMENT
  probability: HIGH
}
safety_ratings {
  category: HARM_CATEGORY_DANGEROUS_CONTENT
  probability: NEGLIGIBLE
}

The second I pass the service_context through and ask a question
makes sense 😅
?? I see the way safety is disabled in gemini.py, but that doesn't look right to me, when I did it before I moved to llama_index I had to specify which safety settings to disable.
ALSO!: am i able to just, switch gemini-ultra using llama index?? Like, I see it, I switched to it, it's genning responses, but is it really gemini-ultra? or did it fall back to gemini-pro/gpt3.5?
it wouldn't fallback to 3.5 if its set in the service context and passed in

I doubt the gemini package would fallback to pro if you specified ultra?
I think you just configure it like this?

Plain Text
safety_config = {
    generative_models.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: generative_models.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    generative_models.HarmCategory.HARM_CATEGORY_HARASSMENT: generative_models.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
}

llm = Gemini(..., safety_settings=safety_config)
just judging by their docs anyways
but ultra isn't available for public api yet right?
yet llm = Gemini(model="models/gemini-ultra", max_tokens=1000) is giving me responses
so is llm = Gemini(model="gemini-ultra", max_tokens=1000)
idk what that code is doing man haha

Plain Text
self._model = genai.GenerativeModel(
    model_name=model_name,
    generation_config=final_gen_config,
    safety_settings=safety_settings,
)
Plain Text
import google.generativeai as genai

for m in genai.list_models():
    if "generateContent" in m.supported_generation_methods:
        print(m.name)
That would print everything available
just pro and pro vision, but then why am I getting responses from setting it to ultra?
yea not sure what the genai package is doing when you give it the ultra model name
maybe debug logs would reveal what kind of requests are being made
Plain Text
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))
I’ll try that in a sec
I see stuff about open ai embedding model, but nothing about google, or gemini or genai
hmm I guess they don't log
I'll send the file here, if you wannt take alook too
I dont even have access to gemini lol google doesn't like canada I guess
I can see on my google cloud console that my api is getting hit, it just doesn't tell me which model is being used
google? as in google it, or as in, you're upset at them? or as in lol, that's such a google thing.
thats such a google thing hahaha
pulled up my old code from before llama index and found the safety settings config:
Plain Text
safety_settings = {
    "HARM_CATEGORY_SEXUALLY_EXPLICIT": "BLOCK_NONE",
    "HARM_CATEGORY_HATE_SPEECH": "BLOCK_NONE",
    "HARM_CATEGORY_HARASSMENT": "BLOCK_NONE",
    "HARM_CATEGORY_DANGEROUS_CONTENT": "BLOCK_NONE"
}
Also, will you guys support googles AQA model anytime soon?
AQA? Is that Active Question Answering?
I think thats super old no?
in any case, most integrations are community driven/contributed
Hi
@Logan M
class ElasticsearchClient:
def init(self):
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
Settings.llm = Vertex(
model="gemini-pro", project=xxxx, credentials=GOOGLE_API_KEY
)
# Settings.llm = Gemini(model="models/gemini-pro")

self.index = VectorStoreIndex.from_vector_store(
self.vector_store, storage_context=self.storage_context)

def get_answer(self, document_id, query) -> Generator[str, None, None]:
query_engine = self.index.as_query_engine(
vector_store_kwargs={
"es_filter": [{"match": {"metadata.docid.keyword": document_id}}],
}, similarity_top_k=6, streaming=True
)

response = query_engine.query(query)
for text in response.response_gen:
yield text
this is my code and it was working before when i used gemini class directly. Now I created api key from vertex ai and after i changed to vertex class to use gemini-pro model, I am getting this error.
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/navyasreepinjalavenkateswararao/Desktop/leximai-deep-api/venv/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 93, in wrapped_llm_chat
f_return_val = f(_self, messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/navyasreepinjalavenkateswararao/Desktop/leximai-deep-api/venv/lib/python3.11/site-packages/llama_index/llms/vertex/base.py", line 213, in stream_chat
chat_history = _parse_chat_history(messages[:-1], self._is_gemini)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/navyasreepinjalavenkateswararao/Desktop/leximai-deep-api/venv/lib/python3.11/site-packages/llama_index/llms/vertex/utils.py", line 172, in _parse_chat_history
raise ValueError("Gemini model don't support system messages")
ValueError: Gemini model don't support system messages
Indeed gemini does not support system messages (which is lame)

There is a WIP PR here though
https://github.com/run-llama/llama_index/pull/11511
Add a reply
Sign up and join the conversation on Discord