LlamaIndex

Log inLog into community

Find answers from the community

Updated 2 years ago

Accidentally using OpenAI API

Accidentally using OpenAI API

At a glance

The community members are discussing an issue with code that seems to work until the graph.query step, where it is asking for an OpenAI API key. They have tried providing an empty key, using the openai module, and checking the query configs, but the issue persists. The community members suspect the issue may be related to the LangChain or Llama-Index modules, or the sentence transformations. They have also shared the full code and are trying to narrow down the issue by removing the openai module. The community members suggest passing the service_context to the query_kwargs as a potential solution, and one member has made progress by passing the service_context for the index build, graph build, and graph query.

Useful resources

OOverclockedClock

·

Definitely not an expert, but I have almost identical code right next to me, and it seems to at least work this far without having to provide an openai api key. I'd say that the code looks good. What does the rest of your code look like?

O

A

L

53 comments

OOverclockedClock

Actually probably easier to continue in here

OOverclockedClock

@Akinus21 What about providing an empty OpenAI key

OOverclockedClock

something like

Plain Text

import os
os.environ["OPENAI_API_KEY"] = ""

I'll give it a shot.

Getting the same error. I'm going to try with the openai module:

Plain Text

Import openai
openai.api_key = ""

I've had to do that in the past, but for the life of me I can't figure out which module is still needing that variable.

OOverclockedClock

Maybe there's something off in your queryconfigs?

Plain Text

def ask_gpt_custom(prompt):
    graph = build_index(prompt)
    query_configs = [
        {
            "index_struct_type": "tree",
            "query_mode": "embedding",
            "query_kwargs": {
                "child_branch_factor": 2
            }
        },
    ]

    response = graph.query(
        prompt,
        query_configs=query_configs
    )

    return f'{response}'

OOverclockedClock

hm

OOverclockedClock

oh didn't even see you already posted it

OOverclockedClock

and this all goes well until the graph.query I guess

The only thing I could think is that one of the LangChain or Llama-Index modules I am importing uses OpenAI by default for something.

Could it be in the sentence transfomations?

Plain Text

#### Import Data ####
import os
import time
import base64
import hashlib
from datetime import datetime
import pytz
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from googleapiclient.errors import HttpError
from googleapiclient.errors import Error
from email.message import EmailMessage
import logging
import openai
from llama_index import download_loader, LLMPredictor, PromptHelper, ServiceContext
from dotenv import load_dotenv
import sys
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.llms.base import LLM
from llama_index import GPTTreeIndex, SimpleDirectoryReader, LangchainEmbedding, GPTListIndex
from transformers import pipeline
from typing import Mapping, Any
import signal
from llama_index.indices.composability import ComposableGraph

OOverclockedClock

Yeah I was already checking if it wasn't your embedding model or anything but that looks good too

OOverclockedClock

I mean

Plain Text

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/all-mpnet-base-v2
INFO:sentence_transformers.SentenceTransformer:Use pytorch device: cpu

It looks like its loading in a HuggingFace model without problem

OOverclockedClock

@Akinus21 are you using the OpenAI library for anything? It's in your imports

Only to provide the 'openai.api_key = os.getenv('OPEN_API_KEY')' line, hoping that would satisfy the error

https://pastebin.com/UE8K9n0w

OOverclockedClock

oh I completely read over that

OOverclockedClock

my bad

This is the entire code

in that pastebin

OOverclockedClock

Alright I'm gonna run it too and I'll see what happens

note: the Google search id and api keys are no longer valid

OOverclockedClock

I figured

OOverclockedClock

You are able to build the indices, right? The exception happens at query time? @Akinus21

OOverclockedClock

It doesnt error on this graph = build_index(prompt) right

It seems like. I have no error information indicating problems with building the indices. Although, I have before, but I think I need to solve the OpenAI key thing before I can get to that.

The background story here is:

I run this script on a server. It fires every 5 seconds. It queries a gmail inbox for messages with a particular pattern. I did not pay due attention, and long story short, it ran every 5 seconds for an entire day and errored out before it could delete the email so it just ran and ran. It would NOT have been an issue, except, it was querying ChatGPT without me knowing why, and so my cost was very high by the end of the day.

I cannot test by adding a valid OpenAI key, to see if there are any errors past that point, because I've reached the cap. So I have no idea if there are any errors in the code past the point that it throws the OpenAI error. I cannot figure out where that error is occurring, because I don't know what's calling it.

All this, sums up why I needed help in the discord, because it's beyond my skillset to figure this one out.

OOverclockedClock

ouch, that's a tough one

OOverclockedClock

Very long shot, but could it be because of a line without service_context?

OOverclockedClock

line 293: index = GPTTreeIndex.from_documents(documents)

OOverclockedClock

I reckon that if you do not provide a service context it will re-create the TreeIndex everytime the attachments folder gets updated

OOverclockedClock

That's the only thing I can think of right now honestly

worth a shot

Still got the same error, but what occurred to me is I might be able to figure out the calls if I uninstall the openai module:

Plain Text

Traceback (most recent call last):
  File "/home/gabri/AkoGPT/AkoGPT_main.py", line 31, in <module>
    from llama_index import download_loader, LLMPredictor, PromptHelper, ServiceContext
  File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/__init__.py", line 15, in <module>
    from llama_index.embeddings.openai import OpenAIEmbedding
  File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/embeddings/openai.py", line 6, in <module>
    import openai
ModuleNotFoundError: No module named 'openai'

OOverclockedClock

I hope it gives some more clarity because I'm kind of out of ideas too tbh

looks like it's in the embeddings somewhere in Llama-Index

OOverclockedClock

What about specifying a custom model_name to the HuggingFaceEmbedding?

OOverclockedClock

wait that doesnt make sense bc its huggingface

OOverclockedClock

Maybe @Logan M has any idea...?

@OverclockedClock @Akinus21

Ok so the issue is, you are using a cutsomLLM and embeddings from huggingface, but it's still asking for an openai key?

I think you can reveal the issue pretty quickly by setting the openai key to some fake value. Then if it actually tries to call openai, it will error out and hopefully you can track down where/why it's being called

oh, it looks like @OverclockedClock suggested that already LOL and it didn't help track down the issue?

in the graph query configs, try passing in the service_context to each query_kwargs maybe?

@Logan M I can't find any documentation on that, would it simply be:

Plain Text

query_configs = [
        {
            "index_struct_type": "tree",
            "query_mode": "embedding",
            "query_kwargs": {
                "child_branch_factor": 2
                "service_context" = service_context
            }
        },
    ]

?

Almost, just change it slightly and swap out the equals sign

"child_branch_factor": 2,
"service_context": service_context

I think I've narrowed it down to my CustomLLM class code

here's the error now:

Traceback (most recent call last):
File "/home/gabri/AkoGPT/AkoGPT_main.py", line 592, in <module>
get_emails()
File "/home/gabri/AkoGPT/AkoGPT_main.py", line 524, in get_emails
response = ask_gpt_custom(body_decoded)
File "/home/gabri/AkoGPT/AkoGPT_main.py", line 372, in ask_gpt_custom
graph = build_index(prompt)
File "/home/gabri/AkoGPT/AkoGPT_main.py", line 307, in build_index
index = GPTListIndex.load_from_disk(index_file)
File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/indices/base.py", line 369, in load_from_disk
return cls.load_from_string(file_contents, kwargs) File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/indices/base.py", line 345, in load_from_string return cls.load_from_dict(result_dict, kwargs)
File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/indices/base.py", line 322, in load_from_dict
return cls(index_struct=index_struct, docstore=docstore, **kwargs)
File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/indices/list/base.py", line 54, in init
super().init(
File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/indices/base.py", line 69, in init
self._service_context = service_context or ServiceContext.from_defaults()
File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/indices/service_context.py", line 69, in from_defaults
llm_predictor = llm_predictor or LLMPredictor()
File "/home/gabri/.local/lib/python3.10/site-packages/llama_index/llm_predictor/base.py", line 164, in init
self._llm = llm or OpenAI(temperature=0, model_name="text-davinci-003")
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.init
pydantic.error_wrappers.ValidationError: 1 validation error for OpenAI

Ohhh you need to pass in the service context when you load from disk 👀👀

Making progress! I passed the service_context for every index build, the graph build, and the graph query. Finally moving on to the next errors.

Thank you all for the help. I finally got it to not use OpenAI API key. You all were awesome in helping me.

Now I just have to figure out this error when I query the graph:
str type expected (type=type_error.str)

OOverclockedClock

Wooo! Glad to hear you managed to fix it! What was using the OpenAI key in the end?

I was declaring my custom LLM wrong, so it was using the default LLM which is OpenAI's.

Add a reply

Sign up and join the conversation on Discord