Hello everyone I started to get

TTominand

Hello everyone, I started to get acquainted with llama-index and I have a small question for the experts, I wrote a service for processing HTML documents and I want to index these documents, but this is not a task, when I do indexing I get three json files and after that when i try to promt to one of the indexes i wait more than 10 minutes for a response, how can i speed up getting the response results?
Here are some code snippets:

Plain Text

This method is responsible for answering the corresponding document
    def load_index(self, index_name, prompt):
        try:
            index_path = f"{self.directory_path}/indexes/index_{index_name}"
            documents = SimpleDirectoryReader(index_path).load_data()
            index = GPTVectorStoreIndex.from_documents(documents)
            query_engine = index.as_query_engine()
            return query_engine.query(prompt)
        except Exception as e:
            return HTTPException(status_code=404, detail='Not Found')

9 comments

WWhiteFang_Jr

You are loading the documents every time you have to make the query.

You should have the indexes created only once at the beginning and keep only this part when querying

Plain Text

        try:
            query_engine = index.as_query_engine()
            return query_engine.query(prompt)
        except Exception as e:
            return HTTPException(status_code=404, detail='Not

Or If you want to keep the index creating part here also then do this

Plain Text

try:
    # Rebuild the storage context
    storage_context = StorageContext.from_defaults(persist_dir="./storage")
    # Load the index
    index = load_index_from_storage(storage_context)
except:
    # Storage not found; create a new one
    from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader

    documents = SimpleDirectoryReader("./data").load_data()
    index = GPTVectorStoreIndex.from_documents(documents)
    index.storage_context.persist()


# Now add the query part here
        try:
            query_engine = index.as_query_engine()
            return query_engine.query(prompt)
        except Exception as e:
            return HTTPException(status_code=404, detail='Not Found')

This will not create indexes everytime you query, hence will reduce your time

TTominand

@WhiteFang_Jr Thank you very much for the answer, if I understood everything correctly, then when creating the index, I get a folder with three json documents (example in the photo)
I didn’t quite understand how I need to refer to a specific document, I have a lot of such indexes (example in photo 2)
earlier when i was using llama-index version <= 0.5.23
I could refer to a particular index by its name, as an example:

Plain Text

index_set = {}
for file in files:
    cur_index = GPTSimpleVectorIndex.load_from_disk(f'{directory_path}/indexes/index_{file}.json', service_context=service_context)
    index_set[file] = cur_index
response = index_set['036283'].query("How does the sum of $1,500,000,000 relate to Apple in 2020?", similarity_top_k=3)
print(response)

Attachments

WWhiteFang_Jr

Okay so, let say you have a folder of documents and you want to create a chatbot on top it.

You provide the document path , and Llama-Index under the hood creates a Storage folder and saves three files under that folder.
Document stores: where ingested documents (i.e., Node objects) are stored,

Index stores: where index metadata are stored,

Vector stores: where embedding vectors are stored.

If you want to create separate embeddings for different data, You can persist them in differnt folder using

Plain Text

    # Data folder 1
    documents = SimpleDirectoryReader("./data").load_data()
    index = GPTVectorStoreIndex.from_documents(documents)
    index.storage_context.persist() # stores by default to storage folder

    # Data folder 2
    documents = SimpleDirectoryReader("path to second folder").load_data()
    index = GPTVectorStoreIndex.from_documents(documents)
    index.storage_context.persist(persist_dir="provide_folder_name_of_your_choice")

Now while loading

Plain Text

    # Rebuild the storage context, provide the folder name and it will load that folder name 
    storage_context = StorageContext.from_defaults(persist_dir="./storage")
    # Load the index
    index = load_index_from_storage(storage_context)

WWhiteFang_Jr

" I could refer to a particular index by its name, as an example:"

Yeah the storing format of indexes changed 😅

WWhiteFang_Jr

But you can refer to the indexes the same way here also, just provide the folder name when the embeddings are stored!!

TTominand

@WhiteFang_Jr Thank you very much, you are my hero!! Everything worked out for me, if it's not difficult for you to tell me where else can I set the maximum number of indexes? Or even pass your model for processing? The warning I see is this:

Plain Text

Token indices sequence length is longer than the specified maximum sequence length for this model (3301 > 1024). Running this sequence through the model will result in indexing errors

I found this example but the current import no longer exists:
from llama_index.llms import OpenAI

Plain Text

llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
service_context_chatgpt = ServiceContext.from_defaults(llm=llm, chunk_size=1024)

How can I increase the amount?

WWhiteFang_Jr

"Token indices part is just a warning". You can ignore that

WWhiteFang_Jr

Also if you are planning to use gpt-3.5
I would suggest use chatOpenAI class

jjalateras1963

What are the primary advantageous in using llama_index over langchain for ingesting documents into a vector store

Add a reply

Find answers from the community

Hello everyone I started to get