Find answers from the community

Updated 3 months ago

Hello everyone I started to get

Hello everyone, I started to get acquainted with llama-index and I have a small question for the experts, I wrote a service for processing HTML documents and I want to index these documents, but this is not a task, when I do indexing I get three json files and after that when i try to promt to one of the indexes i wait more than 10 minutes for a response, how can i speed up getting the response results?
Here are some code snippets:
Plain Text
This method is responsible for answering the corresponding document
    def load_index(self, index_name, prompt):
        try:
            index_path = f"{self.directory_path}/indexes/index_{index_name}"
            documents = SimpleDirectoryReader(index_path).load_data()
            index = GPTVectorStoreIndex.from_documents(documents)
            query_engine = index.as_query_engine()
            return query_engine.query(prompt)
        except Exception as e:
            return HTTPException(status_code=404, detail='Not Found')
W
T
j
9 comments
You are loading the documents every time you have to make the query.

You should have the indexes created only once at the beginning and keep only this part when querying

Plain Text
        try:
            query_engine = index.as_query_engine()
            return query_engine.query(prompt)
        except Exception as e:
            return HTTPException(status_code=404, detail='Not 


Or If you want to keep the index creating part here also then do this

Plain Text
try:
    # Rebuild the storage context
    storage_context = StorageContext.from_defaults(persist_dir="./storage")
    # Load the index
    index = load_index_from_storage(storage_context)
except:
    # Storage not found; create a new one
    from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader

    documents = SimpleDirectoryReader("./data").load_data()
    index = GPTVectorStoreIndex.from_documents(documents)
    index.storage_context.persist()


# Now add the query part here
        try:
            query_engine = index.as_query_engine()
            return query_engine.query(prompt)
        except Exception as e:
            return HTTPException(status_code=404, detail='Not Found')


This will not create indexes everytime you query, hence will reduce your time
@WhiteFang_Jr Thank you very much for the answer, if I understood everything correctly, then when creating the index, I get a folder with three json documents (example in the photo)
I didn’t quite understand how I need to refer to a specific document, I have a lot of such indexes (example in photo 2)
earlier when i was using llama-index version <= 0.5.23
I could refer to a particular index by its name, as an example:
Plain Text
index_set = {}
for file in files:
    cur_index = GPTSimpleVectorIndex.load_from_disk(f'{directory_path}/indexes/index_{file}.json', service_context=service_context)
    index_set[file] = cur_index
response = index_set['036283'].query("How does the sum of $1,500,000,000 relate to Apple in 2020?", similarity_top_k=3)
print(response)
Attachments
image.png
image.png
Okay so, let say you have a folder of documents and you want to create a chatbot on top it.

You provide the document path , and Llama-Index under the hood creates a Storage folder and saves three files under that folder.
Document stores: where ingested documents (i.e., Node objects) are stored,

Index stores: where index metadata are stored,

Vector stores: where embedding vectors are stored.


If you want to create separate embeddings for different data, You can persist them in differnt folder using
Plain Text
    # Data folder 1
    documents = SimpleDirectoryReader("./data").load_data()
    index = GPTVectorStoreIndex.from_documents(documents)
    index.storage_context.persist() # stores by default to storage folder

    # Data folder 2
    documents = SimpleDirectoryReader("path to second folder").load_data()
    index = GPTVectorStoreIndex.from_documents(documents)
    index.storage_context.persist(persist_dir="provide_folder_name_of_your_choice")

Now while loading

Plain Text
    # Rebuild the storage context, provide the folder name and it will load that folder name 
    storage_context = StorageContext.from_defaults(persist_dir="./storage")
    # Load the index
    index = load_index_from_storage(storage_context)
" I could refer to a particular index by its name, as an example:"

Yeah the storing format of indexes changed 😅
But you can refer to the indexes the same way here also, just provide the folder name when the embeddings are stored!!
@WhiteFang_Jr Thank you very much, you are my hero!! Everything worked out for me, if it's not difficult for you to tell me where else can I set the maximum number of indexes? Or even pass your model for processing? The warning I see is this:
Plain Text
Token indices sequence length is longer than the specified maximum sequence length for this model (3301 > 1024). Running this sequence through the model will result in indexing errors

I found this example but the current import no longer exists:
from llama_index.llms import OpenAI
Plain Text
llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
service_context_chatgpt = ServiceContext.from_defaults(llm=llm, chunk_size=1024)

How can I increase the amount?
"Token indices part is just a warning". You can ignore that
Also if you are planning to use gpt-3.5
I would suggest use chatOpenAI class
What are the primary advantageous in using llama_index over langchain for ingesting documents into a vector store
Add a reply
Sign up and join the conversation on Discord