LlamaIndex

Log inLog into community

Find answers from the community

Updated 4 months ago

@WhiteFang_Jr hello sorry for bothering

@WhiteFang_Jr hello sorry for bothering

At a glance

ᴷᴷᵉⁿˢʰⁱHOUSSNI

·

hello sorry for bothering you can u help me with this error traceback for the query engine

W

ᴷ

45 comments

Hey!

Can you share your error and the operation you were trying when you got the issue here please

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr im trying to create a rag application pdf answering for my graduation project , i stock the chunks and the documents info in Mongo db and im using Faiss as a vector store , and im using Llama-2-7b when i use this function so i can get my response @app.route('/query', methods=['GET'])
def query():
query_text = request.args.get('query')
if not query_text:
return jsonify({'error': 'Query parameter is missing'}), 400
try:
logging.info(f"Processing query: {query_text}")
response = query_engine.query(query_text)
logging.info(f"Query response: {response}")
return jsonify({'response': response})
except Exception as e:
logging.error(f"Error processing query: {e}", exc_info=True)
return jsonify({'error': str(e)}), 500 and i have this error

This is because Flask does not allow pydantic response return

What data do you want to return

You can create a dict containing all the required items in it.
For example:
{ response: response.response,
node_info: [ Add all the node info here ]
}

and then return this dict as a final response

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr i have another question why when i run my code i wait a lot with this message transformers\models\llama\modeling_llama.py:670: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
attn_output = torch.nn.functional.scaled_dot_product_attention(

I think this is a warning for something related to pytorch.

I guess this is not stopping your code right?

ᴷᴷᵉⁿˢʰⁱHOUSSNI

im trying this code to resolve my problem
@app.route('/query', methods=['GET'])
def query():
query_text = request.args.get('query')
if not query_text:
return jsonify({'error': 'Query parameter is missing'}), 400
try:
logging.info(f"Processing query: {query_text}")
response = query_engine.query(query_text)
logging.info(f"Query response: {response}")
response_data = {
'response': response.response,
}

return jsonify(response_data)
except Exception as e:
logging.error(f"Error processing query: {e}", exc_info=True)
return jsonify({'error': str(e)}), 500

ᴷᴷᵉⁿˢʰⁱHOUSSNI

the code is runnning but ti takes too much time

Are you running on GPU or CPU?

ᴷᴷᵉⁿˢʰⁱHOUSSNI

cpu i think i put device_map="auto"

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr 1 hour passed and the code didnt excute yet

CPU will take more time as llms are not effective or are created for CPU.

If you want to test local llm, try it with ollama I think it's optimized so might be faster then this current one that you have

ᴷᴷᵉⁿˢʰⁱHOUSSNI

can i know more about how replacing my curent llm with ollama please ?

You can use this as a starting point for ollama integration: https://docs.llamaindex.ai/en/stable/examples/llm/ollama/

ᴷᴷᵉⁿˢʰⁱHOUSSNI

okey thanks a lot for ur time 😄

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr finally he gave me the answer after 2 hours😅 but the answer was in english different from the document language

Finally!!

one more thing, open-source llm may not be good at providing response in language other than english.
Also do use embedding model which works for your language if it is diff than english

ᴷᴷᵉⁿˢʰⁱHOUSSNI

i will try in first place ollama to see if i will have a fastest response

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr when i tried to verify the response the infos are not from the documents on mongo db it just like they are generated from llm how can i identify the problem ?

You'll have to check what your embedding model is returning as nodes for your query.

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr can u verify with this code logique plz

You are creating embedding model on your own. Is there any specific reason to not use llamaindex directly to create embedding model?

ᴷᴷᵉⁿˢʰⁱHOUSSNI

no i thought this is the way to use llamaindex embedding model i already declare this Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-mpnet-base-v2")

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr

Okay, I got confused with this piece of code:

Plain Text

def embed_text(text):
    inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=512)
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state.mean(1).detach().numpy()
    faiss.normalize_L2(embeddings)
    return embeddings

ᴷᴷᵉⁿˢʰⁱHOUSSNI

do i remove it @WhiteFang_Jr ?

Yeah you can, Also can you tell me what do you want to acheive?

Then maybe I can be able to suggest you some changes based on that. For instance here in the follwoing code:

Plain Text

    try:
        with pdfplumber.open(file_path) as pdf:
            for page_number, page in enumerate(pdf.pages, start=1):
                text = page.extract_text()
                if text:
                    embeddings = embed_text(text)
                    if embeddings is not None:
                        embeddings = embeddings.reshape(1, -1)  # Reshape for Faiss
                        vector_id = str(faiss_index.ntotal)
                        faiss_index.add(embeddings)
                        pdf_collection.insert_one({
                            'filename': filename,
                            'text': text,
                            'page_number': page_number,
                            'vector_id': vector_id
                        })
                    else:
                        logging.error(f"No embeddings generated for page {page_number} of {filename}")
                else:
                    logging.error(f"No text found on page {page_number} of {filename}")
        return jsonify({'message': 'PDF uploaded and processed', 'filename': filename})

You can use simply pass this text to index

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr i want to save the chunks and the documents informations ( like the document name and the number of page where the chunk is ) and do the similarity with faiss so when i query it answer from the document

ᴷᴷᵉⁿˢʰⁱHOUSSNI

the documents are on mongo db

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr it still just give a answer inexistant on the documents this is my query code @app.route('/query', methods=['GET'])
def query():
query_text = request.args.get('query')
if not query_text:
return jsonify({'error': 'Query parameter is missing'}), 400
try:
logging.info(f"Processing query: {query_text}")
response = query_engine.query(query_text)
logging.info(f"Query response: {response}")
response_data = {
'response': response.response,
}

return jsonify(response_data)
except Exception as e:
logging.error(f"Error processing query: {e}", exc_info=True)
return jsonify({'error': str(e)}), 500

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr What do u think ?

You'll have to verify whether it is able to find correct data points or not for your query. For this multiple items can be at fault:

Your embedding model may not be suited for your document language ( I think its not in english )
The llm is not able to answer correctly ( could be because it is not familiar with your choice of doc langauge or not capable enough )

If this is your college project or something I would recommend using Qdrant + FastAPI

They are much better and have lots of examples present for these two

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr If I understood correctly, I should replace Faiss with Qdrant and keep Mongo DB for storing chunks and document information.

Yes also one more clarity if you could provide:

How did you store docuemnts in mongo ?
How are you accesssing it

ᴷᴷᵉⁿˢʰⁱHOUSSNI

the objectif is to store the documents information and the chunks and their embedding on Mongofb abd do the similarity search with faiss # MongoDB setup
mongo_conn_url = os.getenv("MONGO_CONN_URL", "mongodb://localhost:27017/")
client = MongoClient(mongo_conn_url)
db = client['pdf_query_db']
pdf_collection = db['pdfs']

Setting up the Document Store and Index Store

docstore = MongoDocumentStore.from_uri(mongo_conn_url)
index_store = MongoIndexStore.from_uri(mongo_conn_url)
storage_context = StorageContext.from_defaults(docstore=docstore, index_store=index_store, vector_store=vector_store)

Initialize MongoDB reader

reader = SimpleMongoReader(uri="mongodb://localhost:27017")
documents = reader.load_data(db.name, pdf_collection.name, field_names=["text"])
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context , embed_model=Settings.embed_model)
print(pdf_collection.name)
index.storage_context.persist(persist_dir="./storage")

Load or create indices

index.set_index_id("my_index")

index = load_index_from_storage(storage_context , index_id="my_index")

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr this is the logic

I dont think you need to persist, You can always read from mongo and create index.

Or if you want to persist. then add if/else condition that if local persist exists then no need to load from mongo anything directly fetch from local persist

ᴷᴷᵉⁿˢʰⁱHOUSSNI

i have a confusion on doing the similarity search because i was convinced that the persist is for saving vectors and using them after to do the similarity search i want to know is my code is correct and how i can save the embedding with the pdf information on Mongo db @WhiteFang_Jr

Yep persisting will save the embeddings of the documents locally, But as per your code everytime you'll run the code:

It'll fetch documents from Mongo Reader
Create embeddings
Create Index
Persist it to local storage.

This is redoing again and again!

If you want to store your embeddings in mongo, Use this: https://docs.llamaindex.ai/en/stable/examples/vector_stores/MongoDBAtlasVectorSearch/?h=mongo

Once you store it, just create the vector store instance , pass it to VectorStoreIndex and you are good to query your docs

ᴷᴷᵉⁿˢʰⁱHOUSSNI

@WhiteFang_Jr yes i understood u but i dont what to use the cloud and i heard tha tMongo db atlas is using cloud thats why i use faiss for the vector stuff and Mongo db to save the informations

Add a reply

Sign up and join the conversation on Discord