Hello everyone,
I recently started working with llama-index and I've encountered a very weird issue that I cannot find the solution.
I have tried three big embedding models, including Salesforce/SFR-Embedding-Mistral model, GritLM/GritLM-7B model and intfloat/e5-mistral-7b-instruct model. It's very confusing that the nodes retrieved (top_k=10) all have the score higher than 0.9999999999. However, when I turned to use a small embedding model like UAE-Large-V1, the highest score is around 0.65, which seems to be okay.
Plus, I tried to modify my prompt. The results remain same (retrieved nodes may be different, but their scores are still very very close to 1), even if my prompt is 'hello' which has nothing to do with the nodes text retrieved and the contents I feed into the model.
I'm confused about where the problem lies. Below is my code snippet:
llm = AzureOpenAI(
model="gpt-35-turbo",
deployment_name='xxxx',
api_key="xxxxx",
)
embed_model=HuggingFaceEmbedding(model_name="Salesforce/SFR-Embedding-Mistral",cache_folder='model_cache',
device='cuda',embed_batch_size=1,max_length=3072)
Settings.llm = llm
Settings.tokenizer = tiktoken.encoding_for_model("gpt-3.5-turbo").encode
Settings.embed_model = embed_model
documents = SimpleDirectoryReader("../2024_report").load_data()
pipeline=IngestionPipeline(transformations=[MarkdownNodeParser(include_metadata=True,include_prev_next_rel=True),
])
nodes = pipeline.run(documents=documents)
index=VectorStoreIndex(nodes,context,show_progess=True)
query='hello'
retriever=VectorIndexRetriever(index=index,similarity_top_k=10,)
ret_nodes=retriever.retrieve(query)
for ret_node in ret_nodes:
print(ret_node.score)
I'm reaching out to see if anyone has experienced similar issues. Ant insights or suggestions on how to solve this problem would be greatly appreciated. Thank you.