Preprocessing

Use preprocessing techniques on input query such as stemming and lemmatization .

But if i will do that on query how will llama_index will match the processed query on nodes

I will have to also do processing on nodes when building index?

I feel like this is pretty common for embeddings. With the query being so short, any change at all can drastically change the embeddings

Maybe try increasing the top k one or two ? You could also look into writing a custom retriever that uses a keyword and vector index

https://gpt-index.readthedocs.io/en/latest/examples/query_engine/CustomRetrievers.html

A mix of keyword and vector index (hybrid) is definitely a good idea.

@Logan M Thank you!

@theOldPhilosopher the pre-processing can be done before sending the query to the index. These will work well for keyword index and maynot work as well for a vector index. Hence, a hybrid approach work much better here.

https://exchange.scale.com/public/blogs/preprocessing-techniques-in-nlp-a-guide

I want to confirm that when llama_index us using query to fetch nodes it can use semantic search?

@Logan M @ravi-decover Hi guys, I want to ask that can I use qdrant for semantic search?

Yes, when using a vector index it is (semantic search is basically just embeddings)

For sure! LlamaIndex works well with qdrant

Hi @Logan M I want to know working of llama_index properly like what's the difference between langchain and llama_index? Why I should use llama_index actually I am not that clear about why I should use llama_index instead of langchain. Semantic search is in qdarnt also.So, if it possible can you help me clearing this doubt. It will be great help.
Thanks

Qdrant (and also langchain) don't allow for more complex query structures. With llama index, you can use query engines on top of your index like sub question query engine and router query engine.

Furthermore, these query engines can all be used in agents (which we have some more news on that later today 👏).

Compared to langchain, I'd say llamaindex is more customizable. Retrievers, node postprocessors, response synthesizers all come together to form a query engine, and you can customize each piece.