Hi all, does anybody know any good resources on strategies to deal with case sensitivity during retrieval from knowledge graphs? For example, I'm testing with a Gartner report on Insight Engines. The knowledge graph index has picked "Insight engine" (note upper case "i") even though other variations of casing are present. query: "tell me more about insight engines" fails. "tell me more about Insight engines" succeeds. I want both to succeed (I think)
Lower-casing is what I thought of also. But then I wasn't sure if I'd just have the opposite problem. Wanted to see if I could understand some fundamentals through reading before hacking 🙂
Default models that come with LlamaIndex KeywordExtractor and EntityExtractor. BAAI/bge-small-en-v1.5 for embedding . I've not tried with text-embedding-ada-002 with this experiment yet.
I've not tried enabling embeddings for the KG. I've not looked into how that changes querying functionally or pros cons.
re: keyword search - in theory I should be able to compose using the LlamaIndex indexing methods and get the same as Elastic etc. right? granted they've done the hard work already.