Hi all does anybody know any good

At a glance

Hi all, does anybody know any good resources on strategies to deal with case sensitivity during retrieval from knowledge graphs? For example, I'm testing with a Gartner report on Insight Engines. The knowledge graph index has picked "Insight engine" (note upper case "i") even though other variations of casing are present. query: "tell me more about insight engines" fails. "tell me more about Insight engines" succeeds. I want both to succeed (I think)

5 comments

SSeldo

That's surprising, I haven't heard of case sensitivity as a big problem before. What model are you using?

SSeldo

The immediate solution that occurs would be to lowercase everything before ingesting it but that seems like a dirty hack

SSeldo

The way-complicated solution would be to use something like Vespa or Elastic which also do keyword search before feeding it to the model.

JJanaka - Docq.AI

Lower-casing is what I thought of also. But then I wasn't sure if I'd just have the opposite problem. Wanted to see if I could understand some fundamentals through reading before hacking 🙂

Default models that come with LlamaIndex KeywordExtractor and EntityExtractor. BAAI/bge-small-en-v1.5 for embedding . I've not tried with text-embedding-ada-002 with this experiment yet.

I've not tried enabling embeddings for the KG. I've not looked into how that changes querying functionally or pros cons.

JJanaka - Docq.AI

re: keyword search - in theory I should be able to compose using the LlamaIndex indexing methods and get the same as Elastic etc. right? granted they've done the hard work already.

Add a reply

Find answers from the community

Hi all does anybody know any good