Instead of trying to encapsulate all of it with purely prompt engineering, I would use RAG for it. Here's what I would do:
- upload and index your documents to llamaindex along with a "heuristics file":
const products = await fs.readFile(
"products.json"
);
const heuristics = await fs.readFile(
"heuristics.txt"
);
const productsDoc = new Document({ text: JSON.stringify(products) });
const heuristicsDoc = new Document({ text: heuristics });
- the heuristics file covers nuances that an LLM would probably miss. for example:
500ml means medium
cherries and gummy cherries are the same thing
croissants and mini croissants are not the same thing
7days is a brand name
coke, coca cola, COCA COLA, etc. are all the same thing
...
you don't have to line-by-line list everything, just the easily confused ones, like you listed here
- now your prompt can be much simpler, focusing on the format of output you want
should be able to query it like this:
const index = await VectorStoreIndex.fromDocuments([productsDoc, heuristicsDoc]);
const queryEngine = index.asQueryEngine();
const response = await queryEngine.query({
query: "how many haribo gummy cherries are there?",
});
^ if your inventory numbers are correct, and the LLM knows which ones are the same, it should get this answer correct. your prompt would just focus on turning it into JSON