The community members are discussing the best way to construct an index from emails in order to create an email bot that can respond like a person. One community member suggests using a list or vector index to summarize past emails, and then feeding the summary and recent emails to a language model to generate responses. Another community member mentions using phone call transcriptions to create a vector index, but wants to avoid retrieving personal details from the knowledge base. A third community member recommends using a PII (Personally Identifiable Information) masking tool to address the issue of personal details. The community members acknowledge that this is a tricky problem to solve.
Hmm, not sure if an index is the right thing here?
Maybe a list or vector index to create a summary of past emails, and then you feed the summary and X most recent emails + instructions to the LLM to generate a response?
In the past, i have used phone call transcriptions and created vector index directly with them and the responses were great. I was thinking of the same thing with emails, but I want to avoid any personal details that might be retrieved from the knowledgebase