When moving data from sources like Slack or Notion to a Vector database, how should I transform the data before embedding it to make it the most useful for my LLM? Any guidance on chunking, adding metadata, or other transforms? Are you using any tools or frameworks for that, or are you writing most of the code yourself?
Some Vector DBs like Weavieate offer embedding as part of the product, but doesn’t that limit me in terms of the transformations I can do beforehand? Is it a bad idea to lock-in to a vendor’s pre-packaged embedding?