I have an idea

AAqzer

I have an idea:

I want to build a tool or software for my company. My company is a consultancy company which has 100's of projects. Each project has their own fair share of documentation in DocMan Databases, Jira tickets, GitLab issues, Wiki, teams chats, etc..

I want the tool to be able to collect all data that is text based and collect it in a vector embeddings database. I then want that tool, after automatically collecting all this data from the project, to attach this DB to a local LLM (local for privacy reasons, probably GPT4ALLJ or Vicuna etc..). This local model should then be able to be prompted much like ChatGPT but offer sources from the aforementioned data. This would make project work much easier because any information can simply be asked to this assistant. (There are many ChatGPT UI clones available open source).

I have a few questions:

Am I right in assuming this is possible?
How easy is it? From my understanding all of the tools required for this is open source and just need to be integrated
How much would it cost? (I assumed free due to the open source nature)
Could you direct me in how exactly I should go about creating this.

Thanks a lot!

2 comments

LLogan M

Definitely possible!
Fairly easy, assuming you have easy access to the documents you want to index. The hardest part will be figuring out the structure of the index (i.e. a single vector index, a few vector indexes for each source/topic in a composable graph index, or a few indexes used as custom tools in langchain)
If you use all local models, it should be free!
The FAQ has some links to using custom LLMs and embeddings, as well as other things: https://docs.google.com/document/d/1bLP7301n4w9_GsukIYvEhZXVAvOMWnrxMy089TYisXU/edit?usp=sharing

This thread also has a few implementations, give it a read: https://discord.com/channels/1059199217496772688/1090945925129707570/1098463407209979954

I would first get the local LLMs and embeddings working with a small subset of documents, and then figure out your index structure from there

AAqzer

Thanks so much! really helps

Add a reply

Find answers from the community

I have an idea