Hey all! I have a lot of chat arquives and wanted to train an AI to behave like this specific person. Do you think it would be best to use indices or maybe fine tuning?
For generating a persona, I think fine-tuning would work best (assuming you have access to enough resources)
Otherwise, prepending some short examples of the person taking to each prompt could work, but that uses valuable input space. For that, I would use langchain
llama_index is more for information retrieval, but you could maaaybe do some hacky stuff to get it to respond more like your persona without fine-tuning. (i.e. changing the text_qa prompt to be something like "Given the following examples of a chat, generate a response to the user query while pretending to be user X")
Another doubt I have is regarding multiple sources. I've created a basic PoC app to chat about a long report I have. I've created a graph index that combines a vector index and the empty index, and created a langchain tool with that graph index
I'm using the graph index for now, but it's hard to write a good description. Using different tools I imagine that makes it easier to write the descriptions, but harder to have combined answers, right?
Yea, writing the summary for each index in the graph can be hard. You can also use a temporary list index and response_mode="tree_summarize" to let the LLM write the summary for you
Using multiple tools, yea I think the combined answers thing can be a problem (I think langchain only uses one tool per response? I need to look into that a bit more)
I think that if langchain can break the prompt into 2 or 3 steps it can use different tools, but then it increases prompting complexity. I think graph will work better for me. Ty so much Logan!