Find answers from the community

Updated 8 months ago

Hello. I'm raising this question/thought so maybe someone could answer. I know it's no eas

At a glance
@kapa.ai Hello. I'm raising this question/thought so maybe someone could answer. I know it's no easy to answer.
Based on my knowledge about models, I'm seeing that the bigger and greater models, the better. But I don't understand why there isn't a simplify model that knows only how to talk and then teach it on a field, so a simple and light model can answer better than GPT4 (the best out there now).
How would be the best training method to transform a light model so it can answer better giving it some RAG files and context? Is this possible? How?
On my learning path, I've found MEOW app plugin for WordPress for fine-tune a model, and the main problem is the cost for training + the cost of inbound and outbound afterwards (fine-tuned models inbound and outbound are way expensive than the main model out of the box, check it, it's soooo expensive).
Also I'm thinking of Ollama + Llama3 + fine-tuned model in local, but I'd like to know how to have the model trained so the questions and answers are coherent.
Anyone experience on this would be much appreciate. Thank you.
k
W
m
5 comments
Please stand by @memorIA as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
Hello! Your question is quite insightful and touches on several aspects of machine learning and language models. Let's break it down:

  1. Simplifying Models: While it's true that larger models like GPT-4 tend to perform better due to their ability to capture more complex patterns, it's also possible to achieve good results with smaller, more specialized models. This is where fine-tuning comes in. Fine-tuning is a process where a pre-trained model (like GPT-3.5-Turbo) is further trained on a specific dataset to specialize it for a particular task. This can potentially lead to a smaller, more efficient model that performs better on that specific task.
  1. Training with RAG: The Retrieval-Augmented Generation (RAG) approach can be used to import external knowledge into a language model as context. This involves using a retrieval model to find relevant information from a large knowledge base, and then using this retrieved information to augment the generator's output. This can help the generator produce more coherent and informative responses. You can see an example of this in the LlamaIndex documentation ().
  1. Fine-tuning Costs: Fine-tuning a model can indeed be expensive, both in terms of computational resources and the cost of the data used for training. However, fine-tuning can lead to significant improvements in model performance. For example, fine-tuning embeddings can give you a 5-10% increase in retrieval evaluation metrics, as mentioned in the LlamaIndex documentation ().
  1. Local Training: Training a model locally can help reduce costs, but it requires sufficient computational resources. You can use tools like LlamaIndex to facilitate this process. For example, you can use LlamaIndex to generate a synthetic question/answer dataset, fine-tune the model, and then evaluate the model, as shown in this example ().
  1. Coherent Questions and Answers: Ensuring that the questions and answers generated by a model are coherent is a complex task that involves various aspects of model training and evaluation. One approach is to use evaluation metrics like answer_relevancy and faithfulness, which measure the relevance of the generated answer to the prompt and the factual consistency of the generated answer against the given context, respectively. You can see an example of this in this example ().
I hope this helps! If you have any more questions, feel free to ask.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
To make the answers coherent, You can finetune llms ( In your case you want small llm. Try with microsoft phi-3 )

.
I would also try with different prompt settings to see how they are working under those conditions
Thank you. Phi-3 and prompts it's what really I'm on to these days. So I'm happy you point that out so I'm on the same page.
Also I'm investigating Knowledge graphs and JSON unstructured conversions.
Thank you @WhiteFang_Jr
Add a reply
Sign up and join the conversation on Discord