Fine-tune your own Llama 2 to replace GPT-3.5/4
There has been a lot of interest on HN in fine-tuning open-source LLMs recently (eg. Anyscale's post at https://news.ycombinator.com/item?id=37090632 ). I've been playing around with fine-tuning models for a couple of years, and wanted to share some insights and practical code. I’ve condensed what I’ve learned into a small set of notebooks at https://github.com/OpenPipe/OpenPipe/tree/main/examples/clas... , covering labeling data, fine-tuning, running efficient inference, and evaluating costs/performance. The 7B model we train here matches GPT-4’s labels 95% of the time on the test set, and for the 5% of cases where they disagree it’s often because the correct answer is genuinely ambiguous. What is fine-tuning? You can think of it as a more-powerful form of prompting, where instead of writing your instructions in text you actually encode them in the weights of the model itself. You do this by training an existing mod
DeepCamp AI