Round 2 - I use CodeLlama 70B vs Mixtral MoE to write code to finetune a model on 16 GPUs ๐คฏ๐คฏ
Skills:
LLM Engineering90%
I try to see if I how well three different LLMs work for writing a python script to finetune a model on 16 GPUs (multi-node).
This video is not edited in any way. It shows a realistic workflow for coding without gimmicks or hype.
I ask CodeLlama 70B, Mixtral MoE to write a python program to finetune a computer vision model on the CIFAR10 dataset. You can validate all this for yourself by running the 3 studios for free:
This is an unedited video... so here are some corrections:
- To clarify what "based on Llama 2 means". Mistral 7B tweaks the way llama 2 does attention but is then pretrained from scratch.
- I think I forgot about llama 2 7B... mixtral was just working super well
Chapters:
00:00 Introduction
00:40 Run CodeLlama 70B
01:13 Run Mixtral 8x7B (MoE)
01:34 Run Mistral 7B
01:47 How to get a GPU
02:08 What is a Lightning Studio
03:47 Basic CodeLlama 70B test
04:20 Basics of model monitoring
04:39 Connect a local VSCode
06:20 Basic Mixtral MoE coding test
08:46 Create the prompt to generate the ML code
09:04 Connect an S3 bucket
10:10 Full prompt for ML code
13:16 Prompt Mistral 7B
13:50 Debug the finetuning script
14:16 About the Lightning Trainer
14:56 Sanity check the finetuning script
15:30 Monitor with Tensorboard
16:20 About model RAM and model size
16:44 A quick TL;DR about profiling a model
17:40 Scale to multi-node (16 GPUs)
19:10 CodeLlama 70B results
20:00 About finetuning
22:10 Monitoring the 16 GPUs
22:54 CodeLlama 70B code results
25:35 Look at multi-node logs, weights
Watch on YouTube โ
(saves to browser)
Sign in to unlock AI tutor explanation ยท โก30
More on: LLM Engineering
View skill โRelated AI Lessons
Chapters (26)
Introduction
0:40
Run CodeLlama 70B
1:13
Run Mixtral 8x7B (MoE)
1:34
Run Mistral 7B
1:47
How to get a GPU
2:08
What is a Lightning Studio
3:47
Basic CodeLlama 70B test
4:20
Basics of model monitoring
4:39
Connect a local VSCode
6:20
Basic Mixtral MoE coding test
8:46
Create the prompt to generate the ML code
9:04
Connect an S3 bucket
10:10
Full prompt for ML code
13:16
Prompt Mistral 7B
13:50
Debug the finetuning script
14:16
About the Lightning Trainer
14:56
Sanity check the finetuning script
15:30
Monitor with Tensorboard
16:20
About model RAM and model size
16:44
A quick TL;DR about profiling a model
17:40
Scale to multi-node (16 GPUs)
19:10
CodeLlama 70B results
20:00
About finetuning
22:10
Monitoring the 16 GPUs
22:54
CodeLlama 70B code results
25:35
Look at multi-node logs, weights
๐
Tutor Explanation
DeepCamp AI