Selecting the best model in scikit-learn using cross-validation
In this video, we'll learn about K-fold cross-validation and how it can be used for selecting optimal tuning parameters, choosing between models, and selecting features. We'll compare cross-validation with the train/test split procedure, and we'll also discuss some variations of cross-validation that can result in more accurate estimates of model performance.
Download the notebook: https://github.com/justmarkham/scikit-learn-videos
Documentation on cross-validation: http://scikit-learn.org/stable/modules/cross_validation.html
Documentation on model evaluation: http://scikit-learn.org/stable/modules/model_evaluation.html
GitHub issue on negative mean squared error: https://github.com/scikit-learn/scikit-learn/issues/2439
An Introduction to Statistical Learning: http://www-bcf.usc.edu/~gareth/ISL/
K-fold and leave-one-out cross-validation: https://www.youtube.com/watch?v=nZAM5OXrktY
Cross-validation the right and wrong ways: https://www.youtube.com/watch?v=S06JpVoNaA0
Accurately Measuring Model Prediction Error: http://scott.fortmann-roe.com/docs/MeasuringError.html
An Introduction to Feature Selection: http://machinelearningmastery.com/an-introduction-to-feature-selection/
Harvard CS109: https://github.com/cs109/content/blob/master/lec_10_cross_val.ipynb
Cross-validation pitfalls: http://www.jcheminf.com/content/pdf/1758-2946-6-10.pdf
WANT TO GET BETTER AT MACHINE LEARNING? HERE ARE YOUR NEXT STEPS:
1) WATCH my scikit-learn video series:
https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A
2) SUBSCRIBE for more videos:
https://www.youtube.com/dataschool?sub_confirmation=1
3) JOIN "Data School Insiders" to access bonus content:
https://www.patreon.com/dataschool
4) ENROLL in my Machine Learning course:
https://www.dataschool.io/learn/
5) LET'S CONNECT!
- Newsletter: https://www.dataschool.io/subscribe/
- Twitter: https://twitter.com/justmarkham
- Facebook: https://www.facebook.com/DataScienceSchool/
- LinkedIn: https://www.linkedin.com/in/ju
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Data School · Data School · 19 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
▶
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Setting up Git and GitHub
Data School
Navigating a GitHub Repository - Part 1
Data School
Forking a GitHub Repository
Data School
Creating a New GitHub Repository
Data School
Copying a GitHub Repository to Your Local Computer
Data School
Committing Changes in Git and Pushing to a GitHub Repository
Data School
Syncing Your GitHub Fork
Data School
Allstate Purchase Prediction Challenge on Kaggle
Data School
Troubleshooting: Updates Rejected When Pushing to GitHub
Data School
Hands-on dplyr tutorial for faster data manipulation in R
Data School
ROC Curves and Area Under the Curve (AUC) Explained
Data School
Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Data School
What is machine learning, and how does it work?
Data School
Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Data School
Getting started in scikit-learn with the famous iris dataset
Data School
Training a machine learning model with scikit-learn
Data School
Comparing machine learning models in scikit-learn
Data School
Data science in Python: pandas, seaborn, scikit-learn
Data School
Selecting the best model in scikit-learn using cross-validation
Data School
How to find the best model parameters in scikit-learn
Data School
How to evaluate a classifier in scikit-learn
Data School
What is pandas? (Introduction to the Q&A series)
Data School
How do I read a tabular data file into pandas?
Data School
How do I select a pandas Series from a DataFrame?
Data School
Why do some pandas commands end with parentheses (and others don't)?
Data School
How do I rename columns in a pandas DataFrame?
Data School
How do I remove columns from a pandas DataFrame?
Data School
How do I sort a pandas DataFrame or a Series?
Data School
How do I filter rows of a pandas DataFrame by column value?
Data School
How do I apply multiple filter criteria to a pandas DataFrame?
Data School
Your pandas questions answered!
Data School
How do I use the "axis" parameter in pandas?
Data School
How do I use string methods in pandas?
Data School
How do I change the data type of a pandas Series?
Data School
When should I use a "groupby" in pandas?
Data School
How do I explore a pandas Series?
Data School
How do I handle missing values in pandas?
Data School
What do I need to know about the pandas index? (Part 1)
Data School
What do I need to know about the pandas index? (Part 2)
Data School
How do I select multiple rows and columns from a pandas DataFrame?
Data School
Machine Learning with Text in scikit-learn (PyCon 2016)
Data School
When should I use the "inplace" parameter in pandas?
Data School
How do I make my pandas DataFrame smaller and faster?
Data School
How do I use pandas with scikit-learn to create Kaggle submissions?
Data School
More of your pandas questions answered!
Data School
How do I create dummy variables in pandas?
Data School
How do I work with dates and times in pandas?
Data School
How do I find and remove duplicate rows in pandas?
Data School
How do I avoid a SettingWithCopyWarning in pandas?
Data School
How do I change display options in pandas?
Data School
How do I create a pandas DataFrame from another object?
Data School
How do I apply a function to a pandas Series or DataFrame?
Data School
Getting started with machine learning in Python (webcast)
Data School
Q&A about Machine Learning with Text (online course)
Data School
Your pandas questions answered! (webcast)
Data School
Machine Learning with Text in scikit-learn (PyData DC 2016)
Data School
Write Pythonic Code for Better Data Science (webcast)
Data School
Web scraping in Python (Part 1): Getting started
Data School
Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Data School
Web scraping in Python (Part 3): Building a dataset
Data School
More on: Supervised Learning
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Understanding Machine Learning and the ML Pipeline — Step by Step
Medium · Machine Learning
Understanding Machine Learning and the ML Pipeline — Step by Step
Medium · Data Science
“How a self-taught builder with no coding background made it to the Meta PyTorch Grand Finale”
Medium · Machine Learning
Python Isn’t One Skill -It’s 10 Different Career Paths!
Medium · Machine Learning
🎓
Tutor Explanation
DeepCamp AI