Data science in Python: pandas, seaborn, scikit-learn
In this video, we'll cover the data science pipeline from data ingestion (with pandas) to data visualization (with seaborn) to machine learning (with scikit-learn). We'll learn how to train and interpret a linear regression model, and then compare three possible evaluation metrics for regression problems. Finally, we'll apply the train/test split procedure to decide which features to include in our model.
Download the notebook: https://github.com/justmarkham/scikit-learn-videos
pandas installation instructions: http://pandas.pydata.org/pandas-docs/stable/install.html
seaborn installation instructions: http://seaborn.pydata.org/installing.html
Longer linear regression notebook: https://github.com/justmarkham/DAT5/blob/master/notebooks/09_linear_regression.ipynb
Chapter 3 of Introduction to Statistical Learning: http://www-bcf.usc.edu/~gareth/ISL/
Videos related to Chapter 3: https://www.dataschool.io/15-hours-of-expert-machine-learning-videos/
Quick reference guide to linear regression: https://www.dataschool.io/applying-and-interpreting-linear-regression/
Introduction to linear regression: http://people.duke.edu/~rnau/regintro.htm
pandas Q&A video series: https://www.dataschool.io/easier-data-analysis-with-pandas/
pandas 3-part tutorial: http://www.gregreda.com/2013/10/26/intro-to-pandas-data-structures/
pandas read_csv documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
pandas read_table documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_table.html
seaborn tutorial: http://seaborn.pydata.org/tutorial.html
seaborn example gallery: http://seaborn.pydata.org/examples/index.html
WANT TO GET BETTER AT MACHINE LEARNING? HERE ARE YOUR NEXT STEPS:
1) WATCH my scikit-learn video series:
https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A
2) SUBSCRIBE for more videos:
https://www.youtube.com/dataschool?sub_confirmation=1
3) JOIN "Data School Insiders" to access bonus content:
https:
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Data School · Data School · 18 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
▶
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Setting up Git and GitHub
Data School
Navigating a GitHub Repository - Part 1
Data School
Forking a GitHub Repository
Data School
Creating a New GitHub Repository
Data School
Copying a GitHub Repository to Your Local Computer
Data School
Committing Changes in Git and Pushing to a GitHub Repository
Data School
Syncing Your GitHub Fork
Data School
Allstate Purchase Prediction Challenge on Kaggle
Data School
Troubleshooting: Updates Rejected When Pushing to GitHub
Data School
Hands-on dplyr tutorial for faster data manipulation in R
Data School
ROC Curves and Area Under the Curve (AUC) Explained
Data School
Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Data School
What is machine learning, and how does it work?
Data School
Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Data School
Getting started in scikit-learn with the famous iris dataset
Data School
Training a machine learning model with scikit-learn
Data School
Comparing machine learning models in scikit-learn
Data School
Data science in Python: pandas, seaborn, scikit-learn
Data School
Selecting the best model in scikit-learn using cross-validation
Data School
How to find the best model parameters in scikit-learn
Data School
How to evaluate a classifier in scikit-learn
Data School
What is pandas? (Introduction to the Q&A series)
Data School
How do I read a tabular data file into pandas?
Data School
How do I select a pandas Series from a DataFrame?
Data School
Why do some pandas commands end with parentheses (and others don't)?
Data School
How do I rename columns in a pandas DataFrame?
Data School
How do I remove columns from a pandas DataFrame?
Data School
How do I sort a pandas DataFrame or a Series?
Data School
How do I filter rows of a pandas DataFrame by column value?
Data School
How do I apply multiple filter criteria to a pandas DataFrame?
Data School
Your pandas questions answered!
Data School
How do I use the "axis" parameter in pandas?
Data School
How do I use string methods in pandas?
Data School
How do I change the data type of a pandas Series?
Data School
When should I use a "groupby" in pandas?
Data School
How do I explore a pandas Series?
Data School
How do I handle missing values in pandas?
Data School
What do I need to know about the pandas index? (Part 1)
Data School
What do I need to know about the pandas index? (Part 2)
Data School
How do I select multiple rows and columns from a pandas DataFrame?
Data School
Machine Learning with Text in scikit-learn (PyCon 2016)
Data School
When should I use the "inplace" parameter in pandas?
Data School
How do I make my pandas DataFrame smaller and faster?
Data School
How do I use pandas with scikit-learn to create Kaggle submissions?
Data School
More of your pandas questions answered!
Data School
How do I create dummy variables in pandas?
Data School
How do I work with dates and times in pandas?
Data School
How do I find and remove duplicate rows in pandas?
Data School
How do I avoid a SettingWithCopyWarning in pandas?
Data School
How do I change display options in pandas?
Data School
How do I create a pandas DataFrame from another object?
Data School
How do I apply a function to a pandas Series or DataFrame?
Data School
Getting started with machine learning in Python (webcast)
Data School
Q&A about Machine Learning with Text (online course)
Data School
Your pandas questions answered! (webcast)
Data School
Machine Learning with Text in scikit-learn (PyData DC 2016)
Data School
Write Pythonic Code for Better Data Science (webcast)
Data School
Web scraping in Python (Part 1): Getting started
Data School
Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Data School
Web scraping in Python (Part 3): Building a dataset
Data School
More on: ML Pipelines
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Environmental Testing Data Is One of the Most Underused Public Datasets in Existence. That's a Problem We Can Fix.
Dev.to · member_shahnaz
From Dashboard to Dialogue: Evolving data analytics with conversational Artificial Intelligence
Medium · AI
The 5 SQL Patterns That Show Up in Every Data Engineering Interview
Medium · Data Science
Your Tableau Workbook Is Holding Your Data Hostage. Here’s How AI Sets It Free.
Medium · Data Science
🎓
Tutor Explanation
DeepCamp AI