Neural Network from scratch - Part 2 (Forward Propagation)

Aladdin Persson · Advanced ·📐 ML Fundamentals ·7y ago

Key Takeaways

This video teaches forward propagation math in generalized notation for neural networks

Full Transcript

so continuing from the last video we now want to compute all the steps to the forward propagation in the neuro network and we looked at how these calculations can look like for a specific node and for a specific training example to understand how it works in the simplest case but we now need to generalize this so if let's look at a similar example similar neural network as we did last time we can write the output from or the input as a zero and the output from the hidden layer as a one the reason why we can use this notation instead of referring to the specific training example and a specific node is because this can be viewed as a matrix so a zero is a matrix containing as its rows all of the of the examples so let's say we have handwritten images images of Hammerton Hammerton digits rather then the the examples would be all of the images so we have examples as our rows and the columns will be the features specifically in layer 0 so that would be in the input and for images that would be the pixels and if we look at the weight the weights so we have in generalize we have weights for layer L then that will be the features in the previous in the ll layer previous and the features for the L next so if you look at W of 1 and that would be the features features l0 comma features l1 and in this case we have three input and we have five minutes earlier so that would be 3 comma 5 in this case so really why why this works is because if we look at for example the first calculation so the Z of layer 1 which will be a 0 times weight 1 plus the biases of layer 1 then look look at this multiplication it will be the examples come a features L 0 so here and here we have two matrix multiplication multiplication that these two will cross so each node inner in the input will be will be multiple multiple kated by each of the weights associated to that node and then they will add them together so just at the example we looked at in the first video this is the case but in general general notation now there's one tricky part still that's not very obvious at first so if we do this major matrix multiply we'll get examples I'll just write examples comma features features 1 so that's from this multiplication and then we add the biases but remember the biases are local for that specific node which means that we have 1 comma features l1 so each node in the layer one has a specific bias associated with that written note so the reason why this Edition works for all of the examples because obviously these don't match is that it's called it's something called broadcasting that when we add these two together it's called broadcasting in Python this one will be expanded into in two rows the width which is the same quantity as examples so this this one will turn to the number that's that the number of examples that we have and then the columns for each of those rows will be the ID and that will be identical and then we can make the addition and then we need to calculate a activate the output from layer 1 which will be the reloj of z1 and remember that this will be element wise for each node and for all training examples and Arella is just a maximum of 0 comma z 1 then for the output layer we do very much the same thing z2 will be a 1 W 2 plus B 2 and remember this addition here will also use broadcasting then we need to remember here that the last the the layer 2 is the output layer and the output layer never has an activation function but it rather it has a soft max classifier to turn the the scores of the Z this Z two probabilities so the softmax let's look at the softmax the softmax of a particular node J of Z will be e raise to Z of that particular node so it will be e raise to the score of that particular node J divided by the sum of all of the nodes so what the output from this will be a value between zero and one because we've normalized it we've taken the sum of all of the values and so we're taking the value of a particular node and divided by the sum of all so each node would have a value between zero and one so what we do is that we recall the probabilities to be the softmax of z2 and so remember this will output the probabilities for all of the nodes and also for all the training examples since this matrix contains both the examples and all of the nodes and the only tricky part left now is to calculate the loss so the loss and the loss of work is L and we need to write it for a particular training example I will be minus log and here comes the tricky part F of y I of Z so remember this will be in our example or in our case this will be Z - and later - but let's just write is that here for in the general case so it can be whatever layer and let me explain this Y I so we only take the minus log of remember F is the computed from our neural network Y I will be the Y will be a vector containing the labels for the correct and the correct output so Y I will be the correct label for that particular training example so if we have handwritten digits let's say that Y is a vector where the first training example the true value is a 0 the second value is a 2 & 3 and so on so so we have an image where the correct label is 0 the correct label of the second image is 2 third image is 3 so what we want to have is let's say the correct label of y1 is 0 okay so what is it that we do well we take the computed value that we have from on your network of the node which is the correct one so let me write this out we take the minus log of e raised to Z Y I so the node that is the the correct correct node or the correct label for this particular training example and we still divide by the sum of all the nodes ok so now we're ready for the backward propagation see in the next video

Original Description

In this video we do the math for forward propagation in generalized notation!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Aladdin Persson · Aladdin Persson · 4 of 60

1 computeCost.m Linear Regression Cost Function - Machine Learning
computeCost.m Linear Regression Cost Function - Machine Learning
Aladdin Persson
2 gradientDescent.m Gradient Descent Implementation -  Machine Learning
gradientDescent.m Gradient Descent Implementation - Machine Learning
Aladdin Persson
3 Neural Network from scratch - Part 1 (Standard Notation)
Neural Network from scratch - Part 1 (Standard Notation)
Aladdin Persson
Neural Network from scratch - Part 2 (Forward Propagation)
Neural Network from scratch - Part 2 (Forward Propagation)
Aladdin Persson
5 Neural Network from scratch - Part 3 (Backward Propagation)
Neural Network from scratch - Part 3 (Backward Propagation)
Aladdin Persson
6 Neural Network from scratch - Part 4 (With Python)
Neural Network from scratch - Part 4 (With Python)
Aladdin Persson
7 sigmoid.m - Programming Assignment 2 Machine Learning
sigmoid.m - Programming Assignment 2 Machine Learning
Aladdin Persson
8 costFunction.m - Programming Assignment 2 Machine Learning
costFunction.m - Programming Assignment 2 Machine Learning
Aladdin Persson
9 predict.m - Programming Assignment 2 Machine Learning
predict.m - Programming Assignment 2 Machine Learning
Aladdin Persson
10 costFunctionReg.m - Programming Assignment 2 Machine Learning
costFunctionReg.m - Programming Assignment 2 Machine Learning
Aladdin Persson
11 lrCostFunction.m - Programming Assignment 3 Machine Learning
lrCostFunction.m - Programming Assignment 3 Machine Learning
Aladdin Persson
12 oneVsAll.m - Programming Assignment 3 Machine Learning
oneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
13 predictOneVsAll.m - Programming Assignment 3 Machine Learning
predictOneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
14 predict.m - Programming Assignment 3 Machine Learning
predict.m - Programming Assignment 3 Machine Learning
Aladdin Persson
15 Caesar Cipher Encryption and Decryption with example
Caesar Cipher Encryption and Decryption with example
Aladdin Persson
16 Cryptography: Caesar Cipher Python
Cryptography: Caesar Cipher Python
Aladdin Persson
17 Vigenere Cipher Explained (with Example)
Vigenere Cipher Explained (with Example)
Aladdin Persson
18 Cryptography: Vigenere Cipher Python
Cryptography: Vigenere Cipher Python
Aladdin Persson
19 Hill Cipher Explained (with Example)
Hill Cipher Explained (with Example)
Aladdin Persson
20 Cryptography: Hill Cipher Python
Cryptography: Hill Cipher Python
Aladdin Persson
21 Interval Scheduling Greedy Algorithm: Python
Interval Scheduling Greedy Algorithm: Python
Aladdin Persson
22 Weighted Interval Scheduling Algorithm Explained
Weighted Interval Scheduling Algorithm Explained
Aladdin Persson
23 Weighted Interval Scheduling Python Code
Weighted Interval Scheduling Python Code
Aladdin Persson
24 Sequence Alignment | Needleman Wunsch Algorithm
Sequence Alignment | Needleman Wunsch Algorithm
Aladdin Persson
25 Sequence Alignment | Needleman Wunsch in Python
Sequence Alignment | Needleman Wunsch in Python
Aladdin Persson
26 Codility BinaryGap Python
Codility BinaryGap Python
Aladdin Persson
27 Codility CyclicRotation Python
Codility CyclicRotation Python
Aladdin Persson
28 Derivation Linear Regression with Gradient Descent
Derivation Linear Regression with Gradient Descent
Aladdin Persson
29 Linear Regression Gradient Descent From Scratch in Python
Linear Regression Gradient Descent From Scratch in Python
Aladdin Persson
30 Pytorch Neural Network example
Pytorch Neural Network example
Aladdin Persson
31 Pytorch CNN example (Convolutional Neural Network)
Pytorch CNN example (Convolutional Neural Network)
Aladdin Persson
32 Pytorch LeNet implementation from scratch
Pytorch LeNet implementation from scratch
Aladdin Persson
33 Pytorch VGG implementation from scratch
Pytorch VGG implementation from scratch
Aladdin Persson
34 Pytorch GoogLeNet / InceptionNet implementation from scratch
Pytorch GoogLeNet / InceptionNet implementation from scratch
Aladdin Persson
35 How to save and load models in Pytorch
How to save and load models in Pytorch
Aladdin Persson
36 How to build custom Datasets for Images in Pytorch
How to build custom Datasets for Images in Pytorch
Aladdin Persson
37 Pytorch Transfer Learning and Fine Tuning Tutorial
Pytorch Transfer Learning and Fine Tuning Tutorial
Aladdin Persson
38 Pytorch Data Augmentation using Torchvision
Pytorch Data Augmentation using Torchvision
Aladdin Persson
39 Pytorch Quick Tip: Weight Initialization
Pytorch Quick Tip: Weight Initialization
Aladdin Persson
40 Pytorch Quick Tip: Using a Learning Rate Scheduler
Pytorch Quick Tip: Using a Learning Rate Scheduler
Aladdin Persson
41 Pytorch ResNet implementation from Scratch
Pytorch ResNet implementation from Scratch
Aladdin Persson
42 Pytorch TensorBoard Tutorial
Pytorch TensorBoard Tutorial
Aladdin Persson
43 Pytorch DCGAN Tutorial (See description for updated video)
Pytorch DCGAN Tutorial (See description for updated video)
Aladdin Persson
44 Naive Bayes from Scratch - Machine Learning Python
Naive Bayes from Scratch - Machine Learning Python
Aladdin Persson
45 Spam Classifier using Naive Bayes in Python
Spam Classifier using Naive Bayes in Python
Aladdin Persson
46 K-Nearest Neighbor from scratch - Machine Learning Python
K-Nearest Neighbor from scratch - Machine Learning Python
Aladdin Persson
47 Linear Regression Normal Equation Python
Linear Regression Normal Equation Python
Aladdin Persson
48 SVM from Scratch - Machine Learning Python (Support Vector Machine)
SVM from Scratch - Machine Learning Python (Support Vector Machine)
Aladdin Persson
49 Neural Network from Scratch - Machine Learning Python
Neural Network from Scratch - Machine Learning Python
Aladdin Persson
50 Pytorch RNN example (Recurrent Neural Network)
Pytorch RNN example (Recurrent Neural Network)
Aladdin Persson
51 Pytorch Bidirectional LSTM example
Pytorch Bidirectional LSTM example
Aladdin Persson
52 Pytorch Text Generator with character level LSTM
Pytorch Text Generator with character level LSTM
Aladdin Persson
53 Logistic Regression from Scratch - Machine Learning Python
Logistic Regression from Scratch - Machine Learning Python
Aladdin Persson
54 K-Means Clustering from Scratch - Machine Learning Python
K-Means Clustering from Scratch - Machine Learning Python
Aladdin Persson
55 Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Aladdin Persson
56 Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Aladdin Persson
57 Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Aladdin Persson
58 Paper Review: Sequence to Sequence Learning with Neural Networks
Paper Review: Sequence to Sequence Learning with Neural Networks
Aladdin Persson
59 Pytorch Seq2Seq Tutorial for Machine Translation
Pytorch Seq2Seq Tutorial for Machine Translation
Aladdin Persson
60 Pytorch Seq2Seq with Attention for Machine Translation
Pytorch Seq2Seq with Attention for Machine Translation
Aladdin Persson

Related Reads

📰
When Should AI Teams Replace a Model in Production?
Learn when to replace an AI model in production based on workflow and data analysis
Dev.to · Ye Allen
📰
Stop Writing Python Classes Until You Learn The 4 Things You Can Do To Every Piece Of Data An…
Learn to manipulate data in Python objects by understanding 4 essential operations to improve your coding skills
Medium · Programming
📰
Top 10 AI Evaluation Interview Questions and Answers
Learn to answer top AI evaluation interview questions and understand their importance
Medium · Machine Learning
📰
We took highlight detection from 0.56 to 0.86 — with zero new footage and zero cloud training
Improve highlight detection in videos from 0.56 to 0.86 accuracy without new footage or cloud training by applying data-driven measurement and optimization techniques
Dev.to AI
Up next
1. Overview of Artificial Intelligence | What is AI? Fundamental Concepts & Complete History of AI
Professor Rahul Jain
Watch →