204.5.14 Neural Network Appendix

Link to the previous post : https://statinfer.com/204-5-13-neural-networks-conclusion/

In this post we will discuss the math behind a few steps of Neural Network algorithms.

Math- How to update the weights?

We update the weights backwards by iteratively calculating the error.
The formula for weights updating is done using gradient descent method or delta rule also known as Widrow-Hoff rule.
First, we calculate the weight corrections for the output layer then we take care of hidden layers.
\(W_(jk) = W_(jk) + \Delta W_(jk)\)
- where \(\Delta W_(jk) = \eta . y_j \delta_k\)
- \(\eta\) is the learning parameter
- \(\delta_k = y_k (1- y_k) * Err\) (for hidden layers \(\delta_k = y_k (1- y_k) * w_j * Err )\)
- Err = Expected output-Actual output
The weight corrections is calculated based on the error function.
The new weights are chosen in such way that the final error in that network is minimized.

Math-How does the delta rule work?

Lets, consider a simple example to understand the weight updating using delta rule.

If we building a simple logistic regression line. We would like to find the weights using weight update rule.
\(Y= \frac{1}{(1+e^(-wx))}\) is the equation.
We are searching for the optimal w for our data

Let w be 1
\(Y=\frac{1}{(1+e^(-x))}\) is the initial equation
The error in our initial step is 3.59
To reduce the error we will add a delta to w and make it 1.5

Now w is 1.5 (blue line)
\(Y=\frac{1}{(1+e^(-1.5x))}\) the updated equation
With the updated weight, the error is 1.57
We can further reduce the error by increasing w by delta

If we repeat the same process of adding delta and updating weights, we can finally end up with minimum error.
The weight at that final step is the optimal weight.
In this example the weight is 8, and the error is 0.
\(Y=\frac{1}{(1+e^(-8x))}\) is the final equation.
In this example, we manually changed the weights to reduce the error. This is just for intuition, manual updating is not feasible for complex optimization problems.
In gradient descent is a scientific optimization method. We update the weights by calculating gradient of the function.

How does gradient descent work?

Gradient descent is one of the famous ways to calculate the local minimum.
By Changing the weights we are moving towards the minimum value of the error function. The weights are changed by taking steps in the negative direction of the function gradient(derivative).

Does this method really work?

We changed the weights did it reduce the overall error?
Lets, calculate the error with new weights and see the change.

Gradient Descent Method Validation

With our initial set of weights the overall error was 0.7137,Y Actual is 0, Y Predicted is 0.7137 error =0.7137
The new weights give us a predicted value of 0.70655
In one iteration, we reduced the error from 0.7137 to 0.70655
The error is reduced by 1%. Repeat the same process with multiple epochs and training examples, we can reduce the error further.

References & Image Sources

“ROC curve” by Masato8686819 – Own work. Licensed under CC BY-SA 3.0 via Wikimedia Commons – https://commons.wikimedia.org/wiki/File:ROC_curve.svg#/media/File:ROC_curve.svg
“Curvas”??????UPO649 1112 prodgom – ?????????????????????????????????????????? – https://commons.wikimedia.org/wiki/File:Curvas.png#/media/File:Curvas.png??????CC BY-SA 3.0??????
http://www.autonlab.org/tutorials/neural.html
“Gradient ascent (surface)”. Licensed under Public Domain via Commons – https://commons.wikimedia.org/wiki/File:Gradient_ascent_(surface).png#/media/File:Gradient_ascent_(surface).png
“Gradient descent method” by ?????????? ???????? – ???????????????????????? ????????????????????, ???????????????? ??????????????. Licensed under CC BY-SA 3.0 via Wikimedia Commons – https://commons.wikimedia.org/wiki/File:Gradient_descent_method.png#/media/File:Gradient_descent_method.png
Lecture 7 :Artificial neural networks: Supervised learning: Negnevitsky, Person Education 2005
Gradient descent can find the local minimum instead of the global minimum By I, KSmrq
“Neuron”. Licensed under CC BY-SA 3.0 via Wikimedia Commons – https://commons.wikimedia.org/wiki/File:Neuron.svg#/media/File:Neuron.svg
“Neural signaling-human brain” by 7mike5000 – Gif created from Inside the Brain: Unraveling the Mystery of Alzheimer’s Disease, an educational film by the National Institute on Aging.. Licensed under CC BY-SA 3.0 via Wikimedia Commons – https://commons.wikimedia.org/wiki/File:Neural_signaling-human_brain.gif#/media/File:Neural_signaling-human_brain.gif

21st June 2017