Mastering Weight Updates and Backpropagation in Multilayer Perceptrons

In this article we present the precise equations used for weight‑update calculations and explain how backpropagation enables a multilayer perceptron (MLP) to learn from data.

Welcome to AAC’s comprehensive machine‑learning series.

Catch up on the series so far here:

How to Perform Classification Using a Neural Network: What Is the Perceptron?
How to Use a Simple Perceptron Neural Network Example to Classify Data
How to Train a Basic Perceptron Neural Network
Understanding Simple Neural Network Training
An Introduction to Training Theory for Neural Networks
Understanding Learning Rate in Neural Networks
Advanced Machine Learning with the Multilayer Perceptron
The Sigmoid Activation Function: Activation in Multilayer Perceptron Neural Networks
How to Train a Multilayer Perceptron Neural Network
Understanding Training Formulas and Backpropagation for Multilayer Perceptrons
Neural Network Architecture for a Python Implementation
How to Create a Multilayer Perceptron Neural Network in Python
Signal Processing Using Neural Networks: Validation in Neural Network Design
Training Datasets for Neural Networks: How to Train and Validate a Python Neural Network

We’ve reached the core of neural‑network theory: the computational steps that fine‑tune an MLP’s weights so it can classify input samples accurately. This process is the foundation of the backpropagation algorithm, a cornerstone of modern deep learning.

Updating Weights

Training an MLP is mathematically dense, and terminology varies across sources. The equations below are drawn from Dr. Dustin Stansbury’s clear derivations, making them an excellent reference for both beginners and practitioners.

The diagram shows the architecture we’ll implement in code, and the equations that follow directly map to this structure.

Mastering Weight Updates and Backpropagation in Multilayer Perceptrons

Terminology

Pre‑activation (S_preA): The weighted sum input to a neuron’s activation function, calculated as a dot product of incoming weights and activations.
Post‑activation (S_postA): The neuron’s output after applying the activation function f_A(·). We use the logistic (sigmoid) function in our examples.
Weight matrices: In code we label them ItoH (input‑to‑hidden) and HtoO (hidden‑to‑output) to avoid ambiguity.
Target (T): The correct output for a training sample.
Learning rate (LR): The step size for weight updates.
Final error (FE): S_postA,O – T.
Error signal (S_ERROR): FE multiplied by the derivative of the output neuron’s activation function.
Gradient: The contribution of a specific weight to the error signal; we subtract LR × gradient from the current weight.

The accompanying diagram illustrates these concepts in context.

Mastering Weight Updates and Backpropagation in Multilayer Perceptrons

Weight‑update equations result from taking the partial derivative of the summed‑squared error with respect to each weight. For hidden‑to‑output weights:

S_ERROR = FE × f_A'(S_preA,O)

gradient_HtoO = S_ERROR × S_postA,H

weight_HtoO ← weight_HtoO – (LR × gradient_HtoO)

For input‑to‑hidden weights, the error must traverse an additional layer:

gradient_ItoH = S_ERROR × weight_HtoO × f_A'(S_preA,H) × input

weight_ItoH ← weight_ItoH – (LR × gradient_ItoH)

Backpropagation

Backpropagation resolves the hidden‑node dilemma: although input‑to‑hidden weights influence the final output indirectly, we can compute their effect by propagating the error signal backward through the network and scaling it with the outgoing weights and activation derivatives. This technique is fundamental to training deep models.

Conclusion

We’ve unpacked the key equations that drive weight updates and the mechanics of backpropagation in MLPs. These concepts underpin every modern neural‑network application, and mastering them opens the door to advanced modeling. Stay tuned for the next installments in our series, where we’ll dive deeper into architecture design, Python implementation, and validation techniques.

Designing a Flexible Perceptron Neural Network in Python Train Your Multilayer Perceptron: Proven Strategies for Optimal Performance

Industrial robot

CNC Machine

Industrial robot

Industrial equipment