Neural network experiments: universal approximator

Jerry Jee
4 min readApr 20, 2021

--

Overview

It is said that neural network is a universal function approximator, how to understand this sentence? Let us do some experiments to get a more intuitive understanding.

In order to be intuitive and easy to understand, we use a neural network to fit a univariate function, that is y=f(x)

Experiments

  1. Function y=x

training samples:

As the picture shows:

  • The blue dots represent training samples, they are all sampled from the function y=x
  • The orange straight line represents the function represented by the neural network, which is currently untrained and deviates greatly from the sample

ideas

To fit a straight line, what structure of neural network do we need to fit it? In order to understand thoroughly, we need to understand a single neuron.

The form of a single neuron is: y = σ(wx+b)

  • w and b are the parameters to be determined
  • σ is the activation function

If you remove σ, the form is y = wx+b, which happens to be a straight line. In other words, we can fit the function by using a neuron without an activation function.

experiment

As shown in the figure above, using a single output neuron, after 20 steps of training, the neural network fits the objective function very well. The obtained parameters are shown in the figure below:

The corresponding function is y=1.0x+0.1, which is very close to the objective function, and a few more training steps can be make it better.

2. Function y=|x|

training samples:

The function is a piecewise function

ideas

Since this is not a straight line, a nonlinear activation function is needed, which can bend the straight line. Since no curve is involved, ReLU is a more appropriate activation function:

experiment

Observe the curve of the ReLU function, one side is a horizontal straight line, and the other is an oblique line. If two ReLU curves can be obtained and they are superimposed in reverse, can the target curve be obtained?

The final result is as follows:

The 2 hidden neurons are:

  • y1​ = ReLU(−x)
  • y2​ = ReLU(x)

The output neuron is: y = y1 + y2, just get the target curve.

(The above results are directly obtained by manually setting parameters without parameter training)

3. Function

The number of hidden neurons required rises to 4.

4. Function y=1.8sin(3x)/x)

The network is more complex, and the fitted curve is no longer perfect.

Summary

As the objective function becomes more complex:

  • The corresponding neural network become more complex
  • More training data is required
  • Training is getting more and more difficult
  • More unintuitive, and more difficult to explain

Conversely, more complex neural networks and more data can be used to fit more complex functions.

In theory, any function can be fitted. Of course, the network must be infinite and the amount of data must be infinite.

Reference software

--

--

No responses yet