A Feed-Forward Neural Network From Scratch

Machine Learning

Project Name: A Neural Network From Scratch
Prepared For: National Research University Higher School of Economics Introduction to Deep Learning

As an assignment for the National Research University Higher School of Economics Introduction to Deep Learning course on Coursera, I wrote a simple multi-layer perceptron network classifier with ReLu activation. Backpropagation on a log-cross entropy with softmax loss function is used to train weights via stochastic gradient descent. This simple architecture was used to build and train a 3-layer network to recognize handwritten letters from the MNIST dataset. Out of sample accuracy was around 86%.

Next, the architecture was upgraded by implementing the adam optimization algorithm outlined in Kingma, Diederik P., and Jimmy Ba (2014). To support deeper network architectures, Xavier initialization was also implemented, as described in Glorot and Bengio 2010 and He et al. (2015). After these improvements, out of sample accuracy on the same dataset rose to 97%.

The whole network was placed in a class wrapper, making future upgrades easy to do. In particular, additional activation functions, such as logistic or tanh, could be easily included. Support for additional loss functions can also be added without difficulty.

You can check out the project here.