How to improve digit recognition prediction in Neural Networks in Matlab?

2018-06-11 04:20:59

I've made digit recognition (56x56 digits) using Neural Networks, but I'm getting 89.5% accuracy on test set and 100% on training set. I know that it's possible to get >95% on test set using this training set. Is there any way to improve my training so I can get better predictions? Changing iterations from 300 to 1000 gave me +0.12% accuracy. I'm also file size limited so increasing number of nodes can be impossible, but if that's the case maybe I could cut some pixels/nodes from the input layer.

To train I'm using:

input layer: 3136 nodes

hidden layer: 220 nodes

labels: 36

regularized cost function with lambda=0.1

fmincg to calculate weights (1000 iterations)

As mentioned in the comments, the easiest and most promising way is to switch to a Convolutional Neural Network. But with you current model you can:

Add more layers with less neurons each, which increases learning capacity and should increase accuracy by a bit. Problem is that you might start overfitting. Use regularization to counter this.

Use batch Normalization (BN). While you are already using regularization, BN accelerates training and also does regularization, and is a NN specific algorithm that might work better.

Make an ensemble. Train several NNs on the same dataset, but with a different initialization. This will produce slightly different classifiers and you can combine their output to get a small increase in accuracy.

Cross-entropy loss. You don't mention what loss function you are using, if its not Cross-entropy, then you should start using it. All the high accuracy classifiers use cross-entropy loss.

Switch to backpropagation and Stochastic Gradient Descent. I do not know the effect of using a different optimization algorithm, but backpropagation might outperform the optimization algorithm you are currently using, and you could combine this with other optimizers such as Adagrad or ADAM.

Other small changes that might increase accuracy are changing the activation functions (like ReLU), shuffle training samples after every epoch, and do data augmentation.

链接地址: http://www.djcxy.com/p/32114.html

上一篇: 将CSV特征解析为ANN的Tensorflow

下一篇: 如何提高Matlab中神经网络中的数字识别预测？