Simple Feedforward Neural Network with TensorFlow won't learn

I am trying to build a simple neural network with TensorFlow. The goal is to find the center of a rectangle in a 32 pixel x 32 pixel image. The rectangle is described by five vectors. The first vector is the position vector, the other four are direction vectors and make up the rectangle edges. One vector has two values (x and y).

The corresponding input for this image would be (2,5)(0,4)(6,0)(0,-4)(-6,0) . The center (and therefore the desired output) is located at (5,7) .

The code I came up with looks like the following:

    import tensorflow as tf 
    import numpy as np
    import Rectangle_Records

    def init_weights(shape):
        """ Weight initialization """
        weights = tf.random_normal(shape, stddev=0.1)
        return tf.Variable(weights)

    def forwardprop(x, w_1, w_2):
        """ Forward-propagation """
        h = tf.nn.sigmoid(tf.matmul(x, w_1))
        y_predict = tf.matmul(h, w_2)
        return y_predict

    def main():
        x_size = 10
        y_size = 2
        h_1_size = 256

        # Prepare input data
        input_data = Rectangle_Records.DataSet()

        x = tf.placeholder(tf.float32, shape = [None, x_size])
        y_label = tf.placeholder(tf.float32, shape = [None, y_size])

        # Weight initializations
        w_1 = init_weights((x_size, h_1_size))
        w_2 = init_weights((h_1_size, y_size))

        # Forward propagation
        y_predict = forwardprop(x, w_1, w_2)

        # Backward propagation
        cost = tf.reduce_mean(tf.square(y_predict - y_label))

        updates = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

        # Run
        sess = tf.Session()
        init = tf.global_variables_initializer()
        sess.run(init)

        for i in range(200):
            batch = input_data.next_batch(10)
            sess.run(updates, feed_dict = {x: batch[0], y_label: batch[1]})

        sess.close()

    if __name__ == "__main__":
        main()

Sadly, the network won't learn properly. The result is too far off. For example, [[ 3.74561882 , 3.70766664]] when it should be arround [[ 3. , 7.]]. What am I doing wrong?


The main problem is your whole training is done only for one epoch , thats not enough training. Try the following changes:

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for j in range(30):
    input_data = Rectangle_Records.DataSet()
    for i in range(200):
        batch = input_data.next_batch(10)
        loss, _ = sess.run([cost,updates], feed_dict = {x: batch[0], y_label: batch[1]})

    pred = sess.run(y_predict, feed_dict={x: batch[0]})
    print('Cost:', loss  )
    print('pred:', pred)
    print('actual:', batch[1])
sess.close()

Change your optimizer to a momentum optimizer for faster convergence: tf.train.AdamOptimizer(0.01).minimize(cost)


There are lots of ways to improve the performance of a neural net. try one or more of the following:

  • add more layers, or more nodes per layer
  • change your activation function (I've found relu to be quite effective)
  • use an ensemble of NNs where each NN gets a vote weighted by its R^2 score
  • bring in more training data
  • perform a grid search to optimize parameters

  • You have forgotten to add bias.

    def init_bias(shape):
        biases = tf.random_normal(shape)
        return tf.Variable(biases)
    
    def forwardprop(x, w_1, w_2, b_1, b_2):
        """ Forward-propagation """
        h = tf.nn.sigmoid(tf.matmul(x, w_1) + b_1)
        y_predict = tf.matmul(h, w_2) + b_2
        return y_predict
    

    Inside main change it to this

    w_1 = init_weights((x_size, h_1_size))
    w_2 = init_weights((h_1_size, y_size))
    b_1 = init_bias((h_1_size,))
    b_2 = init_bias((y_size,))
    
    # Forward propagation
    y_predict = forwardprop(x, w_1, w_2, b_1, b_2)
    

    This will give you much better accuracy. You can then try adding more layers, try different activation functions etc. as mentioned above to improve it furthermore.

    链接地址: http://www.djcxy.com/p/5506.html

    上一篇: 由于tf.placeholder的麻烦,无法运行预测

    下一篇: 简单的前馈神经网络与TensorFlow不会学习