TensorFlow: improve accuracy on training data

爷,独闯天下 提交于 2019-12-24 10:12:39

问题


I am experimenting with TensorFlow. One of my first tries consists on learning one of the features based on the data. Let's say my data is composed on the following values:

35, 2, 3, 4, 19, 31, 7, 9, 34, 10, 33, 12, 59, 6, 14, 31, 13
...
35, 4, 7, 14, 9, 3, 17, 19, 42, 11, 3, 1, 53, 12, 17, 30, 15

I would like to predict the value of the last feature, in the example it is going to be the values 13 for the first row and 15 for the last row.

I have around 10000 rows of data. I've written the following model using TensorFlow(I'm following this tutorial):

    W0 = tf.Variable(tf.zeros([nb_attributes, 25]))
    B0 = tf.Variable(tf.zeros([25]))

    W1 = tf.Variable(tf.truncated_normal([25, 30], stddev=0.1))
    B1 = tf.Variable(tf.zeros([30]))

    W2 = tf.Variable(tf.truncated_normal([30, 70], stddev=0.1))
    B2 = tf.Variable(tf.zeros([70]))

    W3 = tf.Variable(tf.truncated_normal([70, 150], stddev=0.1))
    B3 = tf.Variable(tf.zeros([150]))

    W4 = tf.Variable(tf.truncated_normal([150, 75], stddev=0.1))
    B4 = tf.Variable(tf.zeros([75]))

    W5 = tf.Variable(tf.truncated_normal([75, 54], stddev=0.1))
    B5 = tf.Variable(tf.zeros([54]))

    # placeholder for input and output
    x = tf.placeholder("float", [None, nb_attributes])
    Y_ = tf.placeholder("float", [None,54])

    XX = tf.reshape(x, [-1, nb_attributes])
    Y1 = tf.nn.sigmoid(tf.matmul(XX, W0) + B0)
    Y2 = tf.nn.sigmoid(tf.matmul(Y1, W1) + B1)
    Y3 = tf.nn.sigmoid(tf.matmul(Y2, W2) + B2)
    Y4 = tf.nn.sigmoid(tf.matmul(Y3, W3) + B3)
    Y5 = tf.nn.sigmoid(tf.matmul(Y4, W4) + B4)


    # learned output
    Ylogits = tf.matmul(Y5, W5) + B5
    Y = tf.nn.softmax(Ylogits)

    cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=Y_)
    cross_entropy = tf.reduce_mean(cross_entropy)*100

    train_step = tf.train.ProximalGradientDescentOptimizer(0.01).minimize(cross_entropy)

The train step is as follows:

    for i in range(100):
        batch_xs, batch_ys = get_train_events()
        sess.run(train_step, feed_dict={x: batch_xs, Y_: batch_ys})

        correct_prediction = tf.equal(tf.argmax(Y,1), tf.argmax(Y_,1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

        test_data_evs, test_data_out = batch_xs, batch_ys
        current_accuracy = sess.run(accuracy, feed_dict={x: test_data_evs, Y_: test_data_out})

        print 'Current Accuracy {}'.format(current_accuracy)

Please, note that I am using the same data for training than for testing. I am aware that it is not the approach to follow but I am doing on this way because I've found that the accuracy on the test data was so bad that I decided to know what was the accuracy on the training data. As far as I understand it is supposed that the accuracy on the training data after testing must be close to 100%, isn't it?

However, I cannot improve the accuracy to more the 60%. I tried the following:

  • Give the data using different strategies
  • Using different training optimizer from here
  • Change the net architecture
  • Using dropout approach

The only step that shown some progress has been to provide testing data randomly in batch of size N. In such case, I managed to move the accuracy from 60 to 64%. I was wondering whether I am applying a wrong approach or committing some stupid or naive error. Any thought in respect of the issue is going to very much appreciated.

Thanks a lot in advance!

EDIT 1: For the sake of completing the question I managed to solve quite well the problem by using the k-neighbour algorithm. This code have helped in my case.

来源:https://stackoverflow.com/questions/43922819/tensorflow-improve-accuracy-on-training-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!