python - TensorFlow for binary classification -
i trying adapt this mnist example binary classification.
but when changing nlabels nlabels=2 nlabels=1, loss function returns 0 (and accuracy 1).
from __future__ import absolute_import __future__ import division __future__ import print_function tensorflow.examples.tutorials.mnist import input_data import tensorflow tf # import data mnist = input_data.read_data_sets('data', one_hot=true) nlabels = 2 sess = tf.interactivesession() # create model x = tf.placeholder(tf.float32, [none, 784], name='x-input') w = tf.variable(tf.zeros([784, nlabels]), name='weights') b = tf.variable(tf.zeros([nlabels], name='bias')) y = tf.nn.softmax(tf.matmul(x, w) + b) # add summary ops collect data _ = tf.histogram_summary('weights', w) _ = tf.histogram_summary('biases', b) _ = tf.histogram_summary('y', y) # define loss , optimizer y_ = tf.placeholder(tf.float32, [none, nlabels], name='y-input') # more name scopes clean graph representation tf.name_scope('cross_entropy'): cross_entropy = -tf.reduce_mean(y_ * tf.log(y)) _ = tf.scalar_summary('cross entropy', cross_entropy) tf.name_scope('train'): train_step = tf.train.gradientdescentoptimizer(10.).minimize(cross_entropy) tf.name_scope('test'): correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) _ = tf.scalar_summary('accuracy', accuracy) # merge summaries , write them out /tmp/mnist_logs merged = tf.merge_all_summaries() writer = tf.train.summarywriter('logs', sess.graph_def) tf.initialize_all_variables().run() # train model, , feed in test data , record summaries every 10 steps in range(1000): if % 10 == 0: # record summary data , accuracy labels = mnist.test.labels[:, 0:nlabels] feed = {x: mnist.test.images, y_: labels} result = sess.run([merged, accuracy, cross_entropy], feed_dict=feed) summary_str = result[0] acc = result[1] loss = result[2] writer.add_summary(summary_str, i) print('accuracy @ step %s: %s - loss: %f' % (i, acc, loss)) else: batch_xs, batch_ys = mnist.train.next_batch(100) batch_ys = batch_ys[:, 0:nlabels] feed = {x: batch_xs, y_: batch_ys} sess.run(train_step, feed_dict=feed) i have checked dimensions of both batch_ys (fed y) , _y , both 1xn matrices when nlabels=1 problem seems prior that. maybe matrix multiplication?
i have got same problem in real project, appreciated... thanks!
the original mnist example uses one-hot encoding represent labels in data: means if there nlabels = 10 classes (as in mnist), target output [1 0 0 0 0 0 0 0 0 0] class 0, [0 1 0 0 0 0 0 0 0 0] class 1, etc. tf.nn.softmax() operator converts logits computed tf.matmul(x, w) + b probability distribution across different output classes, compared fed-in value y_.
if nlabels = 1, acts if there single class, , tf.nn.softmax() op compute probability of 1.0 class, leading cross-entropy of 0.0, since tf.log(1.0) 0.0 of examples.
there (at least) 2 approaches try binary classification:
the simplest set
nlabels = 22 possible classes, , encode training data[1 0]label 0 ,[0 1]label 1. this answer has suggestion how that.you keep labels integers
0,1, usetf.nn.sparse_softmax_cross_entropy_with_logits(), suggested in this answer.
Comments
Post a Comment