Using SparseTensor as a trainable variable?

前端 未结 4 1936
借酒劲吻你
借酒劲吻你 2020-12-09 13:39

I\'m trying to use SparseTensor to represent weight variables in a fully-connected layer.
However, it seems that TensorFlow 0.8 doesn\'t allow to use Sp

4条回答
  •  [愿得一人]
    2020-12-09 14:16

    TensorFlow doesn't support training on sparse tensors yet. You can initialize a sparse tensor as you wish, then convert it into a dense tensor and create a variable from it like that:

    # You need to correctly initialize the sparse tensor with indices, values and a shape   
    
    b = tf.SparseTensor(indices, values, shape)
    b_dense = tf.sparse_tensor_to_dense(b)
    b_variable = tf.Variable(b_dense)
    

    Now you have initialized a sparse tensor as a variable. Now you need to take care of the gradient update (in other words, make sure the entries in the variable stay 0, since there is a non-vanishing gradient calculated in the backpropagation algorithm for them when using this naively).

    In order to do this, TensorFlow optimizers have a method called tf.train.Optimizer.compute_gradients(loss, [list_of_variables]). This calculates all the gradients in the graph necessary to minimize the loss function, but doesn't apply them yet. This method returns a list of tuples in a form of (gradients, variable). You can modify these gradients freely, but in your case it makes sense to mask the gradients not needed to 0 (i.e. by creating another sparse tensor with default values 0.0 and values 1.0 where the weights in your network are present). After having modified them, you call the optimizer method tf.train.Optimizer.apply_gradients(grads_and_vars) to actually apply the gradients. An example code would look like this:

    # Create optimizer instance
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
    
    # Get the gradients for your weights
    grads_and_vars = optimizer.compute_gradients(loss, [b_variable])
    
    # Modify the gradients at will
    # In your case it would look similar to this
    modified_grads_and_vars = [(tf.multiply(gv[0], mask_tensor), gv[1] for gv in grads_and_vars]
    
    # Apply modified gradients to your model
    optimizer.apply_gradients(modified_grads_and_vars)
    

    This makes sure your entries stay 0 in your weight matrix and no unwanted connections are created. You need to take care of all the other gradients for all other variables later.

提交回复
热议问题