Using SparseTensor as a trainable variable?

前端 未结 4 1937
借酒劲吻你
借酒劲吻你 2020-12-09 13:39

I\'m trying to use SparseTensor to represent weight variables in a fully-connected layer.
However, it seems that TensorFlow 0.8 doesn\'t allow to use Sp

相关标签:
4条回答
  • 2020-12-09 14:11

    As a workaround to your problem, you can provide a tf.Variable (until Tensorflow v0.8) for the values of a sparse tensor. The sparsity structure has to be pre-defined in that case, the weights however remain trainable.

    weights = tf.Variable(<initial-value>)
    sparse_var = tf.SparseTensor(<indices>, weights, <shape>)  # v0.8
    sparse_var = tf.SparseTensor(<indices>, tf.identity(weights), <shape>)  # v0.9
    
    0 讨论(0)
  • 2020-12-09 14:16

    TensorFlow doesn't support training on sparse tensors yet. You can initialize a sparse tensor as you wish, then convert it into a dense tensor and create a variable from it like that:

    # You need to correctly initialize the sparse tensor with indices, values and a shape   
    
    b = tf.SparseTensor(indices, values, shape)
    b_dense = tf.sparse_tensor_to_dense(b)
    b_variable = tf.Variable(b_dense)
    

    Now you have initialized a sparse tensor as a variable. Now you need to take care of the gradient update (in other words, make sure the entries in the variable stay 0, since there is a non-vanishing gradient calculated in the backpropagation algorithm for them when using this naively).

    In order to do this, TensorFlow optimizers have a method called tf.train.Optimizer.compute_gradients(loss, [list_of_variables]). This calculates all the gradients in the graph necessary to minimize the loss function, but doesn't apply them yet. This method returns a list of tuples in a form of (gradients, variable). You can modify these gradients freely, but in your case it makes sense to mask the gradients not needed to 0 (i.e. by creating another sparse tensor with default values 0.0 and values 1.0 where the weights in your network are present). After having modified them, you call the optimizer method tf.train.Optimizer.apply_gradients(grads_and_vars) to actually apply the gradients. An example code would look like this:

    # Create optimizer instance
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
    
    # Get the gradients for your weights
    grads_and_vars = optimizer.compute_gradients(loss, [b_variable])
    
    # Modify the gradients at will
    # In your case it would look similar to this
    modified_grads_and_vars = [(tf.multiply(gv[0], mask_tensor), gv[1] for gv in grads_and_vars]
    
    # Apply modified gradients to your model
    optimizer.apply_gradients(modified_grads_and_vars)
    

    This makes sure your entries stay 0 in your weight matrix and no unwanted connections are created. You need to take care of all the other gradients for all other variables later.

    0 讨论(0)
  • 2020-12-09 14:19

    The above code works with some minor correction like this.

    def optimize(loss, mask_tensor):
        optimizer = tf.train.AdamOptimizer(0.001)
        grads_and_vars = optimizer.compute_gradients(loss)
        modified_grads_and_vars = [
            (tf.multiply(gv[0], mask_tensor[gv[1]]), gv[1]) for gv in grads_and_vars
        ]
        return optimizer.apply_gradients(modified_grads_and_vars)
    
    0 讨论(0)
  • 2020-12-09 14:26

    TensorFlow doesn't currently support sparse tensor variables. However, it does support sparse lookups (tf.embedding_lookup) and sparse gradient updates (tf.sparse_add) of dense variables. I suspect these two will suffice your use case.

    0 讨论(0)
提交回复
热议问题