Using SparseTensor as a trainable variable?

前端未结

关注

 4  1937

I\'m trying to use SparseTensor to represent weight variables in a fully-connected layer.
However, it seems that TensorFlow 0.8 doesn\'t allow to use Sp

相关标签:

4条回答

情书的邮戳

2020-12-09 14:11
As a workaround to your problem, you can provide a tf.Variable (until Tensorflow v0.8) for the values of a sparse tensor. The sparsity structure has to be pre-defined in that case, the weights however remain trainable.
```
weights = tf.Variable(<initial-value>)
sparse_var = tf.SparseTensor(<indices>, weights, <shape>)  # v0.8
sparse_var = tf.SparseTensor(<indices>, tf.identity(weights), <shape>)  # v0.9
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
[愿得一人]

2020-12-09 14:16
TensorFlow doesn't support training on sparse tensors yet. You can initialize a sparse tensor as you wish, then convert it into a dense tensor and create a variable from it like that:
```
# You need to correctly initialize the sparse tensor with indices, values and a shape   

b = tf.SparseTensor(indices, values, shape)
b_dense = tf.sparse_tensor_to_dense(b)
b_variable = tf.Variable(b_dense)
```
Now you have initialized a sparse tensor as a variable. Now you need to take care of the gradient update (in other words, make sure the entries in the variable stay 0, since there is a non-vanishing gradient calculated in the backpropagation algorithm for them when using this naively).

In order to do this, TensorFlow optimizers have a method called tf.train.Optimizer.compute_gradients(loss, [list_of_variables]). This calculates all the gradients in the graph necessary to minimize the loss function, but doesn't apply them yet. This method returns a list of tuples in a form of (gradients, variable). You can modify these gradients freely, but in your case it makes sense to mask the gradients not needed to 0 (i.e. by creating another sparse tensor with default values 0.0 and values 1.0 where the weights in your network are present). After having modified them, you call the optimizer method tf.train.Optimizer.apply_gradients(grads_and_vars) to actually apply the gradients. An example code would look like this:
```
# Create optimizer instance
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)

# Get the gradients for your weights
grads_and_vars = optimizer.compute_gradients(loss, [b_variable])

# Modify the gradients at will
# In your case it would look similar to this
modified_grads_and_vars = [(tf.multiply(gv[0], mask_tensor), gv[1] for gv in grads_and_vars]

# Apply modified gradients to your model
optimizer.apply_gradients(modified_grads_and_vars)
```
This makes sure your entries stay 0 in your weight matrix and no unwanted connections are created. You need to take care of all the other gradients for all other variables later.
0 讨论(0)
发布评论:

提交评论
- 加载中...

情话喂你

2020-12-09 14:19

The above code works with some minor correction like this.

def optimize(loss, mask_tensor):
    optimizer = tf.train.AdamOptimizer(0.001)
    grads_and_vars = optimizer.compute_gradients(loss)
    modified_grads_and_vars = [
        (tf.multiply(gv[0], mask_tensor[gv[1]]), gv[1]) for gv in grads_and_vars
    ]
    return optimizer.apply_gradients(modified_grads_and_vars)

0 讨论(0)

走了就别回头了

2020-12-09 14:26

TensorFlow doesn't currently support sparse tensor variables. However, it does support sparse lookups (tf.embedding_lookup) and sparse gradient updates (tf.sparse_add) of dense variables. I suspect these two will suffice your use case.

0 讨论(0)
发布评论:

提交评论
- 加载中...