indexing in tensorflow slower than gather

问题

I am trying to index into a tensor to get a slice or single element from 1d tensors. I find that there is significant performance difference when using the numpy way of indexing [:] and slice vs tf.gather (almost 30-40% ).

Also I observe that tf.gather has significant overhead when used on scalars (looping over unstacked tensor) as opposed to tensor . Is this a known issue ?

example code (inefficient) :

for node_idxs in graph.nodes():
    node_indice_list = tf.unstack(node_idxs)
    result = []
    for nodeid in node_indices_list:
        x = tf.gather(..., nodeid)
        y = tf.gather(..., nodeid)
        result.append(tf.mul(x,y))
return tf.stack(result)

as opposed to example code (efficient) :

for node_idxs in graph.nodes():
    x = tf.gather(..., node_idxs)
    y = tf.gather(..., node_idxs)
return tf.mul(x, y)

I understand that the first inefficient implementation is doing more work of unstacking, stacking and then looping and more gather operations, but i was not expecting 100x slowdown when the order of nodes i am operating on is few hundred nodes (is unstacking and overhead of gather on single scalar that slow, in first case i have many more gather operation each operating on single element as opposed to tensor of offsets) . Are there faster way of indexing , i tried numpy and slice which turned out to be slower than gather.

来源：https://stackoverflow.com/questions/46048235/indexing-in-tensorflow-slower-than-gather

标签

tensorflow

tensorflow-serving

tensorflow-gpu

tensor

tensorflow-xla