First of all, I am aware that a related question has been asked here.
However, this question is about the implementation and internals. I was reading the paper \"A
Before explaining the distinction between tensors and variables, we should be precise about what the word "tensor" means in the context of TensorFlow:
In the Python API, a tf.Tensor object represents the symbolic result of a TensorFlow operation. For example, in the expression t = tf.matmul(x, y)
, t
is a tf.Tensor
object representing the result of multiplying x
and y
(which may themselves be symbolic results of other operations, concrete values such as NumPy arrays, or variables).
In this context, a "symbolic result" is more complicated than a pointer to the result of an operation. It is more analogous to a function object that, when called (i.e. passed to tf.Session.run()
) will run the necessary computation to produce the result of that operation, and return it to you as a concrete value (e.g. a NumPy array).
In the C++ API, a tensorflow::Tensor object represents the concrete value of a multi-dimensional array. For example, the MatMul
kernel takes two two-dimensional tensorflow::Tensor
objects as inputs, and produces a single two-dimensional tensorflow::Tensor
object as its output.
This distinction is a little confusing, and we might choose different names if we started over (in other language APIs, we prefer the name Output
for a symbolic result and Tensor
for a concrete value).
A similar distinction exists for variables. In the Python API, a tf.Variable is the symbolic representation of a variable, which has methods for creating operations that read the current value of the variable, and assign values to it. In the C++ implementation, a tensorflow::Var object is a wrapper around a shared, mutable tensorflow::Tensor
object.
With that context out the way, we can address your specific questions:
What is the meaning of "in-memory buffers"?
An in-memory buffer is simply a contiguous region of memory that has been allocated with a TensorFlow allocator. tensorflow::Tensor
objects contain a pointer to an in-memory buffer, which holds the values of that tensor. The buffer could be in host memory (i.e. accessible from the CPU) or device memory (e.g. accessible only from a GPU), and TensorFlow has operations to move data between these memory spaces.
What is the meaning of a "handle"?
In the explanation in the paper, the word "handle" is used in a couple of different ways, which are slightly different from how TensorFlow uses the term. The paper uses "symbolic handle" to refer to a tf.Tensor
object, and "persistent, mutable handle" to refer to a tf.Variable
object. The TensorFlow codebase uses "handle" to refer to a name for a stateful object (like a tf.FIFOQueue
or tf.TensorArray
) that can be passed around without copying all of the values (i.e. call-by-reference).
Is my initial assumption about the internal of a tensor correct?
Your assumption most closely matches the definition of a (C++) tensorflow::Tensor
object. The (Python) tf.Tensor
object is more complicated because it refers to a function for computing a value, rather than the value itself.
What is the essential internal implementation difference between a tensor and a variable?
In C++, a tensorflow::Tensor
and tensorflow::Var
are very similar; the only different is that tensorflow::Var
also has a mutex
that can be used to lock the variable when it is being updated.
In Python, the essential difference is that a tf.Tensor
is implemented as a dataflow graph, and it is read-only (i.e. by calling tf.Session.run()
). A tf.Variable
can be both read (i.e. by evaluating its read operation) and written (e.g. by running an assign operation).
Why are they declared differently and why is that difference essential to TensorFlow?
Tensors and variables serve different purposes. Tensors (tf.Tensor
objects) can represent complex compositions of mathematical expressions, like loss functions in a neural network, or symbolic gradients. Variables represent state that is updated over time, like weight matrices and convolutional filters during training. While in principle you could represent the evolving state of a model without variables, you would end up with a very large (and repetetive) mathematical expression, so variables provide a convenient way to materialize the state of the model, and—for example—share it with other machines for parallel training.