How to use random number in user defined tensorflow op?
I am writing a op in cpp which need random number in the Compute function.
but It seems I should not
The core of all random number generation in TensorFlow is PhiloxRandom, generally accessed through its wrapper GuardedPhiloxRandom. As explained in tf.set_random_seed, there are graph-level and op-level seeds, both of which may or may not be set. If you want to have this in your op too, you need to do a couple of things. First, your op should be declared with two optional attributes, seed
and seed2
; see the existing ops in random_ops.cc. Then, in Python, you have some user API wrapping your op that makes these two values using tensorflow.python.framework.random_seed
, which you have to import as tensorflow.python.framework import random_seed
, and do seed1, seed2 = random_seed.get_seed(seed)
; this will correctly create the two seed values using the graph's seed and an optional seed
parameter to the function (see random_ops.py). These seed1
and seed2
values are then passed as seed
and seed2
attributes to your op, obviously. If you do all that, then GuardedPhiloxRandom will take care of properly initializing the random number generator using the right seeds.
Now, to the kernel implementation. In addition to the things I mentioned above, you will need to combine two things: the struct template FillPhiloxRandom
, declared in core/kernels/random_op.h, which will help you fill a tensor with random data; and a Distribution
, which is just an object that can be called with a random number generator to produce a value (see existing implementations in core/lib/random/random_distributions.h). Now it is mostly a matter of looking at how it is done in core/kernels/random_op.cc, and copy the bits you need. Most kernels in there are based on PhiloxRandomOp (which is not publicly declared, but you can copy or adapt). This essentially holds a random number generator, allocates space in the output tensor (it assumes the first input is the desired shape) and calls FillPhiloxRandom
to do the work. If this is the kind of op you are trying to create (generate some data according to some distribution), then you are all set! Your code could look something like this:
// Required for thread pool device
#define EIGEN_USE_THREADS
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/register_types.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/framework/tensor_shape.h"
#include "tensorflow/core/kernels/random_op.h"
#include "tensorflow/core/util/guarded_philox_random.h"
// Helper function to convert an 32-bit integer to a float between [0..1).
// Copied from core/lib/random/random_distributions.h
PHILOX_DEVICE_INLINE float Uint32ToFloat(uint32 x) {
// IEEE754 floats are formatted as follows (MSB first):
// sign(1) exponent(8) mantissa(23)
// Conceptually construct the following:
// sign == 0
// exponent == 127 -- an excess 127 representation of a zero exponent
// mantissa == 23 random bits
const uint32 man = x & 0x7fffffu; // 23 bit mantissa
const uint32 exp = static_cast<uint32>(127);
const uint32 val = (exp << 23) | man;
// Assumes that endian-ness is same for float and uint32.
float result;
memcpy(&result, &val, sizeof(val));
return result - 1.0f;
}
// Template class for your custom distribution
template <class Generator, typename RealType>
class MyDistribution;
// Implementation for tf.float32
template <class Generator>
class MyDistribution<Generator, float> {
public:
// The number of elements that will be returned (see below).
static const int kResultElementCount = Generator::kResultElementCount;
// Cost of generation of a single element (in cycles) (see below).
static const int kElementCost = 3;
// Indicate that this distribution may take variable number of samples
// during the runtime (see below).
static const bool kVariableSamplesPerOutput = false;
typedef Array<float, kResultElementCount> ResultType;
typedef float ResultElementType;
PHILOX_DEVICE_INLINE
ResultType operator()(Generator* gen) {
typename Generator::ResultType sample = (*gen)();
ResultType result;
for (int i = 0; i < kResultElementCount; ++i) {
float r = Uint32ToFloat(sample[i]);
// Example distribution logic: produce 1 or 0 with 50% probability
result[i] = 1.0f * (r < 0.5f);
}
return result;
}
};
// Could add implementations for other data types...
// Base kernel
// Copied from core/kernels/random_op.cc
static Status AllocateOutputWithShape(OpKernelContext* ctx, const Tensor& shape,
int index, Tensor** output) {
TensorShape tensor_shape;
TF_RETURN_IF_ERROR(ctx->op_kernel().MakeShape(shape, &tensor_shape));
return ctx->allocate_output(index, tensor_shape, output);
}
template <typename Device, class Distribution>
class PhiloxRandomOp : public OpKernel {
public:
typedef typename Distribution::ResultElementType T;
explicit PhiloxRandomOp(OpKernelConstruction* ctx) : OpKernel(ctx) {
OP_REQUIRES_OK(ctx, generator_.Init(ctx));
}
void Compute(OpKernelContext* ctx) override {
const Tensor& shape = ctx->input(0);
Tensor* output;
OP_REQUIRES_OK(ctx, AllocateOutputWithShape(ctx, shape, 0, &output));
auto output_flat = output->flat<T>();
tensorflow::functor::FillPhiloxRandom<Device, Distribution>()(
ctx, ctx->eigen_device<Device>(),
// Multiplier 256 is the same as in FillPhiloxRandomTask; do not change
// it just here.
generator_.ReserveRandomOutputs(output_flat.size(), 256),
output_flat.data(), output_flat.size(), Distribution());
}
private:
GuardedPhiloxRandom generator_;
};
// Register kernel
typedef Eigen::ThreadPoolDevice CPUDevice;
template struct functor::FillPhiloxRandom<
CPUDevice, MyDistribution<tensorflow::random::PhiloxRandom, float>>;
REGISTER_KERNEL_BUILDER(
Name("MyDistribution")
.Device(DEVICE_CPU)
.HostMemory("shape")
.TypeConstraint<float>("dtype"),
PhiloxRandomOp<CPUDevice, MyDistribution<tensorflow::random::PhiloxRandom, float>>);
// Register kernels for more types, can use macros as in core/kernels/random_op.cc...
There are a few extra bits and pieces here. First you need to understand that PhiloxRandom generally produces four unsigned 32-bit integers on each step, and you have to make your random values from these. Uint32ToFloat
is a helper to get a float
between zero and one from one of this numbers. There are a few constants in there too. kResultElementCount
is the number of values your distribution produces on each step. If you produce one value per random number form the generator, you can set it too Generator::kResultElementCount
, like here (which is 4). However, for example if you want to produce double
values (that is, tf.float64
), you may want to use two 32-bit integers per value, so maybe you would produce Generator::kResultElementCount / 2
in that case. kElementCost
is supposed to indicate how many cycles it takes your distribution to produce an element. I do not know how this is measured by the TensorFlow team, but it is just a hint to distribute the generation work among tasks (used by FillPhiloxRandom
), so you can just guess something, or copy it from a similarly expensive distribution. kVariableSamplesPerOutput
determines whether each call to your distribution may produce a different number of outputs; again, when this is false
(which should be the common case), FillPhiloxRandom
will make the value generation more efficient. PHILOX_DEVICE_INLINE
(defined in core/lib/random/philox_random.h) is a compiler hint to inline the function. You can add then additional implementations and kernel registrations for other data types and, if you are supporting it, for DEVICE_GPU
GPUDevice
(with typedef Eigen::GpuDevice GPUDevice
) or even DEVICE_SYCL
(with typedef Eigen::SyclDevice SYCLDevice
), if you want. And about that, EIGEN_USE_THREADS
is just to enable the thread pool execution device in Eigen, to make CPU implementation multi-threaded.
If your use case is different, though (for example, you want to generate some random numbers and do some other computation in addition to that), FillPhiloxRandom
may not be useful to you (or it may be, but then you also need to do something else). Having a look at core/kernels/random_op.cc
and the headers of the different classes should help you figure out how to use them for your problem.