I am a beginner to tensorflow. I\'m currently working on a system with 2 GPUs each of 12GB. I want to implement model parallelism across the two GPUs to train large models.
Here's an example. The model has some parts on GPU0, some parts on GPU1 and some parts on CPU, so this is 3 way model parallelism.
with tf.device("/gpu:0"):
a = tf.Variable(tf.ones(()))
a = tf.square(a)
with tf.device("/gpu:1"):
b = tf.Variable(tf.ones(()))
b = tf.square(b)
with tf.device("/cpu:0"):
loss = a+b
opt = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train_op = opt.minimize(loss)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(10):
loss0, _ = sess.run([loss, train_op])
print("loss", loss0)