I have an autoencoder that takes an image as an input and produces a new image as an output.
The input image (1x1024x1024x3) is split into patches (1024x32x32x3) bef
This code works for your specific case, as well as for cases when the images are square, with a square kernel and the image size is divisible by the kernel size.
I did not test it for other cases.
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
size = 1024
k_size = 32
axes_1_2_size = int(np.sqrt((size * size) / (k_size * k_size)))
# Define a placeholder for image (or load it directly if you prefer)
img = tf.placeholder(tf.int32, shape=(1, size, size, 3))
# Extract patches
patches = tf.image.extract_image_patches(img, ksizes=[1, k_size, k_size, 1],
strides=[1, k_size, k_size, 1],
rates=[1, 1, 1, 1], padding='VALID')
# Reconstruct the image back from the patches
# First separate out the channel dimension
reconstruct = tf.reshape(patches, (1, axes_1_2_size, axes_1_2_size, k_size, k_size, 3))
# Tranpose the axes (I got this axes tuple for transpose via experimentation)
reconstruct = tf.transpose(reconstruct, (0, 1, 3, 2, 4, 5))
# Reshape back
reconstruct = tf.reshape(reconstruct, (size, size, 3))
im_arr = # load image with shape (size, size, 3)
# Run the operations
with tf.Session() as sess:
ps, r = sess.run([patches, reconstruct], feed_dict={img:[im_arr]})
# Plot the reconstructed image to verify
plt.imshow(r)