问题
I have a list of numpy arrays which are actually input images to my CNN. However size of each of my image is not cosistent, and my CNN takes only images which are of dimension 224X224. How do I reshape each of my image into the given dimension?
print(train_images[key].reshape(224, 224,3))
gives me an output
ValueError: total size of new array must be unchanged
I would be very grateful if anybody could help me with this.
回答1:
New array should have the same amount of values when you are reshaping. What you need is cropping the picture (if it is bigger than 224x224) and padding (if it is smaller than 224x224) or resizing on both occasions.
Cropping is simply slicing with correct indexes:
def crop(np_img, size):
v_start = round((np_img.shape[0] - size[0]) / 2)
h_start = round((np_img.shape[1] - size[1]) / 2)
return np_img[v_start:v_start+size[1], h_start:h_start+size[0],:]
Padding is slightly more complex, this will create a zeros array in desired shape and plug in the values of image inside:
def pad_image(np_img, size):
v_start = round((size[0] - np_img.shape[0]) / 2)
h_start = round((size[1] - np_img.shape[1]) / 2)
result = np.zeros(size)
result[v_start:v_start+np_img.shape[1], h_start:h_start+np_img.shape[0], :] = np_img
return result
You can also use np.pad function for it:
def pad_image(np_img, size):
v_dif = size[0] - np_img.shape[0]
h_dif = size[1] - np_img.shape[1]
return np.lib.pad(np_img, ((v_dif, 0), (h_dif, 0), (0, 0)), 'constant', constant_values=(0))
You may realize padding is a bit different in two functions, I didn't want to over complicate the problem and just padded top and left on the second function. Did the both sides in first one since it was easier to calculate.
And finally for resizing, you better use another library. You can use scipy.misc.imresize, its pretty straightforward. This should do it:
imresize(np_img, size)
回答2:
Here are a few ways I know to achieve this:
- Since you're using python, you can use
cv2.resize(), to resize the image to 224x224. The problem here is going to be distortions. - Scale the image to adjust to one of the required sizes (W=224 or H=224) and trim off whatever is extra. There is a loss of information here.
- If you have the larger image, and a bounding box, use some delta to bounding box to maintain the aspect ratio and then resize down to the required size.
When you reshape a numpy array, the produce of the dimensions must match. If not, it'll throw a ValueError as you've got. There's no solution using reshape to solve your problem, AFAIK.
回答3:
The standard way is to resize the image such that the smaller side is equal to 224 and then crop the image to 224x224. Resizing the image to 224x224 may distort the image and can lead to erroneous training. For example, a circle might become an ellipse if the image is not a square. It is important to maintain the original aspect ratio.
来源:https://stackoverflow.com/questions/44451227/how-to-reshape-a-3d-numpy-array