How to reshape a 3D numpy array?

早过忘川 提交于 2019-12-11 00:38:22

问题


I have a list of numpy arrays which are actually input images to my CNN. However size of each of my image is not cosistent, and my CNN takes only images which are of dimension 224X224. How do I reshape each of my image into the given dimension? print(train_images[key].reshape(224, 224,3))

gives me an output

ValueError: total size of new array must be unchanged I would be very grateful if anybody could help me with this.


回答1:


New array should have the same amount of values when you are reshaping. What you need is cropping the picture (if it is bigger than 224x224) and padding (if it is smaller than 224x224) or resizing on both occasions.

Cropping is simply slicing with correct indexes:

def crop(np_img, size):
    v_start = round((np_img.shape[0] - size[0]) / 2)
    h_start = round((np_img.shape[1] - size[1]) / 2)
    return np_img[v_start:v_start+size[1], h_start:h_start+size[0],:]

Padding is slightly more complex, this will create a zeros array in desired shape and plug in the values of image inside:

def pad_image(np_img, size):
    v_start = round((size[0] - np_img.shape[0]) / 2)
    h_start = round((size[1] - np_img.shape[1]) / 2)

    result = np.zeros(size)
    result[v_start:v_start+np_img.shape[1], h_start:h_start+np_img.shape[0], :] = np_img

    return result

You can also use np.pad function for it:

def pad_image(np_img, size):
    v_dif = size[0] - np_img.shape[0]
    h_dif = size[1] - np_img.shape[1]
    return np.lib.pad(np_img, ((v_dif, 0), (h_dif, 0), (0, 0)), 'constant', constant_values=(0))

You may realize padding is a bit different in two functions, I didn't want to over complicate the problem and just padded top and left on the second function. Did the both sides in first one since it was easier to calculate.

And finally for resizing, you better use another library. You can use scipy.misc.imresize, its pretty straightforward. This should do it:

imresize(np_img, size)



回答2:


Here are a few ways I know to achieve this:

  1. Since you're using python, you can use cv2.resize(), to resize the image to 224x224. The problem here is going to be distortions.
  2. Scale the image to adjust to one of the required sizes (W=224 or H=224) and trim off whatever is extra. There is a loss of information here.
  3. If you have the larger image, and a bounding box, use some delta to bounding box to maintain the aspect ratio and then resize down to the required size.

When you reshape a numpy array, the produce of the dimensions must match. If not, it'll throw a ValueError as you've got. There's no solution using reshape to solve your problem, AFAIK.




回答3:


The standard way is to resize the image such that the smaller side is equal to 224 and then crop the image to 224x224. Resizing the image to 224x224 may distort the image and can lead to erroneous training. For example, a circle might become an ellipse if the image is not a square. It is important to maintain the original aspect ratio.



来源:https://stackoverflow.com/questions/44451227/how-to-reshape-a-3d-numpy-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!