问题

I'm trying to build a simple image classifier using scikit-learn. I'm hoping to avoid having to resize and convert each image before training.

Question

Given two different images that are different formats and sizes (1.jpg and 2.png), how can I avoid a ValueError while fitting the model?

I have one example where I train using only 1.jpg, which fits successfully.
I have another example where I train using both 1.jpg and 2.png and a ValueError is produced.

This example will fit successfully:

import numpy as np
from sklearn import svm 
import matplotlib.image as mpimg

target = [1, 2]
images = np.array([
    # target 1
    [mpimg.imread('./1.jpg'), mpimg.imread('./1.jpg')],
    # target 2
    [mpimg.imread('./1.jpg'), mpimg.imread('./1.jpg')],
])
n_samples = len(images)
data = images.reshape((n_samples, -1))
model = svm.SVC()
model.fit(data, target)

This example will raise a Value error.

Observe the different 2.png image in target 2.

import numpy as np
from sklearn import svm 
import matplotlib.image as mpimg

target = [1, 2]
images = np.array([
    # target 1
    [mpimg.imread('./1.jpg'), mpimg.imread('./1.jpg')],
    # target 2
    [mpimg.imread('./2.png'), mpimg.imread('./1.jpg')],
])
n_samples = len(images)
data = images.reshape((n_samples, -1))
model = svm.SVC()
model.fit(data, target)
# ValueError: setting an array element with a sequence.

1.jpg

2.png

回答1:

For this, I would really recommend using the tools in Keras that are specifically designed to preprocess images in a highly scalable and efficient way.

from keras.preprocessing.image import ImageDataGenerator
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np

1 Determine the target size of your new pictures

h,w = 150,150 # desired height and width
batch_size = 32 
N_images = 100 #total number of images

Keras works in batches, so batch_size just determines how many pictures at once will be processed (this does not impact your end result, just the speed).

2 Create your Image Generator

train_datagen = ImageDataGenerator(
    rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    'Pictures_dir',
    target_size=(h, w),
    batch_size=batch_size,
    class_mode = 'binary')

The object that is going to do the image extraction is ImageDataGenerator. It has the method flow_from_directory which I believe might be useful for you here. It will read the content of the folder Pictures_dir and expect your images to be in folders by class (eg: Pictures_dir/class0 and Pictures_dir/class1). The generator, when called, will then create images from these folders and also import their label (in this example, 'class0' and 'class1').

There are plenty of other arguments to this generator, you can check them out in the Keras documentation (especially if you want to do data augmentation).

Note: this will take any image, be it PNG or JPG, as you requested

If you want to get the mapping from class names to label indices, do:

train_generator.class_indices
# {'class0': 0, 'class1': 1}

You can check what is going on with

plt.imshow(train_generator[0][0][0])

3 Extract all resized images from the Generator

Now you are ready to extract the images from the ImageGenerator:

def extract_images(generator, sample_count):
    images = np.zeros(shape=(sample_count, h, w, 3))
    labels = np.zeros(shape=(sample_count))
    i = 0
    for images_batch, labels_batch in generator: # we are looping over batches
        images[i*batch_size : (i+1)*batch_size] = images_batch
        labels[i*batch_size : (i+1)*batch_size] = labels_batch
        i += 1
        if i*batch_size >= sample_count:
            # we must break after every image has been seen once, because generators yield indifinitely in a loop
            break
    return images, labels

images, labels = extract_images(train_generator, N_images)

print(labels[0])
plt.imshow(images[0])

Now you have your images all at the same size in images, and their corresponding labels in labels, which you can then feed into any scikit-learn classifier of your choice.

回答2:

Its difficult because of the math operations behind the scene, (the details are out of scope) if you manage do so, lets say you build your own algorithm, still you would not get the desired result. i had this issue once with faces with different sizes. maybe this piece of code give you starting point.

from PIL import Image
import face_recognition

def face_detected(file_address = None , prefix = 'detect_'):
    if file_address is None:
        raise FileNotFoundError('File address required')
    image = face_recognition.load_image_file(file_address)
    face_location = face_recognition.face_locations(image)

    if face_location:
        face_location = face_location[0]
        UP = int(face_location[0] - (face_location[2] - face_location[0]) / 2)
        DOWN = int(face_location[2] + (face_location[2] - face_location[0]) / 2)
        LEFT = int(face_location[3] - (face_location[3] - face_location[2]) / 2)
        RIGHT = int(face_location[1] + (face_location[3] - face_location[2]) / 2)

        if UP - DOWN is not LEFT - RIGHT:
            height = UP - DOWN
            width = LEFT - RIGHT
            delta = width - height
            LEFT -= int(delta / 2)
            RIGHT += int(delta / 2)

        pil_image = Image.fromarray(image[UP:DOWN, LEFT:RIGHT, :])
        pil_image.thumbnail((50, 50), Image.ANTIALIAS)
        pil_image.save(prefix + file_address)

        return True

    pil_image = Image.fromarray(image)
    pil_image.thumbnail((200, 200), Image.ANTIALIAS)
    pil_image.save(prefix + file_address)
    return False

Note : i wrote this long time ago maybe not a good practice

来源：https://stackoverflow.com/questions/56718952/how-can-i-classify-different-images-with-various-sizes-and-formats-in-scikit-lea

标签

image