Python/PIL affine transformation

问题

This is a basic transform question in PIL. I've tried at least a couple of times in the past few years to implement this correctly and it seems there is something I don't quite get about Image.transform in PIL. I want to implement a similarity transformation (or an affine transformation) where I can clearly state the limits of the image. To make sure my approach works I implemented it in Matlab.

The Matlab implementation is the following:

im = imread('test.jpg');
y = size(im,1);
x = size(im,2);
angle = 45*3.14/180.0;
xextremes = [rot_x(angle,0,0),rot_x(angle,0,y-1),rot_x(angle,x-1,0),rot_x(angle,x-1,y-1)];
yextremes = [rot_y(angle,0,0),rot_y(angle,0,y-1),rot_y(angle,x-1,0),rot_y(angle,x-1,y-1)];
m = [cos(angle) sin(angle) -min(xextremes); -sin(angle) cos(angle) -min(yextremes); 0 0 1];
tform = maketform('affine',m')
round( [max(xextremes)-min(xextremes), max(yextremes)-min(yextremes)])
im = imtransform(im,tform,'bilinear','Size',round([max(xextremes)-min(xextremes), max(yextremes)-min(yextremes)]));
imwrite(im,'output.jpg');

function y = rot_x(angle,ptx,pty),
    y = cos(angle)*ptx + sin(angle)*pty

function y = rot_y(angle,ptx,pty),
    y = -sin(angle)*ptx + cos(angle)*pty

this works as expected. This is the input:

and this is the output:

This is the Python/PIL code that implements the same transformation:

import Image
import math

def rot_x(angle,ptx,pty):
    return math.cos(angle)*ptx + math.sin(angle)*pty

def rot_y(angle,ptx,pty):
    return -math.sin(angle)*ptx + math.cos(angle)*pty

angle = math.radians(45)
im = Image.open('test.jpg')
(x,y) = im.size
xextremes = [rot_x(angle,0,0),rot_x(angle,0,y-1),rot_x(angle,x-1,0),rot_x(angle,x-1,y-1)]
yextremes = [rot_y(angle,0,0),rot_y(angle,0,y-1),rot_y(angle,x-1,0),rot_y(angle,x-1,y-1)]
mnx = min(xextremes)
mxx = max(xextremes)
mny = min(yextremes)
mxy = max(yextremes)
im = im.transform((int(round(mxx-mnx)),int(round((mxy-mny)))),Image.AFFINE,(math.cos(angle),math.sin(angle),-mnx,-math.sin(angle),math.cos(angle),-mny),resample=Image.BILINEAR)
im.save('outputpython.jpg')

and this is the output from Python:

I've tried this with several versions of Python and PIL on multiple OSs through the years and the results is always mostly the same.

This is the simplest possible case that illustrates the problem, I understand that if it was a rotation I wanted, I could do the rotation with the im.rotate call but I want to shear and scale too, this is just an example to illustrate a problem. I would like to get the same output for all affine transformations. I would like to be able to get this right.

EDIT:

If I change the transform line to this:

im = im.transform((int(round(mxx-mnx)),int(round((mxy-mny)))),Image.AFFINE,(math.cos(angle),math.sin(angle),0,-math.sin(angle),math.cos(angle),0),resample=Image.BILINEAR)

this is the output I get:

EDIT #2

I rotated by -45 degrees and changed the offset to -0.5*mnx and -0.5*mny and obtained this:

回答1:

OK! So I've been working on understanding this all weekend and I think I have an answer that satisfies me. Thank you all for your comments and suggestions!

I start by looking at this:

affine transform in PIL python?

while I see that the author can make arbitrary similarity transformations it does not explain why my code was not working, nor does he explain the spatial layout of the image that we need to transform nor does he provide a linear algebraic solution to my problems.

But I do see from his code I do see that he's dividing the rotation part of the matrix (a,b,d and e) into the scale which struck me as odd. I went back to read the PIL documentation which I quote:

"im.transform(size, AFFINE, data, filter) => image

Applies an affine transform to the image, and places the result in a new image with the given size.

Data is a 6-tuple (a, b, c, d, e, f) which contain the first two rows from an affine transform matrix. For each pixel (x, y) in the output image, the new value is taken from a position (a x + b y + c, d x + e y + f) in the input image, rounded to nearest pixel.

This function can be used to scale, translate, rotate, and shear the original image."

so the parameters (a,b,c,d,e,f) are a transform matrix, but the one that maps (x,y) in the destination image to (a x + b y + c, d x + e y + f) in the source image. But not the parameters of the transform matrix you want to apply, but its inverse. That is:

weird
different than in Matlab
but now, fortunately, fully understood by me

I'm attaching my code:

import Image
import math
from numpy import matrix
from numpy import linalg

def rot_x(angle,ptx,pty):
    return math.cos(angle)*ptx + math.sin(angle)*pty

def rot_y(angle,ptx,pty):
    return -math.sin(angle)*ptx + math.cos(angle)*pty

angle = math.radians(45)
im = Image.open('test.jpg')
(x,y) = im.size
xextremes = [rot_x(angle,0,0),rot_x(angle,0,y-1),rot_x(angle,x-1,0),rot_x(angle,x-1,y-1)]
yextremes = [rot_y(angle,0,0),rot_y(angle,0,y-1),rot_y(angle,x-1,0),rot_y(angle,x-1,y-1)]
mnx = min(xextremes)
mxx = max(xextremes)
mny = min(yextremes)
mxy = max(yextremes)
print mnx,mny
T = matrix([[math.cos(angle),math.sin(angle),-mnx],[-math.sin(angle),math.cos(angle),-mny],[0,0,1]])
Tinv = linalg.inv(T);
print Tinv
Tinvtuple = (Tinv[0,0],Tinv[0,1], Tinv[0,2], Tinv[1,0],Tinv[1,1],Tinv[1,2])
print Tinvtuple
im = im.transform((int(round(mxx-mnx)),int(round((mxy-mny)))),Image.AFFINE,Tinvtuple,resample=Image.BILINEAR)
im.save('outputpython2.jpg')

and the output from python:

Let me state the answer to this question again in a final summary:

PIL requires the inverse of the affine transformation you want to apply.

回答2:

I wanted to expand a bit on the answers by carlosdc and Ruediger Jungbeck, to present a more practical python code solution with a bit of explanation.

First, it is absolutely true that PIL uses inverse affine transformations, as stated in carlosdc's answer. However, there is no need to use linear algebra to compute the inverse transformation from the original transformation—instead, it can easily be expressed directly. I'll use scaling and rotating an image about its center for the example, as in the code linked to in Ruediger Jungbeck's answer, but it's fairly straightforward to extend this to do e.g. shearing as well.

Before approaching how to express the inverse affine transformation for scaling and rotating, consider how we'd find the original transformation. As hinted at in Ruediger Jungbeck's answer, the transformation for the combined operation of scaling and rotating is found as the composition of the fundamental operators for scaling an image about the origin and rotating an image about the origin.

However, since we want to scale and rotate the image about its own center, and the origin (0, 0) is defined by PIL to be the upper left corner of the image, we first need to translate the image such that its center coincides with the origin. After applying the scaling and rotation, we also need to translate the image back in such a way that the new center of the image (it might not be the same as the old center after scaling and rotating) ends up in the center of the image canvas.

So the original "standard" affine transformation we're after will be the composition of the following fundamental operators:

Find the current center
of the image, and translate the image by
, so the center of the image is at the origin
.
Scale the image about the origin by some scale factor
.
Rotate the image about the origin by some angle
.
Find the new center
of the image, and translate the image by
so the new center will end up in the center of the image canvas.

To find the transformation we're after, we first need to know the transformation matrices of the fundamental operators, which are as follows:

Translation by
:
Scaling by
:
Rotation by
:

Then, our composite transformation can be expressed as:

which is equal to

where

Now, to find the inverse of this composite affine transformation, we just need to calculate the composition of the inverse of each fundamental operator in reverse order. That is, we want to

Translate the image by
Rotate the image about the origin by
.
Scale the image about the origin by
.
Translate the image by
.

This results in a transformation matrix

where

This is exactly the same as the transformation used in the code linked to in Ruediger Jungbeck's answer. It can be made more convenient by reusing the same technique that carlosdc used in their post for calculating

of the image, and translate the image by

—applying the rotation to all four corners of the image, and then calculating the distance between the minimum and maximum X and Y values. However, since the image is rotated about its own center, there's no need to rotate all four corners, since each pair of oppositely facing corners are rotated "symmetrically".

Here is a rewritten version of carlosdc's code that has been modified to use the inverse affine transformation directly, and which also adds scaling:

from PIL import Image
import math


def scale_and_rotate_image(im, sx, sy, deg_ccw):
    im_orig = im
    im = Image.new('RGBA', im_orig.size, (255, 255, 255, 255))
    im.paste(im_orig)

    w, h = im.size
    angle = math.radians(-deg_ccw)

    cos_theta = math.cos(angle)
    sin_theta = math.sin(angle)

    scaled_w, scaled_h = w * sx, h * sy

    new_w = int(math.ceil(math.fabs(cos_theta * scaled_w) + math.fabs(sin_theta * scaled_h)))
    new_h = int(math.ceil(math.fabs(sin_theta * scaled_w) + math.fabs(cos_theta * scaled_h)))

    cx = w / 2.
    cy = h / 2.
    tx = new_w / 2.
    ty = new_h / 2.

    a = cos_theta / sx
    b = sin_theta / sx
    c = cx - tx * a - ty * b
    d = -sin_theta / sy
    e = cos_theta / sy
    f = cy - tx * d - ty * e

    return im.transform(
        (new_w, new_h),
        Image.AFFINE,
        (a, b, c, d, e, f),
        resample=Image.BILINEAR
    )


im = Image.open('test.jpg')
im = scale_and_rotate_image(im, 0.8, 1.2, 10)
im.save('outputpython.png')

and this is what the result looks like (scaled with (sx, sy) = (0.8, 1.2), and rotated 10 degrees counter-clockwise):