What algorithm is behind the Gimp's “Color to Alpha” feature?

南楼画角 提交于 2019-12-03 04:13:31

You need to come up with a mechanism for comparing the similarity of colors. There are a variety of color spaces in which you can do this. RGB is often not the best for this sort of thing. But you could use HSV, YCbCr, or some other luma/chroma space. Often a distance in one of those spaces will give you a better answer than a Euclidean distance in RGB. Once you have a distance, you could divide that by the maximum distance to get a percentage. That percentage would be the inverse of the alpha you want to use, as one possibility.

If you want to know how the GIMP does it, you can look at the source. For example, here's one recent code change to that plug-in.

I took a look at the source code, and the meat of it is the colortoalpha function. The parameters *a1 to *a4 are the input/output red, green, blue and alpha, respectively, and c1 to c3 is the color to make alpha.

When you're combining two colors c1 and c2 with a specific alpha a (0 ≤ a ≤ 1), the result is

y = a * c1 + (1-a) * c2

Here we're doing the reverse operation: We know the end result y and the background color c2, and want to figure out c1 and a. Since this is an under-specified equation, there's an infinite amount of solutions. However, the ranges 0 ≤ c1 ≤ 255 and 0 ≤ a ≤ 1 adds bounds to the solution.

The way the Gimp plugin works is that for each pixel it minimizes the alpha value (i.e. maximizes transparency). Conversely, this means that for each resulting pixel that isn't completely transparent (i.e. was not exactly the background color), one of the RGB components is either 0 or 255.

This produces an image that when overlayed on top of the specified color will produce the original image (in absence of rounding errors) and has maximum transparency for each pixel.

It's worth noting that the whole process is done in the RGB color space, but could be performed in others as well, as long as the combining operation is done in the same color space.

Harry

So I looked into GIMP source code... ew! I made it generic and readable. Still quite fast though. For math explanation see Sampo's answer. Here's C# implementation (easy convertible to C / C++):

static class PixelShaders {

    /// <summary>
    /// Generic color space color to alpha.
    /// </summary>
    /// <param name="pA">Pixel alpha.</param>
    /// <param name="p1">Pixel 1st channel.</param>
    /// <param name="p2">Pixel 2nd channel.</param>
    /// <param name="p3">Pixel 3rd channel.</param>
    /// <param name="r1">Reference 1st channel.</param>
    /// <param name="r2">Reference 2nd channel.</param>
    /// <param name="r3">Reference 3rd channel.</param>
    /// <param name="mA">Maximum alpha value.</param>
    /// <param name="mX">Maximum channel value.</param>
    static void GColorToAlpha(ref double pA, ref double p1, ref double p2, ref double p3, double r1, double r2, double r3, double mA = 1.0, double mX = 1.0) {
        double aA, a1, a2, a3;
        // a1 calculation: minimal alpha giving r1 from p1
        if (p1 > r1) a1 = mA * (p1 - r1) / (mX - r1);
        else if (p1 < r1) a1 = mA * (r1 - p1) / r1;
        else a1 = 0.0;
        // a2 calculation: minimal alpha giving r2 from p2
        if (p2 > r2) a2 = mA * (p2 - r2) / (mX - r2);
        else if (p2 < r2) a2 = mA * (r2 - p2) / r2;
        else a2 = 0.0;
        // a3 calculation: minimal alpha giving r3 from p3
        if (p3 > r3) a3 = mA * (p3 - r3) / (mX - r3);
        else if (p3 < r3) a3 = mA * (r3 - p3) / r3;
        else a3 = 0.0;
        // aA calculation: max(a1, a2, a3)
        aA = a1;
        if (a2 > aA) aA = a2;
        if (a3 > aA) aA = a3;
        // apply aA to pixel:
        if (aA >= mA / mX) {
            pA = aA * pA / mA;
            p1 = mA * (p1 - r1) / aA + r1;
            p2 = mA * (p2 - r2) / aA + r2;
            p3 = mA * (p3 - r3) / aA + r3;
        } else {
            pA = 0;
            p1 = 0;
            p2 = 0;
            p3 = 0;
        }
    }

}

GIMP's implementation (here) uses RGB color space, uses alpha value as float with 0 to 1 range, and R, G, B as float from 0 to 255.

RGB implementation fails spectacularly when image has JPEG artifacts, because they mean insignificant perceivable color deviations, but quite significant absolute R, G, B deviations. Using LAB colorspace should do the trick for the case.

If you're looking just to remove solid background from the image, color to alpha algorithm is not an optimal option. I got nice results when calculated colorspace distance for each pixel using LAB colorspace. The calculated distance was then applied to alpha channel of the original image. Main difference between this and color to alpha is the hue of the pixels would not be changed. Background remove just sets alpha (opacity) to colorspace difference. It works well if background color does not occur in foreground image. If it does either the background cannot be removed, or BFS algorithm must be used to walk the outer pixels only (something like using magic wand selection in GIMP, then removing the selection).

Background cannot be removed if the foreground image has both holes and pixels in color similar to background color. Such images require some manual processing.

I translated the colortoalpha method from gimp to C# the best I could. The problem is RGBA values are taken as bytes for each channel in a library like ImageSharp. Some of the conversions are losing data during conversion but I tried my best to retain as much as I could. This uses ImageSharp for image mutation. ImageSharp is fully managed so it will work across platforms. Its also fast. This entire methods runs in around ~10ms (less than 10ms).

Here is the code for C# implementation:

public static unsafe void ColorToAlpha(this Image<Rgba32> image, Rgba32 color)
    {
        double alpha1, alpha2, alpha3, alpha4;
        double* a1, a2, a3, a4;

        a1 = &alpha1;
        a2 = &alpha2;
        a3 = &alpha3;
        a4 = &alpha4;

        for (int j = 0; j < image.Height; j++)
        {
            var span = image.GetPixelRowSpan(j);

            for (int i = 0; i < span.Length; i++)
            {
                ref Rgba32 pixel = ref span[i];

                // Don't know what this is for
                // *a4 = pixel.A;

                if (pixel.R > color.R)
                    *a1 = (pixel.R - color.R) / (255.0 - color.R);
                else if (pixel.R < color.R)
                    *a1 = (color.R - pixel.R) / color.R;
                else
                    *a1 = 0.0;

                if (pixel.G > color.G)
                    *a2 = (pixel.G - color.G) / (255.0 - color.G);
                else if (pixel.G < color.G)
                    *a2 = (color.G - pixel.G) / color.G;
                else
                    *a2 = 0.0;

                if (pixel.B > color.B)
                    *a3 = (pixel.B - color.B) / (255.0 - color.B);
                else if (pixel.B < color.B)
                    *a3 = (color.B - pixel.B) / color.B;
                else
                    *a3 = 0.0;

                if (*a1 > *a2)
                    *a4 = *a1 > *a3 ? *a1 * 255.0 : *a3 * 255.0;
                else
                    *a4 = *a2 > *a3 ? *a2 * 255.0 : *a3 * 255.0;

                if (*a4 < 1.0)
                    return;

                pixel.R = (byte)Math.Truncate((255.0 * (*a1 - color.R) / *a4 + color.R));
                pixel.G = (byte)Math.Truncate((255.0 * (*a2 - color.G) / *a4 + color.G));
                pixel.B = (byte)Math.Truncate((255.0 * (*a3 - color.B) / *a4 + color.B));

                pixel.A = (byte)Math.Truncate(*a4);
            }
        }
    }
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!