any more optimisation I can do for this function?

问题

I have a simple box blur function in a graphics library (for JavaScript/canvas, using ImageData) I'm writing.

I've done a few optimisations to avoid piles of redundant code such as looping through [0..3] for the channels instead of copying the code, and having each surrounding pixel implemented with a single, uncopied line of code, averaging values at the end.

Those were optimisations to cut down on redundant lines of code. Are there any further optimisations I can do of that kind, or, better still, any things I can change that may improve performance of the function itself?

Running this function on a 200x150 image area, with a Core 2 Duo, takes about 450ms on Firefox 3.6, 45ms on Firefox 4 and about 55ms on Chromium 10.

Various notes

expressive.data.get returns an ImageData object
expressive.data.put writes the contents of an ImageData back to a canvas
an ImageData is an object with:
- unsigned long width
- unsigned long height
- Array data, a single-dimensional data in the format r, g, b, a, r, g, b, a ...

The code

expressive.boxBlur = function(canvas, x, y, w, h) {
    // averaging r, g, b, a for now
    var data = expressive.data.get(canvas, x, y, w, h);
    for (var i = 0; i < w; i++)
        for (var j = 0; j < h; j++)
            for (var k = 0; k < 4; k++) {
                var total = 0, values = 0, temp = 0;
                if (!(i == 0 && j == 0)) {
                    temp = data.data[4 * w * (j - 1) + 4 * (i - 1) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                if (!(i == w - 1 && j == 0)) {
                    temp = data.data[4 * w * (j - 1) + 4 * (i + 1) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                if (!(i == 0 && j == h - 1)) {
                    temp = data.data[4 * w * (j + 1) + 4 * (i - 1) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                if (!(i == w - 1 && j == h - 1)) {
                    temp = data.data[4 * w * (j + 1) + 4 * (i + 1) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                if (!(j == 0)) {
                    temp = data.data[4 * w * (j - 1) + 4 * (i + 0) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                if (!(j == h - 1)) {
                    temp = data.data[4 * w * (j + 1) + 4 * (i + 0) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                if (!(i == 0)) {
                    temp = data.data[4 * w * (j + 0) + 4 * (i - 1) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                if (!(i == w - 1)) {
                    temp = data.data[4 * w * (j + 0) + 4 * (i + 1) + k];
                    if (temp !== undefined) values++, total += temp;
                }
                values++, total += data.data[4 * w * j + 4 * i + k];
                total /= values;
                data.data[4 * w * j + 4 * i + k] = total;
            }
    expressive.data.put(canvas, data, x, y);
};

回答1:

Maybe (just maybe) moving the if checks out as far as possible would be an advantage. Let me present some pseudo-code:

I'll just call the code looping over k "inner loop" for simplicity

// do a specialized version of "inner loop" that assumes i==0
for (var i = 1; i < (w-1); i++)
     // do a specialized version of "inner loop" that assumes j==0 && i != 0 && i != (w-1)
     for (var j = 1; j < (h-1); j++)
        // do a general version of "inner loop" that can assume i != 0 && j != 0 && i != (w-1) && j != (h-1)
     }
     // do a specialized version of "inner loop" that assumes j == (h - 1) && i != 0 && i != (w-1)
}
// do a specialized version of "inner loop" that assumes i == (w - 1)

This would drastically reduce the number if if checks, since the majority of operations would need none of them.

回答2:

If the only way you use var data is as data.data then you can change:

var data = expressive.data.get(canvas, x, y, w, h);

to:

var data = expressive.data.get(canvas, x, y, w, h).data;

and change every line like:

temp = data.data[4 * w * (j - 1) + 4 * (i - 1) + k];

to:

temp = data[4 * w * (j - 1) + 4 * (i - 1) + k];

and you will save some name lookups.

There may be better ways to optimize it but this is just what I've noticed first.

Update:

Also, if (i != 0 || j != 0) can be faster than if (!(i == 0 && j == 0)) not only because of the negation but also because it can short cuircuit.

(Make your own experiments with == vs. === and != vs. !== because my quick tests showed the results that seem counter-intuitive to me.)

And also some of the tests are done many times and some of the ifs are mutually exclusive but tested anyway without an else. You can try to refactor it having more nested ifs and more else ifs.

回答3:

A minor optimization:

var imgData = expressive.data.get(canvas, x, y, w, h);
var data = imgData.data;

// in your if statements
temp = data[4 * w * (j - 1) + 4 * (i - 1) + k];

expressive.data.put(canvas, imgData, x, y)

You could also perform some minor optimizations in your indices, for example:

4 * w * (j - 1) + 4 * (i - 1) + k // is equal to
4 * ((w * (j-1) + (i-1)) + k

var jmin1 = (w * (j-1))
var imin1 = (i-1)
//etc, and then use those indices at the right place

Also, put {} after every for-statement you have in your code. The 2 additional characters won't make a big difference. The potential bugs will.

回答4:

You could pull out some common expressions:

for (var i = 0; i < w; i++) {
    for (var j = 0; j < h; j++) {
        var t = 4*w*j+4*i;
        var dt = 4*j;
        for (var k = 0; k < 4; k++) {
            var total = 0, values = 0, temp = 0;
            if (!(i == 0 && j == 0)) {
                temp = data.data[t-dt-4+k];
                if (temp !== undefined) values++, total += temp;
            }
            if (!(i == w - 1 && j == 0)) {
                temp = data.data[t-dt+4+k];
                if (temp !== undefined) values++, total += temp;
            }
            if (!(i == 0 && j == h - 1)) {
                temp = data.data[t+dt-4+k];
                if (temp !== undefined) values++, total += temp;
            }
            if (!(i == w - 1 && j == h - 1)) {
                temp = data.data[t+dt+4+k];
                if (temp !== undefined) values++, total += temp;
            }
            if (!(j == 0)) {
                temp = data.data[t-dt+k];
                if (temp !== undefined) values++, total += temp;
            }
            if (!(j == h - 1)) {
                temp = data.data[t+dt+k];
                if (temp !== undefined) values++, total += temp;
            }
            if (!(i == 0)) {
                temp = data.data[t-4+k];
                if (temp !== undefined) values++, total += temp;
            }
            if (!(i == w - 1)) {
                temp = data.data[t+4+k];
                if (temp !== undefined) values++, total += temp;
            }
            values++, total += data.data[t+k];
            total /= values;
            data.data[t+k] = total;
        }
    }
}

You could try moving the loop over k so it's outermost, and then fold the +k into the definition of t, saving a bit more repeated calculation. (That might turn out to be bad for memory-locality reasons.)

You could try moving the loop over j to be outside the loop over i, which will give you better memory locality. This will matter more for large images; it may not matter at all for the size you're using.

Rather painful but possibly very effective: you could lose lots of conditional operations by splitting your loops up into {0,1..w-2,w-1} and {0,1..h-2,h-1}.

You could get rid of all those undefined tests. Do you really need them, given that you're doing all those range checks?

Another way to avoid the range checks: you could pad your image (with zeros) by one pixel along each edge. Note that the obvious way to do this will give different results from your existing code at the edges; this may be a good or a bad thing. If it's a bad thing, you can work out the appropriate value to divide by.

回答5:

the declaration of temp = 0 is not necessary, just write var total = 0, values = 0, temp;.

The next thing is to loop backwards.

var length = 100,
    i;

for (i = 0; i < length; i++) {}

is slower than

var length = 100;

for (; length != 0; length--) {}

The third tip is to use Duffy's Device for huge for loops.

来源：https://stackoverflow.com/questions/5314037/any-more-optimisation-i-can-do-for-this-function

标签

javascript

optimization

canvas