What do gcc's auto-vectorization messages mean?

大城市里の小女人 提交于 2019-12-10 16:38:32

问题


I have some code that I would like to run fast, so I was hoping I could persuade gcc (g++) to vectorise some of my inner loops. My compiler flags include

-O3 -msse2 -ffast-math -ftree-vectorize -ftree-vectorizer-verbose=5

but gcc fails to vectorize the most important loops, giving me the following not-really-very-verbose-at-all messages:

Not vectorized: complicated access pattern.

and

Not vectorized: unsupported use in stmt.

My questions are (1) what exactly do these mean? (How complicated does it have to be before it's too complicated? Unsupported use of what exactly?), and (2) is there any way I can get the compiler to give me even just a tiny bit more information about what I'm doing wrong?

An example of a loop that gives the "complicated access pattern" is

for (int s=0;s<N;++s)
    a.grid[s][0][h-1] =  D[s] * (b.grid[s][0][h-2] + b.grid[s][1][h-1] - 2*b.grid[s][0][h-1]);

and one that gives "unsupported use in stmt" is the inner loop of

for (int s=0;s<N;++s)
    for (int i=1;i<w-1;++i) 
        for (int j=1;j<h-1;++j) 
            a.grid[s][i][j] = D[s] * (b.grid[s][i][j-1] + b.grid[s][i][j+1] + b.grid[s][i-1][j] + b.grid[s][i+1][j] - 4*b.grid[s][i][j]);

(This is the one that really needs to be optimised.) Here, a.grid and b.grid are three-dimensional arrays of floats, D is a 1D array of floats, and N, w and h are const ints.


回答1:


Not vectorized: complicated access pattern.

The "uncomplicated" access patterns are consecutive elements access or strided element access with certain restrictions (single element of the group accessed in the loop, group element count being a power of 2, group size being multiple of the vector type).

b.grid[s][0][h-2] + b.grid[s][1][h-1] - 2*b.grid[s][0][h-1]);

Neither sequential nor strided access

Not vectorized: unsupported use in stmt.

Here "use" is in the data-flow sense, getting the value of a variable (register, compiler temporary). In this case the "supported uses" are variables, defined in the current iteration of the loop, constants and loop invariants.

a.grid[s][i][j] = D[s] * (b.grid[s][i][j-1] + b.grid[s][i][j+1] + b.grid[s][i-1][j] + b.grid[s][i+1][j] - 4*b.grid[s][i][j]);

In this example, I think the "unsupported use" is because b.grid[s][i][j-1] and b.grid[s][i][j+1] are assigned ("defined") by a previous iteration of the loop.



来源:https://stackoverflow.com/questions/13505524/what-do-gccs-auto-vectorization-messages-mean

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!