How to speed up my sparse matrix solver?

前端 未结 7 1363
梦毁少年i
梦毁少年i 2020-12-14 03:22

I\'m writing a sparse matrix solver using the Gauss-Seidel method. By profiling, I\'ve determined that about half of my program\'s time is spent inside the solver. The perfo

7条回答
  •  [愿得一人]
    2020-12-14 04:02

    Poni's answer looks like the right one to me.

    I just want to point out that in this type of problem, you often gain benefits from memory locality. Right now, the b,w,e,s,n arrays are all at separate locations in memory. If you could not fit the problem in L3 cache (mostly in L2), then this would be bad, and a solution of this sort would be helpful:

    size_t d_nx = 128, d_ny = 128;
    float *d_x;
    
    struct D { float b,w,e,s,n; };
    D *d;
    
    void step() {
        size_t ic = d_ny + 1, iw = d_ny, ie = d_ny + 2, is = 1, in = 2 * d_ny + 1;
        for (size_t y = 1; y < d_ny - 1; ++y) {
            for (size_t x = 1; x < d_nx - 1; ++x) {
                d_x[ic] = d[ic].b
                    - d[ic].w * d_x[iw] - d[ic].e * d_x[ie]
                    - d[ic].s * d_x[is] - d[ic].n * d_x[in];
                ++ic; ++iw; ++ie; ++is; ++in;
            }
            ic += 2; iw += 2; ie += 2; is += 2; in += 2;
        }
    }
    void solve(size_t iters) { for (size_t i = 0; i < iters; ++i) step(); }
    void clear(float *a) { memset(a, 0, d_nx * d_ny * sizeof(float)); }
    
    int main(int argc, char **argv) {
        size_t n = d_nx * d_ny;
        d_x = new float[n]; clear(d_x);
        d = new D[n]; memset(d,0,n * sizeof(D));
        solve(atoi(argv[1]));
        cout << d_x[0] << endl; // prevent the thing from being optimized away
    }
    

    For example, this solution at 1280x1280 is a little less than 2x faster than Poni's solution (13s vs 23s in my test--your original implementation is then 22s), while at 128x128 it's 30% slower (7s vs. 10s--your original is 10s).

    (Iterations were scaled up to 80000 for the base case, and 800 for the 100x larger case of 1280x1280.)

提交回复
热议问题