What is a concise way to create a 2D slice in Go?

前端 未结 2 529
一个人的身影
一个人的身影 2020-11-28 19:18

I am learning Go by going through A Tour of Go. One of the exercises there asks me to create a 2D slice of dy rows and dx columns containing

2条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-28 20:09

    There are two ways to use slices to create a matrix. Let's take a look at the differences between them.

    First method:

    matrix := make([][]int, n)
    for i := 0; i < n; i++ {
        matrix[i] = make([]int, m)
    }
    

    Second method:

    matrix := make([][]int, n)
    rows := make([]int, n*m)
    for i := 0; i < n; i++ {
        matrix[i] = rows[i*m : (i+1)*m]
    }
    

    In regards to the first method, making successive make calls doesn't ensure that you will end up with a contiguous matrix, so you may have the matrix divided in memory. Let's think of an example with two Go routines that could cause this:

    1. The routine #0 runs make([][]int, n) to get allocated memory for matrix, getting a piece of memory from 0x000 to 0x07F.
    2. Then, it starts the loop and does the first row make([]int, m), getting from 0x080 to 0x0FF.
    3. In the second iteration it gets preempted by the scheduler.
    4. The scheduler gives the processor to routine #1 and it starts running. This one also uses make (for its own purposes) and gets from 0x100 to 0x17F (right next to the first row of routine #0).
    5. After a while, it gets preempted and routine #0 starts running again.
    6. It does the make([]int, m) corresponding to the second loop iteration and gets from 0x180 to 0x1FF for the second row. At this point, we already got two divided rows.

    With the second method, the routine does make([]int, n*m) to get all the matrix allocated in a single slice, ensuring contiguity. After that, a loop is needed to update the matrix pointers to the subslices corresponding to each row.

    You can play with the code shown above in the Go Playground to see the difference in the memory assigned by using both methods. Note that I used runtime.Gosched() only with the purpose of yielding the processor and forcing the scheduler to switch to another routine.

    Which one to use? Imagine the worst case with the first method, i.e. each row is not next in memory to another row. Then, if your program iterates through the matrix elements (to read or write them), there will probably be more cache misses (hence higher latency) compared to the second method because of worse data locality. On the other hand, with the second method it may not be possible to get a single piece of memory allocated for the matrix, because of memory fragmentation (chunks spread all over the memory), even though theoretically there may be enough free memory for it.

    Therefore, unless there's a lot of memory fragmentation and the matrix to be allocated is huge enough, you would always want to use the second method to get advantage of data locality.

提交回复
热议问题