Which is faster? ++, += or x + 1?

后端未结

关注

 5  1083

I am using C# (This question is also valid for similar languages like C++) and I am trying to figure out the fastest and most efficient way to increment. It isn\'t just one

相关标签:

5条回答

误落风尘

2020-12-10 02:18

They are same:

static void Main(string[] args)
{
    int a = 0;
    a++;
    a +=1;
    a = a+1;
}

The above code in ILSpy is:

private static void Main(string[] args)
{
    int a = 0;
    a++;
    a++;
    a++;
}

Also the IL for all these is same as well (In Release mode):

.method private hidebysig static void  Main(string[] args) cil managed
{
    .entrypoint
    // Code size       15 (0xf)
    .maxstack  2
    .locals init ([0] int32 a)
    IL_0000:  ldc.i4.0
    IL_0001:  stloc.0
    IL_0002:  ldloc.0
    IL_0003:  ldc.i4.1
    IL_0004:  add
    IL_0005:  stloc.0
    IL_0006:  ldloc.0
    IL_0007:  ldc.i4.1
    IL_0008:  add
    IL_0009:  stloc.0
    IL_000a:  ldloc.0
    IL_000b:  ldc.i4.1
    IL_000c:  add
    IL_000d:  stloc.0
    IL_000e:  ret
} // end of method Program::Main

0 讨论(0)

暗喜

2020-12-10 02:26
The compiler should produce the same assembly for 1 and 2 and it may unroll the loop in option 3. When faced with questions like this, a useful tool you can use to empirically test what's going on is to look at the assembly produced by the compiler. In g++ this can be achieved using the -S switch.

For example, both options 1 and 2 produce this assembler when compiled with the command g++ -S inc.cpp (using g++ 4.5.2)
```
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    addl    $5, -4(%rbp)
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
```
g++ produces significantly less efficient assembler for option 3:
```
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    jmp .L2
.L3:
    addl    $1, -4(%rbp)
    addl    $1, -8(%rbp)
.L2:
    cmpl    $4, -8(%rbp)
    setle   %al
    testb   %al, %al
    jne .L3
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
```
But with optimisation on (even -O1) g++ produces this for all 3 options:
```
main:
.LFB0:
    .cfi_startproc
    leal    5(%rdi), %eax
    ret
    .cfi_endproc
```
g++ not only unrolls the loop in option 3, but it also uses the lea instruction to do the addition in a single instruction instead of faffing about with mov.

So g++ will always produce the same assembly for options 1 and 2. g++ will produce the same assembly for all 3 options only if you explicitly turn optimisation on (which is the behaviour you'd probably expect).

(and it looks like you should be able to inspect the assembly produced by C# too, although I've never tried that)
0 讨论(0)
发布评论:

提交评论
- 加载中...
遇见更好的自我

2020-12-10 02:27

Options 1 and 2 will result in identical code after being compiled. Option 3 will be much slower as its results in more code for the for loop involved.

0 讨论(0)
发布评论:

提交评论
- 加载中...
执笔经年

2020-12-10 02:39

Options 1 and 2 will result in identical code being produced by the compiler. Option 3 will be much slower.

It's a fallacy that i++ is faster than i += 1 or even i = i + 1. All decent compilers will turn those three instructions into the same code.

For such a trivial operation as addition, write the clearest code and let the compiler worry about making it fast.

0 讨论(0)
发布评论:

提交评论
- 加载中...
再見小時候

2020-12-10 02:41

(Answer specific to C# as C++ may vary significantly.)

1 and 2 are equivalent.

3 would definitely be slower.

Having said that, doing this a mere 300 times a second, you wouldn't notice any difference. Are you aware of just how much a computer can do in terms of raw CPU+memory in a second? In general, you should write code for clarity as the most important thing. By all means worry about performance - but only when you have a way to measure it, in order to a) tell whether you need to worry, and b) whether any changes actually improve the performance.

In this case, I'd say that option 1 is the clearest, so that's what I'd use.

0 讨论(0)
发布评论:

提交评论
- 加载中...