Which is faster? ++, += or x + 1?

后端 未结 5 1072
轮回少年
轮回少年 2020-12-10 01:41

I am using C# (This question is also valid for similar languages like C++) and I am trying to figure out the fastest and most efficient way to increment. It isn\'t just one

相关标签:
5条回答
  • 2020-12-10 02:18

    They are same:

    static void Main(string[] args)
    {
        int a = 0;
        a++;
        a +=1;
        a = a+1;
    }
    

    The above code in ILSpy is:

    private static void Main(string[] args)
    {
        int a = 0;
        a++;
        a++;
        a++;
    }
    

    Also the IL for all these is same as well (In Release mode):

    .method private hidebysig static void  Main(string[] args) cil managed
    {
        .entrypoint
        // Code size       15 (0xf)
        .maxstack  2
        .locals init ([0] int32 a)
        IL_0000:  ldc.i4.0
        IL_0001:  stloc.0
        IL_0002:  ldloc.0
        IL_0003:  ldc.i4.1
        IL_0004:  add
        IL_0005:  stloc.0
        IL_0006:  ldloc.0
        IL_0007:  ldc.i4.1
        IL_0008:  add
        IL_0009:  stloc.0
        IL_000a:  ldloc.0
        IL_000b:  ldc.i4.1
        IL_000c:  add
        IL_000d:  stloc.0
        IL_000e:  ret
    } // end of method Program::Main
    
    0 讨论(0)
  • 2020-12-10 02:26

    The compiler should produce the same assembly for 1 and 2 and it may unroll the loop in option 3. When faced with questions like this, a useful tool you can use to empirically test what's going on is to look at the assembly produced by the compiler. In g++ this can be achieved using the -S switch.

    For example, both options 1 and 2 produce this assembler when compiled with the command g++ -S inc.cpp (using g++ 4.5.2)

    
    main:
    .LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        movq    %rsp, %rbp
        .cfi_offset 6, -16
        .cfi_def_cfa_register 6
        addl    $5, -4(%rbp)
        movl    $0, %eax
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
    

    g++ produces significantly less efficient assembler for option 3:

    
    main:
    .LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        movq    %rsp, %rbp
        .cfi_offset 6, -16
        .cfi_def_cfa_register 6
        movl    $0, -8(%rbp)
        jmp .L2
    .L3:
        addl    $1, -4(%rbp)
        addl    $1, -8(%rbp)
    .L2:
        cmpl    $4, -8(%rbp)
        setle   %al
        testb   %al, %al
        jne .L3
        movl    $0, %eax
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
    

    But with optimisation on (even -O1) g++ produces this for all 3 options:

    
    main:
    .LFB0:
        .cfi_startproc
        leal    5(%rdi), %eax
        ret
        .cfi_endproc
    

    g++ not only unrolls the loop in option 3, but it also uses the lea instruction to do the addition in a single instruction instead of faffing about with mov.

    So g++ will always produce the same assembly for options 1 and 2. g++ will produce the same assembly for all 3 options only if you explicitly turn optimisation on (which is the behaviour you'd probably expect).

    (and it looks like you should be able to inspect the assembly produced by C# too, although I've never tried that)

    0 讨论(0)
  • 2020-12-10 02:27

    Options 1 and 2 will result in identical code after being compiled. Option 3 will be much slower as its results in more code for the for loop involved.

    0 讨论(0)
  • 2020-12-10 02:39

    Options 1 and 2 will result in identical code being produced by the compiler. Option 3 will be much slower.

    It's a fallacy that i++ is faster than i += 1 or even i = i + 1. All decent compilers will turn those three instructions into the same code.

    For such a trivial operation as addition, write the clearest code and let the compiler worry about making it fast.

    0 讨论(0)
  • 2020-12-10 02:41

    (Answer specific to C# as C++ may vary significantly.)

    1 and 2 are equivalent.

    3 would definitely be slower.

    Having said that, doing this a mere 300 times a second, you wouldn't notice any difference. Are you aware of just how much a computer can do in terms of raw CPU+memory in a second? In general, you should write code for clarity as the most important thing. By all means worry about performance - but only when you have a way to measure it, in order to a) tell whether you need to worry, and b) whether any changes actually improve the performance.

    In this case, I'd say that option 1 is the clearest, so that's what I'd use.

    0 讨论(0)
提交回复
热议问题