Are pointer variables just integers with some operators or are they “symbolic”?

后端 未结 4 1518
长情又很酷
长情又很酷 2020-11-28 15:40

EDIT: The original word choice was confusing. The term \"symbolic\" is much better than the original (\"mystical\").

In the discussion about my previous C++ questio

相关标签:
4条回答
  • 2020-11-28 16:09

    C was conceived as a language in which pointers and integers were very intimately related, with the exact relationship depending upon the target platform. The relationship between pointers and integers made the language very suitable for purposes of low-level or systems programming. For purposes of discussion below, I'll thus call this language "Low-Level C" [LLC].

    The C Standards Committee wrote up a description of a different language, where such a relationship is not expressly forbidden, but is not acknowledged in any useful fashion, even when an implementation generates code for a target and application field where such a relationship would be useful. I'll call this language "High Level Only C" [HLOC].

    In the days when the Standard was written, most things that called themselves C implementations processed a dialect of LLC. Most useful compilers process a dialect which defines useful semantics in more cases than HLOC, but not as many as LLC. Whether pointers behave more like integers or more like abstract mystical entities depends upon which exact dialect one is using. If one is doing systems programming, it is reasonable to view C as treating pointers and integers as intimately related, because LLC dialects suitable for that purpose do so, and HLOC dialects that don't do so aren't suitable for that purpose. When doing high-end number crunching, however, one would far more often being using dialects of HLOC which do not recognize such a relationship.

    The real problem, and source of so much contention, lies in the fact that LLC and HLOC are increasingly divergent, and yet are both referred to by the name C.

    0 讨论(0)
  • 2020-11-28 16:10

    The first thing to say is that a sample of one test on one compiler generating code on one architecture is not the basis on which to draw a conclusion on the behaviour of the language.

    c++ (and c) are general purpose languages created with the intention of being portable. i.e. a well formed program written in c++ on one system should run on any other (barring calls to system-specific services).

    Once upon a time, for various reasons including backward-compatibility and cost, memory maps were not contiguous on all processors.

    For example I used to write code on a 6809 system where half the memory was paged in via a PIA addressed in the non-paged part of the memory map. My c compiler was able to cope with this because pointers were, for that compiler, a 'mystical' type which knew how to write to the PIA.

    The 80386 family has an addressing mode where addresses are organised in groups of 16 bytes. Look up FAR pointers and you'll see different pointer arithmetic.

    This is the history of pointer development in c++. Not all chip manufacturers have been "well behaved" and the language accommodates them all (usually) without needing to rewrite source code.

    0 讨论(0)
  • 2020-11-28 16:10

    If you turn off the optimiser the code works as expected.

    By using pointer arithmetic that is undefined you are fooling the optimiser. The optimiser has figured out that there is no code writing to b, so it can safely store it in a register. As it turns out, you have acquired the address of b in a non-standard way and modify the value in a way the optimiser doesn't see.

    If you read the C standard, it says that pointers may be mystical. gcc pointers are not mystical. They are stored in ordinary memory and consist of the same type of bytes that make up all other data types. The behaviour you encountered is due to your code not respecting the limitations stated for the optimiser level you have chosen.

    Edit:

    The revised code is still UB. The standard doesn't allow referencing a[1] even if the pointer value happens to be identical to another pointer value. So the optimiser is allowed to store the value of b in a register.

    0 讨论(0)
  • 2020-11-28 16:13

    Stealing the quote from TartanLlama:

    [expr.add]/5 "[for pointer addition, ] if both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."

    So the compiler can assume that your pointer points to the a array, or one past the end. If it points one past the end, you cannot defererence it. But as you do, it surely can't be one past the end, so it can only be inside the array.

    So now you have your code (reduced)

    b = 1;
    *pa1 = 2;
    

    where pa points inside an array a and b is a separate variable. And when you print them, you get exactly 1 and 2, the values you have assigned them.

    An optimizing compiler can figure that out, without even storing a 1or a 2 to memory. It can just print the final result.

    0 讨论(0)
提交回复
热议问题