When you learn C++, or at least when I learned it through C++ Primer, pointers were termed the \"memory addresses\" of the elements they point to. I\'m wondering to
According to the C++14 Standard, [expr.unary.op]/3:
The result of the unary
&
operator is a pointer to its operand. The operand shall be an lvalue or a qualified-id. If the operand is a qualified-id naming a non-static memberm
of some classC
with typeT
, the result has type “pointer to member of classC
of typeT
” and is a prvalue designatingC::m
. Otherwise, if the type of the expression isT
, the result has type “pointer to T” and is a prvalue that is the address of the designated object or a pointer to the designated function. [Note: In particular, the address of an object of type “cvT
” is “pointer to cvT
”, with the same cv-qualification. —end note ]
So this says clearly and unambiguously that pointers to object type (i.e. a T *
, where T
is not a function type) hold addresses.
"address" is defined by [intro.memory]/1:
The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address.
So an address may be anything which serves to uniquely identify a particular byte of memory.
Note: In the C++ standard terminology, memory only refers to space that is in use. It doesn't mean physical memory, or virtual memory, or anything like that. The memory is a disjoint set of allocations.
It is important to bear in mind that, although one possible way of uniquely identifying each byte in memory is to assign a unique integer to each byte of physical or virtual memory, that is not the only possible way.
To avoid writing non-portable code it is good to avoid assuming that an address is identical to an integer. The rules of arithmetic for pointers are different to the rules of arithmetic for integers anyway. Similarly, we would not say that 5.0f
is the same as 1084227584
even though they have identical bit representations in memory (under IEEE754).
You should think of pointers as being addresses of virtual memory: modern consumer operating systems and runtime environments place at least one layer of abstraction between physical memory and what you see as a pointer value.
As for your final statement, you cannot make that assumption, even in a virtual memory address space. Pointer arithmetic is only valid within blocks of contiguous memory such as arrays. And whilst it is permissible (in both C and C++) to assign a pointer to one point past an array (or scalar), the behaviour on deferencing such a pointer is undefined. Hypothesising about adjacency in physical memory in the context of C and C++ is pointless.
As many answers have already mentioned, they should not be thought of as memory addresses. Check out those answers and here to get an understanding of them. Addressing your last statement
*p1 and *p2 have the property p2 = p1 + 1 or p1 = p2 + 1 if and only if they are adjacent in physical memory
is only correct if p1
and p2
are of the same type, or pointing to types of the same size.
The operating system provides an abstraction of the physical machine to your program (i.e. your program runs in a virtual machine). Thus, your program does not have access to any physical resource of your computer, be it CPU time, memory, etc; it merely has to ask the OS for these resources.
In the case of memory, your program works in a virtual address space, defined by the operating system. This address space has multiple regions, such as stack, heap, code, etc. The value of your pointers represent addresses in this virtual address space. Indeed, 2 pointers to consecutive addresses will point to consecutive locations in this address space.
However, this address space is splitted by the operating system into pages and segments, which are swapped in and out from memory as required, so your pointers may or may not point to consecutive physical memory locations and is impossible to tell at runtime if that is true or not. This also depends on the policy used by the operating system for paging and segmentation.
Bottom line is that pointers are memory addresses. However, they are addresses in a virtual memory space and it is up to the operating system to decide how this is mapped to the physical memory space.
As far as your program is concerned, this is not an issue. One reason for this abstraction is to make programs believe they are the only users of the machine. Imagine the nightmare you'd have to go through if you would need to consider the memory allocated by other processes when you write your program - you don't even know which processes are going to run concurrently with yours. Also, this is a good technique to enforce security: your process cannot (well, at least shouldn't be able to) access maliciously the memory space of another process since they run in 2 different (virtual) memory spaces.
Absolutely right to think of pointers as memory addresses. That's what they are in ALL compilers that I have worked with - for a number of different processor architectures, manufactured by a number of different compiler producers.
However, the compiler does some interesting magic, to help you along with the fact that normal memory addresses [in all modern mainstream processors at least] are byte-addresses, and the object your pointer refers to may not be exactly one byte. So if we have T* ptr;
, ptr++
will do ((char*)ptr) + sizeof(T);
or ptr + n
is ((char*)ptr) + n*sizeof(T)
. This also means that your p1 == p2 + 1
requires p1
and p2
to be of the same type T
, since the +1
is actually +sizeof(T)*1
.
There is ONE exception to the above "pointers are memory addresses", and that is member function pointers. They are "special", and for now, please just ignore how they are actually implemented, sufficient to say that they are not "just memory addresses".
Unless pointers are optimized out by the compiler, they are integers that store memory addresses. Their lenght depends on the machine the code is being compiled for, but they can usually be treated as ints.
In fact, you can check that out by printing the actual number stored on them with printf()
.
Beware, however, that type *
pointer increment/decrement operations are done by the sizeof(type)
. See for yourself with this code (tested online on Repl.it):
#include <stdio.h>
int main() {
volatile int i1 = 1337;
volatile int i2 = 31337;
volatile double d1 = 1.337;
volatile double d2 = 31.337;
volatile int* pi = &i1;
volatile double* pd = &d1;
printf("ints: %d, %d\ndoubles: %f, %f\n", i1, i2, d1, d2);
printf("0x%X = %d\n", pi, *pi);
printf("0x%X = %d\n", pi-1, *(pi-1));
printf("Difference: %d\n",(long)(pi)-(long)(pi-1));
printf("0x%X = %f\n", pd, *pd);
printf("0x%X = %f\n", pd-1, *(pd-1));
printf("Difference: %d\n",(long)(pd)-(long)(pd-1));
}
All variables and pointers were declared volatile so as the compiler wouldn't optimize them out. Also notice that I used decrement, because the variables are placed in the function stack.
The output was:
ints: 1337, 31337
doubles: 1.337000, 31.337000
0xFAFF465C = 1337
0xFAFF4658 = 31337
Difference: 4
0xFAFF4650 = 1.337000
0xFAFF4648 = 31.337000
Difference: 8
Note that this code may not work on all compilers, specially if they do not store variables in the same order. However, what's important is that the pointer values can actually be read and printed and that decrements of one may/will decrement based on the size of the variable the pointer references.
Also note that the &
and *
are actual operators for reference ("get the memory address of this variable") and dereference ("get the contents of this memory address").
This may also be used for cool tricks like getting the IEEE 754 binary values for floats, by casting the float*
as an int*
:
#include <iostream>
int main() {
float f = -9.5;
int* p = (int*)&f;
std::cout << "Binary contents:\n";
int i = sizeof(f)*8;
while(i) {
i--;
std::cout << ((*p & (1 << i))?1:0);
}
}
Result is:
Binary contents:
11000001000110000000000000000000
Example taken from https://pt.wikipedia.org/wiki/IEEE_754. Check out on any converter.