I have had the recent pleasure to explain pointers to a C programming beginner and stumbled upon the following difficulty. It might not seem like an issue at all if you alre
int *bar = &foo;
Question 1
: What is bar
?
Ans
: It is a pointer variable(to type int
). A pointer should point to some valid memory location and later should be dereferenced(*bar) using a unary operator *
in order to read the value stored in that location.
Question 2
: What is &foo
?
Ans
: foo is a variable of type int
.which is stored in some valid memory location and that location we get it from the operator &
so now what we have is some valid memory location &foo
.
So both put together i.e what the pointer needed was a valid memory location and that is got by &foo
so the initialization is good.
Now pointer bar
is pointing to valid memory location and the value stored in it can be got be dereferencing it i.e. *bar
I saw this question a few days ago, and then happened to be reading the explanation of Go's type declaration on the Go Blog. It starts off by giving an account of C type declarations, which seems like a useful resource to add to this thread, even though I think that there are more complete answers already given.
C took an unusual and clever approach to declaration syntax. Instead of describing the types with special syntax, one writes an expression involving the item being declared, and states what type that expression will have. Thus
int x;
declares x to be an int: the expression 'x' will have type int. In general, to figure out how to write the type of a new variable, write an expression involving that variable that evaluates to a basic type, then put the basic type on the left and the expression on the right.
Thus, the declarations
int *p; int a[3];
state that p is a pointer to int because '*p' has type int, and that a is an array of ints because a[3] (ignoring the particular index value, which is punned to be the size of the array) has type int.
(It goes on to describe how to extend this understanding to function pointers etc)
This is a way that I've not thought about it before, but it seems like a pretty straightforward way of accounting for the overloading of the syntax.
It is nice to know the difference between declaration and initialization. We declare variables as types and initialize them with values. If we do both at the same time we often call it a definition.
1.
int a; a = 42;
int a;
a = 42;
We declare an int
named a. Then we initialize it by giving it a value 42
.
2.
int a = 42;
We declare and int
named a and give it the value 42. It is initialized with 42
. A definition.
3.
a = 43;
When we use the variables we say we operate on them. a = 43
is an assignment operation. We assign the number 43 to the variable a.
By saying
int *bar;
we declare bar to be a pointer to an int. By saying
int *bar = &foo;
we declare bar and initialize it with the address of foo.
After we have initialized bar we can use the same operator, the asterisk, to access and operate on the value of foo. Without the operator we access and operate on the address the pointer is pointing to.
Besides that I let the picture speak.
A simplified ASCIIMATION on what is going on. (And here a player version if you want to pause etc.)
I think the devil is in the space.
I would write (not only for the beginner, but for myself as well): int* bar = &foo; instead of int *bar = &foo;
this should make evident what is the relationship between syntax and semantics
Looking at the answers and comments here, there seems to be a general agreement that the syntax in question can be confusing for a beginner. Most of them propose something along these lines:
You may write int* bar
instead of int *bar
to highlight the difference. This means you won't follow the K&R "declaration mimics use" approach, but the Stroustrup C++ approach:
We don't declare *bar
to be an integer. We declare bar
to be an int*
. If we want to initialize a newly created variable in the same line, it is clear that we are dealing with bar
, not *bar
. int* bar = &foo;
The drawbacks:
int* foo, bar
vs int *foo, *bar
).Edit: A different approach that has been suggested, is to go the K&R "mimic" way, but without the "shorthand" syntax (see here). As soon as you omit doing a declaration and an assignment in the same line, everything will look much more coherent.
However, sooner or later the student will have to deal with pointers as function arguments. And pointers as return types. And pointers to functions. You will have to explain the difference between int *func();
and int (*func)();
. I think sooner or later things will fall apart. And maybe sooner is better than later.
Perhaps stepping through it just a bit more makes it easier:
#include <stdio.h>
int main()
{
int foo = 1;
int *bar = &foo;
printf("%i\n", foo);
printf("%p\n", &foo);
printf("%p\n", (void *)&foo);
printf("%p\n", &bar);
printf("%p\n", bar);
printf("%i\n", *bar);
return 0;
}
Have them tell you what they expect the output to be on each line, then have them run the program and see what turns up. Explain their questions (the naked version in there will certainly prompt a few -- but you can worry about style, strictness and portability later). Then, before their mind turns to mush from overthinking or they become an after-lunch-zombie, write a function that takes a value, and the same one that takes a pointer.
In my experience its getting over that "why does this print that way?" hump, and then immediately showing why this is useful in function parameters by hands-on toying (as a prelude to some basic K&R material like string parsing/array processing) that makes the lesson not just make sense but stick.
The next step is to get them to explain to you how i[0]
relates to &i
. If they can do that, they won't forget it and you can start talking about structs, even a little ahead of time, just so it sinks in.
The recommendations above about boxes and arrows is good also, but it can also wind up digressing into a full-blown discussion about how memory works -- which is a talk that must happen at some point, but can distract from the point immediately at hand: how to interpret pointer notation in C.