Skip variable declaration using goto?

问题

I am reading C Programming - A Modern Approach by K.N.King to learn the C programming language and it was noted that goto statements must not skip variable-length array declarations.

But now the question is: Why are goto jumps allowed to skip fixed-length array declarations and ordinary declarations? And more precisely, what is the behavior of examples like these, according to the C99 standard? When I tested these cases it seemed like the declarations were actually not jumped over, but is that correct? Are the variables whose declarations might have been jumped over safe to use?

goto later;
int a = 4;
later:
printf("%d", a);

goto later;
int a;
later:
a = 4;
printf("%d", a);

goto later;
int a[4];
a[0] = 1;
later:
a[1] = 2;
for(int i = 0; i < sizeof(a) / sizeof(a[0]); i++)
  printf("%d\n", a[i]);

回答1:

I'm in the mood for explaining this without gory memory-layout details (believe me, they get very gory when VLAs are used; see @Ulfalizer's answer for details).

So, originally, in C89, it was mandatory to declare all variables at the start of a block, like this:

{
    int a = 1;
    a++;
    /* ... */
}

This directly implies a very important thing: one block == one unchanging set of variable declarations.

C99 changed this. In it, you can declare variables in any part of the block, but declaration statements are still different from regular statements.

In fact, to understand this, you can imagine that all variable declarations are implicitly moved to the start of the block where they are declared and made unavailable for all statements that preceed them.

That is simply because the one block == one set of declarations rule still holds.

That is why you cannot "jump over a declaration". The declared variable would still exist.

The problem is initialization. It doesn't get "moved" anywhere. So, technically, for your case, the following programs could be considered equivalent:

goto later;
int a = 100;
later:
printf("%d", a);

and

int a;
goto later;
a = 100;
later:
printf("%d", a);

As you can see, the declaration is still there, what is being skipped is initialization.

The reason this doesn't work with VLAs is that they're different. In short, it's because this is valid:

int size = 7;
int test[size];

The declarations of VLAs will, unlike all other declarations, behave differently in different parts of the block where they are declared. In fact, a VLA might have entirely different memory layouts depending on where it is declared. You just can't "move" it outside of the place you just jumped over.

You may ask, "all right, then why not make it so that the declaration would be unaffected by the goto"? Well, you'd still get cases like this:

goto later;
int size = 7;
int test[size];
later:

What do you actually expect this to do?..

So, prohibiting jumping over VLA declarations is there for a reason - it is the most logical decision for dealing with cases like the above by simply prohibiting them altogether.

回答2:

The reason you're not allowed to skip over the declaration of a variable-length array (VLA) is that it would get messy with the way VLAs are commonly implemented, and would complicate the semantics of the language.

The way VLAs are likely to implemented in practice is by decrementing (or incrementing, on architectures where the stack grows upwards) the stack pointer by a dynamic (calculated at runtime) amount to make room for the VLA on the stack. This happens at the point where the VLA is declared (conceptually at least, ignoring optimizations). This is needed so that later stack operations (e.g., pushing arguments to the stack for a function call) do not step on the VLA's memory.

For VLAs nested in blocks, the stack pointer would commonly be restored at the end of the block containing the VLA. If a goto was allowed to jump into such a block and past the declaration of a VLA, then the code to restore the stack pointer would run without the corresponding initialization code having been run, which would likely cause problems. For example, the stack pointer might be incremented by the size of the VLA even though it was never decremented, which would, among other things, make the return address that was pushed when the function containing the VLA was called appear in the wrong place relative to the stack pointer.

It's also messy from a pure language semantics perspective. If you're allowed to skip over the declaration, then what would the size of the array be? What should sizeof return? What does it mean to access it?

For the non-VLA cases, you would simply skip the value initialization (if any), which doesn't necessarily cause problems in and of itself. If you jump over a non-VLA definition like int x;, then storage will still be reserved for the variable x as well. VLAs are different in that their size is calculated at runtime, which complicates things.

As a side note, one of the motivations for allowing variables to be declared anywhere within a block in C99 (C89 requires declarations to be at the beginning of the block, though at least GCC allows them within the block as an extension) was to support VLAs. Being able to perform calculations earlier in the block before declaring the size of the VLA is handy.

For somewhat related reasons, C++ does not allow gotos to skip over object declarations (or initializations for plain old data types, e.g. int). This is because it would be unsafe to jump over the code that calls the constructor but still run the destructor at the end of the block.

回答3:

Using a goto to jump over the declaration of a variable is almost certainly a really bad idea, but it's perfectly legal.

C makes a distinction between the lifetime of a variable and its scope.

For a variable declared without the static keyword inside a function, its scope (the region of program text in which its name is visible) extends from the definition to the end of the nearest enclosing block. Its lifetime (storage duration) begins on entry to the block and ends on exit from the block. If it has an initializer, it's executed when (and if) the definition is reached.

For example:

{  /* the lifetime of x and y starts here; memory is allocated for both */
    int x = 10; /* the name x is visible from here to the "}" */
    int y = 20; /* the name y is visible from here to the "}" */
    int vla[y]; /* vla is visible, and its lifetime begins here */
    /* ... */
}

For variable length arrays (VLAs), the visibility of the identifier is the same, but the lifetime of the object begins at the definition. Why? Because the length of the array is not necessarily known prior to that point. In the example, it's not possible to allocate memory for vla at the beginning of the block, because we don't yet know the value of y.

A goto that skips over an object definition bypasses any initializer for that object, but memory is still allocated for it. If the goto jumps into a block, memory is allocated as the block is entered. If it doesn't (if both the goto and the target label are at the same level in the same block), then the object will already have been allocated.

...
goto LABEL;
{
    int x = 10;
    LABEL: printf("x = %d\n", x);
}

When the printf statement is executed, x exists and its name is visible, but its initialization has been bypassed, so it has an indeterminate value.

The language forbids a goto that skips the definition of a variable-length array. If it were permitted, it would skip the allocation of memory for the object, and any attempt to reference it would cause undefined behavior.

goto statements do have their uses. Using them to skip over declarations, though it's permitted by the language, is not one of them.

来源：https://stackoverflow.com/questions/29880836/skip-variable-declaration-using-goto

标签

arrays

goto

variable-declaration