Why do compilers give a warning about returning a reference to a local stack variable if it is undefined behaviour?

谁都会走 提交于 2019-11-30 10:54:56

It is almost impossible to verify from a compiler point of view whether you are returning a reference to a temporary. If the standard dictated that to be diagnosed as an error, writing a compiler would be almost impossible. Consider:

bool not_so_random() { return true; }
int& foo( int x ) {
   static int s = 10;
   int *p = &s;
   if ( !not_so_random() ) {
      p = &x;
   }
   return *p;
}

The above program is correct and safe to run, in our current implementation it is guaranteed that foo will return a reference to a static variable, which is safe. But from a compiler perspective (and with separate compilation in place, where the implementation of not_so_random() is not accessible, the compiler cannot know that the program is well-formed.

This is a toy example, but you can imagine similar code, with different return paths, where p might refer to different long-lived objects in all paths that return *p.

Undefined behaviour is not a compilation error, it's just not a well-formed C++ program. Not every ill-formed program is incompilable, it's just un-predictable. I'd wager a bet that it's not even possible in principle for a computer to decide whether a given program text is a well-formed C++ program.

You can always add -Werror to gcc to make warnings terminate compilation with an error!

To add another favourite SO topic: Would you like ++i++ to cause a compile error, too?

If you return a pointer/reference to a local inside function the behavior is well defined as long as you do not dereference the pointer/reference returned from the function.

It is an Undefined Behavior only when one derefers the returned pointer.

Whether it is a Undefined Behavior or not depends on the code calling the function and not the function itself.

So just while compiling the function, the compiler cannot determine if the behavior is Undefined or Well Defined. The best it can do is to warn you of a potential problem and it does!

An Code Sample:

#include <iostream>

struct A
{ 
   int m_i;
   A():m_i(10)
   {

   } 
};  
A& foo() 
{     
    A a;
    a.m_i = 20;     
    return a; 
} 

int main()
{
   foo(); //This is not an Undefined Behavior, return value was never used.

   A ref = foo(); //Still not an Undefined Behavior, return value not yet used.

   std::cout<<ref.m_i; //Undefined Behavior, returned value is used.

   return 0;
}

Reference to the C++ Standard:
section 3.8

Before the lifetime of an object has started but after the storage which the object will occupy has been allo-cated 34) or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that refers to the storage location where the object will be or was located may be used but only in limited ways. Such a pointer refers to allocated storage (3.7.3.2), and using the pointer as if the pointer were of type void*, is well-defined. Such a pointer may be dereferenced but the resulting lvalue may only be used in limited ways, as described below. If the object will be or was of a class type with a non-trivial destructor, and the pointer is used as the operand of a delete-expression, the program has undefined behavior. If the object will be or was of a non-POD class type, the program has undefined behavior if:

— .......

Because standard does not restrict us.

If you want to shoot to your own foot you can do it!

However lets see and example where it can be useful:

int &foo()
{
    int y;
}

bool stack_grows_forward()
{
    int &p=foo();
    int my_p;
    return &my_p < &p;
}

Compilers should not refuse to compile programs unless the standard says they are allowed to do so. Otherwise it would be much harder to port programs, since they might not compile with a different compiler, even though they comply with the standard.

Consider the following function:

int foobar() {
    int a=1,b=0;
    return a/b;
}

Any decent compiler will detect that I am dividing by zero, but it should not reject the code since I might actually want to trigger a SIG_FPE signal.

As David Rodríguez has pointed out, there are some cases which are undecidable but there are also some which are not. Some new version of the standard might describe some cases where the compiler must/is allowed to reject programs. That would require the standard to be very specific about the static analysis which is to be performed.

The Java standard actually specifies some rules for checking that non-void methods always return a value. Unfortunately I haven't read enough of the C++ standard to know what the compiler is allowed to do.

You could also return a reference to a static variable, which would be valid code so the code must be able to compile.

It's pretty much super-bad practice to rely on this, but I do believe that in many cases (and that's never a good wager), that memory reference would still be valid if no functions are called between the time foo() returns and the time the calling function uses its return value. In that case, that area of the stack would not have an opportunity to get overwritten.

In C and C++ you can choose to access arbitrary sections of memory anyway (within the process's memory space, of course) via pointer arithmetic, so why not allow the possibility of constructing a reference to wherever one so chooses?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!