Is the return type of a function part of the mangled name?

a 夏天 提交于 2019-12-22 09:18:04

问题


Suppose I have two functions with the same parameter types and name (not in the same program):

std::string foo(int x) {
  return "hello"; 
}

int foo(int x) {
  return x;
}

Will they have the same mangled name once compiled?

Is the the return type part of the mangled name in C++?


回答1:


As mangling schemes aren't standardised, there's no single answer to this question; the closest thing to an actual answer would be to look at mangled names generated by the most common mangling schemes. To my knowledge, those are the GCC and MSVC schemes, in alphabetical order, so...


GCC:

To test this, we can use a simple program.

#include <string>
#include <cstdlib>

std::string foo(int x) { return "hello"; }
//int         foo(int x) { return x; }

int main() {
    // Assuming executable file named "a.out".
    system("nm a.out");
}

Compile and run with GCC or Clang, and it'll list the symbols it contains. Depending on which of the functions is uncommented, the results will be:

// GCC:
// ----

std::string foo(int x) { return "hello"; } // _Z3fooB5cxx11i
                                             // foo[abi:cxx11](int)
int         foo(int x) { return x; }       // _Z3fooi
                                             // foo(int)

// Clang:
// ------

std::string foo(int x) { return "hello"; } // _Z3fooi
                                             // foo(int)
int         foo(int x) { return x; }       // _Z3fooi
                                             // foo(int)

The GCC scheme contains relatively little information, not including return types:

  • Symbol type: _Z for "function".
  • Name: 3foo for ::foo.
  • Parameters: i for int.

Despite this, however, they are different when compiled with GCC (but not with Clang), because GCC indicates that the std::string version uses the cxx11 ABI.

Note that it does still keep track of the return type, and make sure signatures match; it just doesn't use the function's mangled name to do so.


MSVC:

To test this, we can use a simple program, as above.

#include <string>
#include <cstdlib>

std::string foo(int x) { return "hello"; }
//int         foo(int x) { return x; }

int main() {
    // Assuming object file named "a.obj".
    // Pipe to file, because there are a lot of symbols when <string> is included.
    system("dumpbin/symbols a.obj > a.txt");
}

Compile and run with Visual Studio, and a.txt will list the symbols it contains. Depending on which of the functions is uncommented, the results will be:

std::string foo(int x) { return "hello"; }
  // ?foo@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@H@Z
  // class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > __cdecl foo(int)
int         foo(int x) { return x; }
  // ?foo@@YAHH@Z
  // int __cdecl foo(int)

The MSVC scheme contains the entire declaration, including things that weren't explicitly specified:

  • Name: foo@ for ::foo, followed by @ to terminate.
  • Symbol type: Everything after the name-terminating @.
  • Type and member status: Y for "non-member function".
  • Calling convention: A for __cdecl.
  • Return type:
    • H for int.
    • ?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@ (followed by @ to terminate) for std::basic_string<char, std::char_traits<char>, std::allocator<char>> (std::string for short).
  • Parameter list: H for int (followed by @ to terminate).
  • Exception specifier: Z for throw(...); this one is omitted from demangled names unless it's something else, probably because MSVC just ignores it anyway.

This allows it to whine at you if declarations aren't identical across every compilation unit.


Generally, most compilers will use one of those schemes (or sometimes a variation thereof) when targeting *nix or Windows, respectively, but this isn't guaranteed. For example...

  • Clang, to my knowledge, will use the GCC scheme for *nix, or the MSVC scheme for Windows.
  • Intel C++ uses the GCC scheme for Linux and Mac, and the MSVC scheme (with a few minor variations) for Windows.
  • The Borland and Watcom compilers have their own schemes.
  • The Symantec and Digital Mars compilers generally use the MSVC scheme, with a few small changes.
  • Older versions of GCC, and a lot of UNIX tools, use a modified version of cfront's mangling scheme.
  • And so on...

Schemes used by other compilers are thanks to Agner Fog's PDF.


Note:

Examining the generated symbols, it becomes apparent that GCC's mangling scheme doesn't provide the same level of protection against Machiavelli as MSVC's. Consider the following:

// foo.cpp
#include <string>

// Simple wrapper class, to avoid encoding `cxx11 ABI` into the GCC name.
class MyString {
    std::string data;

  public:
    MyString(const char* const d) : data(d) {}

    operator std::string() { return data; }
};

// Evil.
MyString foo(int i) { return "hello"; }

// -----

// main.cpp
#include <iostream>

// Evil.
int foo(int);

int main() {
    std::cout << foo(3) << '\n';
}

If we compile each source file separately, then attempt to link the object files together...

  • GCC: MyString, due to not being part of the cxx11 ABI, causes MyString foo(int) to be mangled as _Z3fooi, just like int foo(int). This allows the object files to be linked, and an executable is produced. Attempting to run it causes a segfault.
  • MSVC: The linker will look for ?foo@@YAHH@Z; as we instead supplied ?foo@@YA?AVMyString@@H@Z, linking will fail.

Considering this, a mangling scheme that includes the return type is safer, even though functions can't be overloaded solely on differences in return type.




回答2:


No, and I expect that their mangled name will be the same with all modern compilers. More importantly, using them in the same program results in undefined behavior. Functions in C++ cannot differ only in their return type.



来源:https://stackoverflow.com/questions/40791413/is-the-return-type-of-a-function-part-of-the-mangled-name

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!