I have heard about code bloats in context of C++ templates. I know that is not the case with modern C++ compilers. But, I want to construct an example and convince myself.>
Here is a little utility script I have been using to get insight into just these issues. It shows you not only if a symbol is defined multiple times, but also how much code size each symbol is taking. I have found this extremely valuable for auditing code size issues.
For example, here is a sample invocation:
$ ~/nmsize src/upb_table.o
39.5% 488 upb::TableBase::DoInsert(upb::TableBase::Entry const&)
57.9% 228 upb::TableBase::InsertBase(upb::TableBase::Entry const&)
70.8% 159 upb::MurmurHash2(void const*, unsigned long, unsigned int)
78.0% 89 upb::TableBase::GetEmptyBucket() const
83.8% 72 vtable for upb::TableBase
89.1% 65 upb::TableBase::TableBase(unsigned int)
94.3% 65 upb::TableBase::TableBase(unsigned int)
95.7% 17 typeinfo name for upb::TableBase
97.0% 16 typeinfo for upb::TableBase
98.0% 12 upb::TableBase::~TableBase()
98.7% 9 upb::TableBase::Swap(upb::TableBase*)
99.4% 8 upb::TableBase::~TableBase()
100.0% 8 upb::TableBase::~TableBase()
100.0% 0
100.0% 0 __cxxabiv1::__class_type_info
100.0% 0
100.0% 1236 TOTAL
In this case I have run it on a single .o file, but you can also run it on a .a file or on an executable. Here I can see that constructors and destructors were emitted twice or three times, which is a result of this bug.
Here is the script:
#!/usr/bin/env ruby
syms = []
total = 0
IO.popen("nm --demangle -S #{ARGV.join(' ')}").each_line { |line|
addr, size, scope, name = line.split(' ', 4)
next unless addr and size and scope and name
name.chomp!
addr = addr.to_i(16)
size = size.to_i(16)
total += size
syms << [size, name]
}
syms.sort! { |a,b| b[0] <=> a[0] }
cumulative = 0.0
syms.each { |sym|
size = sym[0]
cumulative += size
printf "%5.1f%% %6s %s\n", cumulative / total * 100, size.to_s, sym[1]
}
printf "%5.1f%% %6s %s\n", 100, total, "TOTAL"
If you run this on your own .a files or executable files, you should be able to convince yourself that you know exactly what is happening with your code size. I believe that recent versions of gcc may remove redundant or useless template instantiations at link time, so I recommend analyzing your actual executables.