Why are fundamental types in C and C++ not strictly defined like in Java where an int
is always 4 bytes and long
is 8 bytes, etc. To my knowledge i
Efficiency is part of the answer--for example, if you use C or C++ on a machine that uses 36-bit registers, you don't want to force every operation to include overhead to mask the results so they look/act like 32 bits.
That's really only part of the answer though. The other part is that C and C++ were (and are) intended to be systems programming languages. You're intended to be able to write things like virtual machines and operating systems with them.
That means that if (for example) you're writing code that will interact with the MMU on this 36-bit machine, and you need to set bit 34 of some particular word, the basic intent with C and C++ is that you should be able to do that directly in the language. If the language starts by decreeing that no 36-bit type can exist in the first place, that generally makes it difficult to directly manipulate a 36-bit type in that language.
So, one of the basic premises of C and C++ is that if you need to do something, you should be able to do that something inside the language. The corresponding premise in Java was almost exactly the opposite: that what it allowed you to do should be restricted to those operations it could guarantee would be safe and portable to anything.
In particular, keep in mind that when Sun designed Java one of the big targets they had in mind was applets for web pages. They specifically intended to restrict Java to the point that an end-user could run any Java applet, and feel secure in the knowledge that it couldn't possible harm their machine.
Of course, the situation has changed--the security they aimed for has remained elusive, and applets are essentially dead and gone. Nonetheless, most of the restrictions that were intended to support that model remain in place.
I should probably add that this isn't entirely an either/or type of situation. There are a number of ways of providing some middle ground. Reasonably recent iterations of both the C and C++ standards include types that int32_t
. This guarantees a 32-bit, 2's complement representation, just about like a Java type does. So, if you're running on hardware that actually supports a 32-bit two's complement type, int32_t
will be present and you can use it.
Of course, that's not the only possible way to accomplish roughly the same thing either. Ada, for example, takes a somewhat different route. Instead of the "native" types being oriented toward the machine, and then adding special types with guaranteed properties, it went the other direction, and had native types with guaranteed properties, but also an entire facility for defining a new type that corresponded directly to a target machine. For better or worse, however, Ada has never achieved nearly as wide of usage as C, C++, or Java, and its approach to this particular problem doesn't seem to have been adopted in a lot of other languages either.
The reason is largely because C is portable to a much wider variety of platforms. There are many reasons why the different data types have turned out to be the various sizes they are on various platforms, but at least historically, int
has been adapted to be the platform's native word size. On the PDP-11 it was 16 bits (and long
was originally invented for 32-bit numbers), while some embedded platform compilers even have 8-bit int
s. When 32-bit platforms came around and started having 32-bit int
s, short
was invented to represent 16-bit numbers.
Nowadays, most 64-bit architectures use 32-bit int
s simply to be compatible with the large base of C programs that were originally written for 32-bit platforms, but there have been 64-bit C compilers with 64-bit int
s as well, not least of which some early Cray platforms.
Also, in the earlier days of computing, floating-point formats and sizes were generally far less standardized (IEEE 754 didn't come around until 1985), which is why float
s and double
s are even less well-defined than the integer data types. They generally don't even presume the presence of such peculiarities as infinities, NaNs or signed zeroes.
Furthermore, it should perhaps be said that a char
is not defined to be 1 byte, but rather to be whatever sizeof
returns 1 for. Which is not necessarily 8 bits. (For completeness, it should perhaps be added here, also, that "byte" as a term is not universally defined to be 8 bits; there have been many historical definitions of it, and in the context of the ANSI C standard, a "byte" is actually defined to be the smallest unit of storage that can store a char
, whatever the nature of char
.)
There are also such architectures as the 36-bit PDP-10s and 18-bit PDP-7s that have also run C programs. They may be quite rare these days, but do help explain why C data types are not defined in terms of 8-bit units.
Whether this, in the end, really makes the language "more portable" than languages like Java can perhaps be debated, but it would sure be suboptimal to run Java programs on 16-bit processors, and quite weird indeed on 36-bit processors. It is perhaps fair to say that it makes the language more portable, but programs written in it less portable.
EDIT: In reply to some of the comments, I just want to append, as an opinion piece, that C as a language is unlike languages like Java/Haskell/ADA that are more-or-less "owned" by a corporation or standards body. There is ANSI C, sure, but C is more than ANSI C; it's a living community, and there are many implementations that aren't ANSI-compatible but are "C" nevertheless. Arguing whether implementations that use 8-bit int
s are C is similar to arguing whether Scots is English in that it's mostly pointless. They use 8-bit ints
for good reasons, noone who knows C well enough would be unable to reason about programs written for such compilers, and anyone who writes C programs for such architectures would want their int
s to be 8 bits.
The integral data types are not strictly specified so that the compiler can choose whatever is most efficient for the target hardware. However there are some guarantees on minimum sizes of each type (e.g. int
is at least 16-bits).
Check out this page: http://en.cppreference.com/w/cpp/language/types