Why does parseInt('dsff66',16) return 13?

问题

today I stumbled on a strange (in my opinion) case in JavaScript. I passed a non-hexadecimal string to the parseInt function with the base of 16 and...I got the result. I would expect the function to throw some kind of exception or at least return NaN, but it succeeded parsing it and returned an int.

My call was:

var parsed = parseInt('dsff66', 16); // note the 's' in the first argument
document.write(parsed);

and the result was: 13.

I noticed that it "stops" parsing with the first character that doesn't belong to the numeral system specified in the 2nd argument, so calling parseInt('fg',16) I would get 15 as a result.

In my opinion, it should return NaN. Can anyone explain to me why it doesn't? Why would anyone want this function to behave like this (return an integer even if it isn't the precise representation of the string passed) ?

回答1:

Why would anyone want this function to behave like this (return an integer even if it isn't the precise representation of the string passed)?

Because most of the time (by far) you're working with base 10 numbers, and in that case JS can just cast - not parse - the string to a number. (edit: Apparently not just base-10; see update below.)

Since JS is dynamically typed, some strings work just fine as numbers without any work on your part. For instance:

 "21" / 3;   // => 7
 "12.4" / 4; // => 3.1

No need for parseInt there, because "21" and "12.4" are essentially numbers already. If, however the string was "12.4xyz" then you would indeed get NaN when dividing, since that is decidedly not a number and can't be implicitly cast or coerced to one.

You can also explicitly "cast" a string to number with Number(someString). ~~While it too only supports base 10,~~ it will indeed return NaN for invalid strings.

So because JS already has implicit and explicit type casting/conversion/coercion, parseInt's role isn't to be a yet another type casting function.

parseInt's role is instead to be, well, a parsing function. A function that tries its best to make sense of its input, returning what it can. It's for when you have a string you can't just cast because it's not quite perfectly numeric. (And, like JS's basic syntax, it's reminiscent of C, as apsillers' answer explained nicely.)

And since it's a parser, not a casting function, it's got the additional feature of being able to handle other bases than 10.

Now, you might ask why there isn't a strict casting function that handles non-base-10 numbers, and would complain like you want, but... hey, there just isn't. JS's designers just decided that parseInt would suffice, because, again, 0x63 percent of the time, you're dealing with base 10.

~~Closest you can get to "casting" is probably something horribly hacky like:~~

var hexString = "dsff66";
var number = eval("0x" + hexString); // attempt to interpret as a hexadecimal literal

which'll throw a SyntaxError because 0xdsff66 isn't a valid hex literal.

Update: As Lekensteyn points out in the comments, JS appears to properly cast 0x-prefixed hexadecimal strings too. I didn't know this, but indeed this seems to work:

1 * "0xd0ff66"; // => 13696870
1 * "0xdsff66"; // => NaN

which makes it the simplest way to cast a hex string to a number - and get NaN if it can't be properly represented.

Same behavior applies to Number(), e.g Number("0xd0ff66") returns an integer, and Number("0xdsff66") returns NaN.

(/update)

Alternatively, you can check the string beforehand and return NaN if needed:

function hexToNumber(string) {
  if( !/^(0x)?[0-9a-f]+$/i.test(string) ) return Number.NaN;
  return parseInt(string, 16);
}

回答2:

parseInt reads input until it encounters an invalid character, and then uses whatever valid input it read prior to that invalid character. Consider:

parseInt("17days", 10);

This will use the input 17 and omit everything after the invalid d.

From the ECMAScript specification:

If [input string] S contains any character that is not a radix-R digit, then let Z [the string to be integer-ified] be the substring of S consisting of all characters before the first such character; otherwise, let Z be S.

In your example, s is an invalid base-16 character, so parseInt uses only the leading d.

As for why this behavior was included: there's no way to know for sure, but this is quite likely an attempt to reproduce the behavior of strtol (string to long) from C's standard library. From the strtol(3) man page:

...the string is converted to a long int value in the obvious manner, stopping at the first character which is not a valid digit in the given base.

This connection is further supported (to some degree) by the fact that both parseInt and strtol are specified to ignore leading whitespace, and they can both accept a leading 0x for hexadecimal values.

回答3:

In this particular case parseInt() interpret letter from "A" to "F" as hexadecimal and parse those to decimal numbers. That means d will return 13.

What parseInt() does

parseInt("string", radix) interpret numbers and letters in the string as hexadecimal (it depend on the radix) to number.
parseInt() only parse number or letter as hexadecimal from the beginning of the string until invalid character as hexadecimal.
If parseInt() can't find any number or letter as hexadecimal at the beginning of the string parseInt() will return NaN.
If the radix is not defined, the radix is 10.
If the string begin with "0x", the radix is 16.
If the radix defined 0, the radix is 10.
If the radix is 1, parseInt() return NaN.
If the radix is 2, parseInt() only parse "0" and "1".
If the radix is 3 , parseInt() only parse "0", "1", and "2". And so on.
parseInt() parse "0" to 0 if there is no number follows it as the result and remove 0 if there is number follows it. e.g. "0" return 0 and "01" return 1.
If the radix is 11, parseInt() only parse string that begins with number from "0" to "9" and/or letter "A".
If the radix is 12, parseInt only parse string that begins with number from "0" to "9" and/or letter "A" and "B", and so on.
the maximum radix is 36, it will parse string that begins with number from "0" to "9" and/or letter from "A" to "Z".
If the characters interpreted as hexadecimal more than one, every characters will has different value, though those characters are the same character. e.g. parseInt("AA", 11) the first "A" has different value with the second "A".
Different radix will return different number though the strings is the same string.

See it in action

document.body.innerHTML = "<b>What parseInt() does</b><br>" + 
                          "parseInt('9') = " + parseInt('9') + "<br>" +
                          "parseInt('0129ABZ', 0) = " + parseInt('0129ABZ', 0) + "<br>" +
                          "parseInt('0', 1) = " + parseInt('0', 1) + "<br>" +
                          "parseInt('0', 2) = " + parseInt('0', 2) + "<br>" +
                          "parseInt('10', 2) = " + parseInt('10', 2) + "<br>" +
                          "parseInt('01', 2) = " + parseInt('01', 2) + "<br>" +
                          "parseInt('1', 2) = " + parseInt('1', 2) + "<br>" +
                          "parseInt('A', 10) = " + parseInt('A', 10) + "<br>" +
                          "parseInt('A', 11) = " + parseInt('A', 11) + "<br>" +
                          "parseInt('Z', 36) = " + parseInt('Z', 36) + "<br><br>" +
                          "<b>The value:</b><br>" +
                          "parseInt('A', 11) = " + parseInt('A', 11) + "<br>" +
                          "parseInt('A', 12) = " + parseInt('A', 12) + "<br>" +
                          "parseInt('A', 13) = " + parseInt('A', 13) + "<br>" +
                          "parseInt('AA', 11) = " + parseInt('AA', 11) + " = 100 + 20" + "<br>" +
                          "parseInt('AA', 12) = " + parseInt('AA', 12) + " = 100 + 30" + "<br>" +
                          "parseInt('AA', 13) = " + parseInt('AA', 13) + " = 100 + 40" + "<br>" +
                          "parseInt('AAA', 11) = " + parseInt('AAA', 11) + " = 1000 + 300 + 30" + "<br>" +
                          "parseInt('AAA', 12) = " + parseInt('AAA', 12) + " = 1000 + 500 + 70" + "<br>" +
                          "parseInt('AAA', 13) = " + parseInt('AAA', 13) + " = 1000 + 700 + 130" + "<br>" +
                          "parseInt('AAA', 14) = " + parseInt('AAA', 14) + " = 1000 + 900 + 210" + "<br>" +
                          "parseInt('AAA', 15) = " + parseInt('AAA', 15) + " = 1000 + 1100 + 310";

回答4:

For radices above 10, the letters of the alphabet indicate numerals greater than 9. For example, for hexadecimal numbers (base 16), A through F are used.

In your string dsff66, d is a hexadecimal character(even though the string is non hex) which fits the radix type and is equivalent to number 13. It stops parsing after that since next character is not hexadecimal hence the result.

来源：https://stackoverflow.com/questions/26017371/why-does-parseintdsff66-16-return-13

标签

javascript

parseint