String to Int in java - Likely bad data, need to avoid exceptions

橙三吉。 提交于 2019-11-28 04:20:52

That's pretty much it, although returning MIN_VALUE is kind of questionable, unless you're sure it's the right thing to use for what you're essentially using as an error code. At the very least I'd document the error code behavior, though.

Might also be useful (depending on the application) to log the bad input so you can trace.

Stephen Ostermiller

I asked if there were open source utility libraries that had methods to do this parsing for you and the answer is yes!

From Apache Commons Lang you can use NumberUtils.toInt:

// returns defaultValue if the string cannot be parsed.
int i = org.apache.commons.lang.math.NumberUtils.toInt(s, defaultValue);

From Google Guava you can use Ints.tryParse:

// returns null if the string cannot be parsed
// Will throw a NullPointerException if the string is null
Integer i = com.google.common.primitives.Ints.tryParse(s);

There is no need to write your own methods to parse numbers without throwing exceptions.

For user supplied data, Integer.parseInt is usually the wrong method because it doesn't support internationisation. The java.text package is your (verbose) friend.

try {
    NumberFormat format = NumberFormat.getIntegerInstance(locale);
    format.setParseIntegerOnly(true);
    format.setMaximumIntegerDigits(9);
    ParsePosition pos = new ParsePosition(0);
    int val = format.parse(str, pos).intValue();
    if (pos.getIndex() != str.length()) {
        // ... handle case of extraneous characters after digits ...
    }
    // ... use val ...
} catch (java.text.ParseFormatException exc) {
    // ... handle this case appropriately ...
}

What's the problem with your approach? I don't think doing it that way will hurt your application's performance at all. That's the correct way to do it. Don't optimize prematurely.

I'm sure it is bad form, but I have a set of static methods on a Utilities class that do things like Utilities.tryParseInt(String value) which returns 0 if the String is unparseable and Utilities.tryParseInt(String value, int defaultValue) which allows you to specify a value to use if parseInt() throws an exception.

I believe there are times when returning a known value on bad input is perfectly acceptable. A very contrived example: you ask the user for a date in the format YYYYMMDD and they give you bad input. It may be perfectly acceptable to do something like Utilities.tryParseInt(date, 19000101) or Utilities.tryParseInt(date, 29991231); depending on the program requirements.

I'm going to restate the point that stinkyminky was making towards the bottom of the post:

A generally well accepted approach validating user input (or input from config files, etc...) is to use validation prior to actually processing the data. In most cases, this is a good design move, even though it can result in multiple calls to parsing algorithms.

Once you know that you have properly validated the user input, then it is safe to parse it and ignore, log or convert to RuntimeException the NumberFormatException.

Note that this approach requires you to consider your model in two pieces: the business model (Where we actually care about having values in int or float format) and the user interface model (where we really want to allow the user to put in whatever they want).

In order for the data to migrate from the user interface model to the business model, it must pass through a validation step (this can occur on a field by field basis, but most scenarios call for validation on the entire object that is being configured).

If validation fails, then the user is presented with feedback informing them of what they've done wrong and given a chance to fix it.

Binding libraries like JGoodies Binding and JSR 295 make this sort of thing a lot easier to implement than it might sound - and many web frameworks provide constructs that separate user input from the actual business model, only populating business objects after validation is complete.

In terms of validation of configuration files (the other use case presented in some of the comments), it's one thing to specify a default if a particular value isn't specified at all - but if the data is formatted wrong (someone types an 'oh' instead of a 'zero' - or they copied from MS Word and all the back-ticks got a funky unicode character), then some sort of system feedback is needed (even if it's just failing the app by throwing a runtime exception).

Here's how I do it:

public Integer parseInt(String data) {
  Integer val = null;
  try {
    val = Integer.parseInt(userdata);
  } catch (NumberFormatException nfe) { }
  return val;
}

Then the null signals invalid data. If you want a default value, you could change it to:

public Integer parseInt(String data,int default) {
  Integer val = default;
  try {
    val = Integer.parseInt(userdata);
  } catch (NumberFormatException nfe) { }
  return val;
}

I think the best practice is the code you show.

I wouldn't go for the regex alternative because of the overhead.

Zlosny

Try org.apache.commons.lang.math.NumberUtils.createInteger(String s). That helped me a lot. There are similar methods there for doubles, longs etc.

You could use a Integer, which can be set to null if you have a bad value. If you are using java 1.6, it will provide auto boxing/unboxing for you.

Cleaner semantics (Java 8 OptionalInt)

For Java 8+, I would probably use RegEx to pre-filter (to avoid the exception as you noted) and then wrap the result in a primitive optional (to deal with the "default" problem):

public static OptionalInt toInt(final String input) {
    return input.matches("[+-]?\\d+") 
            ? OptionalInt.of(Integer.parseInt(input)) 
            : OptionalInt.empty();
}

If you have many String inputs, you might consider returning an IntStream instead of OptionalInt so that you can flatMap().

References

The above code is bad because it is equivalent as the following.

// this is bad
int val = Integer.MIN_VALUE;
try
{
   val = Integer.parseInt(userdata);
}
catch (NumberFormatException ignoreException) { }

The exception is ignored completely. Also, the magic token is bad because an user can pass in -2147483648 (Integer.MIN_VALUE).

The generic parse-able question is not beneficial. Rather, it should be relevant to the context. Your application has a specific requirement. You can define your method as

private boolean isUserValueAcceptable(String userData)
{
   return (    isNumber(userData)    
          &&   isInteger(userData)   
          &&   isBetween(userData, Integer.MIN_VALUE, Integer.MAX_VALUE ) 
          );
}

Where you can documentation the requirement and you can create well defined and testable rules.

If you can avoid exceptions by testing beforehand like you said (isParsable()) it might be better--but not all libraries were designed with that in mind.

I used your trick and it sucks because stack traces on my embedded system are printed regardless of if you catch them or not :(

The exception mechanism is valuable, as it is the only way to get a status indicator in combination with a response value. Furthermore, the status indicator is standardized. If there is an error you get an exception. That way you don't have to think of an error indicator yourself. The controversy is not so much with exceptions, but with Checked Exceptions (e.g. the ones you have to catch or declare).

Personally I feel you picked one of the examples where exceptions are really valuable. It is a common problem the user enters the wrong value, and typically you will need to get back to the user for the correct value. You normally don't revert to the default value if you ask the user; that gives the user the impression his input matters.

If you do not want to deal with the exception, just wrap it in a RuntimeException (or derived class) and it will allow you to ignore the exception in your code (and kill your application when it occurs; that's fine too sometimes).

Some examples on how I would handle NumberFormat exceptions: In web app configuration data:

loadCertainProperty(String propVal) {
  try
  {
    val = Integer.parseInt(userdata);
    return val;
  }
  catch (NumberFormatException nfe)
  { // RuntimeException need not be declared
    throw new RuntimeException("Property certainProperty in your configuration is expected to be " +
                               " an integer, but was '" + propVal + "'. Please correct your " +
                               "configuration and start again");
    // After starting an enterprise application the sysadmin should always check availability
    // and can now correct the property value
  }
}

In a GUI:

public int askValue() {
  // TODO add opt-out button; see Swing docs for standard dialog handling
  boolean valueOk = false;
  while(!valueOk) {
    try {
      String val = dialog("Please enter integer value for FOO");
      val = Integer.parseInt(userdata);
      return val; 
    } catch (NumberFormatException nfe) {
      // Ignoring this; I don't care how many typo's the customer makes
    }
  }
}

In a web form: return the form to the user with a usefull error message and a chance to correct. Most frameworks offer a standardized way of validation.

Integer.MIN_VALUE as NumberFormatException is bad idea.

You can add proposal to Project Coin to add this method to Integer

@Nullable public static Integer parseInteger (String src)... it will return null for bad input

Then put link to your proposal here and we all will vote for it!

PS: Look at this http://msdn.microsoft.com/en-us/library/bb397679.aspx this is how ugly and bloated it could be

Put some if statements in front of it. if (null != userdata )

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!