Are hard-coded STRINGS ever acceptable?

问题

Similar to Is hard-coding literals ever acceptable?, but I'm specifically thinking of "magic strings" here.

On a large project, we have a table of configuration options like these:

Name         Value
----         -----
FOO_ENABLED  Y
BAR_ENABLED  N
...

(Hundreds of them).

The common practice is to call a generic function to test an option like this:

if (config_options.value('FOO_ENABLED') == 'Y') ...

(Of course, this same option may need to be checked in many places in the system code.)

When adding a new option, I was considering adding a function to hide the "magic string" like this:

if (config_options.foo_enabled()) ...

However, colleagues thought I'd gone overboard and objected to doing this, preferring the hard-coding because:

That's what we normally do
It makes it easier to see what's going on when debugging the code

The trouble is, I can see their point! Realistically, we are never going to rename the options for any reason, so about the only advantage I can think of for my function is that the compiler would catch any typo like fo_enabled(), but not 'FO_ENABLED'.

What do you think? Have I missed any other advantages/disadvantages?

回答1:

if (config_options.isTrue('FOO_ENABLED')) {...
}

Restrict your hard coded Y check to one place, even if it means writing a wrapper class for your Map.

if (config_options.isFooEnabled()) {...
}

Might seem okay until you have 100 configuration options and 100 methods (so here you can make a judgement about future application growth and needs before deciding on your implementation). Otherwise it is better to have a class of static strings for parameter names.

if (config_options.isTrue(ConfigKeys.FOO_ENABLED)) {...
}

回答2:

If I use a string once in the code, I don't generally worry about making it a constant somewhere.

If I use a string twice in the code, I'll consider making it a constant.

If I use a string three times in the code, I'll almost certainly make it a constant.

回答3:

In my experience, this kind of issue is masking a deeper problem: failure to do actual OOP and to follow the DRY principle.

In a nutshell, capture the decision at startup time by an appropriate definition for each action inside the if statements, and then throw away both the config_options and the run-time tests.

Details below.

The sample usage was:

if (config_options.value('FOO_ENABLED') == 'Y') ...

which raises the obvious question, "What's going on in the ellipsis?", especially given the following statement:

(Of course, this same option may need to be checked in many places in the system code.)

Let's assume that each of these config_option values really does correspond to a single problem domain (or implementation strategy) concept.

Instead of doing this (repeatedly, in various places throughout the code):

Take a string (tag),
Find its corresponding other string (value),
Test that value as a boolean-equivalent,
Based on that test, decide whether to perform some action.

I suggest encapsulating the concept of a "configurable action".

Let's take as an example (obviously just as hypthetical as FOO_ENABLED ... ;-) that your code has to work in either English units or metric units. If METRIC_ENABLED is "true", convert user-entered data from metric to English for internal computation, and convert back prior to displaying results.

Define an interface:

public interface MetricConverter {
    double toInches(double length);
    double toCentimeters(double length);
    double toPounds(double weight);
    double toKilograms(double weight);
}

which identifies in one place all the behavior associated with the concept of METRIC_ENABLED.

Then write concrete implementations of all the ways those behaviors are to be carried out:

public class NullConv implements MetricConverter {
    double toInches(double length) {return length;}
    double toCentimeters(double length) {return length;}
    double toPounds(double weight)  {return weight;}
    double toKilograms(double weight)  {return weight;}
}

and

// lame implementation, just for illustration!!!!
public class MetricConv implements MetricConverter {
    public static final double LBS_PER_KG = 2.2D;
    public static final double CM_PER_IN = 2.54D
    double toInches(double length) {return length * CM_PER_IN;}
    double toCentimeters(double length) {return length / CM_PER_IN;}
    double toPounds(double weight)  {return weight * LBS_PER_KG;}
    double toKilograms(double weight)  {return weight / LBS_PER_KG;}
}

At startup time, instead of loading a bunch of config_options values, initialize a set of configurable actions, as in:

MetricConverter converter = (metricOption()) ? new MetricConv() : new NullConv();

(where the expression metricOption() above is a stand-in for whatever one-time-only check you need to make, including looking at the value of METRIC_ENABLED ;-)

Then, wherever the code would have said:

double length = getLengthFromGui();
if (config_options.value('METRIC_ENABLED') == 'Y') {
    length = length / 2.54D;
}
// do some computation to produce result
// ...
if (config_options.value('METRIC_ENABLED') == 'Y') {
    result = result * 2.54D;
}
displayResultingLengthOnGui(result);

rewrite it as:

double length = converter.toInches(getLengthFromGui());
// do some computation to produce result
// ...
displayResultingLengthOnGui(converter.toCentimeters(result));

Because all of the implementation details related to that one concept are now packaged cleanly, all future maintenance related to METRIC_ENABLED can be done in one place. In addition, the run-time trade-off is a win; the "overhead" of invoking a method is trivial compared with the overhead of fetching a String value from a Map and performing String#equals.

回答4:

I realise the question is old, but it came up on my margin.

AFAIC, the issue here has not been identified accurately, either in the question, or the answers. Forget about 'harcoding strings" or not, for a moment.

The database has a Reference table, containing config_options. The PK is a string.
There are two types of PKs:
- Meaningful Identifiers, that the users (and developers) see and use. These PKs are supposed to be stable, they can be relied upon.
- Meaningless Id columns which the users should never see, that the developers have to be aware of, and code around. These cannot be relied upon.
It is ordinary, normal, to write code using the absolute value of a meaningful PK IF CustomerCode = "IBM" ... or IF CountryCode = "AUS" etc.
- referencing the absolute value of a meaningless PK is not acceptable (due to auto-increment; gaps being changed; values being replaced wholesale).
  .
Your reference table uses meaningful PKs. Referencing those literal strings in code is unavoidable. Hiding the value will make maintenance more difficult; the code is no longer literal; your colleagues are right. Plus there is the additional redundant function that chews cycles. If there is a typo in the literal, you will soon find that out during Dev testing, long before UAT.
- hundreds of functions for hundreds of literals is absurd. If you do implement a function, then Normalise your code, and provide a single function that can be used for any of the hundreds of literals. In which case, we are back to a naked literal, and the function can be dispensed with.
- the point is, the attempt to hide the literal has no value.
  .
It cannot be construed as "hardcoding", that is something quite different. I think that is where your issue is, identifying these constructs as "hardcoded". It is just referencing a Meaningfull PK literally.
Now from the perspective of any code segment only, if you use the same value a few times, you can improve the code by capturing the literal string in a variable, and then using the variable in the rest of the code block. Certainly not a function. But that is an efficiency and good practice issue. Even that does not change the effect IF CountryCode = @cc_aus

回答5:

I believe that the two reasons you have mentioned, Possible misspelling in string, that cannot be detected until run time and the possibility (although slim) of a name change would justify your idea.

On top of that you can get typed functions, now it seems you only store booleans, what if you need to store an int, a string etc. I would rather use get_foo() with a type, than get_string("FOO") or get_int("FOO").

回答6:

I really should use constants and no hard coded literals.

You can say they won't be changed, but you may never know. And it is best to make it a habit. To use symbolic constants.

回答7:

I think there are two different issues here:

In the current project, the convention of using hard-coded strings is already well established, so all the developers working on the project are familiar with it. It might be a sub-optimal convention for all the reasons that have been listed, but everybody familiar with the code can look at it and instinctively knows what the code is supposed to do. Changing the code so that in certain parts, it uses the "new" functionality will make the code slightly harder to read (because people will have to think and remember what the new convention does) and thus a little harder to maintain. But I would guess that changing over the whole project to the new convention would potentially be prohibitively expensive unless you can quickly script the conversion.
On a new project, symbolic constants are the way IMO, for all the reasons listed. Especially because anything that makes the compiler catch errors at compile time that would otherwise be caught by a human at run time is a very useful convention to establish.

回答8:

I too prefer a strongly-typed configuration class if it is used through-out the code. With properly named methods you don't lose any readability. If you need to do conversions from strings to another data type (decimal/float/int), you don't need to repeat the code that does the conversion in multiple places and can cache the result so the conversion only takes place once. You've already got the basis of this in place already so I don't think it would take much to get used to the new way of doing things.

回答9:

Another thing to consider is intent. If you are on a project that requires localization hard coded strings can be ambiguous. Consider the following:

const string HELLO_WORLD = "Hello world!";
print(HELLO_WORLD);

The programmer's intent is clear. Using a constant implies that this string does not need to be localized. Now look at this example:

print("Hello world!");

Here we aren't so sure. Did the programmer really not want this string to be localized or did the programmer forget about localization while he was writing this code?

来源：https://stackoverflow.com/questions/487485/are-hard-coded-strings-ever-acceptable

标签

language-agnostic

literals

string-literals

hard-coding