Is it undefined behavior to exceed translation limits and are there checker tools to find it?

不羁的心 提交于 2020-01-01 04:35:06

问题


ORIGINAL QUESTION:

I'm searching the C90 standard for things to be aware of, when writing hignly portable code, while having low trust in the good will of the compiler vendor, and assuming that my software might kill somebody sometimes, if I do things wrong. Let's say I'm a little paranoid.

At the moment I am thinking about the "Translation limits" (5.2.4.1 ANSI/ISO 9899:1990). As pointed out in the standard and in: "Does ansi C place a limit on the number of external variables in a program?", those are minimum requirements for a standard conform implementation. Now on the other hand this means, any implementation does not have to do more - and if I want to be sure that my code works for any confrom implementation, these limits represent absolut limits for me.

So far so annoying.

So the compiler vendor choose limits that equals or are above the minimum required tranlation limits.

What happens now if one exceed these implementation-defined tranlation limits of a specific implementation? In my copy of ANSI/IO 9899:1990 (C90) I haven't found anything, so I think it is Undefined Behavior "of the 3. kind" (by omission). On the other hand would this not be the first time, that I misunderstood the standard or didn't find the right passage.

So here are my questions:

  • IS exceeding the translation limits of a specific implementation undefined behavior in C90?

  • Does C90 behavior hold for the corrected versions up to C95/C96 and for the new iterations C99 & C11?

  • Have anyone seen a checker tool out there, that checks for the minimal, or (tool) user defined limits?

ASPECTS BEYOND THE ORIGINAL QUESTION:

Interesting aspects in answers and comments:

1) As Michael Burr pointed out in a direct comment to the question, according to the C-Standard (I have only checked C90 without corrigendae, and the C99 draft, Michael referenced here) a conform C implementation only needs to accept ONE program, that contains all limits at the same time, which in the strictest interpretation nullifies any minimum limit guarantees.

2) As rubenvb and Keith Thompson pointed out, implementations of some quality should provide diagnostics for the case, that their implementation defined limits are exceeded, especially if the are not conform to the minimum requirements (rubenvb linked an example for MSVC in a comment).

3) As exceeding the compiler limits might be Undefined behavior, but surely lead to some error, the values of the "variables" to which the translation limits apply for a certain piece of my code represent preconditions for reuse.

My personal strategies to deal with them

1) So for maximal paranoia, I will make a fool out of myself, and annoy the compiler vendors' support with a request to guarantee me, that the limits chosen by the implementation apply to any program. :-(

2) So I will investigate the compiler documentations and the capacity for suffering of the compiler supports for getting the confirmation, that: - that for every translation limit, if being exceeded, a diagnostic will be raised, and - because it is undefined behavior, if every instance of exceeding a translation limit will raise a diagnostic - or else another error already prevented a compilation.

3) So I will try to get hands on a tool (or develop myself if I really must), that measure those values, and provide them as precondition for code reuse for my program. As Keith Thompson pointed out in this answer some of values might need a deeper knowledge on how the Implementation is... implemented. I am not perfectly sure what can help in such cases beyond actions in 2.) yet, as far as I see, I have to test - but I only need to test if there is UB (without a diagnostic), and if this is the case, a successful test can not guarantee correctness in the general case.

ANSWERED:

Yes it is undefined behavior by obmission.

Keith Thompson has showed in his (accepted) anwser with terminology of and reference to the C standard documents, that it is undefined behavior.

A tool that checks transaction limits in the code has not (yet) discovered by the commenters. If a tool occurs to anyone that have (even partly) this functionality, please leave an answer or comment.


回答1:


I believe the behavior is undefined.

The standard requires a diagnostic for any translation unit that violates a constraint or syntax rule (N1570 5.1.1.3), and may not successfully translate a translation unit that contains a #error directive that survives the preprocessing phase (n1570 4, paragraph 4). (N1570 is a draft of the C11 standard, but this is the same across C90, C99, and C11, except that #error was added by C99.)

All constraints and syntax rules are specified explicitly in the standard. Exceeding an implementation-defined limit violates neither a constraint nor a syntax rule. It's sufficiently obvious, I think, that an implementation is not required to successfully process an otherwise correct program that exceeds a translation limit, but the standard says nothing about how it should respond to such a violation. Therefore, the behavior is undefined by omission.

(An implementation of decent quality would issue a diagnostic saying that a limit has been exceeded, but this is not required by the standard.)

To answer the third part of your question, no, I haven't heard of a static checker tool that checks programs for violations of the minimum translation limits. Such a tool could be quite useful, and probably wouldn't be too difficult to write once you have a C parser. For the limit on the size of an object (32767 bytes in C90, 65535 bytes in C99 and C11), it would have to know how the compiler determines object sizes; int arr[30000]; may or may not exceed 65535 bytes, depending on sizeof (int). I wouldn't be too surprised if someone has already implemented such a tool and I just haven't heard of it.

Note that most implementations do not impose the fixed limits that the standard permits; rather, any limits are imposed by the memory resources available at compile time.

The standard does present the translation limits in a rather odd way. I'm thinking in particular of the clause that says:

The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:

(that's section 5.2.4.1 in C90, C99, and C11). So a perverse implementation could accept exactly one program and reject all others.

The point, I think is that specifying reasonable limits that all implementations must meet would be impractical. The standard could say that all implementations must always accept objects of at least 32767 bytes -- but what about a program that defines a million such objects? The limits interact with each other in extremely complex ways, and the nature of the interaction depends on the internal structure of each compiler. (If you think you can define the requirements for translation limits better than the C standard does so, I encourage you to try it.)

Instead, the standard states the requirements in such a way that the easiest way to implement a useful compiler that obeys the letter of the standard is to implement a useful compiler that obeys the spirit of the standard, by not imposing any unreasonable limits. A useless compiler that meets the letter of the standard is possible but irrelevant; I don't know that anybody has ever implemented such a thing, and I'm sure nobody would attempt to use it.




回答2:


It's not undefined behavior, it is implementation defined behavior. This means it all depends on the compiler.

Yes, the minimal implementation guidelines remain the same or are extended for newer standards versions.

You probably could use Clang for this, but you'll need to write the tool yourself using the Clang API, I don't know of a pre-existing implementation.


In any case: the limits aren't set by the standard, "They're more [like] guidelines anyways", (actually nothing more than guidelines). You'll need to check the compilers you use for building the code to see if you're hitting any limits, no way around that by only waving the standards document in someone's nose. And as MSVC's implementation is particularly sucky, I would even dare to say that if it compiles your code (assuming no illegal constructs are in the code itself), you're pretty safe.




回答3:


In some environments, an application may receive stack space equal to the total memory available, minus the combined size of the code and static data. If the amount of available memory memory will not be known until an effort is made to run a program, it may be impossible for the compiler, linker, or any other such tool to know if it will be adequate. Nothing in the standard imposes any requirements upon what must happen if an attempt is made to run a program when insufficient memory is available to handle its stack requirements.

It would be helpful if the Standard provided a means by which a program could ensure some measure of predictable behavior when run with any amount of memory available, but at present it does not do so. On many platforms, there will be some amount of available memory which will be large enough that the OS loader won't reject an executable, but will nonetheless be small enough that the application suffers a stack overflow almost immediately upon start-up. The authors of the C standard didn't want to declare that C cannot be used with such platforms, but they also can't really say anything about what platforms will do when trying to run code with that critical amount of memory.



来源:https://stackoverflow.com/questions/23730590/is-it-undefined-behavior-to-exceed-translation-limits-and-are-there-checker-tool

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!