Is it possible to guarantee code doing memory writes is not optimized away in C++?

问题

C++ compilers are allowed to optimize away writes into memory:

 {
     //all this code can be eliminated
     char buffer[size];
     std::fill_n( buffer, size, 0);
 }

When dealing with sensitive data the typical approach is using volatile* pointers to ensure that memory writes are emitted by the compiler. Here's how SecureZeroMemory() function in Visual C++ runtime library is implemented (WinNT.h):

FORCEINLINE PVOID RtlSecureZeroMemory(
     __in_bcount(cnt) PVOID ptr, __in SIZE_T cnt )
{
    volatile char *vptr = (volatile char *)ptr;
#if defined(_M_AMD64)
    __stosb((PBYTE )((DWORD64)vptr), 0, cnt);
#else
    while (cnt) {
        *vptr = 0;
        vptr++;
        cnt--;
    }
#endif
    return ptr;
}

The function casts the passed pointer to a volatile* pointer and then writes through the latter. However if I use it on a local variable:

char buffer[size];
SecureZeroMemory( buffer, size );

the variable itself is not volatile. So according to C++ Standard definition of observable behavior writes into buffer don't count as observable behavior and looks like it can be optimized away.

Now there're a lot of comments below about page files, caches, etc, which are all valid, but let's just ignore them in this question. The only thing this question is about is whether the code for memory writes is optimized away or not.

Is it possible to ensure that code doing writes into memory is not optimized away in C++? Is the solution in SecureZeroMemory() compliant to C++ Standard?

回答1:

There is no portable solution. If it wants to, the compiler could have made copies of the data while you were using it in multiple places in memory and any zero function could zero only the one it's using at that time. Any solution will be non-portable.

回答2:

With library functions like SecureZeroMemory, the library writers will typically have taken pains to ensure that such functions will not be inlined by the compiler. This means that in the snippet

char buffer[size];
SecureZeroMemory( buffer, size );

the compiler does not know what SecureZeroMemory does with buffer, so the optimizer can't prove that taking the snippet out does not affect the observable behaviour of the program. In other words, the library writers will already have done all that is possible to ensure such code is not optimized away.

回答3:

The volatile keyword can be applied to a pointer (or reference, in C++) without requiring a cast, meaning that accesses through this pointer are not to be optimized out. The declaration of the variable does not matter.

The behaviour is analogous to const:

char buffer[16];
char const *p = buffer;

buffer[0] = 'a';          // okay
p[0] = 'b';               // error

That a const pointer to the buffer exists does not alter the behaviour of the variable in any way, only the behaviour of the modified pointer. If the variable is declared const, then it is forbidden to generate non-const pointers to it:

char const buffer[16];
char *p = buffer;         // error

Similarly,

char buffer[16];
char volatile *p = buffer;

buffer[0] = 'a';          // may be optimized out
p[0] = 'b';               // will be emitted

and

char volatile buffer[16];
char *p = buffer;         // error

The compiler is free to remove accesses through non-volatile lvalues as well as function calls where it can prove that no accesses to volatile lvalues happen.

The RtlSecureZeroMemory function is safe to use because the compiler can either see the definition (including the volatile access inside the loop or, depending on the platform, the assembler statement, which is opaque to the compiler and thus assumed to be unoptimizable), or it has to assume that the function will perform a volatile access.

If you wish to avoid the dependency on the <winnt.h> header file, then a similar function will work fine with any conforming compiler.

回答4:

There is always a race condition between when there is sensitive information in memory and the time you wipe it out. In that window of time your application could crash and dump core or a malicious user could get a memory dump of the process' address space with sensitive information in plain text.

May be you should not store sensitive information in memory in plain text. This way you achieve better security and bypass this issue completely.

回答5:

Neither the C nor C++ Standard imposes any requirements on how implementations store things in physical memory. Implementations are free to specify such things, however, and quality implementations which are suitable for applications requiring certain physical-memory behaviors will specify that they will consistently behave in suitable fashion.

Many implementations process at least two distinct dialects. When processing their "optimizations disabled" dialect, they often document in great detail how many actions will interact with physical memory. Unfortunately, enabling optimizations will usually switch in a semantically weaker dialect which guarantees almost nothing about how any actions will interact with physical memory. While it should be possible to process many simple and straightforward optimizations while still processing things in a fashion that is consistent with the "optimizations disabled" dialect in certain easily-identifiable cases where it would be likely to matter, compiler writers aren't interested in providing modes that focuses on the safe low-hanging fruit.

The only reliable way to ensure that physical memory is treated in a certain fashion is to use a dialect that promises to treat physical memory in that fashion. If one does that, getting the required treatment will generally be easy. If one doesn't, nothing will guarantee that a "creative" implementation won't do something unexpected.

来源：https://stackoverflow.com/questions/13268657/is-it-possible-to-guarantee-code-doing-memory-writes-is-not-optimized-away-in-c

标签

c++

optimization

compiler-construction

compiler-optimization