How do I safely and sensibly determine whether a pointer points somewhere into a specified buffer?

后端 未结 2 504
广开言路
广开言路 2020-12-09 17:30

I\'m looking to implement a function that determines whether a given pointer points into a given buffer. The specification:


template          


        
相关标签:
2条回答
  • 2020-12-09 18:15

    As far as I can tell, this is a portable implementation of the function I'm after for all possible implementations:

    #ifdef UINTPTR_MAX
    
    bool points_into_buffer(std::uintptr_t p, std::uintptr_t buf, std::size_t len)
    {
      const auto diff = p + 0u - buf;
      if (diff < len)
        // #1
        if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + diff)
          return true;
      for (std::size_t n = 0; n != len; n++)
        if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n)
          // #2
          if (reinterpret_cast<char *>(p) - reinterpret_cast<char *>(buf) != diff)
            return true;
      return false;
    }
    
    template <typename T>
    bool points_into_buffer(T *p, T *buf, std::size_t len)
    {
      return points_into_buffer(reinterpret_cast<std::uintptr_t>(p),
                                reinterpret_cast<std::uintptr_t>(buf),
                                len * sizeof(T));
    }
    
    #else
    
    template <typename T>
    bool points_into_buffer(T *p, T *buf, std::size_t len)
    {
      for (std::size_t n = 0; n != len; n++)
        if (p == buf + n)
          return true;
      return false;
    }
    
    #endif
    

    In general, diff is not guaranteed to have a meaningful value. But that's okay: the function returns true if and only if it finds some n such that reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n. It only uses diff as a hint to find the value of n faster.

    It relies on the compiler optimising conditions that are not necessarily known at compile time in general, but are known at compile time for a particular platform. The conditions for the if statements marked as #1 and #2 are determined by GCC at compile time to always be true and false respectively, because of how diff is defined, allowing GCC to see that no useful action is performed inside the loop, and allowing the entire loop to be dropped.

    The generated code for points_into_buffer<char> and points_into_buffer<int> looks like:

    bool points_into_buffer(char*, char*, unsigned int):
            movl    4(%esp), %edx
            movl    $1, %eax
            movl    12(%esp), %ecx
            subl    8(%esp), %edx
            cmpl    %edx, %ecx
            ja      L11
            xorl    %eax, %eax
    L11:    rep ret
    
    bool points_into_buffer(int*, int*, unsigned int):
            movl    4(%esp), %edx
            movl    12(%esp), %eax
            subl    8(%esp), %edx
            leal    0(,%eax,4), %ecx
            movl    $1, %eax
            cmpl    %edx, %ecx
            ja      L19
            xorl    %eax, %eax
    L19:    rep ret
    

    On systems where std::uintptr_t is not available, or where addresses are more complicated than simple integers, the loop is used instead.

    0 讨论(0)
  • 2020-12-09 18:34

    If you cast the pointers to large enough unsigned integers and add number of bytes instead of number of objects, the undefined behavior goes away.

    template <typename T>
    bool points_into_buffer (T *p, T *buf, std::size_t len) {
        uintptr_t ip = (uintptr_t)p;
        uintptr_t ibuf = (uintptr_t)buf;
        return ip >= ibuf && ip < (ibuf + sizeof(T) * len);
    }
    

    This code doesn't detect if p is not correctly aligned, but you could easily add a test with a %.

    0 讨论(0)
提交回复
热议问题