Vector from long hex value

生来就可爱ヽ(ⅴ<●) 提交于 2020-08-10 18:13:03

问题


In C++ I can initialize a vector using

std::vector<uint8_t> data = {0x01, 0x02, 0x03};

For convenience (I have python byte strings that naturally output in a dump of hex), I would like to initialize for a non-delimited hex value of the form:

std::vector<uint8_t> data = 0x229597354972973aabbe7;

Is there a variant of this that is valid c++?


回答1:


Combining comments from Evg, JHbonarius and 1201ProgramAlarm:

The answer is that there is no direct way to group but a long hex value into a vector, however, using user defined literals provides a clever notation improvement.

First, using RHS 0x229597354972973aabbe7 anywhere in the code will fail because because unsuffixed literals are assumed to be of type int and will fail to be contained in the register. In MSVC, result in E0023 "integer constant is too large". Limiting to smaller hex sequences or exploring large data types may be possible with suffixed notation, but this would ruin any desire for simplicity.

Manual conversion is necessary, but user defined literals may provide a slightly more elegant notation. For example, we can enable conversion of a hex sequence to a vector with

std::vector<uint8_t> val1 = 0x229597354972973aabbe7_hexvec;
std::vector<uint8_t> val2 = "229597354972973aabbe7"_hexvec;

using the following code:

#include <vector>
#include <iostream>
#include <string>
#include <algorithm>


// Quick Utlity function to view results:
std::ostream & operator << (std::ostream & os, std::vector<uint8_t> & v)
{
    for (const auto & t : v)
        os << std::hex << (int)t << " ";

    return os;
}

std::vector<uint8_t> convertHexToVec(const char * str, size_t len)
{
    // conversion takes strings of form "FFAA54" or "0x11234" or "0X000" and converts to a vector of bytes.

    // Get the first two characters and skip them if the string starts with 0x or 0X for hex specification:
    std::string start(str, 2);
    int offset = (start == "0x" || start == "0X") ? 2 : 0;

    // Round up the number of groupings to allow for ff_hexvec  fff_hexvec and remove the offset to properly count 0xfff_hexvec
    std::vector<uint8_t> result((len + 1 - offset) / 2);

    size_t ind = result.size() - 1;

    // Loop from right to left in in pairs of two but watch out for a lone character on the left without a pair because 0xfff_hexvec is valid:
    for (const char* it = str + len - 1; it >= str + offset; it -= 2) {
        int  val = (str + offset) > (it - 1); // check if taking 2 values will run off the start and use this value to reduce by 1 if we will
        std::string s(std::max(it - 1, str + offset), 2 - val);
        result[ind--] = (uint8_t)stol(s, nullptr, 16);
    }
        
    return result;
}

std::vector<uint8_t> operator"" _hexvec(const char*str, std::size_t len)
{
    // Handles the conversion form "0xFFAABB"_hexvec or "12441AA"_hexvec
    return convertHexToVec(str, len);
}

std::vector<uint8_t> operator"" _hexvec(const char*str)
{
    // Handles the form 0xFFaaBB_hexvec and 0Xf_hexvec
    size_t len = strlen(str);
    return convertHexToVec(str, len);   
}

int main()
{
    std::vector<uint8_t> v;

    std::vector<uint8_t> val1 = 0x229597354972973aabbe7_hexvec;
    std::vector<uint8_t> val2 = "229597354972973aabbe7"_hexvec;

    std::cout << val1 << "\n";
    std::cout << val2 << "\n";

    return 0;
}

The coder must decide whether this is superior to implementing and using a more traditional convertHexToVec("0x41243124FF") function.




回答2:


Is there a variant of this that is valid c++?

I think not.


The following code is valid C++, and uses a more "traditional hex conversion" process.

  • Confirm and remove the leading '0x', also confirm that all chars are hex characters.

  • modifyFor_SDFE() - 'space delimited format extraction'

This function inserts spaces around the two char byte descriptors.

Note that this function also adds a space char at front and back of the modified string. This new string is used to create and initialize a std::stringstream (ss1).

  • By inserting the spaces, the normal stream "formatted extraction" works cleanly

The code extracts each hex value, one by one, and pushes each into the vector, and ends when last byte is pushed (stream.eof()). Note the vector automatically grows as needed (no overflow will occur).

Note that the '0x' prefix is not needed .. because the stream mode is set to hex.

Note that the overflow concern (expressed above as "0x22...be7 is likely to overflow." has been simply side-stepped, by reading only a byte at a time. It might be convenient in future efforts to use much bigger hex strings.


#include <iostream>
using std::cout, std::cerr, std::endl, std::hex,
      std::dec, std::cin, std::flush; // c++17

#include <iomanip>
using std::setw, std::setfill;

#include <string>
using std::string;

#include <sstream>
using std::stringstream;

#include <vector>
using std::vector;
typedef vector<uint8_t>  UI8Vec_t;

#include <cstdint>
#include <cassert>


class F889_t // Functor ctor and dtor use compiler provided defaults
{
  bool    verbose;

public:
  int operator()(int argc, char* argv[])     // functor entry
    {
      verbose = ( (argc > 1) ? ('V' == toupper(argv[1][0])) : false );
      return exec(argc, argv);
    }
  // 2 lines

private:

  int exec(int , char** )
    {
      UI8Vec_t   resultVec;                            // output

      // example1 input
      // string data1 = "0x229597354972973aabbe7";     // 23 chars, hex string
      // to_ui8_vec(resultVec, data1);
      // cout << (verbose ? "" : "\n") << "  vector result       "
      //      << show(ui8Vec);  // show results

      // example2 input   46 chars (no size limit)
      string data = "0x330508465083084bBCcf87eBBaa379279543795922fF";

      to_ui8_vec (resultVec, data);

      cout << (verbose ? "  vector elements      " : "\n  ")
           << show(resultVec) << endl; // show results

      if(verbose) { cout << "\n  F889_t::exec()  (verbose)  ("
                         <<  __cplusplus  << ")" << endl; }

      return 0;
    } // int exec(int, char**)
  // 7 lines

  void to_ui8_vec(UI8Vec_t& retVal,         // output (pass by reference)
                  string    sData)          //  input (pass by value)
    {
      if(verbose) { cout << "\n  input data        '" << sData
         << "'                       (" << sData.size() << " chars)" << endl;}
      { // misc format checks:
        size_t szOrig = sData.size();
        {
          // confirm leading hex indicator exists
          assert(sData.substr(0,2) == string("0x"));
          sData.erase(0,2);                 // discard leading "0x"
        }
        size_t sz = sData.size();
        assert(sz == (szOrig - 2)); // paranoia
        // to test that this will detect any typos in data:
        //    temporarily append or insert an invalid char, i.e. sData += 'q';
        assert(sData.find_first_not_of("0123456789abcdefABCDEF") == std::string::npos);
      }

      modifyFor_SDFE (sData); // SDFE - 'Space Delimited Formatted Extraction'

      stringstream ss1(sData); // create / initialize stream with SDFE

      if(verbose) { cout << "  SDFE  data         '" << ss1.str() // echo init
                         << "' (" << sData.size() << " chars)" << endl; }

      extract_values_from_SDFE_push_back_into_vector(retVal, ss1);

    } // void to_ui8_vec (vector<uint8_t>&, string)
  // 13 lines

  // modify s (of any size) for 'Space Delimited Formatted Extraction'
  void modifyFor_SDFE (string& s)
    {
      size_t indx = s.size();
      while (indx > 2)
      {
        indx -= 2;
        s.insert (indx, 1, ' ');  // indx, count, delimiter
      }
      s.insert(0, 1, ' '); // delimiter at front of s
      s += ' ';            // delimiter at tail of s
    } // void modifyFor_SDFE (string&)
  // 6 lines

  void extract_values_from_SDFE_push_back_into_vector(UI8Vec_t&      retVal,
                                                      stringstream&  ss1)
    {
      do {
        uint  n = 0;

        ss1 >> hex >> n;  // use SDFE, hex mode - extract one field at a time

        if(!ss1.good())   // check ss1 state
        {
          if(ss1.eof()) break; // quietly exit, this is a normal stream exit
          // else make some noise before exit loop
          cerr << "\n  err: data input line invalid [" << ss1.str() << ']' << endl; break;
        }

        retVal.push_back(static_cast<uint8_t>(n & 0xff)); // append to vector

      } while(true);
    } // void extract_from_SDFE_push_back_to_vector(UI8Vec_t& , string)
  // 6 lines

  string show(const UI8Vec_t& ui8Vec)
    {
      stringstream ss ("\n  ");
      for (uint i = 0; i < ui8Vec.size(); ++i) {
        ss << setfill('0') << setw(2) << hex 
           << static_cast<int>(ui8Vec[i]) << ' '; }
      if(verbose) { ss << "  (" << dec << ui8Vec.size() << " elements)"; }
      return ss.str();
    }
  // 5 lines

}; // class F889_t

int main(int argc, char* argv[]) { return F889_t()(argc, argv); }

Typical outputs when invoked with 'verbose' second parameter

$ ./dumy889 verbose

  input data        '0x330508465083084bBCcf87eBBaa379279543795922fF'                       (46 chars)
  SDFE  data         ' 33 05 08 46 50 83 08 4b BC cf 87 eB Ba a3 79 27 95 43 79 59 22 fF ' (67 chars)
  vector elements      33 05 08 46 50 83 08 4b bc cf 87 eb ba a3 79 27 95 43 79 59 22 ff   (22 elements)

When invoked with no parameters

$ ./dumy889 

  33 05 08 46 50 83 08 4b bc cf 87 eb ba a3 79 27 95 43 79 59 22 ff 

The line counts do not include empty lines, nor lines that are only a comment or only a brace. You may count the lines as you wish.



来源:https://stackoverflow.com/questions/63197844/vector-from-long-hex-value

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!