Copy 6 byte array to long long integer variable

问题

I have read from memory a 6 byte unsigned char array. The endianess is Big Endian here. Now I want to assign the value that is stored in the array to an integer variable. I assume this has to be long long since it must contain up to 6 bytes.

At the moment I am assigning it this way:

unsigned char aFoo[6];
long long nBar;
// read values to aFoo[]...
// aFoo[0]: 0x00
// aFoo[1]: 0x00
// aFoo[2]: 0x00
// aFoo[3]: 0x00
// aFoo[4]: 0x26
// aFoo[5]: 0x8e
nBar = (aFoo[0] << 64) + (aFoo[1] << 32) +(aFoo[2] << 24) + (aFoo[3] << 16) + (aFoo[4] << 8) + (aFoo[5]);

A memcpy approach would be neat, but when I do this

memcpy(&nBar, &aFoo, 6);

the 6 bytes are being copied to the long long from the start and thus have padding zeros at the end. Is there a better way than my assignment with the shifting?

回答1:

What you want to accomplish is called de-serialisation or de-marshalling.

For values that wide, using a loop is a good idea, unless you really need the max. speed and your compiler does not vectorise loops:

uint8_t array[6];
...
uint64_t value = 0;

uint8_t *p = array;
for ( int i = (sizeof(array) - 1) * 8 ; i >= 0 ; i -= 8 )
    value |= (uint64_t)*p++ << i;

// left-align value <<= 64 - (sizeof(array) * 8);

Note using stdint.h types and sizeof(uint8_t) cannot differ from1`. Only these are guaranteed to have the expected bit-widths. Also use unsigned integers when shifting values. Right shifting certain values is implementation defined, while left shifting invokes undefined behaviour.

Iff you need a signed value, just

int64_t final_value = (int64_t)value;

after the shifting. This is still implementation defined, but all modern implementations (and likely the older) just copy the value without modifications. A modern compiler likely will optimize this, so there is no penalty.

The declarations can be moved, of course. I just put them before where they are used for completeness.

回答2:

You might try

nBar = 0;
memcpy((unsigned char*)&nBar + 2, aFoo, 6);

No & needed before an array name caz' it's already an address.

回答3:

The correct way to do what you need is to use an union:

#include <stdio.h>

typedef union {
    struct {
      char padding[2];
      char aFoo[6];
    } chars;
    long long nBar;
} Combined;

int main ()
{
  Combined x;

  // reset the content of "x"
  x.nBar = 0;           // or memset(&x, 0, sizeof(x));

  // put values directly in x.chars.aFoo[]...
  x.chars.aFoo[0] = 0x00;
  x.chars.aFoo[1] = 0x00;
  x.chars.aFoo[2] = 0x00;
  x.chars.aFoo[3] = 0x00;
  x.chars.aFoo[4] = 0x26;
  x.chars.aFoo[5] = 0x8e;

  printf("nBar: %llx\n", x.nBar);

  return 0;
}

The advantage: the code is more clear, there is no need to juggle with bits, shifts, masks etc.

However, you have to be aware that, for speed optimization and hardware reasons, the compiler might squeeze padding bytes into the struct, leading to aFoo not sharing the desired bytes of nBar. This minor disadvantage can be solved by telling the computer to align the members of the union at byte-boundaries (as opposed to the default which is the alignment at word-boundaries, the word being 32-bit or 64-bit, depending on the hardware architecture).

This used to be achieved using a #pragma directive and its exact syntax depends on the compiler you use.

Since C11/C++11, the alignas() specifier became the standard way to specify the alignment of struct/union members (given your compiler already supports it).

来源：https://stackoverflow.com/questions/35597378/copy-6-byte-array-to-long-long-integer-variable

标签

long-integer

memcpy