How to make two otherwise identical pointer types incompatible

余生颓废 提交于 2019-12-04 03:31:42

One trick is to wrap the pointers in a struct. Pointers to struct have better type safety than pointers to the primitive data types.

typedef struct
{
  uint8_t ptr;
} a_t;

typedef struct
{
  uint8_t ptr;
} b_t;

const volatile a_t* a = (const volatile a_t*)0x1234;
const volatile b_t* b = (const volatile b_t*)0x5678;

a = b; // compiler error
b = a; // compiler error

You could encapsulate the pointer in different struct for RAM and ROM, making the type incompatible, but containing the same type of values.

struct romPtr {
    void *addr;
};

struct ramPtr {
    void *addr;
};

int main(int argc, char **argv) {
    struct romPtr data1 = {NULL};
    struct romPtr data3 = data1;
    struct ramPtr data2 = data1; // <-- gcc would throw a compilation error here
}

During compilation :

$ cc struct_test.c
struct_test.c: In function ‘main’:
struct_test.c:12:24: error: invalid initializer
  struct ramPtr data2 = data1;
                    ^~~~~

You could of course typedefs the struct for brevity

Since I received several answers which offer different compromises on providing a solution, I decided to merge them in one, outlining the benefits and drawbacks of each. So you can choose the most appropriate for your particular situation

Named Address Spaces

For the particular problem of solving this, and only this case of ROM and RAM pointers on an AVR-8 micro, the most appropriate solution is this.

This was a proposal for C11 which didn't make it into the final standard, however there are C compilers which support it, including avr-gcc used for 8 bit AVRs.

The related documentation can be accessed here (part of the online GCC manual, also including other architectures using this extension). It is recommendable over other solutions (such as function-like macros in pgmspace.h for the AVR-8) as with this, the compiler can make the appropriate checks, while otherwise accessing the data pointed by remains clear and simple.

In particular, if you have a similar problem of porting something from a compiler which offered some sort of named address spaces, like MPLAB C18, this is likely the fastest and cleanest way to do it.

The ported pointers from above would look like as follows:

uint8_t const* data1;
uint8_t const __flash* data2;
char const __flash** strdptr;

(If possible, one could simplify the process using appropriate preprocessor definitions)

(Original answer by Olaf)

Struct encapsulation, pointer inside

This method aims to strenghten typing of pointers by wrapping them in structures. The intended usage is that you pass the structures themselves across interfaces, by which the compiler can perform type checks on them.

A "pointer" type to byte data could look like this:

typedef struct{
    uint8_t* ptr;
}bytebuffer_ptr;

The pointed data can be accessed as follows:

bytebuffer_ptr bbuf;
(...)
bbuf.ptr = allocate_bbuf();
(...)
bbuf.ptr[index] = value;

A function prototype accepting such a type and returning one could look like as follows:

bytebuffer_ptr encode_buffer(bytebuffer_ptr inbuf, size_t len);

(Original answer by dvhh)

Struct encapsulation, pointer outside

Similar to the method above, it aims to strenghten typing of pointers by wrapping them in structures, but in a different manner, providing a more robust constraint. The data type to be pointed to is which is encapsulated.

A "pointer" type to byte data could look like this:

typedef struct{
    uint8_t val;
}byte_data;

The pointed data can be accessed as follows:

byte_data* bbuf;
(...)
bbuf = allocate_bbuf();
(...)
bbuf[index].val = value;

A function prototype accepting such a type and returning one could look like as follows:

byte_data* encode_buffer(byte_data* inbuf, size_t len);

(Original answer by Lundin)

Which should I use?

Named Address Spaces in this regard don't need much discussion: They are the most appropriate solution if you only want to deal with a pecularity of your target handling address spaces. The compiler will provide you the compile-time checks you need, and you don't have to try to invent anything further.

If, however for other reasons you are interested in structure wrapping, these are matters which you may want to consider:

  • Both methods can be optimized just fine: at least GCC will generate identical code from either to using plain pointers. So you don't really have to consider performance: they should work.

  • Pointer inside is useful if you have either third-party interfaces to serve which demand pointers, or maybe if you are refactoring something so large which you can't do in one pass.

  • Pointer outside provides more robust type safety as you reinforce the pointed type itself with it: you have a true distinct type which you can't easily (accidentally) convert (implicit cast).

  • Pointer outside allows you to use modifiers on the pointer, such as adding const, which is important for creating robust interfaces (you can make data intended to be read only by a function const).

  • Keep in mind that some people might not like either of these, so if you are working in a group, or are creating code which might be reused by known parties, discuss the matter with them first.

  • Should be obvious, but keep in mind that encapsulating doesn't solve the problem of requiring special access code (such as by the pgmspace.h macros on an AVR-8), assuming no Named Address Spaces are used alongside with the method. It only provides a method to produce a compile error if you try to use a pointer by functions operating on a different address space than what it intends to point into.

Thank you for all the answers!

True harvard architectures use different instructions to access different types of memory like code (Flash on AVR), data (RAM), hardware peripheral registers (IO) and possibly others. The values of addresses in the ranges typically overlap, i.e. the same value accesses different internal devices, depending on the instruction.

Comming to C, if you want to use a unified pointer, this means you not only have to encode the address (value), but also the access type ("address space" in the following) in the pointer value. This can either be done using additional bits in a pointer's value, but also select the appropriate instruction at run-time for every access. This constitutes a significant overhead to the generated code. Additionally, often there are no spare bits in the "natural" value for at least some address spaces (e.g. all 16 bits of the pointer are used already for the address). So additional bits are required, at least a byte worth. This blows up memory usage (mostly RAM), too.

Both are typically unacceptable on typical MCUs using this architecture, because they are already quite limited. Fortunately, for most applications, it is absolutely unnecessary (or easily avoidable at least) to determine the address space at run-time.

To solve this problem all compilers for such a platform support some way to tell the compiler in which address space and object resides. Standard draft N1275 for the then-upcoming C11 proposed a standard way using "named address spaces". Unfortunately it did not make it into the final version, so we are left with compiler-extensions.

For gcc (see the documentation for other compilers), the developers implemented the original standard proposal. As the address spaces are target-specific, the code is not portable between different archittectures, but that is normally true for bare-metal embedded code anyway, nothing really lost.

Reading the documentation for AVR, an address space is simply used similar to a standard qualifier. The compiler will automatically emit the correct instructions to access the correct space. Also there is a unified address space which determines the area at run-time as explained above.

Address spaces work similar to qualifiers, there are stronger constraints to determine compatibility, i.e. when assigning pointers of different address spaces to each other. For a detailed description, see the proposal, chapter 5.

Conclusion:

named address spaces is what you want. They solve two problems:

  • Ensure pointers to incompatible address spaces can't be assigned to each other unnoticed.
  • Tell the compiler how to access the object, i.e. which instructions to use.

With regard to the other answers proposing structs, you have to specify the address space (and the type for void *) anyway once you acces the data. Usign the address space in the declaration keeps the rest of the code clean and even allows to change it lateron at a single location in the source code.

If you are after portability betwen tool-chains, rread their documentation and use macros. It is most likely you just will have to adopt the actual names of the address spaces.

Sidenote: The PIC18 example you cite actually uses the syntax for named address spaces. Just the names are deprecated, because an implementation should leave all non-standard names free for the application code. Hence the underscore-qualified names in gcc.

Disclaimer: I did not test the features, but relied on the documentation. Helpful feedback in comments appreciated.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!