Why is 'pure polymorphism' preferable over using RTTI?

前端未结

关注

 7  701

清歌不尽 2020-12-04 08:59

Almost every C++ resource I\'ve seen that discusses this kind of thing tells me that I should prefer polymorphic approaches to using RTTI (run-time type identification). In

7条回答

半阙折子戏 (楼主)

2020-12-04 09:12

Some compilers don't use it / RTTI is not always enabled

I believe you have misunderstood such arguments.

There are a number of C++ coding places where RTTI is not to be used. Where compiler switches are used to forcibly disable RTTI. If you are coding within such a paradigm... then you almost certainly have already been informed of this restriction.

The problem therefore is with libraries. That is, if you're writing a library that depends on RTTI, then your library cannot be used by users who turn off RTTI. If you want your library to be used by those people, then it cannot use RTTI, even if your library also gets used by people who can use RTTI. Equally importantly, if you can't use RTTI, you have to shop around a little harder for libraries, since RTTI use is a deal-breaker for you.

It costs extra memory / Can be slow

There are many things you don't do in hot loops. You don't allocate memory. You don't go iterating through linked lists. And so forth. RTTI certainly can be another one of those "don't do this here" things.

However, consider all of your RTTI examples. In all cases, you have one or more objects of an indeterminate type, and you want to perform some operation on them which may not be possible for some of them.

That's something you have to work around at a design level. You can write containers that don't allocate memory which fit into the "STL" paradigm. You can avoid linked list data structures, or limit their use. You can reorganize arrays of structs into structs of arrays or whatever. It changes some things, but you can keep it compartmentalized.

Changing a complex RTTI operation into a regular virtual function call? That's a design issue. If you have to change that, then it's something that requires changes to every derived class. It changes how lots of code interacts with various classes. The scope of such a change extends far beyond the performance-critical sections of code.

So... why did you write it the wrong way to begin with?

I don't have to define attributes or methods where I don't need them, the base node class can stay lean and mean.

To what end?

You say that the base class is "lean and mean". But really... it's nonexistent. It doesn't actually do anything.

Just look at your example: node_base. What is it? It seems to be a thing which has adjacent other things. This is a Java interface (pre-generics Java at that): a class that exists solely to be something that users can cast to the real type. Maybe you add some basic feature like adjacency (Java adds ToString), but that's it.

There's a difference between "lean and mean" and "transparent".

As Yakk said, such programming styles limit themselves in interoperability, because if all of the functionality is in a derived class, then users outside of that system, with no access to that derived class, cannot interoperate with the system. They can't override virtual functions and add new behaviors. They can't even call those functions.

But what they also do is make it a major pain to actually do new stuff, even within the system. Consider your poke_adjacent_oranges function. What happens if someone wants a lime_node type which can be poked just like orange_nodes? Well, we can't derive lime_node from orange_node; that makes no sense.

Instead, we have to add a new lime_node derived from node_base. Then change the name of poke_adjacent_oranges to poke_adjacent_pokables. And then, try casting to orange_node and lime_node; whichever cast works is the one we poke.

However, lime_node needs it's own poke_adjacent_pokables. And this function needs to do the same casting checks.

And if we add a third type, we have to not only add its own function, but we must change the functions in the other two classes.

Obviously, now you make poke_adjacent_pokables a free function, so that it works for all of them. But what do you suppose happens if someone adds a fourth type and forgets to add it to that function?

Hello, silent breakage. The program appears to work more or less OK, but it isn't. Had poke been an actual virtual function, the compiler would have failed when you didn't override the pure virtual function from node_base.

With your way, you have no such compiler checks. Oh sure, the compiler won't check for non-pure virtuals, but at least you have protection in cases where protection is possible (ie: there is no default operation).

The use of transparent base classes with RTTI leads to a maintenance nightmare. Indeed, most uses of RTTI leads to maintenance headaches. That doesn't mean that RTTI isn't useful (it's vital for making boost::any work, for example). But it is a very specialized tool for very specialized needs.

In that way, it is "harmful" in the same way as goto. It's a useful tool that shouldn't be done away with. But it's use should be rare within your code.

So, if you can't use transparent base classes and dynamic casting, how do you avoid fat interfaces? How do you keep from bubbling every function you might want to call on a type from bubbling up to the base class?

The answer depends on what the base class is for.

Transparent base classes like node_base are just using the wrong tool for the problem. Linked lists are best handled by templates. The node type and adjacency would be provided by a template type. If you want to put a polymorphic type in the list, you can. Just use BaseClass* as T in the template argument. Or your preferred smart pointer.

But there are other scenarios. One is a type that does a lot of things, but has some optional parts. A particular instance might implement certain functions, while another wouldn't. However, the design of such types usually offers a proper answer.

The "entity" class is a perfect example of this. This class has long since plagued game developers. Conceptually, it has a gigantic interface, living at the intersection of nearly a dozen, entirely disparate systems. And different entities have different properties. Some entities don't have any visual representation, so their rendering functions do nothing. And this is all determined at runtime.

The modern solution for this is a component-style system. Entity is merely a container of a set of components, with some glue between them. Some components are optional; an entity that has no visual representation does not have the "graphics" component. An entity with no AI has no "controller" component. And so forth.

Entities in such a system are just pointers to components, with most of their interface being provided by accessing the components directly.

Developing such a component system requires recognizing, at the design stage, that certain functions are conceptually grouped together, such that all types that implement one will implement them all. This allows you to extract the class from the prospective base class and make it a separate component.

This also helps follow the Single Responsibility Principle. Such a componentized class only has the responsibility of being a holder of components.

From Matthew Walton:

I note lots of answers don't note the idea that your example suggests node_base is part of a library and users will make their own node types. Then they can't modify node_base to allow another solution, so maybe RTTI becomes their best option then.

OK, let's explore that.

For this to make sense, what you would have to have is a situation where some library L provides a container or other structured holder of data. The user gets to add data to this container, iterate over its contents, etc. However, the library doesn't really do anything with this data; it simply manages its existence.

But it doesn't even manage its existence so much as its destruction. The reason being that, if you're expected to use RTTI for such purposes, then you are creating classes that L is ignorant of. This means that your code allocates the object and hands it off to L for management.

Now, there are cases where something like this is a legitimate design. Event signaling/message passing, thread-safe work queues, etc. The general pattern here is this: someone is performing a service between two pieces of code that is appropriate for any type, but the service need not be aware of the specific types involved.

In C, this pattern is spelled void*, and its use requires a great deal of care to avoid being broken. In C++, this pattern is spelled std::experimental::any (soon to be spelled std::any).

The way this ought to work is that L provides a node_base class that takes an any that represents your actual data. When you receive the message, thread queue work item, or whatever you're doing, you then cast that any to its appropriate type, which both the sender and the receiver know.

So instead of deriving orange_node from node_data, you simply stick an orange inside of node_data's any member field. The end-user extracts it and uses any_cast to convert it to orange. If the cast fails, then it wasn't orange.

Now, if you're at all familiar with the implementation of any, you'll likely say, "hey wait a minute: any internally uses RTTI to make any_cast work." To which I answer, "... yes".

That's the point of an abstraction. Deep down in the details, someone is using RTTI. But at the level you ought to be operating at, direct RTTI is not something you should be doing.

You should be using types that provide you the functionality you want. After all, you don't really want RTTI. What you want is a data structure that can store a value of a given type, hide it from everyone except the desired destination, then be converted back into that type, with verification that the stored value actually is of that type.

That's called any. It uses RTTI, but using any is far superior to using RTTI directly, since it fits the desired semantics more correctly.

0 讨论(0)

查看其它7个回答
发布评论:

提交评论
- 加载中...