Data isolation between pure abstract class against its implmentation and the true meaning of pure abstract class?

问题

In Bjarne Stroustrup's book "The C++ Programming Language", it is stated that a derived class from a super class in a class hierarchy, in many case, gets access to the data of the super class. The book suggests this is a problem, because the sharing between "two related, but different, sets of data is asking for trouble. Sooner or later someone will get them out of sync. Also, experience shows that novice programmers tend to mess with protected data in ways that are unnecessary and that cause maintenance problems". The question here is, how can having access to private data from the ancestor class is asking for trouble? It sounds like having inherited assets from your parent is a bad deal.

As a consequence, the pure abstract class mechanism allows the separation between the abstract class and its implementation. Note that the implementation includes data members, because if the data members are defined in an abstract class, the abstract class contains implementation details (the defined members), thus it is not well decoupled between abstraction and the implementation. Why is such separation needed? One of the reason, as stated in here, is "a way of forcing a contract between the class designer and the users of that class. The users of this class must declare a matching member function for the class to compile." Another important reason, as stated in the book, is to protect against changes from the implementation. Having an abstract class hierarchy, you can protect against the changes which requires compilation of an entire class hierarchy.

For example, if you have a normal class, and the class contains 10 data members and 20 member functions, and is inherited by a dozen of classes with several layers below the root class. A few changes to the implementation of member functions (function signature remains unchanged) of the root class, and you have to recompile the entire class hierarchy to apply changes its terminal classes, otherwise the program breaks. With abstract class hierarchy, unless the function signature is changed, if an implementation of a terminal class changes, only that terminal class needs to be recompiled. Therefore, most of the code are protected to the maximum. Is my understanding correct?

I think his example (the Ival_box example) in combination with the Bridge Design Pattern tries, and is powerful, since Bridge Pattern leaves the abstract tree completely, and have another class hierarchy for implementation. Pure abstract class can be consider equivalent to Interface in Java, except in Java the explanation usually is the abstract class can be inherited once, while interface allow multiple inheritances, thus having shared behaviors encapsulated in a shared interface between multiple classes. That answer, I think, is not valid in this context, since C++ allows multiple inheritance.

The last question is, how is a derived class compiled and exist in the binary image? Does the compiler fill the inherited information into the derived class, considered it to be an isolated class afterward and then compiled? (Similar to preprocessing)

tl;dr:

How can having access to private data from the ancestor class is asking for trouble?

Is pure abstract class (aka Interface in Java) a way to protect source code against changes, with the separation of abstraction tree and implementation tree?

How is abstract class/super class compiled in C++? Does it turn the derived class into a single class by filling in the information from superclass, then compiles

回答1:

Meta-note: Your questions could have been asked as multiple, separate questions.

how can having access to private data from the ancestor class is asking for trouble?

Non-orthogonality. When the protected part of the ancestor class is changed, all derived classes have to be changed, too. This is error-prone, because the programmer changing the ancestor might not have the derived classes in his mental scope.

With abstract class hierarchy, unless the function signature is changed, if an implementation of a terminal class changes, only that terminal class needs to be recompiled.

Technically correct, but you are missing the main point. Recompilation is not nice, but also not terribly costly, because computers do it. The real cost is human work induced by non-orthogonality. When a class has a public or protected data member, people are going to use it, and when it changes, things break and have to be fixed.

BTW, "is inherited by a dozen of classes with several layers below the root class" is usually a sign of bad design. Keep inheritance hierarchies flat and concise. Prefer members over base classes: When you decide to use functionality of a class Foo in a class Bar, and there is no compelling reason to let Bar inherit Foo, rather use a member of type Foo.

That answer, I think, is not valid in this context, since C++ allows multiple inheritance.

In C++, the distinction between interfaces and classes is not reified (made into a thing), but still used. It is very useful in many cases to build interfaces, e.g classes consisting of abstract virtual functions, because they avoid many of the troubles of the much-dreaded multiple inheritance.

how is a derived class compiled and exist in the binary image?

This is compiler-specific. Don't worry about it just yet - first understand the language, then it's implementation.

回答2:

Your understanding regarding the need to recompile everything if an exposed member of the root class changes is correct.

Also, your understanding of how a derived class is compiled is correct. As a matter of fact, the first C++ compiler was just a fancy C preprocessor.

Having access to private data from the ancestor class is asking for trouble because this way the descendants are built dependent on implementation details of the ancestor. If the ancestor needs to change the way it works, (even if its interface and functionality is to remain the same,) all descendants break.

回答3:

How can having access to private data from the ancestor class is asking for trouble?

Imagine if an ancestor class has certain features, state, and methods that it maintains. Assume that certain combinations of member data are invalid (aka violate the class invariants). Then imagine that a child class could change the ancestor data without maintaining the invariants. You wind up in a case with invalid state and not only that, the parent doesn't even know what happened so it has no chance to throw an exception or recover in any way. Not only that but even if it ends up in a valid state the ancestor class may still be "confused" about the state its in if it's been mutated outside the class (public/protected) interface.

Is pure abstract class (aka Interface in Java) a way to protect source code against changes, with the separation of abstraction tree and implementation tree?

A pure abstract class is used to represent an interface concept. It doesn't inherently protect source code against changes. Often as long as the interface is appropriate and the implementation can be written to the interface easily, client code won't have to change although it may have to rebuild. The separation of interface and implementation is definitely be desirable and can be improved further with the use of a non-virtual interface that delegates to a virtual pimpl implementation (or by using the nonvirtual interface pattern).

How is abstract class/super class compiled in C++? Does it turn the derived class into a single class by filling in the information from superclass, then compiles

The superclass is treated as a portion of the derived class. The precise details of this vary slightly by implementation.

回答4:

Having worked with several O.O. languages, there are a few things that I have learn, that help to program in C++. And that apply to your question.

Its a long, boring question, but, it may worth the time.

Concrete Classes, first

I usually start with concrete classes, and when several classes share common features, as methods or fields, then, I design a common superclass. Many developers ( in any language ) seems to jump straight to doing superclasses, and later child concrete classes.

Design of O.O. programs, may seem that superclasses, wheter abstract or not, where designed first. But, in real world, most classes start as concrete, and the superclasses come later. Even if in design, U.M.L. or code superclasses appear first.

So this example, including data & methods:

//  ....

#include <iostream>

class Employee {
  public:
    Employee();                         // constructor;

    void SayHello();
    void Work();
};

class Student {
  public:
    Student();                          // constructor;

    void SayHello();
    void Study();
};

//  ....

Becomes this:

//  ....

#include <iostream>

class Person {
  public:
    Person();                         // constructor;

    virtual void SayHello() = 0;
};

class Employee {
  public:
    Employee();                         // constructor;

    virtual void SayHello();
    void Work();
};

class Student {
  public:
    Student();                         // constructor;

    virtual void SayHello();
    void Study();
};

//  ....

These previous example only shows methods, not data, but the same principle applies.

Avoid private fields or methods

Something, that I usually do, its to avoid "private" sections. If I need to hide, I use "protected", instead.

So, this:

//  ....

#include <iostream>
#include <cstring>

class Person {
  private:
    char[512] Name;

  public:
    Person();                         // constructor;

    virtual void SayHello() = 0;    
};

//  ....

Becomes this:

//  ....

#include <iostream>
#include <cstring>

class Person {
  protected:
    char[512] Name;

  public:
    Person();                         // constructor;

    char[512] Name;

    virtual void SayHello() = 0;    
};

//  ....

If you are working with a class, abstract or not, you may consider that, you or other developers may code a subclass and access the parent class fields or methods.

Use properties (with accesor methods), instead, of plain data fields

Some thing that really bothers me, its the absence of "real properties", in "C++" & "Java".

One of the ways to access, modify, control or update data from an object, its thru, properties, that is, fields accesed using methods called accesors.

Properties can be implemented in C++, using templates, or just coding methods by hand.

So this, direct non isolated field example:

//  ....

#include <iostream>
#include <cstring>

class Person {
  public:
    Person();                         // constructor;

    char[512] Name;

    virtual void SayHello() = 0;    
};

int main (...) {
  Person thisPerson = new Person();

  strcpy(thisPerson.Name, "John Doe");

  cout << thisPerson.Name << "\n";

  delete thisPerson;
}

//  ....

Becomes, this isolated property example:

//  ....

#include <iostream>
#include <cstring>

class Person {
  protected:
    // isolated data field
    char[512] Name;

  public:
    Person();                         // constructor;

    virtual void SayHello() = 0;

    // public data field accesor reader method or "getter"
    const char* getName();

    // public data field accesor writer method or "setter"
    void setName(const char* AName);
};

const char Person::getName() {
  char char* Result = this.FName;

  return Result;
}   

void setName(const char* AName) {
  bool IsValid = false;

  // do some validation before actually modifing field
  // ...

  if (IsValid)
  {
    strcpy(this.FName, AName);
  }
}   

int main (...) {
  Person thisPerson = new Person();

      // should be read as "thisPerson.Name = "John Doe";"
  thisPerson.setName("John Doe");

      // should be read as "cout << thisPerson.Name;"
  cout << thisPerson.getName() << "\n";

  delete thisPerson;
}

//  ....

Many C++ developers, doesn't like "properties", because, of lack of control. I think you should used to access data as your question, but as any feature, should be "properly used".

Properties accesor methods should be virtual

In some languages the concept of properties can be implemented, accesing directly a data field of an object, reading and writing thru methods, or mixing both approaches. To make things more complicated, methods can be "virtual" or "non virtual".

When I use properties, I always use virtual methods to access them. There is a penanalty in speed & memory, but, it worths it. Properties with virtual overriden methods allows to control & access the data of objects.

Summary

Your question relate how to implement data, how to isolate data, & how access data in a class, and how to implement such features in child classes.

I have read similar questions in Stackoverflow, but, I get to the conclusion, that several questions relate to the absence of "real properties" in C++.

You may want to manage your data acces & isolation thru the use of "properties".

And, may want to try to learn other O.O., where the concept of "properties" its implemented, such as: VB (.NET), C#, Delphi, D or Vala.

And later, try to implement, this concept, in your C++ programs, when required. I have class in C++ that have plain data fields, fields treated as "properties", like Java does with accesor methods, or mixing both, when required.

That doesn't mean you rewrite the application you are working, or trying to probe which programming language its better, just learning something from others programming languages that may help your work.

来源：https://stackoverflow.com/questions/8668994/data-isolation-between-pure-abstract-class-against-its-implmentation-and-the-tru

标签

c++

abstract-class