问题
I've just started learning about OOP in C++. I was wondering why is the virtual keyword needed to instruct the compiler to do late binding ? Why can't the compiler know at compile time that the pointer is pointing to a derived class ?
class A {
public: int f() { return 'A';}
};
class B : public A {
public: int f() { return 'B';}
};
int main() {
A* pa;
B b;
pa = &b;
cout << pa->f() << endl;
}
回答1:
Regarding not knowing at compile time, it is often the case the behavior is only known at runtime. Consider this example
#include <iostream>
struct A {};
struct B : A {};
struct C : A {};
int main()
{
int x;
std::cin >> x;
A* a = x == 1 ? new B : new C;
}
In this example, how could the compiler know if a will point to a B* or C*? It cannot because the behavior is dependent on runtime values.
回答2:
How could it (in full generality)? For example
#include <cstdlib>
struct Parent {};
struct Child : Parent {};
int main()
{
Parent* p = std::rand() % 2 ? new Parent() : new Child();
}
回答3:
Lets say you have a simple class-hierarchy like
class Animal
{
// Generic animal attributes and properties
};
class Mammal : public Animal
{
// Attributes and properties specific to mammals
};
class Fish : public Animal
{
// Attributes and properties specific to fishes
};
class Cat : public Mammal
{
// Attributes and properties specific to cats
};
class Shark : public Fish
{
// Attributes and properties specific to sharks
};
class Hammerhead : public Shark
{
// Attributes and properties specific to hammerhead sharks
};
[A little long-winded, but I want to have the "concrete" classes to be far away from each other]
Now lets say we have a function like
void do_something_with_animals(Animal* animal);
And finally let's call this function:
Fish *my_fish = new Hammerhead;
Mammal* my_cat = new Cat;
do_something_with_animals(my_fish);
do_something_with_animals(my_cat);
Now if we think a little, in the do_something_with_animals function there is really no way of knowing exactly what the argument animal might point to. Is it a Mammal? A Fish? A specific Fish sub-type?
This is even harder for the compiler if the do_something_with_animals function is defined in a different translation unit, where the definition of the Mammal and Fish classes (or any of its sub-classes) might not even be available.
回答4:
The virtual keyword marks individual functions as late-bound. This isn't about what the compiler can or cannot know about any pointers to the object. It's about communicating programmer intent ("this function is meant to be overridden") and efficiency ("this function needs the late-binding mechanism enabled").
回答5:
(I started out with some comments on an answer, but decided I should just write up my own answer.)
I've rearranged your code slightly here to make it easier to compile and view the output:
#include <iostream>
#ifdef V
#define VIRTUAL virtual
#else
#define VIRTUAL /*nothing*/
#endif
class A {
public: VIRTUAL char f() { return 'A';}
};
class B : public A {
public: char f() { return 'B';}
};
int main() {
A* pa;
B b;
pa = &b;
std::cout << pa->f() << std::endl;
}
Compiling and running it shows:
$ c++ t.cc && ./a.out
A
$ c++ -DV t.cc && ./a.out
B
which shows that the virtual keyword changes the behavior of the program. This is in fact required by the language standard. Your question could, I think, be best rephrased as Why is the standard written this way (which has a more useful general answer) rather than Can the compiler optimize my code (which has a specific but useless answer: yes, it can, but it's still required to print A, not B).
The language definition doesn't forbid the compiler from doing special optimization tricks. Instead—and especially so in this case, for C++— the language specification specifically tries to make it easier for compiler-writers to optimize. This winds up putting more of a burden on C++ programmers.
If C++ were a different language ...
The feature you're talking about, which is the virtual keyword, specifically exists because of this. The language could have been defined differently (and some other languages are): they could have said that compiler writers must not ever assume that, given some valid A* pa, pa points to some actual instance of type A. Then:
std::cout << pa->f() << std::endl;
would always have to figure out: What is the real underlying type of *pa and hence what function f shall I call here?
In this hypothetical (not-C++) language,1 a compiler that optimizes could take your code and build it to call B::f() directly, because pa points to an instance of type B. But in this same language, a compiler that tries to optimize heavily could not make assumptions about functions where the underlying type of pa is determined by something not predictable at compile-time:
void f(A* pa) {
std::cout << pa->f() << std::endl;
}
int main(int argc, char **argv) {
A a;
B b;
f(argc > 1 ? &b : &a);
}
This program needs to print A when called with no extra arguments, and B when called with extra arguments. So if our not-C++ language lacks a virtual keyword, or defines it as a no-op, function f—which calls either A::f() or B::f() at run-time—must always figure out which underlying function to call.
1It's not C either. The name D is taken. Perhaps P, from the BCPL progression?
Conclusion
Because C++ does have the virtual keyword, the variant we build that has a non-virtual f() in base class A can optimize pa->f() calls by assuming that pa->f() calls A::f(). Hence, instead of actually calling A::f(), an optimizing compiler can just write "A\n" to std::cout. Whether or not the C++ compiler optimizes, the call must produce A rather than B.
The variant with the virtual keyword inserted must not assume that pa->f() calls A::f(). If it can optimize enough to see that pa->f() calls B::f(), and therefore, at compile time, eliminate the call entirely and have the function write "B\n", that's OK! If it can't optimize that much, that's OK too—at least, as far as the language specification goes.
You, as a programmer, are required to know this about the virtual keyword, and to use it whenever you want the compiler to be forced to pick the right function based on the actual runtime class, whether or not the compiler is smart enough to do that at compile-time. If you want to allow and force the compiler to just use the base-class function every time, you can omit the virtual keyword.
来源:https://stackoverflow.com/questions/61954860/why-cant-c-compiler-know-a-pointer-is-pointing-to-a-derived-class