3 real overheads@vptr

Suppose your class Trade has virtual functions and a comparable class Order has no virtual functions. What are the specific runtime overheads of the vptr/vtable usage?

  1. cpu cache efficiency — memory footprint of the vptr in each object. Java affected! If you have a lot of Trade objects with only one char data field, then the vptr greatly expands footprint and you overuse cache lines.
    • [[ARM]] singles out this factor as a justification for -fno-rtti… see RTTI compiler-option enabled by default
    • [[moreEffC++]] P116 singles out vptr footprint as the biggest performance penalty of vptr
  2. runtime indirection — “a few memory references more efficient” [1] in the Order usage
  3. inlining inhibition is the most significant overhead. P209 [[ARM]] says inline virtual functions make perfect sense so it is best to bypass vptr and directly call the virtual function, if possible.

[1] P209 [[ARM]] wording

Note a virtual function unconditionally introduces the first overhead, but the #2/#3 overheads can sometimes be avoided by a smart compiler.

 

Advertisements

covariant return type: c++98→java

java “override” rule permits covariant return — a overriding function to return a type D that’s subtype of B which is the original return type of the overridden method.

— ditto c++

https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Covariant_Return_Types. Most common usage is a “clone” method that

  • returns ptr to Derived, in the Derived class
  • returns ptr to Base in the Base class

Covariant return types work with multiple inheritance and with protected and private inheritance — these simply affect the access levels of the relevant functions.

I was wrong to say virtual mechanism requires exact match on return type.

CRT was added in c++98. ARM P211 (c 1990) explains why CRT was considered problematic in the Multiple Inheritance context.

down-cast a reference #idiom

ARM P69 says down-cast a reference is fairly common. I have never seen it.

Q: Why not use ptr?
%%A: I guess pointer can be null so the receiver function must face the risk of a null ptr.
%%A: 99% of references I have seen in my projects are function parameters, so references are extremely popular and proven in this use case. If you receive a ref-to-base, you can down cast it.

See post on new-and-dynamic_cast-exceptions
see also boost polymorphic_cast

pure virtual dtor

pure virtual dtor is a low-value obscure topic in 1) interview 2) zbs. So I won’t spend too much time on it.

https://stackoverflow.com/questions/1219607/why-do-we-need-a-pure-virtual-destructor-in-c addresses the rationale and justification for pure virtual dtor

https://www.geeksforgeeks.org/pure-virtual-destructor-c/ explains this dtor must be defined somewhere or compiler/linker would complain.

c++base class method accessing subclass field #RTS IV

Surprisingly, the non-virtual base class method can Never know that it’s actually operating within a subclass instance. It always behaves as a strictly-baseclass method and can’t access subclass data. The non-virtual method getId() is compiled into base class binary, so it only knows the fields of the base class. When a subclass adds a field of the same name, it is not accessible to the getId(). A few workarounds:

  1. curiously recurring template pattern
  2. virtual function in subclass
  3. down cast the pointer and directly access the field
#include <iostream>
using namespace std;
struct B {
	char id = 'B';
	virtual ~B() {}
	char getId() {
		return id;
	}
} b;
struct C: public B{
	char id = 'C';
} c;
int main() {
	B* ptr = new C();
	C* ptr2 = dynamic_cast<C*>(ptr);

	cout << "base class nonref: " << b.getId() << endl;
	cout << "sub class nonref: " << c.getId() << endl; //still base class method. 
	// The fact that host object is subclass doesn't matter.

	cout << "base class ptr to subclass instance: " << ptr->getId() << endl; 
	cout << "downcast ptr non-virtual method: " << ptr2->getId() << endl; //B
	cout << "downcast ptr direct access: " << ptr2->id << endl; //C
	return 0;
}

 

java override: strict on declared parameter types

Best practice – use @Override on the overriding method to request “approval” by the compiler. You will realize that

https://briangordon.github.io/2014/09/covariance-and-contravariance.html is concise –

Rule 1: “return type of the overriding method can (but not c++ [1]) be a subclass of the return type of the overridden method, but the argument types must match exactly”

So almost all discrepancies between parent/child parameter types (like int vs long) will be compiled as overloads. The only exception I know is — overriding method can remove “” from List as the parameter type.

There could be other subtle rules when we consider generics, but in the world without parameterized method signatures, the above Rule 1 is clean and simple.

[[ARM]] P212 explains the Multiple-inheritance would be problematic if this were allowed.

"new virtual" modifiers in c#

Unlike java and c++, c# offers new-virtual, which marks a virtual method hiding a base-class virtual method.

* When used in an interface, you must omit the “virtual” since it’s implicit
* When used in a class C, you must spell out “new virtual”. There must be such a Virtual method in the base class B, and there should be a grand-child class D that Overrides this method. P75 [[c# precisely]] illustrates this BCD scenario.

In c#, “override” scares “new” as a method modifier, otherwise “new” plays happily with all other method modifiers —
– new virtual
– new abstract
– new static
– new (nothing) — simple, non-virtual method hiding

y dynamic_cast returns NULL on pointers

https://bintanvictor.wordpress.com/2010/01/14/new-and-dynamic_cast-exceptions-quick-guide/ shows that dynamic_cast of a pointer returns NULL upon failure. Why not throw exception?

A: for efficient test-down-cast. dynamic_cast is the only solution. If it throws exception, the test would have to incur the cost of try/catch.

See P59 [[beyond c++ standard lib]]

template specialization ^ virtual function

To achieve polymorphism, template specialization is an alternative to virtual functions. The way a “behavior” is chosen for a given type is by compile time template instantiation.

In contrast, virtual functions (including essentially all java non-static methods) use run time binding and require additional run time cost and per-instance memory cost –vptr.

In efficiency-sensitive and latency-sensitive apps, the choice is obvious.

RTTI compiler-option enabled by default

All modern compilers have RTTI enabled by default. If you disable it via a compiler option, then typeid, typeinfo and dynamic_cast may fail, but virtual functions continue to work.  Here’s the g++ option

-fno-rtti— Disable generation of information about every class with virtual functions for use by the C++ runtime type identification features (`dynamic_cast‘ and `typeid‘). If you don’t use those parts of the language, you can save some space by using this flag. Note that exception handling uses the same information, but it will generate it as needed. The `dynamic_cast‘ operator can still be used for casts that do not require runtime type information, i.e. casts to void * or to unambiguous base classes.

See http://en.wikibooks.org/wiki/C++_Programming/RTTI

undefined behavior C++: #1 famous, unknown secrets

(See below for smart ptr, template, non-RTTI)

Deleting [1] a derived[3] object via a base[4] pointer is undefined behavior if base[6] class has non-virtual dtor, with or without vtable.

This is well-known but it applies to a very specific situation. Many similar situations aren’t described by this rule —
[1a] This rule requires pointer delete. In contrast, automatic destruction of a non-ref “auto” variable (on stack) is unrelated.

[1b] This rule requires a heap object. Deleting a pointee on stack is a bug but it’s outside this rule.

[1c] This rule is about delete-expression, not delete[]

[3] if the object’s run-time type is base, then this rule is Inapplicable

[4] if the pointer is declared as pointer-to-derived, then Inapplicable, as there is no ambiguity which dtor to run

[3,4] if the object run time type is base, AND pointer is declared pointer-to-derived? Inapplicable — compiler or runtime would have failed much earlier before reaching this point.

[6] what if derived class has non-virtual dtor? Well, that implies base non-virtual too. So Yes applicable.

*) P62 [[effC++]] points out that even in the absence of virtual functions (i.e. in a world of non-RTTI objects), you can still hit this UB by deleting a subclass instance via a base pointer.

**) The same example also shows a derived class-template is considered just like a derived class. Let me spell out the entire rule — deleting an instance of a derived-class-template via a pointer to base-class-template is UB if the base class-template has a non-virtual dtor.

What if the pointee is deleted by a smart_ptr destructor? I think you can hit this UB.

virtual function adding complexity #letter to friend

Hi LA

I guess why many C programming teams avoid virtual keyword is because this innocent-looking keyword can cause complexity explosion when mixed with other elements —

– dynamic cast for pointer or reference variables
– pure virtual functions
– slicing — without virtual, slicing is more consistent and easier to debug
– passing a RTTI object by reference — note an object is an RTTI object IFF it has a virtual pointer.
– throwing an RTTI exception object by reference, by pointer or by value — again, things are simpler without virtual
– member function hiding (by accident) — without virtual, the hiding rule is simpler
– non-trivial destructors
– pointer delete — for example, deleting a derived object via a base pointer is undefined behavior if the destructor is non-virtual. If you avoid virtual completely, we would less likely write this kind of buggy code.
– double pointers — pointer to pointer to an RTTI object
– double dispatch — usually involves virtual functions. Double-dispatch may not be very meaningful to non-RTTI objects.
– container of RTTI objects, such as vector of pointer-to-base, where each pointee can be a Derived1 object or Derived2 object… — again, non-RTTI cases are simpler
– templates — remember STL internally uses no virtual function, and perhaps very little inheritance
– smart pointers holding RTTI objects
– private inheritance
– multiple and virtual inheritance

Some experts advocate all base classes should have virtual destructor. In that case, avoiding virtual means avoiding inheritance. That would definitely reduce complexity.

make dtor pure-virtual 2make class abstract #Google style guide

Many c++ authors say that if you want to make a non-abstract base class abstract, but don’t have any method to “purify” (i.e. convert to pure-virtual), then a standard practice is to purify dtor by adding “=0” at end of the already-virtual dtor.

Note: all base classes should have their dtors virtualized, according to many authorities, but Google style guide says “only if base class has virtual methods”

I wonder why not add a dummy pure-virtual method. (I think in practice this is fine.)

I believe the answer is efficiency, both space-efficiency and possibly run-time efficiency. Adding a dummy pure-virtual adds to every class size and vtbl size, but not instance size. Also every non-abstract sub-class must implement or inherit this dummy method.

const T =shadow-type of T;const T&=distinct type from T&

(background: the nitty gritty of overload/overriding rules are too complicated…)

I feel overloading and overriding has consistent rules. Take 2 same-name functions (single-param functions for simplicity) and ignore their host classes. If the 2 can overload each other, then their parameters are considered distinct. That means they can’t override each other (if they were inserted into a inheritance tree).

Conversely, if the 2 can override each other, then their parameter types are considered “identical” so they can’t overload each other (if set “free”).

Q: We know function overloading forbids 2 overloads with identical parameter sets. How about const, i.e. If 2 functions’ parameters differ only in const, can they have identical names?
A: No. ARM P308 explains that compiler’s (static) “resolution” (based on argument type) will fail to pick a winner among the 2. It’s ambiguous.
A: therefore, in the overloading context, const SomeType is a “shadow type” and does NOT count as a distinct type.

However, if 2 functions’ parameters have this difference — const T& vs T&, then distinct types, so the 2 functions can have identical names. Exaplained in ARM.

Similarly, 2 overloads can overload based on const T* vs T* — distinct types.

Q: We know method overriding requires exact parameter match. How about const? Can an override add or remove const?
A: whatever the answer, this is sloppy. No justification.
A: yes according to my test. Adding the const makes no difference — runtime binding unobstructed.
A: therefore, in the overriding context, const SomeType is a “shadow type” and does NOT count as a distinct type.

However, my test shows const T& vs T& do make a difference — runtime binding disabled. These are considered 2 distinct types.

overload^override – c++ (java too)

I consider overloading and overriding 2 magic tricks by the compiler. Here are a few contrasts as far as I can see.
Example of overloading —
function1(B* arg);
function1(D* arg);

– Overloading is compile-time magic; overriding is run-time magic. If you have a D object’s address in a variable myVar and you pass it function1, which function1 is chosen by compiler? Depends on the declared type of myVar. Discussed repeatedly in my blog.

– consider renaming to function1_for_Base()/function1_for_Derived() etc. It may seem cool to have a bunch of overload functions, but in one Barcap email sender utility class, there are 4 send(…) utilities each with more than 7 parameters — code smell. It’s hard for users to tell them apart. P403 [[c++TimesavingTechniques]] points out that overloading complicates debugging and maintenance.
** Readability is fine in overRiding

– Overriding is generally a best practice to be adopted whenever possible. I can’t say the same about overloading.
– c++ has hiding rule about overLoading. No such thing about overRiding.
– c++ does implicit (!) type conversion about overLoading.
– In conclusion, overriding is cleaner than overloading, less complicated, and more readable.
– In overriding There’s more magic by compiler — vptr etc. More powerful but Not more complicated
– overLoad can be applied to operators such as op=, op+. I doubt overRide can

pure virtual with/out implementation #IV

Item 44 in [[EffC++]] offers a summary that basically says

1) a pure virtual means only interface is inherited (since parent provides no implementation)
2) a “simple/traditional virtual” means interface plus a default implementation is inherited
3) a non-virtual means interface plus a mandatory implementation is inherited and subclasses are advised to keep the implementation. C++ actually allows subclass to /deviate/ by redefining and hiding the parent implementation. Java has a cleaner solution in the “final” keyword.

I’d like to add

1b) a pure-virtual-with-an-implementation tells the subclass author

“Inherit this interface. By the way I also offer an implementation upon request, but uninheritable.”

This differs from (2). Both are illustrated in Item 36.

Author of a subclass of an abstract base class (featuring pure2()) can choose one of three options:

  • 1. don’t declare pure2() at all.  As the default and most popular usage, the subclass is also abstract (pun intended) by virtual of the inherited pure2().
  • 2. define pure2(), and becoming a non-abstract class
  • … so far, exactly same as java syntax
  • 3. redeclare the same pure2() without implementation — an error. See P215 [[ARM]]

c++ method hiding, redefining, overriding – fundamentals

Background — When reading a particular function call in the context of a c++ class hierarchy, we need to identify exactly which function is selected at compile/runtime. In the case of “No match”, we get a compile-time error (never run time?).

Non-trivial. It’s easy to lose focus. Focus on the fundamental principles — only a few.

– Fundamental — override is strict [1]. If overriding, then vtbl dynamic binding. Simple and clear. Otherwise, it’s always, always static binding.
– Fundamental — if static binding, then remember the hiding rule. Per-name basis — See [[Eff C++] last item. As a result, some base class methods become unavailable — compiler errors. [3]
– Fundamental — compiler attempts implicit type conversion on every argument.

Redefining is an important special case of hiding, but fundamentally, it’s plain vanilla function hiding.

It was said that Overriding resolution is done “after” hiding? Does it mean that the hiding rules kick in first before system goes through overriding resolution? But I don’t think hiding would kick in at all.

[1] see http://bigblog.tanbin.com/2011/02/runtime-binding-is-highly-restrictive.html
[3] Fixable with a local “using” directive — Using Defeats Hiding

java/c++overriding: 8 requirements #CRT

Here’s Another angle to look at runtime binding i.e. dynamic dispatch i.e. virtual function. Note [[effModernC++]] P80 has a list of requirements, but I wrote mine before reading it.

For runtime binding to work its magic (via vptr/vtbl), you must, must, must meet all of these conditions.

  • method must be –> non-static.
  • member must be –> non-field. vtbl offers no runtime resolution of FIELD access. See [[java precisely]]. A frequent IV topic.
  • host object must be accessed via a –> ref/ptr, not a non-ref variable. P163 [[essential c++]]. P209 [[ARM]] explains that with a nonref, the runtime object type is known in advance at compile time so runtime dispatch is not required and inefficient.
  • method’s parameter types must be —> an exact copy-paste from parent to subclass. No subsumption allowed in Java [2]. C++ ARM P210 briefly explains why.
  • method is invoked not during ctor/dtor (c++). In contrast, Java/c# ctor can safely call virtual methods, while the base object is under construction and incomplete, and subclass object is uninitialized!
  • method must be –> virtual, so as to hit the vtbl. In Java, all non-static non-final methods are virtual.
  • in c++ the call or the function must NOT be scope-qualified like ptr2->B::virtF() — subversion. See P210 ARM
  • the 2 methods (to choose from) must be defined INSIDE 2 classes in a hierarchy. In contrast, a call to 2 overload methods accepting a B param vs a D param respectively will never be resolved at runtime — no such thing as “argument-based runtime binding”. Even if the argument is a D instance, its declared type (B) is always used to statically resolve the method call. This is the **least-understood** restriction among the restrictions. See http://bigblog.tanbin.com/2010/08/superclass-param-subclass-argument.html

If you miss any one condition, then without run/compile-time warnings compiler will __silently__ forgo runtime binding and assume you want compile-time binding. The c++11 “overload” and java @Override help break the silence by generating compiler errors.

However, return type of the 2 functions can be slightly different (see post on covariant return type). P208 ARM says as of 1990 it was an error for the two to differ in return type only, but [[c++common knowledge]] P100 gives a clear illustration of clone() method i.e. virtual ctor. See also [[more eff C++]] P126. CRT was added in 1998.

[2] equals(SubclassOfObject) is overloading, not overriding. @Override disallowed — remember Kevin Hein’s quiz.

Here’s a somewhat unrelated subtle point. Suppose you have a B extended by C, and a B pointer/ref variable “o” seated at a C object, you won’t get runtime binding in these cases:

– if you have a non-static field f defined in both B/C, then o.f is compile-time binding, based on declared type. P40 [[java precisely]]
– if you have a static method m() defined in both B/C, then o.m() is compile-time binding, based on declared type. [1]
– if you have a nonref B variable receiving a C object, then slicing — u can’t access any C part.

[1] That’s well-known in java. In C++, You can also “call a static member function using the this pointer of a non-static member function.”

slicing/vptr/AOB — pbref between base^derived

(For vptr during slicing, see other posts)

Q: Any Slicing when func1(const B& b) receives a D argument?
A: no, since there’s just _one_ object (see posts on implicit cloning). But not sure afterwards.

Background — On the “real estate” of the Single-Inheritance D object, there is a B object, like a basement. B’s real estate is part of D’s real estate — these 2 objects share[3] the same “postal address”, making pointer casting possible.

[3] with multiple inheritance, Derived object and the embedded Base sub-object don’t necessarily share the same postal address of “this”

(Warning: AOB is inapplicable for multiple-inheritance. See other posts for details.) In the opening question, when you get a B ref to a D object, you get the address of the basement (AOB). Using this B reference, you can access B’s fields only. To access D’s fields, you must downcast a B ptr into a D ptr. AOB is the basis of most if not all slicing problems such as

* copy ctor qq( B b=d ) — In the implicit copier call, the implicit param “const B&” gets the AOB so can only access the B fields of the D object. After the copier call, the LHS becomes a sliced copy of the D.

* assignment qq( B b; b=d ) — The param “const B&” gets the AOB so can only access the B fields of the D object. LHS becomes a sliced copy of the D.

Remember the onion — slice off the outer layer.

is my dtor virtual if I don’t declare it virtual@@ #my take

Q: is my dtor virtual if I don’t declare it virtual?

A lot of implicit rules, but there’s a simple rule — see below

Here’s my long answer
– If you are a top-level class (i.e. not inheriting), and don’t declare a dtor —> non-virtual
+ if you declare it “virtual” —–> virtual
+ if you inherit and declare it “virtual” —–> virtual
= if you inherit but don’t declare a dtor, then synthesized dtor is ——> as virtual as your parent’s
= if you inherit and declare without “virtual”, then your dtor is still —–> as virtual as your parent’s. http://www.parashift.com/c++-faq-lite/virtual-functions.html#faq-20.7

Rule — once ancestor’s dtor is declared virtual, descendants have no way of losing the virtuality.

It’s illuminating to visualize the memory layout. Physically, a subclass real estate always encloses a base instance. Since the base real estate has 32 bits for a vptr, every descendant instance has exactly 32 bits for a vptr too, not more no less. Java simply puts the 32 bit footprint into Object.java, so every java object has 1 and only 1 vptr.

If you go through dtor virtuality down a family tree, you may see NV -> NV -> NV …-> V -> V -> V. Once you get a Virtual, you can only get more Virtuals.

As an analogy, a 3-generation warrior is made a king. All his descendants become royal.

As an analogy, a 3-generation farmer becomes a landlord in 1949. All children and grand children are considered by communists as landlords.

Reason? As described in http://www.parashift.com/c++-faq-lite/virtual-functions.html#faq-20.4, once a base class gets a (class-level) vtbl, its subclass always gets its own vtbl. I believe every descendant dtor is always on a vtbl.

into a java variable{name,value,address,type}.

In any scripting or compiled language, OO or otherwise, a variable is a trio of { name, value, address }. In java, we have to remember the type and actual object behind a variable. “address” and “object” are slightly diffferent views of the same entity.

In java, it’s instructive to see a variable in terms of a pointer and an onion. Multiple remote controls can point to the same chunk of memory.

a pointer is a reference and a remote control, with
* a unique name
* a type defined in a type hierarchy. A type can be an interface type.
* supported services of the type. We mean instance methods.

an object is a /pointee/referent/ and an onion in memory with
* a unique address. There’s no address for the base object nested in an onion. Not possible to have a variable pointing to the base objects inside an onion.
* no name
* fields
* methods, possibly overriden or hidden.

Pointer Casting (up or down) affects the type, the fields and methods. When up-casting from subtype C to a basetype B,
– address and name remain
– instance/static fields may disappear, since they may be undefined in parent class C
– instance methods remain, even if they are overriden in a subclass C. Polymophic runtime binding via vptr
– static methods? Yes affected in a subtle way. see blog on [[ static binding ]]

field hiding by subclass, learning notes

(c# has the trickiest hiding rules; c++ has  more than java.)

First, remember the onion model described in other posts. To rephrase one more time, each instance field occupies memory “cells” in a layer of the onion. If class B has an instance field vf, and subclass C has a same-name instance field vf, then a C “onion” has storage for both fields.

myRef.vf resolves to one of the 2 fields at compile time depending on declared type of myRef. P40 [[java precisely]]

On P25 [[java precisely]], B has a static field sf, and so does C. The C onion in memory has NO storage for either field, but it has a pointer to the class object, which has storage for the field.

Q: does the C class object have storage for both static fields?
%%A: probably no. C class object just needs a pointer to B class object.

Question: “myRef.sf” refers to …? P40 [[java precisely]]
A: At compile time, if the C onion is declared as type B, then the field is B.sf
A: At compile time, if the C onion is declared as type C, then the field is C.sf

I feel the static and non-static cases are similar — both compile-time binding. This is the simple rule for field binding. For method call binding, static and non-static behave differently. Remember that runtime binding is very strict and requires non-static and non-field, among other conditions. Failing any condition means compile-time binding.

virtual inheritance diamond malformed

Both C1 and C2 should virtually derive from Base. You see the D class instantiation only calling Base ctor once. See http://www.parashift.com/c++-faq-lite/virtual-inheritance-where.html

Now, what if we omit one of the “virtual” keywords? Base ctor called twice — tested in GCC

struct B {
    B() {
        cout << "B\n";
    }
};
struct C1: virtual public B {
    C1() {
        cout << "entering C1()\n";
    }
};
struct C2: virtual public B {
    C2() {
        cout << "entering C2()\n";
    }
};
class D: public C1, C2 {
};

int main() {
    D d;

static^dynamic binding

why call it “static”?

First we must understand “binding” — system binding an overriden name (of a MEMBER) to one of several incarnations within the class hierarchy.

Field names are resolved at compile-time — “static” — when system is not alive and runnning.

In contrast, method call is resolved at runtime — dynamic binding.

— learnt in 2010 —
Only virtual methods need dynamic binding, ie overridable methods. Overloaded methods are chosen at compile time based on arg list. See posts on static binding.

vptr is a const non-static field – (slicing, casting)

When you “cast” a c++ or java POINTER[1] up and down family tree, the vptr field of the pointee persistently points to the “most derived” vtbl. To fully appreciate the constness of the vptr, consider slicing —

Q: during slicing, how does vptr end up pointing to the base vtbl?
A: copy constructor of the base object will be used, which (blindly) initializes vptr with the base object’s vtbl.

Background — vtbl is an array of func pointers. One vtbl per CLASS, but —

Q: Is there any compiler where superclass’s and subclass’s vtbl share the same address?
A: Won’t work. When slicing, the trim-down object’s vptr must NOT point to original vtbl. Trim-down object can never access derived class methods.
A: but what if a subclass overrides no method? Probably will share the vtbl with super class?

vptr is like a non-static const field. In the non-slicing scenario, The magic of polymorphism depends on the “vptr still pointing to subclass vtbl even if object is accessed by a base ptr”.

A twist on the constness — After this->vptr is initialized in the base ctor, it gets reseated during subclass ctor. During destruction, this->vptr gets reseated during Base dtor. This knowledge is relevant only if base ctor/dtor calls virtual methods.

[1] nonref won’t work.

basic question on reference counting + virtual^redefining

Given

class C : public B

Q: i have a variable holding a C object , how do i get a variable of static type B, connected to the same object?
A: simple assignment will do, but sliced!

Here’s the call to C constructor

C obj;

Using pointers — B* ptr = &obj; //upcasting by ptr
Using references — B& ref = obj; // upcasting by reference

Q: Can 2 non-ref variables refer to the same object in memory?
A: I don’t think so. You need the address. nonrefs don’t know how to use addresses.
A: I guess it’s similar to java primitives. No way to create 2 int variables connected to the same int object
A: I feel the slicing problem occurs when you copy objects, but what about when you have just one object but multiple pointers?

Now we are ready to differentiate virtual vs redefining. Building on our example, say B has a public method m(), redefined in C. What is ptr->m() or ref.m()?

* difference — virtual is C::m() ^ redefinition is B::m(),
* difference — virtual is runtime binding ^ redefinition is compile time binding

Now a note on java. Java has only overloading (compile time binding) vs overriding ie virtual. C++ offers virtual ^ overloading ^ redefining. Last 2 are compile time binding.

ptr + virtual -} C // B ptr to a C object, to call a virtual method m()
ref + virtual -} C
nonref + virtual -} B
ptr + redefine -} B
ref + redefine -} B

In summary, only ptr/ref + virtual is really virtual. Not virtual if you use a nonref OR if you drop “virtual” keyword.

Destructors behave just like methods —
ptr + virtual -} C
ptr + non-virtual -> B. See P104 [[NittyGritty]]

pure-pure virtual invoked during base ctor/dtor #bbg

Update: P273 [[eff c#]] compares c# vs c++ when a pure virtual is invoked by ctor (book skips dtor). It confirms c++ would crash, but my bloodshed c++ compiler detected pure virtual called in base ctor and failed compilation. See source code at bottom.

This is rather obscure , not typical. Not even a valid question.

struct B{
 virtual void cleanup() = 0;
 ~B(){
   cleanup();
  }// not virtual
};
struct D: public B{
 void cleanup(){}
};
int main(){ D derivedObject; } 
// derivedObject destructed. 
// If (see below why impossible) code were to compile, what would happen here?

%%A: the invisible ~D() runs, then the defined ~B() runs even though it’s non-virtual.
%%A: I think it’s undefined behavior ONLY with “delete”.
%%A: virtual method called during base object destruction/construction – Warning by Scott Meyers. At time of base class destruction only the pure virtual is available, so system crashes saying “pure virtual function called”.

(In java, superclass ctor calling a virtual function results in the subclass’s version invoked, before the subclass ctor runs! Compiles but dangerous. If you do this, make sure the subclass ctor is empty.)

Q: how can a compiler intercept that error condition, with 0 runtime cost?
A: see post on pure pure virtual

Note, if Derived had not implemented the pure virtual, then Derived would have become an abstract class and non-instantiable.

Actually compiler detects the call to cleanup() is a call to B::cleanup() which is abstract. Here’s a demo.

struct B{
  virtual void cleanup() = 0;
  B();
  ~B();  // not virtual
};
//void B::cleanup(){       cout<<"B::cleanup\n";}
B::B(){
  cout<<"B()\n"; //this->cleanup(); //breaks compiler if enabled
}
B::~B(){
  cout<<"~B()\n"; //this->cleanup(); //breaks compiler if enabled
}
struct D: public B{
  void cleanup(){       cout<<"D::cleanup\n"; }
  ~D(){cout<<"~D()\n"; }
};
int main(){
    if (true){
       D derivedObject; // derivedObject destructed. Suppose this can compile, what will happen?
    }
    cin.get();
}

pure^concrete virtual functions #bbg

3 types of virtual methods
* concrete virtual — virtual methods without “=0”. Regular virtual methods.
* concrete pure virtual – with “=0”. pure but with a body. See http://www.gotw.ca/gotw/031.htm. Least common.
* pure-pure virtual  — with “=0”. pure and without a body.  “Pure virtual” usually means this.

You can think of the “=0” as a NULL address assigned to the func ptr. Remember compiler maps each function NAME to an address of the function’s body. At first, a pure/abstract function has no body YET.
[2] concrete pure virtual????

Diff: if a class has at least one pure virtual, then you can’t [2] instantiate it. Subclass must implement it to become non-abstract – This is the purpose for the “PURE”.

Basically, all methods are implemented as func-ptr FIELDS and absolutely non-nullable in any class instance. With a func-ptr field in NULL, this class is non-instantiable i.e. “abstract”.

similar: Both PURE and concrete-virtual methods can have a “body” i.e. a method definition.
** Diff: But PURE declaration ends in “=0” so body must be somewhere else. Concrete virtual can have a body attached to declaration.

similar: the body is callable by subclasses
** Diff: but must be explicitly called in the PURE case. See http://www.gotw.ca/gotw/031.htm and also [[eff c++ 167]]

Q (I don’t remember the exact question at Bloomberg IV. A obscure question): pretend to be a compiler writer. How can your compiler intercept a call to an unimplemented PURE virtual method (i.e. a pure-pure virtual), with 0 runtime cost? P273 [[eff c#]] compares c# vs c++ when a pure virtual is invoked by ctor. It confirms c++ would crash.
%%A: Synthesize a body for each undefined PURE virtual. In the body, we can clean up and print diagnostic then call abort().
%%A: check the vtable. If the function pointer is null, then the host class can’t be instantiated in the first place.

 

overloaded method call resolution — static binding

class Visitor{ // visitor pattern, overloading
void visit(Object o){…} //1
void visit(String s){…} //2
}

If you call
Object o = makeAnObjectOrString(); aVisitor.visit(o), which method runs? I felt it’s resolved at run time. Wrong.

Interviewer pointed out equals(Object o). A novice creates an equals(MyClass o){…}, overloading the inherited equals(Object o). When MyClass is used in an ArrayList, the new method will never be called, because ArrayList.java (see source code) always casts any MyClass instance to Object and calls equals(Object o).

[[Java Precisely]] P44 suggests (1) will be chosen, and always at compile time. At compile time, (2) is ruled out.


public class JavaPreciselyP44 {
static JavaPreciselyP44 instance = new JavaPreciselyP44();
public static void main(String argsp[]) {
//method call won't compile since both methods are equally applicable
instance.print(new ArrayList(), new HashSet());
}
void print(List l, HashSet s) {
System.out.println('o');
}
void print(ArrayList l, Set s) {
System.out.println('s');
}
}

vtable and vptr in pseudo code

http://www.parashift.com/c++-faq-lite/virtual-functions.html#faq-20.4 is a one-pager with pseudo code. Here are some of my comments
tanbin – one v-table per class in the hierachy, shared by all instances. Parent’s v-table, child’s v-table…
tanbin – one v-ptr per instance. Since a child instance wraps a parent, the entire “onion” has a single v-ptr
tanbin – the v-ptr is reseated once when each “onion” layer is added during construction. Each child constructor in the hierarchy can reseat the v-ptr to point to the child’s own v-table

————
Let’s work an example. Suppose class Base has 5 virtual functions: virt0() through virt4().

// Your original C++ source code

class Base {
public:
virtual
arbitrary_return_type virt0(…arbitrary params…);
virtual
arbitrary_return_type virt1(…arbitrary params…);

};

Step #1: the compiler builds a static table containing 5 function-pointers, burying that table into static memory somewhere. Many (not all) compilers define this table while compiling the .cpp that defines Base‘s first non-inline virtual function. We call that table the v-table; let’s pretend its technical name is Base::__vtable. If a function pointer fits into one machine word on the target hardware platform, Base::__vtable will end up consuming 5 hidden words of memory. Not 5 per instance, not 5 per function; just 5 for the class. It might look something like the following pseudo-code:

// Pseudo-code (not C++, not C) for a static table defined within file Base.cpp

// Pretend FunctionPtr is a generic pointer to a generic member function
// (Remember: this is pseudo-code, not C++ code)
FunctionPtr Base::__vtable[5] = {
&Base::virt0, &Base::virt1, &Base::virt2, &Base::virt3, &Base::virt4
};

Step #2: the compiler adds a hidden pointer (typically also a machine-word) to each object of class Base. This is called the v-pointer. Think of this hidden pointer as a hidden data member, as if the compiler rewrites your class to something like this:

// Your original C++ source code

class Base {
public:

FunctionPtr* __vptr;
supplied by the compiler, hidden from the programmer

};

Step #3: the compiler initializes this->__vptr within each constructor. The idea is to cause each object’s v-pointer to point at its class’s static v-table, as if it adds the following instruction in each constructor’s init-list:

Base::Base(…arbitrary params…)
: __vptr(&Base::__vtable[0])
supplied by the compiler, hidden from the programmer

{

}

Now let’s work out a derived class. Suppose your C++ code defines class Der that inherits from class Base. The compiler repeats steps #1 and #3 (but not #2). In step #1, the compiler creates a new hidden v-table for class Der, keeping the same function-pointers as in Base::__vtable but replacing those slots that correspond to overrides. For instance, if Der overrides virt0() through virt2() and inherits the others as-is, Der‘s v-table might look something like this (pretend Der doesn’t add any new virtuals):

// Pseudo-code (not C++, not C) for a static table defined within file Der.cpp

// Pretend FunctionPtr is a generic pointer to a generic member function
// (Remember: this is pseudo-code, not C++ code)
FunctionPtr Der::__vtable[5] = {
&Der::virt0, &Der::virt1, &Der::virt2, &Base::virt3, &Base::virt4
}; ^^^^----------^^^^---inherited as-is

In step #3, the compiler adds a similar pointer-assignment at the beginning of each of Der‘s constructors. The idea is to reseat each Der object’s v-pointer so it points at Der class’s v-table. (This is not a second v-pointer; it’s the same v-pointer that was defined in the base class, Base; remember, the compiler does not repeat step #2 in class Der.)
Finally, let’s see how the compiler implements a call to a virtual function. Your code might look like this:

// Your original C++ code

void mycode(Base* p)
{
p->virt3();
}

The compiler has no idea whether this is going to call Base::virt3() or Der::virt3() or perhaps the virt3() method of another derived class that doesn’t even exist yet. It only knows for sure that you are calling virt3() which happens to be the function in slot #3 of the v-table. It rewrites that call into something like this:

// Pseudo-code that the compiler generates from your C++

void mycode(Base* p)
{
p->__vptr[3](p);
}

On typical hardware, the machine-code is two ‘load’s plus a call:

  1. The first load gets the v-pointer, storing it into a register, say r1.
  2. The second load gets the word at r1 + 3*4 (pretend function-pointers are 4-bytes long, so r1+12 is the pointer to the right class’s virt3() function). Pretend it puts that word into register r2 (or r1 for that matter).
  3. The third instruction calls the code at location r2.

Conclusions:

  • Objects of classes with virtual functions have only a small space-overhead compared to those that don’t have virtual functions.
  • Calling a virtual function is fast — almost as fast as calling a non-virtual function.
  • You don’t get any additional per-call overhead no matter how deep the inheritance gets. You could have 10 levels of inheritance, but there is no “chaining” — it’s always the same — fetch, fetch, call.

up and down casting nonref/ref/ptr

Primary usage of dynamic_cast is down-cast
* base/derived class must have vptr, or you get compile-time error
* original and target types must be ptr/ref, or you get compile-time error [1]
* there’s just one object involved, not 2
** That one object must be a complete and full[2] Derived object, otherwise you get runtime (not compile time) failure, indicated by 1) null ptr or 2) exception (during reference down-cast)
*** boost polymorphic_cast adds uniformity by throwing exception for both

[1] static_cast can cast nonref.
[2] static_cast doesn’t incur the overhead to check that

Q: up-casting a nonref??
A: no casting operator required, but you get sliced. qq/ myB = myD; /
A: by the way, no such thing in java, since java doesn’t use “nonref”

Q: down-casting a nonref??
A: impossible in dynamic_cast. “dynamic_cast can be used only with pointers and references to objects”

Q: down-casting a ptr (to a polymorphic object only)?
A: dynamic_cast. May return NULL. java has a similar feature.
A: see also boost polymophic_cast

Q: down-casting a ref (to a polymorphic object only)?
A: dynamic_cast. Never returns NULL. .. down cast a reference

Q: down-cast a base ptr (or ref) to a derived object but no vtbl/vptr no virt func?
A: impossible. dynamic_cast won’t compile.

Q: up-cast a ptr?
A: common in c++ and java. everyday virtual function scenario. no operator required.

Q: up-cast a ref?
A: legit but less common than ptr. See post on virtual^redefining

virtual op( ) overloading to replace setter method

template class Property{
Value val;

virtual void operator() (const Value& val);
}

* Note this is not a const method; it actually changes host object state
* it’s like a setter method, setting the this->val field
* but it uses op overloading instead of a setter method.

Q: If you have a template instance Volatility as Property<string,
double>, and then create an object myVol, how do you set its value to
1.2 ?
A: using regular setter setValue(), you would call
myVol.setValue(1.2), but using the ( ), it’s

myVol(1.2)

There’s also a “getter” method.
* note the const
* note the double parenthesis.

{…
virtual Value operator ( ) ( ) const; // note the double parenthesis.

}

The technique to overload q[ ( ) ] is a throwback to the constructor
initializer syntax

ConstructorName() :
field1(val5),
field2(val0)
{}

Big difference now is what identifier you put in front of the paren –
put the variable name rather than the field name.

Note overloading q[ ( ) ] is the basis of many advanced techniques including STL functors, custom function objects and probably Boost bind. This overloading is needed for function-like syntax.

virtual — only 1 of the big 3 please

In practice, people seldom virtualize these by mistake, but i feel it’s good to know why.

ctr (incl copier) are never virtual– object is incomplete during construction.

assignment can be (uselessly) marked virtual but won’t work as virtual. Virtual is strict on params. All overrides must specify identical params as the “root” version [1].

There’s always a (default or custom) op= with a param “const HostClass&”, so this func signature is as unique as a fingerprint — Will never match the root version.

Among the big 3, only dtor should be virtual.

[1] both c++ and java. [[annotated ref manual]] explains why.

dynamic_cast, typeid(*ptr) and vtbl

We won’t be too far off if we imagine that all RTTI operations work their magic by following the vptr. Remember each polymorphic class has a distinct vtbl. If there are 521 polymorphic classes in memory, then there are exactly that many v-tables in memory. Each polymorphic object holds a 32-bit vptr seated at the vtbl of the “real” class. Recall that during subclass instantiation, the vptr first gets seated at the parent vtbl, then reseated at the subclass vtbl.

An interviewer pointed out — if a pair of derived/base classes have no virtual function, then you can’t dynamic_cast between them even using pointers and references.

RTTI is inapplicable to types without vtbl, because such types are always fixed and known at compile time. For example D extends B, without any virtual method. We create a D object but give its address to a B pointer. What’s the typeid of the _unwrapped_ pointer? It’s a B. See post on AddressOfBasement.

See https://github.com/tiger40490/repo1/blob/cpp1/cpp/88miscLang/typeid_WithoutVptr.cpp

most specific signature wins#Onion,covariant,subsume

(My most advanced java OO post to date)

Q: LoginAction.java execute(HttpRequest req, HttpResponse res, ActionMapping m, supertypeOfActionForm f) allowed?
A: allowed but after lots of research still not sure if this is overriding or overloading.

http://ocw.mit.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-170Fall-2005/915D0C58-87BE-46B4-893A-42CFCA08A717/0/lec14.pdf

    2. Each method in subtype that corresponds to one in the supertype:
    requires less (has a weaker precondition)
    • there are no more “requires” clauses, and each one is no more strict than the one in the supertype method.
    • 2.1) the argument types may be supertypes of the ones in the supertype. This is called contravariance, and it feels somewhat backward, because the arguments to the subtype method are supertypes of the arguments to the supertype method. However, it makes sense, because any arguments passed to the supertype method are sure to be legal arguments to the subtype method.

    guarantees more (has a stronger postcondition)
    • there are no more exceptions %% a kind of guarantee
    • the result type may be a subtype of that of the supertype. This is called covariance: the return type of the subtype method is a subtype of the return type of the supertype method.

    Example: If A is a subtype of B, then the following would be a legal overriding: %% overloading
    Bicycle B.f(Bicycle arg); %% abstract -> compile error
    RacingBicycle A.f(Vehicle arg);
    Method B.f takes a Bicycle as its argument, but A.f can accept any Vehicle (which includes all Bicycles). Method B.f returns a Bicycle as its result, but A.f returns a RacingBicycle (which is itself a Bicycle).

In this example, subtype A satisfies IS-A, even though we question overloading vs overriding. A.f() supports the expected service. I verified this set-up in NextGen. Compiled OK. However, P 44 and P6 [[ java precisely ]] introduced the concept of subsumption and “most specific signature”. Based on my understanding, the baseclass execute() is more specific => takes precedence at compile-time method binding

The Onion rules of runtime binding apply when the overriding and overriden signatures are equally specific ie the declared argument types are 100% identical.

http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Java has a concise statement — “Parameter types have to be exactly the same (invariant) for method overriding, otherwise the method is overloaded with a parallel definition instead.

Notice the title says “most specific signature” instead of “signatureS”? There’s always a single “most specific signature”, possibly implemented by multiple ancestors/descendants on the inheritance tree. Look at hashCode(), equals() and toString().

(java) abstract AND …. method #unfillable dental cavity

I think java “abstract” keyword evolved from c++ pure virtual…

abstract AND synchronized @@ no idea. I think [[ java precisely ]] covers this.

abstract AND private @@ No. http://java.sun.com/docs/books/jls/second_edition/html/classes.doc.html#11244

abstract AND final @@ no. can’t be implemented

— abstract AND static @@ —
No. “It is a compile-time error for a static method to be declared abstract.” http://java.sun.com/docs/books/jls/second_edition/html/classes.doc.html#11246 It all boils down to runtime^compile-time binding. First, you need to really understand that static calls are resolved at compile time — based on the declared type.

In general, given any static method s() in any baseclass B redefined in any subclass C, then C.s() can never step in to help B.s(). This holds whether B.s() is abstract or concrete.

Therefore, an abstract static method is an unfillable dental /cavity/, an /unrecoverable liability/. Such a construct makes no sense.

(subClassObj(superClassObj))-> onion

Best 1-word glue to tie up constructor chain-reaction, object inheritance, overriding, dynamic binding…. is onion

To drive home the concept, think of any java object as an onion. If C extends B extending A extending Object, then an C obj wraps a B obj, enclosing an A obj, containing an Object obj. There’s always an Object.java instance at the core, inside the innermost onion layer.

In the constructor call stack, C() calls B(), which calls A(), which calls Object(). The callstack looks exactly like an onion. First statement in ANY constructor is always an (implicit or explicit) call to super(). No exception.

Dynamic binding explained by Mr. Onion — If an ancestor method m() is overriden (must be non-static) by every descendant, then a call to m() binds (only at runtime, never compile-time) to the outermost onion layer, ie the most specific implementation of m(). This is regardless of what you put before “. m()” — type_a_var.m() or type_b_var.m() all binds to the m() implementation in C.java. See p 45 [[ Java precisely ]]

lopsided subsumption in java overriding and polymorphism

Say class C.java extends class B.java and wants to override the inherited method m(Number).
* Overrider’s return type must be a subtype of parent’s. Any “client” of B must not be surprised to get C’s return values.
*
How about args? You might think child’s parameter could be a super-type of Number, such as Object, but no no no. Overrider’s parameters must be exactly identical to parent’s parameters.
==> Bottom line: subsumption applies to return type, not parameters.
Remember Kevin’s quiz question in Feb 2010? To get polymorphism , meet all its strict criteria. Always use @Override to verify.

runtime-binding ] fttp parser

Note: Strategy pattern is not yet adopted here but more flexible than polymorphism.

* case: a default setAid() in super to be overridden by subclass to implement a /specialized/ formula
* case: getVpi() returns empty string in the base class (OLTUplinkPort) where there’s no “vpi” slot. In the ATM* subclasses, getVpi() returns the vpi slot.
* case: abstract isValid() declared in interface Validity.java.
* case: isSSCFIAccessible() is calculated differently for each building block class

“dynamic binding” is also known as dynamic dispatch. Compiler doesn’t know which version of the method to “bind the name to”. Only at runtime can the jvm put its finger on the most specialized subclass implementation.

In a circuit we have a list of CircuitElements objects, sitting on various levels of the inheritance hierarchy. We invoke isValid() on each. Some will run the isValid() inherited from a parent class, some will run the grand-parent’s implementation, but jvm ensures the most *specific* implementation in each case is chosen to run.

onion&&remote-control — static binding

Runtime binding is covered in a few posts with onion and remote-control. Now let’s apply our analogies to static method binding. P 45 [[ java precisely ]] has a good example 59.

C2 c2 = new C2(); // C2 extends C1 and hides a static method C1.m1() with C2.m1()
C1 c1 = c2;

Now c1 and c2 are 2 remote controls programmed for a single onion, the “biggest onion” [1]. However,
* variable c1 is a remote control of type C1 and supports C1.java’s static m1() only
* variable c2 supports C2.java’s static m1() only.

Now the question:
Q: how are the various calls to m1() resolved.
A: C1.m1() and C2.m1() are obvious.
A: c1.m1() binds to C1.java static m1(). The onion, which happens to be a C2 onion, doesn’t affect the resolution.
A: c2.m1() binds to C2.m1()

[1] Therefore instance methods c1.m1(int) and c2.m1(int) {c1 c2 both lower-case} are both bound to C2.java m1(int). Runtime binding of instance method m1(int).

Strategy pattern keywords

#1 keyword:
family — of algo

interchangeable

Beating polymorphism — Changing/choosing algo among a family of “sister methods” normally uses polymorphism. Strategy is an alternative to polymorphism.

sister methods — interchangeable

same signature — interchangeable

setter — algo-setter

compose — the system with algo objects

HAS-A — strategy pattern

IS-A — polymorphism

——-
It’s often a good idea to add fields (besides methods) into algo objects to make them more like objects. Any one of these fields will help.
– description of the algo
– name of the algo, a bit more readable than the classname.
– category
– a list of client object types ie “who can use this algo”
– platform supported
– version, date, author