3modes: writ`@(hash/ordered)map

  • insert_Or_Update — (most common map “Writing”): java uses put(); c++ uses operator[].
  • update_iFF_Existing — c++ needs if-block with count(); java uses replace() which returns null if nothing replaced
  • put_IFF_Absent — java uses putIfAbsent(); c++ uses insert()
  • .. Note java computeIfAbsent() is more efficient than putIfAbsent() which unconditionally computes the new “value (often an allocated object)” even though there’s a 50/50 chance this value is unneeded — i.e. the incumbent scenario.

private-Virtual functions: java^c++

Q: in c++ and java, is private virtual function useful?
A: both languages allow this kind of code to compile. C++ experts uses it for some advanced purpose but in java, any private methods are effective non-virtual, so any subclass method is unrelated to the baseclass private method.

— C++ is more nuanced
The trigger of this blogpost is P68 [[c++ coding standard]] by two top experts Sutter and Alexandrescu, but I find this “coding standard” unconvincing.

Private virtual functions seem to be valuable in some philosophical sense but I don’t see any practical use.

— java
See also hiding^overriding in java

Q: beside hiding (static methods), overriding and overloading, does java support another mechanism where subclass can redefine a (non-static) method?

  • in GTD this is very low value.
  • in terms of zbs and expertise this actually reveals something fundamental, esp. between java and c++
  • .. also highlights some important but lesser-known details of java inheritance
  • in terms of IV, this can be a small halo whenever we talk about overriding^overloading

A: Code below is not overriding nor overloading but it does compile and run, so yes this is another mechanism. I will call it “xxxx” or “redefinition”. The xxxx method is unrelated to the baseclass private method so compiler has no confusion. (In contrast, With overriding and overloading, compiler or the runtime follows documented rules to pick an implementation. )

Note if super and subclasses both use “public”, we get overriding. Therefore, “xxxx” requires “private in superclass, public in subclass”

Code below is based on https://stackoverflow.com/questions/19461867/private-method-in-inheritance-in-java

public class A {
    private void say(int number){
        System.out.print("A:"+number);
    }
}
public class D extends A{
    // a public method xxxx/redefining a baseclass private method 
    public void say(int number){
        System.out.print("Over:"+number);
    }
}
public class Tester {
    public static void main(String[] args) {
        A a=new D();
        //a.say(12); // compilation error ... Private
        ((D)a).say(12); //works
    }
}

op=(): java cleaner than c++ #TowerResearch

A Tower Research interviewer asked me to elaborate why I claimed java is a very clean language compared to c++ (and c#). I said “clean” means consistency and fewer special rules, such that regular programmers can reason about the language standard.

I also said python is another clean language, but it’s not a compiled language so I won’t compare it to java.

See c++complexity≅30% mt java

— I gave interviewer the example of q[=]. In java, this is either content update at a given address (for primitive data types) or pointer reseat (for reference types). No ifs or buts.

In c++ q[=] can invoke the copy ctor, move ctor, copy assignment, move assignment, cvctor( conversion ctor), OOC(conversion operator).

  • for a reference variable, its meaning is somewhat special  at site of initialization vs update.
  • LHS can be an unwrapped pointer… there are additional subtleties.
  • You can even put a function call on the LHS
  • cvctr vs OOC when LHS and RHS types differ
  • member-wise assignment and copying, with implications on STL containers
  • whenever a composite object has a pointer field, the q[=] implementations could be complicated.  STL containers are examples.
  • exception safety in the non-trivial operations
  • implicit synthesis of move functions .. many rules
  • when RHS is a rvalue object, then LHS can only be ref to const, nonref,,,

std::thread key points

For a thread to actually become eligible, a Java thread needs start(), but c++ std::thread becomes eligible immediately after initialization i.e. after it is initialized with its target function.

For this reason, [[effModernC++]] dictates that between an int field and a std::thread field in a given class Runner, the std::thread field should be the last initialized in constructor. The int field needs to be already initialized if it is needed in the new thread.

Q1: Can you initialize the std::thread field in the constructor body?
A: yes unless the std::thread field is a declared const field

Now let’s say there’s no const field.

Q2: can the Runner copy ctor initialize the std::thread field in the ctor body, via move()?
A: yes provided the ctor parameter is non-const reference to Runner.
A: no if the parameter is a const reference to Runner. move(theConstRunner) would evaluate to a l-value reference, not a rvr. std::thread ctor and op= only accept rvr, because std::thread is move-only

See https://github.com/tiger40490/repo1/tree/cpp1/cpp/sys_thr for my experiments.

OO-modeling: c++too many choices

  • container of polymorphic Animals (having vtbl);
  • Nested containers; singletons;
  • class inheriting from multiple supertypes ..

In these and other OO-modeling decisions, there are many variations of “common practices” in c++ but in java/c# the best practice usually boils down to one or two choices.

No-choice is a Very Good Thing, as proven in practice. Fewer mistakes…

These dynamic languages rely on a single big hammer and make everything look like a nail….

This is another example of “too many variations” in c++.

debugger stepping into library

I often need my debugger to step into library source code.

Easy in java:

c++ is harder. I need to find more details.

  • in EclipseCDT, STL source code is available to IDE ( probably because class templates are usually in the form of header files), and debugger is able to step through it, but not so well.

Overall, I feel debugger support is significantly better in VM-based languages than c++, even though debugger was invented before these new languages.

I guess the VM or the “interpreter” can serve as an “interceptor” between debugger and target application. The interceptor can receive debugger commands and suspend execution of the target application.

q[throw]: always(?)explicit in c++

in java, a Throwable object can come from some invisible code, perhaps in the jvm implementation code.

In c++ a catch clause supposedly “always” catches something explicitly thrown by a c++ code module, “never” something from a C library. Now I think the “always” and “never” are simply wrong, because there are real life examples to prove otherwise. P 115 [[c++coding standards]] written by top experts also said operation systems can wrap low level errors in exceptions.

However, https://stackoverflow.com/questions/28925878/do-c-standard-library-functions-which-are-included-in-c-throw-exception shows some corner case of undefined behavior in a strcpy(). In such a case, the c++ compiler (for this C function) can do anything such as throwing exception.

Note C functions like strcpy() doesn’t throw because C doesn’t support exceptions, but c++ compiler is permitted to and does generate throwing code in some cases.

 

jGC heap: 2 unrelated advantages over malloc

Advantage 1: faster allocation, as explained in other blogposts

Advantage 2: programmer can "carelessly" create an "local" Object in any method1, pass (by reference) the object into other methods and happily forget about freeing the memory.

In this extremely common set-up, the reference itself is a stack variable in method1, but the heapy thingy is "owned" by the GC.

In contrast, c/c++ requires some "owner" to free the heap memory, otherwise memory would leak. There’s also the risk of double-free. Therefore, we absolutely need clearly documented ownership.

[11]concurrent^serial allocation in JVM^c++

–adapted from online article (Covalent)

Problem: Multithreaded apps create new objects at the same time. During object creation, memory is locked. On a multi CPU machine (threads run concurrently) there can be contention

Solution: Allow each thread to have a private piece of the EDEN space. Thread Local Allocation Buffer
-XX:+UseTLAB
-XX:TLABSize=
-XX:+ResizeTLAB

You can also Analyse TLAB usage -XX:+PrintTLAB

Low-latency c++ apps use a similar technique. http://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919 reveals insight of lock contention in malloc()

indirection due to jGC: runtime cost

Stroustrup is not the first to tell me that java objects are always accessed via a pointer as the Garbage collector may relocate the actual object.

At runtime, this indirection has a non-zero cost. In contrast, C/C++ app (without GC) would access the pointee directly.

I guess a GC language would need some lookup table.

heap allocation: java Can beat c++

  • case 1 (standard java): you allocate heap memory. After you finish with it you wait for the java GC to clean it up.
  • case 2 (low latency java): you allocate heap memory but disable java GC. Either you hold on to all your objects, or you leave unreachable garbage orbiting the earth forever.
  • case 3 (c++): you allocate heap memory with the expectation of releasing it, so the compiler sets up housekeeping in advance for the anticipated delete(). This housekeeping overhead is somehow similar to try/catch before c++11 ‘noexcept’.

Stroustrup suggested that #2 will be faster than #3, but #3 is faster than #1. I said “But c++ can emulate the allocation as jvm does?” Stroustrup said C++ is not designed for that. I think he meant impractical/invalid. I have seen online posts about this “emulation” but I would trust Stroustrup more.

  • case 4 (C): C/c++ can sometimes use local variables to beat heap allocation. C programmers use rather few heap allocations, in my experience.

Note jvm or malloc are all userland allocators, not part of kernel and usually not using system calls. You can substitute your own malloc.

https://stackoverflow.com/questions/18268151/java-collections-faster-than-c-containers top answer by Kanze is consistent with what Stroustrup told me.

  • zero dynamic allocation (Similar to Case 4) is always faster than even the fastest dynamic allocation.
  • jvm allocation (without the GC clean-up) can be 10 times faster than c++ allocation. Similar to Case 2^3
    • Q: Is there a free list in JVM allocator? Yes

https://softwareengineering.stackexchange.com/questions/208656/java-heap-allocation-faster-than-c claims

  • c++ Custom allocators managing a pool of fixed-sized objects can beat jvm
  • jvm allocation often requires little more than one pointer addition, which is certainly faster than typical C heap allocation algorithms in malloc

vtable also contains.. #class file

C++ is more complex than java. A typical vtable in c++ contains

  • offset of base type subobject. In multiple inheritance, this offset is often non-zero. This offset is needed not only for field access but also up-casting
  • typeid for RTTI

These details are part of the compiler ABI, since object files from older and newer compilers (of the same brand) could link together iFF they agree on these details.

Best-known part of ABI is name-mangling algorithm. This vtable detail would be the 2nd best-known ABI feature.

I believe the class file in java is one file per class. Therefore, vtable is something like the equivalent of a java class file.

 

shared vars across files: prefer static field

When I have state to maintain and share across compilation units, there are basically three main types of variables I can create. (Non-static Local variables are simple and don’t maintain state.)

  1. nonstatic field of a singleton class — Note any user of this variable need a reference to the single object 😦
  2. file scope var in a *.cpp  — relatively simple usage, but don’t put in a shared header , as explained in global^file-scope variables]c++
  3. public static fields — most versatile design. Any user can access it after they #include the class header file.
  4. — non-contenders
  5. local static variables — (niche usage) You can create a local static var in myfunc(). To share the variable across compilation units, myfunc() can return a reference to this object, so from anywhere you can use the return value of myfunc(). This is a simple for of singleton.
  6. global variables — monster. Probably involves “extern”. See my other blogposts

The advantages of static field is often applicable to static methods too.

In fact, java leaves you with nothing but this choice, because this choice is versatile. Java has no “local static”, no file-scope, no global variables.

duplicate-prevention by set^map: java and c++

The contrast of set vs map is important as most of us need to use both construccts.

( The cross-reference of java vs c++ (or c#) is only a curiosity. )

— java map.put() Replaces (and returns previos mapped value)
— java set.add() Rejects (and returns false. True would mean added-witout-conlfict )

Not sure how important this is, but if you ask “which incumbent object blocked my insert?” or “which incumbent is equivalent to me?” then I guess the java Set collection is not the best chocie.

I guess we can use a map where each key maps to itself as the value.  Since both sides are pointers, footprint isn’t doubled.

— c++ set.insert() also rejects as java does. Obviously Multiset won’t.
— c++ map.insert() is consistent with set. The word “insert” implies size increase.
— c++ map operator[] Replaces, similar to java, but more complicated under the surface.

Q:are java primitive+reference on heap or stack #escape

An old question but my answers are not really old 🙂

In Java, a so-called “referent” is a non-primitive thingy with a unique address on heap, accessed via heap pointers.

In java, a referent is always an Object, and an Object is always on heap therefore always a referent.

(Java language defines “reference types” in terms of primitive types, so we need a clear understanding of primitive types first.)

In java, a primitive thingy is either part of a (heapy) Object or a local thingy on stack

(In C++ lingo, object can be a new int(88)…)

A reference is, at run-time really a heap pointer. Assuming 32-bit machine, the pointer itself occupies 4 bytes and must be allocated somewhere. If the reference is local to a method, like a parameter or a local variable, then the 4 bytes are on stack like a 32-bit primitive local variable. If it’s part of an object then it’s on heap just like a 32-bit primitive field.

— advanced topic: escape

Escape analysis is enabled by default. EA can avoid construction an Object on heap, by using the individual fields as local variables.

— advanced topic: arrays

Arrays are special and rarely quizzed. My unverified hypothesis:

  • at run-time an array of 3 ints is allocated like an Object with 3 int-fields
  • at run-time an array of 3 Dogs is allocated like an Object with 3 Dog fields. This resembles std::vector<shared_ptr<Dog>>
  • Q: how about std::vector<Dog>?
  • %%A: I don’t think java supports it.
  • The array itself is an Object

checked STL^checked java Collections

jdk checkedList, checkedMap etc are designed to check type errors — checking any newly added item has the correct type for the collection. See P 246 [[java generics]]

STL checked container checks very different coding errors. See http://www.informit.com/articles/article.aspx?p=373341, which is extracted from my book [[c++codingStd]]

Q: which linux c++thread is stuck #CSY

This is a typical “c++ecosystem question”. It’s not about c++ or C; it’s about linux instrumentation tools.

Q1: Given a multi-threaded server, you see some telltale signs that process is stuck and you suspect only one of the threads is stuck while the other threads are fine. How do you verify?

Q2: What if it’s a production environment?
A: I guess all my solution should be usable on production, since the entire machine is non-functioning. We can’t make it any worse.  If the machine is still doing useful work, then we should probably wait till end of day to investigate.

–Method: thread dump? Not popular for c++ processes. I have reason to believe it’s a JVM feature, since java threads are always jvm constructs, usually based on operating system threads [1]. JVM has full visibility into all threads and provides comprehensive instrumentation interface.

https://www.thoughtspot.com/codex/threadstacks-library-inspect-stacktraces-live-c-processes shows a custom c++ thread dumper but you need custom hooks in your c++ source code.

[1] Note “kernel-thread” has an unrelated meaning in the linux context

–Method: gdb

thread apply all bt – prints a stack trace of every thread, allowing you to somewhat easily find the stuck one

I think in gdb you can release each thread one by one and suspend only one suspect thread, allowing the good threads to continue

–Method: /proc — the dynamic pseudo file system

For each process, a lot of information is available in /proc/12345 . Information on each thread is available in /proc/12345/task/67890 where 67890 is the kernel thread ID. This is where pstop and other tools get thread information.

 

contains(): Set^SortedSet

–in java # See P247/32 [[java generics]]

  • Set<Acct> contains() uses Acct.equals()
  • SortedSet<Acct> contains() uses Comparable<Acct> or a custom comparitor class, and ignores Acct.equals()

–In STL

The tree-based containers use “equivalence” to determine containment, basically same as the java  comparator.

The hash-based containers use hashCode + a equality predicate. The implementation details are slightly complicated since both the hash function and the predicate function are template params. Upon template instantiation, the two concrete types become “member types” of the host class. If host class is undered_set<string>, then we get two concrete member types:

unordered_set<string>::hasher and unordered_set<string>::key_equal

These member types can be implemented as typedefs or nested classes. See ARM.

try{}must be completed by ..#j^c++^c#

— Java before 7 and c#
try{} should be completed by at least a catch or finally. Lone wolf try{} block won’t compile. See https://www.c-sharpcorner.com/UploadFile/skumaar_mca/exception-handling-in-C-Sharp/

In particular, try/finally without catch is a standard idiom.

— java 7:
try{} should be completed by a catch, a finally, both or none .. four configurations 🙂

The try/finally configuration now has an important special case i.e. try-with-resources, where the finally is implicit so you won’t see it in anywhere.

— c++ as of 2019
C++ has no finally.

try{} must be followed by catch.

error stack trace: j^c++

Without stack trace, silent crashes are practically intractable.

In my java and python /career/ (I think c# as well) , exceptions always generate a console stack trace. The only time it didn’t happen was a Barclays JNI library written in c++..

In c++, getting the stack trace is harder.

  • when I used the ETSFlowAsert() construct I get a barely usable stack trace, with function names, without line numbers
  • [[safeC++]] described a technique to generate a simple stack trace with some line numbers but I have never tried it
  • the standard assert() macro doesn’t generate stack trace
  • In RTS and mvea, memory access errors lead to seg-fault and core dump. In these contexts we are lucky because the runtime environment (host OS, standard library, seg-fault signal handler ..) cooperate to dump some raw data into the core file, but it’s not as reliable as the JVM runtime … Here are some of the obstacles:
    • core files may be suppressed
    • To actually make out this information, we need gdb + a debug build of the binary with debug symbols.
    • it can take half an hour to load the debug symbols
    • My blogpost %%first core dump gdb session describes how much trouble and time is involved to see the function names in the stack trace.

 

c++dtor^python finalizer^java finalizer

This blog has enough posts on c# finalizers. See also AutoCloseable^Closeable #java

python finalizer is a special class method object.__del__(self), invoked when reference count drops to zero, and garbage-collected. As such, it’s not useful for resource management, which is better done with context manager, a popular python idiom and the best-known “protocol” in python.

— Java finalizer is an important QQ topic

https://stackoverflow.com/questions/2506488/when-is-the-finalize-method-called-in-java has some highly voted summaries.

The finalize() method can be at any time after it has become eligible for garbage collection, possibly never.

The finalize() method should only be written for cleanup of (usually non-Java) resources like closing files. [[effJava]] says avoid it.

https://www.infoq.com/articles/Finalize-Exiting-Java compares

  1. c++RAII
  2. java finalize()
  3. java 7 try-with-resources

how could jvm surpass c++latency

A Shanghai Morgan Stanley interviewer asked in a 2017 java interview — “How could jvm surpass c++ latency?”

— One reason — JIT compiler could aggressively compile bytecode into machine code with speedy shortcuts for the “normal” code path + special code path to handle the abnormal conditions.

JIT to avoid vtable latency #Martin is a prime example.

Priming is tricky in practice — https://www.theserverside.com/tip/Avoid-JVM-de-optimization-Get-your-Java-apps-runnings-fast-right-fromt-the-start highlights pitfalls of priming in the trading context. Some take-aways:
1. Optimizing two paths rather than just one path
2. Reusing successful optimization patterns from one day to the next, using historical data

— One hypothesis — no free() or delete() in java, so the memory manager doesn’t need to handle reclaiming and reusing the memory. [[optimizedC++]] P333 confirmed the c++ mem mgr does that. See [1]

https://stackoverflow.com/questions/1984856/java-runtime-performance-vs-native-c-c-code is Not a published expert but he says —
On average, a garbage collector is far faster than manual memory management, for many reasons:
• on a managed heap, dynamic allocations can be done much faster than the classic heap
• shared ownership can be handled with negligible amortized cost, where in a native language you’d have to use reference counting which is awfully expensive
• in some (possibly rare and contrived) cases, object destruction is vastly simplified as well (Most Java objects can be reclaimed just by GC’ing the memory block. In C++ destructors must always be executed)

— One hypothesis — new() is faster in jvm than c++. See [1]

Someone said “Object instantiation is indeed extremely fast. Because of the way that new objects are allocated sequentially in memory, it often requires little more than one pointer addition, which is certainly faster than typical C++ heap allocation algorithms.”

[1] my blogpost java allocation Can beat c++


Julia, Go and Lua often beat C in benchmark tests .. https://julialang.org/benchmarks/

http://www.javaworld.com/article/2076593/performance-tests-show-java-as-fast-as-c&#8211;.html is a 1998 research.

In my GS-PWM days, a colleague circulated a publication claiming java could match C in performance, but didn’t say “surpass”.

LDLibPath ^ classpath

See also compile-time ^ run-time linking

Q: when I run the a.out, where does the runtime linker search for those *.so files?

Now I know the a.out file remembers/stores the dependency *.so file locations. Suppose a.out remembers pspc.so was loaded from ~/a/b/c. At runtime, the linker will try ~/a/b/c/pspc.so  but likely fail… completely normal.

https://docs.oracle.com/cd/E19683-01/816-1386/chapter3-10898/index.html clearly explains that *.so file locations can be saved in a.out as a ‘runpath’. The 2nd-best solution is the ldLIbPath env var.

https://superuser.com/questions/192573/how-do-you-specify-the-location-of-libraries-to-a-binary-linux lead me to the -R and -rpath linker-options.

https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_3.html describes q[ -rpath ] linker option

A c++ executable like a.out using dynamic lib is comparable a java main class using a bunch of jar files.

container{string}: j^c++

In java, any container (of string or int or anything) holds pointers only.

I think c# collections (i.e. containers) contain pointers if T is a reference type.

In cpp,

  • container of int always contains nonref, unlike java
  • container of container contains ptr, just like in java
  • but container of string is widely used, and invariably contains nonref std::string !

Q: is there any justification for container<(smart) ptr to string>? I found rather few online discussions.
A: See boost::ptr_container

Q: what if the strings are very large?
A: many std::string implementations use COW to speed up copying + assignment, however, string copy ctor has O(lengthOfString) per in the standard ! So in a standard-compliant implementation copy and assignment would be expensive, so I believe we must use container<(smart) ptr to string>

 

[15] template type constraint : j^c++

java and c# templates can have constraints. If the template uses T->length() then the constraint says T must subtype a certain interface containing a length() method. C++ handles it differently.

(http://stackoverflow.com/questions/874298/c-templates-that-accept-only-certain-types presents other solutions like boost static_assert…)

http://stackoverflow.com/questions/122316/template-constraints-c points out

You can call any functions you want upon a template-typed value, and the only instantiations that will be accepted are those for which that method is defined. For example:

template <typename T>
int compute_length(T *value)
{
return value->length();
}
We can call this method on a pointer to any type which declares the length() method to return an int. Thusly:
string s = “test”;
vector vec;
int i = 0;

compute_length(&s);
compute_length(&vec);

//…but not on a pointer to a type which does not declare length():
compute_length(&i); //This third example will not compile.

This works because C++ compiles a new version of the template function (or class) for each instantiation. As it performs that compilation, it makes a direct, almost macro-like substitution of the template instantiation into the code prior to type-checking. If everything still works with that template, then compilation proceeds and we eventually arrive at a result. If anything fails (like int* not declaring length()), then we get the dreaded six page template compile-time error.

covariant return type: c++98→java

java “override” rule permits covariant return — a overriding function to return a type D that’s subtype of B which is the original return type of the overridden method.

— ditto c++

https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Covariant_Return_Types. Most common usage is a “clone” method that

  • returns ptr to Derived, in the Derived class
  • returns ptr to Base in the Base class

Covariant return types work with multiple inheritance and with protected and private inheritance — these simply affect the access levels of the relevant functions.

I was wrong to say virtual mechanism requires exact match on return type.

CRT was added in c++98. ARM P211 (c 1990) explains why CRT was considered problematic in the Multiple Inheritance context.

subclass ctor/dtor using virtual func

See https://stackoverflow.com/questions/13440375/invoking-virtual-method-in-constructor-difference-between-java-and-c

Suppose both superclass and subclass defines a virtual cleanup() method.

— c++

… you lose the virtual i.e. “dynamic dispatch” feature. The subclass instance is not present so the only the base class implementation of cleanup() could run.

–java: let’s focus on ctor

… the subclass implementation of cleanup() would run, even though the subclass instance is not initialized — dangerous! See P70 [[elements of java style]]

i-cache caution: manual inlining

Suppose hot function f2() calls f1(). Should f1() be inlined?

Martin Thompson suggested we developers can effectively “take over” inlining by copying a short function f1()’s body into a caller function f2(), if both functions are hot.

This way, we don’t rely on c++ compiler or JIT compiler to decide what to inline. Note JIT compiler can make that decision dynamically based on runtime heuristics.

However, if the expanded f2()’s footprint becomes too big to fit into i-cache, then this practice can backfire.

i-cache: avoid handling corner cases]hot functions #JIT inlining

“Functions often contain a significant amount of code that deals with corner cases, that is, cases which are rarely executed. This means that a large number of instructions read into cache are rarely executed.”https://www.eetimes.com/document.asp?doc_id=1275470

Martin Thompson agreed. He hinted that JIT compilers could (or have be designed to) notice the “cold” corner cases and dynamically refactor the code path so the corner case is handled at end of a hot loop.

Martin also said inlining has to be done judiciously, to avoid adding corner cases (cold stuff) into hot functions. In c++, Inlining decision are made at compile time but JIT can make the same decisions at run-time.

Personally, I guess this JIT technique is academic or experimental, probably hit-n-miss so the performance gain/penalty is hard to predict.

JIT has opportunities to avoid vtable latency #Martin

P76 [[javaPerf]] described a nifty JIT technique to avoid runtime cost of the dynamic binding of virtual function equals(). Suppose in some class, we call obj1.equals(obj2).

After a priming (i.e. warm-up) period, JIT collects enough statistics to see that every dynamic dispatch at this site is calling String.equals(), so JIT decides to turn it into faster “static binding” so the String.equals() function address is hardwired into the assembly code (not JVM bytecode). JIT also needs to handle the possibility of Character.equals(). I guess the assembly code can detect that obj1/obj2 is not a String.java instance and retry the virtual function lookup. JIT can generate assembly code to
1. verify obj is a String and call an inlined String.equals()
2. if obj1 is not String, then use obj1 vtable to look up the virtual function obj1.equals()

It may turn out that 99.9% of the time we can skip the time-consuming Step 2: )

Martin gave a (hypothetical?) example. Suppose JIT notices that obj1 is always either a String or Character. JIT could inline both equals() functions and completely bypass vtable. (Branching can be done via if/else or …) This inline compilation is done after a fairly long period of instrumentation. I asked Martin why c++ can’t do it. He said c++ only uses static compilation. I feel c++ compilers don’t bother with this technique as it is not a proven performance win.

this->myVector as instance field

Sound byte — if you want your class to have a vector field, then a nonref field is the simplest design, unlike the java convention (below).

I have occasionally seen this->myVector as a nonstatic data member. I think this is normal and should not raise any eyebrows. [[effC++]] P62 has a simple example.

I also used std::map and other containers as fields in my classes, like PSPCDemux.

Java programmers would have a pointer field to a vector constructed on heap, but memory management is simpler with the nonref field. In terms of memory layout, PSPCDemux::myvector has some small footprint [1] embedded in the PSPCDemux object, and the actual container payload has to be allocated on heap, to support container expansion.

[1] Java is different as that “small footprint” shrinks to a single pointer.

These fields don’t need special handling in PSPCDemux ctors. By default an empty container would be allocated “onsite”. PSPCDemux dtor would automatically call the container dtor, which would free the heap memory.

If you adopt the java convention, then your dtor need to explicitly delete the heap pointer. This is tricky. What if the dtor throws exception before deleting? What if ctor throws exception after calling new?

given arbitrary value X,get nearest2nodes in std::map

Basically, find the left/right neighbor nodes.

! Don’t use upper_bound since lower_bound is enough.

  • If perfect match, then lower_bound return value is all you need. No need for 2 nodes:)
  • If no perfect match, then lower_bound() and prev(lower_bound)
  • if X too low, then begin() alone is all we can get
  • if X too high then prev(end()) alone is all we can get

See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/curveInterpolation_CVA.cpp

NavigableMap.java interface offers four methods .. see closestMatch in sorted-collection: j^python^c++

sizeof(‘a’) in c^c++^java #CSY

  • ‘a’ is taken as an int object in C, so sizeof(‘a’) is 4 !!
  • c++ take it as a char object.

See https://stackoverflow.com/questions/2172943/size-of-character-a-in-c-c

Java char is 2-bye to support 16-bit unicode! In Java, Only “byte” type is 1-byte, by definition.

c++ doesn’t have “byte” type, because “char” is the equivalent of the java “byte” type.

c++condVar 2 usages #timedWait

poll()as timer]real time C : industrial-strength #RTS is somewhat similar.

http://www.stroustrup.com/C++11FAQ.html#std-condition singles out two distinct usages:

1) notification
2) timed wait — often forgotten

https://en.cppreference.com/w/cpp/thread/condition_variable/wait_for shows std::condition_variable::wait_for() takes a std::chrono::duration parameter, which has nanosec precision.

Note java wait() also has nanosec precision.

std::condition_variable::wait_until() can be useful too, featured in my proposal RTS pbflow msg+time files #wait_until

5 c++ IV topics seldom asked on java

Blogging again.. Comments welcome.

1) big-6 components of any class, namely constructor, destructor, copy-constructor, move-constructor, assignment operator, move-assignment operator — Java/c# has only one namely constructor
** Destructor should never throw … interviewer often ask why
** All of the others can throw, so how to manage?
** non-virtual destructor

2) references (r-value or l-value) vs pointers (and double pointers) — no such low level constructs in java/c#
** pointer can point to heap or stack. In java all "pointers" are about heap

3) memory management including operator-new, malloc, double free, placement-new,
memory leak prevention, dangling pointer

4) socket API — is a C not c++ api

5) template meta programming

Some minor topics:
) smart pointers — lots of tricky questions.

) virtual functions, vptr, pure virtual,
) linux system library functions like fork(), malloc(), free(), write()
) const-correct

) multiple inheritance, virtual inheritance

std::sort (!!quicksort) requires random-access

list::sort() works on list — a non-random-access container, using a stable version of quicksort. See https://stackoverflow.com/questions/1717773/which-sorting-algorithm-is-used-by-stls-listsort.

There are many online resources about quicksort without random access.

In contrast, java Collections.sort() does NOT require RandomAccess.. RandomAccess #ArrayList sorting

c++template^java generics #%%take

A 2017 Wells Fargo interviewer asked me this question. There are many many differences. Here I list my top picks. I feel c# is more like java.

  1. (1st word indicates the category winner)
  2. C++ TMP is quite an advanced art and very powerful. Java generics is useful mostly on collections and doesn’t offer equivalents to most of the TMP techniques.
  3. java List<Student> and List<Trade> shares a single classfile, with uniform implementation of all the methods. In c++ there are distinct object files. Most of the code is duplicated leading to code bloat, but it also supports specialization and other features.
  4. java generics supports extends/super. C# is even “richer”. I think c++ can achieve the same with some of the TMP tricks
  5. c++ supports template specialization
  6. C++ wins — java doesn’t allow primitive type arguments and requires inefficient boxing. C# improved on it. This is more serious than it looks because most c++ templates use primitive type arguments.
  7. c++ supports non-dummy-type template param, so you can put in a literal argument of “1.3”
  8. c++ actual type argument is available at runtime. Java erases it, but I can’t give a concrete example illustrating the effect.

 

declare variables ] loop header: c^j #IF/for/while

Small trick to show off in your coding test…

Background — In short code snippet, I want to minimize variable declarations. The loop control variable declaration is something I always want to avoid.

https://stackoverflow.com/questions/38766891/is-it-possible-to-declare-a-variable-within-a-java-while-conditional shows java WHILE-loop header allows assignment:

List<Object> processables;
while ((processables = retrieveProcessableItems(..)).size() > 0) {/*/}

But only (I’m 99% sure) c++ WHILe-loop header allows variable declaration.

The solution — both java/c++ FOR-loop headers allow variable declaration. Note the condition is checked Before first iteration, in both for/while loops.

update — c++0x allows variable declaration in IF-block header, designed to limit the variable scope.

if (int a=string().size()+3) cout<<a << ” = a \n”; // shows 3

j4 factory: java^c++ #Wells

A Wells Fargo interviewer asked

Q6: motivation of factory pattern?
Q6b: why prevent others calling your ctor?

  • %%A: some objects are expensive to construct (DbConnection) and I need tight control.
  • %%A: similarly, after construction, I often have some initialization logic in my factory, but I may not be allowed to modify the ctor, or our design doesn’t favor doing such things in the ctor. I make my factory users’ life easier if they don’t call new() directly.
  • AA: more importantly, the post-construction function could be virtual! This is a sanctioned justification for c++ factory on the authoritative [[c++codingStd]] P88
  • %%A: I want to (save the new instance and) throw exception in a post-construction routine, but I don’t want to throw from ctor
  • %%A: I want to centralize the non-trivial business rule of selecting target types to return. Code reuse rather than duplication
  • %%A: a caching factory
  • %%A: there are too many input parameters to my ctor and I want to provide users a simplified façade

collection-of-abstract-shape: j^c++

In java, this usage pattern is simple and extremely common — Shape interface.

In c++, we need a container of shared_ptr to pure abstract base class 😦

  • pure abstract interface can support MI
  • shared_ptr supports reference counting, in place of garbage collection
  • pointer instead of nonref payload type, to avoid slicing.

This little “case study” illustrates some fundamental differences between java and c++, and showcases some of the key advantages of java.

c++nested class accessing host private member

Recurring in stupid on-line MCQ interviews!

P185 ARM says a nested class NC can access static members of the enclosing class EC, and also local types of EC, all without fully qualified name.

However, NC can’t access EC’s non-static members. Java took a bold departure.

In c++11, a nested class is an implicit friend of the enclosing class. See https://stackoverflow.com/questions/5013717/are-inner-classes-in-c-automatically-friends

Exception.getSuppressed()

In c++, dtor should never throw. During stack unwinding due to an escalating exception, every object constructed on the stack would go through their destructors (in the exact reverse order of construction). If one of the destructors throws a new exception, there’s a dilemma with the two exceptions both yet to be caught. C++ standards designate that situation as undefined behavior.

https://docs.oracle.com/javase/7/docs/api/java/lang/Throwable.html#addSuppressed(java.lang.Throwable) contrasts two situations

  • AA) one exception causing another. so the Throwable.initCause() reveals the original. This is supported for a long time
  • BB) two unrelated exceptions
    • BB1 try-with
    • BB2 bulk operations. Very rare.

https://stackoverflow.com/questions/8946661/jdk-1-7-throwable-addsuppressed-method has sample code to illustrate BB2 and BB1

According to [[java SE 8 for the really impatient]] P181, an exception thrown in a finally block would simply wipe out the original exception. [[java precisely]] agrees.

With the try-with-resources, the BB situation becomes more serious. Therefore, Suppressed exception is a general solution — The secondary exception is attached to the original, primary exception, as a suppressed exception.

See also https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html

Another, similar situation is a throw in a catch block like

try{…
}catch( Exception ex){
throw new IllegalArgumentException(); //ignore ex but ex object is not lost. It’s still included in stack trace
}

c++atomic^volatile #java too

See other posts about volatile, in both c++ and java.

For a “volatile” variable, c++ (unlike java) compiler is not required to introduce memory fence i.e. updates by one thread is not guaranteed to be visible globally. See Eric Lippert on http://stackoverflow.com/questions/26307071/does-the-c-volatile-keyword-introduce-a-memory-fence

 

E=optm Enabled; D=optm Disabled
based on [[effModernC++]]
atomic c++ volatile java volatile
optimization..reorder@instruction D: prevented E: permitted 😦 [3] D: prevented
optimization..remove memory fencing around this variable D: fencing enforced E: no guarantee #Eric D: fencing enforced. See 3rdEffect
optimization..redundant access skip E: permitted D: prevented E: probably permitted [2]
optimization..use any caches instead of high-latency main memory # cross-thr visibility of update requires D D: prevented probably E: permitted probably [4] D: prevented
optimization..remove atomicity of RMW operations D: atomicity enforced E: not enforced E: not enforced

[1] https://en.wikipedia.org/wiki/Memory_barrier
[2] special hardware access is too low-level, probably not a target of the JVM designers. Java volatile keyword is about virtual machine, not hardware machine. Using java here is likely a wrong choice of language. This topic might be an expertise/zbs topic but I have no time for it and seldom quizzed.
[3] a c++ volatile read (or write ) can be reordered with regard to non-volatile reads (or writes) according to [1] and [[effModernC++]]
[4] it is not guaranteed that c++ volatile reads and writes will be seen in the same order by other processors due to caching, cache coherence protocol and relaxed memory ordering

Unrelated to optimization — c++volatile can be used on regular variables, fields, methods.

Q: Is c++11 morphing into java n losing efficiency

An interviewer once asked Q: with some of the c++11/14 features, there’s a concern that c++ is slowly becoming more and more like java and losing its advantage over java. What do you think?

A: I lack the deep insight and expertise and therefore not confident to answer this question, but personally I disagree 90%. Java is a very “clean” language compared to C# and c++ in many ways such as

  • Every object is on heap. All custom types are constructed on heap.
  • Every object is passed by reference; every primitive is passed by value.
  • Non-static method are virtual by default.
  • no pointer; no direct access to addresses
  • much simpler templates, with type erasure
  • higher-level, abstract concurrency constructs
  • simpler multiple-inheritance
  • strings are immutable
  • ..

Java /restricts/ us to use a smaller set of higher-level tools, with a reduced level of control and restricted access to the lower level resources such as memory, sockets, kernel … Java is an automatic car, while c++ is a manual race car.

C++ offers a richer toolbox to the programmer. Messier, dirtier and more complex. Many of the power tools are fairly lowLevel. Beside auto_ptr, C++11 doesn’t remove from the toolbox. Therefore, you can use c++11 same way as c++03, OR you can use c++11 more like java.

Q: Is there a trend to use c++ more like java?  
A: I notice growing adoption of references (instead of pointers), vector (instead of raw array), std::string (instead of c_str), smart pointer (instead of raw heap pointer) but they have nothing to do with c++11.

We can also examine the key features added to c++11. I think none of them is like java:

  • move semantics — not like java at all
  • smart pointers
  • concurrency — completely different from java
  • lambda
  • unordered containers

PURE-interface types ] c++^java

http://stackoverflow.com/questions/318064/how-do-you-declare-an-interface-in-c
Someone pointed out

“The whole reason you have a special Interface type-category in addition to abstract base classes in C#/Java is because C#/Java do not support multiple inheritance. C++ supports multiple inheritance, and so a special type isn’t needed. An abstract base class with only pure virtual methods is functionally equivalent to a C#/Java interface.”…. actually with minor differences.

The sample code shows no special ctor, though the dtor is public virtual but without “=0” (not pure virtual), so I assume an “interface” type in c++ should have a virtual dtor, since it is designed to be subclassed.

Google style guide is more strict — A class is a pure interface if it meets the following requirements:

  • It has only public pure virtual methods, non-pure virtual destructor, and static methods
    • I feel the dtor should be empty since there’s no instance field.
  • It may not have non-static data members — same as java8 interfaces
  • It need not have any constructors defined. If a constructor is provided, it must take no arguments and it must be protected.

Simple, clean, pure Multiple Inheritance..really@@

Update — Google style guide is strict on MI, but has a special exception on Windows.

MI can be safe and clean —

#1) avoid the diamond. Diamond is such a mess. I’d say don’t assume virtual base class is a vaccine

#2) make base classes imitate java interface … This is one proven way to use MI. Rememer Barcalys FI team. All pure virtual methods, No field, No big4 except empty virtual dtor.

#2a) Deviation: java8 added default methods to interfaces

#2b) Deviation: c++ private inheritance from one concrete base class , suggested in [[effC++]]

#3) simple, minimal, low-interference base classes. Say the 2 base classes are completely unrelated, and each has only 1 virtual method. Any real use case? I can’t think of any but when this situation arises i feel we should use MI with confidence and caution. Similarly “goto” could be put to good use once in a blue moon.

volatile in c++ ^ java, again

Note there is unlikley going to be a final , a complete blogpost on this topic. This topic may not be very deep by your standard, but it has so many loose ends… No point trying to tie up all the loose ends.

The “volatile” keyword is a restriction on compiler:

  • C++ compiler must disable certain optimizations
  • java compiler must disable some other optimizations (like re-order) and enforce memory fencing

Q9: What does volatile mean in C++, precisely? No easily accessible online answer, so let’s not dwell too long.

A9.0 (one of the common short answers): this keyword tells compiler to remove all “optimizations on the variable”, but what are they exactly?

A9.1 (one of the common answers): it tells compiler to always hit hardware memory, rather than some Software cache. You can verify the assembly generated by compiler. Assembly should show CPU instructions of memory access every time rather than selectively. However, such an experiment is likely to take high tcost. Ask Dong Qihao.

Note — Upon executing the instruction, if CPU uses some sort of Hardware-based caching, compiler has no control no knowledge — Compiler did the right thing and volatile is honored.

A9.2: it means the memory location can be updated by hardware so compiler can’t “assume” this process’s threads are the only writers to that memory location. Without this assumption, compiler must always generate cpu instruction to access that memory location, instead of hitting some software cache.

http://www.ibm.com/developerworks/java/library/j-jtp06197.html java Example 3 is a memory location whose content is subject to change by external systems outside the JVM. Even if your system never modifies state of the object, it can change any time.

Q10: what are the optimzations? I guess all of them are about memory/caching
A: http://en.wikipedia.org/wiki/Volatile_variable#In_C_and_C.2B.2B shows an example.

Now, A9.1 also applies to java volatile, but java volatile has additional meanings, with concurrency implications. C++ volatile has no such additional. Many authors state that volatile is worthless as a c++ threading construct.

Q: So what is c++ volatile for? Arcane question. In reality, most programmers should not use it, but interviews like c++ volatile, so here we go
%%A: tighter control over access on the memory location behind the target variable.

The primary usage of volatile keyword in java vs c++ are completely different and unrelated.

Q: other differences between java and c++?
A: java volatile is about multiple threads updating a memory location; c++ is about writers outside the current process.
A: in java, volatile keyword creates memory fence/barrier. I think this is similar to c++atomic data types, but not c++volatile
A: java volatile can only be (static/non-static) fields!
A: java volatile confers atomicity on long/double fields
A: c++ member function can be marked volatile. Effect and meaning is described in https://stackoverflow.com/questions/15283223/volatile-function
A: c++ local variable can be marked volatile. See [[effModernC++]] examples

See also [[c++succintly]] and [[effModernC++]]

[12]call`same method@unrelated types: template outshines java

Suppose pure abstract class Animal has a die() method, and so does Project, Product, Plant and Planet, but they don’t share a base class. How would you write a reusable function that invokes this method on a generic input object, whose type could be any of them?

Java can’t do this. In C++ you create

template<typename T>  f1(T input){ input.die(); }

If you pass an int into f1(), then you get compile time error. Probably a linker error. Is this SFINAE ? I doubt it.

STL algorithms routinely take an iterator argument and then call operator>() on the iterator. Now, this operator is undefined for a lot of iterator INSTANCES. I think only RandomAccessIterator category supports it.

Q: So how does STL make sure you don’t pass an object of ForwardInterator category into such a function?
A: use the template parameter type name (dummy type name) as a hint. Instead of the customary “T”, they put a suggestive dummy type name like “BidirectionayInterator” or “InputIterator”. If you ignore the hint you get compile-time error.

Now we understand that STL iterators have no inheritance hierarchy, but “stratification” among them.

divide-by-0: c++no excp;java throws..why

https://stackoverflow.com/questions/8208546/in-java-5-0-statement-doesnt-fire-sigfpe-signal-on-my-linux-machine-why explains best.

http://stackoverflow.com/questions/6121623/catching-exception-divide-by-zero — c++ standard says division-by-zero results in undefined behavior (just like deleting Derived via a Base pointer without virtual dtor). Therefore programmer must assume the responsibility to prevent it.

A compliant c++ compiler could generate object code to throw an exception (nice:) or do something else (uh :-() like core dump.

If you are like me you wonder why no exception. Short answer — c++ is a low-level language. Stroustrup said, in “The Design and Evolution of C++” (Addison Wesley, 1994), “low-level events, such as arithmetic overflows and divide by zero, are assumed to be handled by a dedicated lower-level mechanism rather than by exceptions. This enables C++ to match the behavior of other languages when it comes to arithmetic. It also avoids the problems that occur on heavily pipelined architectures where events such as divide by zero are asynchronous.”.

C doesn’t have exceptions and handles division-by-zero with some kind of run time error (http://en.wikibooks.org/wiki/C_Programming/Error_handling). C++ probably inherited that in spirit. However, [[c++primer]] shows you can create your own divideByZero subclass of a base Exception class.

java has no “undefined behavior” and generates an exception instead.

c++ reference variable is like …. in java@@

Q: a c++ reference is like a ….. in java?
A: depends (but i'd say it's not like anything in java.)

A1: For a monolithic type like int or char, a c++ reference variable is like a Integer.java variable.  Assignment to the reference is like calling a setValue(), though Interger.java doesn't have setValue().

A2: For a class type like Trade, a c++ reference is like nothing in java. When you do refVar2.member3, the reference variable is just like a java variable, but what if you do

Trade & refVar = someTrade; //initialize
refVar2 = …//?

The java programmer falls off her chair — this would call the implicit op=

refVar2.operator=(….)

java/c++overriding: 8 requirements #CRT

Here’s Another angle to look at runtime binding i.e. dynamic dispatch i.e. virtual function. Note [[effModernC++]] P80 has a list of requirements, but I wrote mine before reading it.

For runtime binding to work its magic (via vptr/vtbl), you must, must, must meet all of these conditions.

  • method must be –> non-static.
  • member must be –> non-field. vtbl offers no runtime resolution of FIELD access. See [[java precisely]]. A frequent IV topic.
  • host object must be accessed via a –> ref/ptr, not a non-ref variable. P163 [[essential c++]]. P209 [[ARM]] explains that with a nonref, the runtime object type is known in advance at compile time so runtime dispatch is not required and inefficient.
  • method’s parameter types must be —> an exact copy-paste from parent to subclass. No subsumption allowed in Java [2]. C++ ARM P210 briefly explains why.
  • method is invoked not during ctor/dtor (c++). In contrast, Java/c# ctor can safely call virtual methods, while the base object is under construction and incomplete, and subclass object is uninitialized!
  • method must be –> virtual, so as to hit the vtbl. In Java, all non-static non-final methods are virtual.
  • in c++ the call or the function must NOT be scope-qualified like ptr2->B::virtF() — subversion. See P210 ARM
  • the 2 methods (to choose from) must be defined INSIDE 2 classes in a hierarchy. In contrast, a call to 2 overload methods accepting a B param vs a D param respectively will never be resolved at runtime — no such thing as “argument-based runtime binding”. Even if the argument is a D instance, its declared type (B) is always used to statically resolve the method call. This is the **least-understood** restriction among the restrictions. See http://bigblog.tanbin.com/2010/08/superclass-param-subclass-argument.html

If you miss any one condition, then without run/compile-time warnings compiler will __silently__ forgo runtime binding and assume you want compile-time binding. The c++11 “overload” and java @Override help break the silence by generating compiler errors.

However, return type of the 2 functions can be slightly different (see post on covariant return type). P208 ARM says as of 1990 it was an error for the two to differ in return type only, but [[c++common knowledge]] P100 gives a clear illustration of clone() method i.e. virtual ctor. See also [[more eff C++]] P126. CRT was added in 1998.

[2] equals(SubclassOfObject) is overloading, not overriding. @Override disallowed — remember Kevin Hein’s quiz.

Here’s a somewhat unrelated subtle point. Suppose you have a B extended by C, and a B pointer/ref variable “o” seated at a C object, you won’t get runtime binding in these cases:

– if you have a non-static field f defined in both B/C, then o.f is compile-time binding, based on declared type. P40 [[java precisely]]
– if you have a static method m() defined in both B/C, then o.m() is compile-time binding, based on declared type. [1]
– if you have a nonref B variable receiving a C object, then slicing — u can’t access any C part.

[1] That’s well-known in java. In C++, You can also “call a static member function using the this pointer of a non-static member function.”

JNI stack memory — my new answer

Someone asked me about memory organization in a JNI runtime. I said the java stack and C stack sections are distinct. Then I occurred to me that in most cases the java thread extends Into the c stack. See https://pangin.pro/posts/stack-overflow-handling

However, I still feel my answer was correct because C can spawn new threads, perhaps in response to a call from java.

On a 2nd, unrelated, note — there can be (daemon or timer) C threads unrelated to any java thread.

Java stack can hold pointers to java heap objects, and C stack C heap objects. I don’t think they cross-reference. Consequently, the 2 stack sections are logically distinct.

removal→iterator invalidation:STL, fail fast, ConcurrentMap

This is a blog post tying up a few discussions on this subject. It’s instructive to compare the different iterators in different contexts in the face of a tricky removal operation.

http://tech.puredanger.com/2009/02/02/java-concurrency-bugs-concurrentmodificationexception/ points out that ConcurrentModEx can occur even in single-threaded myList.remove(..). Note this is not using myIterator.remove(void).

[[Java generics]] also says single-threaded program can hit CMEx. The official javadoc https://docs.oracle.com/javase/7/docs/api/java/util/ConcurrentModificationException.html agrees.

ConcurrentHashMap never throws this CMEx. See http://bigblog.tanbin.com/2011/09/concurrent-hash-map-iterator.html. Details? not available yet.

Many jdk5 concurrent collections have thread safe iterators. [[java generics]] covers them in some detail.

As seen in http://bigblog.tanbin.com/2011/09/removeinsert-while-iterating-stl.html, all STL graph containers (include slist) can cope with removals, but contiguous containers can get iterators invalidated. Java arrayList improves on it by allowing iterator to perform thread-safe remove. I guess this is possible because the iterator thread could simplify skip the dead node. Any other iterator is invalidated by CMEx. I guess the previous nodes can shift up.

–brief history

  1. STL iterator invalidation results in undefined behavior. My test shows silent erroneous result. Your code continues to run but result can be subtly wrong.
  2. In java, before fail-fast, the outcome is also undefined behavior.
  3. Fail-fast iterator is the java solution to the iterator invalidation issue. Fail-fast iterators all throw CMEx, quickly and cleanly. I think CMEx is caused by structural changes — mostly removal and insertions.
  4. CHM came after fail-fast, and never throws CMEx

references are seldom reseated – java^c++

A reference or pointer takes just 32 bits. Cheap to copy.

For java refs and c++ pointers, The standard, default operation is a bitwise copy. That's what you get at function boundaries. Whenever you see a reference variable “reseated”, raise a red flag — something unusual is going on.

A java best practice is to declare all local reference variables as final (will raise a real red flag:), unless justified otherwise.

smart ptr ^ weak reference ^ atomic ref

There’s some interesting resemblance between sptr and weakref. (Will address atomic ref later ..)Background — dominant construct in both c++ and java is the 32-bit pointer, assuming 32-bit address bus.

sptr and weakref
– both “wrap” the raw ptr
– both occupy more than 32 bits
– both introduce an extra level of indirection when accessing pointee
– both were designed for clever heap memory de-allocation
– when you need a raw ptr, you can’t pass in such a super ptr.
– in each case, 2 addresses are involved –
** pointee address and sptr’s own address
** pointee address and the weakref’s own address. When you pass the weakref, you don’t clone the weakref but you clone the 32-bit address of the weakref object.

hasa^holdsa — in c++ ^ java

C++ offers an alternative to HASA

– hasa means Umbrella class holds a nonref field of the Component class
– holdsa means Umbrella class holds a ptr/ref field, though reference field is less common.

As a contrast, in java,
+ hasa is the only option for primitive fields
+ holdsa is the only option for non-primitive fields

Back to c++,

– holdsa adds exactly 4 bytes to sizeof(Umbrella)
– hasa adds exactly sizeof(Component) bytes to sizeof(Umbrella)

^ hasa adds compile-time dependency. Since compiler needs to size up umbrella class, umbrella class definition must include full composition (field listing) of the component class. This leads to compile time dependency.
^ holdsa is pimpl. reduces compile time dependency. FCD required.

low-level language features — c++ ^ java

Low-level features include
– ptr
– func ptr — usually quite complicated
– func ptr rather than java interface
– func ptr rather than method inheritance and overriding
– arrays rather than collections
– ptr rather than iterators
– *char rather than std::string
We operate “lower” when
– At runtime rather than compile time
– doing Memory management
– doing I/O
– interfacing with OS. I think sys calls are usually c-API
– Overloading operator new
– Max efficiency in terms of memory and time
– Handling exceptions
– Serializing
– Composing your own customized data structures instead of standard containers

These usually call for c/c++.

[10]STL complicated: iwt java collections

  • Collections.java and Arrays.java offer a smaller, simpler set of operations than the free functions in STL, which are important and complex.
  • functor and function ptr as inputs. Java uses interfaces, much cleaner.
  • pointers as container elements
  • unary, binary and generator functors. java offers some clean and simple interfaces
  • algorithms usually take iterator arguments but not just any iterator. You need to know what iterators.
  • function adapters

Basic semantics are identical between STL and java collections. Nested container is fine.

q[private] access modifier ] java # surprises

Given 2 instances of C — c1 and c2. c1 can access private members in c2. See [[hardcore java]].

What about an instance b1 of a base class B? I believe b1 can’t access them. Compiler will tell you b1 knows nothing about subclasses’ fields. Similar to the c++ slicing problem.

A sound byte — “private” is checked at compile time.

Such an understanding of “private” is critical to nested classes. See posts on nested classes.

Moving on to “protected”. B has protected m1(), so a C method can call super.m1() or equivalently m1(), but an arbitrary method of an arbitrary class can’t call m1(). Again compile-time check. This is the story of the protected clone() and finalize() in Object.java.

I feel finalize() is designed to be called as super.finalize() in an overridden finalize().

what C/C++ can do but Java can’t@@

See also post on c++ vs real time jvm.
* java generics has non-trivial limitations. c++ generics is more complete IMHO. Some say c++ generics leads to code bloat but it’s not intuitive to me.
* I believe some financial packages were written for C/C++. On the other hand, some packages might be written for java.
* c/c++ api are more easily usable across by java, dotnet and Excel
* Function pointers. Java has Method objects (and interfaces), but not widely used.
* operator overloading
* preprocessor is mighty. I feel java has some code generators but not popular.
* JVM and java compiler is written in c/c++, on most OS versions.
* A large part of windows was written in C++ and C.
* Most OS system API is in C. C++ can easily call them.
* most shell utilities in Unix and Windows are written in C/C++. Java is too slow and bulky. To run a simple grep, (I guess) you need to start a JVM.
* Most database engines are probably written in C/C++. If for any reason you want to build your own database engine then c++ is probably better than java.
* fastest web servers are in c/c++.
* Here’s one last thing c++ can provide, but this is my personal view. C++ knowledge, if deep enough, provides some kind of window (from a lower angle) into the performance limitations/strengths of java and c#. This insight is valuable but hard to achieve if I only know java.
I don’t know the details, but a large part of c++ power comes from pointers. 
Java/c# are high-level languages. c/c++ are mid-level languages; while assembly is truly low-level. Some requirements (DB, OS, system utilities, compiler…) are best implemented in mid-level languages. Do financial apps fall into that category? I don’t think so, that’s why C++ is severely challenged by java and C#.

explict cloning in c++, java, collections…

Say you have a collection holding Errors, Exceptions .. which are subclasses. Let’s clone them.

In a C++ container, you need Throwable pointers to Error objects. To clone each object, you would need a virtual clone() method.

In a java collection, if Error doesn’t implement clone(), then the Throwable clone() won’t help as it can’t access subclass’s additional fields (molecules on a single onion layer). I gave the UBS interviewer a workaround — collection’s iterating loop to use reflection to get the class of each element (Errors), then call constructor reflectively, finally set each field reflectively.

Serialization is another solution, as described in http://en.wikipedia.org/wiki/Clone_(Java_method)

Can you create methods using reflection? I said no you need bytecode instrumentation or perhaps dynamic proxy.

Unlike assignment, clone() should create a brand new object. (Try Object.java javadoc.) Stack-var would get out of scope. So cloned objects should use new() and pointer and heap.

Can clone() return the new pointer unwrapped (i.e. deferenced)? Put another way, do you return the object by pbref or pbclone? We already made a clone, so pbref is more efficient.

double-checked locking and 3-step constructor re-ordering

I feel one reason why the DCL is broken relates to the 3-step process in “k = new Child()”. In c++, new (delete is similarly 2-stepper) is a 3-stepper — allocation + initialization + pointer assignment. JVM is written in c++, so probably same process —

1) allocate — new address for the new born Child
A) address saves to the 4-byte pointer k. Note the actual “store” operation (ie flush to main memory) may be delayed, but ironically not in this case — When you don’t want it delayed, it may; when you want it delayed it may not … Sigh.
B) initialize the memory cells of Child

Now the surprise — Steps A and B could be reordered by compiler. What if address assignment happens before initialization? Creator thread is in the critical section, between Steps A and B, but 2nd thread outside the critical section already sees A’s effect, believing the entire 3-stepper completed!

You may think Steps A and B are very very adjacent, but no no no. In a 128-processor machine with 100 threads on each, a lot of (mostly bad) things can happen when creator thread is pushed to end of the run queue.

##java’s advantages over c++

#1) portable — no compiler discrepancies
#2) memory management — c++ stack variables are fine, but heap and pointer are messy. Look at all the smart pointers.
) slicing problem

These are the c++ PAINS addressed by java. The other c++ shortcomings are less painful

#3) thread
#4) simpler, (and thread-safe) collections than STL
) reflection, proxy — one of java’s most powerful features
) simpler generics — but i feel this is less battle tested, as most developers don’t write generic classes.
) no multiple, virtual or private/protected inheritance
) simpler remoting

Runnable.java object differ from a func ptr in a C thread creation API@@

(C# Delegates???)
Q: how is a Runnable object different from a function pointer in a C thread creation API?

Let’s start from the simple case of a fieldless Command object. This command object could modify a global object but the command object itself has no field. Such a Runnable object is a replicable wrapper over a func ptr. In C, you pass the same func ptr into one (or more) thread ctor; in java you clone the same Runnable object and feed it to a new Thread. In both cases, the actual address of the run() method is a single fixed address.

Now let’s add a bit of complexity. I feel a Runnable can access a lot of objects in “creator-scope” — Since a Runnable object is created at run time often inside some methods, that creator method has a small number of objects in its scope.

Assuming a 32bit platform,

Q: are these variables (more precisely objects AND their 32bit references) living the stack of the run() method or as hidden fields of the Runnable object?
%%A: no difference. Since java compiler guarantees that these variables are “stable”, compiler can therefore create implicit fields in the Runnable object

Does a func ptr in C also have access to objects in creator-scope? I think so. Here’s an example taken from my blog on void ptr —

  thread (void (FVVP*)(void *),   void* vptr) // creates and starts a new thread. FVVP means a functor returning Void, accepting a Void Pointer

Above is a thread api with 2 arguments — a functor and a void ptr. The functor points to a function accepting a void ptr, so the 2nd argument (creator scope) feeds into the functor.

callback objects and func ptr: java^C++

Java has no func ptr. Method objects are sometimes used instead, but anonymous classes are more common.

in C/C++, func pointers are widely used to meet “callback” requirements. Java has a few callback solutions but under many names
* lambda
* observer/listener
* event handler
* command pattern

Func ptr is closer to the metal, but less object oriented.

Boost has a Boost.Function module for callback.

TYPE — different meanings in java vs c++

PRECISE meaning of Type differs between java and C++.

Java :
– any (even marker) interface,
– any class,
– any primitive….
is a type. We say “the declared type of the variable”.

c++ :
* any class, paramtrized or not
* primitive data type … is a type.
* typedef can declare new type names like aliases, but can’t create new types. Consider const_reverse_iterator.

nested class has special capabilities ] C++^java

Q: Can nested class (NN) methods access a private member of the enclosing class EE? See c++nested class accessing host private member

Can EE methods access NN’s private members? Yes. Tested. See below.

Below is based on P186 ARM, perhaps outdated.
Q: in terms of syntax, how does Inner class methods refer to Outer class’s non-static[1] field?
A: no special access. Unlike java, there’s no hidden “this” field pointing to the Outer _i n s t a n c e_
A: yes if the Inner method has a pointer, reference or a nonref variable of an Outer class _i n s t a n c e_,

Java’s nested class N1 is in the same “private club” along with E2() an Enclose class private method.
– N1 methods can access private members of E2
– E2 methods can access private members of N1
–> Not in C++

Q: So what special capabilities does a nested class has? Why make a class nested at all?
A: to mention the “Inner” type name, you need to prefix it as “Outer::Inner”
A: another specialness — inner class methods can access outer private members. As pointed out on
http://stackoverflow.com/questions/1604853/nested-class-access-to-enclosing-class-private-data-members
— in the original C++98 the inner class has no special privileges accessing the outer class.With C++98 compiler you’d either have to give the inner class the necessary privileges (friendship) or expose
the outer members x and y as public. However, this situation was classified as a defect in C++98, and it was decided that inner classes should have full access to outer class members (even private ones).

[1] static is fine — Outer::f

uninitialized local var and fields – C++/java

C++ variables with static “storage” or global “scope” have default initial value zero, or null for pointers.

c++ fields of primitive types are truly _uninitialized_. C++ need special compiler/debuggers to detect these.

C++ local int objects are truly _uninitialized_ — random initial state. Fixed in java —

java Local variables must be definitely assigned to before they are accessed, or it is a _compile_error_.

java Fields are 0-initialized by _default_. Not in c/c++ .

Note c++ class-type objects are always initialized via one of the (default) constructors. No such thing as an uninitialized MyClass instance.

main thread early exit

in java, main thread can “feel free” to exit if another non-daemon thread keeps the entire JVM alive. Not c++. [[c++cookbook]] says

when the operating system destroys a process, all of its child threads go with it, whether they’re done or not. Without the call to join(), main() doesn’t wait for its child thread: it exits, and the operating system thread is destroyed.

This assumes a 1:1 native thread model, so the operating thread is actually a kernel thread. When it ends, entire process ends.

c++ pure-abstract-class == java interface

http://www.objectmentor.com/resources/articles/dip.pdf

In C++ we generally separate a class into two modules: a .h module and a .cc module. The .h module contains the definition of the class, and the .cc module contains the definition of that class’s member functions. The definition of a class, in the .h module, contains declarations of all the member functions and member variables of the class.

This information goes beyond simple interface.

All the utility functions and private variables needed by the class are also declared in the .h module. These utilities and private variables
are part of the implementation of the class, yet they appear in the module that all users of the class must depend upon (via #include). Thus, in C++, implementation is not automatically separated from pure interface.

This lack of separation between pure interface and implementation in C++ can be dealt with by using purely abstract classes. A purely abstract class is a class that contains nothing (no fields) but pure virtual functions (ending in “=0”). Note Forward class declaration in effC++ can have fields, so those classes are usually not PAC.

c++ purely-abstract-class (PAC) == Java interface. SUN saw the value of PAC and made it a first class citizen in java.
c++ class definition + method definition == java class definition
c++ class with at least one pure virtual method == java abstract class

c++ forward class declaration == java no such thing
c++ regular class header file (without method definition) == java no such thing

In terms of compile-time dependency, java favors dependency-on-interface. In c++, you usually depend on *.h files, which is “thicker” than java interfaces. I feel there are 2 ways to reduce this tight coupling

* Forward class declaration
* PAC

In java, as a “client” class, we can be completely oblivion of the actual type responding to OUR message. That actual type could be unknown at compile time. No recompile required when the actual type changes, since we depend on the interface only. I feel this is the dream of c++ developers.

equals() and Comparable.java improves on STL

The equality-equivalence inconsistency in STL (P84 [[Effective STL]]) is fixed in java collections due to the “consistency” between equals() and Comparable interface —

The natural ordering for a class C is said to be consistent with equals if and only if (e1.compareTo((Object)e2) == 0) has the same boolean value as e1.equals((Object)e2) for every e1 and e2 of class C. In STL, such a contract is not enforced, with multiple implications.

However, java condones/permits an implementation to deviate from this consistency. Consistency is “strongly recommended (though not required)”

It’s instructive to compare this consistency against the equals/hashcode contract in java.

intrusive_ptr and java interface

For a beginner to intrusive_ptr, we won’t be too far off to remember that a pointee class could (not required) implement a java-style interface with 2 public methods addRef and delRef(). The pointee class typically uses these methods to change a private field refCount.

In Boost, the intrusive_ptr actually uses 2 stipulated free functions, but they could (and often do) call methods of the pointee class, like our 2 methods.

java type bounds, c++ concepts, templates

If you have a c++ parametrized class C with whose T must have “operator>”, then java has a better solution —

class C

Now, if a parametrized function f(T input) with whose T must have “operator>”, i would guess java would ditch the generics and simply declare f(Comparable input).

In the general scenario, a template param T has constraints such as “having run() method”, “numeric”, “copyable”, “assignable”, “dereferenceable like a smart ptr”. In a /degenerate/ case, there’s no constraint — vector can take any type[1]. In another degenerate case, a template param can have so many constraints that the template class can take nothing but USPresident class — Better drop the template.

[1] actually vector requires T to be copyable.

Back to the constraints, c++ compilers actually check some of these constraints and won’t let you specialize a template with an incompatible type. Examples —

* if you specialize natural_log() with a non-numeric type, compiler breaks.
* if you specialize a sort() with a non-random-access iterator, …. compile time or runtime error? I think compiler is too dumb.
* ARM p343 — linker detects a non-comparable type specializing a sorting template.

After reading http://www.devx.com/SpecialReports/Article/38864, I feel c++concept (cconcept) resembles java type bounds of the form

Java interface is better. The constraints above usually translate to methods. In that case they can be implemented using interfaces, without generics. Occasionally, you can use something like

default assignment is bitwise copy – c/c++ vs java

in C language, A = b would copy value of b into A’s memory location, resulting in 2 detached clones. Consider a C struct — the copying is bitwise.

java primitives behave the same but Java reference variables don’t. A = b would make A a “hard link” of b, so A and b point to the same object[2]. Exactly like c++ ptr copy. Note ptr A and ptr b must be compatible types. If A is a base type then slicing occurs.

Note java cloning, c/c++ assignment, c++ copier are all field by field. [2] is not field by field.

C++ treats all data types (float, char, classes…) as objects, but default assignment always works as in C (and Java primitives). Even if A and b are class types, A = b would copy value of b into A’s memory location, resulting in 2 detached clones, without creating a new object. Actually qq(A = b) may call the copier without the “explicit” modifier on the copier.

Bitwise copy means something special for a ptr field — the 4-byte content (ie address of pointee) is copied.

c++ nested class has NO ptr to enclosing-class object

See also the post in the recrec blog that’s dedicated to the same topic.

Q: Does a non-static nested class have an implicit pointer to the enclosing-class object [1]?

Specifically, say Class O defines a non-static nested class N, and has a non-static field n1 of type N. Given an object o2 of type O, in java, n1 has an implicit pointer to the o2 object. Not in C++. See P790 [[c++ primer]] and P187 ARM.

[1] and can access the non-static fields and methods of the enclosing object?

Q: what’s the difference between static vs non-static nested class in c++?

overriding vs field hiding in java and c++

[[Java precisely]] P40 summarizes

· In a non-static field access o.f, the field referred to is determined by the compile-time type of the object expression o
· In a non-static call to a non-private method o.m(), the method called is determined by the run-time class of the target object: the object to which o evaluates to

Say C.java extends B.java,

Q: if you only have an object o=new C(), how can you access f and m() declared in B.java?
A: With the exception below, you can’t access B::m(). The method call o.m() resolves at run time and always binds to C::m(). However, f is a different story. You can cast o to B and o.f would refer to B::f

A special context — super.f and super.m() do give you access to parent’s members, but you can only use this syntax in C’s constructors or non-static methods, essentially in C.java source code.

I wonder what C++ does.. Here are some tidbits.

o.B::m() actually let’s you call parent’s m() via a subclass object. Also see Item 50 in [[eff c++]]

2D-array-variable vs double-ptr-variable # c++

[1] Consider a strict 2D-array-variable declared as
int myArray[9][9]; // size? exactly 81 ints + 9 pointers. Are the 81 locations contiguous? Yes for a 2D array.

For a 1D array of pointers (probably same as jagged array), Not contiguous. The 1st (contiguous 9 ints) can live far from the 2nd (contiguous 9 ints). http://stackoverflow.com/questions/7784758/c-c-multidimensional-array-internals has the syntax

Any array is fix-sized (in theory). If you want to append physically, the physical memory block past the end of your array may not be available :-(.

So What solutions can you imagine?
P101 [[c pointer & mem mgmt]] shows diagrammatically a very common solution using a double pointer. Note a 2D-array variable won’t work here.

Once you allocate every nested array in the strict 2D structure, its memory usage is fixed, as it will look like [1] above?? I don’t think so. It’s more like the jagged 2D of C#. The declaration of myArray in [1] restricts the physical capacity of all the sub-arrays. In contrast, for a variable declared as a double-pointer, now I can point it at myArray, and later I can reseat it to an int[8][8].

I can also reseat my double-ptr at a partially initialized 1D array (size 5) of pointer objects, where the last pointer object is null. I can then point that last pointer to an int[99].

Now we realize a strict 2D-array-variable and a double-pointer-variable are quite different.

synthesized no-arg ctor

(Backgrounder — in both c++ and java, online tests and interviewers are crazy about nitty gritty of default ctor)

C++ standard — default constructor is a constructor that can be called with no arguments. –> Unlike java, this includes a constructor whose parameters all have default arguments [1]. By this definition, “default ctor” means something different from the synthesized no-arg (in multiple ways, but let’s not digress).

It’s best to avoid the ambiguous “default dtor” terminology in favor of “synthesized no-arg ctor”

In both java and c++, IF default ctor is needed in you program, you can create compilation errors by accidentally suppressing no-arg synthesization. As soon as you DECLARE/define any ctor in your class,
that synthesization is immediately suppressed.

* [1] Item 45 [[eff c++]] describes the default ctor as an no-arg.
* java also refers to it as the no-arg.

if constructor throws ..

myInstance = new MyClass() ;

Will myInstance become null or …?

I feel the assignment should leave myInstance unchanged. The constructor (which strictly are not “methods”) , like methods, won’t return anything to the caller. The constructor, the caller, and upstream callers may each be aborted.

See blog on exceptions in call-stack

 


For c++, a throwing ctor is common. If on the heap, the compiler will release the memory.