A Shanghai Morgan Stanley interviewer asked in a 2017 java interview — “How could jvm surpass c++ latency?”
- One reason — JIT compiler could aggressively compile bytecode into machine code with speedy shortcuts for the “normal” code path + special code path to handle the abnormal conditions.
P76 [[javaPerf]] described a nifty JIT technique to avoid runtime cost of the dynamic binding of virtual function equals(). Supppose in some class, we call obj1.equals(obj2).
After a priming (i.e. warm-up) period, JIT collects enough statistics to see that every dynamic dispatch at this site is calling String.equals(), so JIT decides to turn it into faster “static binding” so the String.equals() function address is hardwired into the assembly code (not JVM bytecode). JIT also needs to handle the possibility of Character.equals(). I guess the assembly code can detect that obj1/obj2 is not a String.java instance and retry the virtual function lookup. JIT can generate assembly code to
1. call String.equals() and go ahead to compare some field of obj1 and obj2.
2. if no such field, then obj1 is not String, then backtrack and use obj1 vtable to look up the virtual function obj1.equals()
It may turn out that 99.9% of the time we can skip the time-consuming Step 2: )
Priming note — priming is tricky — https://www.theserverside.com/tip/Avoid-JVM-de-optimization-Get-your-Java-apps-runnings-fast-right-fromt-the-start highlights pitfalls of priming in the trading context. Some take-aways:
1. Optimizing two paths rather than just one path
2. Reusing successful optimization patterns from one day to the next, using historical data
- One hypothesis — no free() or delete() in java, so the memory manager doesn’t need to handle reclaiming and reusing the memory. [[optimizedC++]] P333 confirmed the c++ mem mgr does that. Instead the GC uses a very different algorithm
https://stackoverflow.com/questions/1984856/java-runtime-performance-vs-native-c-c-code is Not a published expert but he says —
On average, a garbage collector is far faster than manual memory management, for many reasons:
• on a managed heap, dynamic allocations can be done much faster than the classic heap
• shared ownership can be handled with negligible amortized cost, where in a native language you’d have to use reference counting which is awfully expensive
• in some (possibly rare and contrived) cases, object destruction is vastly simplified as well (Most Java objects can be reclaimed just by GC’ing the memory block. In C++ destructors must always be executed)
- One hypothesis — new() is faster in jvm than c++.
Someone said “Object instantiation is indeed extremely fast. Because of the way that new objects are allocated sequentially in memory, it often requires little more than one pointer addition, which is certainly faster than typical C++ heap allocation algorithms.”
Julia, Go and Lua often beat C in benchmark tests .. https://julialang.org/benchmarks/
http://www.javaworld.com/article/2076593/performance-tests-show-java-as-fast-as-c–.html is a 1998 research.
In my GS-PWM days, a colleague circulated a publication claiming java could match C in performance, but didn’t say “surpass”.