Is java/c# interpreted@@No; CompiledTwice!

category? same as JIT blogposts

Q: are java and c# interpreted? QQ topic — academic but quite popular in interviews. shows one explanation among many:

The term “Interpreter” referencing a runtime generally means existing code interprets some non-native code. There are two large paradigms — Parsing: reads the raw source code and takes logical actions; bytecode execution : first compiles the code to a non-native binary representation, which requires much fewer CPU cycles to interpret.

Java originally compiled to bytecode, then went through an interpreter; now, the JVM reads the bytecode and just-in-time compiles it to native code. CIL does the same: The CLR uses just-in-time compilation to native code.

C# compiles to CIL, while JIT compiles to native; by contrast, Perl immediately compiles a script to a bytecode, and then runs this bytecode through an interpreter.

power-of-2 sized hash tables: implications  and explain that java HashMap uses bucket count equal to power-of-two. A few implications

  • the modulo operation is a fast bit-masking i.e. select the lowest N bits: hash & (length-1)
  • an unlucky hashCode() may not offer dispersion among the lower bits. Assuming 16 buckets so only the lowest 4 bits matter, but hashCode() returns predominantly multiples of 8. Then two out of 16 buckets would be white hot. To guard against it, HashMap performs a rehash on the hashCode() output.

I implemented the same feature in

Note rehash is ineffective if hashCode() returns very few distinct values. In java8, there’s one more protection against it — RBTree as alternative to linked list… See RBTree used inside java8 hashmap

RBTree used inside java8 hashmap is a language-neutral discussion. shows an early implementation

they are implemented as a hashed array of binary trees. 
My experience with them is that they are very efficient.

— JEP 180 is the white paper to introduce a self-balanced tree as an alternative to linked list. shows with one million entries in a HashMap a single lookup taken 20 CPU cycles, or less than 10 nanoseconds. Another benchmark test demonstrates O(logN) get(key) in java8, but O(N) in java7, as traditionally known. N being the colliding elements in a single bucket.

If two hashCode() values are different but ended up in the same bucket (due to rehashing and bucketing), one is considered bigger and goes to the right. If hashCodes are identical (as in a contrived hashCode() implementation), HashMap hopes that the keys are Comparable, so that it can establish some order. This is not a requirement of HashMap keys.

If hashCodes are mostly identical (rare) and keys are not comparable, don’t expect any performance improvements in case of heavy hash collisions. Here’s my analysis of this last case:

  • if your Key.equals() is based on address and hashCode() is mostly the same, then the RBTree ordering can and should use address. You won’t be able to look up using a “clone” key.
  • if you have customized Key.hashCode() then you ought to customize equals(), but suppose you don’t implement Comparable, then you are allowed to lookup using a clone key. Since there’s no real ordering among the tree nodes, the only way to look up is running equals() on every node. says

A given bucket contains both Node (linked list) and TreeNode (red-black tree). Oracle decided to use both data structures with the following rules:

  • If for a given index (bucket) in the inner table there are more than 8 nodes, the linked list is transformed into a red black tree
  • If for a given index (bucket) in the inner table there are less than 6 nodes, the tree is transformed into a linked list

With the self-balanced tree replacing the linked list, worst-case lookup, insert and delete are no longer O(N) but O(logN) guaranteed.

This technique, albeit new, is one of the best simple ideas I have ever seen. Why has nobody thought of it earlier?

CPU run-queue #java perspective

— Mostly based on Charlie Hunt’s [[JavaPerf]] P28

Runtime.availableProcessors() returns the count of virtual processors, or count of hardware threads. This is an important number for CPU tuning, bottleneck analysis.

When a run-queue depth exceeds 4 times the processor count, then host system will become visibly slow (presumably due to excessive context switching).  For a host dedicated to jvm, this is a 2nd reason for CPU saturation. First reason is high CPU usage, which can become high even with a single CPU-hog.

Note run-queue depth is the first column in vmstat output

Optional.empty()=immutable !!singleton– points out that Optional.empty() probably would return a global singleton instance, but no-guarantee ! No one should assume it is a singleton. Use of q[ == ] assumes it, and is wrong.

Unlike enums, JVM doesn’t guarantee “singleton”. I think this no-guarantee decision was chosen so as to give compiler/JVM maximum freedom.

[[effJava]] suggests that immutable instances need no copying. They probably can be singletons, but don’t need to be singletons.


java singleton^immutable classes #enum

In c++ (and perhaps java) Most singletons are designed as “casual singletons” i.e. they are used as singletons, can be instantiated twice if you really want to, although there’s seldom a good justification.

[[effJava]] explains that an immutable class needs no copying. I think this is a performance view. (Does it apply to )

However, we don’t need to work hard trying to make it strict singletons.  Strict singleton is a functional requirement i.e. 2nd instance would be a functional defect.

If an immutable class happens to be used as one-instance data type, then lucky. But tomorrow it may get instantiated twice.. no worries 🙂

If boss asks you to make this class singleton, you should point out the legwork required and the numerous loopholes to fix before achieving that goal. Worthwhile?

Java enum types are special. JVM guarantees them to be immutable AND singleton, even in the face of serialization. See P 311.

jvm footprint: classes can dominate objects

P56 of The official [[java platform performance]], written by SUN java dev team, has pie charts showing that

  • a typical Large server app can have about 20% of heap usage taken up by classes, rather than objects.
  • a typical small or medium client app usually have more RAM used by classes than data, up to 66% of heap usage take up by classes.

On the same page also says it’s possible to reduce class footprint.

cpu sharing among Docker container for jvm

Note cgroup is also usable beyond jvm and Docker, but i will just focus on jvm running in a Docker container..

Based on

CPU shares are the default CPU isolation (or sharing??) and basically provide a priority weighting across all cpu time slots across all cores.

The default weight value of any process is 1024, so if you start a container as follows q[ docker run -it –rm -c 512 stress ] it will receive less CPU cycles than a default process/container.

But how many cycles exactly? That depends on the overall set of processes running at that node. Let us consider two cgroups A and B.

sudo cgcreate -g cpu:A
sudo cgcreate -g cpu:B
cgroup A: sudo cgset -r cpu.shares=768 A 75%
cgroup B: sudo cgset -r cpu.shares=256 B 25%

Cgroups A has CPU shares of 768 and the other has 256. That means that the CPU shares assume that if nothing else is running on the system, A is going to receive 75% of the CPU shares and B will receive the remaining 25%.

If we remove cgroup A, then cgroup B would end up receiving 100% of CPU shares. has more precise details. compares q(nice), cpulimit and cgroups. It provides more precise info on cpu.shares.

cpulimit can be used on an existing PID 1234:

cpulimit -l 50 -p 1234 # limit process 1234 to 50% of cpu timeslots. The remaining cpu timeslots can go to other processes or go to waste.

##Java9 features #fewer than java8

  1. #1 Most important – modular jars featuring declarative module-descriptors i.e. requires and exports
  2. #2 linux cgroup support.. For one example, see Docker/java9 cpu isolation/affinity
  3. #3 G1 becoming default JGC.. CMS JGC: deprecated in java9
  4. REPL JShell
  5. private interface methods, either static or non-static
  6. Minor: C++11 style collection factory methods like

List<String> strings = List.of(“first”, “second”);

It’s unbelievable but not uncommon in Java history —

  • Java9 release introduced significantly fewer and less impactful features than java8.
  • Similarly, java5 overshadows java6 and java7 combined


Docker(+java9)cpu affinity #since2006 is a 2018 article with some concrete examples demonstrating cpu isolation.

a Docker cgroup can specify a cpu-set (like core0 + core3 + core14) and limit itself to this cpu-set. Performance Motivation — preventing a process hopping between cores.

The “cpu-set” scheme provides conceptually simpler cpu isolation, but less popular than the “cpu-share” scheme.

Java9 offers support for cpu isolation if you adopt the the cpu-set scheme but not the cpu-share scheme, as explained succinctly in the article.

A historical note — In 2006 (Mansion/Strategem) I spoke to a Sun Microsystems consultant. An individual Solaris “zone” can specify which cpu core to use. This is my first encounter with CPU isolation/affinity.

## JVM interceptions #mem,exception..%%hypothesis

Java provides a virtual machine to resemble a physical machine. To the upper-level[1] applications running therein[2], JVM provides similar interfaces as a physical machine does. Through these interfaces, JVM intercepts runtime requests made by the hosted application, and provides powerful value-added services to the hosted application. Below are a subset of the interfaces I’m interested in.

Note when I say “intercept”, I mean the request is ultimately serviced by the host kernel of the physical machine. However, in many cases the JVM completes the requests in itself without calling into the host kernel.

  • memory allocation. Note de-allocation is not a legitimate request from a java app. The JVM is likely to handle allocations without calling into kernel. When allocation fails, the outcome is much better than in an unmanaged runtime.
  • memory access beyond an array or linked data structure. JVM probably forwards the request on, but the physical address is managed by the GC.
  • thread — management (including creation). Mostly native thread is used, so JVM forwards the requests to kernel, but I think JVM is very much in-the-loop and can provide instrumentation support
  • exception handling. In an un-managed environment, exception outcome can be inconsistent, unpredictable. Linux uses interrupts and signals (interrupts^signal ] kernel). JVM handles exceptions at a higher level, more effectively:)

[1] a layered view
[2] a “hosting” view. In this view, the JVM is a “container”. A docker container is cgroup that includes the JVM process

lazy singleton based@JVM dynamic classloading #Lea

In [[DougLea]] P86, this JVM expert pointed out that a simple “eager” singleton is eager in other language, but lazy in Java due to runtime on-demand class loading.

Specifically we mean a public[1] static final field. This initialization is thread-safe by default. Assuming the field is immediately accessed after class loading, this simple design is comparable to the familiar synchronized lazy singleton. What are the pros and cons?

Synchronized singleton requires more legwork (by developer) but
* It lets you pass computed ctor parameters at runtime. Note the singleton ctor is private but yo can call getInstance(userInput).
* As hinted earlier, if you load the class but do not immediately use the instance, then the simple design incurs the expensive initialization cost too early.

[[DougLea]] was writen before java5. With java5, [[EffJava]] advocates enum.

[1] DougLea actually prefers private field with public getter, for encapsulation.

## mkt data: avoid byte-copying #NIO

I would say “avoid” or “eliminate” rather than “minimize” byte copying. Market data volume is gigabytes so we want and can design solutions to completely eliminate byte copying.

  • RTS uses reinterpret_cast but still there’s copying from kernel socket buffer to userland buffer.
  • Java NIO buffers can remove the copying between JVM heap and the socket buffer in C library. See P226 [[javaPerf]]
  • java autoboxing is highly unpopular for market data systems. Use byte arrays instead

Arrays.sort(primitiveArray) beats List.sort() #defaultMethod

In terms of sorting performance, Arrays.sort(primitiveArray) is a few times faster than Collections.sort() even though both are O(N logN). My learning notes:

  • Arrays.sort(int []) is a double-pivot quicksort, probably using random access
  • Arrays.sort(Object []) is a mergesort
  • Collections.sort(List) defers to List.sort()
    • List.sort() is a Java8 default method in the interface. It copies data to an array then runs a mergesort
    • overrides the default method, so no copying for ArrayList from  java8 onwards

RandomAccess marker interface (ArrayList implements) is completely irrelevant. That’s because any subtype that provides RandomAccess can simply override (at source code level) the default method as demonstrated in This is cleaner than checking RandomAccess at runtime. One or Both designs could potentially be JIT-compiled to remove the runtime check.


RandomAccess #ArrayList sorting

Very few JDK containers implement the RandomAccess marker interface. I only know, and subclass Raw array isn’t.

Only subtypes can implement RandomAccess. Javadoc says

“The primary purpose of this interface is to allow generic algorithms to alter their behavior when applied to either random or sequential access lists.”

Q: which “generic algos” actually check RamdonAccess?
AA: Collections.binarySearch() in
AA: to my surprise, Collections.sort() does NOT care about RandomAccess, so ArrayList sorting is no different from LinkedList sorting! See my blogpost Arrays.sort(primitiveArray) beat List.sort() has more details


java generics wild cards – too many warnings/errors

Intro: If your project requires generic wild cards that’s too hard for your team’s knowledge level, then sooner or later you need to make a choice.

The complexity may grow out of hands. The compiler errors are non-trivial. Worse still, some Generics errors are runtime errors.

Sugg: see if you can remove generics completely from some classes. Use cast instead.

I feel in most cases, you only need to use "extends" and not "super". I think it can still be too hard.

Here’s one of my projects — the EventQueue project in the 2017 HSBC coding interview. I had to use generic wildcards like

Subscriber<T extends BaseMessage>

SubsriberFilter<T extends BaseMessage>

CallbackTask<T extends BeaseMessage>

When we pass these objects into methods, we face annoying compiler errors or warnings. Most warnings are unnecessary warnings (I think compiler is not smart enough).

Some methods are designed for BaseMessage like …

Other methods are often designed for “T extends BaseMessage”

Yet other methods are designed for a specific subtype PriceMessage.

I feel it’s often easier to use the BaseMessage as argument type. If un-compilable, I often remove the type parameter.

Small tip: if “instanceof ArrayList” gives generics warning, then use ArrayList.class.isInstance().

small tip: use Subscriber<?> can sometimes suppress a warning

small tip: some casts can suppress a warning

RMI class bytecode sync

Consider an object serialized and sent from hostA to hostB.

If the object is not a standard type like String or a collection, how can the receiving hostB reconstruct it? The class bytecode needs to be sent!

[[java the good part]], written by an RMI authority, gave explicit examples. The serialized stream includes metadata on the data object, which describes where to locate the corresponding class bytecode (of course on a bytecode server). On the receiving end, hostB would attempt to load the class bytecode locally. Failing that, hostB would download the bytecode.

Thanks to dynamic class loading, hostB can reconstruct an exact replica.

java interfaces have only abstract method@@ outdated]java8

Compared to c#, java language design was cleaner and simpler, at the cost of lower power and flexibility. C++ is the most flexible, powerful and complex among the trio.

There was never strong reason to disallow static methods in an interface, but presumably disallowed (till java 7) for the sake of simplicity — “Methods in interface is always abstract.” No ifs and buts about it.

With “default methods”, java 8 finally broke from tradition. Java 8 has to deal with Multiple Inheritance issue. See other blog posts.

In total, there are now 4 “deviations” from that simple rule, though some of them are widely considered irrelevant to the rule.

  1. a concrete nested class in an interface can have a concrete method, but it is not really a method _on_ that interface.
  2. Suppose an interface MyInterFace re-declares toString() method of That method isn’t really abstract.
    • There’s very few reasons to do this.
  3. static methods
  4. default methods — the only real significant deviation from the rule


JVM = a bytecode interpreter + JIT compiler

I used to think the JVM is a layer on top hardware and executes platform-independent bytecode against the hardware. The hardware components include

  • filesystems
  • network ports
  • CPU and memory
  • kernel threads
  • user input devices + screen

Consider assembly code. I guess assembly code deals directly with the same hardware components, with possible exception of threads.

(Not sure where the operating system kernel comes into play. See

Now I think JVM includes a JIT compiler that converts bytecode into assembly. See


java8 static methods ] interface

static methods in interface (SIM for short) is a minor feature of java8, fairly low-level and only interesting to java language students like me.

Two noteworthy points are raised in [[mastering lambdas]] P172 footnote (yes the footnote)

  1. A “traditional” static method (i.e. defined in a class) is inherited, but SIM is not inherited, Consequently …
  2. A “traditional” static method can be invoked using myObj.staticMeth1() but SIM can only be invoked using myInterface.staticMeth()

These two restrictions remove “loose” syntax around traditional static methods.

java8 default method break backward compatibility #HSBC

Among the java8 features, I think default  method is a more drastic and fundamental change than lambda or stream,  in terms of language

In my HSBC interview a London interviewer (Wais) challenged me and said that he thought default methods are designed for backward compatibility. I now think he was wrong.

—- Based on P171 [[Mastering Lambdas]]
Note The rare cases of incompatibility is an obscure (almost academic) concern. More important are the rules of method resolution when default methods are among the candidates. This topic is similar in spirit to popular interview questions around overriding vs overloading.

Backward compatibility (BWC) means that when an existing interface like includes a brand new default method, the existing “customer” source code should work as before. Default methods has a few known violations of BWC.

  • simplest case: all (incl. default) methods in an interface must be public. No ifs or buts.  Suppose Java7 MyConcreteClass has private m2() and implements MyIntf. What if MyIntf is now updated with a default method m2()? Compilation error!
  • a more serious case: java overriding rule (similar to c++) is very strict so m(int) vs m(long) is always, automatically overloading not overriding.  Consider a method call myObj.m(33). Originally, this binds to the m(long) declared in the class. Suppose the new default method is m(int) … an Overload! At compile time, this is seen as a better match so selected by compiler (not JVM runtime)… Silent, unexpected change in business logic and a violation of BWC!

This refreshingly thin book gives 2 more examples. Its last example is a serious backward incompatibility issue but I am not convinced it is technically possible. Here’s my rationale —

Any legacy code relying on putIfAbsent() must have an implementation of putIfAbsent() somewhere in some legacy java7 class. Due to “class-wins-over-interface” rule, a new default method putIfAbsent() will never be chosen when compiling the legacy code using java8 tool chain.

linker error in java – example

[[mastering lambda]] points out one important scenario of java linker error. Can happen in java 1.4 or earlier. Here’s my recollection.

Say someone adds a method m1() to interface This new compiled code can coexists with lots of existing compiled code but there’s a hidden defect. Say someone else writes a consumer class using, and calls m1() on it. This would compile in a project having the new but no Again, this looks fine on the surface. At run time, there must be a concrete class when m1() runs. Suppose it’s a HashSet compiled long ago. This would hit a linker error, since HashSet doesn’t implement m1().


64-bit java — my own mini Q&A

Q: will 32-bit java apps run in 64-bit JVM?
A: Write once, run anywhere. Sun was extremely careful during the creation of the first 64-bit Java port to insure Java binary and API compatibility, so all existing 100% pure Java programs would continue running just as they do under a 32-bit VM. However, non-pure java, like JNI, will break.
ficc acc
Q: 32bit apps need recompilation?
A: Unlike pure java apps, all native binary code that was written for a 32-bit VM must be recompiled for use in a 64-bit VM. All currently supported operating systems do not allow the mixing of 32 and 64-bit binaries or libraries within a single process.

Q: The primary advantage of running Java in a 64-bit environment?
A: larger address space. This allows for a much larger Java heap size and an increased maximum number of Java Threads.

Q: complications?
A: Any JNI native code in the 32-bit SDK implementation that relied on the old sizes of these data types is likely to require updating.
%%A: if java calls another program, maybe that program will need to be 64-bit compatible. This answer is slightly relevant.

Q: how is 32/64 bit JDK’s installed?
A: Solaris has both a 32 and 64-bit J2SE implementation contained within the same installation of Java, you can specify either version. If neither -d32 nor -d64 is specified, the default is to run in a 32-bit environment. All other platforms (Windows and Linux) contain separate 32 and 64-bit installation packages.

##what bad things can crash JVM

(Why bother? These are arcane details seldom discussed under the spotlight, but practically important in most java/c++ integrations.)

Most JVM exits happen with some uncaught exception or explicit System.exit(). These are soft-landings — you always know what actually killed it.

In contrast, the hard-landing exits result in a hs_err_pid.log file, which gives cryptic clues to the cause of death. For example, this message in the hs_err file is a null pointer in JNI —

siginfo: ExceptionCode=0xc0000005, reading address 0x00000000

Note this hs_err file is produced by a fatal error handler. However, if you pull the power plug, the FEH may not have a chance to run, and you get what I call an “unmanaged exit“. Unmanaged exit is rare. I have yet to see one.

People often ask what bad things could cause a hard landing? P79 [[javaPerformance]] mentions that FEH can fire due to

* fault in application JNI code
* fault in OS native code
* fault in JRE native code
* fault in the VM itself

caching thousands of java string literals(hardcoded)

Q: Suppose you have lots of java strings (typically up to 100 characters) in your JVM. Some are string literals, some are dynamic inputs from web, database/file or by messaging. You know many of the strings are recurring, such as column headers or individual English words from a file. You could use constant variables to represent column header names, but now we have too many (thousands of) such constant variables — impractical.
A: My basic solution is a cache in the form of a hashset which is internally a hashtable

    static String lookup(String input);

If input is found in the hashtable then we reuse it and avoid creating duplicate objects. This method is best with string literal inputs. Java automatically interns these literals so no redundant copies of literal string object even if you have lookup(“Column1”) in 200 classes.

Issue: indiscriminate usage — a colleague pointed out if lookup() is public, then other developers can abuse it and pass in strings that never re-occur. They just take up permanent memory for no benefits. One simple measure is another argument to remind developers —

    lookup(String input, boolean isRecurring);

Issue: large string — If we get a 800MB string we need to make a decision. If it’s reused often, then we should cache it somewhere. If it’s used only twice, then maybe recreate it each time. A simplistic solution is to add a length check in lookup(), and rename it to lookup1KB(). The places we know we may get 800MB strings, we use an alternative lookupSpecial() method.

Issue: large memory footprint — even if we check the string lengths in lookup1KB(), we can still get 9,000,000 entries. Most of these are due to the above-mentioned indiscriminate usage. We could add a hashtable size control, but I feel this tends to add latency, so not idea for real time. My colleague pointed out supports LRU.

(How does the jvm string pool help???)

Q: why not use a bunch of string constants?
A: Even if we only have 200 of these literals, using these many constants can be inconvenient.
* lookup() shows you the exact spelling with spaces and cases. To convert these many literals to constants, you need to hand-craft a lot of variable names.
* what if the literals change? You would need to rename those variables.
* you may want to decouple the constant’s name vs the content. That can hurt readability, assuming I prefer to see the literals in source code.
* If in Class1 I already defined a constant SOME_LONG_STRING, and in Class2 I see “some long string” I would need to look to see if it’s already a constant.

java iterators – weekly consistent vs fail-fast

Roughly speaking, java iterators are either weakly consistent (WC) or fail-fast (FF). The “3rd” category is snapshot iterator for copy-on-write. See [[java generics]]

WC is in 1.5
FF is in 1.4.

WC don’t throw ConcurrentModificationException. This particular Exception is the “FAIL” part of “FAIL-fast”.

I feel STL iterators provide no thread-safety and fall into none of these 3 categories. Since there’s no consistent threading support in c++ standard library, such features must be implemented by yourself under a particular thread library.

How about c#?

abstract overriding concrete method

Real example P 95 [[ head first design patterns ]]

Q: abstract method overrid` concrete method@@
A: Yes (removed) says “An instance method that is not abstract can be overridden by an abstract method.” and gave examples. I think it can be used to remove “toxic” services and prevent a “consumer” from calling it by mistake.

a simple (tricky) sabotage on java debugger

If you rely heavily on a java debugger, beware of this insidious sabotage.

You could use finally blocks. You could surround everything with try/catch(Throwable). If all of these get skipped (rendering your debugger useless) and system silently terminates at inconsistent moments, as if by a divine intervention, then perhaps ….

perhaps you have a silent System.exit() in an obscure thread.

Let me make these pointers clear —
– System.exit() will override/ignore any finally block. What you put in finally blocks will not run in the face of System.exit()
– System.exit() will not trigger catch(Throwable) since no exception is thrown.
– System.exit() in any thread kills the entire JVM.

Q: Is JNI crash similar to System.exit()?
%%A: i think so.

Actually, in any context a silent System.exit() can be hard to track down when you look at the log.

(archtect IV) what I wish java to have

— big wishes —
* NullPointerException — too many of these are thrown in production systems and can take hours of wild goose chase. Developers must be very very thorough, and adopt a lot of defensive coding habits. Java won’t help you.
* easier tools for byte code engineering
* easier reflection — Look at dynamic scripting languages
* programmatic class creation; runtime class creation.
* memory leak — hard to detect
* easier immutable objects — String is great but we need more

–small wishes
* simpler getter/setter — look at C# properties
* Bags as collections.
* serialization — is a bit murky. I feel this is an important area neglected by many developers, perhaps because it’s murky. Perhaps java can support a special debug-serialization so we can see what it does to a complicated object graph
* checked exception — is a mistake in many developers’ opinion. Narrow the scope of this construct.

class loaders — [[ weblogic definitive]]

P382 [[ weblogic definitive ]] is a good intro to java class loading. Generally concise and detailed. Still there are A few unclear points to bear in mind when studying it:

– pass up and pass down — when a class-loading request (for class J) comes to a child classloader C, C checks “its own memory” to see if J is loaded. Failing that, it sends the class-finding job UPSTAIRS to parent class loader P (and further up). If root class loader R can’t find it then R tries to load the class J. Failing that, R returns the class-loading job downstairs to P and to C.
– – > 1st classloader to attempt loading is always root classloader.

– The classpath classloader can ONLY load from the classpath. The extensions classloader is limited by JVM to only load from /jre/lib/ext/. In general most [2] classloaders are restricted to read a specific group of class files. Fundamental to delegation.
– – > corollary: Every parent is limited in its capability. When a child delegates to her parent some job that’s out of parent’s limits, child will do it herself.

– When a class-loading “job” comes in, it comes to a particular classloader. It doesn’t “come to the system”.

– When a class-loading request comes in, it comes with only a full classname. Each classloader [3] must *search* for the class file in jars and directories. Some beginners may assume the request comes labelled with physical address of the class file.

– It’s common to put a class file in 2 places, each visible to a classloader. Usually (if not always) only one classloader reads the class file and loads it into memory. If these 2 loaders are parent and child, then the parent loads it.

[1] by jvm
[2] if not every [3] the immediate parent of the receiving classloader will try first, followed by the grandparent… A few summary points: – tree. 1:m mapping. one-parent-many-children

What’s so special about jvm portability cf python/perl #YJL

You have a very strong technical mind and I find it hard to convince you. Let’s try this story…

At a party, one guy mentions (quietly) “I flew over here in my helicopter …” 5 boys overheard and start talking “I too have a helicopter”. Well the truth is, either they are renting a helicopter, or their uncle used to have a helicopter, or their girlfriend is rich enough to own a helicopter, or they have an old 2nd hand helicopter, they have a working helicopter for a university research project, or a toy helicopter.

It’s extremely hard to build a cross-platform bytecode interpreter that rivals native executable performance. Early JVM was about the same speed as perl. Current JVM easily exceeds perl and can sometimes surpass C.

In contrast, it’s much easier to build a cross-platform source code interpreter. Javascript, python, perl, php, BASIC, even C can claim that. But why do these languages pale against java in terms of portability? One of the key reasons is efficiency.

To convince yourself the value of JVM portability, ultimately you need to see the limitations of dynamic scripting languages. I used them for years. Scripting languages are convenient and quick-turnaround, but why are they still a minor tool for most large systems? Why are they not taking over the software world by storm?

Why is C still relevant? Because it’s low-level. Low-level means (the possibility of) maximum efficiency.  Why is MSOffice written in C/C++ and not VBA? Efficiency is a key reason. Why are most web servers written in C and not perl, not even java? Efficiency is a key reason.

Back to jvm portability. When I compile 2000 classes into a jar, and download 200 other jars from vendors and free packages. I zip them up and I get a complete zip of executables. If I fully tested it in windows then in many cases I don’t need to test them in unix. Compile once, run anywhere. We rely on this fact every day. Look at spring jars, hibernate jars, JDBC driver jars, xml parser jars, jms jars. Each jar in question has a single download for all platforms. I have not seen many perl downloads that’s one-size-fit-all.

I doubt Python, php or other scripting languages offer that either.

(See comments below)

Sent: Sunday, June 26, 2011 8:14 PM
Subject: RE: What’s so special about jvm’s portability compared to python’s or perl’s?

If you treat JVM == the interpreter of php/python/perl/etc., then Java’s so called “binary code portability” is almost the same as those scripting languages’ “source code portability”.
[Bin ] I have to disagree. AMD engineered their instruction set to be identical to Intel’s. Any machine code produced for Intel runs on AMD too — hardware level portability.
That’s one extreme level of portability. Here’s another level — Almost any language, once proven on one platform, can be ported to other platforms, but only at the SCP (source-code-portable) level. Portability at different levels has very different values. High-level portability is cheap but less useful.

Java Bytecode is supposed to be much faster as a lot of type checking, method binding, access checking, address resolution.. were already completed at compile-time. Java bytecode looks like MOV, JMP, LOAD … and gives you some of the efficiency of machine code.

Another proof is: Java binary code (compiled using regular method) can be de-compiled into source code, which indicates that its “binary code” has almost 1-to-1 mapping to “source code”, which means its binary code is equal to source code.
[Bin ] I would probably disagree. The fastest java bytecode is JIT and probably not decompilable I guess. For a sequence of instructions, the more “machine-like”, the faster it runs.

Well, you may want to argue JVM is better than the interpreter of those scripting languages, and I tend to agree. Java must have something that earned the heart of the enterprise application developers. Only that I haven’t found what it is yet 🙂

What’s so special about jvm portability cf python/perl, briefly@@

When I compile a class in windows, the binary class file is directly usable in unix. (Of course I must avoid calling OS commands.) I don’t think python or perl reached this level.

I feel dynamic, scripting languages are easier to make portable because they offer source-code portability (SCP), no binary code portability (BCP). In other words, BCP is tougher than SCP. I believe BCP is more powerful and valuable.  BCP was the holy grail of compiler design and Java conquered it.

Due to the low entry barrier, some level of SCP is present in many scripting languages, but few (if any) other compiled languages offer BCP, because it’s tough. JVM is far ahead of the pack.

Even C is source-code portable, but C is known as a poorly portable language due to lack of binary portability.

Quiz: who to "intercept" returns from a java method

Jolt: When you put in a “return”, you think “method would exit right here and bypass everything below”, but think again! If you use a return in a try{} or catch{}, it is at the /mercy/ of finally{}.

P477 [[ thinking in java ]] Only one guy can stop a return statement from exiting a method. This guy is finally, also mentioned in one of my posts on try-catch-finally execution order.

Same deal for break, continue.

RMI skeleton/stub instantiation

In an RMI scenario, there are 2 + 1 process. 2 on server, 1 on client
Process ps1) the application jvm
Process ps2) registry process. On unix, you often start it by hand. Note PS1 and PS2 must be on the same localhost.
Process pc) client JVM

There are at least 3 java objects involved. Both skeleton and stub implement the same business interface as OB.
Object OB) the real business object
Object SK) the skeleton object
Object ST) the stub object

Let's see how these are created and linked.
1) skleton is probably instantiated by exportObject() based on the OB object, inside PS1 JVM. This is a static method.
2) After export, Skeleton's address is then registered with the registry, using rebind() or bind(), both static methods.
) UnicastRemoteObject probably has a static collection to hold OB and SK, to fend off garbage collection
) Stub is created on demand in PC, by deserializing the skeleton object

low-level differences between HASA^ISA, in terms of function pointers

Some interviewer (MS or CS) asked me about the differences.

1) Subclass instance has the _same_address_ as base object, that’s why you can cast the ptr (or reference) up and down the inheritance hierarchy. See post on AOB ie address of basement.

2) Also, all inherited methods are available in the “collection” of function pointers of the derived object. In other words, derived object advertises all those inherited “features” or “services”, if you think of a family of interchangeable components. Derived object can _stand_in_ for the base TYPE.

In OO languages, a pure interface is basically a collection of function pointers. Has-a doesn’t expose/advertise the internal object’s function pointers, so the wrapper can’t “stand in” for that type.

dark side of left shift operator – java

1) First, remember left-shift only operates on integer types.
2) Second, remember most if not all integer types use 2’s complement, where leading bit represents sign.

Now the dark side – Multiplication by 2 can move the original number between positive and negative universes.

In this context, Left shift is simpler to understand than multiplication. Left shift by 1 is equivalent to (xxx *2). Confirmed on Sun.

Q: how about multiplication by 4 or 3?
%%A: I feel multiplication by even numbers tend to give the same flipping problem.

Sun says –
If an integer multiplication overflows, then the result is the low-order bits (possibly with a leading ‘1’) of the mathematical product as represented in some sufficiently large two’s-complement format. As a result, if overflow occurs, then the sign of the result may not be the same as the sign of the mathematical product of the two operand values.

parameterized wrapper class with a superclass

(This looks like a low-level and widely usable implementation pattern. If it is, then we better learn to read it. However, I don’t see many java guys create generic classes.)


A wrapper class’s behavior is affected by 3 types of things


·         The “kernel” fields wrapped in the wrapper. In addition, there are often helper fields, too.

·         The parent class

·         The type param. Remember this class is parameterized.


This looks simple until you see it used in a complex system. You need to remember how these “influences” are set.


public class EntityCacheListenerAdapter<K, V extends Entity, T extends EntityListener>

extends AbstractCacheListenerAdapter {

      public EntityCacheListenerAdapter(T target) {

            super(target); // populating



      public void afterUpdate(EntryEvent event) {

            super.getTarget() // inherited from parent

             .entityUpdated( // this method is from the kernel field

                    event.getOldValue(), event.getNewValue());



subvert: field initializer + constructors #java

I had an apparently watertight base bean class with a field initializer

public final Map properties = Collections.EMPTY_MAP;

Apparently every instance should have an immutable empty map? You will see how this “watertight guarantee” is compromised.

Subclass adds no field. However, at runtime, I found an object of the subclass with == a Hashtable, even populated with data. How could JVM allow it?

More (ineffective) chokepoint — There is only one line of code that “puts” into this Map. It’s in a private method and I added a throw new RuntimeException() //to ensure it never adds any data.

More (ineffective) chokepoint — There’s only one constructor for the base class. I put some println() which, surprisingly, didn’t run.

Short Answer – casting an object received by de-serialization.

Long answer — These bean classes are DTO classes for RMI. Server side is still using the old version, so it conjures up objects with == a populated Hashtable and serializes it to client JVM. Client de-serializes only with respect to the declared type, bypassing field initializer or constructors. So long as the incoming stream can convert to a Map, it’s successfully de-serialized.

stepping through class/object loading, Take 2

– – – a story/hypothesis to be verified. See P240,113 [[Practical Java]] and P28,30 [[Java Precisely]] – – –

base static initializer and static initializer BLOCK run, in the order of appearance
child static initializer and static initializer block run, in the order of appearance
(see P30 [[java precisely]])

^^ milestone: classes fully loaded.

child and base instance field half-initialized to defaults — null, 0.0, false,..

^^ milestone: dummy C object allocated, which contains a dummy B object

child constructor C() *entered*
base constructor B() *entered*, as first statement in C(…)
base constructor may call an overridden method m1(), child’s m1(), so child’s m1() runs, with child’s instance fields half initialized! Note in C++, B::m1() runs. See

^^ milestone: base constructor B() returns.

child’s instance field initializers run. All fields fully initialized as programmed.
remaining statements in C() run

static nested classes unnecessary@@

As stated in another post, I always start my nested class as “private static” and relax gradually when justified.

Now, some say static nested classes can always be pulled out as first-class citizens i.e. top level classes. No. A major feature (perhaps 2) I rely on everyday is the private access modifier in the context of nested classes.

Nested class can refer to private members (fields/methods) of the enclosing class; enclosing class can refer to private members of the nested class.

java nested class — static^non-static

See post on [[philosophy of nested classes]].

Let’s put aside anon classes.

By default, all my nested classes start with the “static” keyword.

Practically, whenever the nested class needs access to enclosing class’s instance fields (outerInt1 for eg) i remove the “static” keyword and make the class non-static, effectively adding an implicit “outer.this” field into the nested class. Nested class can then access outerInt1 via outer.this.outerInt1


java private nested classes: my habits

A major justification and usage of nested class is constructor access control. To appreciate, first appreciate immutables, final, singletons, protected, …. Note you can finish most projects without using these access control devices, but I tend to spend extra time adding such access controls. In complex concurrent systems, they reduce risk and add much-needed reassurance. Constructor access control is one of the controls.

Majority of my nested classes have only private constructors. I usually call them from the enclosing class.

Furthermore, all my nested classes start off as private static classes, and become protected only when necessary.

public static nested class


(another blog)

Just as constants can be defined in classes but better defined in interfaces, public nested static classes are better defined in interfaces. I feel this might be a best practice. It's more reusable and accessible.

More importantly, this is more readable than if defined in a class. When we define a public static nested class in an enclosing class, it appears to be tied to that class, but i feel that's an illusion. By putting the class in an interface, it's clearly presented as part of an open, shared interface and not tied to any object in the design.

However, i don't really know why we need public static nested classes at all. They look completely unnecessary.

In general, i feel anything that can go into classes or interfaces had better go into interfaces.

Coming back to nested classes, experts say static is better than non-static, if we have a choice. I agree, on the basis of readability, semantics, flexibility and loose coupling.

philosophy of java nested classes

“inner classes” = non-static nested classes. For beginners, let’s focus on typical scenarios:
1) encloses N as a *private* inner class.
2) has a field n1 of type N

Q: how would you use an instance of N? What does this object in memory represent?
A: a component of an object c (of type C).

Internalize — this.n1 is very similar to … a regular field, whereas N.class is more like a regular class than a C field. Allow me to repeat — Whenever you look at an inner class N, it’s very similar to a regular class.

Q: what does this.n1 resemble most? A field in an instance (c) of C?
A: No. Suppose a field j (of type J) in C has method j.m1(). m1() can’t (N can) access C’s private members
A: i think this.n1 most resembles a sophisticated and “trusted field“, a regular filed like j + the additional trust. The trust means this.n1 can have methods to access C’s private members.

Q: is such a construct never necessary and can be achieved using regular OO constructs?
A: No. The “trust” is hard to achieve otherwise.

Q: where do you instantiate N? How do you pass the instance around? How do you call N’s instance methods?
A: all inside C. Outside C, no other objects can see N

Q: how about a public (instead of private) inner class N2? What’s the use case or justification?
A: I would make N2’s constructor private, so outer class is the only access point. So an instance (n2) of N2 becomes a slave object dedicated to the outer object. Note trust still applies.

Now for static nested class S declared in
Q: how would you use an instance of S? What does this object in memory represent?
A: not part of a C instance.
Q: what does this instance resemble most? A static field in an instance (c) of C?
A: No. I think in some usages this instance most resembles a better static method wrapper. You can group static methods in C and move them into S and make them non-static [1] inside S. An alternative design is the System.out pattern –. put (converting to non-static) C’s static methods [2] into a regular class A, and create an A instance as C’s static field. However, static nested class S (not A) can access C’s private static members.
A: in the case of, static nested class resembles….?

[2] they lose access to C’s private static members.

[1] I think most methods in S should be non-static. Static methods in a static nested class is a waste of time.

Q: where do you instantiate S? How do you pass the instance around? How do you call S’s instance methods?
A: instantiate in a static method in C. You can also do so in a static initializer or a non-static method.

swig vs javah, briefly

In the JNI world,

A “native method” (NM) is basically the skeleton one-liner method prototype in a *.java source file. Related to it …
a “native Function” (NF) is a special C function implemented using the JNI types (included from jni.h + jni_md.h) and conforming to JNI standard.

This native function is often a wrapper over an Existing Function (EF) written in unconstrained C and has the real business logic.

SWIG will wrap C++ EF, whereas javah command takes NM and creates C/C++ NF but in the form of a forward declaration (a.k.a. prototype). Based on that NF you then implement business logic.

Swig starts from C++ EF; javah start from NM.

Swig generates a lot of layers (the “entire stack”) — including pure C++ wrapper classes all the way to java proxy classes and anything (if any) in between. In contrast, javah generates just one of the layers (see above.)

4 ways to set jni search path

1) On all platforms, -Djava.library.path is all you need. But if you can’t modify the java command line, then there are a few alternatives

On *nix,
2) $LD_LIBRARY_PATH can include the directory of your

On windows,
3a) if YourFile.dll is in the current directory, then it will be automatically  picked up

3b) otherwise, include its folder in %PATH%, which typically includes ….\bin folders and not …\lib, so be prepared

See P 17[[ JNI ]], which is on

subverting java private constructor #some c++

Private constructor is not watertight. Among the consequences, singletons must be carefully examined.

  • reflection
  • ObjectInputStream.readObject() — see other posts
  • de-serialization can instantiate the class without constructors (or field initializers).
    • RMI, EJB, JMS, Web service all use serialization
    • any time we copy an object across 2 jvm processes
  • if you see a private constructor, don’t feel safe — the class may be a nested class. In that case the enclosing class can call the private constructor. Worse still, another nested class (sister class) can also call the private constructor. And enclosed classes too. In summary all of these classes can call my private ctor —
    • ** my enclosing class
    • ** my “sister” classes ie other classes enclosed by my enclosing class
    • ** my “children” ie enclosed class including anonymous classes
    • ** my “grand-children” classes

— in c++

  • Friend function and friend class can call my private ctor.
    • Factory as friend
  • static method is frequently used by singletons

writeObject() invoked although never declared in any supertype

Usually, a common behavior must be “declared” in a supertype. If base type declares method m1(), then anyone having a B pointer can invoke m1() on our object, which could be a B subtype.

However, writeObject(ObjectOutputStream) [and readObject] is different. You can create a direct subtype of and put a private writeObject() in it. Say you have an object myOb and you serialize it. In the classic Hollywood tradition, Hollywood calls myOb.writeObject(), even though this method is private and never declared in any supertype. Trick is reflection — Hollywood looks up the method named writeObject —

writeObjectMethod = getPrivateMethod(cl, “writeObject”, …

##java magic across domains

“Any sufficiently advanced technology is indistinguishable from magic.” If you don’t know these technologies, you can’t imagine how some things are achievable.

* runtime code generation — cglib and proxies (since jdk 1.3)
* bytecode engineering
* reflection

Also powerful:
* threads in private inner classes

stepping through class/object loading

Based on P28, 30 [[ Java Precisely ]], and P110 [[practical java]]. There are dozens of important details [1]. Here we cover a few interesting observations. Assuming class C extends B, extending A.

Step: static initializer blocks and field initializers run, in order of apperance. Once static fields are initialized, they are available for use by all including static methods.

Step: static methods loaded and available to be called from the call-stack

— By this /milestone/, the class is “loaded” with all static stuff ready —

Note: Before any C-specific initializations, B() always *completes its steps* and returns a complete B object, to be wrapped in the onion.

Let’s skip ahead and look at…
Step K1: A’s instance field initializers and instance initializer blocks run, in order of apperance. These always appear outside the constructor.

Step K2: A() statements.

Note: By this time, no B state-initialization[2]. However, A() statements could call a B method — see [[baseclass dependent on subclass]]

Repeat K12 for B, and then C

[1] see Example 60 [[ Java Precisely ]].
[2] i think this is obvious. loading B’s method definition doesn’t count.
[3] Obviously, Object() and A() must complete beforehand.

java: static method inherited even if hidden

–I guess not inherited:
a static method m1 is tied to the super class. I think A subclass can only define another m1 to hide it.

Refer to the other post on “superclass instance inside subclass instance”. Static members are not INSIDE the onion and not inherited.

— Now I believe yes inherited:
P 22 [[ java precisely ]] says “inherits all methods … but not the constructors”
p 45 has a real example to prove that the superclass static method is still accessible even if shadowed (hidden) by a subclass static method (of the same signature, of course)

finally execution is so-called guaranteed UNLESS

[24 May 2007] guaranteed by JVM.

q: unless system powers off before it can?
A: agree. since the guarantee is given by jvm.

a7: finally still runs. Perhaps finally can do a return and /disarm/ the bomb?

q: unless the fatal error leads to an immediate jvm crash?
A: I don’t think so. If JVM can manage the fatal error, it can run finally{}.

q: what if there’s an endless loop in the try or catch?
A: then finally{} is expected to be delayed. JVM will honor the class owner’s expectations.

a690: right before the search for a /matching catch/ in the stack? see other blog posts on try/catch/finally execution order

q7: when does my finally run if my catch says exit()

q690: when does my finally run in the presence of my catchers and also catchers up on the call stack?

if constructor throws ..

myInstance = new MyClass() ;

Will myInstance become null or …?

I feel the assignment should leave myInstance unchanged. The constructor (which strictly are not “methods”) , like methods, won’t return anything to the caller. The constructor, the caller, and upstream callers may each be aborted.

See blog on exceptions in call-stack


For c++, a throwing ctor is common. If on the heap, the compiler will release the memory.

portableremote.narrow() unable to cast between objects loaded by 2 class loaders


mscope.jar should not include com/titan/**/*.class, so
AdaptiveClassLoader [1] won’t load them.

Which classloader will load these classes? By default, classes mentioned on
the classpath are loaded by the default classpath classloader.

[1] This is a custom class loader to load from mscope.jar. It’s a descendant of the classpath classloader.

javac^JIT – 2 compilers #priming

“compile->compile” rather than “compile-interpret”. More specifically,

1) Compile time — java source code is first compiled (by javac) to platform-independent byte code. You can move this *.class files to any machine of any arch. However, by definition, platform-independent code is not optimized for any platform.
2) Run time – byte code is compiled to platform-specific machine code “just in time”. The machine code is similar to those produced by gcc.

This is rather essential knowledge. I think sometimes quizzed on interviews.

O’Reilly [[java performance]] P75 has concise introduction of Hot-Spot JIT. The JIT only compiles the most critical “hotspots”, not those parts of the byte code executed once only. Therefore, the JIT needs some priming (i.e. warm-up) phase, during which it collects statistic about each code chunk, while executing the half-cooked byte code from Phase One. Based on the statistics it makes heuristic optimizations, as illustrated on P76.

Note it’s possible to execute the byte code from Phase One without second-phase compilation. This is actually optimal when the methods are executed only once — interpreting is faster for them!

This description is consistent with Ab-Initio architect who said in 2012 (not in his original words) that java system needed to consume some typical input data in priming, before throughput became comparable to c++. He said you need to “throw some data at the JVM”.

## how might jvm surpass c++]latency #MS #priming discussed priming esp. in trading.