java singleton^immutable classes #enum

[[effJava]] explains that an immutable class needs no copying.

However, we don’t need to work hard trying to make it singletons.

If an immutable class happens to be used as one-instance data type, then lucky. But tomorrow it may get instantiated twice.. no worries 🙂

If boss asks you to make this class singleton, you should point out the legwork required and the numerous loopholes to fix before achieving that goal. Worthwhile?

Java enum types are special. JVM guarantees them to be immutable AND singleton, even in the face of serialization. See P 311.

Advertisements

##java heap allocation+!explicit q[new]

Most of these are java compiler tricks.

  • (pre-runtime) enum instance instantiation — probably at class-loading time
    • P62 [[java precisely]]
  • String myStr = “string1”; // see string pool blogpost
    • P11 [[java precisely]]
    • NOT anonymous temp object like in c++
  • “string1” + “str2” — is same as myStr.concat(..). So a method can often new up an object like this.
    • P10/11 [[java precisely]]
  • boxing
  • (most tricky) array initialization
    • int[] days ={31,28,31/* instantiates the array on heap */};
    • most tricky
    • P17 [[java precisely]] has examples
    • P16 [[java precisely]] also shows an alternative syntax “new int[]”

jvm footprint: classes can dominate objects

P56 of The official [[java platform performance]], written by SUN java dev team, has pie charts showing that

  • a typical Large server app can have about 20% of heap usage taken up by classes, rather than objects.
  • a typical small or medium client app usually have more RAM used by classes than data, up to 66% of heap usage take up by classes.

On the same page also says it’s possible to reduce class footprint.

cpu sharing among Docker container for jvm

Note cgroup is also usable beyond jvm and Docker, but i will just focus on jvm running in a Docker container..

Based on https://jaxenter.com/nobody-puts-java-container-139373.html

CPU shares are the default CPU isolation (or sharing??) and basically provide a priority weighting across all cpu time slots across all cores.

The default weight value of any process is 1024, so if you start a container as follows q[ docker run -it –rm -c 512 stress ] it will receive less CPU cycles than a default process/container.

But how many cycles exactly? That depends on the overall set of processes running at that node. Let us consider two cgroups A and B.

sudo cgcreate -g cpu:A
sudo cgcreate -g cpu:B
cgroup A: sudo cgset -r cpu.shares=768 A 75%
cgroup B: sudo cgset -r cpu.shares=256 B 25%

Cgroups A has CPU shares of 768 and the other has 256. That means that the CPU shares assume that if nothing else is running on the system, A is going to receive 75% of the CPU shares and B will receive the remaining 25%.

If we remove cgroup A, then cgroup B would end up receiving 100% of CPU shares.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/sec-cpu has more precise details.

https://scoutapp.com/blog/restricting-process-cpu-usage-using-nice-cpulimit-and-cgroups compares q(nice), cpulimit and cgroups. It provides more precise info on cpu.shares.

cpulimit can be used on an existing PID 1234:

cpulimit -l 50 -p 1234 # limit process 1234 to 50% of cpu timeslots. The remaining cpu timeslots can go to other processes or go to waste.

##Java9 features #fewer than java8

  1. #1 Most important – modular jars featuring declarative module-descriptors i.e. requires and exports
  2. #2 linux cgroup support.. For one example, see Docker/java9 cpu isolation/affinity
  3. #3 G1 becoming default JGC.. CMS JGC: deprecated in java9
  4. REPL JShell
  5. private interface methods, either static or non-static
  6. Minor: C++11 style collection factory methods like

List<String> strings = List.of(“first”, “second”);


It’s unbelievable but not uncommon in Java history —

  • Java9 release introduced significantly fewer and less impactful features than java8.
  • Similarly, java5 overshadows java6 and java7 combined

 

identityHashCode,minimum object size,relocation by JGC

https://srvaroa.github.io/jvm/java/openjdk/biased-locking/2017/01/30/hashCode.html offers a few “halo” knowledge pearls

  • every single java Object must always give an idHashcode on-demand, even if its host class has hashCode() method overridden to return a hard-coded 55.
    • hashcode() doesn’t overshadow idHashcode
  • The “contract” says an object’s idHashcode number must never [2] change, in the face of object relocations. So it’s not really computed based on address. Once someone requests the idHashCode number (like 4049040490), this number must be retained somewhere in object, as per the “contract”. It is retained in the 12-byte object header. (8-byte for a 32-bit JVM)
    • Therefore, the idHashcode contributes to the minimum size of java objects.
  • contrary to common belief, the idHashcode can clash between two objects, so idHashcode is a misnomer, more “hashcode” and not “identity”. https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6321873 explains there are insufficient integer values given the maximum object count.
  • Note anyone can call the hashcode() method on this same object and it could be overridden to bypass the idHashcode.
  • [2] in contrast a custom hashcode() can change its value when object state changes.

anon classes^lambda: java perf

Based on [[JavaPerm]] P381

  • An anon class requires an actual *.class file created by javac compiler and loaded from serialized form (usually disk).
  • No such class file for a lambda.

More than half the interview questions are about fancy theoretical knowledge, so this is knowledge valuable to interviews.

This difference has very limited performance impact, as of java 8. This performance perspective is unimportant to interviews, IMHO.java

 

Docker+java9 cpu isolation/affinity #2006

https://jaxenter.com/nobody-puts-java-container-139373.html is a 2018 article with some concrete examples demonstrating cpu isolation.

a Docker cgroup can specify a cpu-set (like core0 + core3 + core14) and limit itself to this cpu-set. Performance Motivation — preventing a process hopping between cores.

The “cpu-set” scheme provides conceptually simpler cpu isolation, but less popular than the “cpu-share” scheme.

Java9 offers support for cpu isolation if you adopt the the cpu-set scheme but not the cpu-share scheme, as explained succinctly in the article.

A historical note — In 2006 (Mansion/Strategem) I spoke to a Sun Microsystems consultant. An individual Solaris “zone” can specify which cpu core to use. This is my first encounter with CPU isolation/affinity.

## JVM interceptions #mem,exception..%%hypothesis

Java provides a virtual machine to resemble a physical machine. To the upper-level[1] applications running therein[2], JVM provides similar interfaces as a physical machine does. Through these interfaces, JVM intercepts runtime requests made by the hosted application, and provides powerful value-added services to the hosted application. Below are a subset of the interfaces I’m interested in.

Note when I say “intercept”, I mean the request is ultimately serviced by the host kernel of the physical machine. However, in many cases the JVM completes the requests in itself without calling into the host kernel.

  • memory allocation. Note de-allocation is not a legitimate request from a java app. The JVM is likely to handle allocations without calling into kernel. When allocation fails, the outcome is much better than in an unmanaged runtime.
  • memory access beyond an array or linked data structure. JVM probably forwards the request on, but the physical address is managed by the GC.
  • thread — management (including creation). Mostly native thread is used, so JVM forwards the requests to kernel, but I think JVM is very much in-the-loop and can provide instrumentation support
  • exception handling. In an un-managed environment, exception outcome can be inconsistent, unpredictable. Linux uses interrupts and signals (interrupts^signal ] kernel). JVM handles exceptions at a higher level, more effectively:)

[1] a layered view
[2] a “hosting” view. In this view, the JVM is a “container”. A docker container is cgroup that includes the JVM process

lazy singleton based@JVM dynamic classloading #Lea

In [[DougLea]] P86, this JVM expert pointed out that a simple “eager” singleton is eager in other language, but lazy in Java due to runtime on-demand class loading.

Specifically we mean a public[1] static final field. This initialization is thread-safe by default. Assuming the field is immediately accessed after class loading, this simple design is comparable to the familiar synchronized lazy singleton. What are the pros and cons?

Synchronized singleton requires more legwork (by developer) but
* It lets you pass computed ctor parameters at runtime. Note the singleton ctor is private but yo can call getInstance(userInput).
* As hinted earlier, if you load the class but do not immediately use the instance, then the simple design incurs the expensive initialization cost too early.

[[DougLea]] was writen before java5. With java5, [[EffJava]] advocates enum.

[1] DougLea actually prefers private field with public getter, for encapsulation.

## mkt data: avoid byte-copying #NIO

I would say “avoid” or “eliminate” rather than “minimize” byte copying. Market data volume is gigabytes so we want and can design solutions to completely eliminate byte copying.

  • RTS uses reinterpret_cast but still there’s copying from kernel socket buffer to userland buffer.
  • Java NIO buffers can remove the copying between JVM heap and the socket buffer in C library. See P226 [[javaPerf]]
  • java autoboxing is highly unpopular for market data systems. Use byte arrays instead

Arrays.sort(primitiveArray) beats List.sort() #defaultMethod

In terms of sorting performance, Arrays.sort(primitiveArray) is a few times faster than Collections.sort() even though both are O(N logN). My learning notes:

  • Arrays.sort(int[]) is a 2-pivot quicksort, probably using random access
  • Arrays.sort(Object[]) is a mergesort
  • Collections.sort(List) defers to List.sort()
    • List.sort() is a Java8 default method in the List.java interface. It copies data to an array then runs a mergesort
    • ArrayList.java overrides the default method, so no copying for ArrayList in java8

RandomAccess marker interface (ArrayList implements) is completely irrelevant. That’s because any List.java subtype that provides RandomAccess can simply override the default method as demonstrated in ArrayList.java. This is cleaner than checking RandomAccess at runtime. One or Both designs could potentially be JIT-compiled to remove the runtime check.

 

RandomAccess #ArrayList sorting

Very few JDK containers implement the RandomAccess marker interface. I only know Stack.java, ArrayList.java and subclass Vector.java. Raw array isn’t.

Only List.java subtypes can implement RandomAccess. Javadoc says

“The primary purpose of this interface is to allow generic algorithms to alter their behavior when applied to either random or sequential access lists.”

Q: which “generic algos” actually check RamdonAccess?
AA: Collections.binarySearch() in https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html
AA: to my surprise, Collections.sort() does NOT care about RandomAccess, so ArrayList sorting is no different from LinkedList sorting! See separate post Arrays.sort(primitiveArray) beat List.sort()

http://etutorials.org/Programming/Java+performance+tuning/Chapter+11.+Appropriate+Data+Structures+and+Algorithms/11.6+The+RandomAccess+Interface/ has more details

 

java generics wild cards – too many warnings/errors

Intro: If your project requires generic wild cards that’s too hard for your team’s knowledge level, then sooner or later you need to make a choice.

The complexity may grow out of hands. The compiler errors are non-trivial. Worse still, some Generics errors are runtime errors.

Sugg: see if you can remove generics completely from some classes. Use cast instead.

I feel in most cases, you only need to use "extends" and not "super". I think it can still be too hard.

Here’s one of my projects — the EventQueue project in the 2017 HSBC coding interview. I had to use generic wildcards like

Subscriber<T extends BaseMessage>

SubsriberFilter<T extends BaseMessage>

CallbackTask<T extends BeaseMessage>

When we pass these objects into methods, we face annoying compiler errors or warnings. Most warnings are unnecessary warnings (I think compiler is not smart enough).

Some methods are designed for BaseMessage like …

Other methods are often designed for “T extends BaseMessage”

Yet other methods are designed for a specific subtype PriceMessage.

I feel it’s often easier to use the BaseMessage as argument type. If un-compilable, I often remove the type parameter.

Small tip: if “instanceof ArrayList” gives generics warning, then use ArrayList.class.isInstance().

small tip: use Subscriber<?> can sometimes suppress a warning

small tip: some casts can suppress a warning

RMI class bytecode sync

Consider an object serialized and sent from hostA to hostB.

If the object is not a standard type like String or a collection, how can the receiving hostB reconstruct it? The class bytecode needs to be sent!

[[java the good part]], written by an RMI authority, gave explicit examples. The serialized stream includes metadata on the data object, which describes where to locate the corresponding class bytecode (of course on a bytecode server). On the receiving end, hostB would attempt to load the class bytecode locally. Failing that, hostB would download the bytecode.

Thanks to dynamic class loading, hostB can reconstruct an exact replica.

java interfaces have only abstract method@@ outdated]java8

Compared to c#, java language design was cleaner and simpler, at the cost of lower power and flexibility. C++ is the most flexible, powerful and complex among the trio.

There was never strong reason to disallow static methods in an interface, but presumably disallowed (till java 7) for the sake of simplicity — “Methods in interface is always abstract.” No ifs and buts about it.

With “default methods”, java 8 finally broke from tradition. Java 8 has to deal with Multiple Inheritance issue. See other blog posts.

In total, there are now 4 “deviations” from that simple rule, though some of them are widely considered irrelevant to the rule.

  1. a concrete nested class in an interface can have a concrete method, but it is not really a method _on_ that interface.
  2. Suppose an interface MyInterFace re-declares toString() method of Object.java. That method isn’t really abstract.
    • There’s very few reasons to do this.
  3. static methods
  4. default methods — the only real significant deviation from the rule

See https://stackoverflow.com/questions/27833168/difference-between-static-and-default-methods-in-interface

JVM == a bytecode interpreter + JIT compiler

I used to think the JVM is a layer on top hardware and executes platform-independent bytecode against the hardware. The hardware components include

  • filesystems
  • network ports
  • CPU and memory
  • kernel threads
  • user input devices + screen

Consider assembly code. I guess assembly code deals directly with the same hardware components, with possible exception of threads.

(Not sure where the operating system kernel comes into play. See https://bintanvictor.wordpress.com/2011/09/08/what-is-kernel-space-vs-userland/)

Now I think JVM includes a JIT compiler that converts bytecode into assembly. See https://bintanvictor.wordpress.com/2016/02/09/javac-jit-2-compilers/

 

java local var( !! fields)need explicit initialization

http://stackoverflow.com/questions/268814/uninitialized-variables-and-members-in-java briefly mentions the reason.

Instance variables (i.e. fields) of object type default to being initialized to null. Local variables of object type are not initialized by default and it’s a compile time error to access an undefined variable.

For primitives, story is similar

java8 static methods ] interface

static methods in interface (SIM for short) is a minor feature of java8, fairly low-level and only interesting to java language students like me.

Two noteworthy points are raised in [[mastering lambdas]] P172 footnote (yes the footnote)

  1. A “traditional” static method (i.e. defined in a class) is inherited, but SIM is not inherited, Consequently …
  2. A “traditional” static method can be invoked using myObj.staticMeth1() but SIM can only be invoked using myInterface.staticMeth()

These two restrictions remove “loose” syntax around traditional static methods.

java8 default method break backward compatibility #HSBC

Among the java8 features, I think default  method is a more drastic and fundamental change than lambda or stream,  in terms of language

In my HSBC interview a London interviewer (Wais) challenged me and said that he thought default methods are designed for backward compatibility. I now think he was wrong.

—- Based on P171 [[Mastering Lambdas]]
Note The rare cases of incompatibility is an obscure (almost academic) concern. More important are the rules of method resolution when default methods are among the candidates. This topic is similar in spirit to popular interview questions around overriding vs overloading.

Backward compatibility (BWC) means that when an existing interface like Collection.java includes a brand new default method, the existing “customer” source code should work as before. Default methods has a few known violations of BWC.

  • simplest case: all (incl. default) methods in an interface must be public. No ifs or buts.  Suppose Java7 MyConcreteClass has private m2() and implements MyIntf. What if MyIntf is now updated with a default method m2()? Compilation error!
  • a more serious case: java overriding rule (similar to c++) is very strict so m(int) vs m(long) is always, automatically overloading not overriding.  Consider a method call myObj.m(33). Originally, this binds to the m(long) declared in the class. Suppose the new default method is m(int) … an Overload! At compile time, this is seen as a better match so selected by compiler (not JVM runtime)… Silent, unexpected change in business logic and a violation of BWC!

This refreshingly thin book gives 2 more examples. Its last example is a serious backward incompatibility issue but I am not convinced it is technically possible. Here’s my rationale —

Any legacy code relying on putIfAbsent() must have an implementation of putIfAbsent() somewhere in some legacy java7 class. Due to “class-wins-over-interface” rule, a new default method putIfAbsent() will never be chosen when compiling the legacy code using java8 tool chain.

linker error in java – example

[[mastering lambda]] points out one important scenario of java linker error. Can happen in java 1.4 or earlier. Here’s my recollection.

Say someone adds a method m1() to interface Collection.java. This new compiled code can coexists with lots of existing compiled code but there’s a hidden defect. Say someone else writes a consumer class using Collection.java, and calls m1() on it. This would compile in a project having the new Collection.java but no HashSet.java. Again, this looks fine on the surface. At run time, there must be a concrete class when m1() runs. Suppose it’s a HashSet compiled long ago. This would hit a linker error, since HashSet doesn’t implement m1().

 

64-bit java — my own mini Q&A

Q: will 32-bit java apps run in 64-bit JVM?
A: Write once, run anywhere. Sun was extremely careful during the creation of the first 64-bit Java port to insure Java binary and API compatibility, so all existing 100% pure Java programs would continue running just as they do under a 32-bit VM. However, non-pure java, like JNI, will break.
ficc acc
Q: 32bit apps need recompilation?
A: Unlike pure java apps, all native binary code that was written for a 32-bit VM must be recompiled for use in a 64-bit VM. All currently supported operating systems do not allow the mixing of 32 and 64-bit binaries or libraries within a single process.

Q: The primary advantage of running Java in a 64-bit environment?
A: larger address space. This allows for a much larger Java heap size and an increased maximum number of Java Threads.

Q: complications?
A: Any JNI native code in the 32-bit SDK implementation that relied on the old sizes of these data types is likely to require updating.
%%A: if java calls another program, maybe that program will need to be 64-bit compatible. This answer is slightly relevant.

Q: how is 32/64 bit JDK’s installed?
A: Solaris has both a 32 and 64-bit J2SE implementation contained within the same installation of Java, you can specify either version. If neither -d32 nor -d64 is specified, the default is to run in a 32-bit environment. All other platforms (Windows and Linux) contain separate 32 and 64-bit installation packages.

java 8 default method – phrasebook

[[mastering lambdas]] is concise about method call resolution — corner cases, backward incompatibilities… I’m glad to be able to understand all of the practical issues.

j4, mtv — and the problem addressed : adding methods to a published interface. Impossible before java 8. Need to really understand it.

MI … isn’t a problem in java 8 – no “state” allowed in default methods.

vs abstract classes – instance state…

diamond – not a problem in itself

why delayed for so long – aversion of  MI

##what bad things can crash JVM

(Why bother? These are arcane details seldom discussed under the spotlight, but practically important in most java/c++ integrations.)

Most JVM exits happen with some uncaught exception or explicit System.exit(). These are soft-landings — you always know what actually killed it.

In contrast, the hard-landing exits result in a hs_err_pid.log file, which gives cryptic clues to the cause of death. For example, this message in the hs_err file is a null pointer in JNI —

siginfo: ExceptionCode=0xc0000005, reading address 0x00000000

Note this hs_err file is produced by a fatal error handler. However, if you pull the power plug, the FEH may not have a chance to run, and you get what I call an “unmanaged exit“. Unmanaged exit is rare. I have yet to see one.

People often ask what bad things could cause a hard landing? P79 [[javaPerformance]] mentions that FEH can fire due to

* fault in application JNI code
* fault in OS native code
* fault in JRE native code
* fault in the VM itself

caching thousands of java string literals(hardcoded)

Q: Suppose you have lots of java strings (typically up to 100 characters) in your JVM. Some are string literals, some are dynamic inputs from web, database/file or by messaging. You know many of the strings are recurring, such as column headers or individual English words from a file. You could use constant variables to represent column header names, but now we have too many (thousands of) such constant variables — impractical.
A: My basic solution is a cache in the form of a hashset which is internally a hashtable

    static String lookup(String input);

If input is found in the hashtable then we reuse it and avoid creating duplicate objects. This method is best with string literal inputs. Java automatically interns these literals so no redundant copies of literal string object even if you have lookup(“Column1”) in 200 classes.

Issue: indiscriminate usage — a colleague pointed out if lookup() is public, then other developers can abuse it and pass in strings that never re-occur. They just take up permanent memory for no benefits. One simple measure is another argument to remind developers —

    lookup(String input, boolean isRecurring);

Issue: large string — If we get a 800MB string we need to make a decision. If it’s reused often, then we should cache it somewhere. If it’s used only twice, then maybe recreate it each time. A simplistic solution is to add a length check in lookup(), and rename it to lookup1KB(). The places we know we may get 800MB strings, we use an alternative lookupSpecial() method.

Issue: large memory footprint — even if we check the string lengths in lookup1KB(), we can still get 9,000,000 entries. Most of these are due to the above-mentioned indiscriminate usage. We could add a hashtable size control, but I feel this tends to add latency, so not idea for real time. My colleague pointed out LinkedHashMap.java supports LRU.

(How does the jvm string pool help???)

Q: why not use a bunch of string constants?
A: Even if we only have 200 of these literals, using these many constants can be inconvenient.
* lookup() shows you the exact spelling with spaces and cases. To convert these many literals to constants, you need to hand-craft a lot of variable names.
* what if the literals change? You would need to rename those variables.
* you may want to decouple the constant’s name vs the content. That can hurt readability, assuming I prefer to see the literals in source code.
* If in Class1 I already defined a constant SOME_LONG_STRING, and in Class2 I see “some long string” I would need to look to see if it’s already a constant.

java method taking a single argument but 2 alternative types

Requirement: method read() in an interface IReader needs to accept a single argument but of various types

read(Book b) to be implemented by one concrete class BookReader
read(Score s) to be implemented by another concrete class ScoreReader

Suppose we want to unify the 2 read() methods into one unified method, so a “client” can get an instance of Reader and simply pass in either a book or a score. C# probably has language support for this, but in Java …

Solution 1: use Object argument

Solution 2: declare
read(T content, Class argumentClass);

BookReader.java implements
read(T content, Class argumentClass){
if (content instanceof Book){….
}

Paradoxically IDE may warn you that argumentClass is an unused variable inside the method. However Compiler use it to enforce type safety —

myReader.read(someBook, Integer.class);// won’t compile. Book.class required

java iterators – weekly consistent vs fail-fast

Roughly speaking, java iterators are either weakly consistent (WC) or fail-fast (FF). The “3rd” category is snapshot iterator for copy-on-write. See [[java generics]]

WC is in 1.5
FF is in 1.4.

WC don’t throw ConcurrentModificationException. This particular Exception is the “FAIL” part of “FAIL-fast”.

I feel STL iterators provide no thread-safety and fall into none of these 3 categories. Since there’s no consistent threading support in c++ standard library, such features must be implemented by yourself under a particular thread library.

How about c#?

abstract overriding concrete method

Real example P 95 [[ head first design patterns ]]

Q: abstract method overrid` concrete method@@
A: Yes http://docs.oracle.com/javase/specs/jls/se5.0/html/classes.html (removed) says “An instance method that is not abstract can be overridden by an abstract method.” and gave examples. I think it can be used to remove “toxic” services and prevent a “consumer” from calling it by mistake.

a simple (tricky) sabotage on java debugger

If you rely heavily on a java debugger, beware of this insidious sabotage.

You could use finally blocks. You could surround everything with try/catch(Throwable). If all of these get skipped (rendering your debugger useless) and system silently terminates at inconsistent moments, as if by a divine intervention, then perhaps ….

perhaps you have a silent System.exit() in an obscure thread.

Let me make these pointers clear —
– System.exit() will override/ignore any finally block. What you put in finally blocks will not run in the face of System.exit()
– System.exit() will not trigger catch(Throwable) since no exception is thrown.
– System.exit() in any thread kills the entire JVM.

Q: Is JNI crash similar to System.exit()?
%%A: i think so.

Actually, in any context a silent System.exit() can be hard to track down when you look at the log.

(archtect IV) what I wish java to have

— big wishes —
* NullPointerException — too many of these are thrown in production systems and can take hours of wild goose chase. Developers must be very very thorough, and adopt a lot of defensive coding habits. Java won’t help you.
* easier tools for byte code engineering
* easier reflection — Look at dynamic scripting languages
* programmatic class creation; runtime class creation.
* memory leak — hard to detect
* easier immutable objects — String is great but we need more

–small wishes
* simpler getter/setter — look at C# properties
* Bags as collections.
* serialization — is a bit murky. I feel this is an important area neglected by many developers, perhaps because it’s murky. Perhaps java can support a special debug-serialization so we can see what it does to a complicated object graph
* checked exception — is a mistake in many developers’ opinion. Narrow the scope of this construct.

class loaders — [[ weblogic definitive]]

P382 [[ weblogic definitive ]] is a good intro to java class loading. Generally concise and detailed. Still there are A few unclear points to bear in mind when studying it:

– pass up and pass down — when a class-loading request (for class J) comes to a child classloader C, C checks “its own memory” to see if J is loaded. Failing that, it sends the class-finding job UPSTAIRS to parent class loader P (and further up). If root class loader R can’t find it then R tries to load the class J. Failing that, R returns the class-loading job downstairs to P and to C.
– – > 1st classloader to attempt loading is always root classloader.

– The classpath classloader can ONLY load from the classpath. The extensions classloader is limited by JVM to only load from /jre/lib/ext/. In general most [2] classloaders are restricted to read a specific group of class files. Fundamental to delegation.
– – > corollary: Every parent is limited in its capability. When a child delegates to her parent some job that’s out of parent’s limits, child will do it herself.

– When a class-loading “job” comes in, it comes to a particular classloader. It doesn’t “come to the system”.

– When a class-loading request comes in, it comes with only a full classname. Each classloader [3] must *search* for the class file in jars and directories. Some beginners may assume the request comes labelled with physical address of the class file.

– It’s common to put a class file in 2 places, each visible to a classloader. Usually (if not always) only one classloader reads the class file and loads it into memory. If these 2 loaders are parent and child, then the parent loads it.

[1] by jvm
[2] if not every [3] the immediate parent of the receiving classloader will try first, followed by the grandparent… A few summary points: – tree. 1:m mapping. one-parent-many-children

What’s so special about jvm portability compared2python/perl#letter

You have a very strong technical mind and I find it hard to convince you. Let’s try this story…
At a party, one guy mentions (quietly) “I flew over here in my helicopter …” 5 boys overheard and start talking “I too have a helicopter”. Well the truth is, either they are renting a helicopter, or their uncle used to have a helicopter, or their girlfriend is rich enough to own a helicopter, or they have an old 2nd hand helicopter, they have a working helicopter for a university research project, or a toy helicopter.
It’s extremely hard to build a cross-platform bytecode interpreter that rivals native executable performance. Early JVM was about the same speed as perl. Current JVM easily exceeds perl and can sometimes surpass C.
In contrast, it’s much easier to build a cross-platform source code interpreter. Javascript, python, perl, php, BASIC, even C can claim that. But why do these languages pale against java in terms of portability? One of the key reasons is efficiency.
To convince yourself the value of JVM portability, ultimately you need to see the limitations of dynamic scripting languages. I used them for years. Scripting languages are convenient and quick-turnaround, but why are they still a minor tool for most large systems? Why are they not taking over the software world by storm?
Why is C still relevant? Because it’s low-level. Low-level means (the possibility of) maximum efficiency.  Why is MSOffice written in C/C++ and not VBA? Efficiency is a key reason. Why are most web servers written in C and not perl, not even java? Efficiency is a key reason.
Back to jvm portability. When I compile 2000 classes into a jar, and download 200 other jars from vendors and free packages. I zip them up and I get a complete zip of executables. If I fully tested it in windows then in many cases I don’t need to test them in unix. Compile once, run anywhere. We rely on this fact every day. Look at spring jars, hibernate jars, JDBC driver jars, xml parser jars, jms jars. Each jar in question has a single download for all platforms. I have not seen many perl downloads that’s one-size-fit-all.
I doubt Python, php or other scripting languages offer that either.
(See comments below)
Sent: Sunday, June 26, 2011 8:14 PM
Subject: RE: What’s so special about jvm’s portability compared to python’s or perl’s?
If you treat JVM == the interpreter of php/python/perl/etc., then Java’s so called “binary code portability” is almost the same as those scripting languages’ “source code portability”.
[Bin ] I have to disagree. AMD engineered their instruction set to be identical to Intel’s. Any machine code produced for Intel runs on AMD too — hardware level portability.

That’s one extreme level of portability. Here’s another level — Almost any language, once proven on one platform, can be ported to other platforms, but only at the SCP (source-code-portable) level. Portability at different levels has very different values. High-level portability is cheap but less useful.
Java Bytecode is supposed to be much faster as a lot of type checking, method binding, access checking, address resolution.. were already completed at compile-time. Java bytecode looks like MOV, JMP, LOAD … and gives you some of the efficiency of machine code.
Another proof is: Java binary code (compiled using regular method) can be de-compiled into source code, which indicates that its “binary code” has almost 1-to-1 mapping to “source code”, which means its binary code is equal to source code.
[Bin ] I would probably disagree. The fastest java bytecode is JIT and probably not decompilable I guess. For a sequence of instructions, the more “machine-like”, the faster it runs.
Well, you may want to argue JVM is better than the interpreter of those scripting languages, and I tend to agree. Java must have something that earned the heart of the enterprise application developers. Only that I haven’t found what it is yet 🙂

What’s so special about jvm’s portability compared to python/perl, briefly@@

When I compile a class in windows, the binary class file is directly usable in unix. (Of course I must avoid calling OS commands.) I don’t think python or perl reached this level.

I feel dynamic, scripting languages are easier to make portable because they offer source-code portability (SCP), no binary code portability (BCP). In other words, BCP is tougher than SCP. I believe BCP is more powerful and valuable.  BCP was the holy grail of compiler design and Java conquered it.

Due to the low entry barrier, some level of SCP is present in many scripting languages, but few (if any) other compiled languages offer BCP, because it’s tough. JVM is far ahead of the pack.

Even C is source-code portable, but C is a known as a poorly portable language due to lack of binary portability.

Quiz: who to "intercept" returns from a java method

Jolt: When you put in a “return”, you think “method would exit right here and bypass everything below”, but think again! If you use a return in a try{} or catch{}, it is at the /mercy/ of finally{}.


P477 [[ thinking in java ]] Only one guy can stop a return statement from exiting a method. This guy is finally, also mentioned in one of my posts on try-catch-finally execution order.

Same deal for break, continue.

RMI skeleton/stub instantiation

In an RMI scenario, there are 2 + 1 process. 2 on server, 1 on client
Process ps1) the application jvm
Process ps2) registry process. On unix, you often start it by hand. Note PS1 and PS2 must be on the same localhost.
Process pc) client JVM

There are at least 3 java objects involved. Both skeleton and stub implement the same business interface as OB.
Object OB) the real business object
Object SK) the skeleton object
Object ST) the stub object

Let's see how these are created and linked.
1) skleton is probably instantiated by exportObject() based on the OB object, inside PS1 JVM. This is a static method.
2) After export, Skeleton's address is then registered with the registry, using rebind() or bind(), both static methods.
) UnicastRemoteObject probably has a static collection to hold OB and SK, to fend off garbage collection
) Stub is created on demand in PC, by deserializing the skeleton object

address of a java object (and virtual/physical memory)

XR,

(another blog post) We once discussed how to find the address of a java object. The address has to be hidden from application programs since the garbage collector often need to relocate the object through the generational heap. Therefore any reference variable we use in java will let us read/write the “pointee” object but won't reveal address.

However, the address is visible to the garbage collector and some of the C code integrating with java via JNI or other means. It has to be visible because C uses pointers. A pointer holds a memory address. If a C function uses a pointer, then the C function can print out the address.

By the way, all along we are talking about virtual memory addresses, which could be anything from 0 to 0xFFFFFFFF ie 32-bit integer, even on a 128MB RAM laptop.

The virtual memory module in the kernel translates between virtual memory address and physical RAM address.

Q: Is it every possible for a C program to see the physical RAM address of an object? Here are my tentative answers so please correct me —

A: yes for the C program implementing the virtual memory module itself. This module runs in probably the lowest layer in the kernel. Virtual memory module probably gets loaded first so that a 32MB RAM laptop can load a 50M operation system. Virtual memory continues to be extremely relevant since no machine has enough RAM to fill up a 64 bit address space.

A: no for any other C program running on top of virtual memory module.

low-level differences between HASA^ISA, in terms of function pointers

Some interviewer (MS or CS) asked me about the differences.

1) Subclass instance has the _same_address_ as base object, that’s why you can cast the ptr (or reference) up and down the inheritance hierarchy. See post on AOB ie address of basement.

2) Also, all inherited methods are available in the “collection” of function pointers of the derived object. In other words, derived object advertises all those inherited “features” or “services”, if you think of a family of interchangeable components. Derived object can _stand_in_ for the base TYPE.

In OO languages, a pure interface is basically a collection of function pointers. Has-a doesn’t expose/advertise the internal object’s function pointers, so the wrapper can’t “stand in” for that type.

dark side of left shift operator – java

1) First, remember left-shift only operates on integer types.
2) Second, remember most if not all integer types use 2’s complement, where leading bit represents sign.

Now the dark side – Multiplication by 2 can move the original number between positive and negative universes.

In this context, Left shift is simpler to understand than multiplication. Left shift by 1 is equivalent to (xxx *2). Confirmed on Sun.

Q: how about multiplication by 4 or 3?
%%A: I feel multiplication by even numbers tend to give the same flipping problem.

Sun says –
If an integer multiplication overflows, then the result is the low-order bits (possibly with a leading ‘1’) of the mathematical product as represented in some sufficiently large two’s-complement format. As a result, if overflow occurs, then the sign of the result may not be the same as the sign of the mathematical product of the two operand values.

parameterized wrapper class with a superclass

(This looks like a low-level and widely usable implementation pattern. If it is, then we better learn to read it. However, I don’t see many java guys create generic classes.)

 

A wrapper class’s behavior is affected by 3 types of things

 

·         The “kernel” fields wrapped in the wrapper. In addition, there are often helper fields, too.

·         The parent class

·         The type param. Remember this class is parameterized.

 

This looks simple until you see it used in a complex system. You need to remember how these “influences” are set.

 

public class EntityCacheListenerAdapter<K, V extends Entity, T extends EntityListener>

extends AbstractCacheListenerAdapter {

      public EntityCacheListenerAdapter(T target) {

            super(target); // populating this.target

      }

      @Override

      public void afterUpdate(EntryEvent event) {

            super.getTarget() // inherited from parent

             .entityUpdated( // this method is from the kernel field

                    event.getOldValue(), event.getNewValue());

      }

 

subvert: field initializer + constructors #java

I had an apparently watertight base bean class with a field initializer

public final Map properties = Collections.EMPTY_MAP;

Apparently every instance should have an immutable empty map? You will see how this “watertight guarantee” is compromised.

Subclass adds no field. However, at runtime, I found an object of the subclass with this.properties == a Hashtable, even populated with data. How could JVM allow it?

More (ineffective) chokepoint — There is only one line of code that “puts” into this Map. It’s in a private method and I added a throw new RuntimeException() //to ensure it never adds any data.

More (ineffective) chokepoint — There’s only one constructor for the base class. I put some println() which, surprisingly, didn’t run.

Short Answer – casting an object received by de-serialization.

Long answer — These bean classes are DTO classes for RMI. Server side is still using the old version, so it conjures up objects with this.properties == a populated Hashtable and serializes it to client JVM. Client de-serializes this.properties only with respect to the declared type, bypassing field initializer or constructors. So long as the incoming stream can convert to a Map, it’s successfully de-serialized.

stepping through class/object loading, Take 2

– – – a story/hypothesis to be verified. See P240,113 [[Practical Java]] and P28,30 [[Java Precisely]] – – –

base static initializer and static initializer BLOCK run, in the order of appearance
child static initializer and static initializer block run, in the order of appearance
(see P30 [[java precisely]])

^^ milestone: classes fully loaded.

child and base instance field half-initialized to defaults — null, 0.0, false,..

^^ milestone: dummy C object allocated, which contains a dummy B object

child constructor C() *entered*
base constructor B() *entered*, as first statement in C(…)
base constructor may call an overridden method m1(), child’s m1(), so child’s m1() runs, with child’s instance fields half initialized! Note in C++, B::m1() runs. See http://www.artima.com/cppsource/nevercall.html

^^ milestone: base constructor B() returns.

child’s instance field initializers run. All fields fully initialized as programmed.
remaining statements in C() run

static nested classes unnecessary@@

As stated in another post, I always start my nested class as “private static” and relax gradually when justified.

Now, some say static nested classes can always be pulled out as first-class citizens i.e. top level classes. No. A major feature (perhaps 2) I rely on everyday is the private access modifier in the context of nested classes.

Nested class can refer to private members (fields/methods) of the enclosing class; enclosing class can refer to private members of the nested class.

java nested class — static^non-static

See post on [[philosophy of nested classes]].

Let’s put aside anon classes.

By default, all my nested classes start with the “static” keyword.

Practically, whenever the nested class needs access to enclosing class’s instance fields (outerInt1 for eg) i remove the “static” keyword and make the class non-static, effectively adding an implicit “outer.this” field into the nested class. Nested class can then access outerInt1 via outer.this.outerInt1

 

java private nested classes: my habits

A major justification and usage of nested class is constructor access control. To appreciate, first appreciate immutables, final, singletons, protected, …. Note you can finish most projects without using these access control devices, but I tend to spend extra time adding such access controls. In complex concurrent systems, they reduce risk and add much-needed reassurance. Constructor access control is one of the controls.

Majority of my nested classes have only private constructors. I usually call them from the enclosing class.

Furthermore, all my nested classes start off as private static classes, and become protected only when necessary.

public static nested class

XR

(another blog)

Just as constants can be defined in classes but better defined in interfaces, public nested static classes are better defined in interfaces. I feel this might be a best practice. It's more reusable and accessible.

More importantly, this is more readable than if defined in a class. When we define a public static nested class in an enclosing class, it appears to be tied to that class, but i feel that's an illusion. By putting the class in an interface, it's clearly presented as part of an open, shared interface and not tied to any object in the design.

However, i don't really know why we need public static nested classes at all. They look completely unnecessary.

In general, i feel anything that can go into classes or interfaces had better go into interfaces.

Coming back to nested classes, experts say static is better than non-static, if we have a choice. I agree, on the basis of readability, semantics, flexibility and loose coupling.

philosophy of java nested classes

“inner classes” = non-static nested classes. For beginners, let’s focus on typical scenarios:
1) C.java encloses N as a *private* inner class.
2) C.java has a field n1 of type N

Q: how would you use an instance of N? What does this object in memory represent?
A: a component of an object c (of type C).

Internalize — this.n1 is very similar to … a regular field, whereas N.class is more like a regular class than a C field. Allow me to repeat — Whenever you look at an inner class N, it’s very similar to a regular class.

Q: what does this.n1 resemble most? A field in an instance (c) of C?
A: No. Suppose a field j (of type J) in C has method j.m1(). m1() can’t (N can) access C’s private members
A: i think this.n1 most resembles a sophisticated and “trusted field“, a regular filed like j + the additional trust. The trust means this.n1 can have methods to access C’s private members.

Q: is such a construct never necessary and can be achieved using regular OO constructs?
A: No. The “trust” is hard to achieve otherwise.


Q: where do you instantiate N? How do you pass the instance around? How do you call N’s instance methods?
A: all inside C. Outside C, no other objects can see N

Q: how about a public (instead of private) inner class N2? What’s the use case or justification?
A: I would make N2’s constructor private, so outer class C.java is the only access point. So an instance (n2) of N2 becomes a slave object dedicated to the outer object. Note trust still applies.

——-
Now for static nested class S declared in C.java
Q: how would you use an instance of S? What does this object in memory represent?
A: not part of a C instance.
Q: what does this instance resemble most? A static field in an instance (c) of C?
A: No. I think in some usages this instance most resembles a better static method wrapper. You can group static methods in C and move them into S and make them non-static [1] inside S. An alternative design is the System.out pattern –. put (converting to non-static) C’s static methods [2] into a regular class A, and create an A instance as C’s static field. However, static nested class S (not A) can access C’s private static members.
A: in the case of AbstractMap.java, static nested class resembles….?

[2] they lose access to C’s private static members.

[1] I think most methods in S should be non-static. Static methods in a static nested class is a waste of time.

Q: where do you instantiate S? How do you pass the instance around? How do you call S’s instance methods?
A: instantiate in a static method in C. You can also do so in a static initializer or a non-static method.

swig vs javah, briefly

In the JNI world,

A “native method” (NM) is basically the skeleton one-liner method prototype in a *.java source file. Related to it …
a “native Function” (NF) is a special C function implemented using the JNI types (included from jni.h + jni_md.h) and conforming to JNI standard.

This native function is often a wrapper over an Existing Function (EF) written in unconstrained C and has the real business logic.

SWIG will wrap C++ EF, whereas javah command takes NM and creates C/C++ NF but in the form of a forward declaration (a.k.a. prototype). Based on that NF you then implement business logic.

Swig starts from C++ EF; javah start from NM.

Swig generates a lot of layers (the “entire stack”) — including pure C++ wrapper classes all the way to java proxy classes and anything (if any) in between. In contrast, javah generates just one of the layers (see above.)

4 ways to set jni search path

1) On all platforms, -Djava.library.path is all you need. But if you can’t modify the java command line, then there are a few alternatives

On *nix,
2) $LD_LIBRARY_PATH can include the directory of your libYourFile.so

On windows,
3a) if YourFile.dll is in the current directory, then it will be automatically  picked up

3b) otherwise, include its folder in %PATH%, which typically includes ….\bin folders and not …\lib, so be prepared

See P 17[[ JNI ]], which is on http://java.sun.com/docs/books/jni/html/start.html

subverting java private constructor #some c++

Private constructor is not watertight. Among the consequences, singletons must be carefully examined.

  • reflection
  • ObjectInputStream.readObject() — see other posts
  • de-serialization can instantiate the class without constructors (or field initializers).
    • RMI, EJB, JMS, Web service all use serialization
    • any time we copy an object across 2 jvm processes
  • if you see a private constructor, don’t feel safe — the class may be a nested class. In that case the enclosing class can call the private constructor. Worse still, another nested class (sister class) can also call the private constructor. And enclosed classes too. In summary all of these classes can call my private ctor —
    • ** my enclosing class
    • ** my “sister” classes ie other classes enclosed by my enclosing class
    • ** my “children” ie enclosed class including anonymous classes
    • ** my “grand-children” classes

— in c++

  • Friend function and friend class can call my private ctor.
    • Factory as friend
  • static method is frequently used by singletons

writeObject() invoked although never declared in any supertype

Usually, a common behavior must be “declared” in a supertype. If base type B.java declares method m1(), then anyone having a B pointer can invoke m1() on our object, which could be a B subtype.

However, writeObject(ObjectOutputStream) [and readObject] is different. You can create a direct subtype of Object.java and put a private writeObject() in it. Say you have an object myOb and you serialize it. In the classic Hollywood tradition, Hollywood calls myOb.writeObject(), even though this method is private and never declared in any supertype. Trick is reflection — Hollywood looks up the method named writeObject —

writeObjectMethod = getPrivateMethod(cl, “writeObject”, …

##java magic across domains

“Any sufficiently advanced technology is indistinguishable from magic.” If you don’t know these technologies, you can’t imagine how some things are achievable.

* runtime code generation — cglib and proxies (since jdk 1.3)
* bytecode engineering
* reflection
* AOP

Also powerful:
* threads in private inner classes

variable can’t live on the heap; only objects can

See also post [[a heap-stack dual variable]]

an object can live on the heap or the stack; but a variable can’t live on the heap. It’s either a stackVar or a field (or occasionally a global). That begs the questions

Q: what if a stack ref/ptr seated to a heap obj gets out of scope?
A: leak. unreachable object — need garbage collector.

Q: what if a field ptr/ref seated to a heap obj gets destructed with its host object?
A: the host dtor simply frees/reclaims the 4-byte memory, without calling delete() on the ptr. Item 7 [[eff c++]] says the “dtor” of a ptr is a no-op.
A: custom virtual dtor needed in the host object.

In java, a variable is either a field or a stackVar. An object is always on the heap

stepping through class/object loading

Based on P28, 30 [[ Java Precisely ]], and P110 [[practical java]]. There are dozens of important details [1]. Here we cover a few interesting observations. Assuming class C extends B, extending A.

Step: static initializer blocks and field initializers run, in order of apperance. Once static fields are initialized, they are available for use by all including static methods.

Step: static methods loaded and available to be called from the call-stack

— By this /milestone/, the class is “loaded” with all static stuff ready —

Note: Before any C-specific initializations, B() always *completes its steps* and returns a complete B object, to be wrapped in the onion.

Let’s skip ahead and look at…
Step K1: A’s instance field initializers and instance initializer blocks run, in order of apperance. These always appear outside the constructor.

Step K2: A() statements.

Note: By this time, no B state-initialization[2]. However, A() statements could call a B method — see [[baseclass dependent on subclass]]

Repeat K12 for B, and then C

[1] see Example 60 [[ Java Precisely ]].
[2] i think this is obvious. loading B’s method definition doesn’t count.
[3] Obviously, Object() and A() must complete beforehand.

java: static method inherited even if hidden

–I guess not inherited:
a static method m1 is tied to the super class. I think A subclass can only define another m1 to hide it.

Refer to the other post on “superclass instance inside subclass instance”. Static members are not INSIDE the onion and not inherited.

— Now I believe yes inherited:
P 22 [[ java precisely ]] says “inherits all methods … but not the constructors”
p 45 has a real example to prove that the superclass static method is still accessible even if shadowed (hidden) by a subclass static method (of the same signature, of course)

finally execution is so-called guaranteed UNLESS

[24 May 2007] guaranteed by JVM.

q: unless system powers off before it can?
A: agree. since the guarantee is given by jvm.

a7: finally still runs. Perhaps finally can do a return and /disarm/ the bomb?

q: unless the fatal error leads to an immediate jvm crash?
A: I don’t think so. If JVM can manage the fatal error, it can run finally{}.

q: what if there’s an endless loop in the try or catch?
A: then finally{} is expected to be delayed. JVM will honor the class owner’s expectations.

a690: right before the search for a /matching catch/ in the stack? see other blog posts on try/catch/finally execution order

q7: when does my finally run if my catch says exit()

q690: when does my finally run in the presence of my catchers and also catchers up on the call stack?

if constructor throws ..

myInstance = new MyClass() ;

Will myInstance become null or …?

I feel the assignment should leave myInstance unchanged. The constructor (which strictly are not “methods”) , like methods, won’t return anything to the caller. The constructor, the caller, and upstream callers may each be aborted.

See blog on exceptions in call-stack

 


For c++, a throwing ctor is common. If on the heap, the compiler will release the memory.

java always pass-by-value

A ref-type argument is passed by value — a copy of the remote-control, pointing to the same object

Once you point “critique_arg” to a new object, this method loses contact with the original critique object ] the caller method.

private boolean recordFault(Critique critique_arg, String brokenSlot){
String message = “Required slot ” + brokenSlot.toUpperCase() + ” missing.”;
critique_arg = new Critique(Critique.Severity.Critical, 0, slot, Critique.Type.ORDER);

By the way, the original object will get garbage collected if the variable in the caller method also gets assigned a new object.

portableremote.narrow() unable to cast between objects loaded by 2 class loaders

Solution:

mscope.jar should not include com/titan/**/*.class, so
AdaptiveClassLoader [1] won’t load them.

Which classloader will load these classes? By default, classes mentioned on
the classpath are loaded by the default classpath classloader.

[1] This is a custom class loader to load from mscope.jar. It’s a descendant of the classpath classloader.

javac^JIT – 2 compilers #priming

“compile->compile” rather than “compile-interpret”. More specifically,

1) Compile time — java source code is first compiled (by javac) to platform-independent byte code. You can move this *.class files to any machine of any arch. However, by definition, platform-independent code is not optimized for any platform.
2) Run time – byte code is compiled to platform-specific machine code “just in time”. The machine code is similar to those produced by gcc.

This is rather essential knowledge. I think sometimes quizzed on interviews.

O’Reilly [[java performance]] P75 has concise introduction of Hot-Spot JIT. The JIT only compiles the most critical “hotspots”, not those parts of the byte code executed once only. Therefore, the JIT needs some priming (i.e. warm-up) phase, during which it collects statistic about each code chunk, while executing the half-cooked byte code from Phase One. Based on the statistics it makes heuristic optimizations, as illustrated on P76.

Note it’s possible to execute the byte code from Phase One without second-phase compilation. This is actually optimal when the methods are executed only once — interpreting is faster for them!

This description is consistent with Ab-Initio architect who said in 2012 (not in his original words) that java system needed to consume some typical input data in priming, before throughput became comparable to c++. He said you need to “throw some data at the JVM”.

## how might jvm surpass c++]latency #MS #priming discussed priming esp. in trading.