3rd effect@volatile ] java5

A Wells Fargo java interviewer said there are 3 effects. I named

  1. load/store to main memory on the target variable
  2. disable statement reordering

I think interviewer mentioned a 3rd effect about memory barrier.

This quote from Java Concurrency in Practice, chap. 3.1.4 may be relevant:

The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable. So from a memory visibility perspective, writing a volatile variable is like exiting a synchronized block and reading a volatile variable is like entering a synchronized block.

https://stackoverflow.com/questions/9169232/java-volatile-and-side-effects address my doubt about “other writes“. Nialscorva’s answer echoes the interviewer:

Before java 1.5, the compiler can reorder the two steps

  1. construction of the new object
  2. assigning the new address to the variable

In such a scenario, other threads (unsynchronized) can see the address in the variable and use the incomplete object, while the construction thread is preempted indefinitely like for 3 hours!

So in java 1.5, the construction is among the “other writes” by the volatile-writing thread! Therefore, the construction is flushed to memory before the address assignment. Below is my own solution, using a non-static volatile field:

public class DoubleCheckSingleton {
	private static DoubleCheckSingleton inst = null;
	private volatile boolean isConstructed = false;
	private DoubleCheckSingleton() {
		/* other construction steps */
		this.isConstructed = true; //last step
	DoubleCheckSingleton getInstance() {
		if (inst != null && inst.isConstructed) return inst;
		synchronized(DoubleCheckSingleton.class) {
			if (inst != null && inst.isConstructed) return inst;
/**This design makes uses of volatile feature that's reputed to be java5
* Without the isConstructed volatile field, an incomplete object's 
* address can be assigned to inst, so another thread entering getInstance()
* will see a non-null inst and use the half-cooked object 😦
* The isConstructed check ensures the construction has completed
			return inst = new DoubleCheckSingleton();

Wells IV #socket/c++/threading

This is one of the longest tech interviews and one of the most enriching 🙂 even though I don’t “like” all their questions — very academic/theoretical, text-book driven, too low-level to be practical, not testing practical zbs.

Q: Any consistently reliable technique to detect stale order in your book?

Q: what’s your frame size if your tcp/ucp buffers sizes are so large?

Q: experience storing ticks?

—— C++ questions:
Q1: how can you avoid the cost of virtual function?
%%A: enum/switch or CRTP

Q1b: what’s CRTP

Q: what if your copy ctor signature is missing the “const”?
%%A: you can’t pass in temporaries or literal values. Such arguments must pass into “const &” parameter or “const” parameter. Correct.

Q: double delete?

Q: policy class? traits class?

Q: STL binders.. use case?

Q: how many types of smart pointers do you use?

Q: difference between java generics vs c++ templates?
%%A: type erasure. Java Compiler enforces many rules, but bytecode saves no info about the type argument, so we can get strange runtime errors.
%%A: template specialization. Template meta-programming
%%A (accepted): code bloat since each template instantiation is a separate chunk of code in the object file and in memory.
A: I have a dedicated blog post on this.

Q: what’s returned by a std::queue’s dequeue operation when it’s empty?
AA: undefined behavior, so we must check empty() before attempting dequeue. I believe ditto for a std::stack

Q: why template specialization?
%%A: customize behavior for this particular vector since the standard implementation is not suitable.

Q: how do you implement a thread-safe c++singleton
A: not obvious. See concurrent lazy singleton using static-local var

Q12: in a simple function I have
vector v1 = {“a”, “b”}; vector v2 = v1; cout<<….
What happens to the ctor, dtor etc?
A: 2 std::strings constructed on heap, vector constructed on stack; 2nd vector copy-constructed on stack; 2 new strings constructed on heap; vector destructors deletes all four strings
A: the actual char array is allocated only once for each actual value, due to reference counting in the current std::string implementation.

Q12b: you mean it’s interned?

Coding question: implement void remove_spaces(char * s) //modify the s char array in place. See %%efficient removeSpaces(char*) #Wells

—— threading, mostly in java
Q: What are the problems of CAS solutions?
A: too many retries. CAS is optimistic, but if there are too many concurrent writes, then the assumption is invalid.
%%A: missed update? Not a common issue so far.

%%Q: Synchronized keyword usage inside a static method?
AA: you need be explicit about the target object, like synchronized(MyClass.class)

Q21: Name 3 effects of java volatile keyword — Advanced. See 3effects@volatile ] java5.

Q21b: analyze the double-checking singleton implementation.
staticInst = new Student(); // inside synchronized block, this can assign an incomplete object’s address to staticInst variable, which will be visible to an unsynchronized reader. Solution – declare staticInst as volatile static field

—— System programming (ANSI-C) questions
Q: have you used kernel bypass?

Q5: how do you construct a Student object in a memory-mapped-file?
%%A: placement new

Q5b: what if we close the file and map it again in the same process?
%%A: the ptr I got from malloc is now a dangling ptr. Content will be intact, as in Quest

Q5c: what if Student has virtual functions and has a vptr.
%%A: I see no problem.

Q: you mentioned non-blocking TCP send(), so when your send fails, what can you do?
%%A: retry after completing some other tasks.

Q4: why is UDP faster than TCP
%%A: no buffering; smaller envelopes; no initial handshake; no ack;
%%A: perhaps most important for market data — multiple recipient with TCP is very heavy on sender and on network

Q4b: why buffering in tcp?
%%A: resend

Q: can tcp client use bind()?
%%A: bind to a specific client port, rather than a random port

Q6: is there socket buffer overflow in TCP?
A: probably not

Q6b: what if I try to send a big file by TCP, and the receiver’s buffer is full? Will sender care about it? Is that reliable transmission?
A: Aquis sends 100MB file. See no overflow]TCP slow receiver #non-blocking sender

Q: in your ticking risk system, how does the system get notified with new market data?
%%A: we use polling instead of notification. The use case doesn’t require notification.

Q: any network engineering techniques?

Q: what kernel parameters did you tune?


—— Stupid best-practice questions:
Q: what’s the benefit of IOC?

Q: Fitnesse, mock objects

Q6: motivation of factory pattern?
Q6b: why prevent others calling your new()?
%%A: there are too many input parameters to my ctor and I want to provide users a simplified façade
%%A: some objects are expensive to construct (DbConnection) and I need tight control.
%%A: similarly, after construction, I have some initialization in my factory, but I may not be allowed to modify the ctor, or our design doesn’t favor doing such things in the ctor. I make my factory users’ life easier if they don’t call new() directly.
%%A: I want to centralize the logic of how to decide what to return. Code reuse rather than duplication
%%A: I can have some control over the construction. I could implement some loose singleton

java lock: atomic^visibility

You need to use “synchronized” on a simple getter, for the #2 effect.!


A typical lock() operation in a java method has two effects:

  1. serialized access — unless I release the lock, all other threads will be blocked grabbing this lock
  2. memory barrier — after I acquire the lock, all the changes (on shared variables) by other threads are now visible. The exact details are .. to be detailed.

I now feel the 2nd effect is often more important (and more tricky) than the 1st effect. See P94 [[Doug Lea]] . I like this simple summary, even if not 100% complete —

“In essence, releasing a lock forces a flush of all writes from working memory….and acquiring a lock forces reload of accessible fields.”

Q: what are the subset of “accessible fields” in a class with 9 fields?
A: I believe the compiler knows “in advance” what subset of fields will be accessed after lock acquisition.

Q: what if I acquire a lock, do nothing and release the lock? Is there the #2 effect?
A: I doubt it. You need to enclose all the “tricky” operations between a lock grab/release. If you leave some update (to a shared mutable) outside in the “cold”, then #1 will fail and #2 may also fail.



y concurrentHM.size() must lock entire map#my take

Why not lock one segment, get the subcount, unlock, then move to next segment?

Here’s my take. Suppose 2 threads concurrently inserts an item in each of two segments. Before that, there are 33 items. Afterwards, there are 35 items. So 33 and 35 are both "correct" answers. 34 is incorrect.

If you lock one segment at a time, you could count an old value in one segment then a new value in another segment.


java ReentrantLock^synchronized keyword

I told Morgan Stanley interviewers that reentrantLock is basically same thing as Synchronized keyword. Basically same thing but with additional features:

  • Feature: lockInterruptibly() is very useful if you need to cancel a “grabbing” thread.
  • Feature: tryLock() is very useful. It can an optional timeout argument.

Above features all help us deal with deadlock:)

  • Feature: Multiple condition variables on the same lock.
  • Feature: lock fairness is configurable. A fair lock favors longest-waiting thread. Synchronized keyword is always unfair.
  • — query operations —
  • Feature: bool hasQueuedThread(targetThread) gives a best-effort answer whether targetThread is waiting for this lock
  • Feature: Collection getQueuedThreads() gives a best-effort list of “grabbing” threads on this lock
  • Feature: Collection getWaitingThreads (aConditionVar) gives a best-effort view of the given “waiting room”.
  • Feature: int getHoldCount() basically gives the “net” re-entrancy count
  • Feature: bool isHeldByCurrentThread()

[17]java MS java threading IV#phone

See also https://bintanvictor.wordpress.com/2009/03/21/realtime-inter-vm-communication-in-front-desk-trading-sys/

These are in the QQ category i.e. skills required for QnA IV only.

Q1: 3 threads to print the numbers 1,2,3,4,5… in deterministic, serial order. Just like single-threaded.

Q1b: what if JVM A has T1, T2, and JVM B has T3? How do they coordinate?
%%A: in C++ shared memory is the fastest IPC solution for large data volume. For signaling, perhaps a semaphore or named pipe
%%A: I feel the mutex is probably an kernel object, accessible by multiple processes.

On Windows, mutex, wait handle, … are all accessible cross-process, but java (on Windows or unix) is designed differently and doen’t have these cross-process synchronization devices.

%%A: use a database table with one row one column. Database can notify a JVM.
AA: The java.nio.file package provides a file change notification API, called the Watch Service API. The registered JVM has a thread dedicated to watching.AA: in java, the JDK semaphore is NOT a wrapper of the operation system semaphore so not usable for IPC
A: java Semaphore? Probably not an IPC construct in java.

Q2: have you used any optimized Map implementations outside the JDK?

Q3: to benchmark your various threading solutions how do you remove the random effects of GC and JIT compilation?
%%A: allocate enough memory to avoid GC. Turn off JIT to compile every code path. Perhaps give the JVM some warm-up period to complete the JIT compilation before we start the benchmark.


create java thread pool – constructor^factory

The factory is recommended, but a true blue ThreadPoolExecutor offers additional methods such as getTaskCount(). See


So in reality I may use the constructor.


java Callable returning a char

My favorite return type is char, largely due to the non-null guarantee.

Callable<ReturnType> must use a reference type. Similar to List<E>.

My suggestion: Callable<Character>. The actual call() method would return a primitive character, so there’s a good-enough guarantee that return value is not null.


java exception passing between threads

Many people ask how to make a child thread’s exception “bubble up” to the parent thread.

Background — A Runnable task is unsure how to handle its own exception. It wants to escalate to parent thread. Note parent has to block for the entire duration of the child thread (right after child’s start()), blocked either in wait() or some derivative of wait().

This question is not that trivial. Here are my solutions:

1) Callable task and Futures results — but is the original exception escalated? Yes. P197 [[java threads]]

2) Runnable’s run() method can temporarily catch the exception, save the object in a global variable such as a blocking queue, and notifyAll(). Parent thread could check the global variable after getting notified. Any thread can monitor the gloabal.

If you don’t have to escalate to parent thread, then

3) setUncaughtExceptionHandler() – I think the handler method is called in the same exception thread — single-threaded. In the handler, you can give the task to a thread pool, so the exception thread can exit, but I don’t know how useful.

4) adopt invokeAndWait() design — invokeAndWait() javadoc says “Note that if the Runnable.run() method throws an uncaught exception (on EDT) it’s caught and rethrown, as an InvocationTargetException, on the callers thread”

In c#, there are various constructs similar to Futures.get() — seems to be the standard solutions for capturing child thread exception.
* Task.Wait()
* Task.Result property
* EndInvoke()


##thread cancellation techniques: java

Cancellation is required when you decide a target thread should be told to stop.

Cancellation is a practical technique, too advanced for most IV.

Note in both java and c#, cancellation is cooperative. The requester (on it’s own thread) can’t force the target thread to stop.

C# has comprehensive support for thread cancellation (CancellationToken etc). Java uses a numbers of simpler constructs, described concisely in [[thinking in java]].

* loop polling – the preferred method if your design permits.
* interrupt
* thread pool shutdown, which calls thread1.interrupt(), thread2.interrupt() …
* Future — myFuture.cancel(true) can call underlyingThread.interrupt()

Some blocking conditions are clearly interruptible — indicated by the compulsory try block surrounding the wait() and sleep(). Other blocking conditions are immune to interrupt.

NIO is interruptible but the traditional I/O isn’t.

The new Lock objects supports lockInterruptibly(), but the traditional synchronized() lock grab is immune to interrupt.