AtomicBoolean is irreplaceable

People question the justification for seemingly trivial classes like AtomicSomething. Let’s take the simplest and show AtomicBoolean is irreplaceable.

compareAndSet(true, false) means “if value in main memory before this clock cycle is positive, then toggle it, otherwise do nothing. No locking please.

Read it over and over, and you would realize it’s impossible without Atomics. Volatiles can’t do it, because this involves more than a single load or store. AtomicBoolean does the load and the store in one clock cycle, using special hardware features — “cmpxchg” in intel instruction.

java’s getAndSet() is written using compareAndSet()

Q: Now, since a volatile done/stop flag is frequently used to stop a thread, shall we replace it with an AtomicBoolean?
A: no harm, but I don’t see the need.

AtomicReference + immutable = lockfree #code

import static java.lang.System.out;
import java.util.concurrent.atomic.AtomicReference;

public class AtomicAccount {
    public static void main(String[] a) {
        final AtomicAccount inst = new AtomicAccount(100);
        Runnable r = new Runnable() {
            public void run() {
        new Thread(r).start();
        new Thread(r).start();

    public final AtomicReference AR = new AtomicReference();

    public AtomicAccount(float bal) {
        Account tmp = new Account();
        tmp.bal = bal;
        this.AR.set(new ImmutableAccount(tmp));

     * add 1% to balance. load and store must be atomic.
    void addInterest() {
        while (true) {
            ImmutableAccount b4 = this.AR.get();
            Account tmp = new Account();
            tmp.bal = b4.getBal() * 1.01f;
            ImmutableAccount update = new ImmutableAccount(tmp);
            if (this.AR.compareAndSet(b4, update)) {

    public void printBal() {
        out.println(“current balance = ” + this.AR.get().getBal());

    static private class ImmutableAccount {
        final private Account mutable = new Account();

         * if bal is a non-volatile double, we might read the 2 parts
         * un-atomically.
         * @param a
        public ImmutableAccount(Account a) {
            this.mutable.bal = a.bal;
            // copy other fields

        public float getBal() {
            return this.mutable.bal;

class Account {
    float bal;

AtomicReference + immutable = lockfree

Q: what does atomic mean?
A: “within this method between first and last access to some shared object, no other thread can modify it without me knowing.”

For a mutable bean like Account, we’d love atomic operations like addInterest(). Note there’s no AtomicFloat class.

Note AtomicReference is a wrapper/bridge of the raw pointer. Somewhat similar to how a smart pointer wraps a raw pointer. Our could be a bigger wrapper around the AtomicRefernce wrapper.

Note the cornerstone method AtomicReference.compareAndSet() works solely with addresses, in the comparison and the reseating. Much simpler and faster than equals(), esp. in a lockfree atomic context.

Q: Can we use AtomicReference to add thread safety without locking?

Suppose someone supplies a mutable Account class. We will construct an AtomicAccount class meeting these requirements all without locking —

– Req: List getStocks() to atomically return all the stock symbols held in the account. Without locking, it’s possible to see an invalid list undergoing simultaneous changes in 2 threads.
– Req: addInterest() and addFund(double). Without locking or atomic classes, you can lose updates.
– Req: getBalance()

% assumption: we have strict access control over the Account instance. No thread should use a naked Account reference to access its state.

A key design principle — “If RAW-POINTER wrapped in the AtomicReference is not _reseated_, then the pointee object state is intact.” In other words, account is mutable only by creating a new object. —

Achieved by wrapping Account instance in an ImmutableAccount instance. Change to Account results in a new ImmutableAccount wrapping a new Account. New ImmutableAccount then replaces pointer in the AtomicReference.

Now you realize an AtomicReference is a thread-safe lookup table with just one key-value pair. Key is the unnamed default key; value is an address. Salient features:
* compareAndSet in a loop, lockfree.
* immutables
* optimistic.
* layers of wrapper[1] to give the lockfree advantage

[1] In our example, 3 wrappers on Account ) ImmutableAccount ) AtomicReference ) AtomicAccount

boost thread to run a method of my object (full source

class Worker{
    Worker()    {
        // the thread is not-a-thread ie a dummy thread object, until we assign a real thread object
    void start(int N)    {
        // pass “this” as first arg to processQueue(), which runs not as a method but as a free func in its own thread
        m_Thread = boost::thread(&Worker::processQueue, this, N);
    void join(){m_Thread.join();}
    void processQueue(unsigned N)    {
        float ms = N * 1e3;
        boost::posix_time::milliseconds workTime(ms);
        std::cout << "Worker: started, will work for "
                  << ms << "ms"
                  << std::endl;
        std::cout << "Worker: completed" << std::endl;
    boost::thread m_Thread;
int main(int argc, char* argv[]){
    std::cout << "main: startup" << std::endl;
    Worker worker;
    std::cout << "main: waiting for thread" << std::endl;
    std::cout << "main: done" << std::endl;
    return 0;

Compile a boost regex/thread-based project – minGW

You configure the -L under project -> properties –> CCG -> LibraryPath

You configure the -l under project -> properties -> CCB -> settings -> toolSettings tab -> MingwC++Linker -> library
— regex
cd C:xbakideWorkspaceec_boost1Debug
g++ -IC:boost_1_44_0 -O0 -g3 -Wall -c -fmessage-length=0 -osrcec_boost1.o ..srcec_boost1.cpp
g++ -LC:boost_1_44_0stagelib -oec_boost1.exe srcec_boost1.o -lboost_regex-mgw44-1_44

— boost::thread is more complicated cos the lib file(s) you built with bjam seem to be a hybrid of static and dynamic lib. Perhaps a static import library used to pull in the dll file (yes there is a dll produced). In that case, at runtime the dll must be available on PATH.

cd C:xbakideWorkspaceec_boost1Debug
g++ -IC:boost_1_44_0 -O0 -g3 -Wall -c -fmessage-length=0 -osrcec_boost1.o ..srcec_boost1.cpp
g++ -LC:boost_1_44_0stagelib -Bstatic -oec_boost1.exe srcec_boost1.o -lboost_thread-mgw44-mt-1_44

If you see error about missing dll, then copy the named dll to the location of your exe, or a %PATH% dir. Why the error? Dependency Walker shows LIBBOOST_THREAD-MGW44-MT-1_44.DLL is a dependency of your new born exe. Apparently, there’s no real static linking of the boost thread lib.

##decouple to refactor a degenerating codebase

An OO design often starts cleanly decoupled, then gets /entangled/. Over time, tight coupling invariably creeps in (among other code smells) –>

* a class having (too) many fields pointing to other classes we wrote. The laziest way to put some existing logic into another class is to add a field , so-called “dependency” or “service”.
** too many setters
** complex constructors
* too many “new SomeClass” all over the place
* too many objects passed around as method arguments
* too many “public” members

Drawback — rigid, fragile — see post on …

— Some of my favorite refactoring fixes —
– a lot of assertions to document my refactoring assumptions.
– c# style static classes — all-static stateless classes
– black-box Facade
– More shared interfaces
– Factory
– Tuple classes for many-to-many association.
– POJO/DTO classes without methods — you could slowly add one or 2 methods but not recommended when cleaning up a degenerate codebase.
– Enums

– (code smell) use exceptions to break out of loops and call stacks.

[10] return value optimization, 1st encounter

I got a surprise when I run the program below with g++ (mingw port).

* returnLocalVar() should return by value, by calling the copy constructor, but copy constructor didn’t get called.
* localCat object should be destructed when it goes out of scope at end of the function, but no it was kept alive till end of main()
* ret is a local variable in main(), not a reference, so i expect ret to be a copy of localCat, with a separate address, but no they 2 variables have the same address.

#define S(x) std::cout<<#x” address =\t”<<&x<<“\n”;
using namespace std;
struct CAT {
CAT() {
cout < no-arg constructor with addr = ” << this << “\n”;
} // default constructor
~CAT() {
cout << this << ” Destructor called!\n”;
CAT(const CAT& rhs) {
cout < copy constructor ” << this << ” <- ” << &rhs << “\n”;
CAT returnLocalVar() {
CAT localCat;
cout << “leaving returnCat\n”;
return localCat;
// localCat is not destructed if result is assigned
int main() {
CAT ret = returnLocalVar();
cout << “done!\n”;

Answer from a friend:

return value optimization (RVO). It is an optimization specifically blessed by the C++ specification. While the compiler is creating the returnLocalVar function, it is allowed for it to create the CAT temporary in main’s stack space instead of returnLocalVar’s stack space. Then, instead of returning it by making a copy, it just leaves it there on the stack where main can find it. The compiler can’t always figure out whether that would be safe, so it only does it in predictable circumstances.

Usage of ScheduledFuture and Callable in our pricing engine

The dependency manager may have submitted 1000 requests for repricing and the engine has gotten through 300 of them. If the dependency manager knows that for example position 106 no longer needs a revaluation, it can use the future object to cancel it.

Future also gives us access to the result of the task, but we mostly use Future to cancel tasks for multithreading efficiency

release test failures and code quality

(blogging again) XR, You mentioned some colleagues often release (to QA) code that can’t complete an end-to-end test .. (Did I got you wrong?)

That brings to mind the common observation “95% of your code and your time is handling the unexpected”. That’s nice advertisement and not reality — a lot of my GS colleagues aggressively cut corners on that effort (95% -> 50%), because they “know” what unexpected cases won’t happen. They frequently assume (correctly) a lot of things about the input. When some unexpected things happen, our manager can always blame the upstream or the user so we aren’t at fault. Such a culture rewards cutting-corners.

I used to handle every weird input combination, including nulls in all the weird places. I took the inspiration from library writers (spring, or tomcat…) since they can’t assume anything about their input.

I used to pride myself on my code “quality” before the rude awakening – such quality is unwanted luxury in my GS team. I admit that we are paid to serve the business. Business don’t want to pay for extra quality. Extra quality is like the loads of expensive features sold to a customer who just needs the basic product.

As we discussed, re-factoring is quality improvement at a price and at a risk. Business (and manager) may not want to pay that price, and don’t want to take unnecessary risk. I often do it anyway, in my own time.

pricing control in a bond dealer desk

Pricing (along with pnl) is one of the most important data to monitor and control. There’re multiple levels of price controls.
* Offer/bid price limits, to block out-of-range offer/bid advertisements
* the price in a response to a IFB is typically sent out via a system and is probably subject to price control, to prevent bidding too high.
* After trade execution, Middle Office would check the price against some reference prices. If a trader executed an unusually price, she may be responsible. I was told MO only bothers with unfinished (i.e. unclosed) positions.
* Pricing exception report and attestation. I think this is internal compliance.
* There could be regulations on unusual execution prices in some regulated securities. It’s conceivable that government wants to know every trade’s price in a particular derivative so as to prevent another bank collapse.

trade booking, trade capture, position management, sub ledger

The standard OTC trade booking system (TBS) — After u finalize i.e. execute a trade with your counterparty, typically over phone, you enter the completed trade in the TBS. Some call it trade capture. I think this used to be the trade blotter. Before TBS, people used spreadsheet. This is one of the earliest and most essential IT systems for traders.

The other absolutely essential trading system is the position management system (PMS), aka sub ledger. TBS records all the trade activities, and independently computes current positions by accumulation, and synchs up with the PMS every day.

How about pricing engine? In OTC, trader can decide the price with a pencil or a sophisticated pricing engine. I think it’s firm’s money but trader’s decision, so it’s up to her.

##java GTD knowledge2deepen on daily basis {SocGen

(blogging again)

Even though we can pass most java interviews (in the east coast:)), I find myself learning bits of practical java how-to on a *daily* basis.

* ant and JVM launch script
* reflection
* thread pool
* concurrency constructs
* serialization
* callback listeners
* generics
* swing

They present everyday challenges to a intermediate developer like me, so better knowledge does improve problem solving, code t r a c i n g and design. These challenges illustrate one reason why I feel less proficient in java than in Perl. I can solve most *practical* Perl problems. Java is bigger and present more “problems”. As I spend more years in java, I hope to pick up more know-how. Therefore java remains my primary study subject, along with c++.

In the list above, I have deliberately omitted vendor products as they aren’t core java IMO. But obviously there’s a lot to learn in them.

* spring
* hibernate
* gemfire
* fitnesse

options and futures – related

* options is about future prices. Closely related to futures contracts, futures pricing, futures trading.

* When CBOT was inventing (commodity) options, their options contracts were constructed around futures contracts

* In commodities biz, futures contracts have a right to exist and an economic value, for producers and manufacturers (buyers). options? questionable.

– A futures contract is a contract. One side must delivery; the other must take the delivery.
– An option is an obligation, a one-sided contract. The writer of the option has an obligation to provide (C option) or pay for (P option) the security upon request. Some say an option is not an obligation, but i disagree.

RAII lock auto-release with boost SCOPED_lock

using namespace std;
boost::posix_time::milliseconds sleepTime(1);

class MyQueue {
    void enqueue(const T& x) {
        cout < enqueuing … ” << x << "n";
        boost::mutex::scoped_lock myScopedLock(mutex_);
        cout <> just got lock … ” << x << "n";
        // A scoped_lock is destroyed (and thus unlocked) when it goes out of scope
    T dequeue() {
        boost::mutex::scoped_lock lock(mutex_);
        if (list_.empty()) {
            throw “empty”; // unlock
        T tmp = list_.front();
        cout << "< dequeu " << tmp << "n";
        return (tmp);
    std::list list_;
    boost::mutex mutex_;
MyQueue queueOfStrings;
int reps = 5;
void sendSomething() {
    std::string s;
    for (int i = 0; i < reps; ++i) {
        stringstream st;
        st << i;
        s = “item_” + st.str();
void recvSomething() {
    std::string s;
    for (int i = 0; i < reps*3; ++i) {
        try {
            s = queueOfStrings.dequeue();
        } catch (…) {
            cout << "<- – (    ) after releasing lock n";
int main() {
    boost::thread thr1(sendSomething);
    boost::thread thr2(recvSomething);

initializer in copier — bypass no-arg ctor

Say your copy ctor is invoked (often implicitly;-), and you need to clone a field of type T. If you don’t use the field initializer syntax, then the copier first calls T’s no-arg ctor to “make room” for the field, then calls T’s operator= to modify its state.

If you choose the initializer syntax, then T’s no-arg ctor is not needed. T’s copier is used instead. Chain reaction — Host copier invoking field copier

using namespace std;
template class Safe{
    Safe(const T& obj){ // T aObj will instantiate
        cout<<"Safe copier called"<<endl;
        cout<<"before assignment, "<<&aObj<<endl;
        aObj = obj;
        cout<<"after assignment, "<<&aObj<<endl;
//    Safe(const T& obj):aObj(obj){} // T aObj won’t instantiate
    ~Safe(){    }
    T aObj;

struct Date{
        cout<<"Date no-arg called with this = "<<this<<endl;
    Date(int y, int m, int d){}
    Date(const Date& rhs){
        cout<<"Date copier called with this = "<<this<<endl;
int main() {
    Date d(1,2,3);
    cout<<"& d = "<<&d<<endl;
    Safe dSafe (d);
    cout<<"dSafe.aObj address = "<<&(dSafe.aObj)<<endl;
    return 0;

FIFO read/write lock – another home-made design

1) A writer-friendly rwlock says each writer is assigned an auto-increment queue-num (qnum), and all readers — early-birds or late-comers — all get qnum = -1, which is behind all writers. [[pthreads]] covers this kind of rwlock.

2) A strict FIFO rwlock says each writer is assigned a queue-number (qnum), and between 2 writers, any readers all get the same qnum.

Note all rwlock implementations are based entirely on mutex and condition objects. Nothing else needed.

Naive design 1: everyone waits on the same condition. Last release will notifyAll. The head the queue is self-aware [1] and proceeds. Everyone else goes back to wait. This is inefficient IF too many waiters.
[1] How? The rwlock knows (via a private field) the qnum of the head node. Alternatively, we can use a linked list.

Design 2 (notify only the head node): have the head thread(s) wait on a head-condition; everyone else in the everyone-condition. Whoever sends notify would notifyALL on the head condition, yield, and notifyAll on the everyone-condition. Everyone wakes up but only the next thread(s) in the linked list proceeds to wait in the head condition. However, before they enter, better check if the rwlock is free — it’s possible by the time they  wake up the lock is free.

simple boost thread program


void workerFunc() {
    boost::posix_time::seconds workTime(3);
    std::cout << "Worker: starting up then going to sleep" << std::endl;
    boost::this_thread::sleep(workTime); // Thread.sleep
    std::cout << "Worker: finished" << std::endl;
int main(int argc, char* argv[]) {
    std::cout << "main: startup" << std::endl;
    boost::thread workerThread(workerFunc); // pass a func ptr to thread ctor
    boost::posix_time::seconds workTime(1);
    std::cout << "main: sleeping" << std::endl;
    boost::this_thread::sleep(workTime); // Thread.sleep
    std::cout << "main: waking up and waiting for thread" << std::endl;
    std::cout << "main: done" << std::endl;
    return 0;

Build simple boost project]EclipseCDT^codeBlocks has a simple program using Boost Lamda.

1) immediately clean the project. Eclipse will rebuild and fail due to include path

2) Project -> Properties -> CCG -> pathsAndSymbols -> Includes -> GNU_C++ -> Add -> file system …
…I took the laziest route and simply put in “C:boost_1_44_0” to ignore all checkboxes, and it worked. I used a forward slash and it worked too. However, I removed it and put in “C:boost_1_44_0” and checked all 3 checkboxes -> compiler can’t resolve “#include “

(msvc procedure is well-documented.)

codeblocks is easier with include path — Project -> BuildOptions -> SearchDirectories -> Compiler -> Add

countingSemaphore and read/write lock

Q: Logically, reentrant read-write locks (available in java, boost,objectSpace …) can be achieved using counting semaphores — in theory. How about practically?
%%A: I feel you can implement it using condVar without countingSema.

Q2: I guess counting semaphore uses inter-thread signals. This is either the conditionVar OR the low-level signaling construct described in the post on [[mutex^condVar^countingSemaphore — big 3 synchronization devices]]

%%A2: I feel in some cases it could be the condVar. CountingSema are sometimes written by library-USERS, who must use library utilities.Those low-level signals by definition aren’t exposed to library-USERS.

read/write lock concept simplified, using countingSemaphore

Invariant to be enforced (Ignore the how-to-enforce please) — if a (exclusive) writer is in the critical section, then no other (reader/writer) thread should be seen in the same critical section.

Now a possible enforcement policy — using semaphore with a large number of permits. Any reader must get a permit before entering critical section; a writer must get all permits before entering.

We can learn a lot from DB transaction isolation levels.

FIFO read/write lock #my take

A simple counting-semaphore-based reader/writer lock is unfair.

I proposed long ago that we initialize a large number (2000+, up to max long integer) of permits for each lock instance, and require the writer to block to acquire all the permits in one go. As stated on P89 [[Pthreads programming]], such a design is unfair to writer, as a troop of readers can circulate a token among themselves, A passing to B, passing to C, passing to D … and the writer will wait forever even if it arrives on the scene just after A, and before B, C, D… Note the token is not a permit or a lock, just some object represent a “turn”. When Reader Thread A leaves the scene, A releases the Permit, but only After B takes up another Permit. At any moment in time, there’s some permit owned by the readers. The writer is starved unfairly.

One idea is incremental acquisition by writer, who acquires all the free permits ASAP, and waits to grab A’s permit. I feel this is susceptible to deadlock. It also (fairly or unfairly) disables the “shared read” functionality.

A more fair design puts the candidates into some kind of FIFO queue, perhaps a simple linked list. This tentative design uses 2 condition variables (and 2 associated locks). Whenever the RW lock becomes completely free (all permits freed), a signal is sent by the last returning thread, on the rw_free condition. In contrast to the simple design, there’s only one (special) waiting thread on that condition — a messenger thread, who wakes up and then notifies the first thread in the queue, before going back to wait(). In fact, all the queue threads (to be capped at 200) are waiting on the same queue condition, but only the FIFO queue-head thread is allowed to proceed, while others re-enter wait().

A new reader candidate joins the queue if any, otherwise grabs a permit and proceeds.

A new writer candidate joins the queue if any, otherwise determines if lock is completely free. If not free, then it starts the queue. Queue can be started by writer only.

Once a writer leaves the queue, does its job and frees all permits, all the readers before the next writer proceeds to leave the queue.

The queue holds Nodes, each having a pointer to a thread handle — a java Thread instance or a thread id in C. Any thread returning from wait() would check the queue to see if its own thread handle equals the queue-head’s thread handle.

(A tentative design.)

##java value-add` GTD/KPI skill when u join a trading team

What skills make you a more resourceful and *productive* developer to a fast-paced trading system? These usually have little to do with arch design. Almost never asked in interviews, ironically.

See also my post on 2 types of tools.

1) IDE remote debugging
** breakpoint on exception
** conditional breakpoint
** compile with local vars
2) IDE refactor — rename, search

) reflection, dynamic proxy
) profiling
) jconsole, JMX
) requesting Runtime.gc()
) generics for algorithm abstraction but this tends to be too complex
) code gen? cache triggers

Threading? nope it seldom adds too much biz value

homemade serializer/deserializer: object⇔file

public class Serializer {
	private static final String FILE = "c:\temp\"+Serializer.class.getSimpleName();

		public static void main(String[] args) throws IOException, ClassNotFoundException{
		public static void writeToFile(Object input) {
		try {
			ObjectOutputStream s = new ObjectOutputStream(new FileOutputStream(FILE));
		catch (FileNotFoundException e) {
			throw new IllegalStateException(e);
		catch (IOException e) {
			throw new IllegalStateException(e);

	/** @param Target Type. Will cast the de-serialized object into this type.
	* @return
	* @throws ClassNotFoundException
	public static T readFromFile() throws ClassNotFoundException {
		Object o = null;
		try {
			ObjectInputStream s = new ObjectInputStream(new FileInputStream(FILE));
			o = s.readObject();
		catch (FileNotFoundException e) {
			throw new IllegalStateException(e);
		catch (IOException e) {
			throw new IllegalStateException(e);
		if (o == null) return null;
		T uncompressed = (T)o;
		out.println(uncompressed + " ");
		return uncompressed;

partition – multiple meanings

in GS, PWM desktop has a partition selector. Authorized users (mostly IT) can select one of about a dozen regions (like NorthEast), so system will only query and display that subset of the total data.

DB — Oracle, sybase, mssql have partitioned table

STL algorithm — has a partition() template function

large distributed java/c++ system — P105 [[enterprise java perf]] has a 3-page example on “partitioning”. I feel the idea is to encourage local calls in favor of cross-process data transfer. Serialization is required even between 2 local JVM’s. I believe you can avoid serialization only between threads.

many-to-1 relation — at heart of wpf commanding

To a beginner (like me) wpf commanding has a many-to-1 entity relationship at its core. This m:1 is highlighted in numerous wpf tutorials.

m:1 means 1 logical “task” mapping to multiple physical “controls”. (Note it’s Never many-to-many.) Best example is a simple print (or Save or Close) task mapped to a shortcut  key combination, a button, a menu bar item, a context menu item etc.

It may be slightly imprecise to call them “controls”, but this terminology is nice and simple. Let’s not play language lawyers. Strictly, the shortcut key is actually a KeyGesture instance, and a kind of “command source”. Indeed, Command-source is the technical term — rather abstract.

When you first expand your design from 1:1 to 1:2, adding a 2nd control, you may do a bit of harmless copy-paste. When you expand to 1:m, you need more serious OO modelling and abstraction. The OO abstraction chosen by WPF creators is the commanding infrastructure, an object-oriented solution. (As mentioned in, GUI systems often feature powerful, elegant design patterns.) The commanding pattern addresses the configuration, and the binding — the glue-up, the hook-up, the plumbing between source (and target) “controls” on one hand and task execution logic on the other — the m:1 mapping.

As explained in THE msdn article (, a command binding instance connects 3 kinds of things — command / source / target. The 1:m is “command 1: many Sources” [1]. The target is slightly less important to a beginner.

[1] a KeyGesture instance is also a command source.

subvert: field initializer + constructors #java

I had an apparently watertight base bean class with a field initializer

public final Map properties = Collections.EMPTY_MAP;

Apparently every instance should have an immutable empty map? You will see how this “watertight guarantee” is compromised.

Subclass adds no field. However, at runtime, I found an object of the subclass with == a Hashtable, even populated with data. How could JVM allow it?

More (ineffective) chokepoint — There is only one line of code that “puts” into this Map. It’s in a private method and I added a throw new RuntimeException() //to ensure it never adds any data.

More (ineffective) chokepoint — There’s only one constructor for the base class. I put some println() which, surprisingly, didn’t run.

Short Answer – casting an object received by de-serialization.

Long answer — These bean classes are DTO classes for RMI. Server side is still using the old version, so it conjures up objects with == a populated Hashtable and serializes it to client JVM. Client de-serializes only with respect to the declared type, bypassing field initializer or constructors. So long as the incoming stream can convert to a Map, it’s successfully de-serialized.

c++class templates: container^other categories

I feel most introductory texts and most interview questions follow the “container” theme. There’s at least one field of type T. This prompted me to rethink the Usage, industry adoption and justification of c++ class templates.

Most well-known libraries define class templates. They don’t all follow that “theme”.

In enterprise application teams, however, out of MS I don’t see a lot of brand-new templates created from scratch. (Let’s not hypothesize why.) A lot of template class concretized; perhaps (presumably) some new templates created based on some proven designs, with low risk and low complexity. I do see some templates created in a local library, such as a MatrixDouble or a SmartPtr, but few.

Bottom line — you seldom create your own new class template. If you do, then I would venture to say the first template you /popularize/ is likely a container.

The class templates I created so far —

* matrix of any type like string or number, with user-defined size

* circular array

Template meta programming is deep and complex. However, if a template has no specialization, no non-type param, no nested template as a type param, no inheritance, then it’s manageable. I feel these can be meaningful, practical and useful class templates. My phone is useful without data plan, camera, browser, organizer and other bells and whistles.

when to denormalize, briefly

%%A: when select performance is kind of more important than DML performance
%%A: read-mostly DB

Q: any specific type of application you know
A: reporting, query, analysis, research, search

Q: How about data warehousing? Materialized views can be created from a join (?), therefore denormalized?
A: Not even close. Both can help performance. A Materilized view consists entirely of derived, redundant data. When the original data changes, this copy of redundant data is marked obsolete. If you add a bunch of triggers to always update the Materialized View to keep it in sync with original, then maybe it’s equivalent to a denormalized set of tables.

* joins — big joins are the problem
* summary tables like in GS are examples of denormalized tables.
* star schema on wikipedia

FIFO read/write lock from a java book (multiprocessor)

I feel this is similar to the simple algorithm in wikipedia.

A design adapted from P186 [[art of multiprocessor programming]]. I found a few bugs in the original. The site actually confirmed my suspicions.

However, I don’t know how to test it.

My Adaptations
– await() before readAcquires++.
– perhaps multiple condition objects for different conditions.

Basic ideas —
* Use a private lock (plock) to guard the counters this.readAcquires and this.readReleases, which are updated by readers.
* (first) writer thread if unsuccessful would set flag this.writerIsPending
** subsequent readers block in while(writerIsPending){cond.await(); }before readAcquires++.
** subsequent writers block in while(wlockAcquired){cond.await(); }
** both block in await() rather than busy waiting
* invariant : readReleases <= readAcquires
* how many conditions? 1

/* Created on January 9, 2006, 7:39 PM
* From "Multiprocessor Synchronization and Concurrent Data Structures",
* by Maurice Herlihy and Nir Shavit.
* Modified ...
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReadWriteLock;
import java.util.concurrent.locks.ReentrantLock;
public class FifoReadWriteLock implements ReadWriteLock {
    private int readAcquires = 0; // read acquires since start
    private int readReleases = 0; // read releses since start
    private boolean writerPendingOrRunning = false; // writer present?
    private final Lock lock = new ReentrantLock(); // short-term synchronization
                                                   // one condition for multiple purposes?
    private final Condition cond = lock.newCondition();
    private final Lock rLock = new ReadLock(); // readers apply here
    private final Lock wLock = new WriteLock(); // writers apply here
    public Lock readLock() {
        return rLock;
    public Lock writeLock() {
        return wLock;
    private class ReadLock extends AbstractLock {
        public void lock() {
            try {
                while (writerPendingOrRunning) {
                    try {
                    }catch (InterruptedException ex) {}
            finally {
        public void unlock() {
            try {
                if (readAcquires == readReleases)
            finally {
    private class WriteLock extends AbstractLock {
        public void lock() {
            try {
                while (writerPendingOrRunning) {
                    try {
                    }catch (InterruptedException ex) {}
                writerPendingOrRunning = true;

                while (readAcquires != readReleases) {
                    try {
                    }catch (InterruptedException ex) {}
            finally {
        public void unlock() {
            writerPendingOrRunning = false;
abstract class AbstractLock implements Lock {
    public void lockInterruptibly() throws InterruptedException {
        throw new UnsupportedOperationException();
    public boolean tryLock() {
        throw new UnsupportedOperationException();
    public boolean tryLock(long time, TimeUnit unit)
        throws InterruptedException {
        throw new UnsupportedOperationException();
    public Condition newCondition() {
        throw new UnsupportedOperationException();

Runtime.gc() calls do make a difference

Hi YH,

You said calling Runtime.gc() is counter-productive. My experiment shows otherwise. Below is a heavy java job that uses about 200MB memory. “JJ_1001001=233564416” means after processing 1001001 rows, 233MB memory was in use. As you can see, usage grows with data volume.

[2010-09-07 14:03:07,132 INFO .Main:39] – {before_JJ=17944080, JJ_1=149162416, JJ_100101=161278496, JJ_200201=177440888, JJ_300301=179908768, JJ_400401=182329728, JJ_500501=198315560, JJ_600601=200645064, JJ_700701=216799944, JJ_800801=218142816, JJ_900901=231948296, JJ_1001001=233564416, before_destroySingletons=143855904}

——– Now I added calls to Runtime.gc() after every 100,000 rows. Memory usage is now kept constant at around 120MB.

[2010-09-07 13:40:12,347 INFO .Main:39] – {before_JJ=17944080, JJ_1=120664072, JJ_100101=120664248, JJ_200201=120664072, JJ_300301=120664504, JJ_400401=120664176, JJ_500501=120664760, JJ_600601=120664584, JJ_700701=120664864, JJ_800801=120664840, JJ_900901=120665336, JJ_1001001=120665256, before_destroySingletons=131827256}

I run the job again. Results are similar, so GC behavior is not 100% predictable —

[2010-09-07 13:57:02,060 INFO .Main:39] – {before_JJ=17944080, JJ_1=120664224, JJ_100101=120663944, JJ_200201=120664528, JJ_300301=120664048, JJ_400401=120664632, JJ_500501=120664608, JJ_600601=120664888, JJ_700701=120664712, JJ_800801=120664992, JJ_900901=120665280, JJ_1001001=120665248, before_destroySingletons=131828080}

Here's how to request gc():

public static void runGC() {
// for whatever reason it helps to call Runtime.gc()
// using several method calls:
for (int r = 0; r < 4; ++r)

private static void _runGC() {
long usedMem1 = usedMemory(), usedMem2 = Long.MAX_VALUE;
for (int i = 0; (usedMem1 < usedMem2) && (i < 1000); ++i) {
usedMem2 = usedMem1;
usedMem1 = usedMemory();

auto properties ^ public fields

Automatically implemented properties are very similar to public fields, but

#) You can’t data bind to a field
#) Auto properties (like all properties) can go into interfaces. Fields can’t.
#) Auto properties can have ONE of get/set be private. If both private, then a field is better.
#) debug into getter or setter with a simple temporary code change. I guess MSVS can’t break-point on field-write alone. For debugger to work, you may need to introduce a private backing field.

See also

5 STL components #letter to friend

Hi YH,

(A blog post, so i won’t spell out your name or email.)

You named 5 STL components. I now feel STL was designed first to provide containers, then algorithms, then iterators, as the 3 main things. In addition, functors and are an efficiency feature. Allocators are an advanced customization tool for those users with special requirement and skill.

To a developer new to c++, iterators aren’t meaningful. In contrast, container has the most visible value and usage — to hold a bunch of objects (or pointers), including nested containers. Most developers would want to use common data structures like arrays, linked lists, sets, lookup tables etc. Unfortunately core c++ provides only fixed arrays. Therefore STL provides parametrized containers to the rescue.

Immediately people would need common operations on these containers, like find, remove, copy, sort, count_if… So STL provides parametrized algorithms on containers. A key design goal is maximum flexibility and re-usability when you mix and match any container with any algorithm. This is hard, even without the stringent requirement of efficiency. It’s so hard that a complete breed of classes were invented — iterators. Numerous authors including STL original developers confirmed that iterator was the glue between containers and STL algorithms.

Among the 3, some are more necessary than others. Ranked in terms of necessity,

1) A container is usable by itself, even without algorithms/iterators. I mean the methods of the container classes, like push_back, get, operator[]
2) However, serious users of containers usually need iterators just to get all the content. This minimal usage need nothing beside the operator++ and operator* .
3) To meet more application requirements, some users write their own loops to do everything, using more features of iterators. So those iterator features are less essential than operator++ and operator* .
4) least necessary are the STL algorithms. They use the full features of iterators.

As an interesting comparison, java’s containers are likewise the most necessary part of the collections framework. Iterators are heavily used but not always needed (except in loops). The non-static methods in and static methods in are similar to STL algorithms, but are somewhat less used. I for one write lots of loops by myself.

use sortKeys AND custom comparator – JTable

After you create a TableRowSorter, you need to install your custom comparator first, before setting sort key list. Reversing the sequence breaks the correct behaviour, but I don’t know why.

If you use sort key list (like sort first by quote price, Then timestamp), you may want to keep the live table sorted that way forever. You need to disable mouse-click-sort on Any columns. Here’s one way to do it —

public final void toggleSortOrder(int column) {}//disabled

Mouse events are still generated, but when the event handler calls into toggleSortOrder() nothing happens.

silent bug if u forget to implement INotifyPropertyChanged

public int PortNumber{
    return _portNumber;
    if (_portNumber != value){
      _portNumber = value;

If host class Agent implements INotifyPropertyChanged, then the wiring works as follows
#1) users click a “++” button
#4) event handler increments PortNumber and saves it via the setter
#6) setter fires prop changed event
#9) xaml binding system queries PortNumber property via getter — you can see this in a debugger

Now things break if you don’t implement INPC. Xaml binding doesn’t call the getter (#9) and won’t display new value, even though
– event firing still happens (#6),
– xaml won’t complain either

I think xaml will still bind to this Agent object, but won’t treat Agent as a “Notifying source”, so the binding system will ignore the event — the invitation to re-query the VM for updates.

sort a large joined table by any field

Hi ZR,

Let’s start with a different challenge — say I have a big table of 10GB data, and a lot of users. If I peek at all the queries by these users, I see they search by every single column! One day I see a column that’s not the first column of any index, I know we have a real problem — a search by this column is a full table scan. Even if this happens once a month, this is unacceptable, as it means one of the valid queries simply won’t work.

My solution — create indices for every column, but make this table readonly for 23 hours, and update it during the 1-hour window. Having so many indices will slow down insert/update/delete. Writers may lock some data and slow down queries.

Assumption — if there are enough indices, then search will be fast enough. This is not true if an index has a very fat key and each index page contains just one key value, and the index tree is very very deep. Also, some searches can’t run fast — search by gender, search where age > 0, search for non-null names… Optimizer will be smart enough to ignore indices since FTS is best query plan. I think these are rare and obvious, so assumption is probably safe.

Now let’s modify the challenge — a big “worktable” after a join, not a big physical table. Say table A has columns a1 a2 a3 .., table B has b1, b2, b3 .. and they are joined.

My solution — create indices on all search columns (in our case all columns). Experiment and keep the physical tables writable all day. Since table A is now much narrower than the big table earlier, writers may be faster. Writes to tables A and B can execute independently. This is similar to the java ConcurrentHashMap design.

From my experience, if table A has 10 columns, it’s ok to create that many indices. If a big table has 200 columns, i am not sure.

Now, your question of sorting a large joined table by any field. I believe the last solution may work. Every sort column is covered by an index, so the driver table will be the host table of the sort column, using the index on the sort column. All rows will come out in the right sequence without further sorting.

reference field are writable even when host object is const

In the example below, the intVar field is a nonref and behaves as expected — can't be modified via a const method, and can't be modified via a variable marked const.

However the reference field this->intRef is not on the real estate of the host object and isn't part of the host object state, so in both cases above intRef is modifiable!

using namespace std;
template struct J{
    J(T & rhs):intRef(rhs){}
    void doIt(T a) const{
        this->intRef = a; // legal even when method is const and is called !
        //this->intVar = a; // illegal when method is const and is called
    T & intRef;
    T intVar;
int main()
    int a=22;
    const J j1(a);
    j1.intRef = 19; // legal even when object is const
    //j1.intVar = 19; // illegal when object is const

Command Object pattern brief notes

Doug Lea (Page 64) said there are many variants of CO. Simplest is a passable object wrapping a single method. Any time you see an (anonymous class) “new Runnable(){” it’s a Command Object. Sooooooo much simpler than the strict CO definition below.

Command object “knows” the receiver — ie HAS-A reference to the receiver
Command object “knows” exactly how to work with the receiver —
Command’s method would call the receiver’s methods.

Q: List the key objects (excluding interfaces) of the pattern? Command, receiver, invoker… Client? could be a non-object, like the main() method.

Q: List the key methods of the pattern? createCommand, setCommand, invoke,

Q: how are the key objects related via HASA or arg-passing or … ?

Q: List exactly at which junctures each object is created? too complicated

— keywords
* decoupled — so much knowledge is in the Command object that the
invoker or receiver need not know each other
* encapsulation — so much knowledge is private in the Command object that the invoker knows none.
* least knowledge — invoker only knows (a group of) command objects, and knows no receiver objects.
* thread pool

This pattern (like many others) treat a “verb” as a noun. It models an action/behaviour as an object.

Is an ICommand instance stateful@@ Usually no

A typical ICommand instance is typically stateless.I think many implementations would keep it stateless.

 I feel good de-coupling dictates that ICommand should not reference any view, or anything in the xaml (i.e. any visual objects on the screen). So how does the Execute() method work? Usually by raising events. Alternatively, if you choose to actually implement Execute() with real logic, then Execute would need to access some objects, which are often passed in as argument. Still stateless.

Q: What state does an instance of this Class represent? (This generic question often gives insight into any Class.)
A: the enabled/disabled Boolean status of a “logical” operation. The operation can come from multiple user input controls, such as keyboard shortcut (KeyGesture), a button, a menu item, a context menu item. When the status is disabled, all these controls become dimmed.

The ICommand interface defines only a few members. The bool CanExecute() method feels like a read-only Boolean property. I feel this Boolean state is about the ONLY state there is in a command Instance.

By the way, a command Source object is also a stateful instance. See content of ICommandSource.

onMsg() invocation in any MOM or mkt-data client-runtime

Q: what happens between socket receiving the msg and the invocation of onMsg()?

Q: is the listener thread blocked in read() or wait()
%%A: wait() I guess.

Q: if wait(), then which threads notify it?
%%A: perhaps a single socket reader thread is responsible to notify multiple listener threads that are interested in the same data. For max performance, listener threads might run on separate processors. They may be waiting in different monitors. I think the socket thread may either notifyAll or sequentially notify each listener thread.