cancel^thread.stop() ^interrupt^unixSignal

Cancel, interrupt and thread.stop() are three less-quizzed topics that show up occasionally in java/c++ interviews. They are fundamental features of the concurrency API. You can consider this knowledge as zbs.

As described in ##thread cancellation techniques: java #pthread,c#, thread cancellation is supported in c# and pthreads, whereas java indirectly supports it.

— cancel and java thread.stop() are semantically identical but java thread.stop() is forbidden and unsafe.

PTHREAD_CANCEL_ASYNCHRONOUS is usable in very restricted contexts. I think this is similar to thread.stop().

— cancel and interrupt both define stop points.  In both cases, the target thread choose to ignore the cancellation/interrupt request, or check them at the stop points.

Main difference between cancel and interrupt ? Perhaps just the wording. In pthreads there’s only cancel, no interrupt, but in java there’s no cancel.

Note interrupted java thread probably can’t resume.

Nos sure if Unix signal handler also supports stop points.

Java GC on-request is also cooperative. You can’t force the GC to start right away.

Across the board, the only immediate (non-cooperative) mechanism is power loss. Even a keyboard “kill” is subject to software programmed behavior, typically the OS scheduler. 

VWAP=a bmark^exectionAlgo

In the context of a broker algos (i.e. an execution algo offered by a broker), vwap is

  • A benchmark for a bulk order
  • An execution algo aimed at the benchmark. The optimization goal is to minimize slippage against this benchmark. See other blogposts about slippage.

The vwap benchmark is simple, but the vwap algo implementation is non-trivial, often a trade secret.

SDI: 3 ways to expire cached items

server-push update ^ TTL ^ conditional-GET # write-through is not cache expiration

Few Online articles list these solutions explicitly. Some of these are simple concepts but fundamental to DB tuning and app tuning. compares write-through ^ write-behind ^ refresh-ahead. I think refresh-ahead is similar to TTL.

B) cache-invalidation — some “events” would trigger an invalidation. Without invalidation, a cache item would live forever with a infinity TTL, like the list of China provinces.

After cache proxies get the invalidation message in a small payload (bandwidth-friendly), the proxies discard the outdated item, and can decide when to request an update. The request may be skipped completely if the item is no longer needed.

B2) cache-update by server push — IFF bandwidth is available, server can send not only a tiny invalidation message, but also the new cache content.

IFF combined with TTL, or with reliability added, then multicast can be used to deliver cache updates, as explained in my other blogposts.

T) TTL — more common. Each “cache item” embeds a time-to-live data field a.k.a expiry timestamp. Http cookie is the prime example.

In Coherence, it’s possible for the cache proxy to pre-emptively request an update on an expired item. This would reduce latency but requires a multi-threaded cache proxy.

G) conditional-GET in HTTP is a proven industrial strength solution described in my 2005 book [[computer networking]]. The cache proxy always sends a GET to the database but with a If-modified-since header. This reduces unnecessary database load and network load.

W) write-behind (asynchronous) or write-through — in some contexts, the cache proxy is not only handling Reads but also Writes. So the Read requests will read or add to cache, and Write requests will update both cache proxy and the master data store. Drawback — In distributed topology, updates from other sources are not visible to “me” the cache proxy, so I still rely one of the other 3 means.

TTL eager server-push conditional-GET
if frequent query, in-frequent updates efficient efficient frequent but tiny requests between DB and cache proxy
if latency important OK lowest latency slower lazy fetch, though efficient
if in-frequent query good waste DB/proxy/NW resources as “push” is unnecessary efficient on DB/proxy/NW
if frequent update unsuitable high load on DB/proxy/NW efficient conflation
if frequent update+query unsuitable can be wasteful perhaps most efficient


single-writer lockfree data structure@@

Most of the lockfree data structures I have seen have 2 writer threads or more.

Q: what if in your design, there can be at most one writer thread but some reader threads. Any need for lockfree?
A: I don’t think so. I think both scenarios below are very common.

  • Scenario: If notification required, then mutex + condVar
  • Scenario: In many scenarios, notification is not needed — Readers simply drop by and read the latest record. In java, a volatile field suffices.


key^payload: realistic treeNode #hashset

I think only SEARCH trees have keys. “Search” means key search. In a search tree (and hashset), each node carries a (usually unique) key + 0 or more data fields as payload.

Insight — Conceptually the key is not part of the payload. Payload implies “additional information”. In contrast, key is part of the search tree (and hashset) infrastructure, similar to auto-increment record IDs in database. In many systems, the key also has natural meaning, unlike auto-incremented IDs.

Insight — If a tree has no payload field, then useful info can only exist in the key. This is fairly common.

For a treemap or hashmap, the key/value pair can be seen as “key+payload”. My implementation saves a key/value pair in each bucket.

A useful graph (including tree , linked list, but not hashset) can have payloads but no search keys. The graph nodes can hold pointers to payload objects on heap [1]. Or The graph nodes can hold Boolean values. Graph construction can be quite flexible and arbitrary. So how do you select a graph node? Just follow some iteration/navigation rules.

Aha — similar situation: linked list has well-defined iterator but no search key . Ditto vector.

[1] With pre-allocation, a static array substitutes for the heap.

Insight — BST, priorityQ, and hash tables need search key in each item.

I think non-search-TREES are rarely useful. They mostly show up in contrived interview questions. You can run BFT, pre|post-order DFT, but not BF S / DF S, since there’s nothing to “search”.

Insight — You may say search by payload, but some trees have Boolean payloads.

std::move(): robbed object still usable !

Conventional wisdom says after std::move(obj2), this object is robbed and invalid for any operation…. Well, not quite! specifies exactly what operations are invalid. To my surprise, a few common operations are still valid, such as this->clear() and this->operator=().

The way I read it — if (but not iFF) an operation wipes out the object content regardless of current content, then this operation is valid on a robbed/hollowed object like our obj2.

Crucially, any robbed object should never hold a pointer to any “resource”, since that resource is now used by the robber object. Most known movable data types hold such pointers. The classic implementation (and the only implementation I know) is by pointer reseat.

multicast routing: efficient data duplication

The generic  routing algorithm in multicast has optimization features that offer efficient data duplication.

— optimized message duplication at routing layer (Layer 3?) explains that

Routers duplicate data packets and forward multiple copies wherever the path to recipients diverges. Group membership information is used to calculate optimal branch points i.e. the best routers at which to duplicate the packets to optimize the use of the network.

The multicast distribution tree of receiving hosts holds the route to every recipient that has joined the multicast group, and is optimized so that

  • multicast traffic does not reach networks that do not have any such recipients (unless the network is a transit network on the way to other recipients)
  • duplicate copies of packets are kept to a minimum.

5 constructs: c++implicit singletons

#1 most implicit singleton in c++ is the ubiquitous “file-scope variable”. Extremely common in my projects.

  • — The constructs below are less implicit as they all use some explicit keyword to highlight the programmer’s intent
  • keyword “extern” — file-scope variable with extern
    • I seldom need it and don’t feel the need to remember the the details.. see other blogposts
  • keyword “static” — file-scope static variables
  • keyword “static” within function body — local static variables — have nice feature of predictable timing of initializaiton
  • keyword “static” within a class declaration —  static field

~~~~~~  The above are the 5 implicit singleton constructs ~~~~~~

Aha — it’s useful to recognize that when a data type is instantiated many many times i.e. non-singleton usage, it is usually part of a collection, or a local (stack) variable.

Sometimes we have the ambiguous situation where we use one of the constructs above, but we instantiate multiple instances of the class. It’s best to document the purpose like “instance1 for …; instance2 for …”

old-timers trapped ] pressure-cooker #Davis

See also G3 survival capabilities #health;burn rate

update: old timers are safe in some teams like MLP, but not in MS.

Q: Why are so many old timers unable to get out of a shitty job in a pressure-cooker company?

  • A small number of old timers change job but some of them regret as new job turns out worse. Other old timers hear those war-stories and prefer the certainty of known pain in the current job. They don’t feel confident they can find a better job.
  • bench time — Many old timers can’t afford bench time due to low reserve and high burn rate.
    • Old timers tend to be very unused to unstable income.
    • Damien (Macq) relied on his bonus payout..
    • Some had no savings at all.
  • Most old timers won’t even consider a 6M contract.
  • Most old timers will never consider a voluntary pay cut to get a more comfortable job
  • golden handshake — Some old timers (UBS?) can’t let go of the promised compensation package. They would not resign and give up on a 100k promised windfall
  • some old timers worry about job loss so much that they work their ass off to keep the job, even if the workload is unreasonable/unfair. I think in GS I was like them. I have a friend in a similar situation. I think the boss wants to let him go but may feel merciful, so they keep him but give him shitty tasks and demand fast turnaround at good quality…. unreasonable/unfair workload.
  • Some old timers may have a wife who completely opposes job change due to potential impact on family.

Q: why XR and I can afford to quit a job, stay home, and find a lower job?
A: Supportive wife, burn rate, passive income and savings.

[17] string(+vector) reserve()to avoid re-allocation #resize()wrong

See also std::vector capacity reduction and vector internal array: always on heap; cleaned up via RAII

string/vector use expandable arrays. Has to be allocated on heap not stack.

For vector, it’s very simple —

  • resize() creates dummy elements iFF upsizing
  • reserve() only allocates spare “raw” capacity without creating elements to fill it. See P279 [[c++stdLib]]

[[Optimized C++]] P70 confirms that std::string can get re-allocated when it grows beyond current capacity. compares std::string.resize() vs reserve().

  • Backgrounder: Capacity — allocated capacity is often uninitialized memory reserved for this string.
  • Backgrounder: size — length of the string that’s fully initialized and accessible. Some of the characters (nulls) could be invisible.
  • string.resize() can increase the string’s size by filling in space (at the end) with dummy characters (or a user-supplied char). See
    • After resize(55) the string size is 55. All 55 characters are part of the string proper.
    • changes string’s size
  • string.reserve() doesn’t affect string size. This is a request to increase or shrink the “capacity” of the object. I believe capacity (say 77) is always bigger than the size of 55. The 77 – 55 = 22 extra slots allocated are uninitialized! They only get populated after you push_back or insert to the string.

vector internal array: always on heap; cleaned up via RAII

Across languages, vector is the #1 most common and useful container. Hashtable might be #2.

I used to feel that a vector (and hashtable) can grow its “backing array” without heap. There’s global memory and stack memory and perhaps more. Now I think that’s naive.

  • Global area is allocated at compile time. The array would be fixed in size.
  • For an array to be allocated on stack, the runtime must know its size. Once it’s allocated on that stack frame, it can’t grow beyond that stack frame.
  • In fact, any array can’t grow at all.

The vector grows its backing array by allocating a bigger array and copying the payload. That bigger array can’t be in global area because this type on-demand reallocation is not compile-time.

Anything on heap is accessed by pointer. Vector owns this hidden pointer. It has to delete the array on heap as in RAII. No choice.

Now the painful part — heap allocation is much slower than stack allocation. Static allocation has no runtime cost at all after program starts.

BST: finding next greater node is O(1) #Kyle shows that in a std::map, the time complexity of in-order iterating every node is O(N) from lowest to highest.

Therefore, each step is O(1) amortized. In other words, finding next higher node in such a BST is a constant-time operation. confirms that std::next() is O(1) if the steps to advance is a single step. Even better, across all STL containers, iterator movement by N steps is O(N) except for random-access iterators, which are even faster, at O(1).

API^ABI: java++

See also posts on jvm portability including What’s so special about jvm portability cf python/perl, briefly@@, which introduced SCP/BCP

  • — Breaking API change is a decision by a library supplier/maintainer.
  • clients are required to make source code change just to compile against the library.
  • clients are free to use previous compiler
  • — Breaking ABI change is a decision by a compiler supplier.
  • clients are free to keep source code unchanged.
  • clients are often required to recompile all source code including library source code

API is source-level compatibility; ABI is binary-level compatibility. I feel

  1. jar file’s compile-once-run-anywhere is kind of ABI portability
  2. python, perl… offer API portability at source level

Beneath these simplified rules of thumb, there are many exceptions and subtleties.

— some important scenarios in java

library upgrade —

jdk upgrade —

c++ABI #gr8 eg@zbs

Mostly based on

Imagine a client uses libraries from Vendor AA and Vender BB, among others. All vendors support the same c++ compiler brand, but new compiler versions keep coming up. In this context, unstable ABI means

  • Recompile-all – client needs libAA and libBB (+application) all compiled using the same compiler version, otherwise the binary files don’t agree on some key details.
  • Linker error – LibAA compiled by version 21 and LibBB compiled under version 21.3 may fail to link
  • Runtime error – if they link, they may not run correctly, if ABI has changed.

Vendor’s solutions:

  1. binary releases —  for libAA. Vendor AA needs to keep many old binary versions of libAA, even if compiler version 1.2 has retired for a long time. Some clients may need that libAA version. Many c++ libraries are distributed this way on vendor websites.
  2. Source distribution – Vendor AA may choose to distribute libAA in source form. Maintenance issues exist in other forms.

In a better world,

  • ABI compatible — between compiler version 5.1 and 5.2. Kevin of Macq told me this does happen to some extent.
  • interchangeable parts — the main application (or libBB) can be upgraded to newer compiler version, without upgrading everything else. Main application could be upgraded to use version 5.2 and still link against legacy libAA compiled by older compiler.

(overloaded function) name mangling algorithm — is the best-known part of c++ABI. Two incompatible compiler versions would use different algorithms so LibAA and LibBB will not link correctly. However, I don’t know how c++lint demangles those names across compilers.

No ABI between Windows and Linux binaries, but how about gcc vs llvm binaries on Intel linux machine? Possible according to

Java ABI?


STL map[k]=.. Always uses assignment@runtime

The focus is on the value2, of type Trade.

— someMap[key1] = value2; //uses default-ctor if needed. Then Unconditionally invokes Trade::operator=() for assignment! See P344 [[Josuttis]]

Without default-ctor or operator=, then Trade class probably won’t compile with this code.

— someMay.emplace(key1, value2); //uses some ctor, probably the copy-ctor, or the cv-ctor if value2 type is not Trade

void list.reverse() ^ reversed()→iterator #sorted(),items()

These three constructs do three different things, without overlap

  • mylist[::-1] returns a reversed clone
    • myStr[::-1] works too
  • mylist.reverse() is a void in-place mutation, same as mylist.sort()
  • reversed(mylist) returns an iterator, not printable but can be used as argument to some other functions
    • I feel this can be convenient in some contexts.
    • reversed(myStr) works too but you can’t print it directly


  • Unlike items(),dict.iteritems()return iterator

Different case:

  • sorted(any_iterable) returns a new vector (i.e. “list”). Returning an iterator would be technically stupid.

function returning (unnamed)rvalue object

The rules are hard to internalize but many interviewers like to zero in on this topic. [[effModernC++]] explains

  • case: if return type is nonref, and in the function the thingy-to-be-returned is a …. rvr parameter [1], then by default, it goes through copy-ctor on return.
    • You should apply explicit return move(thingy)
    • [1] See P172, extremely rare except in interviews
  • case: if return type is nonref and in the function the thingy-to-be-returned is a …. stack (non-static) object behind a local variable (NOT nameless object) .. a very common scenario, then RVO optimization usually happens.
    • Such a function call is always (P175 bottom) seen by the caller as evaluating to a naturally-occurring nameless rvalue object.
    • the books says move() should never be used in the case
  • case: if return type is rvr? avoid it if possible. I don’t see any use case.
  • case: if return type is lvr (not rvr) and returned thingy is a local object, then compilation fails as explained in the blogpost above

sharding:=horizontal partition #too many rows

Assuming a 88-column, 6mil-row table, “Horizontal” means horizontally pushing a knife across all 88 columns, splitting 6 million rows into 3 millions each.

Sharding can work on noSQL too.

GS PWM positions table had 9 horizontal partitions for 9 regions.

“Partitioning” is a more generic term and can mean 1) horizontal (sharding) or 2) vertical cutting like normalization.

[19] assignment^rebind in python^c++j

For a non-primitive, java assignment is always rebinding. Java behavior is well-understood and simple, compared to python.

Compared to python, c++ assignment is actually well-documented .. comparable to a mutator method.

Afaik, python assignment is always rebinding afaik, even for an integer. Integer objects are immutable, reference counted.
In python, if you want two functions to share a single mutable integer variable, you can declare a global myInt.
It would be in the global idic/namespace. q[=] on myInt has special meaning similar to

idic[‘myInt’] =..

Alternatively, you can wrap the int in a singular list and call list mutator methods, without q[=].

See my experiment in github py/88lang and my blogpost on immutable arg-parssing

meaningful mv-ctor requires heap

I think any data type with a meaningful mv operation (including move-assignment) must use heap storage, accessed via a heap ptr.

SSO (small-string-optimization) bypasses heap, so the mv-ctor isn’t meaningful. See Item 29 of [[effModernC++]]

Built-in primitives like bool, int, float does support rvr, but no efficient move-operation. In fact, they are not class types, so no mv-ctor or mv-assignment. The compiler probably treats their rvr variables as dummies.

move()doesn’t move..who does@@

(Note in this blogpost, when I say mv-ctor i mean move-ctor and move-assignment.)

As explained in other posts in this blog, move() is a compile time cast, no performing a physical “steal” at runtime, so

Q: what c++11 constructs performs a physical steal?
%%A: mv-ctor

  • move() ⇏ steal — std::move() may not trigger a physical steal — If you call myContainer.insert(std::move(myString)) but your custom String class has no mv-ctor, then copy-ctor is picked by compiler .. see P21 [[Josuttis]].
  • steal ⇏ move() — a physical steal may not require std::move() — if you have a naturally occurring rvalue object.
  • steal => mv-ctor

average_case ^ amortized

>> bigO (and big-Theta) is about input SIZE, therefore, usable on average or worst case data quality

There’s no special notation like bigX specifically for average case.

>> Worst vs average case … is about data quality.

With hash table, we (novices) are permitted to say “quality of hash function” but this is technically incorrect.  My book [[algorithms in a nutshell]] P93 states that IFF given a uniform hash function, then even in worst case bucket sort runs in linear time O(N).

>> Amortized … is about one expensive step in a large number (≅ input size) steps like inserts. Amortized analysis makes no assumption about quality of data, and therefore is a worst-case analysis.

Note Sorting has no amortized analysis AFAIK.

eg: quicksort+quickSelect can degrade to O(NN). Amortization doesn’t apply.

worst case amortized — is a standard analysis
Eg: vector expansion. Worst-case doesn’t bother us.

— statistical-average amortized analysis — is a standard analysis
eg: Hashtable insert is O(1) based on both statistical average AND amortized analysis :

  • Some inserts will hit long chain, while most insertions hit short chain, so O(1) on average, statistically
  • Amortization is relevant iFF an insert triggers a hash table expansion.

What if the input is so bad that all hash codes are identical? Then amortization won’t help. For hash tables, people don’t worry about worst-case data, because a decent, usable hash function is always, always possible.

eg: iterating a RBTree. Data quality doesn’t bother us as RBTree has guaranteed time complexity :). Traversing entire tree is O(N). Therefore statistical average per-iteration is O(1). Worst per-iteration is O(log N) says “amortized analysis is a worst case analysis” , where “worse” means data quality

average case per-operation — can be quite pessimistic
eg: vector insert

eg: RBTree iteration (see above)

worst case per-operation — not common, relevant to realtime systems only.

I feel this analysis easily hits the extreme far-outliers, so the per-operation cost is far worse than typical,

POSIX semaphores do grow beyond initial value #CSY

My friend Shanyou pointed out a key difference between c++ binary semaphore and a simple mutex — namely ownership. Posix Semaphore (binary or otherwise) has no ownership, because non-owner threads can create permits from thin air, then use them to enter the protected critical section. Analogy — NFL fans creating free tickets to enter the stadium. is one of the better online discussions.

  • ! to my surprise, if you initialize a posix semaphore to 1 permit, it can be incremented to 2. So it is not a strict binary semaphore. is my experiment using linux POSIX semaphore. Not sure about system-V semaphore. I now think a posix semaphore is always a multi-value counting semaphore with the current value indicating current permits available.

  • ! Initial value is NOT “max permits” but rather “initial permits”

I feel the restriction by semaphore is just advisory, like advisory file locking. If a program is written to subvert the restriction, it’s easy. Therefore, “binary semaphore” is binary IIF all threads follow protocol. claims “Mutex can be released only by thread that had acquired it, while you can signal semaphore from any non-owner thread (or process),” This does NOT mean a non-owner thread can release a toilet key owned by another thread/process. It means the non-owner thread can mistakenly create a 2nd key, in breach of exclusive, serialized access to the toilet.–-part-1-semaphores/ is explicit saying that a thread can release the single toilet key without ever acquiring it.


[[DougLea]] P220 confirms that in some thread libraries such as pthreads, release() can increment a 0/1 binary semaphore value to beyond 1, destroying the mutual exclusion control.

However, java binary semaphore is a mutex because releasing a semaphore before acquiring has no effect (no “spare key”) but doesn’t throw error.

temp object bind`preferences: rvr,lvr,, #SCB

(Note I used “temp object” as a loose shorthand for “rval-object”.)

— based on P545 [[c++primer]]

Usually, a library class would define two overloads of an API such vector::push_back():

  • push_back(X const &)
  • push_back(X &&)

No const rvr; no mutable lvr. The author gave brief reasons to each.  However, outside the library, I often see functions taking a mutable lvr param, in order to modify the object passed in.

— Based on

  • a const (not mutable [1]) L-value reference … … can bind to a naturally-occurring temp object (or a to-be-robbed object from std::move), or any object according to P545 [[c++primer]]
  • a non-const r-value reference can bind to a naturally-occurring rvalue object, treating the object as a target for robbing
  • a const r-value reference (crvr, rarely encountered) can bind to a naturally-occurring rvalue object but canNOT bind to an lvalue object
  • [1] See Basically, temp objects and robbed objects have no “location” so they can’t be the target of a mutable L-value reference.

Q: so in the presence of all of these overloads, which reference can naturally-occurring temp objects bind to?
A: a const rvr. Such an object prefers to bind to a const rvalue reference rather than a const lvalue reference. There are some obscure use cases for this binding preference.

More important is the fact that const lvr (param type of copy-ctor and assignment operator) can bind to temp object. [[c++primer]] P540 has a section describing that if you pass a temp into Foo ctor you may hit the copy-ctor:

Foo z(std::move(anotherFoo)) // compiles and runs fine even if move-ctor is unavailable. This is THE common scenario before c++11 and after.

Compiler doesn’t bother to synthesize the move-ctor, when copy-ctor is defined!

returning^throwing local object

Tag line – always catch by reference. [[moreEffC++]] has a chapter with this title.

  • If you return a local object by reference, it will lead to run time error as the object would be wiped out from stack. Compiler is likely to give a warning.
  • If you throw a local object and catch by reference up the call stack, it is actually considered best practice, because compiler always clones the local object and throws the clone.

xtap check`if UDP channel=healthy #CSY

xtap library needs reliable ways to check if a “connectivity” is down

UDP (including multicast) is a connectionless protocol; TCP is connection-oriented. Therefore the xtap “connection” class cannot be used for multicast channels. Multicast Channel is like a TV-channel (NYSE terminology).

–UDP is connectionless, has no session, no virtual circuit, so no “established” state. So how do we know the exchange server is dead or just quiet?

After discussing with CSY, I feel UDP unicast or UDP multicast can’t tell me.

heartbeat — I think we must rely on the heartbeat. I actually helped create an inactivity timeout alert precisely because in a multicast channel, we don’t know if exchange is down or just quiet.

–TCP should be easier.

Many online resources such as

According to CSY, as a receiver, we don’t need to send a probe message. if exchange has closed the TCP socket, the four-way handshake would have informed my socket that the connection is closed. So my select() would say my TCP socket is ready for reading. When I read it I would get 0 bytes.

I believe there’s a one-to-one mapping between a socket and a “connection” for TCP only


Virtual_Machine^guest_OS explains that

Guest OS runs in a virtual machine or “vm”. A “vm”

  • usually refers to a container process if it’s “live”


  • More often, a vm means a vm-config i.e. a collection of parameters defining a physical container process to-be-started.It’s important to realize (windows host OS as example) a vm is strictly an application with a window, like a browser or shell. As such, this application has it’s own config data saved on disk.


IV Q: implement op=()using copier #swap+RAII

A 2010 interviewer asked:

Q: do you know any technique to implement the assignment operator using an existing copy ctor?
A: Yes. See P100 [[c++codingStd]] and P347 [[c++cookbook]] There are multiple learning points:

  • RAII works with a class owning some heap resource.
  • RAII works with a local stack object owning a heapy thingy
  • RAII provides a much-needed exception safety. For the exception guarantee to hold, std::swap and operator delete should never throw, as echoed in my books.
  • I used to think only half the swap (i.e. transfer-in) is needed. Now I know the transfer-out on the old resource is more important. It guarantees the old resource is released even-if hitting exceptions, thanks to RAII.
  • The std::swap() trick needed here is powerful because there’s a heap pointer field. Without this field, I don’t think std::swap will be relevant.
  • self-assignment check — not required as it is rare and tolerable
  • Efficiency — considered highly efficient. The same swap-based op= is used extensively in the standard library.
  • idiomatic — this implementation of operator= for a resource-owning class is considered idiomatic, due to simplicity, safety and efficiency shows one technique. Not sure if it’s best practice. Below is my own

C & operator=(C const & rhs){
  C localCopy(rhs); //This step is not needed for move-assignment
  std::swap(localCopy.heapResource, this->heapResource);
  return *this;
}// at end of this function, localCopy is destructed, and the original this->heapResource is deleted

C & operator=(C && rhs){ //move-assignment
  std::swap(rhs.heapResource, this->heapResource);
  return *this;


enable_if{bool,T=void} #sfinae,static_assert

  • enable_if is a TMP technique that hides a function overload (or template specialization) — convenient way to leverage SFINAE to conditionally remove functions from overload resolution based on type traits and to provide separate function overloads for different type traits. [3]
    • I think static_assert doesn’t do the job.
  • typically used with std::is_* compile-time type checks provided in type_traits.h
  • enable_if_t evaluates to either a valid type or nothing. You can use enable_if_t as a function return type. See [4]. This is the simplest way usage of enable_if to make the target function eligible for overload resolution. You can also use enable_if_t as a function parameter type but too complicated for me. [3]
  • There are hundreds of references to it in the C++11 standard template library [1]
  • type traits — often combined with enable_if [1]
  • sfinae — is closely related to enable_if [1]
  • by default (common usage), the enable_if_t evaluates to void if “enabled”. I have a github experiment specifically on enable_if_t. If you use enable_if_t as return type you had better put a q(*) after it!
    • Deepak’s demo in [4] uses bool as enable_if_t
  • static_assert — is not necessary for enable_if but can be used for compile-time type validation. Note static_assert is unrelated to sfinae. explains that
    • sfinae checks declarations of overloads. sfinae = Substitution failure is not a (compiler) error. An error would break the compilation.
    • static assert is always in definitions, not declarations. static asserts generate compiler errors.
    • Note the difference between failure vs error
    • I think template instantiation (including overload resolution) happens first and static_assert happens later. If static_assert fails, it’s too late. SFINAE game is over. Compilation has failed irrecoverably.
    • Aha moment on SFINAE !





enable_shared_from_this #CRTP

  • std::enable_shared_from_this is a base class template
  • shared_from_this() is the member function provided
  • CRTP is needed when you derive from this base class.
  • underlying problem (target of this solution) — buggy design where two separate “clubs” centered around the same raw ptr. Each club thinks it owns the raw ptr and would destroy it independently.
  • usage example —
    • your host class is already managed via a shared_ptr
    • your instance method need to create new shared_ptr objects from “this”
    • Note if you only need to access “this” you won’t need the complexity here.

Pimco asked it in 2017!

unique_ptr implicit copy : only for rvr #auto_ptr

P 470-471 [[c++primer]] made it clear that

  • on a regular unique_ptr variable, explicit copy is a compilation error. Different from auto_ptr here.
  • However returning an unnamed temp unique_ptr (rvalue object) from a function is a standard idiom.
    • Factory returning a unique_ptr by value is the most standard idiom.
    • This is actually the scenario in my SCB-FM interview by the team architect

Underlying reason is what I have known for a long time — move-only. What I didn’t know (well enough to impress interviewer) — the implication for implicit copy. Implicit copy is the most common usage of unique_ptr.

risk-neutral means..illustrated by CIP

Background — all of my valuation procedures are subjective, like valuing a property, an oil field, a commodity …

Risk-Neutral has always been confusing, vague, abstract to me. CIP ^ UIP, based on Mark Hendricks notes has an illustration —

  • RN means .. regardless of individuals’ risk profiles … therefore objective
  • RN means .. partially [1] backed by arbitrage arguments, but often theoretical
    • [1] partially can mean 30% or 80%
    • If it’s well supported by arbitrage argument, then replication becomes theoretical foundation of RN pricing



I think there are many application-layer multicast protocols. says

Unlike native multicast (i.e. IP-multicast) where data packets are replicated at routers inside the network, in application-layer multicast data packets are replicated at end hosts.

When we hear “multicast” we should not assume it means IP-multicast using UDP ! “Multicast” is not a specific protocol like UDP, but a generic, loosely defined term.

— RGDD says “many current systems support RGDD at application layer using unicast TCP

extern^static on var^functions

[1] confirmed my understanding that static local function is designed to allow other files to define different functions under this name.

extern var file-scope STATIC var static func extern-C func
purpose 1 single object shared across files [1] applies. 99% sure [1] name mangling
purpose 2 private variable private function
alternative 1 static field anon namespace no alternative
alternative 2 singleton
advantage 1 won’t get used by mistake from other file
disadv 1 “extern” is disabled if you provide an initializer no risk. very simple
disadv 2
put in shared header? be careful  should be fine  not sure

order slice^fill: jargon

An execution is also known as a fill, often a partial fill.

  • A slice is part of a request; An execution is part of a response

A slice can have many fills, but a fill is always for a single request.

  • An execution always comes from some exchange, back to buy-side clients, whereas
  • A request (including a slice) always comes from upstream (like clients) to downstream (like exchanges)
  • Slicing is controlled by OMS systems like HFT; Executions are controlled by exchanges.
  • Clients can send a cancel for a slice before it’s filled; Executions can be busted only by exchanges.

reference(instead of ptr) to smart ptr instance

I usually pass smart pointers by value (copy-constructor or move-constructor), just like copying a raw ptr.  Therefore the code below looks unnatural:

unique_ptr<Trade> & ref2smartPtr

Well, my “pbclone” intuition was incorrect.  Actually pbref is rather common because

  • As Herb Sutter suggested, when we need to put pointer into containers, we should avoid raw ptr. Unique ptr is the default choice, and the first choice, followed by shared_ptr
  • I often use unique_ptr as map value . The operator[] return type is a reference to the value type i.e. reference to unque_ptr
  • I may need to put unique_ptr into a vector…. ditto for vector operator[]

factory patterns explained graphically has nice block diagrams and examples.

  • CC) “Simple factory” is the most widely used one but not a widely known pattern, not even a named pattern. “SimpleFactory” is just an informal name

I mostly used simple factory and not the other factory patterns.

Any time I have a method that creates and returns polymorphic objects from a family (like different Trades), I would name the entire class SomeFactory. I often add some states and additional features but as a class or struct, it is still a simple manufacturer.

  • BB) factory method — a family of manufacturer classes. Each supports a virtual create() method. Each is responsible for instantiating just one specific type of object. NokiaFactory.create() only creates NokiaPhone objects.
  • AA) AbstractFactory — too complex and needed in very special situations only.

swap on eq futures/options: client motive

Q1: why would anyone want to enter a swap contract on an option/futures (such a complex structure) rather than trading the option/futures directly?

Q2: why would anyone want to use swap on an offshore stock rather than trading it directly?

More fundamentally,

Q3: why would anyone want to use swap on domestic stock?

A1: I believe one important motivation is restrictions/regulation.  A trading shop needs a lot of approvals, licenses, capital, disclosures … to trade on a given futures/options exchange. I guess there might be disclosure and statuary reporting requirements.  If the shop can’t or doesn’t want to bother with the regulations, they can achieve the same exposure via a swap contract.

This is esp. relevant in cross-border trading. Many regulators restrict access by offshore traders, as a way to protect the local market and local investors.

A3: One possible reason is transparency, disclosure and reporting. I guess many shops don’t want to disclose their positions in, say, AAPL. The swap contract can help them conceal their position.

in c++11 overload resolution, TYPE means..

In the presence of rvr, compiler must more carefully choose overloads based on type of argument and type of parameter.

  • I think type-of-parameter is the declared type, as programmer wrote it.

No change in c++11. No surprise.

  • “argument” means the object. Type-of-argument means…?
  1. Traditionally it means int vs float
  2. Traditionally it means const vs non-const
  3. (Traditionally, it means Acct vs TradingAcct but this is runtime type info, not available at compile time.)
  4. In c++11, it also means rval-obj vs regular object. You can convert regular object to a rval-object via … move(). But What if argument is a rval-variable, declared as “int&& param” ?

This is one of the trickiest confusions about rvr. The compiler “reasons” differently than a human reasons. [[effModernC++]] tried to explain it on P2 and P162 but I don’t get it.

Yes the ultimate runtime int object is an rval-obj (either naturally-occurring or moved), but when compiler sees the argument is an “int&& param”, compiler treats this argument as lvalue as it has a Location !

My blogpost calling std::move() inside mv-ctor  includes my code experiments.

FRA^ED-fut: actual loan-rate fixed when@@

Suppose I’m IBM, and need to borrow in 3 months’s time. As explained in typical FRA scenario, inspired by CFA Reading 71, I could buy a FRA and agree to pay a pre-agreed rate of 550 bps.  What’s the actual loan rate? As explained in that post,

  • If I borrow on open market, then actual loan rate is the open-market rate on 1 Apr
  • If I borrow from the FRA dealer GS, then loan rate is the pre-agreed 550 bps
  • Either way, I’m indifferent, since in the open-market case, what ever rate I pay is offset by the p/l of the FRA

Instead of FRA, I could go short the eurodollar futures. This contract is always cash-settled, so the actually loan rate is probably the open-market rate, but whatever market rate I pay is offset by the p/l of the futures contract.

Q: bond price change when yield goes to zero

Can bond yield become negative? Yes 2015-2017 many bonds traded at negative yield. shows a realistic example of a vanilla bond trading at $104. Yield is negative –You pay $104 now and will get $100 repayment so you are guaranteed to lose money.

Mathematically, when yield approaches negative 100 bps, price goes to infinity.

When yield approaches zero, bond price would go to the arithmetic sum of all coupons + repayment.

Cloneable, Object.clone(), Pen.Clone() #java

A few learning points.

The Object.clone() implementation is not important (at least to interviews), because I should always override it in my class like Pen, but here are some observations about this Object.clone():

  • shallow copy, not deep copy…. therefore not very useful.
  • this protected method is mostly meant to be invoked by a subclass:
    • If your variable points to some Pen object that’s no Clonable, and you call it from either the same package or from a subclass, then you hit CloneNoSupported exception.
    • if your Pen class implements Cloneable , then it should [2] override the clone() method

Cloneable is a special marker interface. It could trigger the CloneNotSupported exception, but if you override clone() then this exception may not hit. It’s an obscure detail.

  • I think you can override clone() without implementing Clonable, but this is tricky and non-standard.
  • [2] You could also implement Cloneable without overriding clone() .. The only way the default Object.clone() gets picked by compiler is when a Cat class accidentally implements Clonable but doesn’t override clone(), and you call it from either the same package or from a subclass

java: protected^package-private i.e.default (java8) shows two nice tables:

  • There’s no more “private protected
  • default access level is better known as “package-private” — strictly more private more restrictive than Protected . (Protected is more like Public). The 2nd table shows that
    • a “package-private” member of Alpha is accessible by Beta (same package) only, whereas
    • a “protected” member of Alpha is accessible by Beta and Alphasub
    • Therefore, “neighbors are more trusted than children”

I find it hard to remember so here are some “sound bytes”

  1. “protected” keyword only increases visibility never decreases it.
  2. So a protected field is more accessible than package-private (default) field
    • As an example, without “protected” label on my field1, my subclasses outside the package cannot see field1.
  3. same-package neighbors are local and trusted more than (overseas) children outside the package, possibly scattered over external jars

For a “protected” field1, a non-subclass in the same package can see it just as it can see a default-accessible field2

Not mentioned in the article, but when we say “class Beta can access a member x of Alpha”, it means that the compiler allows you to write, inside Beta methods, code that mentions x. It could be myAlpha.x or it could be Alpha.x for a static member.

PendingNew^New: OrdStatus[39]

“8” has special meaning

  • in tag 35, it means execution report
  • in tag 39 and tag 150, it means status=rejected.

PendingNew and New are two possible statuses for a given order.

PendingNew (39=A, 150=A) is relatively simple. The downstream system sends a PendingNew to upstream as soon as it receives the “envelop”, before opening, validating or saving it. I would say even a bad order can go into PendingNew.

New (39=0, 150=0) is significant. It’s an official acknowledgement (or confirmation) of acceptance. It’s synonymous with “Accept” and “Ack”. I think it means fully validated and saved for execution. For an intermediate system, usually it waits for an Ack i.e. 39=0 from exchange before sending an Ack to the upstream. Tag 39 is usually not modified.

I see both A and 0 frequently in my systems, in execution report messages.

For a market Buy order, I think it will be followed by (partial) fills, but not guaranteed, because there may be no offers, or execution could fail for any reason. For a dealer system, execution can fail due to inventory shortage. I implemented such an execution engine in 95G.

I’m no expert on order statuses.

perl regex top3 tips #modifier /m /s shows

The part of the (haystack) string that actually matched the (needle) pattern is automatically stored in q[  $& ]

Whatever came before the matched section is in $` and whatever was after it is in $'. Another way to say that is that $` holds whatever the regular expression engine had to skip over before it found the match, and $' has the remainder of the string that the pattern never got to. If you glue these three strings together in order, you’ll always get back the original string.

— /m /s clarification:

  1. By default, q($) + q(^) won’t match newline. /m targets q($) and q(^)
  2. By default, the dot q(.) won’t match newline. /s targets the dot.
  3. The /m and /s both help get newlines matched, in different contexts.

Official doc says:

  1. /m  Treat the string being matched against as multiple lines. That is, change "^" and "$" from matching the start of the string’s first line and the end of its last line to matching embedded start and end of each line within the string.
  2. /s  Treat the string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.
  3. Used together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string.

at run time, function must be predefined ] python #c++

As shown in, you can randomly shuffle your functions bodies (i.e. definitions) as long as … at run time, each function call can resolve to a function already seen by the interpreter, not a function to be defined later in the script.

Easiest way is to call main() after all function bodies. By the time main() runs and uses any of those functions, the interpreter runtime has already parsed the function body.

If you move func8() body to after the call to main() (I didn’t say “the body” of main()), then python will complain that func8 is undefined. It’s as bad as:

print var3; var3=3

(Note there’s no declaration vs definition for python functions.)

C is more strict — at Compile time (not run time), when the compiler encounters a name like func5(), it insists func5() declaration (or definition) is already seen.

To satisfy the compiler, people put function declarations in include files. Function definition obeys ODR.

template template param

Andrei Alexandrescu briefly introduced this technique .. I think he said it is one of the most powerful TMP techniques.

Vague recollection — one of the template parameters is itself a template.

One common feature of templ-templ param — the param P “template<typename>P” is usually used as “P<S>”, where S is a sister parameter to P ! has a good tutorial. shows operator<<(). I have it in my github

The variadic version is broken because the template tries to accept any class template having 1 or more parameters in the definition. This includes std::basic_string, which already has an operator<<() that conflicts with the variadic version.

This also helps me understand that the 2-param version (using templ-templ) is really a template that accepts any class-template having two type-params by definition. Note the two type-params can use default values.

— See my github demo

First and possibly biggest challenge is — understand the limitation of traditional solutions. I would say this is 90% of the struggle for a novice.


[19] j^python arg passing #(im)mutable

Python is ALWAYS pbref.

— list/arrayList, dict/HM etc: same behavior between java and py —
pbref. Edits by the function hits the original object.

— string: probably identical behavior between java and py —
immutable in both java and py.

Passed by reference in both.

If function “modifies” the string content, it gets a new string object (like copy-on-write), unrelated to the original string object in the “upstairs” stackframe.

— ints, floats etc: functionally similar between java and py —
If function modifies the integer content, the effect is “similar”. In python a new int object is created (like string). In java this int is a mutable clone of the original, created on the stackframe.

Java int is primitive, without usable address and pbclone, never pbref.

Python numbers are like python strings – immutable objects passed by reference. Probably copy-on-write.

##eg@syntax/idiom learning via coding drill

Syntax (+ some best practice in syntax) is essential in 70% of my real-coding interviews.

There are many memorable exceptions like char-array/int-array, tree, linked list, regex ….. , which require ECT, BP and pure algo, but trivial syntax. Some of them are relatively “hard” on pure algo and my relative strength shines through among Wall St candidates.

Below are some examples of syntax learning via coding drill

  • pointer manipulation in linked graph
  • matrix manipulation
  • pitfall — iterator invalidation
  • pitfall — output iterator hitting end()
  • c++ search within string
  • eg: multi-threading problems —- I had at least 8 experiences in java and c++ (UBS-Yang, Barcap, Gelber, SCB, HSBC, Wells, Bbg thrPool, Pimco + some minor ones in BGC)
  • eg: template, move semantic and placement-new … asked in white-board coding, nothing to do with ECT!
  • eg: lower_bound() means touchUpperBound is all about syntax not algorithm. Revealed via coding interview!
  • See also categories pyECT_nlg, t_c++syntaxTip, t_c++tecniq, T3_stdStr_tagT2_cStr/arr/vector
  • see tags t_c++ECT_nlg
  • c++ syntax requires 3-6 times more learning than python

Liunx(!!Unix)kernel thread can! run user program

–“Liunx kernel thread cannot run user programs”, as explained in [[UnderstandingLinuxKernel]] P3.

Removing ‘kernel‘… Linux userland threads (based on LWP, explained below) do run user programs.

Removing ‘thread‘… Linux kernel processes or interrupt routines could possibly run under a user process pid.

Removing ‘Linux‘ … Other Unix variants do use kernel threads to run both 1) user programs and 2) kernel routines. This is my understanding from reading about JVM…

  • unix  — multi-threaded user apps =use=> LWP =use=> kernel threads
  • linux — multi-threaded user apps =use=> LWP. LWP is the basic unit of concurrency in user apps. Linux LWP is unrelated to Unix LWP.

y FIX needs seqNo over TCP seqNo

My friend Alan Shi said … Suppose your FIX process crashed or lost power, reloaded (from disk) the last sequence received and reconnected (resetting tcp seq#). It would then receive a live seq # higher than expected. This could mean some executions reports were missed. If exchange notices a sequence gap, then it could mean some order cancellation was missed.  Both scenarios are serious and requires request for resend. CME documentation states:

… a given system, upon detecting a higher than expected message sequence number from its counterparty, requests a range of ordered messages resent from the counterparty.

Major difference from TCP sequence number — FIX specifies no Ack though some exchange do. See Ack in FIX^TCP

Q; how could FIX miss messages given TCP reliability? I guess tcp session disconnect is the main reason. has details:

  • two streams of seq numbers, each controlled by exch vs trader
  • how to react to unexpected high/low values received. Note “my” outgoing seq is controlled by me hence never “unexpected”
  • Sequence number reset policy — After a logout, sequence numbers is supposed to reset to 1, but if connection is terminated ‘non-gracefully’ sequence numbers will continue when the session is restored. In fact a lot of service providers (eg: Trax) never reset sequence numbers during the day. There are also some, who reset sequence numbers once per week, regardless of logout.


std::lower_bound() means touchUpperBound # 4 scenarios

  • std::upper_bound() should be named strictUpperBound()
  • std::lower_bound() should be named touchUpperBound() since it can return an element touching the target value

If no perfect hit, then both returns the same node — lowest node above the target

  1. if target is too high, both return end(), which is a fake element
  2. if target is too low, both return begin(), which is a real element
  3. if target is matching one of the nodes, then perfect hit
  4. if target is between 2 nodes, then … this is the most common. caters to all four scenarios.

jira Resolved^Closed

JIRA has two “fields” (one of them isn’t a field, but for this, it might as well be).

  1. The Status tells you where the issue is in the workflow. It does not tell you anything about the issue’s resolution, even though it might say “resolved”, “closed”, “done”, “fed to penguin”, it’s just an indicator of where it is.
  2. Resolution is either empty or populated (I consider it a boolean). if it is empty, JIRA sees the issue as “open” (and displays “unresolved” on screen), whatever the status is. It considers it “done” if it has any data value at all.

Resolved often means “developer put in the fix and feel confident but other people may not feel confident.”

Closed is a Status. It is the end status of the workflow and means “Reporter or QA confirmed that no more action needed. No need to look at it again, until reopened.”

In one workflow, when I Resolve a jira, status transitions to ReadyForQA.

##types of rvr/rvalueObjects out there #SCB IV

An rvr variable is a door plate on a memory location , wherein the data content is regarded Disposable. Either 1) a naturally occurring unnamed temporary object or 2) a named object earmarked (via move()) as no-longer-need.

Examples of first case:

  • function returning a nonref — Item 25 of [[effModernC++]] and P532 [[c++primer]]. I think this is extremely common
    • function returning a pair<int, float>
    • function returning a vector<int>
    • function returning a string
    • function returning an int
  • string1+string2
  • 33+55

In the 2nd case the same object could also have a regular lvr door plate (or a pointer pointing to it). This lvr variable should NOT be used any more.

Q: That’s a rvr variable… how about the rvr object?
A: no such thing. A rvr is always a variable. There exists a memory location at the door plate, but that object is neither rvr nor lvr.
%%A: I explained to my 2018 SCB interviewer — rvr and lvr (and pointer variables) are thingies known to the compiler. Objects are runtime thingies, including 32-bit pointer objects. However, an unnamed temp object is (due to compiler) soon-to-be-destroyed, so it is accessed via a rvr.

UDP socket is identified by two-tuple; TCP socket is by four-tuple

Based on [[computer networking]] P192. see also de-multiplex by-destPortNumber UDP ok but !! enough for TCP

  • Note the term in subject is “socket” not “connection”. UDP is connection-less.

A TCP segment has four header fields for Source IP:port and destination IP:port.

A TCP socket has internal data structure for a four-tuple — Remote IP:port and local IP:port.

A regular TCP “Worker socket” has all four items populated, to represent a real “session/connection”, but a Listening socket could have wild cards in all but the local-port field.

fragmentation: IP^TCP #retrans

I consider this a “halo” knowledge pearl because it is part of an essential everyday service. We can easily find an opportunity to inject it into an Interview. Interviews are unlikely to go this deep, but it’s good to over-prepare here.

This comparison ties together many loose ends like Ack, reassembly, retrans, seq resets..

This comparison clarifies what reliability IP offers + what reliability features it lacks:

  • it offers complete datagrams
  • it can lose an entire datagram — TCP (not UDP) will detect it and use retrans
  • it can deliver two datagrams out of order — TCP will detect and fix it

[1] IP fragmentation can cause excessive retransmissions when fragments encounter packet loss and reliable protocols such as TCP must retransmit ALL of the fragments in order to recover from the loss of a SINGLE fragment
[2] TCP seq# never looks like 1,2,3
[3] IP (de)fragmentation #MTU,offset

IP4 fragmentation TCP fragmentation
minimum guarantees all-or-nothing. Never partial packet stream in-sequence without gap
reliability unreliable fully reliable
name of a “whole” piece IP datagram #“packet/msg” are vague
a TCP session with thousands of sequence numbers
name for a “part” IP fragment TCP segment
sequencing each fragment has an offset each segment has a seq#
.. continuous? yes discontinuous ! [2]
.. seq # reset? yes for each packet loops back to 0 right before overflow
Ack no Ack !
positive Ack needed
gap detection using offset using seq# [2]
id for the “msg” identification number no such thing
end-of-msg flag in last fragment no such thing. Continuous stream
out-of-sequence? likely likely
..reassembly based on id/offset/flag [3] based on seq#
retrans not by IP4 [1][3] commonplace
missing a piece? entire IP datagram discarded[3] triggers retrans

de-multiplex by-destPort: UDP ok but insufficient for TCP

When people ask me what is the purpose of the port number in networking, I used to say that it helps demultiplex. Now I know that’s true for UDP but TCP uses more than the destination port number.

Background — Two processes X and Y on a single-IP machine  need to maintain two private, independent ssh sessions. The incoming packets need to be directed to the correct process, based on the port numbers of X and Y… or is it?

If X is sshd with a listening socket on port 22, and Y is a forked child process from accept(), then Y’s “worker socket” also has local port 22. That’s why in our linux server, I see many ssh sockets where the local ip:port pairs are indistinguishable.

TCP demultiplex uses not only the local ip:port, but also remote (i.e. source) ip:port. Demultiplex also considers wild cards.

socket has local IP:port
socket has remote IP:port no such thing
2 sockets with same
local port 22 ???
can live in two processes not allowed
can live in one process not allowed
2 msg with same dest ip:port
but different source ports
addressed to 2 sockets;
2 ssh sessions
addressed to the
same socket

make_shared() cache efficiency, forward()

This low-level topic is apparently important to multiple interviewers. I guess there are similarly low-level topics like lockfree, wait/notify, hashmap, const correctness.. These topics are purely for theoretical QQ interviews. I don’t think app developers ever need to write forward() in their code. touches on a few low-level optimizations. Suppose you follow Herb Sutter’s advice and write a factory accepting Trade ctor arg and returning a shared_ptr<Trade>,

  • your factory’s parameter should be a universal reference. You should then std::forward() it to make_shared(). See gcc source code See make_shared() source in
  • make_shared() makes a single allocation for a Trade and an adjacent control block, with cache efficiency — any read access on the Trade pointer will cache the control block too
  • I remember reading online that one allocation vs two is a huge performance win….
  • if the arg object is a temp object, then the rvr would be forwarded to the Trade ctor. Scott Meryers says the lvr would be cast to a rvr. The Trade ctor would need to move() it.
  • if the runtime object is carried by an lvr (arg object not a temp object), then the lvr would be forwarded as is to Trade ctor?

Q: What if I omit std::forward()?
AA: Trade ctor would receive always a lvr. See ScottMeyers P162 and my github code is my experiment.

— in summary, advantages of make_shared
* code size is smaller and more i-cache friendly
* one allocation fewer. I think one allocation is thousands of times slower than a simple calc


4buffers in 1 TCP connection #full-duplex

Many socket programmers may not realize there are four transmission buffers in a single TCP connection or TCP session. These buffers are set aside by the kernel when the socket is created.

Two buffers in socket AA (say in U.S.) + two sockets on socket BB (say, in Singapore)

Two receive buffers + two send buffers

At any time, there could be data in all four buffers. AA can initiate a send in the middle of receiving data because TCP is a full-duplex protocol.

Whenever one of the two sockets initiate a send, it has a job duty to ensure it never overflows the receiving buffer on the other end. This is a the essence of flow-control. This flow-control relies on the 2 buffers involved (not the other two buffers) P 247 [[computer networking]] has details. Basically, sender has an estimate of the remaining free space in the receive buffer so sender never sends too many bytes. It keeps unsent data in its send buffer.

Is UDP full-duplex? My book doesn’t mention it. I think it is.

session^msg^tag level concepts

Remember every session-level operation is implemented using messages.

  • Gap management? Session level
  • Seq reset? Session level, but seq reset is a msg
  • Retrans? per-Msg
  • Checksum? per-Msg
  • MsgType tag? there is one per-Msg
  • Header? per-Msg
  • order/trade life-cycle? session level
  • OMS state management? None of them …. more like application level than Session level

linux tcp buffer^AWS tuning params

—receive buffer configuration
In general, there are two ways to control how large a TCP socket receive buffer can grow on Linux:

  1. You can set setsockopt(SO_RCVBUF) explicitly as the max receive buffer size on individual TCP/UDP sockets
  2. Or you can leave it to the operating system and allow it to auto-tune it dynamically, using the global tcp_rmem values as a hint.
  3. … both values are capped by

/proc/sys/net/core/rmem_max — is a global hard limit on all sockets (TCP/UDP). I see 256M in my system. Can you set it to 1GB? I’m not sure but it’s probably unaffected by the boolean flag below.

/proc/sys/net/ipv4/tcp_rmem — doesn’t override SO_RCVBUF. The max value on RTS system is again 256M. The receive buffer for each socket is adjusted by kernel dynamically, at runtime.

The linux “tcp” manpage explains the relationship.

Note large TCP receive buffer size is usually required for high latency, high bandwidth, high volume connections. Low latency systems should use smaller TCP buffers.

For high-volume multicast channel, you need large receive buffers to guard against data loss — UDP sender doesn’t obey flow control to prevent receiver overflow.


/proc/sys/net/ipv4/tcp_window_scaling is a boolean configuration. (Turned on by default) 1GB  is the new limit on AWS after turning on window scaling. If turned off, then AWS value is constrained to a 16-bit integer field in the TCP header — 65536

I think this flag affects AWS and not receive buffer size.

  • if turned on, and if buffer is configured to grow beyond 64KB, then Ack can set AWS to above 65536.
  • if turned off, then we don’t (?) need a large buffer since AWS can only be 65536 or lower.


FIX.5 + FIXT.1 breaking changes

I think many systems are not yet using FIX.5 …

  1. FIXT.1 protocol [1] is a kinda subset of the FIX.4 protocol. It specifies (the traditional) FIX session maintenance over naked TCP
  2. FIX.5 protocol is a kinda subset of the FIX.4 protocol. It specifies only the application messages and not the session messages.

See Therefore,

  • FIX4 over naked TCP = FIX.5 + FIXT
  • FIX4 over non-TCP = FIX.5 over non-TCP

FIX.5 can use a transport messaging queue or web service, instead of FIXT

In FIX.5, Header now has 5 mandatory fields:

  • existing — BeginString(8), BodyLength(9), MsgType(35)
  • new — SenderCompId(49), TargetCompId(56)

Some applications also require MsgSeqNo(34) and SendingTime(52), but these are unrelated to FIX.5

Note BeginString actually look like “8=FIXT1.1

[1] FIX Session layer will utilize a new version moniker of “FIXTx.y”

HFT mktData redistribution via MOM #real world

Several low-latency practitioners say MOM is unwelcome due to added latency:

  1. The HSBC hiring manager Brian R was the first to point out to me that MOM adds latency. Their goal is to get the raw (market) data from producer to consumer as quickly as possible, with minimum hops in between.
  2. 29West documentation echoes “Instead of implementing special messaging servers and daemons to receive and re-transmit messages, Ultra Messaging routes messages primarily with the network infrastructure at wire speed. Placing little or nothing in between the sender and receiver is an important and unique design principle of Ultra Messaging.” However, the UM system itself is an additional hop, right? Contrast a design where sender directly sends to receiver via multicast.
  3. Then I found that the RTS systems (not ultra-low-latency ) have no middle-ware between feed parser and order book engine (named Rebus).

However, HFT doesn’t always avoid MOM.

  • P143 [[all about HFT]] published 2010 says an HFT such as Citadel often subscribes to both individual stock exchanges and CTS/CQS [1], and multicasts the market data for internal components of the HFT platform. This design has additional buffers inherently. The first layer receives raw external data via a socket buffer. The 2nd layer components would receive the multicast data via their socket buffers.
  • SCB eFX system is very competitive in latency, with Solarflare NIC + kernel bypass etc. They still use Solace MOM!
  • Lehman’s market data is re-distributed over tibco RV, in FIX format.
  • A major hedge fund was targeting sub-10 microsec latency and decided Solace is unacceptable and Aeron was chosen

[1] one key justification to subscribe redundant feeds — CTS/CQS may deliver a tick message faster than direct feed!

[20]hedge fund AUM ranking: confusing #BW,Ren,AQR

Note AUM fluctuates year by year, as hot money flows between hedge funds..

— Based on ( has a similar ranking.)

Many hedge fund managers also manage public funds and offer non-hedge fund strategies. To compare two hedge funds’ AUM, one can use

  • AUM in private funds — world top-10 are typically 30-70b in size. Bridgewater (an outlier, along with Renaissance) is #1 managing $132b
    • Man group ($60b+) is biggest in Europe in this respect
    • Citadel ($30b+) is biggest Chicago manager in this respect
  • AUM in public mutual funds following hedge fund strategies — AQR is #1 managing $172b including $60b in private funds.
  • AUM across all type of funds — Blackrock is #1 managing $6400b uses a different criteria and includes ibanks’ asset-management arms.

IP4 fragmentation+reassembly #MTU,offset

I consider this a “halo” knowledge pearl because it is part of an essential everyday service. We can easily find an opportunity to inject it into an IV.

A Trex interviewer said something questionable. I said fragmentation is done at IP layer and he said yes but not reassembly. He was wrong. See P329 [[Computer Networking]]

I was talking about IP layer breaking up , say, a 4KB packet (TCP or UDP packet) into three IP-fragments no bigger than 1500B [1]. The reassembly task is to put all 3 fragments back together in sequence (and detect missing fragments) and hand it over to TCP or UDP.

This reassembly is done in IP layer. IP4 uses an “offset” number in each fragment to identify the sequencing and to detect missing fragments. The fragment with the highest offset also has a flag indicating it’s the last fragment of a given /logical/ packet.

Therefore, IP4 detects and will never deliver partial packets to UDP/TCP (P328 [[computer networking]]), even though IP is considered an unreliable service. IP4 can detect missing/incomplete IP-datagram and will refuse to “release” it to upper layer (i.e. UDP/TCP).

  • If TCP doesn’t get one “segment” (a.k.a. IP-datagram), it will request retransmission
  • UDP does no retransmission
  • IP4 does no retransmission

[1] MTU for some hardware is lower than 1500 Bytes …

##functions(outside big4)using either rvr param or move()

Q: Any function(outside big6) featuring an rvr param?
%%A: Such functions are rare. I don’t know any.
AA: [[effModernC++]] has a few functions taking rvr param, but fairly contrived as I remember. See P170, 173
AA: P544 [[c++primer]] says class methods could use rvr param
* eg: push_back()

Q: any function (outside big6) using std::move?

  • [[effModernC++]] has a few functions such as P170, 174
  • P544 [[c++primer]] says rarely needed
  • [[josuttis]] p20


tcp: detect wire unplugged

In general for either producer or consumer, the only way to detect peer-crash is by probing (eg: keepalive, 1-byte probe, RTO…).

  • Receiver generally don’t probe and will remain oblivious.
  • Sender will always notice a missing Ack (timeout OR 3 duplicate Acks). After retrying, TCP module will give up and generate SIGPIPE.
send/recv buffers full buffer full then receiver-crash receiver-crash then sender has data to send receiver-crash amid active transmission
visible symptom 1-byte probe from sender triggers Ack containing AWS=0 The same Ack suddenly stops coming very first expected Ack doesn’t come Ack was coming in then suddenly stops coming
retrans by sender yes yes yes yes
SIGPIPE no probably yes probably

Q20: if a TCP producer process dies After transmission, what would the consumer get?
AA: nothing. See — Receiver is ready to receive data, and has no idea that sender has crashed.
AA: Same answer on

Q21: if a TCP producer process dies During transmission, what would the consumer get?
%A: ditto. Receiver has to assume sender stopped.

Q30: if a TCP consumer process dies during a quiet period After transmission, what would the producer get?
AA: P49 [[tcp/ip sockets in C]] Sender doesn’t know right away. At the next send(), sender will get -1 as return value. In addition, SIGPIPE will also be delivered, unless configured otherwise.

Q30b: Is SIGPIPE generated immediately or after some retries?
AA: describes Ack and re-transmission. Sender will notice a missing Ack and RTO will kick in.
%A: I feel TCP module will Not give up prematurely. Sometimes a wire is quickly restored during ftp, without error. If wire remains unplugged it would look exactly like a peer crash.

Q31: if a TCP consumer dies During transmission, what would the producer get?
%A: see Q30.

Q32: if a TCP consumer process dies some time after buffer full, what would the producer get?
%A: probably similar to above, since sender would send a 1-byte probe to trigger a Ack. Not getting the Ack tells sender something. This probe is builtin and mandatory , but functionally similar to (the optional) TCP Keepalive feature

I never studied these topics but they are quite common.

Q: same local IP@2 worker-sockets: delivery to who

Suppose two TCP server worker-sockets both have local address port 80. Both connections active.

When a packet comes addressed to, which socket will kernel deliver it to? Not both.

Not sure about UDP, but TCP connection is a virtual circuit or “private conversation”, so Socket W1  knows the client is If the incoming packet doesn’t have source IP:port matching that, then this packet doesn’t belong to the conversation.

simultaneous send to 2 tcp clients #multicast emulation

Consider a Mutlti-core machine hosting either a forking or multi-threaded tcp server. The accept() call would return twice with file descriptors 5 and 6 for two new-born worker sockets. Both could have the same server-side address, but definitely their ports must be identical to the listening port like 80.

There will be 2 dedicated threads (or processes) serving file descriptor 5 and 6, so they can both send the same data simultaneously. The two data streams will not be exactly in-sync because the two threads are uncoordinated.

My friend Alan confirmed this is possible. Advantages:

  • can handle slow and fast receives at different data rates. Each tcp connection maintains its state
  • guaranteed delivery

For multicast, a single thread would send just a single copy  to the network. I believe the routers in between would create copies for distribution.  Advantages:

  • Efficiency — Router is better at this task than a host
  • throughput — tcp flow control is a speed bump


new socket from accept()inherits port# from listen`socket

[[tcp/ip sockets in C]] P96 has detailed diagrams, but this write-up is based on

Background — a socket is an object in memory, with dedicated buffers.

We will look at an sshd server listening on port 22. When accept() returns without error, the return value is a positive integer file descriptor pointing to a new-born socket object with its own buffers. I call it the “worker” socket. It inherits almost every property from the listening socket (but see tcp listening socket doesn’t use a lot of buffers for differences.)

  • local (i.e. server-side) port is 22
  • local address is a single address that client source code specified. Note sshd server could listen on ANY address, or listen on a single address.
  • … so if you look at the local address:port, the worker socket and the listening socket may look identical, but they are two objects!
  • remote port is some random high port client host assigned.
  • remote address is … obvious
  • … so if you compare the worker-socket vs listening socket, they have
    • identical local port
    • different local ip
    • different remote ip
    • different remote port

tcp listening socket doesn’t use a lot of buffers, probably

The server-side “worker” socket relies on the buffers. So does the client-side socket, but the listening socket probably doesn’t need the buffer since it exchanges very little data with client

That’s one difference between listening socket vs worker socket

2nd difference is the server (i.e. local) address field. Worker socket has a single server address filled in. The listening socket often has “ANY” as its server address.

non-forking STM tcp server

This a simple design. In contrast, I proposed a multiplexing design for a non-forking single-threaded tcp server in accept()+select() : multiple persistent worker-sockets

ObjectSpace manual P364 shows a non-forking single-threaded tcp server. It exchanges some data with each incoming client and immediate disconnects the client and goes back to accept()

Still, when accept() returns, it returns with a new “worker” socket that’s already connected to the client.

This new “worker” socket has the same port as the listening socket.

Since the worker socket object is a local variable in the while() loop, it goes out of scope and gets destructed right away.

y tcp server listens on ANY address

Background — a server host owns multiple addresses, as shown by ifconfig. There’s usually an address for each network “interface”. (Routing is specified per interface.) Suppose we have on IF1 on IF2

By default, an Apache server would listen on and This allows clients from different network segments to reach us.

If Apache server only listens to, but some clients can only connect to perhaps due to routing!

Ack in FIX #TCP

TCP requires Ack for every segment? See cumulative acknowledgement in [[computerNetworking]]

FIX protocol standard doesn’t require Ack for every message.

However, many exchanges spec would require an Ack message for {order submission OR execution report}. At TrexQuant design interview, I dealt with this issue —

  • –trader sends an order and expecting an Ack
  • exchange sends an Ack and expecting another Ack … endless cycle
  • –exchange sends an exec report and expecting an Ack
  • trader sends an Ack waiting for another Ack … endless cycle
  • –what if at any point trader crashes? Exchange would assume the exec report is not acknowledged?

My design minimizes duty of the exchange —

  • –trader sends an order and expecting an Ack
  • exchange sends an Ack and assumes it’s delivered.
  • If trader misses the Ack she would decide either to resend the order (with PossResend flag) or query the exchange
  • –exchange sends an exec report and expecting an Ack. Proactive resend (with PossResend) when appropriate.
  • trader sends an Ack and assumes it’s delivered.
  • If exchanges doesn’t get any Ack it would force a heartbeat with TestRequest. Then exchange assumes trader is offline.

forward^move on various reference types

Suppose there are 3 overloads of a sink() function, each with a lvr, rvr or nonref parameter.

(In reality these three sink() functions will not compile together due to ambiguity. So I need to remove one of the 3 just to compile.)

We pass string objects into sink() after going through various “routes”. In the table’s body, if you see “passing on as lvr” it means compiler picks the lvr overload.

thingy passed through move/fwd ↴ route: as-is route: move() route: fwd  without move() route: fwd+move
output from move() (without any named variable) pbclone or pbRvr ? should be fine pbref: passing on a lvr pbRvr
natural temp (no named variable) pbclone or pbRvr ? should be fine pbref: passing on a lvr pbRvr
named rvr variable from move() ! [2] pbref or pbclone [1] pbRvr pbref?
lvr variable pbref or pbclone passing on a rvr pbref? should never occur
nonref variable pbref or pbclone passing on a rvr pbref? should never occur are my experiment.  Take-aways:

— when a non-template function has a rvr param like f(MyType && s), we know the object behind s is a naturally-occurring temporary object (OR someone explicitly marked a non-temp as no-longer-in-use) but in order to call the move-ctor of MyType, we have to MyType(move(s)). Otherwise, to my surprise, s is treated as a lvr and we end up calling the MyType copy-ctor. [1]

  • with a real rvr (not a universal ref), you either directly steal from it, or pass it to another Func2 via Func2(move(rvrVar)). Func2 is typically mv-ctor (including assignment)

[2] Surprise Insight: Any time you save the output of move() in a Named rvr variable like q(string && namedRvrVar), this variable itself has a Location and is a l-value expression, even though the string object is now marked as abandoned. Passing this variable would be passing a lvr!

  • If you call move(), you must not save move() output to a named variable! This rule is actually consistent with…
  • with a naturally-occurring temp like string(“a”)+”1″, you must not save it to named variable, or it will become non-temp object.

— std::forward requires a universal reference i.e. type deduction.

If I call factory<MyType>(move(mytype)), there’s no type deduction. This factory<MyType>(..) is a concretized function and is equivalent to factory(MyType && s). It is fundamentally different from the function template. It can Only be called with a true rvr. If you call it like factor<MyType>(nonref>, compiler will complain no-matching-function, because the nonref argument can only match a lvr parameter or a by-value parameter. shows that every time any function gets a variable var1 as a universal ref or a true rvr, we always apply either move(var1) or forward(var1) before passing to another function. I think as-is passing would unconditionally pass var1 as a lvr.

c++matrix using deque@deque #python easier

My own experiment shows

  • had better default-populate with zeros. Afterwards, you can easily overwrite individual cells without bound check.
  • it’s easy to insert a new row anywhere. Vector would be inefficient.
  • To insert a new column, we need a simple loop

For python, # zero-initialize a 5-row, 8-column matrix:
width, height = 8, 5
Matrix = [[0 for x in range(width)] for y in range(height)]

In any programming language, the underlying data structure is a uniform pile-of-horizontal-arrays, therefore it’s crucial (and tricky) to understand indexing. It’s very similar to matrix indexing — Mat[0,1] refers to first row, 2nd element.

Warning: 2d array is hard to pass in c++, based on personal experience 😦 You often need to specify the size in the receiving function declaration. Even if this is feasible, it’s unwanted legwork.

[1] Warning — The concept of “column” is mathematical (matrix) and non-existent in our implementation, therefore misleading! I will avoid any mention of it in my source code. No object no data structure for the “column”!

[2] Warning — Another confusion due to mathematics training. Better avoid Cartesian coordinates. Point(4,1) is on 2nd row, 5th item, therefore arr[2][5] — so you need to swap the subscripts.

1st subscript 2nd subscript
max subscript  44  77
height #rowCnt 45 #not an index value  <==
width #rowSz [1]  ==> 78 #not an index value
example value 1 picks 2nd row 4 picks 5th item in the row
variable name rowId, whichRow subId [1]
Cartesian coordinate[2] y=1 (Left index) x=4

ValueType default ctor needed in std::map[]

Gotcha — Suppose you define a custom Value type for your std::map and you call mymap[key] = Value(…). There’s an implicit call to the no-arg ctor of Value class!

If you don’t have a no-arg ctor, then avoid operator[](). Use insert(make_pair()) and find(key) methods.

Paradox — for both unordered_map and map, a lookup HIT also requires a no-arg ctro to be available. Reason is,

  • since your code uses map operator[], compiler will “see” the template member function operator[]. This function explicitly calls the ctor within a if-block.
    • If you have any lookup miss, this ctor is called at run time
    • If you have no lookup miss, this ctor is never called at run time
    • compiler doesn’t know if lookup miss/hit may happen in the future, so both branches are compiled into a.out
  • If your source code doesn’t use operator[], then at compile time, the template function operator[] is not included in a.out, so you don’t need the default ctor.


pass in func name^func ptr^functor object

Let’s set the stage — you can either pass in a function name (like myFunc), or a ptr to function (&myFunc) or a functor object. The functor is recommended even though it involves more typing. Justification — inlining. I remember a top c++ expert said so.

I believe “myFunc” is implicitly converted to & myFunc by compiler, so these two forms are equivalent.


Y allocate static field in .c file %%take

why do we have to define static field myStaticInt in a cpp file?

For a non-static field myInt, the allocation happens when the class instance is allocated on stack, on heap (with new()) or in global area.

However, myStaticInt isn’t take care of. It’s not on the real estate of the new instance. That’s why we need to declare it in the class header, and then define it exactly once (ODR) in a cpp file. It is allocated at compile time — static allocation.

RVO^move : on return value

Let’s set the stage. A function returns a local Trade object “myTrade” by value. Will RVO kick in or move-semantic kicks in? Not both!

I had lots of confusions about these 2 features.[[effModernC++]] P176 has a long discussion and an advice — do not write std::move() hoping to “help” compiler on a local object being returned from a function

  • If the local object is eligible for RVO then all compilers would elide the copy. Your std::move() would hinder the compiler and back fire
  • if the local object is ineligible for RVO then compiler are required to return an rvalue object, often implicitly using st::move(), so your help is unneeded.
    • Note local object returned by clone is a naturally-occurring temp object.

P23 [[c++stdLib]] gave 2-line answer:

  1. if Trade class has a suitable copy or move ctor, then compiler may choose to “elide the copy”. This was long implemented as RVO optimization in most compilers before c++11. is my experiment.
  2. otherwise, if Trade class has a move ctor, the myTrade object is robbed

So if condition for RVO is present, then most likely your move-ctor will NOT run.

c++compiler select`move-ctor

This is a __key__ part of understanding move-semantics, seldom quizzed. Let’s set the stage:

  • you overload a traditional insert(Amount const &)  with a move version insert(Amount &&)
    • by the way, Derrick of TrexQuant introduced a 3rd alternative emplace()
  • without explicit std::move, you pass in an argument into insert()
  • (For this example, I want to keep things simple by avoid constructors, but the rules are the same.)

Q1: When would the compiler select the rvr version?

P22 [[c++stdLib]] has a limited outline. Here’s my illustration

  • if I pass in a temporary like insert(originalAmount + 15), then this argument is a rvalue obj so the rvr version is selected
  • if I pass in a regular variable like insert(originalAmount), then this argument is an lvalue obj so the traditional version is selected
  • … See also my dedicated blogpost in c++11overload resolution TYPE means..

After we are clear on Q1, we can look at Q2

Q2: how would std::move help?
A: insert(std::move(originalAmount)); // if we know the object behind originalAmount is no longer needed. shows when we need to use std::move() and when we don’t need.

nearest-rank percentile definition has a few concise pointers

  • a percentile value is always an original object, never interpolated
  • we can think of it as a mapping function percentile(int100) -> an original object, where int100 data type can be a value from 1 to 100 (zero disallowed). Two distinct inputs can map to the same output object.
  • 100th percentile is always the largest object in the list. No such thing as 0th. 1st percentile is not always the smallest object.


In our discussions on ODR, global variables, file-scope static variables, global functions … the concept of “shared header” is often misunderstood.

  • If a header is only included in one *.cpp, then its content is effectively part of a *.cpp.

Therefore, you may experiment by putting “wrong” things in such a private header and the set-up may work or fail, but it’s an invalid test. Your test is basically putting those “wrong” things in an implementation file!


STL iterator invalidation rules, succinctly has concise explanations. Specifically,

  • list insertions don’t invalidate any iterator. I feel iterator is a pointer to a node.
  • tree insertions don’t invalidate any iterator. Same reason as list.
  • for erasure from lists or trees, only the iterator to the erased node is invalidated.

Now for vector:

  • vector insertion invalidates any iterator positioned somewhere after the insertion point. If reallocation happens due to exceeding vector.capacity() then all invalidated

Who cares?

  1. in projects, we always check the documentation
  2. in coding interviews, we can save precious time if we remember these rules
  3. in QQ interviews, I have never received such a knowledge question. I think interviewers don’t consider this very “deep”, but we could possibly impress some interviews.

tcp: one of 3 client-receivers is too slow

See also no overflow]TCP slow receiver #non-blocking sender

This is a real BGC interview question

Q: server is sending data fast. One of the clients (AA) is too slow.

Background — there will be 3 worker-sockets. The local address:port will look identical among them if the 3 clients connect to the same network interface, from the same network segment.

The set-up is described in simultaneous send to 2 tcp clients #mimic multicast

Note every worker socket for every client has identical local port.


I believe the AA connection/session/thread will be stagnant. At a certain point [1] server will have to remove the (mounting) data queue and release memory — data loss for the AA client.

[1] can happen within seconds for a fast data feed.

I also feel this set-up overloads the server. A TCP server has to maintain state for each laggard client, assuming single-threaded multiplexing(?). If each client takes a dedicated thread then server gets even higher load.

Are 5 client Connections using 5 sockets on server? I think so. Can a single thread multiplex them? I think so.

## Y avoid blocking design

There are many contexts. I only know a few.

1st, let’s look at an socket context. Suppose there are many (like 500 or 50) sockets to process. We don’t want 50 threads. We prefer fewer, perhaps 1 thread to check each “ready” socket, transfer whatever data can be transferred then go back to waiting. In this context, we need either

  • /readiness notification/, or
  • polling
  • … Both are compared on P51 [[TCP/IP sockets in C]]

2nd scenario — GUI. Blocking a UI-related thread (like the EDT) would freeze the screen.

3rd, let’s look at some DB request client. The request thread sends a request and it would take a long time to get a response. Blocking the request thread would waste some memory resource but not really CPU resource. It’s often better to deploy this thread to other tasks, if any.

Q: So what other tasks?
A: ANY task, in the thread pool design. The requester thread completes the sending task, and returns to the thread pool. It can pick up unrelated tasks. When the DB server responds, any thread in the pool can pick it up.

This can be seen as a “server bound” system, rather than IO bound or CPU bound. Both the CPU task queue and the IO task queue gets drained quickly.


no overflow]TCP slow receiver #non-blocking sender

Q: Does TCP receiver ever overflow due to a fast sender?

A: See

A: should not. When the receiver buffer is full, the receiver sends AdvertizedWindowSize to informs the sender. If sender app ignores it and continues to send, then sent data will remain in the send buffer and not sent over the wire. Soon the send buffer will fill up and send() will block. On a non-blocking TCP socket, send() returns with error only when it can’t send a single byte. (UDP is different.)

Non-block send/receive operations either complete the job or returns an error.

Q: Do they ever return with part of the data processed?
A: Yes they return the number of bytes transferred. Partial transfer is considered “completed”.


3rd effect@volatile introduced@java5

A Wells Fargo java interviewer said there are 3 effects. I named

  1. load/store to main memory on the target variable
  2. disable statement reordering

I think interviewer mentioned a 3rd effect about memory barrier.

This quote from Java Concurrency in Practice, chap. 3.1.4 may be relevant:

The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable. So from a memory visibility perspective, writing a volatile variable is like exiting a synchronized block and reading a volatile variable is like entering a synchronized block.

— “volatile” on a MyClass field provides special concurrency protection when the MyClass field is constructed :

java reference-type volatile #partially constructed has good details.

In , Nialscorva’s answer echoes the interviewer:

Before java 1.5, the compiler can reorder the two steps

  1. construction of the new object
  2. assigning the new address to the variable

In such a scenario, other threads (unsynchronized) can see the address in the variable and use the incomplete object, while the construction thread is preempted indefinitely like for 3 hours!

So in java 1.5, the construction is among the “other writes” by the volatile-writing thread! Therefore, the construction is flushed to memory before the address assignment. Below is my own solution, using a non-static volatile field:

public class DoubleCheckSingleton {
	private static DoubleCheckSingleton inst = null;
	private volatile boolean isConstructed = false;
	private DoubleCheckSingleton() {
		/* other construction steps */
		this.isConstructed = true; //last step
	DoubleCheckSingleton getInstance() {
		if (inst != null && inst.isConstructed) return inst;
		synchronized(DoubleCheckSingleton.class) {
			if (inst != null && inst.isConstructed) return inst;
/**This design makes uses of volatile feature that's reputed to be java5
* Without the isConstructed volatile field, an incomplete object's 
* address can be assigned to inst, so another thread entering getInstance()
* will see a non-null inst and use the half-cooked object 😦
* The isConstructed check ensures the construction has completed
			return inst = new DoubleCheckSingleton();

volume alone doesn’t make something big-data

The Oracle nosql book has these four “V”s to qualify any system as big data system. I added my annotations:

  1. Volume
  2. Velocity
  3. Variety of data format — If any two data formats account for more than 99% of your data in your system, then it doesn’t meet this definition. For example, FIX is one format.
  4. Variability in value — Does the system treat each datum equally?

Most of the so-called big-data systems I have seen don’t have these four V’s. All of them have some volume but none has the Variety or the Variability.

I would venture to say that

  • 1% of the big-data systems today have all four V’s
  • 50%+ of the big-data systems have no Variety no Variability
    • 90% of financial big-data systems are probably in this category
  • 10% of the big-data systems have 3 of the 4 V’s

The reason that these systems are considered “big data” is the big-data technologies applied. You may call it “big data technologies applied on traditional data”

See #top 5 big-data technologies

Does my exchange data qualify? Definitely high volume and velocity, but no Variety or Variability.

data science^big data Tech

The value-add of big-data (as an industry or skillset) == tools + models + data

  1. If we look at 100 big-data projects in practice, each one has all 3 elements, but 90-99% of them would have limited value-add, mostly due to .. model — exploratory research
    1. data mining probably uses similar models IMHO but we know its value-add is not so impressive
  2. tools —- are mostly software but also include cloud.
  3. models —- are the essence of the tools. Tools are invented, designed mostly for models. Models are often theoretical. Some statistical tools are tightly coupled with the models…

Fundamentally, the relationship between tools and models is similar to Quant library technology vs quant research.

  • Big data technologies (acquisition, parsing, cleansing, indexing, tagging, classifying..) is not exploratory. It’s more similar to database technology than scientific research.
  • Data science is an experimental/exploratory discovery task, like other scientific research. I feel it’s somewhat academic and theoretical. As a result, salary is not comparable to commercial sectors. My friend Jingsong worked with data scientists in Nokia/Microsoft.

The biggest improvement in recent years are in … tools

The biggest “growth” over the last 20 years is in data. I feel user-generated data is dwarfed by machine generated data

data mining^big-data

Data mining has been around for 20 years (before 1995). The most visible and /compelling/ value-add in big-data always involves some form of data mining, often using AI including machine-learning.

Data mining is The valuable thing that customers pay for, whereas Big-data technologies enhance the infrastructure supporting the mining has a /critical/ and concise comment. I modified it slightly for emphasis.

Data mining involves finding patterns from datasets. Big data involves large scale storage and processing of datasets. So combining both, data mining done on big data(e.g, finding buying patterns from large purchase logs) is getting lot of attention currently.

NOT All big data task are data mining ones(e.g, large scale indexing).

NOT All data mining tasks are on big data(e.g, data mining on a small file which can be performed on a single node). However, note that wikipedia(as on 10 Sept. 2012) defines data mining as “the process that attempts to discover patterns in large data sets”.

(latency) DataGrid^^noSQL (throughput)

  • Coherence/Gemfire/gigaspace are traditional data grids, probably distributed hashmaps.
  • One of the four categories of noSQL systems is also a distributed key/value hashmaps, such as redis
  • …. so what’s the diff? has an insightful answer — DataGrids were designed for latency; noSQL were designed for throughput.

I can see the same trade-off —

  • FaceBook’s main challenge/priority is fanout (throughput)
  • RTS main challenge is TPS measured in messages per second throughput
  • HFT main challenge is nanosec latency.

For a busy exchange, latency and throughput are both important but if they must pick one? .. throughput. I think most systems designers would lean towards throughput if data volume becomes unbearable.

I guess a system optimized for latency may be unable to cope with such volume. The request queue would probably overflow … lost of data

In contrast, a throughput-oriented design would still stay alive under unusual data volume. I feel such designs are likely more HA and more fault-tolerant.

c++debug^release build can modify app behavior #IV

This was actually asked in an interview, but it’s also good GTD knowledge. points out —

  • fewer uninitialized variables — Debug mode is more forgiving because it is often configured to initialize variables that have not been explicitly initialized.
    • For example, Perhaps you’re deleting an uninitialized pointer. In debug mode it works because pointer was nulled and delete ptr will be ok on NULL. On release it’s some rubbish, then delete ptr will actually cause a problem. points out —

  • guard bytes on the stack frame– The debugger puts more on the stack, so you’re less likely to overwrite something important.

I had frequent experience reading/writing beyond an array limit. points out —

  • relative timing between operations is changed by debug build, leading to race conditions

Echoed on P260 [[art of concurrency]] which says (in theory) it’s possible to hit threading error with optimization and no such error without optimization, which represents a bug in the compiler.

P75 [[moving from c to c++]] hints that compiler optimization may lead to “critical bugs” but I don’t think so.

  • poor use of assert can have side effect on debug build. Release build always turns off all assertions as the assertion failure messages are always unwelcome.

array^pointer variables types: indistinguishable

  • int i; // a single int object
  • int arr[]; //a nickname of the starting address of an array, very similar to a pure-address const pointer
  • int * const constPtr;
  • <— above two data types are similar; below two data types are similar —->
  • int * pi; //a regular pointer variable,
  • int * heapArr = new int[9]; //data type is same as pi

q[std::less] functor ^ operator<() again, briefly

[[effSTL]] P177 has more details than I need. Here are a few key points:

std::map and std::set — by default uses less<Trade>, which often uses a “method” operator<() in the Trade class

  • If you omit this operator, you get verbose STL build error messages about missing operator<()
  • this operator<() must be a const method, otherwise you get lengthy STL build error.
  • See
  • The other two (friendly) alternatives are
  • function pointer — easiest choice for quick coding test
  • binary functor class with an operator()(Trade, Trade) — too complex but most efficient, best practice.


j4 factory: java^c++ #Wells

A Wells Fargo interviewer asked

Q6: motivation of factory pattern?
Q6b: why prevent others calling your ctor?

  • %%A: some objects are expensive to construct (DbConnection) and I need tight control.
  • %%A: similarly, after construction, I often have some initialization logic in my factory, but I may not be allowed to modify the ctor, or our design doesn’t favor doing such things in the ctor. I make my factory users’ life easier if they don’t call new() directly.
  • AA: more importantly, the post-construction function could be virtual! This is a sanctioned justification for c++ factory on the authoritative [[c++codingStd]] P88
  • %%A: I want to (save the new instance and) throw exception in a post-construction routine, but I don’t want to throw from ctor
  • %%A: I want to centralize the non-trivial business rule of selecting target types to return. Code reuse rather than duplication
  • %%A: a caching factory
  • %%A: there are too many input parameters to my ctor and I want to provide users a simplified façade

cross-currency equity swap: %%intuition

Trade 1: At Time 1, CK (a hedge fund based in Japan) buys one share of GE priced at USD 10, paying JPY 1000. Eleven months later, GE is still at USD 10 which is now JPY 990. CK faces a paper loss due to FX. I will treat USD as asset currency. CK bought 10 greenbacks at 100 yen each and now each greenback is worth 99 yen only.

Trade 2: a comparable single-currency eq-swap trade

Trade 3: a comparable x-ccy swap. At Time 1, the dealer (say GS) buys and holds GE on client’s behalf.

(It is instructive to compare this to compare this to Trade 2. The only difference is the FX.)

In Trade 3, how did GS pay to acquire the share? GS received JPY 1000 from CK and used it to get [1] 10 greenbacks to pay for the stock.

Q: What (standard) solutions do GS have to eliminate its own FX risk and remain transparent to client? I think GS must pass on the FX risk to client.

I think in any x-ccy deal with a dealer bank, this FX risk is unavoidable for CK. Bank always avoids the FX risk and transfer the risk to client.

[1] GS probably bought USDJPY on the street. Who GS bought from doesn’t matter, even if that’s another GS trader. For an illiquid currency, GS may not have sufficient inventory internally. Even if GS has inventory under trader Tom, Tom may not want to Sell the inventory at the market rate at this time. Client ought to get the market rate always.

After the FX trade, GS house account is long USDJPY at price 100 and GS want USD to strengthen. If GS effectively passes on the FX risk, then CK would be long USDJPY.

I believe GS need to Sell USDJPY to CK at price 100, to effectively and completely transfer the FX risk to client. In a nutshell, GS sells 10 greenbacks to CK and CK uses the 10 greenbacks to enter an eq-swap deal with GS.

GS trade system probably executes two FX trades

  1. buy USDJPY on street
  2. sell USDJPY to CK

After that,

  • GS is square USDJPY.
  • CK is Long USDJPY at price 100. In other words, CK wants USD to strengthen.

I believe the FX rate used in this trade must be communicated to CK.

Eleven months later, GS hedge account has $0 PnL since GE hasn’t moved. GS FX account is square. In contrast, CK suffers a paper loss due to FX, since USD has weakened.

As a validation (as I instructed my son), notice that this outcome is identical to the traditional trade, where CK buys USDJPY at 100 to pay for the stock. Therefore, this deal is fair deal.

Q: Does GS make any money on the FX?
A: I don’t think so. If they do, it’s Not by design. By design, GS ought to Sell USDJPY to client at fair market price. “Fair” implies that GS could only earn bid/ask spread.

once algo cracked,syntax/idiom=雕虫小技@@ NO

Background — Let’s focus on short coding tests , compiler-based.

Q: If it takes 2Y to improve from 3 to 6, then how many percent of the effort is syntax?
A: less than 50%, but can be as short as a few days. ECT is harder.

In java or python, my coding test track record is better. The syntax know-how required is relatively small, because the questions use only array, hash table, linked nodes, StringBuilder/stringstream… Look at EPI300.

C++ feels similar. Once the algo (and data structures) is cracked, the syntax know-how required consist of about 500 common operations. (I have some of it listed in my blog…)

However, once I have a decent idea, it often takes me 20 – 60 minutes to pass the basic tests, but i am given only 30 minutes! Why am I so slow? Syntax is sometimes non-trivial and also ECT, even if without debugging the corner cases.

Some non-trivial syntax + idioms needed in speed coding — ##eg@syntax learning via coding drill

##shared_ptr thr-safety: 3 cases

This topic is worthwhile as it is about two high-value topics … threading + smart ptr

At least 3 interviewers pointed out thread safety issues … first answer shows many correct MT usages and incorrect MT usages. Looking at that answer, I see at least 3 distinct”objects” that could be “shared-mutable”:

  1. control block — shared by different club members. Any one club member, like global_instance could be a shared mutable object. Concurrent access to the control block is internally managed by the shared_ptr implementation and probably thread-safe.
  2. pointee on heap — is shared  mutable. If 2 threads call mutator methods on this object, you can hit race condition.
  3. global_instance variable —  a shared mutable instance of shared_ptr. race condition 😦




BFT^DFT, pre^in^post-order #trees++

Here are my own “key” observations, possibly incomplete, on 5 standard tree walk algos, based on my book [[discrete math]]

  • in-order is defined for binary trees only. Why? in-order means “left-subtree, parent, right-subtree”, implying only 2 subtrees.
  • In contrast, BFS/DFS are more versatile. Their input can be any graph. If not a tree, then BFS/DFS could produce a tree — you first choose a “root” node arbitrarily.
  • The 3 *-order walks are recursively defined. I believe they are three specializations of DFS. See and
  • The difference of the 3 *-order walks are visually illustrated via the black dot on

—-That’s a tiny intro. Below are more advanced —

BFT uses an underlying queue. In contrast DFS is characterized by backtracking, which requires a stack. However, you can implement the queue/stack without recursion.

BFT visits each node twice; DFT visits each node at least twice — opening (to see its edges) and leaving (when all edges are explored)

DFT backtrack in my own language —

  1. # start at root.
  2. At each node, descend into first [1] subtree
  3. # descend until a leaf node A1
  4. # backtrack exactly one level up to B
  5. # descend into A1’s immediate next sibling A2’s family tree (if any) until a leaf node. If unable to move down (i.e. no A2), then move up to C.

Visually — if we assign a $val to each node such that these nodes form a BST, then we get the “shadow rule”. In-order DFS would probably visit the nodes in ascending order by $value

[1] the child nodes, of size 1 or higher, at each node must be remembered in some order.

ODR@functions # and classes

Warning – ODR is never quizzed in IV. A weekend coding test might touch on it but we can usually circumvent it.

OneDefinitionRule is more strict on global variables (which have static duration). You can’t have 2 global variables sharing the same name. Devil is in the details:

(As explained in various posts, you declare the same global variable in a header file that’s included in various compilation units, but you allocate storage in exactly one compilation unit. Under a temporary suspension of disbelief, let’s say there are 2 allocated storage for the same global var, how would you update this variable?)

With free function f1(), ODR is more relaxed. (possibly buggy) explains the Lessor ODR vs Greater ODR. Lessor ODR is simpler and more familiar, forbidding multiple (same or different) definitions of f1() within one compilation unit.

My real focus today is the Greater ODR. Obeying Lessor ODR, the same static or inline function is often included via a header file and compiled into multiple binary files. If you want to put non-template free function definition in a shared header file but avoid Great ODR, then it must be static or inline, implicitly or explicitly. I find the Dr Dobbs article unclear on this point — In my test, when a free function was defined in a shared header without  “static” or “inline” keywords, then linker screams “ODR!”

The most common practice is to move function definitions out of shared headers, so the linker (or compiler) sees only one definition globally.

With inline, Linker actually sees multiple (hopefully identical) physical copies of func1(). Two copies of this function are usually identical definitions. If they actually have different definitions, compiler/linker can’t easily notice and are not required to verify, so no build error (!) but you could hit strange run time errors.

Java linker is simpler and never cause any problem so I never look into it.

//* if I have exactly one inline, then the non-inlined version is used. 
// Linker doesn't detect the discrepancy between two implementations.
//* if I have both inline, then main.cpp won't compile since both definitions 
// are invisible to main.cpp
//* if I remove both inline, then we hit ODR 
//* objdump on the object file would show the function name 
// IFF it is exposed i.e. non-inline
#include &amp;lt;iostream&amp;gt;
using namespace std;

void sameFunc(){
#include &amp;lt;iostream&amp;gt;
using namespace std;

void sameFunc(){
void sameFunc(); //needed
int main(){


Sliding-^Advertised- window size has real life illustration using wireshark.

  • AWS = amount of free space on receive buffer
    • This number, along with the ack seq #, are both sent from receiver to sender
  • lastSeqSent and lastSeqAcked are two sender control variable in the sender process.

SWS indicates position within a stream; AWS is a single integer.

Q: how are the two variables updated during transmission?
A: When an Ack for packet #9183 is received, sender updates its control variable “lastSeqAcked”. It then computes how many more packets to send, ensuring that “lastSeqSent – lastSeqAcked < AWS

  • SWS (sliding window size) = lastSeqSent – lastSeqAcked = amount of transmitted but unacknowledged bytes, is a derived control variable in the sender process, like a fluctuating “inventory level”.
  • SWS is a concept; AWS is a TCP header field
  • receive buffer size — is sometimes incorrectly referred as window size
  • “window size” is vague but usually refers to SMS
  • AWS is probably named after the sender’s sliding window.
  • receiver/sender — only these players control the SWS, not the intermediate routers etc.
  • too large — large SWS is always set based on large receive buffer, perhaps AWS
  • too small — underutilized bandwidth. As explained in linux tcp buffer^AWS tuning, high bandwidth connections should use larger AWS.

Q: how are AWS and SWS related?
A: The sender adjusts lastSeqSent (and SWS) based on the feedback of AWS and lastSeqAcked.

c++variables: !! always objects

Every variable that holds data is an object. Objects are created either with static duration (sometimes by defining rather than declaring the variable), with automatic duration (declaration alone) or with dynamic duration via malloc().

That’s the short version. Here’s the long version:

  • heap objects — have no name, no host variable, no door plate. They only have addresses. The address could be saved in a “pointer-object”, which is a can of worm.
    • In many cases, this heap address is passed around without any pointer object.
  • stack variables (including function parameters) — each stack object has a name (multiple possible?) i.e. the host variable name, like a door plate on the memory location. This memory is allocated when the stack frame is created. When you clone a stack variable you get a cloned object.
    • Advanced — You could create a reference to the stack object, when you pass the host variable by-reference into another function. However, you should never return a stack variable by reference.
  • static Locals — the name myStaticLocal is a door plate on the memory location. This memory is allocated the first time this function is run. You can return a reference to myStaticLocal.
  • file-scope static objects — memory is allocated at program initialization, but if you have many of them scattered in many files, their order of initialization is unpredictable. The name myStaticVar is a door plate on that memory location, but this name is visible only within this /compilation-unit/. If you declare and define it (in one step!) in a shared header file (bad practice) then you get multiple instances of it:(
  • extern static objects — Again, you declare and define it in one step, in one file — ODR. All other compilation units  would need an extern declaration. An extern declaration doesn’t define storage for this object:)
  • static fields — are tricky. The variable name is there after you declare it, but it is a door plate without a door. It only becomes a door plate on a storage location when you allocate storage i.e. create the object by “defining” the host variable. There’s also one-definition-rule (ODR) for static fields, so you first declare the field without defining it, then you define it elsewhere. See

Note: thread_local is a fourth storage duration, after 1) dynamic, 2) automatic and 3) static

C++build error: declared but undefined variable

I sometimes declare a static field in a header, but fail to define it (i.e. give it storage). It compiles fine and may even link successfully. When you run the executable, you may hit

error loading library /home/ undefined symbol: _ZN14arcabookparser6Parser19m_srcDescriptionTknE

Note this is a shared library.
Note the field name is mangled. You can un-mangle it using c++filt:

c++filt _ZN14arcabookparser6Parser19m_srcDescriptionTknE -> arcabookparser::Parser::m_srcDescriptionTkn

According to Deepak Gulati, the binary files only contain mangled names. The linker and all subsequent programs deal exclusively with mangled names.

If you don’t use this field, the undefined variable actually will not bother you! I think the compiler just ignores it.

zbs/GTD/KPI/effi^productivity #succinctly

See zbs^GTD^QQ

  • zbs — real, core strength (内功) of the tech foundation for GTD and IV.
  • GTD —
  • KPI — boss’s assessment often uses productivity as the most important underlying KPI, though they won’t say it.
  • productivity — GTD level as measured by __manager__, at a higher level than “effi”
  • effi — a lower level measurement than “Productivity”

c++class field defined]header,but global vars obey ODR

Let’s put function declaration/definition aside — simpler.

Let’s put aside local static/non-static variables — different story.

Let’s put aside function parameters. They are like local variables.

The word “static” is heavily overloaded and confusing. I will try to avoid it as far as possible.

The tricky/confusing categories are

  • category: static field. Most complex and better discussed in a dedicated post — See
  • category: file-scope var — i.e. those non-local vars with “static” modifier
  • category: global var declaration — using “extern”
    • definition of the same var — without “extern” or “static”
  • category: non-static class field, same as the classic C struct field <– the main topic in the post. This one is not about declaration/definition of a variable with storage. Instead, this is defining a type!

I assume you can tell a variable declaration vs a variable definition. Our intuition is usually right.

The Aha — [2] pointed out — A struct field listing is merely describing what constitutes a struct type, without actually declaring the existence of any variables, anything to be constructed in memory, anything addressable. Therefore, this listing is more like a integer variable declaration than a definition!

Q: So when is the memory allocated for this field?
A: when you allocate memory for an instance of this struct. The instance then becomes an object in memory. The field also becomes a sub-object.

Main purpose to keep struct definition in header — compiler need to calculate size of the struct. Completely different purpose from function or object declarations in headers. Scott Meyers discussed this in-depth along with class fwd declaration and pimpl.

See also

global^file-scope variables]c++ #extern

(Needed in coding tests and GTD, not in QnA interviews.)

Any object declared outside a block has “static duration” which means (see MSDN) “allocated at compile time not run time”

“Global” means extern linkage i.e. visible from other files. You globalize a variable by removing “static” modifier if any. explains the 2+1 variations of non-local object. I have added my own variations:

  • A single “static” variable in a *.cpp file. “File-scope” means internal linkage i.e. visible only within the file. You make a variable file-scope by adding “static”.
  • an extern (i.e. global mutable) variable — I used this many times but still don’t understand all the rules. Basically in one *.cpp it’s defined, without “static” or “extern”. In other *.cpp files, it’s declared (via header) extern without a initial value.
  • A constant can be declared and defined in one shot as a static (i.e. file-scope) const. No need for extern and separate definition.
  • confusing — if you (use a shared header to) declare the same variable “static” in 2 *.cpp files, then each file gets a distinct file-scope mutable variable of the same name. Nonsense. says “Note that not specifying static is the same as specifying extern: the default is external linkage” but I doubt it.

I guess there are 3 forms:

  • static double — file-scope, probably not related to “extern”
  • extern double — global declaration of a var already Defined in a single file somewhere else
  • double (without any modifier) — the single definition of a global var. Will break ODR if in a shared header

Note there’s no special rule about “const int”. The special rule is about const int static FIELD.

//--------- shared.h ---------
void modify();

extern std::string global; //declaration without storage allocation
static const int fileScopeConst1=3; //each implementation file gets a distinct copy of this const object
static double fileScopeMutable=9.8; //each implementation file gets a distinct copy of this mutable object
//double var3=1.23; //storage allocation. breaks compiler due to ODR!

// ------ define.C --------
#include "shared.h"
using namespace std;
string global("defaultValue"); //storage allocation + initialization
int main(){
  cout<<"addr of file scope const is "<<&fileScopeConst1<<std::endl;
  cout<<"addr of global var is "<<&global<<std::endl;
  cout<<"before modify(), global = "<<global<< "; double = "<<fileScopeMutable<<endl;
  cout<<"after modify(), global = "<<global<< "; double (my private copy) = "<<fileScopeMutable<<endl;
// ------ modify.C --------
#include "shared.h"
void modify(){
  global = "2ndValue";
  fileScopeMutable = 700;
  std::cout<<"in modify(), double (my private copy) = "<<fileScopeMutable<<std::endl;
  std::cout<<"in modify(), addr of file scope const is "<<&fileScopeConst1<<std::endl;
  std::cout<<"in modify(), addr of global var is "<<&global<<std::endl;

Single-Threaded^STM #jargon

SingleThreadedMode means each thread operates without regard to other threads, as if there’s no shared mutable data with other threads.

Single-Threaded can mean

  • no other thread is doing this task. We ensure there’s strict control in place. Our thread still needs to synchronize with other threads doing other tasks.
  • there’s only one thread in the process.

condition.signalAll Usually requires locking

Across languages,

  • notify() doesn’t strictly need the lock;
  • wait() always requires the lock.
    • c++ wait() takes the lock as argument
    • old jdk uses the same object as lock and conditionVar
    • jdk5 makes the lock beget the conditionVar

[[DougLea]] P233 points out that pthreads signal() doesn’t require the lock being held

—- original jdk
100% — notify() must be called within synchronized block, otherwise exception. See

—- jdk 5 onwards
99% — says that An implementation may (and typically does) require the lock acquired by the notifying thread.

Not 100% strict.

— compare await():
100% — On the same page, the await() javadoc is 100% strict on this requirement!

0% — notify_all() method does NOT require holding the lock.  See

–compare wait()
100% — wait(myLock) requires a lock as argument!

100% — Similar to old JDK, the notifying thread must hold the lock. See


Y more threads !! help`throughput if I/O bound

To keep things more concrete. You can think of the output interface in the I/O.

The paradox — given an I/O bound busy server, the conventional wisdom says more thread could increase CPU utilization [1]. However, the work queue for CPU gets quickly /drained/, whereas the I/O queue is constantly full, as the I/O subsystem is working at full capacity.

Can we find threads doing CPU-only-zero-I/O tasks?

[1] In a CPU bound server, adding 20 threads will likely create 20 idle, starved new threads!

Holy Grail is simultaneous saturation. Suggestion: “steal” a cpu core from this engine and use it for unrelated tasks. Additional threads basically achieve that purpose. In other words, the cpu cores aren’t stictly dedicated to this purpose.

If the CPU cores are dedicated, then there’s no way to improve throughput without adding more I/O capacity. At a high level, I see too much CPU /overcapacity/.

Assumption — adding more I/O hardware is not possible. (Instead, scaling out to more nodes could help.)

relative funding advantage paradox by Jeff

(Adapted from Jeff’s lecture notes. [[hull]] P156 example is similar.)
Primary Market Financing available to borrowers AA and BB are
fixed rate
<= AA’s real advantage
floating rate
Libor + 1%
Libor 1.24%
needs to borrow
Note BB has lower credit rating and therefore higher fixed/floating interest costs. AA’s real, bigger advantage is “fixed”, BUT prefers floating. This mismatch is the key and presents a golden opportunity.
Paradoxically, regardless of L being 5% or 10% or whatever, AA and BB can both save cost by entering an IRS.
To make things concrete, suppose each needs to borrow $100K for 12M. AA prefers a to break it into 4 x 3M loans. We forecast L in the near future around  6 ~ 6.5%.
— The strategy –
BB to pay 6.15% to, and receive Libor from, AA. So in this IRS, AA is floating payer.
Meanwhile, AA to borrow from Market fixed 7% (i.e. $7k interest) <= AA's advantage
Meanwhile, BB  to borrow from market L + 1.24% (i.e. L+1.25K) <= BB's advantage
To see the rationale, it’s more natural to add up the net INflow —
AA: -L+6.15  -7 = -L-0.85. This is a saving of 15bps
BB:   L -6.15  -L-1.24 = -7.39. This is a saving of 11bps
Net net, AA pays floating (L+0.85%) and BB pays fixed (7.39%) as desired.
Notice in both markets AA enjoys preferential treatment, but the 2 “gaps” are different by 26 bps i.e. 50 (fixed) ~ 24 (floating). AA and BB Combined savings = 26 bps is exactly the difference between the gaps. This 26 bps combined saving is now shared between AA and BB.
Fake [1] Modified example
fixed rate
floating rate
Libor + 1%
Libor 1.74%
<= AA’s real advantage
needs to borrow
— The strategy –
AA to pay 5.85% to and receive Libor from BB.
Meanwhile, BB  to borrow fixed 7.5% 
Meanwhile, AA to borrow L + 1% <= AA's advantage
Net inflow:
AA:  L -5.85 -L-1 = -6.85, saving 15 bps
BB: -L+5.85-7.5 = -L-1.65, saving 9 bps

[1] [[Hull]] P156 points out that the credit spread (AA – BB) reflects more in the fixed rate than the floating rate, so usually, AA’s advantage is in fixed. Therefore this modified example is fake.

The pattern? Re-frame the funding challenge — “2 companies must have different funding needs and gang up to borrow $100K fixed and $100k floating total, but only one half of it using AA’s preferential rates. The other half must use BB’s inferior rates.
In the 2nd example, since AA’s advantage lies _more_ in floating market, AA’s floating rate is utilized. BB’s smaller disadvantage in fixed is accepted.
It matter less who prefers fixed since it’s “internal” between AA and BB like 2 sisters. In this case, since AA prefers something (fixed) other than its real advantage (float), AA swaps them “in the family”. If AA were to prefer floating i.e. matching her real advantage, then no swap needed.
Q: Why does AA need BB?
A: only if AA needs something other than its real advantage. Without BB, AA must borrow at its lower advantage (in “fixed” rate market), wasting its real advantage in floating market.

python ‘global myVar’ needed where@@

Suppose you have a global variable var1 and you need to “use” it in a function f1()

Note Global basically means module-level. There’s nothing more global than that in python.

Rule 1a: to read any global variable in any function you don’t need “global”. I think the LGB rule applies.

Rule 1b: to call a mutator method on a global object, you don’t need “global”. Such an object can be a dict or a list or your custom object. Strings and integers have no mutators!

Rule 2: to rebind the variable to another object in memory (i.e. pointer reseat), you need “global” declaration to avoid compilation error.

Note you don’t need to declare the variable outside all functions. Just declare it global inside the rebinding functions

import package5 #(python)when package5 = directory

Python import statement has many trivial or non-trivial scenarios. Here are some common ones:

Common scenario 1: we “import modu2” to execute and also exposes all the “public” objects and functions in modu2 as modu2.something

(Simplest scenario but still not known to some programmers.)

Common scenario 2: if pack3 is a directory containing some python modules, we can
“import pack3.modu5” or “from pack3 import modu5”. Easy to understand.

Here’s a relative unusual scenario:

./pip install --verbose --no-deps --index-url=http://ficlonapd05.macbank:9090/api/ pymodels-linux== # assuming is the pymodels version to test

>>> import pymodels
>>> pymodels.__file__
>>> pymodels.BlackImplyVol(.2, 3,3, 1, 0, 1,1)

In this case, there’s a pymodels directory but BlackImplVol is a function in the compiled library ./pymodels/ How is this function accessible?

A: when “import pymodels” is run, the ./pymodels/ script executes, which does  “from pymodels_imp import * “

struct packing^memory alignment %%experiment

Sound byte — packing is for structs but alignment and padding is Not only for structs

Longer version — alignment and padding apply to Any object placement, whether that object is a field of a struct or an independent object on stack. In contrast, the packing technique is specific to a struct having multiple fields. has my test code to reveal the rules:

  1. For a (8-byte) long long object on stack (or in a struct), the address is 8-byte aligned. So padding is added by compiler, unless you say “packed” on a struct.
  2. For a (4-byte) int object on stack (or in a struct), the address is 4-byte aligned.
  3. For a (2-byte) short object on stack (or in a struct), the address is 2-byte aligned.
  4. For a char object on stack (or in a struct), the address is 1-byte aligned. All memory address are 1-byte aligned, so compiler adds no padding. Wug’s answer echoes Ashish’s comment that tiny fields like char should be grouped together, due to Rule #4. This applies to stack layout as well [1]. However, compiler optimization can be subversive:

Not only can the compiler reorder the stack layout of the local variables, it can assign them to registers, assign them to live sometimes in registers and sometimes on the stack, it can assign two locals to the same slot in memory (if their live ranges do not overlap) and it can even completely eliminate variables.

[1] See

c++atomic^volatile #fencing

See other posts about volatile, in both c++ and java.

For a “volatile” variable, c++ (unlike java) compiler is not required to introduce memory fence i.e. updates by one thread is not guaranteed to be visible globally. See Eric Lippert on

based on [[effModernC++]]
atomic volatile
optimization: reorder prevented permitted 😦
atomicity of RMW operations enforced not enfoced
optimization: redundant access skip permitted prevented
cross-thr visibility of update probably enforced no guarantee [1]
memory fencing enforced no guarantee


marginal probability density: clarified #with equations

(Equations were created in Outlook then sent to WordPress by HTML email. )

My starting point is Look at the cross section at X=7.02. This is a 2D area, so volume (i.e. probability mass) is zero, not close to zero. Hard to work with. In order to work with a proper probability mass, I prefer a very thin but 3D “sheet” , by cutting again at X=7.02001 i.e 7.02 + deltaX. The prob mass in this sheet divided by deltaX is a number. I think it’s the marginal density value at X=7.02.

The standard formula for marginal density function is on

How is this formula reconciled with our “sheet”? I prefer to start from our sheet, since I don’t like to deal with zero probability mass. Sheet mass divided by the thickness i.e. deltaX:

Since f(x,y) is assumed not to change with x, this expression simplifies to

Now it is same as formula [1]. The advantage of my “sheet” way is the numerator always being a sensible probability mass. The integral in the standard formula [1] doesn’t look like a probably mass to me, since the sheet has zero width.

The simplest and most visual bivariate illustration of marginal density — throwing a dart on a map of Singapore drawn on a x:y grid. Joint density is a constant (you can easily work out its value). You could immediate tell that marginal density at X=7.02 is proportional to the island’s width at X=7.02. Formula [1] would tell us that marginal density is

probability density = always prob mass per unit space

It’s worthwhile to get an intuitive feel for the choice of words in this jargon.

* With discrete probabilities, there’s the concept of “probably mass function”
* With continuous probability space, the corresponding concept is “density function”.

Density is defined as mass per unit space.

For a 1D probability space, the unit space is length. Example – width of a nose is a RV with a continuous distro. Mean = 2.51cm, so the probability density at this width is probably highest…

For a 2D probability space, the unit space is an area. Example – width of a nose and temperature inside are two RV, forming a bivariate distro. You can plot the density function as a dome. Total volume = 1.0 by definition. Density at (x=2.51cm, y=36.01 C) is the height of the dome at that point.

The concentration of “mass” at this location is twice the concentration at another location like (x=2.4cm, y=36 C).

2 reasons: BM is poor model for bond price

Reason 1 — terminal value is known. It’s more than deterministic. It’s exactly $100 at maturity. Brownian Motion doesn’t match that.

Reason 2 — drift estimate is too hard too sensitive. A BM process has a drift value. U can be very careful very thorough to estimate it, but any minor change in the drift estimate would result in very large differences in the price evolution, if the bond’s lifespan is longer than 10Y.


central limit theorem – clarified

background – I was taught CLT multiple times but still unsure about important details..

discrete — The original RV can have any distro, but many
illustrations pick a discrete RV, like a Poisson RV or binomial RV. I think to some students a continuous RV can be less confusing.

average — of N iid realizations/observations of this RV is the estimate [1]. I will avoid the word “mean” as it’s paradoxically ambiguous. Now this average is the average of N numbers, like 5 or 50 or whatever.

large group — N needs to be sufficiently large, esp. if the original RV’s distro is highly asymmetrical. This bit is abstract, but lies at the heart of CLT. For the original distro, you may want to avoid some extremely asymmetrical ones but start with something like a uniform distro or a pyramid distro. We will realize that regardless of the original distro, as N increases our “estimate” becomes a Gaussian RV.

[1] estimate is a sample mean, and an estimate of the population mean. Also, the estimate is a RV too.

finite population (for the original distro) — is a common confusion. In the “better” illustrations, the population is unlimited, like NY’s temperature. In a confusing context, the total population is finite and perhaps small, such as dice or the birth hour of all my
classmates. I think in such a context, the population mean is actually some definite but yet-unknown number to be estimated using a subset of the population.

log-normal — an extremely useful extension of CLT says “the product of N iid random variables is another RV, with a LN distro”. Just look at the log of the product. This RV, denoted L, is the sum of iid random variables, so L is Gaussian.

python RW global var hosted in a module

Context: a module defines a top-level global var VAR1, to be modified by my script. Reading it is relatively easy:

from mod3 import *
print VAR1

Writing is a bit tricky. I’m still looking for best practices.

Solution 1: mod3 to expose a setter setVAR1(value)

Solution 2:
import mod3
mod3.VAR1 = ‘new_value’

Note “from mod3 import * ” doesn’t propagate the new value back to the module. See example below.

#!/usr/bin/python -u
from mod3 import *

def main():
  ''' Line below is required to propagate new value back to mod3
      Also note the semicolon -- to put two statements on one line '''
  import mod3; mod3.VAR1 = 'new value'
VAR1='initial value'
def mod3func():
  print 'VAR1 =', VAR1

delegate: Yamaha makes motorbikes + pianos

label: lambda/delegate
when we say “delegate” we should but don't make an explicit distinction between a delegate typename vs a delegate instance.
Also, we seldom make an explicit distinction between the multicast delegate and unicast delegate usages, but these 2 have too little overlap to be discussed as the same concept. 
Yamaha makes motorbikes + pianos

statistical independent ≠> no causal influence

When I was young, I ate twice as much rice as noodles; Now I still do. So the ratio of rice and noodle intake is independent of my age. This independence doesn’t imply that my age has no influence on the ratio. It only appears to have no influence. shows that 2 RV are each controlled by a family of “driver random variables”, but the 2 RV can be independent!

Note the mathematical definition of independence is based on covariance. There must be a stream of paired data points. I would say the mathematical meaning of independence is fundamentally different from everyday English, so intuition often gets in the way.

Special context — time series. We record 2 processes X(t) and Y(t). Both could be influenced by several (perhaps shared) factors. In this context, the layman’s independence is somewhat close to the mathematical definition.
* historical data – we could analyze the paired data points and compute a covariance. We could conclude they are independent, based on the data. We aren’t sure if there’s some common factor that could later give rise to a strong covariance
* future – we are actually more interested in the future. We often rely on historical data


abc pairwise independence ≠> Pa*Pb*Pc

Q: If we know events A, B, C are pairwise independent, does Pa*Pb*Pc mean anything?

Q: does Pa*Pb*Pc equal P(A & B & C)?
A: no. The multivariate normal model would imply exactly that, but this is just one possibility, not guaranteed.

Jon Frye gave an excellent example to show that P(A & B & C) can have any value ranging from 0% to minimum(Pa, Pb, Pc, Pab, Pac, Pbc). Suppose Pa = Pb = Pc = 10%. Pairwise independence means Pab = 1%. However, P(abc) can be 0% or 1% (i.e. whenever A and B happen, then C also happens)

School Prom illustration – each student decides whether to go, regardless of any other individual student, but if everyone else goes, then your decision is swayed.

real J4 embracing U.S.: ez2get jobs till60

ez2get jobs; abundance of jobs — At the risk of oversimplifying things, I would single out this one as the #1 fundamental
justification of my bold and controversial move to US.

There are many ways to /dice-n-slice/ this justification:
* It gives me confidence that I can support my kids for many years. * I don’t worry so much about aging as a techie
* I don’t worry so much about outsourcing and a shrinking job pool * when I don’t feel extremely happy on a job, I don’t need to feel trapped like I do in a Singapore job.
* I feel like an “attractive girl” not someone desperately seeking on “a men’s market”.

Look at Genn. What if I really plan to stay in one company for 10 years? I guess after 10 years I may still face problem changing job in a market like Singapore.

fwd contract often has negative value, briefly

An option “paper” is a right but not an obligation, so its holder has no obligation, so this paper is always worth a non-negative value.

if the option holder forgets it, she could get automatically exercised or receive the cash-settlement income. No one would go after her.

In contrast, an obligation requires you to fulfill your duty.

A fwd contract to buy some asset (say oil) is an obligation, so the pre-maturity value can be negative or positive. Example – a contract to “buy oil at $3333” but now the price is below $50. Who wants this obligation? This paper is a liability not an asset, so its value is negative.

0 probability ^ 0 density, 2nd look #cut rope

See the other post on 0 probability vs 0 density.

Eg: Suppose I ask you to cut a 5-meter-long string by throwing a knife. What’s the distribution of the longer piece’s length? There is a density function f(x). Bell-shaped, since most people will aim at the center.

f(x) = 0 for x > 5 i.e. zero density.

For the same reason, Pr(X > 5) = 0 i.e. no uncertainty, 100% guaranteed.

Here’s my idea of probability density at x=4.98. If a computer simulates 100 trillion trials, will there be some hits within the neighborhood around x=4.98 ? Very small but positive density. In contrast, the chance of hitting x=5.1 is zero no matter how many times we try.

By the way, due to the definition of density function, f(4.98) > 0 but Pr(X=4.98) = 0, because the range around 4.98 has zero width.

is RVR Only used as function parameter@@

RVR – almost exclusively used as function parameter

There could be tutorials showing other usages, like a regular variable, but i don’t see any reason. I only understand the motivation for
– move-ctor/move-assignment and
– insertions like push_back()

When there’s a function (usually a big4) taking a RVR param, there’s usually a “traditional” overload without RVR, so compiler can choose based on the argument:
* if arg is an rval expression then choose the RVR version [1]
* otherwise, default to the traditional

[1] std::move(myVar) would cast myVar into an rval expression. This is kind of adapter between the 2 overloads.

In some cases, there exists only the RVR version, perhaps because copy is prohibited… like in a class holding a FILE pointer?

mv-semantics: the players

Q: resource means?
A: Don’t talk about resources. Talk about heapy thingy.
A: Some expensive data item to acquire. Requires allocating from some resource manager such memory allocator. Each class instance holds a resource, not sharable.

Q: What’s the thing that’s moved? not the class instance, but part of the class instance. A “resource” used by the instance, via a pointer field.

Q: Who moves it? the compiler, not run time. Compiler selects the mv-ctor or mv-assignment, only if all conditions are satisfied. See post on use-case.

Q: What’s the mv-ctor (or mv-assignment) for? The enabler. It alone isn’t enough to make the move happen.

Q: What does the RVR point to? It points to an object that the programmer has marked for imminent destruction.

returning RVR #Josuttis

My rule  of thumb is to avoid a RVR return type, even though Josuttis did NOT forbid it by saying anything like return type should never be rvr.

Instead of an rvr return type, I feel in most practical cases, we can achieve the same result using a nonref return type. I think such a function call usually evaluates to a nameless temp object i.e a naturally-occurring rvalue object.

[[Josuttis]] (i.e. the c++Standard library) P22 explains the rules about returning rval ref.

In particular, it’s a bad idea to return a newly-created stack object by rvr. This object is a nonstatic local and will be wiped out after the function returns.

(It’s equally bad to return this object by l-value reference.)


equivalent FX(+option) trades, succinctly

The equivalence among FX trades can be confusing to some. I feel there are only 2 common scenarios:

1) Buying usdjpy is equivalent to selling jpyusd.
2) Buying usdjpy call is equivalent to Buying jpyusd put.

However, Buying a fx option is never equivalent to Selling an fx option. The seller wants (implied) vol to drop, whereas the buyer wants it to increase.

factors affecting bond sensitivity to IR

In this context, we are concerned with the current market value (eg a $9bn bond) and how this holding may devalue due to Rising interest rate for that particular maturity.

* lower (zero) coupon bonds are more sensitive. More of the cash flow occurs in the distant future, therefore subject to more discounting.

* longer bonds are more sensitive. More of the cashflow is “pushed” to the distant future.

* lower yield bonds are more sensitive. On the Price/yield curve, at the left side, the curve is steeper.

(I read the above on a slide by Investment Analytics.)

Note if we hold the bond to maturity, then the dollar value received on maturity is completely deterministic i.e. known in advance, so why worry about “sensitivity”? There are 3 issues with this strategy:

1) if in the interim my bond’s MV drops badly, then this asset offers poor liquidity. I won’t have the flexibility to get contingency cash out of this asset.

1b) Let’s ignore credit risk in the bond itself. If this is a huge position (like $9bn) in the portfolio of a big organization (even for a sovereign fund), a MV drop could threaten the organization’s balance sheet, credit rating and borrowing cost. Put yourself in the shoes of a creditor. Fundamentally, the market and the creditors need to be assured that this borrower could safely liquidity part of this bond asset to meet contingent obligations.

Imagine Citi is a creditor to MTA, and MTA holds a bond. Fundamental risk to the creditor (Citi) — the borrower (MTA)  i.e. the bond holder could become insolvent before bond maturity, when the bond price recovers.

2) over a long horizon like 30Y, that fixed dollar amount may suffer unexpected inflation (devaluation). I feel this issue tends to affect any long-horizon investment.

3) if in the near future interest rises sharply (hurting my MV), that means there are better ways to invest my $9bn.

beta definition in CAPM – confusion cleared

In CAPM, beta (of a stock like ibm) is defined in terms of
* cov(ibm excess return, mkt excess return), and
* variance of ibm excess return

I was under the impression that variance is the measured “dispersion” among the recent 60 monthly returns over 5 years (or another period). Such a calculation would yield a beta value that’s heavily influenced or “skewed” by the last 5Y’s performance. Another observation window is likely to give a very different beta value. This beta is based on such unstable input data, but we will treat it as a constant, and use it to predict the ratio of ibm’s return over index return! Suppose we are lucky so last 12M gives beta=1.3, and last 5Y yields the same, and year 2000 also yields the same. We still could be unlucky in the next 12M and this beta
fails completely to predict that ratio… Wrong thought!

One of the roots of the confusion is the 2 views of variance, esp. with time-series data.

A) the “statistical variance”, or sample variance. Basically 60 consecutive observations over 5 years. If these 60 numbers come from drastically different periods, then the sample variance won’t represent the population.

B) the “probability variance” or “theoretical variance”, or population variance, assuming the population has a fixed variance. This is abstract. Suppose ibm stock price is influenced mostly by temperature (or another factor not influenced by human behavior), so the inherent variance in the “system” is time-invariant. Note the distribution of daily return can be completely non-normal — Could be binomial, uniform etc, but variance should be fixed, or at least stable — i feel population variance can change with time, but should be fairly stable during the observation window — Slow-changing.

My interpretation of beta definition is based on the an unstable, fast-changing variance. In contrast, CAPM theory is based on a fixed or slow-moving population variance — the probability context. Basically the IID assumption. CAPM assumes we could estimate the population variance from history and this variance value will be valid in the foreseeable future.

In practice, practitioners (usually?) use historical sample to estimate population variance/cov. This is, basically statistical context A)

Imagine the inherent population variance changes as frequently as stock price itself. It would be futile to even estimate the population variance. In most time-series contexts, most models assume some stability in the population variance.

physical measure is impractical

Update: Now I think physical probability is not observable nor quantifiable and utterly unusable in the math including the numerical methods.  In contrast, RN probabilities can be derived from observed prices.

Therefore, now I feel physical measure is completely irrelevant to option math.

RN measure is the “first” practical measure for derivative pricing. Most theories/models are formulated in RN measure. T-Forward measure and stock numeraire are convenient when using these models…

Physical measure is an impractical measure for pricing. Physical measure is personal feeling, not related to any market prices. Physical measure is mentioned only for teaching purpose. There’s no “market data” on physical measure.

Market prices reflect RN (not physical) probabilities.

Consider cash-or-nothing bet that pays $100 iff team A wins a playoff. The bet is selling for $30, so the RN Pr(win) = 30%. I am an insider and I rig the game so physical Pr() = 80% and Meimei (my daughter) may feel it’s 50-50 but these personal opinions are irrelevant for pricing any derivative.

Instead, we use the option price $30 to back out the RN probabilities. Namely, Calibrate the pricing curves using liquid options, then use the RN probabilities to price less liquid derivatives.

Professor Yuri is the first to point out (during my oral exam!) that option prices are the input, not the output to such pricing systems.

drift ^ growth rate – are imprecise

The drift rate “j” is defined for BM not GBM
                dAt = j dt + dW term
Now, for GBM,
                dXt = r Xt  dt + dW term
So the drift rate by definition is r Xt, Therefore, it’s confusing to say “same drift as the riskfree rate”. Safer to say “same growth rate” or “same expected return”

so-called tradable asset – disillusioned

The touted feature of a “tradable” doesn’t impress me. Now I feel this feature is useful IMHO only for option pricing theory. All traded assets are supposed to follow a GBM (under RN measure) with the same growth rate as the MMA, but I’m unsure about most of the “traded assets” such as —
– IR futures contracts
– weather contracts
– range notes
– a deep OTM option contract? I can’t imagine any “growth” in this asset
– a premium bond close to maturity? Price must drop to par, right? How can it grow?
– a swap I traded at the wrong time, so its value is decreasing to deeply negative territories? How can this asset grow?

My MSFM classmates confirmed that any dividend-paying stock is disqualified as “traded asset”. There must be no cash coming in or out of the security! It’s such a contrived, artificial and theoretical concept! Other non-qualifiers:

eg: spot rate
eg: price of a dividend-paying stock – violates the self-financing criteria.
eg: interest rates
eg: swap rate
eg: future contract’s price?
eg: coupon-paying bond’s price

Black’s model isn’t interest rate model #briefly

My professors emphasized repeatedly
* first generation IR model is the one-factor models, not Black model.
* Black model initially covered commodity futures
* However, IR traders adopted Black’s __formula__ to price the 3 most common IR options
** bond options (bond price @ expiry is LN
** caps (libor rate @ expiry is LN
** swaptions ( swap rate @ expiry is LN
** However, it’s illogical to assume the bond price, libor ate, and swap rates on the contract expiry date (three N@FT) ALL follow LogNormal distributions.

* Black model is unable to model the term structure. I think it doesn’t eliminate arbitrage. I would say that a proper IR model (like HJM) must describe the evolution of the entire yield curve with N points on the curve. N can be 20 or infinite…