3 ways to expire cached items

server-push update ^ TTL ^ conditional-GET # write-through is not cache expiration

Few Online articles list these solutions explicitly. Some of these are simple concepts but fundamental to DB tuning and app tuning. https://docs.oracle.com/cd/E15357_01/coh.360/e15723/cache_rtwtwbra.htm#COHDG198 compares write-through ^ write-behind ^ refresh-ahead. I think refresh-ahead is similar to TTL.

B) cache-invalidation — some “events” would trigger an invalidation. Without invalidation, a cache item would live forever with a infinity TTL, like the list of China provinces.

After cache proxies get the invalidation message in a small payload (bandwidth-friendly), the proxies discard the outdated item, and can decide when to request an update. The request may be skipped completely if the item is no longer needed.

B2) cache-update by server push — IFF bandwidth is available, server can send not only a tiny invalidation message, but also the new cache content.

IFF combined with TTL, or with reliability added, then multicast can be used to deliver cache updates, as explained in my other blogposts.

T) TTL — more common. Each “cache item” embeds a time-to-live data field a.k.a expiry timestamp. Http cookie is the prime example.

In Coherence, it’s possible for the cache proxy to pre-emptively request an update on an expired item. This would reduce latency but requires a multi-threaded cache proxy.

G) conditional-GET in HTTP is a proven industrial strength solution described in my 2005 book [[computer networking]]. The cache proxy always sends a GET to the database but with a If-modified-since header. This reduces unnecessary database load and network load.

W) write-behind (asynchronous) or write-through — in some contexts, the cache proxy is not only handling Reads but also Writes. So the Read requests will read or add to cache, and Write requests will update both cache proxy and the master data store. Drawback — In distributed topology, updates from other sources are not visible to “me” the cache proxy, so I still rely one of the other 3 means.

TTL eager server-push conditional-GET
if frequent query, in-frequent updates efficient efficient frequent but tiny requests between DB and cache proxy
if latency important OK lowest latency slower lazy fetch, though efficient
if in-frequent query good waste DB/proxy/NW resources as “push” is unnecessary efficient on DB/proxy/NW
if frequent update unsuitable high load on DB/proxy/NW efficient conflation
if frequent update+query unsuitable can be wasteful perhaps most efficient

 

Advertisements

key^payload: realistic treeNode #hashset

I think only SEARCH trees have keys. “Search” means key search. In a search tree (and hashset), each node carries a (usually unique) key + 0 or more data fields as payload.

Insight — Conceptually the key is not part of the payload. Payload implies “additional information”. In contrast, key is part of the search tree (and hashset) infrastructure, similar to auto-increment record IDs in database. In many systems, the key also has natural meaning, unlike auto-incremented IDs.

Insight — If a tree has no payload field, then useful info can only exist in the key. This is fairly common.

For a treemap or hashmap, the key/value pair can be seen as “key+payload”. My re_hash_table_LinearProbing.py implementation saves a key/value pair in each bucket.

A useful graph (including tree , linked list, but not hashset) can have payloads but no search keys. The graph nodes can hold pointers to payload objects on heap [1]. Or The graph nodes can hold Boolean values. Graph construction can be quite flexible and arbitrary. So how do you select a graph node? Just follow some iteration/navigation rules.

Aha — similar situation: linked list has well-defined iterator but no search key . Ditto vector.

[1] With pre-allocation, a static array substitutes for the heap.

Insight — BST, priorityQ, and hash tables need search key in each item.

I think non-search-TREES are rarely useful. They mostly show up in contrived interview questions. You can run BFT, pre|post-order DFT, but not BF S / DF S, since there’s nothing to “search”.

Insight — You may say search by payload, but some trees have Boolean payloads.

std::move(): robbed object still usable !

Conventional wisdom says after std::move(obj2), this object is robbed and invalid for any operation…. Well, not quite!

https://en.cppreference.com/w/cpp/utility/move specifies exactly what operations are invalid. To my surprise, a few common operations are still valid, such as clear() and operator=().

The way I read it — if (but not iFF) an operation wipes out the object content regardless of current content, then this operation is valid on a robbed/hollowed object like our obj2.

Crucially, any robbed object should never hold a pointer to any “resource”, since that resource is now used by the robber object. Most movable data types hold such pointers. The classic implementation (and the only implementation I know) is by pointer reseat.

multicast routing: efficient data duplication

The generic  routing algorithm in multicast has optimization features that offer efficient data duplication.

— optimized message duplication at routing layer (Layer 3?)

https://www.metaswitch.com/knowledge-center/reference/what-is-multicast-ip-routing explains that

Routers duplicate data packets and forward multiple copies wherever the path to recipients diverges. Group membership information is used to calculate optimal branch points i.e. the best routers at which to duplicate the packets to optimize the use of the network.

The multicast distribution tree of receiving hosts holds the route to every recipient that has joined the multicast group, and is optimized so that

  • multicast traffic does not reach networks that do not have any such recipients (unless the network is a transit network on the way to other recipients)
  • duplicate copies of packets are kept to a minimum.

5 constructs: c++implicit singletons

#1 most implicit singleton in c++ is the ubiquitous “file-scope variable”. Extremely common in my projects.

  • — The constructs below are less implicit as they all use some explicit keyword to highlight the programmer’s intent
  • keyword “extern” — file-scope variable with extern
    • I seldom need it and don’t feel the need to remember the the details.. see other blogposts
  • keyword “static” — file-scope static variables
  • keyword “static” within function body — local static variables — have nice feature of predictable timing of initializaiton
  • keyword “static” within a class declaration —  static field

~~~~~~  The above are the 5 implicit singleton constructs ~~~~~~

Aha — it’s useful to recognize that when a data type is instantiated many many times i.e. non-singleton usage, it is usually part of a collection, or a local (stack) variable.

Sometimes we have the ambiguous situation where we use one of the constructs above, but we instantiate multiple instances of the class. It’s best to document the purpose like “instance1 for …; instance2 for …”

average_case ^ amortized

–> bigO (and big-Theta) is about input SIZE, therefore, usable on average or worst case data quality

There’s no special notation like bigX for average case.

–> Worst vs average case … is about quality of input data.

With hash table, we (novices) are permitted to say “quality of hash function” but this is incorrect.  My book [[algorithms in a nutshell]] P93 states that IFF given a uniform hash function, even in worst case bucket sort runs in linear time O(N).

–> Amortized … is about one expensive step in a long series of steps like inserts. Note Sorting has no amortized analysis AFAIK.

worst case amortized — is a standard analysis
Eg: vector expansion. Worst-case doesn’t bother us.

eg: quicksort+quickSelect can degrade to O(NN). Amortization doesn’t apply.

average case amortized — is a standard analysis
eg: hash table collision. A few operations happen to hit long chain, while most operations hit short chain, so O(1) on average.

Amortization is relevant iFF an insert triggers a hash table expansion.

What if the input is so bad that all hash codes are identical? Then amortization won’t help. For hash tables, people don’t worry about worst case, because a decent, usable hash function is always, always possible.

average case per-operation — can be quite pessimistic
eg: vector insert

worst case per-operation —

oldtimers trapped ] pressure-cooker #Davis

See also G3 survival capabilities #health;burn rate

Q: Why are so many old timers unable to get out of a shitty job in a pressure-cooker company?

  • A small number of old timers change job but some of them regret as new job turns out worse. Other old timers hear those war-stories and prefer the certainty of known pain in the current job. They don’t feel confident they can find a better job.
  • bench time — Many old timers can’t afford bench time due to low reserve and high burn rate.
    • Old timers tend to be very unused to unstable income.
    • Damien (Macq) relied on his bonus payout..
    • Some had no savings at all.
  • Most old timers won’t even consider a 6M contract.
  • Most old timers will never consider a voluntary pay cut to get a more comfortable job
  • golden handshake — Some old timers (UBS?) can’t let go of the promised compensation package. They would not resign and give up on a 100k promised windfall
  • some old timers worry about job loss so much that they work their ass off to keep the job, even if the workload is unreasonable/unfair. I think in GS I was like them. I have a friend in a similar situation. I think the boss wants to let him go but may feel merciful, so they keep him but give him shitty tasks and demand fast turnaround at good quality…. unreasonable/unfair workload.
  • Some old timers may have a wife who completely opposes job change due to potential impact on family.

Q: why XR and I can afford to quit a job, stay home, and find a lower job?
A: Supportive wife, burn rate, passive income and savings.

[17] string(+vector) reserve()to avoid re-allocation #resize()wrong

See also std::vector capacity reduction and vector internal array: always on heap; cleaned up via RAII

string/vector use expandable arrays. Has to be allocated on heap not stack.

For vector, it’s very simple —

  • resize() creates dummy elements iFF upsizing
  • reserve() only allocates spare “raw” capacity without creating elements to fill it. See P279 [[c++stdLib]]

[[Optimized C++]] P70 confirms that std::string can get re-allocated when it grows beyond current capacity.

http://stackoverflow.com/questions/9521629/stdstringss-capacity-reserve-resize-functions compares std::string.resize() vs reserve().

  • Backgrounder: Capacity — allocated capacity is often uninitialized memory reserved for this string.
  • Backgrounder: size — length of the string that’s fully initialized and accessible. Some of the characters (nulls) could be invisible.
  • string.resize() can increase the string’s size by filling in space (at the end) with dummy characters (or a user-supplied char). See http://www.cplusplus.com/reference/string/string/resize/
    • After resize(55) the string size is 55. All 55 characters are part of the string proper.
    • changes string’s size
  • string.reserve() doesn’t affect string size. This is a request to increase or shrink the “capacity” of the object. I believe capacity (say 77) is always bigger than the size of 55. The 77 – 55 = 22 extra slots allocated are uninitialized! They only get populated after you push_back or insert to the string.

vector internal array: always on heap; cleaned up via RAII

Across languages, vector is the #1 most common and useful container. Hashtable might be #2.

I used to feel that a vector (and hashtable) can grow its “backing array” without heap. There’s global memory and stack memory and perhaps more. Now I think that’s naive.

  • Global area is allocated at compile time. The array would be fixed in size.
  • For an array to be allocated on stack, the runtime must know its size. Once it’s allocated on that stack frame, it can’t grow beyond that stack frame.
  • In fact, any array can’t grow at all.

The vector grows its backing array by allocating a bigger array and copying the payload. That bigger array can’t be in global area because this type on-demand reallocation is not compile-time.

Anything on heap is accessed by pointer. Vector owns this hidden pointer. It has to delete the array on heap as in RAII. No choice.

Now the painful part — heap allocation is much slower than stack allocation. Static allocation has no runtime cost at all after program starts.

BST: finding next greater node is O(1) #Kyle

https://stackoverflow.com/questions/11779859/whats-the-time-complexity-of-iterating-through-a-stdset-stdmap shows that in a std::map, the time complexity of in-order iterating every node is O(N) from lowest to highest.

Therefore, each step is O(1) amortized. In other words, finding next higher node in such a BST is a constant-time operation.

https://en.cppreference.com/w/cpp/iterator/next confirms that std::next() is O(1) if the steps to advance is a single step. Even better, across all STL containers, iterator movement by N steps is O(N) except for random-access iterators, which are even faster, at O(1).

API^ABI

See also posts on jvm portability

  • — Breaking API change is a decision by a library supplier/maintainer.
  • clients are required to make source code change just to compile against the library.
  • clients are free to use previous compiler
  • — Breaking ABI change is a decision by a compiler supplier.
  • clients are free to keep source code unchanged.
  • clients are often required to recompile all source code including library source code

API is source-level compatibility; ABI is binary-level compatibility. I feel

  1. jar file’s compile-once-run-anywhere is kind of ABI portability
  2. python, perl… offer API portability at source level

c++ABI #best eg@zbs

Mostly based on https://www.oracle.com/technetwork/articles/servers-storage-dev/stablecplusplusabi-333927.html

Imagine a client uses libraries from Vendor AA and Vender BB, among others. All vendors support the same c++ compiler brand, but new compiler versions keep coming up. In this context, unstable ABI means

  • Recompile-all – client needs libAA and libBB (+application) all compiled using the same compiler version, otherwise the binary files don’t agree on some key details.
  • Linker error – LibAA compiled by version 21 and LibBB compiled under version 21.3 may fail to link
  • Runtime error – if they link, they may not run correctly, if ABI has changed.

Vendor’s solutions:

  1. binary releases —  for libAA. Vendor AA needs to keep many old binary versions of libAA, even if compiler version 1.2 has retired for a long time. Some clients may need that libAA version. Many c++ libraries are distributed this way on vendor websites.
  2. Source distribution – Vendor AA may choose to distribute libAA in source form. Maintenance issues exist in other forms.

In a better world,

  • ABI compatible — between compiler version 5.1 and 5.2. Kevin of Macq told me this does happen to some extent.
  • interchangeable parts — the main application (or libBB) can be upgraded to newer compiler version, without upgrading everything else. Main application could be upgraded to use version 5.2 and still link against legacy libAA compiled by older compiler.

(overloaded function) name mangling algorithm — is the best-known part of c++ABI. Two incompatible compiler versions would use different algorithms so LibAA and LibBB will not link correctly. However, I don’t know how c++lint demangles those names across compilers.

No ABI between Windows and Linux binaries, but how about gcc vs llvm binaries on Intel linux machine? Possible according to https://en.wikipedia.org/wiki/Application_binary_interface#Complete_ABIs

Java ABI?

 

STL map[k]=.. Always uses assignment@runtime

The focus is on the value2, of type Trade.

— someMap[key1] = value2; //uses default-ctor if needed. Then Unconditionally invokes Trade::operator=() for assignment! See P344 [[Josuttis]]

Without default-ctor or operator=, then Trade class probably won’t compile with this code.

— someMay.emplace(key1, value2); //uses some ctor, probably the copy-ctor, or the cv-ctor if value2 type is not Trade

void list.reverse() ^ reversed()→iterator

These three constructs do three different things, without overlap

  • mylist[::-1] returns a reversed clone
    • myStr[::-1] works too
  • mylist.reverse() is a void in-place mutation, same as mylist.sort()
  • reversed(mylist) returns an iterator, not printable but can be used as argument to some other functions
    • I feel this can be convenient in some contexts.
    • reversed(myStr) works too but you can’t print it directly

Similarly,

  • Unlike items(),dict.iteritems()return iterator

Different case:

  • sorted(any_iterable) creates a new vector (i.e. “list”). Iterator would be technically stupid.

function returning (unnamed)rvalue object

The rules are hard to internalize but many interviewers like to zero in on this topic. [[effModernC++]] explains

  • case: if return type is nonref, and in the function the thingy-to-be-returned is a …. rvr parameter [1], then by default, it goes through copy-ctor on return.
    • You should apply explicit return move(thingy)
    • [1] See P172, extremely rare except in interviews
  • case: if return type is nonref and in the function the thingy-to-be-returned is a …. stack (non-static) object behind a local variable (NOT nameless object) .. a very common scenario, then RVO optimization usually happens.
    • Such a function call is always (P175 bottom) seen by the caller as evaluating to a naturally-occurring nameless rvalue object.
    • the books says move() should never be used in the case
  • case: if return type is rvr? avoid it if possible. I don’t see any use case.
  • case: if return type is lvr (not rvr) and returned thingy is a local object, then compilation fails as explained in the blogpost above

sharding:=horizontal partition #too many rows

Assuming a 88-column, 6mil-row table, “Horizontal” means horizontally pushing a knife across all 88 columns, splitting 6 million rows into 3 millions each.

Sharding can work on noSQL too.

GS PWM positions table had 9 horizontal partitions for 9 regions.

“Partitioning” is a more generic term and can mean 1) horizontal (sharding) or 2) vertical cutting like normalization.

[19] assignment^rebind in python^c++j

For a non-primitive, java assignment is always rebinding. Java behavior is well-understood and simple, compared to python.

Compared to python, c++ assignment is actually well-documented .. comparable to a mutator method.

Afaik, python assignment is always rebinding afaik, even for an integer. Integer objects are immutable, reference counted.
In python, if you want two functions to share a single mutable integer variable, you can declare a global myInt.
It would be in the global idic/namespace. q[=] has special meaning like

idic[‘myInt’] =..

Alternatively, you can wrap the int in a singular list and call list mutator methods, without q[=].

See my experiment in github py/88lang and my blogpost on immutable arg-parssing

move()doesn’t move..who does@@

(Note in this blogpost, when I say mv-ctor i mean move-ctor and move-assignment.)

As explained in other posts in this blog, move() is a compile time cast, no performing a physical “steal” at runtime, so

Q: what c++11 constructs performs a physical steal?
%%A: mv-ctor

  • move() ⇏ steal — std::move() may not trigger a physical steal — If you call myContainer.insert(std::move(myString)) but your custom String class has no mv-ctor, then copy-ctor is picked by compiler .. see P21 [[Josuttis]].
  • steal ⇏ move() — a physical steal may not require std::move() — if you have a naturally occurring rvalue object.
  • steal => mv-ctor

POSIX semaphores do grow beyond initial value #CSY

My friend Shanyou pointed out a key difference between c++ binary semaphore and a simple mutex — namely ownership. Posix Semaphore (binary or otherwise) has no ownership, because non-owner threads can create permits from thin air, then use them to enter the protected critical section. Analogy — NFL fans creating free tickets to enter the stadium.

https://stackoverflow.com/questions/7478684/how-to-initialise-a-binary-semaphore-in-c is one of the better online discussions.

  • ! to my surprise, if you initialize a posix semaphore to 1 permit, it can be incremented to 2. So it is not a strict binary semaphore.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/thr/binarySem.cpp is my experiment using linux POSIX semaphore. Not sure about system-V semaphore. I now think a posix semaphore is always a multi-value counting semaphore with the current value indicating current permits available.

  • ! Initial value is NOT “max permits” but rather “initial permits”

I feel the restriction by semaphore is just advisory, like advisory file locking. If a program is written to subvert the restriction, it’s easy. Therefore, “binary semaphore” is binary IIF all threads follow protocol.

https://stackoverflow.com/questions/62814/difference-between-binary-semaphore-and-mutex claims “Mutex can be released only by thread that had acquired it, while you can signal semaphore from any non-owner thread (or process),” This does NOT mean a non-owner thread can release a toilet key owned by another thread/process. It means the non-owner thread can mistakenly create a 2nd key, in breach of exclusive, serialized access to the toilet. https://blog.feabhas.com/2009/09/mutex-vs-semaphores-–-part-1-semaphores/ is explicit saying that a thread can release the single toilet key without ever acquiring it.

–java

[[DougLea]] P220 confirms that in some thread libraries such as pthreads, release() can increment a 0/1 binary semaphore value to beyond 1, destroying the mutual exclusion control.

However, java binary semaphore is a mutex because releasing a semaphore before acquiring has no effect (no “spare key”) but doesn’t throw error.

temp object binding preferences: rvr,lvr.. #SCB

(Note I used “temp object” as a loose shorthand for “rval-object”.)

Based on https://www.codesynthesis.com/~boris/blog/2012/07/24/const-rvalue-references/

  • a const L-value reference … … can bind to a naturally-occurring rvalue object (or a robbed object after std::move)
  • a non-const r-value reference can bind to a naturally-occurring rvalue object
  • a const r-value reference (crvr) can bind to a naturally-occurring rvalue object but canNOT bind to an lvalue object

Q: so in the presence of all overloads, what kind of reference can naturally occurring temp objects bind to?
A: a const rvr. Such an object prefers to bind to a const rvalue reference rather than a const lvalue reference. There are some obscure use cases for this binding preference.

More important is the fact that const lvr (param type of copy-ctor) can bind to temp object. [[c++primer]] P540 has a section describing that if you pass a temp into Foo ctor you may hit the copy-ctor:

Foo z(std::move(anotherFoo)) // compiles and runs fine even if move-ctor is unavailable. This is THE common scenario before c++11. No change.

Compiler doesn’t bother to synthesize the move-ctor, when copy-ctor is defined!

 

returning^throwing local object

Tag line – always catch by reference. [[moreEffC++]] has a chapter with this title.

  • If you return a local object by reference, it will lead to run time error as the object would be wiped out from stack. Compiler is likely to give a warning.
  • If you throw a local object and catch by reference up the call stack, it is actually considered best practice, because compiler always clones the local object and throws the clone.

xtap check`if UDP channel=healthy #CSY

xtap library needs reliable ways to check if a “connectivity” is down

UDP (including multicast) is a connectionless protocol; TCP is connection-oriented. Therefore the xtap “connection” class cannot be used for multicast channels. Multicast Channel is like a TV-channel (NYSE terminology).

–UDP is connectionless, has no session, no virtual circuit, so no “established” state. So how do we know the exchange server is dead or just quiet?

After discussing with CSY, I feel UDP unicast or UDP multicast can’t tell me.

heartbeat — I think we must rely on the heartbeat. I actually helped create an inactivity timeout alert precisely because in a multicast channel, we don’t know if exchange is down or just quiet.

–TCP should be easier.

Many online resources such as https://stackoverflow.com/questions/4142012/how-to-find-the-socket-connection-state-in-c.

According to CSY, as a receiver, we don’t need to send a probe message. if exchange has closed the TCP socket, the four-way handshake would have informed my socket that the connection is closed. So my select() would say my TCP socket is ready for reading. When I read it I would get 0 bytes.

I believe there’s a one-to-one mapping between a socket and a “connection” for TCP only

 

Virtual_Machine^guest_OS

https://www.virtualbox.org/manual/ch01.html#virtintro explains that

Guest OS runs in a virtual machine or “vm”. A “vm”

  • usually refers to a container process if it’s “live”

 

  • More often, a vm means a vm-config i.e. a collection of parameters defining a physical container process to-be-started.It’s important to realize (windows host OS as example) a vm is strictly an application with a window, like a browser or shell. As such, this application has it’s own config data saved on disk.

 

IV Q: implement op=()using copier #swap+RAII

A 2010 interviewer asked:

Q: do you know any technique to implement the assignment operator using an existing copy ctor?
A: Yes. See P100 [[c++codingStd]] and P347 [[c++cookbook]] There are multiple learning points:

  • RAII works with a class owning some heap resource.
  • RAII works with a local stack object owning a heapy thingy
  • RAII provides a much-needed exception safety. For the exception guarantee to hold, std::swap and operator delete should never throw, as echoed in my books.
  • I used to think only half the swap (i.e. transfer-in) is needed. Now I know the transfer-out on the old resource is more important. It guarantees the old resource is released even-if hitting exceptions, thanks to RAII.
  • The std::swap() trick needed here is powerful because there’s a heap pointer field. Without this field, I don’t think std::swap will be relevant.
  • self-assignment check — not required as it is rare and tolerable
  • Efficiency — considered highly efficient. The same swap-based op= is used extensively in the standard library.
  • idiomatic — this implementation of operator= for a resource-owning class is considered idiomatic, due to simplicity, safety and efficiency

http://www.geeksforgeeks.org/copy-swap-idiom-c/ shows one technique. Not sure if it’s best practice. Below is my own

C & operator=(C const & rhs){
  C localCopy(rhs); //This step is not needed for move-assignment
  std::swap(localCopy.heapResource, this->heapResource);
  return *this;
}// at end of this function, localCopy is destructed, and the original this->heapResource is deleted

C & operator=(C && rhs){ //move-assignment
  std::swap(rhs.heapResource, this->heapResource);
  return *this;
}

 

enable_if{bool,T=void} #sfinae,static_assert

  • enable_if is a TMP technique that hides a function overload (or template specialization) — convenient way to leverage SFINAE to conditionally remove functions from overload resolution based on type traits and to provide separate function overloads for different type traits. [3]
    • I think static_assert doesn’t do the job.
  • typically used with std::is_* compile-time type checks provided in type_traits.h
  • enable_if_t evaluates to either a valid type or nothing. You can use enable_if_t as a function return type. See [4]. This is the simplest way usage of enable_if to make the target function eligible for overload resolution. You can also use enable_if_t as a function parameter type but too complicated for me. [3]
  • There are hundreds of references to it in the C++11 standard template library [1]
  • type traits — often combined with enable_if [1]
  • sfinae — is closely related to enable_if [1]
  • by default (common usage), the enable_if_t evaluates to void if “enabled”. I have a github experiment specifically on enable_if_t. If you use enable_if_t as return type you had better put a q(*) after it!
    • Deepak’s demo in [4] uses bool as enable_if_t
  • static_assert — is not necessary for enable_if but can be used for compile-time type validation. Note static_assert is unrelated to sfinae. https://stackoverflow.com/questions/16302977/static-assertions-and-sfinae explains that
    • sfinae checks declarations of overloads. sfinae = Substitution failure is not a (compiler) error. An error would break the compilation.
    • static assert is always in definitions, not declarations. static asserts generate compiler errors.
    • Note the difference between failure vs error
    • I think template instantiation (including overload resolution) happens first and static_assert happens later. If static_assert fails, it’s too late. SFINAE game is over. Compilation has failed irrecoverably.
    • Aha moment on SFINAE !

[1] https://eli.thegreenplace.net/2014/sfinae-and-enable_if/

[2] https://stackoverflow.com/questions/30556176/template-detects-if-t-is-pointer-or-class

[3] https://en.cppreference.com/w/cpp/types/enable_if

[4] https://github.com/tiger40490/repo1/blob/cpp1/cpp/template/SFINAE_Deepak.cpp

enable_shared_from_this #CRTP

  • std::enable_shared_from_this is a base class template
  • shared_from_this() is the member function provided
  • CRTP is needed when you derive from this base class.
  • underlying problem (target of this solution) — buggy design where two separate “clubs” centered around the same raw ptr. Each club thinks it owns the raw ptr and would destroy it independently.
  • usage example —
    • your host class is already managed via a shared_ptr
    • your instance method need to create new shared_ptr objects from “this”
    • Note if you only need to access “this” you won’t need the complexity here.

Pimco asked it in 2017!

rvr^rvalueObject #rvr=compiler concept

  • ALL objects by definition exist in runtime memory but references may not.
  • std::move() is about rvr variables not rvalue objects!
    • You can even use move() on natural-occurring temp though std::move is supposed to be used on regular lval objects
  • I now believe rvr is a compiler concept. Underlying is just a temporary’s address.
    • Further, the traditional lvr is probably a compiler concept too. Underlying is a regular pointer variable.

rvalue object is an object that can ONLY appear on the RHS of assignment.

rvalue object can be naturally occurring (usually anonymous), or “converted from a named object” by std::move(), but if some variable still references the object, then the object is actually not a proper rvalue object.

The SCB architect pointed that “some variable” can be a const ref bound to a naturally occurring rvalue! To my surprise the c++ syntax rule says the object is still a temp object i.e. rvalue object, so you can’t assign it to a lvr !

unique_ptr implicit copy : only for rvr #auto_ptr

P 470-471 [[c++primer]] made it clear that

  • on a regular unique_ptr variable, explicit copy is a compilation error. Different from auto_ptr here.
  • However returning an unnamed temp unique_ptr (rvalue object) from a function is a standard idiom.
    • Factory returning a unique_ptr by value is the most standard idiom.
    • This is actually the scenario in my SCB-FM interview by the team architect

Underlying reason is what I have known for a long time — move-only. What I didn’t know (well enough to impress interviewer) — the implication for implicit copy. Implicit copy is the most common usage of unique_ptr.

risk-neutral means..illustrated by CIP

Background — all of my valuation procedures are subjective, like valuing a property, an oil field, a commodity …

Risk-Neutral has always been confusing, vague, abstract to me. CIP ^ UIP, based on Mark Hendricks notes has an illustration —

  • RN means .. regardless of individuals’ risk profiles … therefore objective
  • RN means .. partially [1] backed by arbitrage arguments, but often theoretical
    • [1] partially can mean 30% or 80%
    • If it’s well supported by arbitrage argument, then replication becomes theoretical foundation of RN pricing

 

extern^static on var^functions

[1] https://stackoverflow.com/questions/14742664/c-static-local-function-vs-global-function confirmed my understanding that static local function is designed to allow other files to define different functions under this name.

extern var file-scope STATIC var static func extern-C func
purpose 1 single object shared across files [1] applies. 99% sure [1] name mangling
purpose 2 private variable private function
alternative 1 static field anon namespace no alternative
alternative 2 singleton
advantage 1 won’t get used by mistake from other file
disadv 1 “extern” is disabled if you provide an initializer no risk. very simple
disadv 2
put in shared header? be careful  should be fine  not sure

order slice^fill: jargon

An execution is also known as a fill, often a partial fill.

  • A slice is part of a request; An execution is part of a response

A slice can have many fills, but a fill is always for a single request.

  • An execution always comes from some exchange, back to buy-side clients, whereas
  • A request (including a slice) always comes from upstream (like clients) to downstream (like exchanges)
  • Slicing is controlled by OMS systems like HFT; Executions are controlled by exchanges.
  • Clients can send a cancel for a slice before it’s filled; Executions can be busted only by exchanges.

reference(instead of ptr) to smart ptr instance

I usually pass smart pointers by value (copy-constructor or move-constructor), just like copying a raw ptr.  Therefore the code below looks unnatural:

unique_ptr<Trade> & ref2smartPtr

Actually it is rather common because

  • As Herb Sutter suggested, when we need to put pointer into containers, we should avoid raw ptr. Unique ptr is the default choice, and the first choice, followed by shared_ptr
  • I often use unique_ptr as map value . The operator[] return type is a reference to the value type i.e. reference to unque_ptr
  • I may also put unique_ptr into a vector…. ditto for vector operator[]

2 reasons: BM is poor model for bond price

Reason 1 — terminal value is known. It’s more than deterministic. It’s exactly $100 at maturity. Brownian Motion doesn’t match that.

Reason 2 — drift estimate is too hard too sensitive. A BM process has a drift value. U can be very careful very thorough to estimate it, but any minor change in the drift estimate would result in very large differences in the price evolution, if the bond’s lifespan is longer than 10Y.

 

factory patterns explained graphically

https://vivekcek.wordpress.com/2013/03/17/simple-factory-vs-factory-method-vs-abstract-factory-by-example/ has nice block diagrams and examples.

  • CC) “Simple factory” is the most widely used one but not a widely known pattern, not even a named pattern. “SimpleFactory” is just an informal name

I mostly used simple factory and not the other factory patterns.

Any time I have a method that creates and returns polymorphic objects from a family (like different Trades), I would name the entire class SomeFactory. I often add some states and additional features but as a class or struct, it is still a simple manufacturer.

  • BB) factory method — a family of manufacturer classes. Each supports a virtual create() method. Each is responsible for instantiating just one specific type of object. NokiaFactory.create() only creates NokiaPhone objects.
  • AA) AbstractFactory — too complex and needed in very special situations only.

swap on eq futures/options: client motive

Q1: why would anyone want to enter a swap contract on an option/futures (such a complex structure) rather than trading the option/futures directly?

Q2: why would anyone want to use swap on an offshore stock rather than trading it directly?

More fundamentally,

Q3: why would anyone want to use swap on domestic stock?

A1: I believe one important motivation is restrictions/regulation.  A trading shop needs a lot of approvals, licenses, capital, disclosures … to trade on a given futures/options exchange. I guess there might be disclosure and statuary reporting requirements.  If the shop can’t or doesn’t want to bother with the regulations, they can achieve the same exposure via a swap contract.

This is esp. relevant in cross-border trading. Many regulators restrict access by offshore traders, as a way to protect the local market and local investors.

A3: One possible reason is transparency, disclosure and reporting. I guess many shops don’t want to disclose their positions in, say, AAPL. The swap contract can help them conceal their position.

in c++11 overload resolution, TYPE means..

In the presence of rvr, compiler must more carefully choose overloads based on type of argument and type of parameter.

  • I think type-of-parameter is the declared type, as programmer wrote it.

No change in c++11. No surprise.

  • “argument” means the object. Type-of-argument means…?
  1. Traditionally it means int vs float
  2. Traditionally it means const vs non-const
  3. (Traditionally, it means Acct vs TradingAcct but this is runtime type info, not available at compile time.)
  4. In c++11, it also means rval-obj vs regular object. You can convert regular object to a rval-object via … move(). But What if argument is a rval-variable, declared as “int&& param” ?

This is one of the trickiest confusions about rvr. The compiler “reasons” differently than a human reasons. [[effModernC++]] tried to explain it on P2 and P162 but I don’t get it.

Yes the ultimate runtime int object is an rval-obj (either naturally-occurring or moved), but when compiler sees the argument is an “int&& param”, compiler treats this argument as lvalue as it has a Location !

My blogpost calling std::move() inside mv-ctor  includes my code experiments.

FRA^ED-fut: actual loan-rate fixed when@@

Suppose I’m IBM, and need to borrow in 3 months’s time. As explained in typical FRA scenario, inspired by CFA Reading 71, I could buy a FRA and agree to pay a pre-agreed rate of 550 bps.  What’s the actual loan rate? As explained in that post,

  • If I borrow on open market, then actual loan rate is the open-market rate on 1 Apr
  • If I borrow from the FRA dealer GS, then loan rate is the pre-agreed 550 bps
  • Either way, I’m indifferent, since in the open-market case, what ever rate I pay is offset by the p/l of the FRA

Instead of FRA, I could go short the eurodollar futures. This contract is always cash-settled, so the actually loan rate is probably the open-market rate, but whatever market rate I pay is offset by the p/l of the futures contract.

Q: bond price change when yield goes to zero

Can bond yield become negative? Yes 2015-2017 many bonds traded at negative yield. https://www.ft.com/content/312f0a8c-0094-11e6-ac98-3c15a1aa2e62 shows a realistic example of a vanilla bond trading at $104. Yield is negative –You pay $104 now and will get $100 repayment so you are guaranteed to lose money.

Mathematically, when yield approaches negative 100 bps, price goes to infinity.

When yield approaches zero, bond price would go to the arithmetic sum of all coupons + repayment.

Cloneable, Object.clone(), Pen.Clone() #java

A few learning points.

The Object.clone() implementation is not that important, because I should always override it in my class like Pen, but here are some observations about this Object.clone():

  • shallow copy, not deep copy, not very useful.
  • this protected method is mostly meant to be invoked by a subclass:
    • If your variable points to some Pen object that’s no Clonable, and you call it from either the same package or a subclass, then you hit CloneNoSupported.
    • if your Pen class implements Clonable, then it should override the clone() method
    • [1] The only way the default Object.clone() gets picked by compiler is when a Cat class implements Clonable but doesn’t override clone(), and you call it from either the same package or a subclass

Clonable is a special marker interface. It could trigger the CloneNotSupported exception, but if you override clone() then this exception may not hit. It’s an obscure detail.

  • I think you can override clone() without implementing Clonable, but this is tricky and non-standard.
  • You could also implement Clonable without overriding clone() .. see [1]

java: protected^package-private

https://docs.oracle.com/javase/tutorial/java/javaOO/accesscontrol.html (java8) shows two nice tables:

  • There’s no more “private protected
  • default access level is better known as “package-private” — strictly more private more restrictive than Protected . (Protected is more like Public). The 2nd table shows that
    • a “package-private” member of Alpha is accessible by Beta (same package) only, whereas
    • a “protected” member of Alpha is accessible by Beta and Alphasub
    • Therefore, “neighbors are more trusted than children”

I find it hard to remember so here are some sound bytes

  1. “protected” keyword only increases visibility never decreases it.
  2. So a protected field is more accessible than default (package-private)
    • As an example, without “protected” label on my field1, my subclasses outside the package cannot see field1.
  3. same-package neighbors are local and trusted more than children outside the package, possibly scattered over external jars

For a “protected” field1, a non-subclass in the same package can see it just as it can see a default-accessible field2

Not mentioned in the article, but when we say “class Beta can access a member x of Alpha”, it means that the compiler allows you to write, inside Beta methods, code that mentions x. It could be myAlpha.x or it could be Alpha.x for a static member.

PendingNew^New: OrdStatus[39]

“8” has special meaning

  • in tag 35, it means execution report
  • in tag 39 and tag 150, it means status=rejected.

PendingNew and New are two possible statuses for a given order.

PendingNew (39=A, 150=A) is relatively simple. The downstream system sends a PendingNew to upstream as soon as it receives the “envelop”, before opening, validating or saving it. I would say even a bad order can go into PendingNew.

New (39=0, 150=0) is significant. It’s an official acknowledgement (or confirmation) of acceptance. It’s synonymous with “Accept” and “Ack”. I think it means fully validated and saved for execution. For an intermediate system, usually it waits for an Ack i.e. 39=0 from exchange before sending an Ack to the upstream. Tag 39 is usually not modified.

I see both A and 0 frequently in my systems, in execution report messages.

For a market Buy order, I think it will be followed by (partial) fills, but not guaranteed, because there may be no offers, or execution could fail for any reason. For a dealer system, execution can fail due to inventory shortage. I implemented such an execution engine in 95G.

I’m no expert on order statuses.

perl regex top3 tips #modifier /m /s

https://docstore.mik.ua/orelly/perl3/lperl/ch09_05.htm shows

The part of the (haystack) string that actually matched the (needle) pattern is automatically stored in q[  $& ]

Whatever came before the matched section is in $` and whatever was after it is in $'. Another way to say that is that $` holds whatever the regular expression engine had to skip over before it found the match, and $' has the remainder of the string that the pattern never got to. If you glue these three strings together in order, you’ll always get back the original string.

— /m /s clarification:

  1. By default, q($) + q(^) won’t match newline. /m targets q($) and q(^)
  2. By default, the dot q(.) won’t match newline. /s targets the dot.
  3. The /m and /s both help get newlines matched, in different contexts.

Official doc says:

  1. /m  Treat the string being matched against as multiple lines. That is, change "^" and "$" from matching the start of the string’s first line and the end of its last line to matching embedded start and end of each line within the string.
  2. /s  Treat the string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.
  3. Used together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string.

at run time, function must be predefined ] python #c++

As shown in https://github.com/tiger40490/repo1/blob/py1/py/str/testAbbr_ez.py, you can randomly shuffle your functions bodies (i.e. definitions) as long as … at run time, each function call can resolve to a function already seen by the interpreter, not a function to be defined later in the script.

Easiest way is to call main() after all function bodies. By the time main() runs and uses any of those functions, the interpreter runtime has already parsed the function body.

If you move func8() body to after the call to main() (I didn’t say “the body” of main()), then python will complain that func8 is undefined. It’s as bad as:

print var3; var3=3

(Note there’s no declaration vs definition for python functions.)

C is more strict — at Compile time (not run time), when the compiler encounters a name like func5(), it insists func5() declaration (or definition) is already seen.

To satisfy the compiler, people put function declarations in include files. Function definition obeys ODR.

[19] j^python arg passing #(im)mutable

Python is ALWAYS pbref.

— list/arrayList, dict/HM etc: same behavior between java and py —
pbref. Edits by the function hits the original object.

— string: probably identical behavior between java and py —
immutable in both java and py.

Passed by reference in both.

If function “modifies” the string content, it gets a new string object (like copy-on-write), unrelated to the original string object in the “upstairs” stackframe.

— ints, floats etc: functionally similar between java and py —
If function modifies the integer content, the effect is “similar”. In python a new int object is created (like string). In java this int is a mutable clone of the original, created on the stackframe.

Java int is primitive, without usable address and pbclone, never pbref.

Python numbers are like python strings – immutable objects passed by reference. Probably copy-on-write.

##eg@syntax/idiom learning via coding drill

Syntax (+ some best practice in syntax) is essential in 70% of my real-coding interviews.

There are many memorable exceptions like char-array/int-array, tree, linked list, regex ….. , which require ECT, BP and pure algo, but trivial syntax. Some of them are relatively “hard” on pure algo and my relative strength shines through among Wall St candidates.

Below are some examples of syntax learning via coding drill

  • pointer manipulation in linked graph
  • matrix manipulation
  • pitfall — iterator invalidation
  • pitfall — output iterator hitting end()
  • c++ search within string
  • eg: multi-threading problems —- I had at least 8 experiences in java and c++ (UBS-Yang, Barcap, Gelber, SCB, HSBC, Wells, Bbg thrPool, Pimco + some minor ones in BGC)
  • eg: template, move semantic and placement-new … asked in white-board coding, nothing to do with ECT!
  • eg: lower_bound() means touchUpperBound is all about syntax not algorithm. Revealed via coding interview!
  • See also categories pyECT_nlg, t_c++syntaxTip, t_c++tecniq, T3_stdStr_tagT2_cStr/arr/vector
  • see tags t_c++ECT_nlg
  • c++ syntax requires 3-6 times more learning than python

Liunx(!!Unix)kernel thread can! run user program

–“Liunx kernel thread cannot run user programs”, as explained in [[UnderstandingLinuxKernel]] P3.

Removing ‘kernel‘… Linux userland threads (based on LWP, explained below) do run user programs.

Removing ‘thread‘… Linux kernel processes or interrupt routines could possibly run under a user process pid.

Removing ‘Linux‘ … Other Unix variants do use kernel threads to run both 1) user programs and 2) kernel routines. This is my understanding from reading about JVM…

  • unix  — multi-threaded user apps =use=> LWP =use=> kernel threads
  • linux — multi-threaded user apps =use=> LWP. LWP is the basic unit of concurrency in user apps. Linux LWP is unrelated to Unix LWP.

y FIX needs seqNo over TCP seqNo

My friend Alan Shi said … Suppose your FIX process crashed or lost power, reloaded (from disk) the last sequence received and reconnected (resetting tcp seq#). It would then receive a live seq # higher than expected. This could mean some executions reports were missed. If exchange notices a sequence gap, then it could mean some order cancellation was missed.  Both scenarios are serious and requires request for resend. CME documentation states:

… a given system, upon detecting a higher than expected message sequence number from its counterparty, requests a range of ordered messages resent from the counterparty.

Major difference from TCP sequence number — FIX specifies no Ack though some exchange do. See Ack in FIX^TCP

Q; how could FIX miss messages given TCP reliability? I guess tcp session disconnect is the main reason.

https://kb.b2bits.com/display/B2BITS/Sequence+number+handling has details:

  • two streams of seq numbers, each controlled by exch vs trader
  • how to react to unexpected high/low values received. Note “my” outgoing seq is controlled by me hence never “unexpected”
  • Sequence number reset policy — After a logout, sequence numbers is supposed to reset to 1, but if connection is terminated ‘non-gracefully’ sequence numbers will continue when the session is restored. In fact a lot of service providers (eg: Trax) never reset sequence numbers during the day. There are also some, who reset sequence numbers once per week, regardless of logout.

 

lower_bound() means touchUpperBound #4 scenarios

  • std::upper_bound() should be named strictUpperBound()
  • std::lower_bound() should be named touchUpperBound() since it can return an element touching the target value

If no perfect hit, then both returns the same node — lowest node above the target

  1. if target is too high, both return end(), which is a fake element
  2. if target is too low, both return begin(), which is a real element
  3. if target is matching one of the nodes, then perfect hit
  4. if target is between 2 nodes, then … this is the most common.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/66miscIVQ/curveInterpolation_CVA.cpp caters to all four scenarios.

##types of rvr/rvalueObjects out there #SCB IV

An rvr variable is a door plate on a memory location , wherein the data content is regarded Disposable. Either 1) a naturally occurring unnamed temporary object or 2) a named object earmarked (via move()) as no-longer-need.

Examples of first case:

  • function returning a nonref — Item 25 of [[effModernC++]] and P532 [[c++primer]]. I think this is extremely common
    • function returning a pair<int, float>
    • function returning a vector<int>
    • function returning a string
    • function returning an int
  • string1+string2
  • 33+55

In the 2nd case the same object could also have a regular lvr door plate (or a pointer pointing to it). This lvr variable should NOT be used any more.

Q: That’s a rvr variable… how about the rvr object?
A: no such thing. A rvr is always a variable. There exists a memory location at the door plate, but that object is neither rvr nor lvr.
%%A: I explained to my 2018 SCB interviewer — rvr and lvr (and pointer variables) are thingies known to the compiler. Objects are runtime thingies, including 32-bit pointer objects. However, an unnamed temp object is (due to compiler) soon-to-be-destroyed, so it is accessed via a rvr.

UDP socket is identified by two-tuple; TCP socket is by four-tuple

Based on [[computer networking]] P192. see also de-multiplex by-destPortNumber UDP ok but !! enough for TCP

  • Note the term in subject is “socket” not “connection”. UDP is connection-less.

A TCP segment has four header fields for Source IP:port and destination IP:port.

A TCP socket has internal data structure for a four-tuple — Remote IP:port and local IP:port.

A regular TCP “Worker socket” has all four items populated, to represent a real “session/connection”, but a Listening socket could have wild cards in all but the local-port field.

susFocus^engage^absorbency^visPgress^traction – Clarified

I hope to differentiate these related terms, and reduce proliferation of tags

  • traction — is kind of vague and high-level, so no “t_traction”
  • traction — often describes learning curve breakthrough. Defined in [11]traction(def)^learning curve gradient #diminishing ROTI
  • visPgress — is more concrete and well-defined than “traction”
  • (dis)engaged — describes a mental state, like “absorbed”, “enchanted”
  • (dis)engaged — can be felt only from inside
  • sustainedFocus — is often required to overcome tough learning obstacles
  • sustainedFocus — can be sustained by self-discipline (absorbency) whereas engaged is often due to luck and sustained interest.
  • sustainedFocus — is more a specific form of being “engaged”
  • sustainedFocus — is a longer phrase compared to the honeymoon of engagement
  • absorbency — is most specific
  • absorbency — describes the difficulty of staying engaged despite dry, repetitive, focused learning

Grandpa is a model of

  • engaged
  • sustainedFocus — but without too much self-discipline needed
  • absorbency

In comparison to grandpa,

  • I want to have more sustained focus
  • I want longer engagement, because I tend to lose interest too soon
  • I know my engagement lasts only hours, so I frequently feel a bold justification to capture the moment, perhaps at a financial cost
  • my absorbency capacity is not as good as I wish
  • i feel a real need to capture commute time
  • I need a job that gives me more personal time, even during work hours.
  • … I think these factors have profound consequences, such as my career decisions

fragmentation: IP^TCP #retrans

I consider this a “halo” knowledge pearl because it is part of an essential everyday service. We can easily find an opportunity to inject it into an Interview. Interviews are unlikely to go this deep, but it’s good to over-prepare here.

This comparison ties together many loose ends like Ack, reassembly, retrans, seq resets..

This comparison clarifies what reliability IP offers + what reliability features it lacks:

  • it offers complete datagrams
  • it can lose an entire datagram — TCP (not UDP) will detect it and use retrans
  • it can deliver two datagrams out of order — TCP will detect and fix it

[1] IP fragmentation can cause excessive retransmissions when fragments encounter packet loss and reliable protocols such as TCP must retransmit ALL of the fragments in order to recover from the loss of a SINGLE fragment
[2] TCP seq# never looks like 1,2,3
[3] IP (de)fragmentation #MTU,offset

IP4 fragmentation TCP fragmentation
minimum guarantees all-or-nothing. Never partial packet stream in-sequence without gap
reliability unreliable fully reliable
name of a “whole” piece IP datagram #“packet/msg” are vague
a TCP session with thousands of sequence numbers
name for a “part” IP fragment TCP segment
sequencing each fragment has an offset each segment has a seq#
.. continuous? yes discontinuous ! [2]
.. seq # reset? yes for each packet loops back to 0 right before overflow
Ack no Ack !
positive Ack needed
gap detection using offset using seq# [2]
id for the “msg” identification number no such thing
end-of-msg flag in last fragment no such thing. Continuous stream
out-of-sequence? likely likely
..reassembly based on id/offset/flag [3] based on seq#
retrans not by IP4 [1][3] commonplace
missing a piece? entire IP datagram discarded[3] triggers retrans

de-multiplex by-destPort: UDP ok but insufficient for TCP

When people ask me what is the purpose of the port number in networking, I used to say that it helps demultiplex. Now I know that’s true for UDP but TCP uses more than the destination port number.

Background — Two processes X and Y on a single-IP machine  need to maintain two private, independent ssh sessions. The incoming packets need to be directed to the correct process, based on the port numbers of X and Y… or is it?

If X is sshd with a listening socket on port 22, and Y is a forked child process from accept(), then Y’s “worker socket” also has local port 22. That’s why in our linux server, I see many ssh sockets where the local ip:port pairs are indistinguishable.

TCP demultiplex uses not only the local ip:port, but also remote (i.e. source) ip:port. Demultiplex also considers wild cards.

TCP UDP
socket has local IP:port
socket has remote IP:port no such thing
2 sockets with same
local port 22 ???
can live in two processes not allowed
can live in one process not allowed
2 msg with same dest ip:port
but different source ports
addressed to 2 sockets;
2 ssh sessions
addressed to the
same socket

make_shared() cache efficiency, forward()

This low-level topic is apparently important to multiple interviewers. I guess there are similarly low-level topics like lockfree, wait/notify, hashmap, const correctness.. These topics are purely for theoretical QQ interviews. I don’t think app developers ever need to write forward() in their code.

https://stackoverflow.com/questions/18543717/c-perfect-forwarding/18543824 touches on a few low-level optimizations. Suppose you follow Herb Sutter’s advice and write a factory accepting Trade ctor arg and returning a shared_ptr<Trade>,

  • your factory’s parameter should be a universal reference. You should then std::forward() it to make_shared(). See gcc source code See make_shared() source in https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-api-4.6/a01033_source.html
  • make_shared() makes a single allocation for a Trade and an adjacent control block, with cache efficiency — any read access on the Trade pointer will cache the control block too
  • I remember reading online that one allocation vs two is a huge performance win….
  • if the arg object is a temp object, then the rvr would be forwarded to the Trade ctor. Scott Meryers says the lvr would be cast to a rvr. The Trade ctor would need to move() it.
  • if the runtime object is carried by an lvr (arg object not a temp object), then the lvr would be forwarded as is to Trade ctor?

Q: What if I omit std::forward()?
AA: Trade ctor would receive always a lvr. See ScottMeyers P162 and my github code

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp is my experiment.

 

4buffers in 1 TCP connection #full-duplex

Many socket programmers may not realize there are four transmission buffers in a single TCP connection or TCP session. These buffers are set aside by the kernel when the socket is created.

Two buffers in socket AA (say in U.S.) + two sockets on socket BB (say, in Singapore)

Two receive buffers + two send buffers

At any time, there could be data in all four buffers. AA can initiate a send in the middle of receiving data because TCP is a full-duplex protocol.

Whenever one of the two sockets initiate a send, it has a job duty to ensure it never overflows the receiving buffer on the other end. This is a the essence of flow-control. This flow-control relies on the 2 buffers involved (not the other two buffers) P 247 [[computer networking]] has details. Basically, sender has an estimate of the remaining free space in the receive buffer so sender never sends too many bytes. It keeps unsent data in its send buffer.

Is UDP full-duplex? My book doesn’t mention it. I think it is.

session^msg^tag level concepts

Remember every session-level operation is implemented using messages.

  • Gap management? Session level
  • Seq reset? Session level, but seq reset is a msg
  • Retrans? per-Msg
  • Checksum? per-Msg
  • MsgType tag? there is one per-Msg
  • Header? per-Msg
  • order/trade life-cycle? session level
  • OMS state management? None of them …. more like application level than Session level

linux tcp buffer^AWS tuning params

—receive buffer configuration
In general, there are two ways to control how large a TCP socket receive buffer can grow on Linux:

  1. You can set setsockopt(SO_RCVBUF) explicitly as the max receive buffer size on individual TCP/UDP sockets
  2. Or you can leave it to the operating system and allow it to auto-tune it dynamically, using the global tcp_rmem values as a hint.
  3. … both values are capped by

/proc/sys/net/core/rmem_max — is a global hard limit on all sockets (TCP/UDP). I see 256M in my system. Can you set it to 1GB? I’m not sure but it’s probably unaffected by the boolean flag below.

/proc/sys/net/ipv4/tcp_rmem — doesn’t override SO_RCVBUF. The max value on RTS system is again 256M. The receive buffer for each socket is adjusted by kernel dynamically, at runtime.

The linux “tcp” manpage explains the relationship.

Note large TCP receive buffer size is usually required for high latency, high bandwidth, high volume connections. Low latency systems should use smaller TCP buffers.

For high-volume multicast channel, you need large receive buffers to guard against data loss — UDP sender doesn’t obey flow control to prevent receiver overflow.

—AWS

/proc/sys/net/ipv4/tcp_window_scaling is a boolean configuration. (Turned on by default) 1GB  is the new limit on AWS after turning on window scaling. If turned off, then AWS value is constrained to a 16-bit integer field in the TCP header — 65536

I think this flag affects AWS and not receive buffer size.

  • if turned on, and if buffer is configured to grow beyond 64KB, then Ack can set AWS to above 65536.
  • if turned off, then we don’t (?) need a large buffer since AWS can only be 65536 or lower.

 

FIX.5 + FIXT.1 breaking changes

I think many systems are not yet using FIX.5 …

  1. FIXT.1 protocol [1] is a kinda subset of the FIX.4 protocol. It specifies (the traditional) FIX session maintenance over naked TCP
  2. FIX.5 protocol is a kinda subset of the FIX.4 protocol. It specifies only the application messages and not the session messages.

See https://www.fixtrading.org/standards/unsupported/fix-5-0/. Therefore,

  • FIX4 over naked TCP = FIX.5 + FIXT
  • FIX4 over non-TCP = FIX.5 over non-TCP

FIX.5 can use a transport messaging queue or web service, instead of FIXT

In FIX.5, Header now has 5 mandatory fields:

  • existing — BeginString(8), BodyLength(9), MsgType(35)
  • new — SenderCompId(49), TargetCompId(56)

Some applications also require MsgSeqNo(34) and SendingTime(52), but these are unrelated to FIX.5

Note BeginString actually look like “8=FIXT1.1

[1] FIX Session layer will utilize a new version moniker of “FIXTx.y”

HFT mktData redistribution via MOM #real world

Several low-latency practitioners say MOM is unwelcome due to added latency:

  1. The HSBC hiring manager Brian R was the first to point out to me that MOM adds latency. Their goal is to get the raw (market) data from producer to consumer as quickly as possible, with minimum hops in between.
  2. 29West documentation echoes “Instead of implementing special messaging servers and daemons to receive and re-transmit messages, Ultra Messaging routes messages primarily with the network infrastructure at wire speed. Placing little or nothing in between the sender and receiver is an important and unique design principle of Ultra Messaging.” However, the UM system itself is an additional hop, right? Contrast a design where sender directly sends to receiver via multicast.
  3. Then I found that the RTS systems (not ultra-low-latency ) have no middle-ware between feed parser and order book engine (named Rebus).

However, HFT doesn’t always avoid MOM.

  • P143 [[all about HFT]] published 2010 says an HFT such as Citadel often subscribes to both individual stock exchanges and CTS/CQS [1], and multicasts the market data for internal components of the HFT platform. This design has additional buffers inherently. The first layer receives raw external data via a socket buffer. The 2nd layer components would receive the multicast data via their socket buffers.
  • SCB eFX system is very competitive in latency, with Solarflare NIC + kernel bypass etc. They still use Solace MOM!
  • Lehman’s market data is re-distributed over tibco RV, in FIX format.

[1] one key justification to subscribe redundant feeds — CTS/CQS may deliver a tick message faster than direct feed!

IP4 fragmentation+reassembly #MTU,offset

I consider this a “halo” knowledge pearl because it is part of an essential everyday service. We can easily find an opportunity to inject it into an IV.

A Trex interviewer said something questionable. I said fragmentation is done at IP layer and he said yes but not reassembly. He was wrong. See P329 [[Computer Networking]]

I was talking about IP layer breaking up , say, a 4KB packet (TCP or UDP packet) into three IP-fragments no bigger than 1500B [1]. The reassembly task is to put all 3 fragments back together in sequence (and detect missing fragments) and hand it over to TCP or UDP.

This reassembly is done in IP layer. IP4 uses an “offset” number in each fragment to identify the sequencing and to detect missing fragments. The fragment with the highest offset also has a flag indicating it’s the last fragment of a given /logical/ packet.

Therefore, IP4 detects and will never deliver partial packets to UDP/TCP (P328 [[computer networking]]), even though IP is considered an unreliable service. IP4 can detect missing/incomplete IP-datagram and will refuse to “release” it to upper layer (i.e. UDP/TCP).

  • If TCP doesn’t get one “segment” (a.k.a. IP-datagram), it will request retransmission
  • UDP does no retransmission
  • IP4 does no retransmission

[1] MTU for some hardware is lower than 1500 Bytes …

##functions(outside big4)using either rvr param or move()

Q: Any function(outside big4) with rvr param?
%%A: Such functions are rare. I don’t know any.
AA: [[effModernC++]] has a few functions taking rvr param, but fairly contrived as I remember.
AA: P544 [[c++primer]] says class methods could use rvr param
* eg: push_back()

 

Q: any function (outside big4) using std::move?

  • [[effModernC++]] has a few functions.
  • P544 [[c++primer]] says rarely needed
  • [[josuttis]] p20

 

tcp: detect wire unplugged

In general for either producer or consumer, the only way to detect peer-crash is by probing (eg: keepalive, 1-byte probe, RTO…).

  • Receiver generally don’t probe and will remain oblivious.
  • Sender will always notice a missing Ack (timeout OR 3 duplicate Acks). After retrying, TCP module will give up and generate SIGPIPE.
send/recv buffers full buffer full then receiver-crash receiver-crash then sender has data to send receiver-crash amid active transmission
visible symptom 1-byte probe from sender triggers Ack containing AWS=0 The same Ack suddenly stops coming very first expected Ack doesn’t come Ack was coming in then suddenly stops coming
retrans by sender yes yes yes yes
SIGPIPE no probably yes probably

Q20: if a TCP producer process dies After transmission, what would the consumer get?
AA: nothing. See http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html — Receiver is ready to receive data, and has no idea that sender has crashed.
AA: Same answer on https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html

Q21: if a TCP producer process dies During transmission, what would the consumer get?
%A: ditto. Receiver has to assume sender stopped.


Q30: if a TCP consumer process dies during a quiet period After transmission, what would the producer get?
AA: P49 [[tcp/ip sockets in C]] Sender doesn’t know right away. At the next send(), sender will get -1 as return value. In addition, SIGPIPE will also be delivered, unless configured otherwise.

Q30b: Is SIGPIPE generated immediately or after some retries?
AA: https://superuser.com/questions/911808/what-happens-to-tcp-connections-when-i-remove-the-ethernet-cable/911829 describes Ack and re-transmission. Sender will notice a missing Ack and RTO will kick in.
%A: I feel TCP module will Not give up prematurely. Sometimes a wire is quickly restored during ftp, without error. If wire remains unplugged it would look exactly like a peer crash.

Q31: if a TCP consumer dies During transmission, what would the producer get?
%A: see Q30.

Q32: if a TCP consumer process dies some time after buffer full, what would the producer get?
%A: probably similar to above, since sender would send a 1-byte probe to trigger a Ack. Not getting the Ack tells sender something. This probe is builtin and mandatory , but functionally similar to (the optional) TCP Keepalive feature

I never studied these topics but they are quite common.

Q: same local IP@2 worker-sockets: delivery to who

Suppose two TCP server worker-sockets both have local address 10.0.0.1 port 80. Both connections active.

When a packet comes addressed to 10.0.0.1:80, which socket will kernel deliver it to? Not both.

Not sure about UDP, but TCP connection is a virtual circuit or “private conversation”, so Socket W1  knows the client is 11.1.1.2:8701. If the incoming packet doesn’t have source IP:port matching that, then this packet doesn’t belong to the conversation.

simultaneous send to 2 tcp clients #multicast emulation

Consider a Mutlti-core machine hosting either a forking or multi-threaded tcp server. The accept() call would return twice with file descriptors 5 and 6 for two new-born worker sockets. Both could have the same server-side address, but definitely their ports must be identical to the listening port like 80.

There will be 2 dedicated threads (or processes) serving file descriptor 5 and 6, so they can both send the same data simultaneously. The two data streams will not be exactly in-sync because the two threads are uncoordinated.

My friend Alan confirmed this is possible. Advantages:

  • can handle slow and fast receives at different data rates. Each tcp connection maintains its state
  • guaranteed delivery

For multicast, a single thread would send just a single copy  to the network. I believe the routers in between would create copies for distribution.  Advantages:

  • Efficiency — Router is better at this task than a host
  • throughput — tcp flow control is a speed bump

 

new socket from accept()inherits port# from listen`socket

[[tcp/ip sockets in C]] P96 has detailed diagrams, but this write-up is based on https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.bpxbd00/accept.htm

Background — a socket is an object in memory, with dedicated buffers.

We will look at an sshd server listening on port 22. When accept() returns without error, the return value is a positive integer file descriptor pointing to a new-born socket object with its own buffers. I call it the “worker” socket. It inherits almost every property from the listening socket (but see tcp listening socket doesn’t use a lot of buffers for differences.)

  • local (i.e. server-side) port is 22
  • local address is a single address that client source code specified. Note sshd server could listen on ANY address, or listen on a single address.
  • … so if you look at the local address:port, the worker socket and the listening socket may look identical, but they are two objects!
  • remote port is some random high port client host assigned.
  • remote address is … obvious
  • … so if you compare the worker-socket vs listening socket, they have
    • identical local port
    • different local ip
    • different remote ip
    • different remote port

tcp listening socket doesn’t use a lot of buffers, probably

The server-side “worker” socket relies on the buffers. So does the client-side socket, but the listening socket probably doesn’t need the buffer since it exchanges very little data with client

That’s one difference between listening socket vs worker socket

2nd difference is the server (i.e. local) address field. Worker socket has a single server address filled in. The listening socket often has “ANY” as its server address.

non-forking STM tcp server

This a simple design. In contrast, I proposed a multiplexing design for a non-forking single-threaded tcp server in accept()+select() : multiple persistent worker-sockets

ObjectSpace manual P364 shows a non-forking single-threaded tcp server. It exchanges some data with each incoming client and immediate disconnects the client and goes back to accept()

Still, when accept() returns, it returns with a new “worker” socket that’s already connected to the client.

This new “worker” socket has the same port as the listening socket.

Since the worker socket object is a local variable in the while() loop, it goes out of scope and gets destructed right away.

y tcp server listens on ANY address

Background — a server host owns multiple addresses, as shown by ifconfig. There’s usually an address for each network “interface”. (Routing is specified per interface.) Suppose we have

10.0.0.1 on IF1

192.168.0.2 on IF2

By default, an Apache server would listen on 10.0.0.1:80 and 192.168.0.2:80. This allows clients from different network segments to reach us.

If Apache server only listens to 192.168.0.2:80, but some clients can only connect to 10.0.0.1:80 perhaps due to routing!

Ack in FIX #TCP

TCP requires Ack for every segment? See cumulative acknowledgement in [[computerNetworking]]

FIX protocol standard doesn’t require Ack for every message.

However, many exchanges spec would require an Ack message for {order submission OR execution report}. At TrexQuant design interview, I dealt with this issue —

  • –trader sends an order and expecting an Ack
  • exchange sends an Ack and expecting another Ack … endless cycle
  • –exchange sends an exec report and expecting an Ack
  • trader sends an Ack waiting for another Ack … endless cycle
  • –what if at any point trader crashes? Exchange would assume the exec report is not acknowledged?

My design minimizes duty of the exchange —

  • –trader sends an order and expecting an Ack
  • exchange sends an Ack and assumes it’s delivered.
  • If trader misses the Ack she would decide either to resend the order (with PossResend flag) or query the exchange
  • –exchange sends an exec report and expecting an Ack. Proactive resend (with PossResend) when appropriate.
  • trader sends an Ack and assumes it’s delivered.
  • If exchanges doesn’t get any Ack it would force a heartbeat with TestRequest. Then exchange assumes trader is offline.

move^forward on various reference types

Suppose there are 3 overloads of a sink() function, each with a lvr, rvr or nonref parameter.

(In reality these three sink() functions will not compile together due to ambiguity. So I need to remove one of the 3 just to compile.)

We pass string objects into sink() after going through various “routes”. In the table’s body, if you see “passing on as lvr” it means compiler picks the lvr overload.

thingy passed through move/fwd ↴ route: as-is route: move() route: fwd  without move() route: fwd+move
output from move() (without any named variable) pbclone or pbRvr ? should be fine pbref: passing on a lvr pbRvr
natural temp (no named variable) pbclone or pbRvr ? should be fine pbref: passing on a lvr pbRvr
named rvr variable from move() ! [2] pbref or pbclone [1] pbRvr pbref?
lvr variable pbref or pbclone passing on a rvr pbref? should never occur
nonref variable pbref or pbclone passing on a rvr pbref? should never occur

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp
https://github.com/tiger40490/repo1/blob/cpp1/cpp1/mv_fwd.cpp are my experiment.  Take-aways:

— when a non-template function has a rvr param like f(MyType && s), we know the object behind s is a naturally-occurring temporary object (OR someone explicitly marked a non-temp as no-longer-in-use) but in order to call the move-ctor of MyType, we have to MyType(move(s)). Otherwise, to my surprise, s is treated as a lvr and we end up calling the MyType copy-ctor. [1]

  • with a real rvr (not a universal ref), you either directly steal from it, or pass it to another Func2 via Func2(move(rvrVar)). Func2 is typically mv-ctor (including assignment)

[2] Surprise Insight: Any time you save the output of move() in a Named rvr variable like q(string && namedRvrVar), this variable itself has a Location and is a l-value expression, even though the string object is now marked as abandoned. Passing this variable would be passing a lvr!

  • If you call move(), you must not save move() output to a named variable! This rule is actually consistent with…
  • with a naturally-occurring temp like string(“a”)+”1″, you must not save it to named variable, or it will become non-temp object.

— std::forward requires a universal reference i.e. type deduction.

If I call factory<MyType>(move(mytype)), there’s no type deduction. This factory<MyType>(..) is a concretized function and is equivalent to factory(MyType && s). It is fundamentally different from the function template. It can Only be called with a true rvr. If you call it like factor<MyType>(nonref>, compiler will complain no-matching-function, because the nonref argument can only match a lvr parameter or a by-value parameter.

https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-api-4.6/a01033_source.html shows that every time any function gets a variable var1 as a universal ref or a true rvr, we always apply either move(var1) or forward(var1) before passing to another function. I think as-is passing would unconditionally pass var1 as a lvr.

c++matrix using deque@deque #python easier

My own experiment https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/spiral_FB.cpp shows

  • had better default-populate with zeros. Afterwards, you can easily overwrite individual cells without bound check.
  • it’s easy to insert a new row anywhere. Vector would be inefficient.
  • To insert a new column, we need a simple loop

For python, # zero-initialize a 5-row, 8-column matrix:
width, height = 8, 5
Matrix = [[0 for x in range(width)] for y in range(height)]

In any programming language, the underlying data structure is a uniform pile-of-horizontal-arrays, therefore it’s crucial (and tricky) to understand indexing. It’s very similar to matrix indexing — Mat[0,1] refers to first row, 2nd element.

Warning: 2d array is hard to pass in c++, based on personal experience 😦 You often need to specify the size in the receiving function declaration. Even if this is feasible, it’s unwanted legwork.

[1] Warning — The concept of “column” is mathematical (matrix) and non-existent in our implementation, therefore misleading! I will avoid any mention of it in my source code. No object no data structure for the “column”!

[2] Warning — Another confusion due to mathematics training. Better avoid Cartesian coordinates. Point(4,1) is on 2nd row, 5th item, therefore arr[2][5] — so you need to swap the subscripts.

1st subscript 2nd subscript
max subscript  44  77
height #rowCnt 45 #not an index value  <==
width #rowSz [1]  ==> 78 #not an index value
example value 1 picks 2nd row 4 picks 5th item in the row
variable name rowId, whichRow subId [1]
Cartesian coordinate[2] y=1 (Left index) x=4

ValueType default ctor needed in std::map[]

Gotcha — Suppose you define a custom Value type for your std::map and you call mymap[key] = Value(…). There’s an implicit call to the no-arg ctor of Value class!

If you don’t have a no-arg ctor, then avoid operator[](). Use insert(make_pair()) and find(key) methods.

Paradox — for both unordered_map and map, a lookup HIT also requires a no-arg ctro to be available. Reason is,

  • since your code uses map operator[], compiler will “see” the template member function operator[]. This function explicitly calls the ctor within a if-block.
    • If you have any lookup miss, this ctor is called at run time
    • If you have no lookup miss, this ctor is never called at run time
    • compiler doesn’t know if lookup miss/hit may happen in the future, so both branches are compiled into a.out
  • If your source code doesn’t use operator[], then at compile time, the template function operator[] is not included in a.out, so you don’t need the default ctor.

 

Y allocate static field in .c file %%take

why do we have to define static field myStaticInt in a cpp file?

For a non-static field myInt, the allocation happens when the class instance is allocated on stack, on heap (with new()) or in global area.

However, myStaticInt isn’t take care of. It’s not on the real estate of the new instance. That’s why we need to declare it in the class header, and then define it exactly once (ODR) in a cpp file. It is allocated at compile time — static allocation.

RVO^move : on return value

Let’s set the stage. A function returns a local Trade object “myTrade” by value. Will RVO kick in or move-semantic kicks in? Not both!

I had lots of confusions about these 2 features.[[effModernC++]] P176 has a long discussion and an advice — do not write std::move() hoping to “help” compiler on a local object being returned from a function

  • If the local object is eligible for RVO then all compilers would elide the copy. Your std::move() would hinder the compiler and back fire
  • if the local object is ineligible for RVO then compiler are required to return an rvalue object, often implicitly using st::move(), so your help is unneeded.
    • Note local object returned by clone is a naturally-occurring temp object.

P23 [[c++stdLib]] gave 2-line answer:

  1. if Trade class has a suitable copy or move ctor, then compiler may choose to “elide the copy”. This was long implemented as RVO optimization in most compilers before c++11. https://github.com/tiger40490/repo1/blob/cpp1/cpp/rvr/moveOnlyType_pbvalue.cpp is my experiment.
  2. otherwise, if Trade class has a move ctor, the myTrade object is robbed

So if condition for RVO is present, then most likely your move-ctor will NOT run.

c++compiler select`move-ctor

This is a __key__ part of understanding move-semantics, seldom quizzed. Let’s set the stage:

  • you overload a traditional insert(Amount const &)  with a move version insert(Amount &&)
    • by the way, Derrick of TrexQuant introduced a 3rd alternative emplace()
  • without explicit std::move, you pass in an argument into insert()
  • (For this example, I want to keep things simple by avoid constructors, but the rules are the same.)

Q1: When would the compiler select the rvr version?

P22 [[c++stdLib]] has a limited outline. Here’s my illustration

  • if I pass in a temporary like insert(originalAmount + 15), then this argument is a rvalue obj so the rvr version is selected
  • if I pass in a regular variable like insert(originalAmount), then this argument is an lvalue obj so the traditional version is selected
  • … See also my dedicated blogpost in c++11overload resolution TYPE means..

After we are clear on Q1, we can look at Q2

Q2: how would std::move help?
A: insert(std::move(originalAmount)); // if we know the object behind originalAmount is no longer needed.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp shows when we need to use std::move() and when we don’t need.

nearest-rank percentile definition

https://en.wikipedia.org/wiki/Percentile#The_nearest-rank_method has a few concise pointers

  • a percentile value is always an original object, never interpolated
  • we can think of it as a mapping function percentile(int100) -> an original object, where int100 data type can be a value from 1 to 100 (zero disallowed). Two distinct inputs can map to the same output object.
  • 100th percentile is always the largest object in the list. No such thing as 0th. 1st percentile is not always the smallest object.

private-header^shared-header

In our discussions on ODR, global variables, file-scope static variables, global functions … the concept of “shared header” is often misunderstood.

  • If a header is only included in one *.cpp, then its content is effectively part of a *.cpp.

Therefore, you may experiment by putting “wrong” things in such a private header and the set-up may work or fail, but it’s an invalid test. Your test is basically putting those “wrong” things in an implementation file!

 

STL iterator invalidation rules, succinctly

http://www.martinbroadhurst.com/iterator-invalidation-rules-for-c-containers.html has concise explanations. Specifically,

  • list insertions don’t invalidate any iterator. I feel iterator is a pointer to a node.
  • tree insertions don’t invalidate any iterator. Same reason as list.
  • for erasure from lists or trees, only the iterator to the erased node is invalidated.

Now for vector:

  • vector insertion invalidates any iterator positioned somewhere after the insertion point. If reallocation happens due to exceeding vector.capacity() then all invalidated

tcp: one of 3 client-receivers is too slow

See also no overflow]TCP slow receiver #non-blocking sender

This is a real BGC interview question https://bintanvictor.wordpress.com/2017/04/08/bgc-iv-c-and-java/

Q: server is sending data fast. One of the clients (AA) is too slow.

Background — there will be 3 worker-sockets. The local address:port will look identical among them if the 3 clients connect to the same network interface, from the same network segment.

The set-up is described in simultaneous send to 2 tcp clients #mimic multicast

Note every worker socket for every client has identical local port.

See https://stackoverflow.com/questions/1997691/what-happens-when-tcp-udp-server-is-publishing-faster-than-client-is-consuming

I believe the AA connection/session/thread will be stagnant. At a certain point [1] server will have to remove the (mounting) data queue and release memory — data loss for the AA client.

[1] can happen within seconds for a fast data feed.

I also feel this set-up overloads the server. A TCP server has to maintain state for each laggard client, assuming single-threaded multiplexing(?). If each client takes a dedicated thread then server gets even higher load.

Are 5 client Connections using 5 sockets on server? I think so. Can a single thread multiplex them? I think so.

## Y avoid blocking design

There are many contexts. I only know a few.

1st, let’s look at an socket context. Suppose there are many (like 500 or 50) sockets to process. We don’t want 50 threads. We prefer fewer, perhaps 1 thread to check each “ready” socket, transfer whatever data can be transferred then go back to waiting. In this context, we need either

  • /readiness notification/, or
  • polling
  • … Both are compared on P51 [[TCP/IP sockets in C]]

2nd scenario — GUI. Blocking a UI-related thread (like the EDT) would freeze the screen.

3rd, let’s look at some DB request client. The request thread sends a request and it would take a long time to get a response. Blocking the request thread would waste some memory resource but not really CPU resource. It’s often better to deploy this thread to other tasks, if any.

Q: So what other tasks?
A: ANY task, in the thread pool design. The requester thread completes the sending task, and returns to the thread pool. It can pick up unrelated tasks. When the DB server responds, any thread in the pool can pick it up.

This can be seen as a “server bound” system, rather than IO bound or CPU bound. Both the CPU task queue and the IO task queue gets drained quickly.

 

no overflow]TCP slow receiver #non-blocking sender

Q: Does TCP receiver ever overflow due to a fast sender?

A: See http://www.mathcs.emory.edu/~cheung/Courses/455/Syllabus/7-transport/flow-control.html

A: should not. When the receiver buffer is full, the receiver sends AdvertizedWindowSize to informs the sender. If sender app ignores it and continues to send, then sent data will remain in the send buffer and not sent over the wire. Soon the send buffer will fill up and send() will block. On a non-blocking TCP socket, send() returns with error only when it can’t send a single byte. (UDP is different.)

Non-block send/receive operations either complete the job or returns an error.

Q: Do they ever return with part of the data processed?
A: Yes they return the number of bytes transferred. Partial transfer is considered “completed”.

 

3rd effect@volatile in java5

A Wells Fargo java interviewer said there are 3 effects. I named

  1. load/store to main memory on the target variable
  2. disable statement reordering

I think interviewer mentioned a 3rd effect about memory barrier.

This quote from Java Concurrency in Practice, chap. 3.1.4 may be relevant:

The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable. So from a memory visibility perspective, writing a volatile variable is like exiting a synchronized block and reading a volatile variable is like entering a synchronized block.

— “volatile” on a MyClass field provides special concurrency protection when the MyClass field is constructed :

java reference-type volatile #partially constructed has good details.

https://stackoverflow.com/questions/9169232/java-volatile-and-side-effects address my doubt about “other writes“. Nialscorva’s answer echoes the interviewer:

Before java 1.5, the compiler can reorder the two steps

  1. construction of the new object
  2. assigning the new address to the variable

In such a scenario, other threads (unsynchronized) can see the address in the variable and use the incomplete object, while the construction thread is preempted indefinitely like for 3 hours!

So in java 1.5, the construction is among the “other writes” by the volatile-writing thread! Therefore, the construction is flushed to memory before the address assignment. Below is my own solution, using a non-static volatile field:

public class DoubleCheckSingleton {
	private static DoubleCheckSingleton inst = null;
	private volatile boolean isConstructed = false;
	private DoubleCheckSingleton() {
		/* other construction steps */
		this.isConstructed = true; //last step
	}
	DoubleCheckSingleton getInstance() {
		if (inst != null && inst.isConstructed) return inst;
		synchronized(DoubleCheckSingleton.class) {
			if (inst != null && inst.isConstructed) return inst;
			
/**This design makes uses of volatile feature that's reputed to be java5
*
* Without the isConstructed volatile field, an incomplete object's 
* address can be assigned to inst, so another thread entering getInstance()
* will see a non-null inst and use the half-cooked object 😦
* 
* The isConstructed check ensures the construction has completed
*/
			return inst = new DoubleCheckSingleton();
		}
	}
}

volume alone doesn’t make something big-data

The Oracle nosql book has these four “V”s to qualify any system as big data system. I added my annotations:

  1. Volume
  2. Velocity
  3. Variety of data format — If any two data formats account for more than 99% of your data in your system, then it doesn’t meet this definition. For example, FIX is one format.
  4. Variability in value — Does the system treat each datum equally?

Most of the so-called big-data systems I have seen don’t have these four V’s. All of them have some volume but none has the Variety or the Variability.

I would venture to say that

  • 1% of the big-data systems today have all four V’s
  • 50%+ of the big-data systems have no Variety no Variability
    • 90% of financial big-data systems are probably in this category
  • 10% of the big-data systems have 3 of the 4 V’s

The reason that these systems are considered “big data” is the big-data technologies applied. You may call it “big data technologies applied on traditional data”

See #top 5 big-data technologies

Does my exchange data qualify? Definitely high volume and velocity, but no Variety or Variability.

data science^big data Tech

The value-add of big-data (as an industry or skillset) == tools + models + data

  1. If we look at 100 big-data projects in practice, each one has all 3 elements, but 90-99% of them would have limited value-add, mostly due to .. model — exploratory research
    1. data mining probably uses similar models IMHO but we know its value-add is not so impressive
  2. tools —- are mostly software but also include cloud.
  3. models —- are the essence of the tools. Tools are invented, designed mostly for models. Models are often theoretical. Some statistical tools are tightly coupled with the models…

Fundamentally, the relationship between tools and models is similar to Quant library technology vs quant research.

  • Big data technologies (acquisition, parsing, cleansing, indexing, tagging, classifying..) is not exploratory. It’s more similar to database technology than scientific research.
  • Data science is an experimental/exploratory discovery task, like other scientific research. I feel it’s somewhat academic and theoretical. As a result, salary is not comparable to commercial sectors. My friend Jingsong worked with data scientists in Nokia/Microsoft.

The biggest improvement in recent years are in … tools

The biggest “growth” over the last 20 years is in data. I feel user-generated data is dwarfed by machine generated data

data mining^big-data

Data mining has been around for 20 years (before 1995). The most visible and /compelling/ value-add in big-data always involves some form of data mining, often using AI including machine-learning.

Data mining is The valuable thing that customers pay for, whereas Big-data technologies enhance the infrastructure supporting the mining

https://www.quora.com/What-is-the-difference-between-the-concepts-of-Data-Mining-and-Big-Data has a /critical/ and concise comment. I modified it slightly for emphasis.

Data mining involves finding patterns from datasets. Big data involves large scale storage and processing of datasets. So combining both, data mining done on big data(e.g, finding buying patterns from large purchase logs) is getting lot of attention currently.

NOT All big data task are data mining ones(e.g, large scale indexing).

NOT All data mining tasks are on big data(e.g, data mining on a small file which can be performed on a single node). However, note that wikipedia(as on 10 Sept. 2012) defines data mining as “the process that attempts to discover patterns in large data sets”.

(latency) DataGrid^^noSQL (throughput)

  • Coherence/Gemfire/gigaspace are traditional data grids, probably distributed hashmaps.
  • One of the four categories of noSQL systems is also a distributed key/value hashmaps, such as redis
  • …. so what’s the diff?

https://blog.octo.com/en/data-grid-or-nosql-same-same-but-different/ has an insightful answer — DataGrids were designed for latency; noSQL were designed for throughput.

I can see the same trade-off —

  • FaceBook’s main challenge/priority is fanout (throughput)
  • RTS main challenge is TPS measured in messages per second throughput
  • HFT main challenge is nanosec latency.

For a busy exchange, latency and throughput are both important but if they must pick one? .. throughput. I think most systems designers would lean towards throughput if data volume becomes unbearable.

I guess a system optimized for latency may be unable to cope with such volume. The request queue would probably overflow … lost of data

In contrast, a throughput-oriented design would still stay alive under unusual data volume. I feel such designs are likely more HA and more fault-tolerant.

c++debug^release build can modify app behavior #IV

This was actually asked in an interview, but it’s also good GTD knowledge.

https://stackoverflow.com/questions/4012498/what-to-do-if-debug-runs-fine-but-release-crashes points out —

  • fewer uninitialized variables — Debug mode is more forgiving because it is often configured to initialize variables that have not been explicitly initialized.
    • For example, Perhaps you’re deleting an uninitialized pointer. In debug mode it works because pointer was nulled and delete ptr will be ok on NULL. On release it’s some rubbish, then delete ptr will actually cause a problem.

https://stackoverflow.com/questions/186237/program-only-crashes-as-release-build-how-to-debug points out —

  • guard bytes on the stack frame– The debugger puts more on the stack, so you’re less likely to overwrite something important.

I had frequent experience reading/writing beyond an array limit.

https://stackoverflow.com/questions/312312/what-are-some-reasons-a-release-build-would-run-differently-than-a-debug-build?rq=1 points out —

  • relative timing between operations is changed by debug build, leading to race conditions

Echoed on P260 [[art of concurrency]] which says (in theory) it’s possible to hit threading error with optimization and no such error without optimization, which represents a bug in the compiler.

P75 [[moving from c to c++]] hints that compiler optimization may lead to “critical bugs” but I don’t think so.

  • poor use of assert can have side effect on debug build. Release build always turns off all assertions as the assertion failure messages are always unwelcome.

array^pointer variables types: indistinguishable

  • int i; // a single int object
  • int arr[]; //a nickname of the starting address of an array, very similar to a pure-address const pointer
  • int * const constPtr;
  • <— above two data types are similar; below two data types are similar —->
  • int * pi; //a regular pointer variable,
  • int * heapArr = new int[9]; //data type is same as pi

q[less] functor ^ operator<() again, briefly

[[effSTL]] P177 has more details than I need. Here are a few key points:

std::map and std::set — by default uses less<Trade>, which often uses a “method” operator<() in the Trade class

  • If you omit this operator, you get verbose STL build error messages about missing operator<()
  • this operator<() must be a const method, otherwise you get lengthy STL build error.
  • See https://stackoverflow.com/questions/1102392/stdmaps-with-user-defined-types-as-key
  • The other two (friendly) alternatives are
  • function pointer — easiest choice for quick coding test
  • binary functor class with an operator()(Trade, Trade) — too complex but most efficient, best practice.

 

j4 factory: java^c++ #Wells

A Wells Fargo interviewer asked

Q6: motivation of factory pattern?
Q6b: why prevent others calling your ctor?

  • %%A: some objects are expensive to construct (DbConnection) and I need tight control.
  • %%A: similarly, after construction, I often have some initialization logic in my factory, but I may not be allowed to modify the ctor, or our design doesn’t favor doing such things in the ctor. I make my factory users’ life easier if they don’t call new() directly.
  • AA: more importantly, the post-construction function could be virtual! This is a sanctioned justification for c++ factory on the authoritative [[c++codingStd]] P88
  • %%A: I want to (save the new instance and) throw exception in a post-construction routine, but I don’t want to throw from ctor
  • %%A: I want to centralize the non-trivial business rule of selecting target types to return. Code reuse rather than duplication
  • %%A: a caching factory
  • %%A: there are too many input parameters to my ctor and I want to provide users a simplified façade

cross-currency equity swap: %%intuition

Trade 1: At Time 1, CK (a hedge fund based in Japan) buys one share of GE priced at USD 10, paying JPY 1000. Eleven months later, GE is still at USD 10 which is now JPY 990. CK faces a paper loss due to FX. I will treat USD as asset currency. CK bought 10 greenbacks at 100 yen each and now each greenback is worth 99 yen only.

Trade 2: a comparable single-currency eq-swap trade

Trade 3: a comparable x-ccy swap. At Time 1, the dealer (say GS) buys and holds GE on client’s behalf.

(It is instructive to compare this to compare this to Trade 2. The only difference is the FX.)

In Trade 3, how did GS pay to acquire the share? GS received JPY 1000 from CK and used it to get [1] 10 greenbacks to pay for the stock.

Q: What (standard) solutions do GS have to eliminate its own FX risk and remain transparent to client? I think GS must pass on the FX risk to client.

I think in any x-ccy deal with a dealer bank, this FX risk is unavoidable for CK. Bank always avoids the FX risk and transfer the risk to client.

[1] GS probably bought USDJPY on the street. Who GS bought from doesn’t matter, even if that’s another GS trader. For an illiquid currency, GS may not have sufficient inventory internally. Even if GS has inventory under trader Tom, Tom may not want to Sell the inventory at the market rate at this time. Client ought to get the market rate always.

After the FX trade, GS house account is long USDJPY at price 100 and GS want USD to strengthen. If GS effectively passes on the FX risk, then CK would be long USDJPY.

I believe GS need to Sell USDJPY to CK at price 100, to effectively and completely transfer the FX risk to client. In a nutshell, GS sells 10 greenbacks to CK and CK uses the 10 greenbacks to enter an eq-swap deal with GS.

GS trade system probably executes two FX trades

  1. buy USDJPY on street
  2. sell USDJPY to CK

After that,

  • GS is square USDJPY.
  • CK is Long USDJPY at price 100. In other words, CK wants USD to strengthen.

I believe the FX rate used in this trade must be communicated to CK.

Eleven months later, GS hedge account has $0 PnL since GE hasn’t moved. GS FX account is square. In contrast, CK suffers a paper loss due to FX, since USD has weakened.

As a validation (as I instructed my son), notice that this outcome is identical to the traditional trade, where CK buys USDJPY at 100 to pay for the stock. Therefore, this deal is fair deal.

Q: Does GS make any money on the FX?
A: I don’t think so. If they do, it’s Not by design. By design, GS ought to Sell USDJPY to client at fair market price. “Fair” implies that GS could only earn bid/ask spread.

once algo cracked,syntax/idiom=雕虫小技@@ NO

Background — Let’s focus on short coding tests , compiler-based.

Q: If it takes 2Y to improve from 3 to 6, then how many percent of the effort is syntax?
A: less than 50%, but can be as short as a few days. ECT is harder.

In java or python, my coding test track record is better. The syntax know-how required is relatively small, because the questions use only array, hash table, linked nodes, StringBuilder/stringstream… Look at EPI300.

C++ feels similar. Once the algo (and data structures) is cracked, the syntax know-how required consist of about 500 common operations. (I have some of it listed in my blog…)

However, once I have a decent idea, it often takes me 20 – 60 minutes to pass the basic tests, but i am given only 30 minutes! Why am I so slow? Syntax is sometimes non-trivial and also ECT, even if without debugging the corner cases.

Some non-trivial syntax + idioms needed in speed coding — ##eg@syntax learning via coding drill

##shared_ptr thr-safety: 3 cases

This topic is worthwhile as it is about two high-value topics … threading + smart ptr

At least 3 interviewers pointed out thread safety issues …

http://stackoverflow.com/questions/14482830/stdshared-ptr-thread-safety first answer shows many correct MT usages and incorrect MT usages. Looking at that answer, I see at least 3 distinct”objects” that could be “shared-mutable”:

  1. control block — shared by different club members. Any one club member, like global_instance could be a shared mutable object. Concurrent access to the control block is internally managed by the shared_ptr implementation and probably thread-safe.
  2. pointee on heap — is shared  mutable. If 2 threads call mutator methods on this object, you can hit race condition.
  3. global_instance variable —  a shared mutable instance of shared_ptr. race condition 😦

 

 

 

BFT^DFT, pre^in^post-order #trees++

Here are my own “key” observations, possibly incomplete, on 5 standard tree walk algos, based on my book [[discrete math]]

  • in-order is defined for binary trees only. Why? in-order means “left-subtree, parent, right-subtree”, implying only 2 subtrees.
  • In contrast, BFS/DFS are more versatile. Their input can be any graph. If not a tree, then BFS/DFS could produce a tree — you first choose a “root” node arbitrarily.
  • The 3 *-order walks are recursively defined. I believe they are three specializations of DFS. See https://www.cs.bu.edu/teaching/c/tree/breadth-first/ and https://en.wikipedia.org/wiki/Tree_traversal
  • The difference of the 3 *-order walks are visually illustrated via the black dot on  https://en.wikipedia.org/wiki/Tree_traversal#Pre-order_(NLR)

—-That’s a tiny intro. Below are more advanced —

BFT uses an underlying queue. In contrast DFS is characterized by backtracking, which requires a stack. However, you can implement the queue/stack without recursion.

BFT visits each node twice; DFT visits each node at least twice — opening (to see its edges) and leaving (when all edges are explored)

DFT backtrack in my own language —

  1. # start at root.
  2. At each node, descend into first [1] subtree
  3. # descend until a leaf node A1
  4. # backtrack exactly one level up to B
  5. # descend into A1’s immediate next sibling A2’s family tree (if any) until a leaf node. If unable to move down (i.e. no A2), then move up to C.

Visually — if we assign a $val to each node such that these nodes form a BST, then we get the “shadow rule”. In-order DFS would probably visit the nodes in ascending order by $value

[1] the child nodes, of size 1 or higher, at each node must be remembered in some order.

ODR@functions # and classes

Warning – ODR is never quizzed in IV. A weekend coding test might touch on it but we can usually circumvent it.

OneDefinitionRule is more strict on global variables (which have static duration). You can’t have 2 global variables sharing the same name. Devil is in the details:

(As explained in various posts, you declare the same global variable in a header file that’s included in various compilation units, but you allocate storage in exactly one compilation unit. Under a temporary suspension of disbelief, let’s say there are 2 allocated storage for the same global var, how would you update this variable?)

With free function f1(), ODR is more relaxed. http://www.drdobbs.com/cpp/blundering-into-the-one-definition-rule/240166489 (possibly buggy) explains the Lessor ODR vs Greater ODR. Lessor ODR is simpler and more familiar, forbidding multiple (same or different) definitions of f1() within one compilation unit.

My real focus today is the Greater ODR. Obeying Lessor ODR, the same static or inline function is often included via a header file and compiled into multiple binary files. If you want to put non-template free function definition in a shared header file but avoid Great ODR, then it must be static or inline, implicitly or explicitly. I find the Dr Dobbs article unclear on this point — In my test, when a free function was defined in a shared header without  “static” or “inline” keywords, then linker screams “ODR!”

The most common practice is to move function definitions out of shared headers, so the linker (or compiler) sees only one definition globally.

With inline, Linker actually sees multiple (hopefully identical) physical copies of func1(). Two copies of this function are usually identical definitions. If they actually have different definitions, compiler/linker can’t easily notice and are not required to verify, so no build error (!) but you could hit strange run time errors.

Java linker is simpler and never cause any problem so I never look into it.

//* if I have exactly one inline, then the non-inlined version is used. 
// Linker doesn't detect the discrepancy between two implementations.
//* if I have both inline, then main.cpp won't compile since both definitions 
// are invisible to main.cpp
//* if I remove both inline, then we hit ODR 
//* objdump on the object file would show the function name 
// IFF it is exposed i.e. non-inline
::::::::::::::
lib1.cpp
::::::::::::::
#include &amp;lt;iostream&amp;gt;
using namespace std;

//inline
void sameFunc(){
    cout&amp;lt;&amp;lt;"hi"&amp;lt;&amp;lt;endl;
}
::::::::::::::
lib2.cpp
::::::::::::::
#include &amp;lt;iostream&amp;gt;
using namespace std;

inline
void sameFunc(){
    cout&amp;lt;&amp;lt;"hey"&amp;lt;&amp;lt;endl;
}
::::::::::::::
main.cpp
::::::::::::::
void sameFunc(); //needed
int main(){
  sameFunc();
}

 

Sliding-^Advertised- window size

https://networklessons.com/cisco/ccnp-route/tcp-window-size-scaling/ has real life illustration using wireshark.

https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.0/com.ibm.itsm.perf.doc/c_network_sliding_window.html

https://web.cs.wpi.edu/~rek/Adv_Nets/Spring2002/TCP_SlidingWindows.pdf

  • AWS = amount of free space on receive buffer
    • This number, along with the ack seq #, are both sent from receiver to sender
  • lastSeqSent and lastSeqAcked are two sender control variable in the sender process.

SWS indicates position within a stream; AWS is a single integer.

Q: how are the two variables updated during transmission?
A: When an Ack for packet #9183 is received, sender updates its control variable “lastSeqAcked”. It then computes how many more packets to send, ensuring that “lastSeqSent – lastSeqAcked < AWS

  • SWS (sliding window size) = lastSeqSent – lastSeqAcked = amount of transmitted but unacknowledged bytes, is a derived control variable in the sender process, like a fluctuating “inventory level”.
  • SWS is a concept; AWS is a TCP header field
  • receive buffer size — is sometimes incorrectly referred as window size
  • “window size” is vague but usually refers to SMS
  • AWS is probably named after the sender’s sliding window.
  • receiver/sender — only these players control the SWS, not the intermediate routers etc.
  • too large — large SWS is always set based on large receive buffer, perhaps AWS
  • too small — underutilized bandwidth. As explained in linux tcp buffer^AWS tuning, high bandwidth connections should use larger AWS.

Q: how are AWS and SWS related?
A: The sender adjusts lastSeqSent (and SWS) based on the feedback of AWS and lastSeqAcked.

c++variables: !! always objects

Every variable that holds data is an object. Objects are created either with static duration (sometimes by defining rather than declaring the variable), with automatic duration (declaration alone) or with dynamic duration via malloc().

That’s the short version. Here’s the long version:

  • heap objects — have no name, no host variable, no door plate. They only have addresses. The address could be saved in a “pointer-object”, which is a can of worm.
    • In many cases, this heap address is passed around without any pointer object.
  • stack variables (including function parameters) — each stack object has a name (multiple possible?) i.e. the host variable name, like a door plate on the memory location. This memory is allocated when the stack frame is created. When you clone a stack variable you get a cloned object.
    • Advanced — You could create a reference to the stack object, when you pass the host variable by-reference into another function. However, you should never return a stack variable by reference.
  • static Locals — the name myStaticLocal is a door plate on the memory location. This memory is allocated the first time this function is run. You can return a reference to myStaticLocal.
  • file-scope static objects — memory is allocated at program initialization, but if you have many of them scattered in many files, their order of initialization is unpredictable. The name myStaticVar is a door plate on that memory location, but this name is visible only within this /compilation-unit/. If you declare and define it (in one step!) in a shared header file (bad practice) then you get multiple instances of it:(
  • extern static objects — Again, you declare and define it in one step, in one file — ODR. All other compilation units  would need an extern declaration. An extern declaration doesn’t define storage for this object:)
  • static fields — are tricky. The variable name is there after you declare it, but it is a door plate without a door. It only becomes a door plate on a storage location when you allocate storage i.e. create the object by “defining” the host variable. There’s also one-definition-rule (ODR) for static fields, so you first declare the field without defining it, then you define it elsewhere. See https://bintanvictor.wordpress.com/2017/05/30/declared-but-undefined-variable-in-c/

Note: thread_local is a fourth storage duration, after 1) dynamic, 2) automatic and 3) static

C++build error: declared but undefined variable

I sometimes declare a static field in a header, but fail to define it (i.e. give it storage). It compiles fine and may even link successfully. When you run the executable, you may hit

error loading library /home/nysemkt_integrated_parser.so: undefined symbol: _ZN14arcabookparser6Parser19m_srcDescriptionTknE

Note this is a shared library.
Note the field name is mangled. You can un-mangle it using c++filt:

c++filt _ZN14arcabookparser6Parser19m_srcDescriptionTknE -> arcabookparser::Parser::m_srcDescriptionTkn

According to Deepak Gulati, the binary files only contain mangled names. The linker and all subsequent programs deal exclusively with mangled names.

If you don’t use this field, the undefined variable actually will not bother you! I think the compiler just ignores it.

GTD/KPI/zbs/effi^productivity defined#succinctly

  • zbs — real, core strength (内功) of the tech foundation for GTD and IV. Zbs (not GTD) is required as architect, lead developer, or for west coast interviews.
  • GTD — make the damn thing work. LG of quality, code smell, maintainability etc
  • KPI — boss’s assessment often uses productivity as the most important underlying KPI, though they won’t say it.
  • productivity — GTD level as measured by __manager__, at a higher level than “effi”
  • effi — a lower level measurement than “Productivity”

c++class field defined]header,but global vars obey ODR

Let’s put function declaration/definition aside — simpler.

Let’s put aside local static/non-static variables — different story.

Let’s put aside function parameters. They are like local variables.

The word “static” is heavily overloaded and confusing. I will try to avoid it as far as possible.

The tricky/confusing categories are

  • category: static field. Most complex and better discussed in a dedicated post — See https://bintanvictor.wordpress.com/2017/02/07/c-static-field-init-basic-rules/
  • category: file-scope var — i.e. those non-local vars with “static” modifier
  • category: global var declaration — using “extern”
    • definition of the same var — without “extern” or “static”
  • category: non-static class field, same as the classic C struct field <– the main topic in the post. This one is not about declaration/definition of a variable with storage. Instead, this is defining a type!

I assume you can tell a variable declaration vs a variable definition. Our intuition is usually right.

The Aha — [2] pointed out — A struct field listing is merely describing what constitutes a struct type, without actually declaring the existence of any variables, anything to be constructed in memory, anything addressable. Therefore, this listing is more like a integer variable declaration than a definition!

Q: So when is the memory allocated for this field?
A: when you allocate memory for an instance of this struct. The instance then becomes an object in memory. The field also becomes a sub-object.

Main purpose to keep struct definition in header — compiler need to calculate size of the struct. Completely different purpose from function or object declarations in headers. Scott Meyers discussed this in-depth along with class fwd declaration and pimpl.

See also

global^file-scope variables]c++ #extern

(Needed in coding tests and GTD, not in QnA interviews.)

Any object declared outside a block has “static duration” which means (see MSDN) “allocated at compile time not run time”

“Global” means extern linkage i.e. visible from other files. You globalize a variable by removing “static” modifier if any.

http://stackoverflow.com/questions/14349877/static-global-variables-in-c explains the 2+1 variations of non-local object. I have added my own variations:

  • A single “static” variable in a *.cpp file. “File-scope” means internal linkage i.e. visible only within the file. You make a variable file-scope by adding “static”.
  • an extern (i.e. global mutable) variable — I used this many times but still don’t understand all the rules. Basically in one *.cpp it’s defined, without “static” or “extern”. In other *.cpp files, it’s declared (via header) extern without a initial value.
  • A constant can be declared and defined in one shot as a static (i.e. file-scope) const. No need for extern and separate definition.
  • confusing — if you (use a shared header to) declare the same variable “static” in 2 *.cpp files, then each file gets a distinct file-scope mutable variable of the same name. Nonsense.

https://msdn.microsoft.com/en-us/library/s1sb61xd.aspx

http://en.wikipedia.org/wiki/Global_variable#C_and_C.2B.2B says “Note that not specifying static is the same as specifying extern: the default is external linkage” but I doubt it.

I guess there are 3 forms:

  • static double — file-scope, probably not related to “extern”
  • extern double — global declaration of a var already Defined in a single file somewhere else
  • double (without any modifier) — the single definition of a global var. Will break ODR if in a shared header

Note there’s no special rule about “const int”. The special rule is about const int static FIELD.

//--------- shared.h ---------
#include 
#include 
void modify();

extern std::string global; //declaration without storage allocation
static const int fileScopeConst1=3; //each implementation file gets a distinct copy of this const object
static double fileScopeMutable=9.8; //each implementation file gets a distinct copy of this mutable object
//double var3=1.23; //storage allocation. breaks compiler due to ODR!

// ------ define.C --------
#include "shared.h"
using namespace std;
string global("defaultValue"); //storage allocation + initialization
int main(){
  cout<<"addr of file scope const is "<<&fileScopeConst1<<std::endl;
  cout<<"addr of global var is "<<&global<<std::endl;
  cout<<"before modify(), global = "<<global<< "; double = "<<fileScopeMutable<<endl;
  modify();
  cout<<"after modify(), global = "<<global<< "; double (my private copy) = "<<fileScopeMutable<<endl;
}
// ------ modify.C --------
#include "shared.h"
void modify(){
  global = "2ndValue";
  fileScopeMutable = 700;
  std::cout<<"in modify(), double (my private copy) = "<<fileScopeMutable<<std::endl;
  std::cout<<"in modify(), addr of file scope const is "<<&fileScopeConst1<<std::endl;
  std::cout<<"in modify(), addr of global var is "<<&global<<std::endl;
}

Single-Threaded^STM #jargon

SingleThreadedMode means each thread operates without regard to other threads, as if there’s no shared mutable data with other threads.

Single-Threaded can mean

  • no other thread is doing this task. We ensure there’s strict control in place. Our thread still needs to synchronize with other threads doing other tasks.
  • there’s only one thread in the process.

condition.signalAll Usually requires locking

Across languages,

  • notify() doesn’t strictly need the lock;
  • wait() always requires the lock.
    • c++ wait() takes the lock as argument
    • old jdk uses the same object as lock and conditionVar
    • jdk5 makes the lock beget the conditionVar

[[DougLea]] P233 points out that pthreads signal() doesn’t require the lock being held

—- original jdk
100% — notify() must be called within synchronized block, otherwise exception. See https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#notify()

—- jdk 5 onwards
99% — https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html#signal() says that An implementation may (and typically does) require the lock acquired by the notifying thread.

Not 100% strict.

— compare await():
100% — On the same page, the await() javadoc is 100% strict on this requirement!

====c++
0% — notify_all() method does NOT require holding the lock.  See http://en.cppreference.com/w/cpp/thread/condition_variable/notify_all

–compare wait()
100% — wait(myLock) requires a lock as argument!

====c#
100% — Similar to old JDK, the notifying thread must hold the lock. See https://msdn.microsoft.com/en-us/library/system.threading.monitor.pulseall(v=vs.110).aspx

 

Y more threads !! help`throughput if I/O bound

To keep things more concrete. You can think of the output interface in the I/O.

The paradox — given an I/O bound busy server, the conventional wisdom says more thread could increase CPU utilization [1]. However, the work queue for CPU gets quickly /drained/, whereas the I/O queue is constantly full, as the I/O subsystem is working at full capacity.

[1] In a CPU bound server, adding 20 threads will likely create 20 idle, starved new threads!

Holy Grail is simultaneous saturation. Suggestion: “steal” a cpu core from this engine and use it for unrelated tasks. Additional threads or processes basically achieve that purpose. In other words, the cpu cores aren’t dedicated to this purpose.

Assumption — adding more I/O hardware is not possible. (Instead, scaling out to more nodes could help.)

If the CPU cores are dedicated, then there’s no way to improve throughput without adding more I/O capacity. At a high level, I clearly see too much CPU /overcapacity/.

relative funding advantage paradox by Jeff

(Adapted from Jeff’s lecture notes. [[hull]] P156 example is similar.)
Primary Market Financing available to borrowers AA and BB are
AA
BB
fixed rate
7%
7.5%
<= AA’s real advantage
floating rate
Libor + 1%
Libor 1.24%
needs to borrow
floating
fixed
Note BB has lower credit rating and therefore higher fixed/floating interest costs. AA’s real, bigger advantage is “fixed”, BUT prefers floating. This mismatch is the key and presents a golden opportunity.
Paradoxically, regardless of L being 5% or 10% or whatever, AA and BB can both save cost by entering an IRS.
To make things concrete, suppose each needs to borrow $100K for 12M. AA prefers a to break it into 4 x 3M loans. We forecast L in the near future around  6 ~ 6.5%.
— The strategy –
BB to pay 6.15% to, and receive Libor from, AA. So in this IRS, AA is floating payer.
Meanwhile, AA to borrow from Market fixed 7% (i.e. $7k interest) <= AA's advantage
Meanwhile, BB  to borrow from market L + 1.24% (i.e. L+1.25K) <= BB's advantage
——-
To see the rationale, it’s more natural to add up the net INflow —
AA: -L+6.15  -7 = -L-0.85. This is a saving of 15bps
BB:   L -6.15  -L-1.24 = -7.39. This is a saving of 11bps
Net net, AA pays floating (L+0.85%) and BB pays fixed (7.39%) as desired.
Notice in both markets AA enjoys preferential treatment, but the 2 “gaps” are different by 26 bps i.e. 50 (fixed) ~ 24 (floating). AA and BB Combined savings = 26 bps is exactly the difference between the gaps. This 26 bps combined saving is now shared between AA and BB.
———————————————————————————————-
Fake [1] Modified example
AA
BB
fixed rate
7%
7.5%
floating rate
Libor + 1%
Libor 1.74%
<= AA’s real advantage
needs to borrow
fixed
floating
— The strategy –
AA to pay 5.85% to and receive Libor from BB.
Meanwhile, BB  to borrow fixed 7.5% 
Meanwhile, AA to borrow L + 1% <= AA's advantage
Net inflow:
AA:  L -5.85 -L-1 = -6.85, saving 15 bps
BB: -L+5.85-7.5 = -L-1.65, saving 9 bps

[1] [[Hull]] P156 points out that the credit spread (AA – BB) reflects more in the fixed rate than the floating rate, so usually, AA’s advantage is in fixed. Therefore this modified example is fake.

———————————————————————————————-
The pattern? Re-frame the funding challenge — “2 companies must have different funding needs and gang up to borrow $100K fixed and $100k floating total, but only one half of it using AA’s preferential rates. The other half must use BB’s inferior rates.
In the 2nd example, since AA’s advantage lies _more_ in floating market, AA’s floating rate is utilized. BB’s smaller disadvantage in fixed is accepted.
It matter less who prefers fixed since it’s “internal” between AA and BB like 2 sisters. In this case, since AA prefers something (fixed) other than its real advantage (float), AA swaps them “in the family”. If AA were to prefer floating i.e. matching her real advantage, then no swap needed.
Q: Why does AA need BB?
A: only if AA needs something other than its real advantage. Without BB, AA must borrow at its lower advantage (in “fixed” rate market), wasting its real advantage in floating market.

python ‘global myVar’ needed where@@

Suppose you have a global variable var1 and you need to “use” it in a function f1()

Note Global basically means module-level. There’s nothing more global than that in python.

Rule 1a: to read any global variable in any function you don’t need “global”. I think the LGB rule applies.

Rule 1b: to call a mutator method on a global object, you don’t need “global”. Such an object can be a dict or a list or your custom object. Strings and integers have no mutators!

Rule 2: to rebind the variable to another object in memory (i.e. pointer reseat), you need “global” declaration to avoid compilation error.

Note you don’t need to declare the variable outside all functions. Just declare it global inside the rebinding functions

import package5 #(python)when package5 = directory

Python import statement has many trivial or non-trivial scenarios. Here are some common ones:

Common scenario 1: we “import modu2” to execute modu2.py and also exposes all the “public” objects and functions in modu2 as modu2.something

(Simplest scenario but still not known to some programmers.)

Common scenario 2: if pack3 is a directory containing some python modules, we can
“import pack3.modu5” or “from pack3 import modu5”. Easy to understand.

Here’s a relative unusual scenario:

./pip install --verbose --no-deps --index-url=http://ficlonapd05.macbank:9090/api/ pymodels-linux==0.0.1.480 # assuming 0.0.1.480 is the pymodels version to test

./python
>>> import pymodels
>>> pymodels.__file__
'/home/tcrmg/vtan7/myenv/lib/python2.6/site-packages/pymodels/__init__.pyc'
>>> pymodels.BlackImplyVol(.2, 3,3, 1, 0, 1,1)

In this case, there’s a pymodels directory but BlackImplVol is a function in the compiled library ./pymodels/pymodels_imp.so. How is this function accessible?

A: when “import pymodels” is run, the ./pymodels/__init__.py script executes, which does  “from pymodels_imp import * “

struct packing^memory alignment %%experiment

Sound byte — packing is for structs but alignment and padding is Not only for structs

Longer version — alignment and padding apply to Any object placement, whether that object is a field of a struct or an independent object on stack. In contrast, the packing technique is specific to a struct having multiple fields.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/memAlign.cpp has my test code to reveal the rules:

  1. For a (8-byte) long long object on stack (or in a struct), the address is 8-byte aligned. So padding is added by compiler, unless you say “packed” on a struct.
  2. For a (4-byte) int object on stack (or in a struct), the address is 4-byte aligned.
  3. For a (2-byte) short object on stack (or in a struct), the address is 2-byte aligned.
  4. For a char object on stack (or in a struct), the address is 1-byte aligned. All memory address are 1-byte aligned, so compiler adds no padding.

http://stackoverflow.com/questions/11108328/double-alignment Wug’s answer echoes Ashish’s comment that tiny fields like char should be grouped together, due to Rule #4. This applies to stack layout as well [1]. However, compiler optimization can be subversive:

Not only can the compiler reorder the stack layout of the local variables, it can assign them to registers, assign them to live sometimes in registers and sometimes on the stack, it can assign two locals to the same slot in memory (if their live ranges do not overlap) and it can even completely eliminate variables.

[1] See https://stackoverflow.com/questions/1061818/stack-allocation-padding-and-alignment

c++atomic^volatile #fencing

See other posts about volatile, in both c++ and java.

For a “volatile” variable, c++ (unlike java) compiler is not required to introduce memory fence i.e. updates by one thread is not guaranteed to be visible globally. See Eric Lippert on http://stackoverflow.com/questions/26307071/does-the-c-volatile-keyword-introduce-a-memory-fence

based on [[effModernC++]]
atomic volatile
optimization: reorder prevented permitted 😦
atomicity of RMW operations enforced not enfoced
optimization: redundant access skip permitted prevented
cross-thr visibility of update probably enforced no guarantee [1]
memory fencing enforced no guarantee

[1] https://en.wikipedia.org/wiki/Memory_barrier

marginal probability density: clarified #with equations

(Equations were created in Outlook then sent to WordPress by HTML email. )

My starting point is https://bintanvictor.wordpress.com/2016/06/29/probability-density-clarified-intuitively/. Look at the cross section at X=7.02. This is a 2D area, so volume (i.e. probability mass) is zero, not close to zero. Hard to work with. In order to work with a proper probability mass, I prefer a very thin but 3D “sheet” , by cutting again at X=7.02001 i.e 7.02 + deltaX. The prob mass in this sheet divided by deltaX is a number. I think it’s the marginal density value at X=7.02.

The standard formula for marginal density function is on https://www.statlect.com/glossary/marginal-probability-density-function:

How is this formula reconciled with our “sheet”? I prefer to start from our sheet, since I don’t like to deal with zero probability mass. Sheet mass divided by the thickness i.e. deltaX:

Since f(x,y) is assumed not to change with x, this expression simplifies to

Now it is same as formula [1]. The advantage of my “sheet” way is the numerator always being a sensible probability mass. The integral in the standard formula [1] doesn’t look like a probably mass to me, since the sheet has zero width.

The simplest and most visual bivariate illustration of marginal density — throwing a dart on a map of Singapore drawn on a x:y grid. Joint density is a constant (you can easily work out its value). You could immediate tell that marginal density at X=7.02 is proportional to the island’s width at X=7.02. Formula [1] would tell us that marginal density is

probability density = always prob mass per unit space

It’s worthwhile to get an intuitive feel for the choice of words in this jargon.

* With discrete probabilities, there’s the concept of “probably mass function”
* With continuous probability space, the corresponding concept is “density function”.

Density is defined as mass per unit space.

For a 1D probability space, the unit space is length. Example – width of a nose is a RV with a continuous distro. Mean = 2.51cm, so the probability density at this width is probably highest…

For a 2D probability space, the unit space is an area. Example – width of a nose and temperature inside are two RV, forming a bivariate distro. You can plot the density function as a dome. Total volume = 1.0 by definition. Density at (x=2.51cm, y=36.01 C) is the height of the dome at that point.

The concentration of “mass” at this location is twice the concentration at another location like (x=2.4cm, y=36 C).

central limit theorem – clarified

background – I was taught CLT multiple times but still unsure about important details..

discrete — The original RV can have any distro, but many
illustrations pick a discrete RV, like a Poisson RV or binomial RV. I think to some students a continuous RV can be less confusing.

average — of N iid realizations/observations of this RV is the estimate [1]. I will avoid the word “mean” as it’s paradoxically ambiguous. Now this average is the average of N numbers, like 5 or 50 or whatever.

large group — N needs to be sufficiently large, esp. if the original RV’s distro is highly asymmetrical. This bit is abstract, but lies at the heart of CLT. For the original distro, you may want to avoid some extremely asymmetrical ones but start with something like a uniform distro or a pyramid distro. We will realize that regardless of the original distro, as N increases our “estimate” becomes a Gaussian RV.

[1] estimate is a sample mean, and an estimate of the population mean. Also, the estimate is a RV too.

finite population (for the original distro) — is a common confusion. In the “better” illustrations, the population is unlimited, like NY’s temperature. In a confusing context, the total population is finite and perhaps small, such as dice or the birth hour of all my
classmates. I think in such a context, the population mean is actually some definite but yet-unknown number to be estimated using a subset of the population.

log-normal — an extremely useful extension of CLT says “the product of N iid random variables is another RV, with a LN distro”. Just look at the log of the product. This RV, denoted L, is the sum of iid random variables, so L is Gaussian.

python RW global var hosted in a module

Context: a module defines a top-level global var VAR1, to be modified by my script. Reading it is relatively easy:

from mod3 import *
print VAR1

Writing is a bit tricky. I’m still looking for best practices.

Solution 1: mod3 to expose a setter setVAR1(value)

Solution 2:
import mod3
mod3.VAR1 = ‘new_value’

Note “from mod3 import * ” doesn’t propagate the new value back to the module. See example below.

::::::::::::::
globalrw.py
::::::::::::::
#!/usr/bin/python -u
from mod3 import *

def main():
  mod3func()
  ''' Line below is required to propagate new value back to mod3
      Also note the semicolon -- to put two statements on one line '''
  import mod3; mod3.VAR1 = 'new value'
  mod3func()
main()
::::::::::::::
mod3.py
::::::::::::::
VAR1='initial value'
def mod3func():
  print 'VAR1 =', VAR1

delegate: Yamaha makes motorbikes + pianos

label: lambda/delegate
when we say “delegate” we should but don't make an explicit distinction between a delegate typename vs a delegate instance.
Also, we seldom make an explicit distinction between the multicast delegate and unicast delegate usages, but these 2 have too little overlap to be discussed as the same concept. 
Yamaha makes motorbikes + pianos

statistical independent ≠> no causal influence

When I was young, I ate twice as much rice as noodles; Now I still do. So the ratio of rice and noodle intake is independent of my age. This independence doesn’t imply that my age has no influence on the ratio. It only appears to have no influence.

https://bintanvictor.wordpress.com/2015/02/02/2-multivariat-normal-variables-can-be-indie/ shows that 2 RV are each controlled by a family of “driver random variables”, but the 2 RV can be independent!

Note the mathematical definition of independence is based on covariance. There must be a stream of paired data points. I would say the mathematical meaning of independence is fundamentally different from everyday English, so intuition often gets in the way.

Special context — time series. We record 2 processes X(t) and Y(t). Both could be influenced by several (perhaps shared) factors. In this context, the layman’s independence is somewhat close to the mathematical definition.
* historical data – we could analyze the paired data points and compute a covariance. We could conclude they are independent, based on the data. We aren’t sure if there’s some common factor that could later give rise to a strong covariance
* future – we are actually more interested in the future. We often rely on historical data

 

abc pairwise independence ≠> Pa*Pb*Pc

Q: If we know events A, B, C are pairwise independent, does Pa*Pb*Pc mean anything?

Q: does Pa*Pb*Pc equal P(A & B & C)?
A: no. The multivariate normal model would imply exactly that, but this is just one possibility, not guaranteed.

Jon Frye gave an excellent example to show that P(A & B & C) can have any value ranging from 0% to minimum(Pa, Pb, Pc, Pab, Pac, Pbc). Suppose Pa = Pb = Pc = 10%. Pairwise independence means Pab = 1%. However, P(abc) can be 0% or 1% (i.e. whenever A and B happen, then C also happens)

School Prom illustration – each student decides whether to go, regardless of any other individual student, but if everyone else goes, then your decision is swayed.

real J4 embracing U.S.: ez2get jobs till60

ez2get jobs; abundance of jobs — At the risk of oversimplifying things, I would single out this one as the #1 fundamental
justification of my bold and controversial move to US.

There are many ways to /dice-n-slice/ this justification:
* It gives me confidence that I can support my kids for many years. * I don’t worry so much about aging as a techie
* I don’t worry so much about outsourcing and a shrinking job pool * when I don’t feel extremely happy on a job, I don’t need to feel trapped like I do in a Singapore job.
* I feel like an “attractive girl” not someone desperately seeking on “a men’s market”.

Look at Genn. What if I really plan to stay in one company for 10 years? I guess after 10 years I may still face problem changing job in a market like Singapore.

fwd contract often has negative value, briefly

An option “paper” is a right but not an obligation, so its holder has no obligation, so this paper is always worth a non-negative value.

if the option holder forgets it, she could get automatically exercised or receive the cash-settlement income. No one would go after her.

In contrast, an obligation requires you to fulfill your duty.

A fwd contract to buy some asset (say oil) is an obligation, so the pre-maturity value can be negative or positive. Example – a contract to “buy oil at $3333” but now the price is below $50. Who wants this obligation? This paper is a liability not an asset, so its value is negative.

0 probability ^ 0 density, 2nd look #cut rope

See the other post on 0 probability vs 0 density.

Eg: Suppose I ask you to cut a 5-meter-long string by throwing a knife. What’s the distribution of the longer piece’s length? There is a density function f(x). Bell-shaped, since most people will aim at the center.

f(x) = 0 for x > 5 i.e. zero density.

For the same reason, Pr(X > 5) = 0 i.e. no uncertainty, 100% guaranteed.

Here’s my idea of probability density at x=4.98. If a computer simulates 100 trillion trials, will there be some hits within the neighborhood around x=4.98 ? Very small but positive density. In contrast, the chance of hitting x=5.1 is zero no matter how many times we try.

By the way, due to the definition of density function, f(4.98) > 0 but Pr(X=4.98) = 0, because the range around 4.98 has zero width.

is RVR Only used as function parameter@@

RVR – almost exclusively used as function parameter

There could be tutorials showing other usages, like a regular variable, but i don’t see any reason. I only understand the motivation for
– move-ctor/move-assignment and
– insertions like push_back()

When there’s a function (usually a big4) taking a RVR param, there’s usually a “traditional” overload without RVR, so compiler can choose based on the argument:
* if arg is an rval expression then choose the RVR version [1]
* otherwise, default to the traditional

[1] std::move(myVar) would cast myVar into an rval expression. This is kind of adapter between the 2 overloads.

In some cases, there exists only the RVR version, perhaps because copy is prohibited… like in a class holding a FILE pointer?

mv-semantics: the players

Q: resource means?
A: Don’t talk about resources. Talk about heapy thingy.
A: Some expensive data item to acquire. Requires allocating from some resource manager such memory allocator. Each class instance holds a resource, not sharable.

Q: What’s the thing that’s moved? not the class instance, but part of the class instance. A “resource” used by the instance, via a pointer field.

Q: Who moves it? the compiler, not run time. Compiler selects the mv-ctor or mv-assignment, only if all conditions are satisfied. See post on use-case.

Q: What’s the mv-ctor (or mv-assignment) for? The enabler. It alone isn’t enough to make the move happen.

Q: What does the RVR point to? It points to an object that the programmer has marked for imminent destruction.

returning RVR #Josuttis

My rule  of thumb is to avoid a RVR return type, even though Josuttis did NOT forbid it by saying anything like return type should never be rvr.

Instead of an rvr return type, I feel in most practical cases, we can achieve the same result using a nonref return type. I think such a function call usually evaluates to a nameless temp object i.e a naturally-occurring rvalue object.

[[Josuttis]] (i.e. the c++Standard library) P22 explains the rules about returning rval ref.

In particular, it’s a bad idea to return a newly-created stack object by rvr. This object is a nonstatic local and will be wiped out after the function returns.

(It’s equally bad to return this object by l-value reference.)

 

equivalent FX(+option) trades, succinctly

The equivalence among FX trades can be confusing to some. I feel there are only 2 common scenarios:

1) Buying usdjpy is equivalent to selling jpyusd.
2) Buying usdjpy call is equivalent to Buying jpyusd put.

However, Buying a fx option is never equivalent to Selling an fx option. The seller wants (implied) vol to drop, whereas the buyer wants it to increase.

factors affecting bond sensitivity to IR

In this context, we are concerned with the current market value (eg a $9bn bond) and how this holding may devalue due to Rising interest rate for that particular maturity.

* lower (zero) coupon bonds are more sensitive. More of the cash flow occurs in the distant future, therefore subject to more discounting.

* longer bonds are more sensitive. More of the cashflow is “pushed” to the distant future.

* lower yield bonds are more sensitive. On the Price/yield curve, at the left side, the curve is steeper.

(I read the above on a slide by Investment Analytics.)

Note if we hold the bond to maturity, then the dollar value received on maturity is completely deterministic i.e. known in advance, so why worry about “sensitivity”? There are 3 issues with this strategy:

1) if in the interim my bond’s MV drops badly, then this asset offers poor liquidity. I won’t have the flexibility to get contingency cash out of this asset.

1b) Let’s ignore credit risk in the bond itself. If this is a huge position (like $9bn) in the portfolio of a big organization (even for a sovereign fund), a MV drop could threaten the organization’s balance sheet, credit rating and borrowing cost. Put yourself in the shoes of a creditor. Fundamentally, the market and the creditors need to be assured that this borrower could safely liquidity part of this bond asset to meet contingent obligations.

Imagine Citi is a creditor to MTA, and MTA holds a bond. Fundamental risk to the creditor (Citi) — the borrower (MTA)  i.e. the bond holder could become insolvent before bond maturity, when the bond price recovers.

2) over a long horizon like 30Y, that fixed dollar amount may suffer unexpected inflation (devaluation). I feel this issue tends to affect any long-horizon investment.

3) if in the near future interest rises sharply (hurting my MV), that means there are better ways to invest my $9bn.

beta definition in CAPM – confusion cleared

In CAPM, beta (of a stock like ibm) is defined in terms of
* cov(ibm excess return, mkt excess return), and
* variance of ibm excess return

I was under the impression that variance is the measured “dispersion” among the recent 60 monthly returns over 5 years (or another period). Such a calculation would yield a beta value that’s heavily influenced or “skewed” by the last 5Y’s performance. Another observation window is likely to give a very different beta value. This beta is based on such unstable input data, but we will treat it as a constant, and use it to predict the ratio of ibm’s return over index return! Suppose we are lucky so last 12M gives beta=1.3, and last 5Y yields the same, and year 2000 also yields the same. We still could be unlucky in the next 12M and this beta
fails completely to predict that ratio… Wrong thought!

One of the roots of the confusion is the 2 views of variance, esp. with time-series data.

A) the “statistical variance”, or sample variance. Basically 60 consecutive observations over 5 years. If these 60 numbers come from drastically different periods, then the sample variance won’t represent the population.

B) the “probability variance” or “theoretical variance”, or population variance, assuming the population has a fixed variance. This is abstract. Suppose ibm stock price is influenced mostly by temperature (or another factor not influenced by human behavior), so the inherent variance in the “system” is time-invariant. Note the distribution of daily return can be completely non-normal — Could be binomial, uniform etc, but variance should be fixed, or at least stable — i feel population variance can change with time, but should be fairly stable during the observation window — Slow-changing.

My interpretation of beta definition is based on the an unstable, fast-changing variance. In contrast, CAPM theory is based on a fixed or slow-moving population variance — the probability context. Basically the IID assumption. CAPM assumes we could estimate the population variance from history and this variance value will be valid in the foreseeable future.

In practice, practitioners (usually?) use historical sample to estimate population variance/cov. This is, basically statistical context A)

Imagine the inherent population variance changes as frequently as stock price itself. It would be futile to even estimate the population variance. In most time-series contexts, most models assume some stability in the population variance.

physical measure is impractical

Update: Now I think physical probability is not observable nor quantifiable and utterly unusable in the math including the numerical methods.  In contrast, RN probabilities can be derived from observed prices.

Therefore, now I feel physical measure is completely irrelevant to option math.

RN measure is the “first” practical measure for derivative pricing. Most theories/models are formulated in RN measure. T-Forward measure and stock numeraire are convenient when using these models…

Physical measure is an impractical measure for pricing. Physical measure is personal feeling, not related to any market prices. Physical measure is mentioned only for teaching purpose. There’s no “market data” on physical measure.

Market prices reflect RN (not physical) probabilities.

Consider cash-or-nothing bet that pays $100 iff team A wins a playoff. The bet is selling for $30, so the RN Pr(win) = 30%. I am an insider and I rig the game so physical Pr() = 80% and Meimei (my daughter) may feel it’s 50-50 but these personal opinions are irrelevant for pricing any derivative.

Instead, we use the option price $30 to back out the RN probabilities. Namely, Calibrate the pricing curves using liquid options, then use the RN probabilities to price less liquid derivatives.

Professor Yuri is the first to point out (during my oral exam!) that option prices are the input, not the output to such pricing systems.

drift ^ growth rate – are imprecise

The drift rate “j” is defined for BM not GBM
                dAt = j dt + dW term
Now, for GBM,
                dXt = r Xt  dt + dW term
So the drift rate by definition is r Xt, Therefore, it’s confusing to say “same drift as the riskfree rate”. Safer to say “same growth rate” or “same expected return”

so-called tradable asset – disillusioned

The touted feature of a “tradable” doesn’t impress me. Now I feel this feature is useful IMHO only for option pricing theory. All traded assets are supposed to follow a GBM (under RN measure) with the same growth rate as the MMA, but I’m unsure about most of the “traded assets” such as —
– IR futures contracts
– weather contracts
– range notes
– a deep OTM option contract? I can’t imagine any “growth” in this asset
– a premium bond close to maturity? Price must drop to par, right? How can it grow?
– a swap I traded at the wrong time, so its value is decreasing to deeply negative territories? How can this asset grow?

My MSFM classmates confirmed that any dividend-paying stock is disqualified as “traded asset”. There must be no cash coming in or out of the security! It’s such a contrived, artificial and theoretical concept! Other non-qualifiers:

eg: spot rate
eg: price of a dividend-paying stock – violates the self-financing criteria.
eg: interest rates
eg: swap rate
eg: future contract’s price?
eg: coupon-paying bond’s price

Black’s model isn’t interest rate model #briefly

My professors emphasized repeatedly
* first generation IR model is the one-factor models, not Black model.
* Black model initially covered commodity futures
* However, IR traders adopted Black’s __formula__ to price the 3 most common IR options
** bond options (bond price @ expiry is LN
** caps (libor rate @ expiry is LN
** swaptions ( swap rate @ expiry is LN
** However, it’s illogical to assume the bond price, libor ate, and swap rates on the contract expiry date (three N@FT) ALL follow LogNormal distributions.

* Black model is unable to model the term structure. I think it doesn’t eliminate arbitrage. I would say that a proper IR model (like HJM) must describe the evolution of the entire yield curve with N points on the curve. N can be 20 or infinite…

mean reversion in Hull-White model

The (well-known) mean reversion is in drift, i.e. the inst drift, under physical measure.

(I think historical data shows mean reversion of IR, which is somehow related to the “mean reversion of drift”….)

When changing to RN measure, the drift is discarded, so not relevant to pricing.
However, on a “family snapshot”, the implied vol of fwd Libor rate is lower the further out accrual startDate goes. This is observed on the market [1], and this vol won’t be affected when changing measure. Hull-White model does model  this feature:
<!–[if gte msEquation 12]>σte-a T-t <![endif]–>
[1] I think this means the observed ED future price vol is lower for a 10Y expiry than a 1M expiry.

vol, unlike stdev, always implies a (stoch) Process

Volatility, in the context of pure math (not necessarily finance), refers to the coefficient of dW term. Therefore,
* it implies a measure,
* it implies a process, a stoch process

Therefore, if a vol number is 5%, it is, conceptually and physically, different from a stdev of 0.05.

* Stdev measures the dispersion of a static population, or a snapshot as I like to say. Again, think of the histogram.
* variance parameter (vol^2) of BM shows diffusion speed.
* if we quadruple the variance param (doubling the vol) value, then the terminal snapshot’s stdev will double.

At any time, there’s an instantaneous vol value, like 5%. This could last a brief interval before it increase or decreases. Vol value changes, as specified in most financial models, but it changes slowly — quasi-constant… (see other blog posts)

There is also a Black-Scholes vol. See other posts.