c++screening Questions for GregR

Similar to https://bintanvictor.wordpress.com/2017/09/15/more-mcq-questions-on-core-java-skills/, hopefully these questions can be given over phone.

Other topics easy to quiz over phone: smart pointers, std::thread, rvr/move, big4

  • — Q3: Suppose you have a simple class “Account” with only simple data fields like integers and strings, we can get an Account object constructed on 1) heap or in 2) data segment. Where else can it be constructed? For clarity, Data segment holds things like global variables.  [on stack]
  • Q3b: Suppose Account class has a default constructor Only. In which cases above is this constructor used to instantiate the Account object? [all three cases]
  • Q3c: in which one of the three cases must we use pointer to access the constructed object? [the Heap case ]
  • Q3d: in which one of the three cases above do we have a risk of memory leak? [the Heap case]
  • Q3e: in which of the three cases can the Account object construction happen before main() function starts? Hint: dynamic vs static allocation [ data segment case ]
  • Q3e2 (advanced): for a static Account object allocated in data segment, is the construction always before main()? [Not always. Consider local static variables]
  • Q3f (advanced): RAII relies on which of the 3 types of allocation? Hint: RAII invokes the destructor automatically and reliably. [stack allocation]
  • Q3g: can you have an array of Account objects constructed on heap? [yes]
  • Q3h: YES we can get such an array of Account objects, using array-new, but do we get a pointer or an array or do we get the first array element in an Account variable? [pointer]
  • Q3i: what happens when one of these Account objects is destructed? [destructor would run]
  • Q3j: OK let’s assume Account destructor runs. In which cases above is the destructor Guaranteed to run? Note we have four destruction scenarios so far — on stack, in data-segment, on heap and after array-new. [all cases, including static objects]
  • Q3k: what if we don’t define any destructor in Account class, in which cases above is destructor skipped? [Never skipped in any  case]
  • Q3L: In the array-new case, suppose 100 Account objects are constructed on heap, when we destruct the array via array-delete, how many times would the Account destructor run? [ 100 times]
  • — Q4: I have a variable var1 as a integer pointer, and I pass var1 directly into a function, what types of functions below can accept this argument?
    • A: funcA takes a parameter of non-constant reference to integer
    • B: funcB takes a parameter of constant reference to integer
    • * C: funcC takes a parameter of pointer to integer
    • D: funcD takes a parameter of bare integer (not pointer or reference)
  • Q4b: I want to call funcX(100+200+300), what types of function can accept this argument? [B/D]
  • Q4c: I have a variable var2 as a integer variable, and I want to pass it in like funcX(var2), what types of functions below can accept this argument? [A/B/D]
  • — Q5: which casts below are platform-specific? [correct answer is marked with *]
    • A: static_cast
    • B: down_cast
    • C: const_cast
    • D: dynamic_cast
    • * E: reinterpret_cast
  • Q5b: which casts are Not part of the c++ standard?
  • Q5c: which casts are closest to the old-school cast in C language? [A]
  • Q5d: which casts are likely to require the vtable? [D]
  • Q5e: which casts usually work with pointers and references Only? [D/E]
  • Q5f (advanced): when applied on pointers, which casts could produce an address different from the input pointer? [D]
  • — Q6: which standard data types below usually require heap allocation [see asterisks]
    • A: integer
    • * B: the string in standard library
    • C: plain vanilla array of 20 timestamps
    • * D: the list in standard library
    • * E: the vector in standard library
    • * F: the map in standard library
    • * G: unordered_map
    • H: C-style string
  • Q6b: which data types are available in C? [A C H]
  • Q6c: which data types by design won’t grow to accommodate more elements? [C H] A is a trivial case.
  • Q6d (advanced): which implementation classes for these data types must include code to allocate an array on heap? [B E G]
  • Q6d2 (more advanced, if last question was correctly answered): which one among these three classes may skip array allocation in a common optimization? [B in the form of small-string optimization]
  • Q6e: which data types offer random-access iterators capable of jumping by an arbitrary offset? [B C E H]
  • Q6f: which data types offer amortized constant-time lookup? [BCEH and G]
  • Q6g (advanced): which data type(s) offer unconditional guaranteed performance for insertion/deletion at any position and in all scenarios ? [D F]
  • Q6h: (advanced): which data structures are allocated on heap but doesn’t require reallocation of existing elements? [list and map]
Advertisements

heap usage: C ilt C++/Java #Part 2

I now recall that when I programmed in C, my code never used malloc() directly.

The library functions probably used malloc to some extent, but malloc was advanced feature. Alexandrescu confirmed my experience and said that c++ programmers usually make rather few malloc() calls, each time requesting a large chunk. Instead of malloc, I used mostly local variables and static variables. In contrast, C++ uses heap much more:

  • STL containers are 99% heap-based
  • virtual functions require pointer, and the target objects are usually on heap, as Alexandrescu said on P78
  • pimpl idiom i.e. private implementation requires heap object, as Alexandrescu said on P78
  • the c++ reference is used mostly for pass-by-reference. Pass-by-reference usually works with heap objects.

In contrast, C++ uses small chunks of heap memory.

Across languages, heap usage is is slow because

  • In general OO programming uses more pointers more indirection and more heap objects
  • heap allocation is much slower than stack allocation, as Stroustrup explained to me
  • using a heap object, always always requires a runtime indirection. The heap object has no name, only an address !
  • In Garbabe-Collected languages, there’s one more indirection.

us`lowest N natural nums,count BST#70%#Rahul

how do you make use of the previous results for N=2 to tackle N=3?
f(z) denotes count of unique BSTs consisting of the fist z natural number
f(0)=f(1)=1

For N=21, we can have 1 node on the left side, 19 nodes on the right side

  • for odd z, express it as z:=2y+1 where y > 0
    f(2y+1)=[ f(a=0)*f(2y) + f(a=1)f(2y-1) + f(2)f(2y-2)… +f(y-1)f(y+1) ]*2 + f(y)f(y)

Note a should increment from 0 up to y-1.

  • for even z, express it as z:=2y where y > 0
    f(2y)=[ f(0)*f(2y-1) + f(1)f(2y-2) + f(2)f(2y-3)… +f(y-1)f(y) ]*2

Let’s try this formula on f(2)…= 2 good
Let’s try this formula on f(3)…=5 y=1

–N=3:
if ‘1’ is root, then there are f(2) BSTs
if ‘3’ is root, then there are f(2) BSTs
if ‘2’ is root, then there are f(1) * f(1) BSTs
–N=9:
if ‘1’ is root, there are f(8) BSTs
if ‘9’ is root, there are f(8) BSTs
if ‘4’ is root, there are f(3) left-subtrees and f(5) right-subtrees, giving f(3)*f(5) BSTs

This is not coding challenge but math challenge.

Q2: output all BSTs. We need a way to represent (serialize) a BST?

compile-time ^ run-time linking

https://en.wikipedia.org/wiki/Dynamic_linker describes the “magic” of linking *.so files with some a.out at runtime, This is more unknown and “magical” than compile-time linking.

“Linking is often referred to as a process that is performed when the executable is compiled, while a dynamic linker is a special part of an operating system that loads external shared libraries into a running process”

I now think when the technical literature mentions linking or linker I need to ask “early linker or late linker?”

versioned-queue problem

I think this problem is mostly about data structure, not algorithm.

Q: Design and implement a Version-Queue. A Version-Queue maintains a version number along with normal Queue functionality. Every version is a snapshot of the entire queue. Every operation[Enqueue/Dequeue] on the Queue increments its version.

Implement the following functions:

1. Enqueue – appends an element at the end of the queue.
2. Dequeue – returns the top element of the queue.
3. Print – it takes a version number as input and prints the elements of the queue of the given version. The version number input can also be an old/historical version number.

E.g. if the current version number of the queue is 7 and the input to this function is 5, then it should print the elements of the queue when its version number was 5.

For simplicity, assume the elements are integers.

We expect you to write a helper program to test the above data structure which will read input from stdin and prints output to stdout.

Input format:
First line should have an integer n (number of operations). This should be followed by n number of lines, each denoting one operation.
e.g.
6
e 1
e 4
d
e 5
p 2
p 4

‘e’ stands for enqueue

— My design —
In addition to the regular queue data structure, we need a few helper data structures.

All current + historical queue elements are saved as individual elements in a “Snapshot” vector, in arrival order. This vector never decreases in size even during dequeue. Two pointers represent the trailing edge (dequeue) and leading edge (enqueue).

(minor implementation detail — Since it’s a vector, the pointers can be implemented as 2 integer index variables. Pointers and iterators get invalidated by automatic resizing.)

Every enqueue operation increments the right pointer to the right;
Every dequeue operation Increments the left pointer to the Right;
(No decrement on the pointers.)

With this vector, I think we can reconstruct each snapshot in history.

Every pointer increment is recorded in a Version table, which is a vector of Version objects {Left pointer, Right pointer}. For example, if Version 782 has {L=22, R=55} then the snapshot #782 is the sub-array from Index 22 to 55.

Additional space costs:
O(K) where K = number of operations

Additional time costs:
Enqueue operation — O(1). Insert at end of the Snapshot vector and Version vector
Dequeue operation — O(1). Move a pointer + insert into Version vector Print operation — O(M) where M = maximum queue capacity

Minor point — To avoid vector automatic resize, we need to estimate in advance the limit on K i.e. number of operations. If you tell me you get millions of operations a microsecond, then over a day there
could be trillions of “versions” so the Versions vector and the Snapshot vector need sufficient initial capacity.

Note we could get 9000 enqueue operations in a row.

minimum hardware requirement to run linux #historical observation

  1. Bell labs Unix, in its early days, were able to run on hardware considered “underpowered” even by the standard of that day — P33 [[art of unix programming]]. I believe contemporary kernels were unable to run on those low-end machines.
  2. Linux (P77) has a similar characteristic. In the 1990’s the big commercial Unixes targeted enterprise-class hardware but Linux emphasized doing more with less. Today, Linux powers 99% of the world’s most powerful supercomputers but Linux also runs on low-end or obsolete hardware.

In both cases, I feel that an OS designed with very low minimum hardware requirement turned out to be actually more efficient, more adaptable, more versatile, more powerful than their conventional competitors.

## 5 expertise I could teach #thread

The most sought-after Expertise I could develop.

#1 personal investment – FX/option, HY and unit trust investment

# tech IV by top employers, including brain teasers
# financial dnlg, appealing to pure techies
# low latency
# Wall St techie work culture
However, some of these are hard to make a teaching career. So which domain can i teach for a living, perhaps with a PhD
  1. programming in c++/java/python
  2. evergreen comp science, esp concurrency theory .. data structure .. algorithm
  3. data science, combining finance with…
  4. fin math

extern^static on var^functions

[1] https://stackoverflow.com/questions/14742664/c-static-local-function-vs-global-function confirmed my understanding that static local function is designed to allow other files to define different functions under this name.

extern var file-scope STATIC var static func extern-C func
purpose 1 single object shared across files [1] applies. 99% sure [1] name mangling
purpose 2 private variable private function
alternative 1 static field anon namespace no alternative
alternative 2 singleton
advantage 1 won’t get used by mistake from other file
disadv 1 “extern” is disabled if you provide an initializer no risk. very simple
disadv 2
put in shared header? be careful  should be fine  not sure

swap usage when RAM already adequate

Adapted from blog by Hayden James

Even when our average memory usage is smaller than RAM capacity, system still benefits from swap!

Most server processes are daemons. Any daemon can create lots of memory pages rarely accessed till shutdown. Kernel often decides to relocate rarely used memory pages to swap for performance reasons, mostly to free up RAM. The reclaimed RAM space can remain vacant for some time, so you may think the relocation is unnecessary but ..

  • the relocation is usually harmless — the kswap pid uses very little cpu unless such relocation workload becomes frequent and bidirectional, a sign of insufficient RAM.
  • the relocation is preemptive — The vacant RAM is available to any process that can use it more productively. In general, faster cache ought to hold “hotter” data. In other words, hotter data should be cheaper to access.

But what if there’s no other process or hotter data? What if all the data can fit into RAM? This is rare but yes you can disable swap. Rarely needed tweaks are sometimes under-documented, as kernel is a very efficient “government” like Singapore.

Note, as explained in my [[linux kernel]], kswapd is both a process and a kernel thread that wakes up from time to time.

2 pitfalls on my accu path #portable,MktDepth..

Say you stay in one team for 5 years, hoping to accumulate expertise and insight

  • Trap 1: local system knowledge, 100% nonportable
    • eg: Qz
    • Tibrv wrappers in 95G is not so bad
  • Trap 1b: limited standardization across companies.
    • eg: cryptocurrency
    • eg: OMS framework? M b2bTradigEngine ^ SIG ^ ETSFlow
    • eg: software mkt data dissemination cf raw mkt data parsing
  • Trap 2 : poor market depth
    • eg: vol fitter

killer app@RBTree #priorityQ,sortedVector

A common usage context is query on some data set after pre-processing . In these contexts, BST competes with sorted vector and priority queue.

  • Killer app against vector: incremental insert or live updates
  • Killer app against vector: if there’s even occasional delete(), then sorted vector would suffer
  • Killer app against vector: update on one item can be implemented as delete/reinsert. Vector requires binary search -> mid-stream insert
  • minor advantage over sorted vector: vector sorting requires swapping, so value-semantics is problematic
  • Killer app against priority queue: range query, approximate query,
  • Killer app against priority queue: both max() and min()
  • Application: if each BST node needs additional data. Binary heap doesn’t expose those nodes(?)

It’s important to remember the advantages of vector

  1. cache efficiency
  2. runtime malloc cost

Java, c++ and c# all provide self-balancing BST in their standard libraries, from the very beginning. In my projects, I use these containers on an everyday basis. However, after talking to a few friends I now agree that most coding problems don’t need self-balancing BST because

  1. These problems have no incremental insertion/deletion, so we can simply sort an array of items at O(N logN) as pre-processing
  2. In some cases, priorityQ is a viable alternative
  3. Python doesn’t have this container at all and all coding problems must support python.

[12] VaR: stress^historical based

I think my OC colleague Rajesh hinted once — there are two common methodologies to compute "minimum loss in the worst 5% scenarios"

1. stress test
2. historical

In Historical, you record the actual (unrealized + realized) losses for the past x (like 252) trading days and plot a histogram, and directly pin point "minimum loss in the worst 5% of those trading days".

I think the way you record the losses (or gains) is like "for IBM, daily price relative is 95%, +10%, 102%, 92% …."

In stress test, I think you simulate a drop in vol and/or a drop in underlier etc

CAS cpu-instruction takes 3 inputs

A CVA interviewer asked me to explain the cmpxch cpu-instruction. I now believe it COMPARES two values (expected^current) and IIF matched, updates the memory location to a “newValue”.

Out of these 3 “inputs”, only the expected and newValue are inputs to the function. The 3rd item “current” is NOT an input parameter to the function, but discovered in the hardware.

See P1018 [[the c++StdLib]] by Josuttis

mlock() : low-level syscall to prevent paging ] real-time apps

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/reference_guide/using_mlock_to_avoid_page_io has sample code

See also https://eklitzke.org/mlock-and-mlockall

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.3/html/Realtime_Reference_Guide/sect-Realtime_Reference_Guide-Memory_allocation-Using_mlock_to_avoid_memory_faults.html says

If the application is entering a time sensitive region of code, an mlockall call prior to entering, followed by munlockall can reduce paging while in the critical section. Similarly, mlock can be used on a data region that is relatively static or that will grow slowly but needs to be accessed without page faulting.

#1 server-side language@WallSt: evolution

Don’t spend too much time.. Based on my limited observations,

  • As of 2007, the top dog was java.
  • The dominance is even stronger in 2018.
  • Q: how about in 10 years?
  • A: I feel java will remain #1

Look at the innovation leaders — West coast. For their (web) server side, they seem to have shifted slightly towards python, javascript, RoR

Q: Why do I consider buy-side, sell-side and other financial tech shops as a whole and why don’t I include google finance?
A: … because there’s mobility between sub-domains within, and entry barrier from outside.

Buy-side tend to use more c++; Banks usually favor java; Exchanges tend to use … both c++ and java. The latency advantage of c++ isn’t that significant to a major exchange like Nsdq.

 

semi-retirement jobs:Plan B #if !! U.S.

I had blogged about this before, such as the blogger-pripri post on “many long term career options”

Hongzhi asked what if you can’t go to the US.

* Some Singapore companies simply keep their older staff since they still can do the job, and the cost of layoff is too high.
* Support and maintenance tech roles
* Some kind of managerial role ? I guess I am mediocre but could possibly do it, but i feel hands-on work is easier and safer.
* Teach in a Poly or private school ? possibly not my strength
* Run some small business such as Kindergarten with wife

crossed orderbook mgmt: real story4IV

–challenges:
disappears after a while; Some days better some days worse

–impact
data quality visible to paying customers

–individual contribution?
not really, to be honest.

–what would you do differently?
Perhaps detect and set a warning flag when generating top-book token? See other post on “crossed”

–details:

  • query the exchange if query is supported – many exchanges do.
  • compare across ticker plants? Not really done in our case.
  • replay captured data for investigation
  • retrans as the main solution
  • periodic snapshot feed is supported by many exchanges, designed for late-starting subscribers. We could (though we don’t) use it to clean up our crossed orderbook
  • manual cleaning via the cleaner script, as a “2nd-last” resort
  • hot failover.. as last resort

–the cleaner script:

This “depth-cleaner” tool is essentially a script to delete crossed/locked (c/l) entries from our replicated order books. It is run by a user in response to an alert.

… The script then compares the Ask & Bid entries in the order book. If it finds a crossed or locked order, where the bid price is greater than (crossed) or equal to (locked) the ask price, it writes information about that order to a file. It then continues checking entries on both sides of the book until it finds a valid combination. Note that the entries are not checked for “staleness”. As soon as it finds non-crossed/locked Ask and Bid entries, regardless of their timestamps, it is done checking that symbol.

The script takes the entries from the crossed/locked file, and creates a CIM file containing a delete message for every order given. This CIM data is then sent to (the admin port of) order book engine to execute the deletes.

Before the cleaner is invoked manually, we have a scheduled scanner for crossed books. This scanner checks every symbol once a minute. I think it uses a low-priority read-only thread.

IP4 fragmentation+reassembly #MTU,offset

I consider this a “halo” knowledge pearl because it is part of an essential everyday service. We can easily find an opportunity to inject it into an IV.

A Trex interviewer said something questionable. I said fragmentation is done at IP layer and he said yes but not reassembly. He was wrong. See P329 [[Computer Networking]]

I was talking about IP layer breaking up , say, a 4KB packet (TCP or UDP packet) into three IP-fragments no bigger than 1500B [1]. The reassembly task is to put all 3 fragments back together in sequence (and detect missing fragments) and hand it over to TCP or UDP.

This reassembly is done in IP layer. IP4 uses an “offset” number in each fragment to identify the sequencing and to detect missing fragments. The fragment with the highest offset also has a flag indicating it’s the last fragment of a given /logical/ packet.

Therefore, IP4 detects and will never deliver partial packets to UDP/TCP (P328 [[computer networking]]), even though IP is considered an unreliable service. IP4 can detect missing/incomplete IP-datagram and will refuse to “release” it to upper layer (i.e. UDP/TCP).

  • If TCP doesn’t get one “segment” (a.k.a. IP-datagram), it will request retransmission
  • UDP does no retransmission
  • IP4 does no retransmission

[1] MTU for some hardware is lower than 1500 Bytes …

new socket from accept()inherits port# from listen`socket

[[tcp/ip sockets in C]] P96 has detailed diagrams, but this write-up is based on https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.bpxbd00/accept.htm

Background — a socket is an object in memory, with dedicated buffers.

We will look at an sshd server listening on port 22. When accept() returns without error, the return value is a positive integer file descriptor pointing to a new-born socket object with its own buffers. I call it the “worker” socket. It inherits almost every property from the listening socket (but see tcp listening socket doesn’t use a lot of buffers for differences.)

  • local (i.e. server-side) port is 22
  • local address is a single address that client source code specified. Note sshd server could listen on ANY address, or listen on a single address.
  • … so if you look at the local address:port, the worker socket and the listening socket may look identical, but they are two objects!
  • remote port is some random high port client host assigned.
  • remote address is … obvious
  • … so if you compare the worker-socket vs listening socket, they have
    • identical local port
    • different local ip
    • different remote ip
    • different remote port

c++get current command line as str

#include <fstream>

std::string const & get_command_line() {
  static std::string ret;
  if (ret.empty())
  try{
        std::string path="/proc/" + std::to_string((long long)getpid()) + "/cmdline";
        std::cerr<<"initializing rtsd parsername from "<<path<<" ..\n";
        std::ifstream myfile(path);
        if (!myfile) return ret;
        getline(myfile, ret);
  }catch(...){
  }
  return ret;
}

bucketSort: O(N+k) could be better or worse than O(N)

Based on the 53-vote top answer in https://stackoverflow.com/questions/7311415/how-is-the-complexity-of-bucket-sort-is-onk-if-we-implement-buckets-using-lin

It’s obvious (by definition) that O(2N) is equivalent to O(N).

O(N+k) is worse than O(2N) when k is larger, like 4000 times N. For a concrete illustration, say, we have N=5 million strings to sort, using k = 20 billion buckets.

  • It takes constant time to bucket each string, so the first step takes O(N) i.e. grows proportional to N.
  • 2nd step is now dominated by k, since all 20 billion buckets have to be examined. Time complexity here is O(N+k). This is explained in the stackoverflow answer.
  • Therefore, total complexity is O(N+k) i.e. cN + bk with two constant factors c and b. Even if b is small (tcost of looking at an empty bucket), as k grows, at sometime the k term would dominate the N term.

In practice, k is often chosen to be comparable to N, so I don’t think this is a practical issue.

##short-n-hard algo@simple data structures

90% of the hardest coding questions use simple data structures. Some of these algos are

  • very short but baffling
  • doesn’t use fancy data structures
  • requires prior study; no expectation to invent them during an interview

–Linked list

Brent — check for loop in linked list

–binary tree

Morris — Morris in-order walk]O(N) #O(1)space

–string

  • regex
  • KMP string search
  • longest palindrome substr
  • edit distance

–array

 

y I use lots of if..{continue;}

Inside a loop, many people prefer if/elif/else. To them, it looks neat and structured.

However, I prefer the messier if…continue; if…continue; if..continue. Justification?

I don’t have to look past pageful of if/elif/elif/…/else to see what else happens to my current item. I can ignore the rest of the loop body.

Beside the current item, I also can safely let go (rather than keeping track) of all the loop-local variable values longer. All of them will be wiped out and reset to new values.

python queues #deque

In a coding test, I will use a vanilla list for both.

https://docs.python.org/2/tutorial/datastructures.html confirms that list can be an efficient stack and an inefficient queue.

— Stack can use

  • list.append()
  • list.pop() with no arg.
  • Top of stack is list[-1]

— Queue:  https://github.com/tiger40490/repo1/blob/py1/py/tree/bTreeDftBftSerialize_bbg.py shows an primitive (inefficient) Queue class I wrote:

  • dequeue — list.pop()
    • A non-queue operation — pop(2) would remove and return the 3rd vector item but this is inefficient on a vector!
  • enqueue — list.insert(0, newItem) — similarly inefficient

A deque or circular array (fixed capacity) are more efficient for a queue. Linked list is slower due to frequent allocations.

[[python cookbook]] P659 compared these and other alternatives. The fastest is collections.deque.

elaborate python project/challenge #PWM

Key is “technical challenge”. Without it, I may look like just another python coder. Some of the interviewers may not be easy to impress. I will only target the majority.

Different companies might have different needs for python skill. I can only guess

  1. Usage: data analysis, data science
  2. Usage: automation, glue language
  3. …. py as primary language to implement business logic in an enterprise application? Only in Macquarie, baml and jpm
    1. I could remake my PWM Commissions system in python

For 2), I used

  • devops
  • integration with git, maven, g++, code generation, test result parsing, …. for devops
  • automated testing
  • simple network servers
  • subprocess
  • integration with shell script
  • automatic upload
  • generate c++ source code to be compiled into a native python extension
  • import hack
  • logging decorators — https://github.com/tiger40490/repo1/blob/py1/py/loggingDecorator.py
  • python code attached to trade object. Code to be evaluated at reval time.

tcp: one of 3 client-receivers is too slow

See also no overflow]TCP slow receiver #non-blocking sender

This is a real BGC interview question https://bintanvictor.wordpress.com/2017/04/08/bgc-iv-c-and-java/

Q: server is sending data fast. One of the clients (AA) is too slow.

Background — there will be 3 worker-sockets. The local address:port will look identical among them if the 3 clients connect to the same network interface, from the same network segment.

The set-up is described in simultaneous send to 2 tcp clients #mimic multicast

Note every worker socket for every client has identical local port.

See https://stackoverflow.com/questions/1997691/what-happens-when-tcp-udp-server-is-publishing-faster-than-client-is-consuming

I believe the AA connection/session/thread will be stagnant. At a certain point [1] server will have to remove the (mounting) data queue and release memory — data loss for the AA client.

[1] can happen within seconds for a fast data feed.

I also feel this set-up overloads the server. A TCP server has to maintain state for each laggard client, assuming single-threaded multiplexing(?). If each client takes a dedicated thread then server gets even higher load.

Are 5 client Connections using 5 sockets on server? I think so. Can a single thread multiplex them? I think so.

## Y avoid blocking design

There are many contexts. I only know a few.

1st, let’s look at an socket context. Suppose there are many (like 500 or 50) sockets to process. We don’t want 50 threads. We prefer fewer, perhaps 1 thread to check each “ready” socket, transfer whatever data can be transferred then go back to waiting. In this context, we need either

  • /readiness notification/, or
  • polling
  • … Both are compared on P51 [[TCP/IP sockets in C]]

2nd scenario — GUI. Blocking a UI-related thread (like the EDT) would freeze the screen.

3rd, let’s look at some DB request client. The request thread sends a request and it would take a long time to get a response. Blocking the request thread would waste some memory resource but not really CPU resource. It’s often better to deploy this thread to other tasks, if any.

Q: So what other tasks?
A: ANY task, in the thread pool design. The requester thread completes the sending task, and returns to the thread pool. It can pick up unrelated tasks. When the DB server responds, any thread in the pool can pick it up.

This can be seen as a “server bound” system, rather than IO bound or CPU bound. Both the CPU task queue and the IO task queue gets drained quickly.

 

java lock: atomic^visibility

You need to use “synchronized” on a simple getter, for the #2 effect.!

 

A typical lock() operation in a java method has two effects:

  1. serialized access — unless I release the lock, all other threads will be blocked grabbing this lock
  2. memory barrier — after I acquire the lock, all the changes (on shared variables) by other threads are now visible. The exact details are .. to be detailed.

I now feel the 2nd effect is often more important (and more tricky) than the 1st effect. See P94 [[Doug Lea]] . I like this simple summary, even if not 100% complete —

“In essence, releasing a lock forces a flush of all writes from working memory….and acquiring a lock forces reload of accessible fields.”

Q: what are the subset of “accessible fields” in a class with 9 fields?
A: I believe the compiler knows “in advance” what subset of fields will be accessed after lock acquisition.

Q: what if I acquire a lock, do nothing and release the lock? Is there the #2 effect?
A: I doubt it. You need to enclose all the “tricky” operations between a lock grab/release. If you leave some update (to a shared mutable) outside in the “cold”, then #1 will fail and #2 may also fail.

 

jmp_buf/setjmp() basics for IV #ANSI-C

Q: how is longjmp different from goto? See http://ecomputernotes.com/what-is-c/function-a-pointer/what-is-the-difference-between-goto-and-longjmp-and-setjmp

A: longjmp can 1) jump across functions, and 2) restore program state from a jmp_buf, which was saved earlier by setjmp.

defining+using your swap() #gotcha

Gotcha! If you define your own swap() then it may not get picked up depending on some subtle factors. In the demo below, when the args are declared as “int” variables, then the hidden std::swap() gets picked up instead of your custom swap(size_t, size_t)!

Note there’s no specific #include required.

  • Solution : if practical, avoid “using namespace std” even in cpp files
  • solution : except the outermost main(), enclose everything  in an anonymous namespace, to force the unqualified swap() to resolve to your custom version.
#include <iostream>
#include <assert.h>

//namespace{

using namespace std;

int value1 = 5;
int calls=0;
template <typename T> void swap(size_t a, size_t b){
        ++calls;
        std::cout<<"in my swap()"<<std::endl;
}

int mymain(){
  int a=0, b=1;
  int oldCount = calls;
  swap<int>(a,b); //int arguments won't invoke my swap()
  assert (calls == oldCount);

  std::cout<<"after 1st call to swap()"<<std::endl;
  swap<int>(0,1); //calls my swap()
  assert (calls == 1+oldCount);

  std::swap<int>(a,b);  //can compile even without any #include
  // no return required
}

//}

int main(){
  mymain();
}

symlink/hardlink: Win7 or later

https://www.howtogeek.com/howto/16226/complete-guide-to-symbolic-links-symlinks-on-windows-or-linux/ is a 2017 article.

The mklink command can create both hard links (known as “hard links” in Windows) and soft links (known as “symbolic links” in Windows).

On Windows XP, I have used “Junction.exe” for years, because mklink is not available.

coin problem #all large-enough amounts are decomposable

This is not really a algo IV question, but more like brain teaser problem.

Based on https://en.wikipedia.org/wiki/Coin_problem — For example, the largest amount that cannot be obtained using only coins of 3 and 5 units is 7 units. The solution to this problem for a given set of coin denominations is called the Frobenius number of the set. The Frobenius number exists as long as the set of coin denominations has no common divisor.

Note if a common divisor exists as in {2,4} then all the odd amounts will be non-decomposable.

Q: why a very large amount is always decomposable ? Give an intuitive explanation for 2 coin values like 3 and 5.

Here’s an incomplete answer — 15 (=3*5), 16, 17 are all decomposable. Any larger number can be solved by adding 3’s .

In fact, it was proven that any amount greater than (not equal to) [xy-x-y] are always decomposable. So if we are given 2 coin values (like 4,5, where x is the smaller value) we can easily figure out a range

xy-x-y+1  to xy-y

are each decomposable. Note this range has x distinct values. So any higher amount are easily solved by adding x’s

Also note xy-y is obviously decomposable as (x-1)y.

 

##G5 std::map tasks 4cod`IV

custom hash func? See short example on P 364 [[c++standard library]]. [[optimized c++]] has many examples too.

initialize? There’s a simple constructor taking a long initializer, but the insert() methods support the same and are more versatile.

insert? single pair; range (anotherMap.being(), end());

** insert single value — won’t overwrite pre-existing
** map1.emplace(…)
** map1[key2] = val3 // overwrites pre-existing
** insert list of values — http://en.cppreference.com/w/cpp/container/unordered_map/insert

(returning the value) lookup? at() better than operator[]

a pointer type as key? useful technique.

erase? by a specific key. No need to call another function to really erase the node.

Q: create only if absent; no update please
A: insert()

Q2: create or uppate
Q2b: look up or create
A: operator []

Q1: update only; no create please
Q1b: look up only. No create please
A: find() method

Q: check for existance
A: c.find() is slightly better than c.count() esp. for multi_* containers

 

complexities: replicate exchange order book #Level1+2

–For a generic orderbook, the Level 1 complexity is mostly about trade cancel/correct.
All trades must be databased (like our TickCache). In the database, each trade has a trade Id but also an arrival time.

When a trade is canceled by TradeId, we need to regenerate LastPrice/OpenPrice, so we need the ArrivalTime attribute.

VWAP/High/Low are all generated from TickCache.

The other complexity is BBO generation from Level 2.

–For a Level-based Level 2 order book, the complexity is higher than Level 1.

OrderId is usually required so as to cancel/modify.

asymmetry lower_bound^upper_bound #IFF lookup miss

For a “perfect” hit in both set::lower_bound() and std::lower_bound(), the return value is equivalent to the target; whereas upper_bound is strictly higher than target. See

To achieve symmetry, we need to decrement (if legal) the iterator returned from upper_bound.
———-
If no perfect hit, then lower_bound() and upper_bound() both give the next higher node, i.e. where you would insert the target value.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

vector<int> v{1,3,5};
int main(){
  vector<int>::iterator it;
  it = lower_bound(v.begin(), v.end(), 2); cout<<*it<<endl;
  it = upper_bound(v.begin(), v.end(), 2); cout<<*it<<endl;
}

array^pointer variables types: indistinguishable

  • int i; // a single int object
  • int arr[]; //a nickname of the starting address of an array, very similar to a pure-address const pointer
  • int * const constPtr;
  • <— above two data types are similar; below two data types are similar —->
  • int * pi; //a regular pointer variable,
  • int * heapArr = new int[9]; //data type is same as pi

memset: a practical usage #Gregory

  • memset is a low-level C function.
  • memset takes a void pointer.
  • Fast and simple way to zero out an array of struct, having primitive data members. No std::string please. No ptr please. Use sizeof to get the byte count.
  • Useful in low level wire coding
// illustrates packed and memset
#include <iostream>
using namespace std;

struct A{
  unsigned int i1; //4 bytes
  bool b; //1 byte
  char cstr[2];
  int* ptr; //8 bytes
} __attribute__((packed));
size_t const cnt = 3;
A arr[cnt];
int main(){
  cout<<sizeof(A)<<endl;
  size_t sz = sizeof(arr);
  cout<<sz<<endl;
  memset(arr, 0, sz);
  for(size_t i=0; i<cnt; ++i){
    A* tmp = &arr[i];
    cout<<"i1 = "<<tmp->i1<<"; b = "<<tmp->b<<" ; cstr[1] = "<<(int)tmp->cstr[1]<<" ptr = "<<tmp->ptr<<endl;
  }
}

memory leak demo: memset() std::string

valgrind proves the leak.

using namespace std;
size_t const len=15;
struct A{
        string s4;
        A(string s="default std::string"): s4(s){ cout<<"ctor"<<endl; }
};
size_t const  cnt=2;
size_t siz=cnt * sizeof(A);
A arr[cnt], ar2[cnt];

char fname[] = "/tmp/,.dat";
void leakDemo1() {
        {
                arr[0]=A("grin");
                arr[1]=A("frown"); //somehow skipping this causes core dump
        }
        cout<<"before write()"<<endl;

        int fd = open(fname, O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR);
        write(fd, arr, siz);
        close(fd);

        if(1){
                int fd2 = open(fname, O_RDONLY);
                read(fd2, ar2, siz);
                close(fd2);
        }
        for (int idx = 0; idx < cnt; ++idx){
                A * tmp = ar2 + idx;
                cout<<tmp->s4<<endl;
        }
}
void leakDemo2(){
        int * intp = new int(11); //not deleted. valgrind detected the leak
        memset(&intp, 0, 8); //overwrite the 8-byte pointer object
        delete intp;  //deleting on the fake address from memset
}
void leakDemo3(){
        string s="hello";
        cout<<"sie of string == size of pointer = " << sizeof(string)<<endl; //inside the string object, there's nothing but a poitner!
        memset(&s, 0, 1); // overwite pointer object itself, so it now points to somewhere else ... leak

        //somehow, memset(....,2) causes seg fault?
}
int main() {
        leakDemo1();
}

c++q[new] variations

  1. variation: new MyClass(arg1) // most common. Can throw bad_alloc
  2. variation: new MyClass() // better form, calling the no-arg ctor
  3. variation: new MyClass //bare word, same as above. See op-new: allocate^construct #placement #IV
  4. variation: new (nothrow) MyClass(…) // returns NULL upon failure
  5. variation: placement new
  6. variation: array-new // no argument allowed!

## IV favorites ] sockets

There are dozens of sub-topics but in my small sample of interviews, the following sub-topics have received disproportionate attention:

  1. blocking vs non-blocking
  2. tuning
  3. multicast
  4. add basic reliability over UDP (many blog posts); how is TCP transmission control implemented
  5. accept() + threading
  6. select (or epoll) on multiple sockets

Background — The QQ/ZZ framework was first introduced in this post on c++ learning topics

Only c/c++ positions need socket knowledge. However, my perl/py/java experience with socket API is still relevant.

Socket is a low-level subject. Socket tough topics feel not as complex as concurrency, algorithms, probabilities, OO design, … Yet some QQ questions are so in-depth they remind me of java threading.

Interview is mostly knowledge test; but to do well in real projects, you probably need experience.

Coding practice? no need. Just read and blog.

Socket knowledge is seldom the #1 selection criteria for a given position, but could be #3. (In contrast, concurrency or algorithm skill could be #1.)

  • [ZZ] tweaking
  • [ZZ] exception handling in practice
  • —-Above topics are still worth studying to some extent—–
  • [QQ] tuning
  • [QQ] buffer management

 

thread-unsafe shared_ptr: tiny examples

##shared_ptr thr-safety: 3 cases is a “framework”.

http://www.boost.org/doc/libs/1_35_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety gave simple examples of unsafe access.

//— Example 3, where the shared mutable is a club member p3 ——————
// thread A
p = p3; // reads p3, writes p

// thread B
p3.reset(); // writes p3: Undefined, simultaneous read/write

I gave this example correctly to a Sep 2018 SCB interview 🙂

===> In the same clock cycle, p3 Object is accessed on 2 threads. This is nothing to do with the ref count, or the pointee object.

//— Example 4,  where the shared mutable is a club member p2 ——————
// thread A
p3 = refToP2; // reads p2, writes p3

// thread B
// p2 goes out of scope: Undefined, simultaneous read/write since the destructor is considered a “write access”

===> In the same clock cycle, p2 Object gets write-accessed.

op-new: allocate^construct #placement #IV

Popular IV topic. P41 [[more effective c++]] has an excellent summary:

  1. to BOTH allocate (on heap) and call constructor, use regular q(new)
  2. to allocate Without construction, use q(operator new)
    1. You can also use malloc. See https://stackoverflow.com/questions/8959635/malloc-placement-new-vs-new
  3. to call constructor on heap storage already allocated, use placement-new, which invokes ctor

The book has examples of Case 2 and Case 3.

Note it’s common to directly call constructor on stack and in global area, but on the heap, placement-new is the only way.

Placement-new is a popular interview topic (Jump, DRW and more …), rarely used in common projects.

##y MultiCast favored over TCP

Reason: data rate constraints inherent in TCP protocol. Congestion Control?
Reason: TCP to a large group would be one-by-one unicast, highly inefficient and too much load on the sender. Reason: TCP has more data-overhead in the form of non-payload data. * TCP header is typically 20 bytes vs 8 bytes for UDP
* Receiver need to acknowledge

in-depth article: epoll illustrated #SELECT

(source code is available for download in the article)

Compared to select(), the newer linux system call epoll() is designed to be more performant.

Ticker Plant uses epoll. No select() at all.

https://banu.com/blog/2/how-to-use-epoll-a-complete-example-in-c/ is a nice article with sample code of a TCP server.

  • bind(), listen(), accept()
  • main() function with an event loop. In the loop
  • epoll_wait() to detect
    • new client
    • new data on existing clients
    • (Using the timeout parameter, it could also react to a timer events.)

I think this toy program is more readable than a real-world epoll server with thousands of lines.

static_cast^reinterpret_cast

I like the top answer in https://stackoverflow.com/questions/6855686/what-is-the-difference-between-static-cast-and-reinterpret-cast.

Both are unsafe casts and could hit runtime errors, but RC is more unsafe. RC turns off compiler error checking, so you are on your own in the dark forest.

I feel RC is the last resort, when you have control on the bit representation of raw data, with documented assurance this bit representation won’t change.

In a real example in RTS — a raw byte array comes in as a char array, and you RC it into (pointer to) a packed struct. The compiler has no reason to believe the char array can be interpreted as that struct, so it skips all safety checks. If the raw data somehow changes you get a undefined behavior at run time. In this case SC is illegal — will not pass compiler.

 

q[grep]-F -o -P -r -C –color

-P

Use perl regex. Absolutely necessary if you want non-greedy match like .*?

Without -P, the question mark would be treated as literal.

-F, –fixed-strings

Interpret PATTERN as a list of fixed strings, separated by new-lines, any of which is to be matched. See http://stackoverflow.com/questions/3242873/grep-for-literal-strings for examples. Useful if your needle contains meta-characters you want to treat as literals

-x, –line-regexp

Select only those matches that exactly match the whole line.

-o … output only the matched substring.

-C … context

-r … recursive

Inlining is THE optimization

MSVS and g++ debug build both disable inline (presumably to ease debugging). The performance difference vis-a-vis release build is mostly due to this single factor, according to [[optimized c++]]

The same author asserts that inlining is probably the most powerful code optimization.

Stroustrup singled out inline as a key feature of c++.

In his low-latency seminar, Martin Thompson also quoted a computer scientist saying “Inlining is THE optimization”. This is now a memorable quote.

I think one of my java performance books singled out inlining as the single most effective optimization compiler can do.

Pimpl effectively disables inlining

vptr-based dispatch also disables inlining

C++faq gives many reasons for and against inline.

big bright world behind heavy door

(Was so hard to get over U.S. barriers in the form of h1b and later GC …)

Was so hard to overcome the English barrier[1] … I struggled so very hard. Slowly I was able to express myself, with a barely enough vocab…. Once the door cracked open, I got my first glimpse of a big, bright world[1] Behind the door. Immediately I could see myself thriving in that world.

With a safe retreat always available, there’s no real risk [2] to me so I boldly jumped in as soon as possible.

So which items in the 5advantages post make the “big bright world”? Biggest item is Job pool i.e. the age-friendly and huge job pool with premium salary + lower stress.

[1] Some people don’t need English and can do well using Chinese as primary language … I’m different.
[2] there’s indeed some risk to wife and to kids…

simple implementation of memory allocator#no src

P9 [[c++game development primer]] has a short implementation without using heap. The memory pool comes from a large array of chars. The allocator keeps track of allocated chunks but doesn’t reuse reclaimed chunks.

It showcases the header associated with each allocated chunk. This feature is also part of a real heap allocator.

reinterpret_cast is used repeatedly.

pointer/itr as field: a G3 implementation pattern

common interview question.

This mega-pattern is present in 90% of java and c# classes, and also very common in c++. Important data structure classes  relying on pointer fields include vectors, strings, hashtables and most STL or boost containers.

Three common choices for a pointer member variable:

  • Raw pointer — default choice
  • shared_ptr — often a superior choice
  • char pointer, usually a c-str

In each case, we worry about

  • construction of this field
  • freeing heap memory
  • RAII
  • what if the pointee is not on heap?
  • copy control of this field
  • ownership of the pointer
  • when to return raw ptr and when to return a smart ptr

shared_ptr field offers exception safety, automatic delete, default synthesized dtor, copier, op=

 

multicast address 1110xxxx #briefly

By definition, multicast addresses all start with 1110 in the first half byte. Routers seeing such a destnation (never a source) address knows the msg is a multicast msg.

However, routers don’t forward any msg with destnation address 224.0.0.0 through 224.0.0.255 because these are local multicast addresses. I guess these local multicast addresses are like 192.168.* addresses.

SO_REUSEPORT TCP server socket option – hungry chicks

With SO_REUSEPORT option, multiple TCP server processes could bind() to the same server endpoint. Designed for the busiest multithreaded servers.

http://i.dailymail.co.uk/i/pix/2011/03/03/article-1362552-0D7319F3000005DC-882_634x357.jpg – a bunch of hungry chicks competing to get the next worm the mother delivers. The mother can only give the worm to one chick at a time. SO_REUSEPORT option sets up a chick family. When an incoming connection hits the accept(), kernel picks one of the accepting threads/processes and delivers the data to it alone.

See https://lwn.net/Articles/542629/  + my socket book P102.

(server)promiscuous socket^connected socket

[[tcp/ip sockets in C]] P100 has a diagram showing that an incoming packet will be matched against multiple candidate listening sockets:

  • format: {local address:local port / remote address:remote port}
  • Socket 0: { *:99/*:*}
  • Socket 1: {10.1.2.3:99/*:*}
  • Socket 2: {192.168.3.2:99/ 172.1.2.3:30001} — this one has the remote address:port populated because it’s an Established connection)

An incoming packet need to match all fields otherwise it’s rejected.

However it could find multiple candidate sockets. Socket 0 is very “promiscuous”. The rule (described in the book) is — the more wild cards, the less likely selected.

(Each packet must be delivered to at most 1 socket as far as I know.)

atomic≠lockFree≈CAS

See also blog on c++ atomic<int> — primary usage is load/store, not CAS

lock free isn’t equal to CAS — lock-free construct also include Read-Copy-Update, widely used in linux and other kernels.

atomic isn’t equal to lock-free — For example, c++ atomic classes (actually templates) can use locks if the underlying processor lacks CAS instruction.

 

java override: strict on declared parameter types

Best practice – use @Override on the overriding method to request “approval” by the compiler. You will realize that

https://briangordon.github.io/2014/09/covariance-and-contravariance.html is concise –

Rule 1: “return type of the overriding method can (but not c++ [1]) be a subclass of the return type of the overridden method, but the argument types must match exactly”

So almost all discrepancies between parent/child parameter types (like int vs long) will be compiled as overloads. The only exception I know is — overriding method can remove “” from List as the parameter type.

There could be other subtle rules when we consider generics, but in the world without parameterized method signatures, the above Rule 1 is clean and simple.

[[ARM]] P212 explains the Multiple-inheritance would be problematic if this were allowed.

cloud4java developers – brief notes

Am an enterprise java developer, not a web developer. I feel PaaS is designed more for the web developer.

I agree with the general observation that IaaS doesn’t impact us significantly.

I feel SaaS doesn’t either. SaaS could offer devops (build/delivery) services for java developer teams.

PaaS has the biggest impact. We have to use the API /SDK provided by the PaaS vendor. Often no SQL DB. Can’t access a particular host’s file system. MOM is rarely provided.

java ReentrantLock^synchronized keyword

I told Morgan Stanley interviewers that reentrantLock is basically same thing as Synchronized keyword. Basically same thing but with additional features:

  • Feature: lockInterruptibly() is very useful if you need to cancel a “grabbing” thread.
  • Feature: tryLock() is very useful. It can an optional timeout argument.

Above features all help us deal with deadlock:)

  • Feature: Multiple condition variables on the same lock.
  • Feature: lock fairness is configurable. A fair lock favors longest-waiting thread. Synchronized keyword is always unfair.
  • — query operations —
  • Feature: bool hasQueuedThread(targetThread) gives a best-effort answer whether targetThread is waiting for this lock
  • Feature: Collection getQueuedThreads() gives a best-effort list of “grabbing” threads on this lock
  • Feature: Collection getWaitingThreads (aConditionVar) gives a best-effort view of the given “waiting room”.
  • Feature: int getHoldCount() basically gives the “net” re-entrancy count
  • Feature: bool isHeldByCurrentThread()

substring search – Boyer-Moore

I think single-string coding questions are evergreen. The insights and techniques might be …reusable?

Brute force search — shift by one, then attempt to match the entire substring. My code below is more efficient than brute-force. Mostly it shifts by one each time, but checks just one char.

Q: write a python function match(string haystack, string nonRegexNeedle), following the simplified Boyer-Moore algo:

Try matching at the left end. If mismatch, then identify the LEFT-most offending char (J) in haystack. Now locate the RIGHT-most position of J in needle (if not found then simply shift all the way past the J.) and right-shift the needle to align the two J.

Q: So when would we right-shift  by one? Not sure. So I don’t know if my code below is 100% correct in all cases.

def visualize(haystack, substring, offset):
  print '\t--' + '0123456789 123456789 12345-----'
  print '\t--' + haystack
  print '\t--' + ' '*offset + substring
  print '\t--' + '-------------------------------'

def match(haystack, substring):
  offset = 0 #offset from start of haystack, where substring is now
  while(True):
    #now locate first offender in haytack
    offenderOffset, offender = -9999, ''
    for i, char in enumerate(substring):
       offenderOffset = offset+i
       if substring[i] != haystack[offenderOffset]:
         offender = haystack[offenderOffset]
         print offender + ' <-- spotted at offenderOffset = ', offenderOffset
         visualize(haystack, substring, offset)       
         break;
         
    if offender == '': return True
    
    # now back-scan substring to locate offender, and then shift to align
    # not really forward-scan haystack
    while True:
      offset += 1
      if offset + len(substring) > len(haystack):
           print 'Gave up complately:'
           visualize(haystack, substring, offset)       
           return False
      if offenderOffset - offset == -1: 
        print 'gave up on aligning: '
        visualize(haystack, substring, offset)
        break
      if substring[offenderOffset - offset] == offender:
        print 'aligned: '
        visualize(haystack, substring, offset)
        break # go back to start of outer while loop

print match('aborb bdb', 'abors') 

 

java exception passing between threads

Many people ask how to make a child thread’s exception “bubble up” to the parent thread.

Background — A Runnable task is unsure how to handle its own exception. It wants to escalate to parent thread. Note parent has to block for the entire duration of the child thread (right after child’s start()), blocked either in wait() or some derivative of wait().

This question is not that trivial. Here are my solutions:

1) Callable task and Futures results — but is the original exception escalated? Yes. P197 [[java threads]]

2) Runnable’s run() method can temporarily catch the exception, save the object in a global variable such as a blocking queue, and notifyAll(). Parent thread could check the global variable after getting notified. Any thread can monitor the gloabal.

If you don’t have to escalate to parent thread, then

3) setUncaughtExceptionHandler() – I think the handler method is called in the same exception thread — single-threaded. In the handler, you can give the task to a thread pool, so the exception thread can exit, but I don’t know how useful.

4) adopt invokeAndWait() design — invokeAndWait() javadoc says “Note that if the Runnable.run() method throws an uncaught exception (on EDT) it’s caught and rethrown, as an InvocationTargetException, on the callers thread”

In c#, there are various constructs similar to Futures.get() — seems to be the standard solutions for capturing child thread exception.
* Task.Wait()
* Task.Result property
* EndInvoke()

java interfaces have only abstract method@@ outdated]java8

Compared to c#, java language design was cleaner and simpler, at the cost of lower power and flexibility. C++ is the most flexible, powerful and complex among the trio.

There was never strong reason to disallow static methods in an interface, but presumably disallowed (till java 7) for the sake of simplicity — “Methods in interface is always abstract.” No ifs and buts about it.

With “default methods”, java 8 finally broke from tradition. Java 8 has to deal with Multiple Inheritance issue. See other blog posts.

In total, there are now 4 “deviations” from that simple rule, though some of them are widely considered irrelevant to the rule.

  1. a concrete nested class in an interface can have a concrete method, but it is not really a method _on_ that interface.
  2. Suppose an interface MyInterFace re-declares toString() method of Object.java. That method isn’t really abstract.
    • There’s very few reasons to do this.
  3. static methods
  4. default methods — the only real significant deviation from the rule

See https://stackoverflow.com/questions/27833168/difference-between-static-and-default-methods-in-interface

goto: justifications

Best-known use case: break out of deeply nested for/while blocks.

  • Alternative — extract into a function and use early return instead of goto. If you need “goto: CleanUp”, then you can extract the cleanup block into a function returning the same data type and replace the goto with “return cleanup()”
  • Alternative — extract the cleanup block into a void function and replace the goto with a call to cleanup()
  • Alternative (easiest) — exit or throw exception

Sometimes none of the alternatives are easy. To refactor the code to avoid goto requires too much testing, approval and release. The code is in a critical module in production. Laser surgery is preferred — introduce goto.

 

my_unsigned_int = -1

http://www.cplusplus.com/reference/string/string/npos/:

static const size_t npos = -1;

npos is a static std::string member constant value with the greatest possible value for an element of type size_t.This value, when used as the value for a len (or sublen) parameter in string‘s member functions, means “until the end of the string”. As a return value, it is usually used to indicate no matches.This constant is defined with a value of -1, which because size_t is an unsigned integral type, it is the largest possible representable value for this type.

##boost modules USED ]finance n asked]IV

#1) shared_ptr (+ intrusive_ptr) — I feel more than half  (70%?) of the “finance” boost usage is here. I feel every architect who chooses boost will use shared_ptr
#2) boost thread
#3) hash tables

Morgan uses Function, Tuple, Any, TypeTraits, mpl

Gregory (RTS) said bbg used shared_ptr, hash tables and function binders from boost

Overall, I feel these topics are mostly QQ types, frequently asked in IV (esp. those in bold) but not central to the language. I would put only smart pointers in my Tier 1 or Tier 2.

—— other modules used in my systems
* tuple
* regex
* noncopyable — for singletons
** private derivation
** prevents synthesized {op= and copier}, since base class is noncopyable.

* polymorphic_cast
* numeric_cast — mostly int types, also float types
* bind?
* scoped_ptr — as non-copyable [1] stackVar,
[1] different from auto_ptr

tiny team of elite developers

Upshot — value-creation per head and salary level would rival the high-flying manager roles.

Imagine a highly successful trading shop. Even though the trading profit (tens of millions) is comparable to a bank trading desk with hundreds of IT head count, this trading shop’s core dev team *probably* have a few (up to a few dozen) core developers + some support teams [1] such as QA team, data team, operations team. In contrast, the big bank probably have hundreds of “core” developers.

[1] Sometimes the core teams decide to take on such “peripheral” tasks as market data processing/storage, back testing, network/server set-up if these are deemed central to their value-add.

In the extreme case, I guess a trading shop with tens of millions of profit can make do with a handful of developers. They want just a few top geeks. The resultant efficiency is staggering. I can only imagine what personal qualities they want:

* code reading — my weakness
* tools – https://bintanvictor.wordpress.com/2012/11/08/2-kinds-of-essential-developer-tools-on-wall-st-elsewhere/ * manuals — reading tons of tech info (official or community) very quickly, when a “new” system invariably behave strangely
* local system knowledge
* trouble-shooting — and systematic problem-solving. I feel this largely depends on system knowledge.
* design — it right, and able to adjust it as requirements change * architecture?
* tuning?
* algorithms?

(soft skills:)
* clearly communicate design trade-offs in a difficult discussion * drive to get things done under pressure without cutting corners * teamwork — teamwork to back down when needed to implement a team decision

pyramid of wage levels, CN^sg^US

See also the similar post on NBA salary

Look at my brother-in-law. (I’m not too sure about his situation so I will use tentative statements.) He’s smart, dedicated. Rather long experience as a team lead. He has a masters from a top uni in Shanghai.

However, there are many people with similarly strong track record in China, so he can’t move up the pyramid to, say CNY 1000k. I guess 500k is also tough.

In Singapore I’m facing a similar challenge. A S$150k (after tax) tech job is rare and considered elite so you need to be rather strong to get it. In other words, the pyramid has a sharper tip than in the US pyramid, based on a sample of 5,000 IT jobs.

(I think the Shanghai salary distro is better than most China cities…)

The NBA post brings together other important factors — lifelong income; managerial skill; …

linux command elapse time

–q(time) command

output can be a bit confusing.

Q: does it work with redirection?

–$SECONDS variable

architect’s helicopter view of a system

This (pearl) post is not only describing some role models. This is a career tip, possibly survival tip.

Bottom line — my capabilities (code reading…) is not very high, so as architect/owner of any large codebase I would tend to struggle. There are some ways to survive, but not easy. I would think a high volume, high latency batch+web/quant system in java/SQL/python/javascript would be easier for me and harder for my peers.

Prime eg: Yang. Nikhil calls it “his system knowledge” – Without knowing the source code, Yang would ask sharp questions that baffles the actual coders. If the answer smells bad, then either answer is not exactly truthful, or existing system design is imperfect. Yang is quick to notice “inconsistency with reality”, a very sharp observation. This pattern is critical in troubleshooting and production support as well. I told XR that production support can be challenging and require logical, critical thinking about the systems as a whole.

An __effective__ geek needs more than high-level system knowledge. Somehow, every one of them soon gets down to the low level, unless her role is a purely high-level role … like presales?

Not everyone has the same breadth and width of helicopter view. Often I don’t need it in the first few years. You would think after many years, each would acquire enough low-level and high-level knowledge, but I imagine some individuals would refuse to put in the extra effort, and keep her job by taking care of her own “territory” and never step out.

— my own experience —

At Citi muni, I largely avoided the complexity. I did try and trace through the autoreo codebase but stopped when I was given other projects. In hind sight, I feel it’s too slow and probably not going to work. Mark deMunk pointed out the same.

At 95 Green and Barcap, I was a contractor and didn’t need to step out of my territory. This might be one reason I had an easier job and exceed expectations. See post on “slow decline” and http://tigertanbinpripri.blogspot.com/2015/09/fwdsome-reasons-why-oc-boss-complains.html, both in the pripri blog.

At OC, the Guardian codebase I did take up fully. Quest was given to me after I /fell out/ with boss, so I did enough research to implement any required feature. No extra effort to get a helicopter view.

Stirt was tough. I spent a few months trying to learn some but not all of the nuts and bolts in Quartz. This knowledge is fundamental and crucial to my job and I didn’t have too much time learning the Stirt system, which I did learn but didn’t master. In fact, I asked many good system-level questions about Sprite and Stirt-Risk, but failed to gain Yang’s insight.

 

python RW global var hosted in a module

Context: a module defines a top-level global var VAR1, to be modified by my script. Reading it is relatively easy:

from mod3 import *
print VAR1

Writing is a bit tricky. I’m still looking for best practices.

Solution 1: mod3 to expose a setter setVAR1(value)

Solution 2:
import mod3
mod3.VAR1 = ‘new_value’

Note “from mod3 import * ” doesn’t propagate the new value back to the module. See example below.

::::::::::::::
globalrw.py
::::::::::::::
#!/usr/bin/python -u
from mod3 import *

def main():
  mod3func()
  ''' Line below is required to propagate new value back to mod3
      Also note the semicolon -- to put two statements on one line '''
  import mod3; mod3.VAR1 = 'new value'
  mod3func()
main()
::::::::::::::
mod3.py
::::::::::::::
VAR1='initial value'
def mod3func():
  print 'VAR1 =', VAR1

## innovative features of python

Here’s my answer to a friend’s question “what innovative features do you see in python”

  • * decorators. Very powerful. Perhaps somewhat similar to AOP. Python probably borrowed it from Haskell?
  • * dynamic method/attribute lookup. Somewhat similar to C# “dynamic” keyword. Dangerous technique similar to java reflection.
  • * richer introspection than c# (which is richer than java)
  • * richer metaprogramming support (including decorator and introspection) … Vague answer!
  • * enhanced for-loop for a file, a string,
  • * listcomp and genexpr
  • * Mixin?
  • I wrote a code gen to enrich existing modules before importing them. I relied on hooks in the importation machinery.

RESTful – phrase book

agile – …

coupling – less coupling between client and server, so server side changes are easier. I think SOAP requires client rebuild.

b2b – still dominated by SOAP

resource-oriented services – …

object – each URL logically represents an object, which you can Get (query), POST (create), PUT (update its content) or DELETE

Q: is REST simpler than traditional SOA or SOAP web services?

Q: the concept of “resource” — does it sometimes mean a database record?

reach next level ] tech IV performance

See also the similar pripri post about “beat us”.

I really don’t care that much about zbs needed in a project:
* the so-called architecture expertise is overrated and overhyped. See my post on
** GTD
** software architect vs enterprise architect
* the optimization expertise is … similar

Next level requires .. more knowledge. See post on “Yaakov”
Next level requires .. more practice with Facebook algo questions.
Next level requires .. IDE coding against simple problems like codility.
Next level requires .. more mileage with hands-on dev.
Next level requires .. more “best practices” like google style guide

fwd contract often has negative value, briefly

An option “paper” is a right but not an obligation, so its holder has no obligation, so this paper is always worth a non-negative value.

if the option holder forgets it, she could get automatically exercised or receive the cash-settlement income. No one would go after her.

In contrast, an obligation requires you to fulfill your duty.

A fwd contract to buy some asset (say oil) is an obligation, so the pre-maturity value can be negative or positive. Example – a contract to “buy oil at $3333” but now the price is below $50. Who wants this obligation? This paper is a liability not an asset, so its value is negative.

openssh^putty key files

openssh key format is the one used in id_rsa / id_rsa.pub.

  • Proven tip: I actually took the pair of id_rsa files and used them in any windows or linux account. It’s like a single fingerprint used everywhere. The key is not tied to any machine.
    • The authorized_keys file in ALL destination machines have the SAME line containing that public key.
  • Proven tip: To support ssh support@rtppeslo2 with ssh key, I copy a standard authorized_keys to rtppeslo2:~support/.ssh. See https://superuser.com/questions/283216/ssh-key-authentication-with-another-user/

transformation — Puttygen can read the same “fingerprint” to generate ppk files, in putty’s format. Then a putty session can use the ppk file to auto-login, in place of a password

Note when my remote_host home dir permission was too open, then the ssh key was ignored by sshd. I had to enter password.

Hull: estimate default probability from bond prices

label: credit

The arithmetic on P524-525 could be expanded into a 5-pager if we were to explain to people with high-school math background…

 

There are 2 parts to the math. Part A computes the “expected” (probabilistic) loss from default to be $8.75 for a notional/face value of $100. Part B computes the same (via another route) to be $288.48Q. Equating the 2 parts gives Q =3.03%.

 

Q3: How is the 7% yield used? Where in which part?

 

Q4: why assume defaults happen right before coupon date?

%%A: borrower would not declare “in 2 days I will fail to pay the coupon” because it may receive help in the 11th hour.

 

–The continuous discounting in Table 23.3 is confusing

Q: Hull explained how the 3.5Y row in Table 23.3 is computed. Why discount to  the T=3.5Y and not discounting to T=0Y ?

 

The “risk-free value” (Column 4) has a confusing meaning. Hull mentioned earlier a “similar risk-free bond” (a TBond). At 3.5Y mark, we know this risk-free bond is scheduled to pay all cash flows at future times T=3.5Y, 4Y, 4.5Y, 5Y. We use risk-free rate 5% to discount all cash flows to T=3.5Y. We get $104.34 as the “value of the TBond cash flows discounted to T=3.5Y”

 

Column 5 builds on it giving the “loss due to a 3.5Y default, but discounted to T=3.5Y”. This value is further discounted from 3.5Y to T=0Y – Column 6.

Part B computes a PV relative to the TBond’s value. Actually Part A is also relative to the TBond’s value.

 

In the model of Part B, there are 5 coin flips occurring at T=0.5Y   1.5  2.5  3.5  4.5 with Pr(default_0.5) = Pr(default_1.5) = … = Pr(default_4.5) = Q. Concretely, imagine that Pr(flip = Tail) is 25%. Now Law of total prob states

 

100% = Pr(05) + Pr(15) + Pr(25) + Pr(35) + Pr(45) + Pr(no default). If we factor in the amount of loss at each flip we get

 

Pr(05) * $65.08 + Pr(15) * $61.20 + Pr(25) * $57.52 + Pr(35) * $54.01 + Pr(45) * $50.67 + Pr(no default, no loss) + $0 == $288.48Q

##c++(GTD+)learning-aids to upgrade 6→7

Which yardstick? First must be IV. 2nd would be common (not the obscure) GTD skills
For IV, need to analyze how I was beaten in past interviews. For GTD zbs, a few major home projects has significantly increased my proficiency.
  • see recoll -> c++Q&A.txt
  • [ZZ] try out debug tools in the gdb book? LG
  • [ZZ] experiment with sample code in [[fin instrument pricing using c++]]
  • [ZZ] proj: valgrind (linux) – get hands-on experience. Might take too much time to get it working
  • problem – test if a linked list has cycles
  • problem: all permutations of 0 or more of abcde
  • problem: write skeleton c++ code for ref counted string or auto_ptr
  • problem: test if a given container is in ascending order
  • [ZZ means … see https://bintanvictor.wordpress.com/2017/02/14/low-leverage-ctopics/]

mv-semantic: keywords

I feel all the tutorials seem to miss some important details and selling a propaganda. Maybe [[c++ recipes]] is better?

[s = I believe std::string is a good illustration of this keyword]

  • [s] allocation – mv-semantic efficiently avoids memory allocation on heap or on stack
  • [s] resource — is usually allocated on heap and accessed via a pointer field
  • [s] pointer field – every tutorial shows a class with a pointer field. Note a reference field is much less common.
  • [s] deep-copy – is traditional. Mv-semantics uses some special form of shallow-copy. Has to be carefully managed.
  • [s] temp – the RHS of mv-semantic must strictly be a temp object. I believe by using the move() function and the r-val reference (RVR) we promise to the compiler not to access the temp object afterwards. If we access it, i guess bad things could happen. Similar to UndefBehv? See [[c++standard library]]
  • promise – see above
  • containers – All standard STL container classes (including std::string) provide mv-semantics. Here, the entire container instance is the payload! Inserting a float into a container won’t need mv-semantics.
  • [s] expensive — allocation and copying assumed expensive. If not expensive, then the move is not worthwhile.
  • [s] robbed — the source object of the move is crippled, robbed, abandoned and should not be used afterwards. Its “resource” is already stolen, so the pointer field to that resource should be set to NULL.

——–
http://www.boost.org/doc/libs/1_59_0/doc/html/move/implementing_movable_classes.html says “Many aspects of move semantics can be emulated for compilers not supporting rvalue references and Boost.Move offers tools for that purpose.” I think this sheds light…

c++ parametrized functor – more learning notes

Parametrized Functor (class template) is a standard, recommended construct in c++, with no counterpart in java. C# delegae is conceptually simpler but internally more complex IMO, and represents a big upgrade from c++ functor. Better get some clarity with functor before comparing with delegates.

The most common functor is the simple stateless functor (like a simple lambda). The 2nd common category is the (stateful but) immutable functor. In all cases, the functor is designed for pass-by-value (not by ref or by pointer), cheap to copy, cheap to construct. I see many textbook code samples creating throwaway functor instances.

Example of the 2nd category – P127[[essentialC++]].

A) One mental block is using a simple functor Class as a template type param. This is different from java/c#

B) One mental block is a simple parametrized functor i.e. class template. C++ parametrized functor can take various type params, more “promiscuous” than c#/java.

C) A bigger mental block, combining A+B, is a functor parametrized with another functor Class as a template param. See P186[[EssentialC++]].

This is presented as a simple construct, with about 20 lines of code, but the more I look at it, the more strange it feels. I think this is somwehwat powerful but unpopular and unfamiliar to mainstream developers.

Functional programming in c++?

In java, we write a custom comparitor class rather than a comparitor class Template. We also have the simpler alternative of a Comparable interface, but that’s not very relevant in this discussion. Java 8 lambda — http://www.dreamsyssoft.com/java-8-lambda-tutorial/comparator-tutorial.php

std::vector memory allocation/free: always heap@@

https://stackoverflow.com/questions/8036474/when-vectors-are-allocated-do-they-use-memory-on-the-heap-or-the-stack is concise.

  • The vector “shell” can be on stack or heap or static memory, as you wish.
  • The continuous array (payload objects) are always allocated on the heap, so that the vector can grow or deallocate them.

Deallocation is explained in https://stackoverflow.com/questions/13944886/is-stdvector-memory-freed-upon-a-clear

coblood – each year you worked

master -> jobjob

 

Each year that you spent on some job,

 

you earn, most important of all, some cash to keep the family financially safe.

** you also earn something to plow back. Look at my UChicago experience…

you earn some experience, insight, and hopefully some zbs, but most of it will not be relevant in future jobs

you earn/build some track record that helps maintain marketability.

you either speed up or slow down brain aging

you incur stress

you incur sacrifice of family time and exercise time

 

c++IV: importance: knowledge imt dev xp

1) Many hard-core tech interviewers (Yaakov, Jump, 3Arrows, Bbg, nQuants …) often asked me to explain a language feature, then drill in to see if I really do understand the key ideas, including the rationale, motivation and history. This knowledge is believe to … /separate the wheat from the chaff/

This knowledge can’t be acquired simply by coding. In fact, a productive coder often lacks such knowledge since it’s usually unnecessary theoretical knowledge.

2) West Coast always drills in on algo (+ data structure). No way to pick up this skill in projects…

1+2 —> many interviewers truly believe a deep thinker will always learn faster and design better.

typedef as prefix: syntax demystified

I find the typedef syntax not as simple as it appeared. http://stackoverflow.com/questions/3783016/fundamental-typedef-operand-syntax is the best explanation of the very confusing typedef syntax.

        int x; // declares a variable named ‘x’ of type ‘int’
typedef int x; // declares a type     named ‘x’ that is ‘int’
typedef char Decimal[20];
        int(*F)(size_t); // declares a variable named F of type ‘int(*)(size_t)’
typedef int(*F)(size_t); // declares a type     named F that is ‘int(*)(size_t)’

If you add the “typedef” prefix, instead of declaring a VARIABLE, you’ve declared a new TYPE-ALIAS instead.

rope-cliff puzzle #use stone

You are on the top of a 200m cliff. You have a 150m long rope and a knife. You can only tie your rope where you stand, or 100m below on a tree. Can you reach the bottom of the cliff alive, and how? Any Free fall is considered deadly.

—-Well-known solution, using a loop
Suppose the cliff is point C and the tree is T and the midpoint is point M.

Cut (1st and last cut) 100M of rope and put into pocket. Tie the other 50M at C then descend to M. Convert the end of this rope into a loop at the point M. Lubricate the loop then pass the 100M rope through, to make a 50+50 double rope. Descend to T then retrieve the entire 100M.

—– my solution that can go
For a loop at top of cliff. Tie a stone (at least fist-sized) at end A of the rope and run end B of the rope through the loop until the stone blocks the passage. Now hold end B and descend 100m to the tree. Now let go of end B so the stone pulls the entire rope down through the loop and falls into your hand.

## relatively innovative features of python

I’m relatively familiar with perl, java, c++, c# and php, though some of them I didn’t use for a long time.

IMO, these python features are kind of unique, though other unknown languages may offer them.

* decorators
* list comp
* more hooks to hook into object creation. More flexible and richer controls
* methods and fields are both class attributes. I think fundamentally they are treated similarly.

ccy swap^ IRS^ outright FX fwd

currency swap vs IRS
– Similar – exchange interest payments
– diff — Ccy swap requires a final exchange of principal. FX rate is set up front on deal date

currency swap vs outright fx fwd?
– diff — outright involves no interest payments
– similar — the far-date principal exchange has an FX rate. Rate is set on deal date
– diff — the negotiated rate is the spot rate on deal date for ccy swap, but in the outright deal, it is the fwd rate.

currency swap vs FX swap? Less comparable. Quite a confusing comparison.
– FX swap is more common more popular

physical, intuitive feel for matrix shape – fingers on keyboards

Most matrices I have seen so far in real world (not that many actually) are either
– square matrices or
– column/row vectors

However it is good to develop a really quick and intuitive feel for matrix shape. When you are told there’s some mystical 3×2 matrix i.e. 3-row 2-column.
– imagine a rectangle box
– on its left imagine a vertical keyboard
– on it put Left fingers, curling. 3 fingers only.
– Next, imagine a horizontal keyboard below (or above, if comfortable) the rectangle.
– put Right fingers there. 2 fingers only

For me, this gives a physical feel for the matrix size. Now let’s try it on a column matrix of 11. The LHS vertical keyboard is long – 11 fingers. Bottom keyboard is very short — 1 finger only. So it’s 11×1

The goal is to connect the 3×2 (abstract) notation to the visual layout. To achieve that,
– I connect the notation to — the hand gesture, then to — the visual. Conversely,
– I connect the visual to — the hand gesture, then to — the notation
————————
Now consider matrix multiplication. Consider a 11×2. Note a 11×1 columnar matrix is more common, but it’s harmless to get a more general feel.

An 11x2 * 2x9 gives a 11x9.

Finger-wise, the left 11 fingers on the LHS matrix stay glued; and the right 9 fingers on the RHS matrix stay glued. In other words,

The left hand fingers on the LHS matrix remain.
The right hand fingers on the RHS matrix remain.

Consider a 11×1 columnar matrix X, which is more common.

X * X’ is like what we just showed — 11×11 matrix matrix
X’ * X is 1×11 multiplying 11×1 — 1×1 tiny matrix

IRS motivations – a few tips

See also – Trac Consultancy course handout includes many practical applications of IRS.
see also — There’s a better summary and scenarios in the blog on IRS dealers

I feel IR swap is flexible and “joker card” in a suite — with transformation power.

Company B (Borrower aka Issuer) wants to borrow. Traditional solution is a bond issue or unfortunately …. a bank loan (most expensive of all), either fixed or floating rate. A relatively new Alternative is an IRS.

Note bank loan is the most expensive alternative (in terms of capital charge, balance sheet impact …), so if possible you avoid it. Mostly small companies with no choice take bank loans.

Motivation 1  relative funding advantage
Motivation 2 for company B – reduce cost of borrowing fixed
Motivation 3 for Company B – betting on Libor.
* If B bets on Libor to _rise, B would “buy” the Libor income stream of {12 semi-annual payments}, at a fixed (par) swap rate (like 3.5%) agreed now, which is seen as a dirt cheap price. Next month, the par swap rate may rise (to 3.52%) for the same income stream, so B is lucky to have bought it at 3.5%.
* If B bets on Libor to _drop, B would “sell” (paying) the Libor income stream

Motivation 4 to cater to different borrowing preferences. Say Company C is paying a fixed 5% interest, but believes Libor will fall. C wants to pay floating. C can swap with company A so as to pay libor. C will end up paying floating interest to A and receive 5.2% from A to offset the original 5% cost.

Why would A want to do this? I guess A could be a bank.

##basic steps in vanilla IRS valuation, again

* First build a yield curve using all available live rates. This “family photo” alone should be enough to evaluate any IRS
* Then write down all (eg 20) reset dates aka fixing date.
* Take first reset date and use the yield curve to infer the forward 3M Libor rate for that date.
* Find the difference between that fwd Libor rate and the contractual fixed rate (negotiated on this IRS contract). Could be +/-
* Compute the net cashflow to occur on that fixing/reset date.
* Discount that cashflow to PV. The discounting curve could be OIS or Libor based.
* Write down that amount.

Repeat for the next reset date, until we have an amount for each reset date. Notice all 20 numbers are inferred from the same “family photo”. Tomorrow under a new family photo, we will recalc/reval all 20 numbers.

Add up the 20 amounts to get the net PV in that position. Since the initial value of the position is $0, this net value is also the PnL.

c# DYNAMIC – TryCall_method1 on an Unidentified object#src

With dynamic type, we can try-call an arbitrary method m1(), which need not exist in any public interface. Behaviour — If the underlying type exposes m1, it is invoked. If unavailable you get a run time error, not a compile time error.

One usage idea — Instead of calling the standard ToString() on the unknown object, we want to try-call ToStringXml(), and pray the underlying type supports it. Traditionally, If we have a few types to add ToStringXml(), we would prefer compile-time type check by introducing an interface IXmlDumper. However, if you have 55 types that could pass into TryCall() and 33 of them exposes ToStringXml(), you may need to cast to IXmlDumper then call ToStringXml(). Worse, you would need to edit source code for the 33 types!

With dynamnic type, basically the compile-time type check is postponed to run time. Compile time type check would need m1() declared in the type C, like —

C c = new C(); c.m1();

But if we use dynamic, and type-check at run time, then we don’t need m1() exposed on any public interface —

invoking a delegate instance – explict Invoke()

Most c# folks take the shortcut by writing “this.myDelegateInstance();”. But this sugar-coated syntax masks off the distinction between a method and a delegate. C# syntax sugar coating trivializes this distinction because c# “wants” you to treat delegate instances as if it’s a method defined nearby. However, sometimes I want the distinction highlighted.

When the morphing hurts (in high-complexity codebase), I prefer the long-winded syntax myDelegateInstance.Invoke()

This kinda small tweaks can add up to a more readable codebase.

c# DYNAMIC expando object (ExOb) – phrasebook

dynamic – better assign the exob instance to a dynamic variable. If you use a ExpandoObject variable, you lose a lot of features but i won’t elaborate …

javascript/python/perl – in these languages, a class instance is conceptually(?) and physically(?) a dictionary of name-value pairs. Exob is somewhat similar.

name-value pairs — In an exob instance, a field is (obviously) name-value pair. Ditto for a runtime-created method. The “value” is basically a (possibly void) lambda expression.

Emit — is the traditional technique to create a brand new class at run time and “emit” IL code into it. Exob is newer and simpler. I guess it’s more practical.

one event delegating to another event – WPF

http://msdn.microsoft.com/en-us/magazine/dd419663.aspx#id0090030 shows an interesting mini “pattern” of one event (“event-pseudo-field”) delegating all its add/remove logic to another event (pseudo-field).

public event EventHandler CanExecuteChanged{
  add { CommandManager.RequerySuggested += value; }
  remove { CommandManager.RequerySuggested -= value; }
}

This reveals the true meaning of an event member. If an object studentA (of class Student) has an event pseudo field evt2, this field is basically a chain of callbacks (or notification targets, or event handlers). The evt2 field is nothing but a handle to add/remove on this chain.

In many cases, the callbacks run sequentially on the same thread as the event-firing code.

See also http://bigblog.tanbin.com/2010/06/field-like-event-syntax-restrictions.html

managed ^ unmanaged (dotnet)- what to Manage@@

A popular IV question — exactly what runtime Services are provided by the dotnet Hosting runtime/VM to the Hosted/managed code? In other words, how is the “managed code” managed?

Re http://bigblog.tanbin.com/2011/09/what-is-kernel-space-vs-userland.html, I believe an executing thread often has it’s lowest level stack frames in kernel mode, middle frames in the VM, and top frames running end-user application code. The managed code is like a lawyer working out of a hotel room. The hotel provides her many business services. Now, host environments have always provided essential runtime services to hosted applications, since the very early days of computers. The ubiquitous runtime host environment is the OS. In fact, the standard name of an OS instance is a “hostname”. If you have 3 operating systems running on the same box sharing the CPU/RAM/disk/network then there are 3 hosts — i.e. 3 distinct Hosting-Environments. In the same tradition, the dotnet/java VM also provides some runtime services to hosted applications. Hotel needs “metadata” about the data types, so a dotnet assembly always include type metadata in additional to IL code.

Below is a dotnet-centric answer to the IV question. (JVM? probably similar.) For each, we could drill down if needed (usually unneeded in enterprise apps).

– (uncaught) exception handling. See [[Illustrated c#]] (but how different from unmanaged c++?)
– class loading. See [[Illustrated c#]]
– security
– thread management? But I believe unmanaged code can also get userland threads manufactured by the  (unmanaged) thread library.
– reflection.See [[Illustrated c#]]
– instrumentation? Remember jconsole
– easier debugging – no PhD required. Unmanaged code offers “limited” debugging??
– cross-language integration?
– memory management?
** garbage collection. This service is so prominent/important it’s often listed on its own, but I consider this part of mem  mgmt.
** memory request/allocation? A ansi-C app uses memory management library to grab memory wholesale from kernel (see other posts) and a VM probably takes over that task.
** translate a C# heapy object reference to a “real virtual” heap address. Both JVM and .Net collectors have to move (non-pinned[1]) objects from one real-virtual address to another real-virthal address. Note the OS (the so-called Paging supervisor) translates this real-virtual address into physical RAM address.** appDomain. VM isolates each appdomain from other appdomains within a single Process and prevents memory cross-access. See http://msdn.microsoft.com/en-us/library/ms173138(v=vs.80).aspx

[1] pinned objects are not relocatable.

multicast – highly efficient@@ # my take

(Note virtually all MC apps use UDP.)
To understand MC efficency, we must compare with UC (unicast) and BC (broadcast). First we need some “codified” metrics —
  • TT = imposing extra Traffic on network, which happens when the same packet is sent multiple times through the same network.
  • RR = imposing extra processing workload on the Receiver host, because the packet is addressed TO “me” (pretending to be a receiver). If “my” address were not mentioned in the packet, then I would have ignored it without processing.
  • SS = imposing extra processing workload by the Sender — a relatively low priority.
Now we can contrast MC, UC and BC. Suppose there are 3 receiver hosts to be notified, and 97 other hosts to leave alone, and suppose you send the message via —
  1. UC – TT not RR — sender dispatches 3 copies each addressed to a single host.
  2. BC – RR not TT — every host on the network sees a packet addressed to it though most would process then ignore it, wasting receiver’s time. When CEO sends an announcement email, everyone is in the recipient list.
  3. MC – not RR not TT. However, MC can still flood the network.

premium adjusted delta – basic illustration

http://www.columbia.edu/~mh2078/FX_Quanto.pdf says “When computing your delta it is important to know what currency was used to pay the premium. Returning to the stock analogy, suppose you paid for an IBM call option in IBM stock that you borrowed in the stock-lending market. Then I would inherit a long delta position from the option and a short delta position position from the premium payment in stocks. My overall net delta position will still be long (why?), but less long than it would have been if I had paid for it in dollars.”

Suppose we bought an ATM call, so the option position itself gives us +50 delta and let us “control” 100 shares. Suppose premium costs 8 IBM shares (leverage of 12.5). Net delta would be 50-8=42. Our effective exposure is 42%

The long call gives us positive delta (or “positive exposure”) of 50 shares as underlier moves. However, the short stock position reduces that positive delta by 8 shares, so our portfolio is now slightly “less exposed” to IBM fluctuations.

2nd scenario. Say VOD ATM call costs 44 VOD shares. Net delta = 50 – 44 = 6. As underlier moves, we are pretty much insulated — only 6% exposure. Premium-adjusted delta is significantly reduced after the adjustment.

You may wonder why 2nd scenario’s ATM premium is so high. I guess
* either TTL(i.e. expiration) is too far,
* or implied vol is too high,
* or bid ask spread is too big, perhaps due to market domination/manipulation

quiet confidence on go-live day

I used to feel “Let’s pray no bug is found in my code on go-live day. I didn’t check all the null pointers…”

I feel it’s all about …blame, even if manager make it a point to to avoid blame.

Case: I once had a timebomb bug in my code. All tests passed but production system failed on the “scheduled” date. UAT guys are not to blame.

Case: Suppose you used hardcoding to pass UAT. If things break on go-live, you bear the brunt of the blame.

Case: if a legitimate use case is mishandled on go-live day, then
* UAT guys are at fault, including the business users who signed off. Often the business come up with the test cases. The blame question is “why this use case isn’t specified”?
* Perhaps a more robust exception framework would have caught such a failure gracefully, but often the developer doesn’t bear the brunt of the blame.
**** I now feel business reality discounts code quality in terms of airtight error-proof
**** I now feel business reality discounts automated testing for Implementation Imperfections (II). See http://bigblog.tanbin.com/2011/02/financial-app-testing-biz.html

Now I feel if you did a thorough and realistic UAT, then you should have quiet confidence on go-live day. Live data should be no “surprise” to you.

java visitor pattern – overloading vs overriding, again

In standard visitor, you have a hierarchy of visitable data classes (eg Product and 30 subclasses) + a bunch of visitor classes. In
this case, i have just one visitor class, which defines overloaded methods
Visitor.visit(Option)
Visitor.visit(Bond)
Visitor.visit(Futures) etc

Now, I am given a random Product object myProduct. It might be an Option, a Bond, a Futures… but given the multitude of Product
subtypes, it’s too tedious to type-test.

I already have a myVisitor object, so I try calling myVisitor.visit(myProduct). I want it bound to the correct visit() among the
overloads, but guess what — this won’t compile because … (hold your breath)…. there’s no Visitor.visit(Product) defined.

Reason — overloaded method call is resolved at compile time based on declared type of myProduct

Solution — define
abstract Product.accept(Visitor)
Bond.accept(Visitor){v.visit(this);}
Option.accept(Visitor v) {v.visit(this);}

Now, myProduct.accept(myVisitor) would bind the method call to the correct visit(). Why? Well, accept() is resolved at runtime and
binds to Bond.accept() assuming myProduct is a Bond. Inside Bond.accept(), visit(this) is always visit(Bond) — resolved at compile
time like all Overloaded methods.

In conclusion, Visitable.accept(Visitor) is resolved dynamically and Visitor.visit(Visitable) is resolved statically.

What if I create a FriendlyVisitor type? Is accept() dynamic-and-static?

double meanings (c#): new, using, yield

In C#, a number of keywords are given 2 unrelated meanings.

using — typedef
using — name space import
——– unrelated to ———–
using — resource management (IDisposable)
=====================================
Yield() — static method Thread.Yield()
——– unrelated to ———–
yield — iterator
=====================================
new MyClass() — on class types
——-
new MyStruct() — on struct types
** does NOT necessarily hit heap; See P55 [[c#preciely]]
** doesn’t return pointer;
** entirely optional. Only needed if you want to call the ctor of the struct.
**** without new, a value type object is created uninitialized when you declare the variable.
See http://stackoverflow.com/questions/7767669/why-is-it-possible-to-instantiate-a-struct-without-the-new-keyword
——– unrelated to ———–
new — redefine/hide a field, method, property etc
=====================================
Not a keyword, but 2D-array in c# can be either jagged or a matrix

reliably convert Any c++ source to C : IV

More than one person asked me “Can you compile a c++ app if you only have a c compiler?”

Some of the foremost experts on c/c++ compiler said on http://www.edg.com/faq/convert —

If you mean “can you convert C++ source to C source, and run the C through a C compiler to get object code“, as a way to run C++ code on a system that has only a C compiler, yes it is possible to implement all of the features of ISO standard C++ by translation to C source code, and except for exception handling this produces object code with efficiency comparable to that of the code generated by a conventional compiler.

For exception handling, it is possible to do an implementation using setjmp/longjmp that is completely conformant, but the code generated will be 5-20% slower than code generated by a true c++ compiler.

boost intrusive smart ptr phrase book #no IV

* MI — P35 [[beyond c++ standard lib]] shows your pointee CLASS can be written to multiple-inherit from a general-purpose ref_count holder class. This is a good use case of multiple inheritance, perhaps in Barclays QA codebase.

* real-estate — The 32-bit ref counter lives physically on the real estate of the pointee. Pointee type can’t be a builtin like “float”. In contrast, a club of shared_ptr instances share a single “club-count” that’s allocated outside any shared_ptr instance.

* legacy — many legacy smart pointer classes were written with a ref count in the pointee CLASS like QA YieldCurve. As a replacement for the legacy smart pointer, intrusive_ptr is more natural than shared_ptr.

* builtin — pointee class should not be a builtin type like float or int. They don’t embed ref count on their real estate; They can’t inherit; …

* TR1 — not in TR1 (http://en.cppreference.com/w/cpp/memory), but popular

* ref-count — provided by the raw pointee Instance, not the smart pointer Instance. Note the Raw pointER Instance is always 32 bit (assuming 32-bit bus) and can never host the reference count.

* same-size — as a raw ptr

* expose — The pointee class must expose mutator methods on the ref count field

default-initialize^value-initialize — new-expressions

Q: difference between new Dog and new Dog() with the parentheses?

http://www.cplusplus.com/forum/general/37962/ answers clearly
The first constructor – naked – provides what is called default initialization (unrelated to default ctor). Default initialization leaves the values of fundamental types (int, double, etc) uninitialized, i.e. arbitrary as in the graveyard. Therefore, with new Dog you get uninitialized chunk of memory with arbitrary values in the fields.

The second constructor – with parenthesis – provides what is called value initialization. Value initialization zeros fundamental types. Therefore, with new-Demo() you get zero-initialized chunk of memory – all fields in this case will be zeros.

–C++Primer P407 made it even clearer that default-initialize on the heap basically leaves the bits (of the real estate carved out of a graveyard) as they were left behind by the dead/freed/bulldozed corpses.

http://stackoverflow.com/questions/620137/do-the-parentheses-after-the-type-name-make-a-difference-with-new/620402#620402 has good illustrations of 3 categories of classes and shows the behavior of new-Dog vs new-Dog(). Another contributor pointed out that the difference is usually negligible. “If there is such a constructor it will be used. For 99.99% of sensibly designed classes there will be such a constructor, and so the issues can be ignored.” However in practice, how many classes out of 100 are sensibly designed?

— to initialize the field (of type T) of a class template, 
P687 [[absoluteC++]] says we can skip the init since there’s no good default value for the unknown type T.

Stanley Lippman on P246 [[essentialC++]] says we could use T() even if T happens to be int.

##[12] bottlenecks in a high performance data "flow" #abinitio

Bottlenecks:

#1 probably most common — database, both read and write operations. Therefore, ETL solutions achieve superior throughput by taking data processing out of database. ETL uses DB mostly as dumb storage.

  • write – if a database data-sink capacity is too slow, then entire pipe is limited by its throughput, just like sewage.
    • relevant in mkt data and high frequency trading, where every execution must be recorded
  • read – if you must query a DB to enrich or lookup something, this read can be much slower than other parts of the pipe.

#2 (similarly) flat files. Write tends to be faster than database write. (Read is a completely different story.)
* used in high frequency trading
* used in high volume market data storage — Sigma2 for example. So flat file writing is important in industry.
* IDS uses in-memory database + some kind of flat file write-behind for persistence.

#? Web service

#? The above are IO-bound. In contrast, CPU-bound compute-intensive transform can (and do) also become bottlenecks.

vega roll-up makes no sense #my take

We know dv01, duration, delta (and probably gamma) … can roll up across positions as weighted average. I think theta too, but how about vega?

Specifically, suppose you have option positions on SPX at different strikes and maturities. Can we compute weighted average of vega? If we simulate a 100bps change in sigma_i (implied vol), from 20% pa to 21% pa, can we estimate net change to portfolio MV?

I doubt it. I feel a 100 bps change in the ATM 1-month option will not happen in tandem with a 100 bps change across the vol surface.

– Along the time-dimension, the long-tenor options will have much __lower__ vol changes.
– Along the strikes, the snapshot vol smile curve already exhibit a significant skew. It’s unrealistic to imagine a uniform 100 bps shift of the entire smile (though many computer system still simulates such a parallel shift.)

Therefore, we can’t simulate a 100 bps bump to sigma_i across a portfolio of options and compute a portfolio MV change. Therefore vega roll-up can’t be computed this way.

What CAN we do then? I guess we might bucket our positions by tenor and aggregate vega. Imperfect but slightly better.

stop-the-world inevitable: either minor or major JGC

Across All of Sun’s GC engines so far (2012), young generation (eden + survivors) algorithm has _always_ been STW. The algo changed from single-threaded to parallel, but remains STW. Therefore, during a minor GC ALL application threads are suspended. Usually a short pause.

Across All of Sun’s GC engines so far (2012), oldgen is at best low-pause but _never_ no-pause. For the oldgen, there is _always_ some pause, due to an inevitable STW phase.

Note one of the oldgen algorithms (i.e. CMS) is mostly-concurrent, meaning the (non-concurrent) STW phase is a brief phase — low-pause. However, young gen algorithm has always been STW throughout, without any concurrent phase.

PnL roll-up inside Coherence@@

Hi friends,

In my previous project, the cache had to support PnL aggregation by account or by product type (US aviation stock or high-yield bonds etc).  I didn’t find out how it was implemented. How is it done in a typical Coherence system?

In fact, the accounts exist in a hierarchy. For example, individual accounts belong to an office. Each office belongs to a region. Each region belongs to a legal entity…. Similarly, the products are organized in a hierarchy too. PnL is first computed at position level. How are these individual position PnL aggregated in Coherence?

Answer 1 — http://coherence.oracle.com/display/COH35UG/Processor+Example is the recommended solution by some practitioners. Basically, api-users write 2 classes — a filter and a processor. Filter controls the subset of data entries or you may say the “universe”. Processor does the aggregation. Internally, I was told an aggregation loop is unavoidable.

I feel coherence data entries are organized into maps along with auxiliary lookup tables to support predicate-assisted select queries like “where account.country = UK”.

Answer 2 — http://docs.oracle.com/cd/E18686_01/coh.37/e18677/toc.htm also mentions an EntryAggregator.

##hidden cost,stolen CPU cycles: latency engineering

Latency /engineering/ and optimization is all about the implicit operations, hidden costs and “stolen” cpu cycles. Incur the minimum CPU costs for a task.

  • eg (practical!): function A calling B, calling C is one more stack frame than A-calling-C
  • eg: boxing/unboxing — extra work for cpu. Also creates garbage for the GC.
  • eg: one thread switching between multiple sockets — more cpu workload than one thread dedicated to one (exchange) socket
  • eg: un-contended lock acquisition — more work than no-lock, partly due to memory fence
  • eg: garbage collection – competes for CPU at a bad time. Usually, if there’s no OOM, then the GC thread is very low priority and won’t slow down a critical task.
  • eg: page swap as part of virtual memory systems — competes for CPU
  • eg: vtbl lookup — adds a few clock cycles per function call. To be avoided inside the most critical apps in an exchange. Therefore c developers favor templates than virtuals
  • eg: RTTI — latency sensitive apps generally disable RTTI early on — during compilation
  • eg: null terminator at end of c-strings — adds network traffic by x% unnecessarily
  • eg: inline – trims cpu cycles.
  • eg: one kernel thread mapping to multiple user threads — fastest system should create no more user threads than the maximum cpu (or kernel) threads, so the thread scheduler doesn’t need to run at all. I feel this is possible only in a dedicated machine, but such a machine must then communicate with peripheral machines, involving serialization and network latency.

For a dedicated kernel thread to service a busy stream of tasks, we need to consider what if the tasks come in bursts so the thread becomes temporarily idle. One idea is to suspend the thread in wait() but i believe kernel thread can’t be suspended. In the kernel, a common approach is to simply keep the thread in spinlock. Assumption is, one cpu is exclusively dedicated to this thread, so this cpu can’t do anything else even if we suspend this thread.

##common c++ run time errors

In java, the undisputed #1 common run time error is NPE, so much so that half of all error checks are null-pointer-checks. In c++, divide-by-zero is similarly a must. But there are more…

– divide by zero
– pointer-move beyond bounds — remember all array sizes are compile-time constants, even with realloc()
– read/write a heap object (heapy thingy) via a reference variable after de-allocation
– return by reference an object allocated on the stack — bulldozed
– dereference (unwrap) a dangling pointer
– c-style cast failure
– double-free
– dereference (unwrap) a null pointer — undefined behavior, unlike java NPE. We should always check null before dereferencing.

—-less common
– delete a pointer to non-heap
– free a pointer to non-heap
– hold pointer/reference to a field of an object, not knowing it’s soon to be reclaimed/bulldozed. Object could be on heap or stack. More likely in multi-threaded programs.
– misusing delete/delete[] on a pointer created with new/new[]. The variable always looks the same — plain pointers.

————-
For any of the above, if it happens in a dtor, you are in double trouble, because the dtor could be executing due to another run time error.

The authoritative [[essential c++]] says that for every exception, there must be a “throw” you can find. Divide-by-zero is something directly done on “hardware” so no chance to throw. I feel many error conditions in C are treated same as in C, without “throw”. In contrast,

  • I feel operator-new is a typical “managed” low level operation that can (therefore does) use exception.
  • dynamic_cast() is anther “managed” low level operation added over C, so it does throw in some cases

Now, JVM “wraps” up all these runtime error Conditions into exceptions. Java creators generally prefer to push these error conditions to the compilation phase, to reduce the variety of Runtime errors. What remain are wrapped up as various RuntimeExceptions. It’s rather remarkable that JVM let’s you catch  all(?) of these exceptions. No undefined behavior.

##some c++REAL productivity skills seldom covered by IV

pre-processor tricks
debugger — graphical or command line
remote debugging
truss
(quick) navigating manpages
gcc (troubleshooting) command line
make (troubleshooting)
linker (troubleshooting)
core dump analysis — not too tough. Just know the basics
memory leak detection
scripting to automate dev-related processes to minimize mistakes and avoid rerunning something 30 times mannually.
text processing to analyze input/output text files and logs
profilers

——
With debugging c++ or others, I think many times you don’t need in-depth knowledge. Many of these tools have the intelligence to probe the runtime/binary and extract insightful data, often presented in a mountain of ascii text (at least human readable:). You just need to search and search in it.

Often one tool is unable to extract some critical data on one platform but another tool works better. You don’t really need to know in-depth though it often pays to be a bit curious. Just keep trying different tricks until you find a break through.

python features (broad) actually used inside 2 Wall St ibanks

If a python project takes 1 man-year, then the number of python features used would take 2 man-weeks of learning, because python is an easy-learning language. In such a project you probably won’t need fancy features like threading, decorators… Most of the work would be mundane.

After the initial weeks of exciting learning curve, you may feel bored. (c# took me months/years). You may feel your peers are deepening their expertise in c++ or wpf …

The tech quiz would be mostly on syntax.

========= the features =========
? lambda, closure, apply()
Everyday operations on list/dict/tuple, str, file object
** for-loop, conversion, filter(), map(), aggregation, argument passing,
Filesystem, input/output
Basics of module import (no advanced features needed)
command line processing + env vars
create functions –composite argument passing
config files and environment config like PYTHON_PATH

———a bit advanced and less used———
creating classes? fewer than functions, and usually simple classes only
list comprehension, generator expressions
basic platform-specific issues. Mostly unix
inheritance, overriding
threading
spawn new processes
decorators
* Python calling C, not the other way round. In financial analytics, heavy lifting is probably done in C/C++ [1], so what’s doable in python? I guess the domain models (market models, instrument models, contract models …) are probably expressed in python as graph-aware python objects.

[1] I have a book on option pricing using things like Finite Element Method. It seems to use many c++ language features (template etc) but I’m not sure if java or python can meet the requirements.

offload non-essential processing to helper threads

update: https://wiki.sei.cmu.edu/confluence/display/java/LCK09-J.+Do+not+perform+operations+that+can+block+while+holding+a+lock mentions a queue for logging requests.

In many real time trading systems, a novice threading developer may decide to dispatch some non-essential logging[1] logic to a task queue backed by a thread pool, without measuring the resulting latency. I don’t always have time to profile such trivial code, so I am not sure if latency might be better if we simply execute the logging code in the current “request processing” thread.

However — here’s the key point — even if separate threading doesn’t help in this case, it’s still important to know how to use threading in similar contexts.

I’d rather have a developer with threading know-how even though we don’t need it now, than a developer uninitiated to the intricacies of threading.

If you are wondering why the fuss over non-essentials, here’s an example — In some of the busiest, most time-critical trading engines in the exchanges/ECNs, there’s just a single thread sorting/matching quotes for a given symbol. This thread has to be lean and mean. All non-essentials must be delegated. We just mentioned a naive Design #1.

Design #2) Another common technique is messaging. Send a message with all the details, and leave the rest to the receivers. I feel this is slower than threading. Serialization, data copy, network bottleneck, socket handshake. Fastest thread should be network-free and disk-free.

Design #3) Another producer/consumer design uses a pre-allocated (circulalr) array of Requests. Main thread just set a flag to mark a request as executed and move on. Some consumer threads would clean it up and remove it from the array. Typically, we have a single producer and a configurable bunch of consumers, just like a thread pool. If there are enough consumer threads, then the buffer [2] will be almost empty. I feel it’s helpful if the buffer fits into CPU cache.

Note all the designs are producer/consumer.

[1] or other non-essential work such as serialization, persistence, replication, comparison, verification, DB query, or web download. See http://bigblog.tanbin.com/2010/10/gemfire-write-behind-and-gateway-queue.html

[2] You can call it a task queue, but physically it’s really a simple memory buffer, the simpler the faster.

glibc, an archetypical standard library (vs system calls)

Background — I was looking for a concrete example of a standard library.

Best example of a standard library is glibc, produced by GNU team. If you strip the G, it’s “libc” — THE C standard library.

“Standard” means this library is a “carpet” hiding all the platform-specific differences and presents a uniform interface to high-level application programmers — so-called App-Programmer-Interface or “API”.

There are many industry standards for this same purpose, such as POSIX, ANSI-C (which standardizes the C programming language + standard lib). Glibc supports all of these standards.

To clarify a common confusion, it’s worthwhile to understand this simple example — glibc functions (like printf) are implemented in platform-specific underlying syscalls.

Q: Exactly What are the platform differences? Short answer — system calls. Remember System calls call into the “hotel service desk”. These syscalls are tied to the processor and the operating system so they are by definition platform-specific. See other posts on syscall vs standard library.

b4 and af select() syscall

Note select() is usually used on the server-side. It allows a /single/ server thread to handle hundreds of concurrent clients.
— B4 —
open the sockets. Each socket is represented by an integer file descriptor. It can be saved in an int array. (A vector would be better, but in C the array also looks like an int pointer).

FD_SET(socketDes1, &readfds); /* add socketDes1 to the readfds */
——–

select() function argument includes readfds — the list of existing sockets[1]. Select will test each socket.

— After —
check the set of incoming sockets and see which socket is “ready”.

FD_ISSET(socketDes1, &readfds)

If ready, Then you can either read() or recvfrom()

http://www.cs.cmu.edu/afs/cs/academic/class/15441-f01/www/lectures/lecture03.ppt has sample code.

[1] Actually Three independent sets of file descriptors are watched, but for now let’s focus on the first — the incoming sockets

most fundamental data type in c#/c++/java

Notice the inversion —
* C# — Simple types like “int” are based on struct — by aliaing
* C — struct is based on primitives like “int” — by composition
* C++ — classes are based on the C struct

See http://msdn.microsoft.com/en-us/library/s1ax56ch(v=vs.80).aspx. Ignoring enum types for a moment, all c# value types are struct.

Now we are ready to answer the big Question
Q: beside pointers, what’s the most fundamental data type in this language?
A: C# — most fundamental type is the struct. Simple ints, classes and all other types (except enums) are built on top of struct.
A: Java/C++ — most fundamental types are primitives. In C++ all other types are derived using pointer and composition.
A: Once again, java emerges as the cleanest

Now we know why c# doesn’t call int a PRIMITIVE type because it’s not monolithic nor a truly fundamental type.

polymorphic equals() in java/c#@@ NO

A java interviewer quizzed me why my equals() implementation insists on both objects having exactly the same runtime type, and rejecting a derived object. He questioned what if we happen to have

B var1 = new Derived(1,2); B var2 = new B(1);

and the B portion of both objects are identical?

Biggest problem with this type of polymorphic equals() — X === Y === Z doesn’t mean X === Z. Suppose the base sub-object of X/Y/Z are identical. Y is a base object. Therefore Y.equals(X) == true == Y.equals(Z), but X and Z are subtype instances, and different.

I’m a purist with high standard. If my equals() determines 2 objects to be equals, then I “swear” they are indistinguishable and interchangeable carbon copies.

It’s generally safe to be strict writing equals() — insist both objects’ actual types at runtime are identical.

IRS trading in a big IR Swap dealer

A3: each deal is tailor made for a particular client. A deal (or a “trade”) often has more than 100 attributes and a lifecycle of its own.

I spoke with a big US investment bank (Citi?) and a big European bank. IRS is a dealer market — No exchanges; each dealer takes positions. (There’s London Clearing House though). Each dealer bank maintains bid/offer but don’t publish them. Why? See A3. Each new client is signed on by an elaborate on-boarding/account-opening process, through the bank’s dedicated sales team. I guess that’s the signing of ISDA Master Agreement.

Once signed up, a client can send an RFQ with a deal size but without a price and without a Buy/Sell indicator. Bank typically responds with a fully disclosed bid/offer price pair. Client can either hit the bid or lift the offer.

Since IRS trade size is much larger than equities, the daily volume of trades is much smaller.

A dealer bank maintains IRS bid/offer pairs in multiple currencies. But here we are talking about single-currency swap.

There are dealers in cross-currency IRS, but that’s a different market.

java getResourceAsStream() – easist(@@) way to read any file

You can easily read a file in the same dir (esp. in a jar) as your class or anywhere(?) on file system. Methods below are usually conveniently called on the Class object, like

1) simplest usage:
InputStream is = this.getClass().getResourceAsStream(“any file such as a.txt”);
InputStream is = AnyClass.class.getResourceAsStream(“any file such as a.txt”);

InputStream is = GetResourceTest.class.getResourceAsStream(“../../test.txt”);
if (is==null) System.err.println(“file not found”);

Behind the scene, these methods delegate to the classloader’s getResourceAsStream()
http://mindprod.com/jgloss/getresourceasstream.html has good examples.

Yang KPI:learn’huge codebase-goto person, connect user-expectations+codebase

This is an annotated conversation with a team mate.

> “Y (our team lead) has good system knowledge” like how things *should* work.

If Y asks a developer a specific question about how something actually works in the current implementation, he often points out “discrepancies” thanks to his system knowledge. In that case, either the developer is wrong or there’s a bug — Y’s system knowledge is quite sound. I (Bin) feel Y is sharp, quick and can filter and absorb the gist into his big picture.

I feel current implementation is probably fairly close to the “expecation” because it has been used (therefore verified) for years.

I feel Y approaches a complex system from top down, focusing on the how-things-should-work, or so-called “system knowledge” which is non-trivial. That’s one reason Y is more interested in the SQL queries. SQL is useful because DB serves as “checkpoint”.

Part of his learning is weekly user meeting on (high-level or detailed) requirements, bugs … — boils down to user EXPECTATIONS. Team members don’t get that insight. I feel that is more than biz knowledge — such insight connects user expectation with implementation. Much of the insight is summarized in jira and you can tell how much insight there is.

Another part of his learning is knowledge transfer session from his predecessor.

I feel Y asks better questions than I did. “system knowledge” questions. Roland told me he has difficulty getting answers but Y is *persistent* and asked lots of sharp questions until he got all his high/low level doubts cleared. My focus tended to be at a lower level.

Y became a go-to person quickly.

< By system knowledge i think you also mean the PATTERNS and SUMMARIES of logic in codebase. After you study codebase bottom up you inevitably see such patterns.
> “yes”

< how did you pick up in the beginning? There's so much code to get through.
> “only 2 ways to learn — support or development. I worked on development of pgrid and GCA. i followed emails between users and support guys and went into code to verify.”
I (Bin) think there’s also a lot of help from the Infosystem colleagues.

I didn’t follow all the email chains and take those as learning opportunities. Email overload.

Y gave me a plausible learning opportunity — document the haircut rules. But there’s too much logic. I studied it a few times but didn’t connect to the log messages or user issues. I didn’t learn enough to make me a go-to person. Now i know support is the way to go — i get to know what’s important among the hundreds of complex rules and also how-things-should-work.