## significant algo IV P/F xp

See also ##failed c++cod`IV: home^IDE^whiteboard:4 beat-fronts

  • paper — is a shorthand for paper, dumb editor, white-board etc
  • webex vs onsite are somewhat similar, usually pair-programming. Take-Home setting is vastly different.
  • pts (points) — awarded to myself, indicating the performance, value, difficulty ..
  • p/f are only based on interviewer’s subjective judgement
  • pp means passed convincingly, with flying colors
  • [1] I focused on working solution! See %% priorities in a take-home coding IV
  • [2] highlight: where practice can make a difference in a make-or-break test
lang location IDE? firm/person pts 1-5 Notes category type@practice needed [2]
c phone phone p SCB-FM 2 2018 pure algo 1 easy + 2 hard ones. I over-performed algo array/slist problems
c weekend IDE 😦 quantlab 4 2018 Houston HFT weekend coding drill ROTI dStruct
c/thr home IDE p Pimco quant 3 2018 2H parallel programming thr study pthreads
c home hackerrank P! Flex 3 2018 link cleanup#github std algo array+list problems
java home hackerrank 😦 Pimco #190m 2 2018 Q9: path through matrix. Ineffi [1] std algo too hard, but some 2D practice can help
java home hackerrank 😦 Pimco #190m 2018 Q8: buy 1 sell any algo array problems
c onsite paper p LiquidNet 2 2018 30 minutes algo array problems
c home codility 😦 Mako 4 2018 monkey. Ineffi [1] algo too hard
c home codility 😦 Mako 3 2018 quasiconstant [1] algo array practice
c onsite paper p SIG 2 2018 15-30min … RAII smart ptr std lang
c onsite paper p SIG 3 2018 push_back std lang
py onsite IDE p SIG 3 2018 movie query dStruct algo practice
c weekend hackerrank p Promethean 5 2018 2 tough graph problems algo too hard but some QQ can help
c/py home hackerrank pp Kraken 3 2018 3 (small?) problems algo algo
any onsite paper 😦 Nsdq 2018 oracle day-trading algo array problems
any onsite paper 😦 Nsdq 2018 check array 0-N in-situ algo array@int problems
java home IDE pp Nsda 4 2018 drone: medium-level implementation design hard to prepare
c onsite paper pp CVA 3 2018 O(N) in-situ filtering algo array problems
c onsite paper 😦 FB 2018 2 short problems std algo algo
c onsite 😦 Trex 2018 exchange messaging SDI no practice can help
py onsite IDE pp Trex 1 2018 3 short tricky problems. only 1 implemented algo algo practice can help
c onsite paper pp Quantum 1 2018 virt func not declared in base trivial lang
c/thr webex paper pp bbg 4 2018 single-thr pool QQ lang
c webex paper pp bbg 3 2018 shared ptr ctor/dtor QQ lang
py onsite paper pp bbg 5 2018 continuousSentence algo backtracking, string problems
c/thr onsite paper ? Wells 2 2017 concurrent queue std
c onsite paper p Wells 1 2017 remove_spaces. Ineffi trivial string problems
c onsite paper p bbg 2 teams 2 2017 array reshuffle lang
c onsite IDE p bbg 2 teams 1 2017 isBST() std tree problems
c onsite paper P! bbg 2 teams 1 2017 string string problems
c onsite paper 😦 bbg 2 teams 2017 filters on screen dStruct heavy dStruct
C home hackerrank 😦 Thesys 2017 3 problems out of 5 algo algo
C home IDE pp Thesys 5 2017 first N prime Fibonacci hard to prepare
c webex IDE pp bbg 2 2017 top N active stocks dStruct heavy dStruct
c onsite paper pp BAML 1 2017 various lang
c weekend IDE 😦 GS 3 2017 TickEngine dStruct heavy dStruct problems can help
C webex IDE pp bbg 2 2017 free slots dStruct heavy algo
C webex paper p bbg 1st round 2 2017 tree serialization std algo tree algo
Java onsite paper pp BGC 2 2017 multiple short programs lang
Java weekend IDE pp BGC 3 2017 connected or not hard to prepare
Java/thr weekend IDE p HSBC 4 2017 3 big questions barely passed hard to prepare
Java/thr home IDE pp pimco 4 2017 iterator again lang
Java onsite paper pp pimco-Zoltan 2 2017 a few short problems lang?
c webex IDE 😦 Citadel 2017 array shrinking array problems
py webex paper ? Broadway 2017 hashtable. Code (on github) works std algo implement std containers
cpp weekend IDE ? iRage 3 2015 order book again. No response dStruct heavy hard to prepare
Java home codility 😦 baml 2015 Qz FX option team. too lazy algo
c home codility 😦 Jump 3rd 2015 algo
c weekend IDE pp jump 1st 3 2012 order book dStruct heavy hard to prepare
c weekend IDE 😦 DRW 3 2015 Tetris. code quality hard to prepare
c webex paper 😦 bbg -London 2015 unable to understand Q1 of 4 std algo algo
c# webex paper pp eikon 2 2013 short, easy questions QQ lang won’t practice
java home IDE 😦 MS 4 2011 FIX. Big take-home test hard to prepare
swing webex IDE pp Barx 2 2012 swing QQ lang won’t practice
C home IDE 😦 Mac 2012 2 short problems lang
java home IDE 😦 MS-comm 2012 too lazy hard to prepare
c onsite paper pp Cantor 1 2011 auto-ptr QQ lang won’t practice
java onsite paper 😦 Barc SOR 2010 recursion recursion
java onsite IDE pp RBC 1 2010 30-45 min, bond tick conversion lang
java onsite IDE 😦 UBS 2011 Suntec lang
java/thr onsite whiteboard 😦 UBS 2011 Jack Yang lockfree stack QQ
java/thr onsite whiteboard 😦 Barc 2011 wait/notify + lockfree QQ
java/thr onsite IDE 😦 Lab49 2010 executor lang
java/thr home IDE pp Gelber 3 2011 multithreaded hard to prepare
C home IDE 😦 Amazon 2011 didn’t take it seriously algo
C onsite whiteboard 😦 FB 2012 regex QQ std algo too hard, but some QQ can make a diff
any onsite whiteboard 😦 Goog 2007 algo

clone a std::vector by memcpy@@ #Deepak

In general, we can’t use memcpy to clone the internal array of a vector.

If we have a vector of simple value types like an int or float, memcpy is probably simpler.

If there’s any pointer member data, then we are in trouble. For example, if there’s any std::string in the internal array, memcpy will bypass the string copy operations. The internal pointer in the string will be copied incorrectly. When the original array content gets deleted, our cloned strings will be messed up.

There are many other element types that involve pointer:

  • the element may have a smart ptr data member. All smart ptr classes provide special copy control, not to be bypassed.
  • the element may be another container. Every container I know uses pointers. They all have special copy control.
  • the element may have some shared resource like a mutex or a connection. These resources are typically allocated on heap. When the original vector gets destroyed, the resource will need release or clean-up

Q: For vector of int, is memcpy faster than copying element by element?
A: Probably not. Both O(N). A for-loop to copy an array of integers may compile to the same binary.

proof-of-work ] blockchain #key questions

Based on my discussions with some experts in the field..

Say Cody just paid Bob some amount in bitcoins. At this time, there could be a large number of pending (unaccepted) transactions like this one waiting to be accepted into the linked list (i.e. the block chain).

One or more miners without coordination, can pick up this transaction + up to 999 other transactions and attempt to add them to the block chain.

They have to overcome a challenge, a computationally intensive challenge, by brute force — They need to find a number (called nonce) such that the integer hash code from hashing function

H(localhost timestamp, //not always in sync with other hosts
[T1, T2, … T1000], // 1 or more pending transactions
the minder’s Id, //typically the IP
previous block’s hash // <– forming a linked list or block-chain
) < barrier

The barrier is a binary number with 3 (or more) leading zeros. In other words, the challenge is mining for an nonce to give a low-enough hash code with enough leading zero bits. Once a miner finds a “solution” and creates a valid block, she broadcasts the block’s hash code (with enough leading zeros) to the network. If one node verifies it, then other nodes would soon verify it.  There’s no central authority to decide when to accept. A node accepts it and uses it as “previous block hash” in a new block, as described in TOWP.

The head block of a linked list is special — no previous block! (A Git repo has only one such “block”, but not in blockchain.) 2nd block includes the first block’s hash. 3rd block includes 2nd block’s hash, and indirectly includes first block’s hash. Therefore, for any given block KK, its entire ancestry lineage is hashed into KK.

  • TOWP — TheOriginalWhitePaper
  • POW — proofOfWork
  • lucky — Some miner could be lucky and find a “solution” in the first try, but the challenge is so tough that there’s only brute force.
  • LevelOfDifficulty — is the number of leading zeros, like 3. When hardware speeds improve, someone (perhaps the merchant) will increase the LevelOfDifficulty, as explained in TOWP
  • immutable — Just like Git, every block is immutable once accepted into the chain.
  • The most common hash function is SHA and SCRYPT.

Q: how fast is the verification vs POW
A: POW is supposed to take 10 min on average. Verification should take nanosec, according to the experts I spoke to. Beside the “attached” transactions, I don’t know what inputs there are to the verification, but it computes a hashcode and compares it to something.

Q: Fundamentally, how would verification fail when an owner double-spends a coin?

Q: what if two miners both pick Cody/Bob’s transaction and broadcasts their “solution” hash code? The first miner would win the race (and get the reward after some delay)

Q: How are two independent linked lists handled in one host?

Q: what’s the need for POW?

Q: what’s the motivation for the a miner to take up a transaction?
A: there’s a 25-coin reward, or possibly higher if the Cody or Bob increases the reward

Q: does each coin have an id?
AA: no, but there is ID for each transaction, each block and each account.

Q5: what if verification fails?
%%A: I feel it’s rare but possible. If happens, the node would ignore it.

Q5b: how would the miner know? Not sure.


movie query #SIG

Like the other coding question from the same employer, this is mostly about nested data structure.

Q1: Given sample csv file below (unsorted), implement an efficient query function get_movies(genre, Start_year, End_year). Optimize for query, not initial loading. Bonus points if you make it work within 75 minutes.

  1,Toy Story 3,2010,Adventure|Animation|Children|Comedy|Fantasy
  2,Toy Story,1995,Adventure|Animation|Children|Comedy|Fantasy
  ...... 20,000 more lines

https://github.com/tiger40490/repo1/blob/py1/py/movieQuery_SIG.py is my working code, good enough for a small data set like movies, but inefficient for bigger data set like newspaper articles.

Q2: what if the number of genres could be thousands, like a tag cloud.

— Q1 analysis
My initial thought was on binary search, with O(logN) + O(E-S) which is often equivalent to O(N). Locate first and last item in a sorted list, then iterate over enclosed items.

After some brief discussion with interviewer, I came up with a design of O(1) + O(E-S), arguably the fastest, featuring a jagged 2D array:

  • Assumption: genres are shared and shows up repeatedly in the file, rather than random strings that may show up only once.
  • use contiguous array to hold all movies, array indexed by year.
  • Within each “year bucket”, store multiple movies in a contiguous array indexed by genreId.
    • Note there is no gap in the genreId /numbers/. I initially said hardcoded genre enum but interview (Ken) challenged me with frequently changing genres. I now feel we can generate genreId as we load the csv file. This is comparable to some perfect-hashing described in Q2 analysis below.
  • each “cell” holds a list-of-shared_ptr-to-movie
  • each movie object is on heap, and owned by shared_ptr’s

— Q2 analysis:
If tags are user-generated, random and uncoordinated, then the total population of tags could be unlimited. Our genreId would go up to millions. Solution? Perhaps use a separate hash table within each “year bucket” instead.

However, since the query is “by tag” it implies that the most important tags are shared.

At this point, I feel design should depend on the real, not imaginary, context, so it’s kind of pointless to speculate what type of tags we will get.

OMS skill !! standardized !! portable

I now feel the glorified OMS know-how is vague and poorly-standardized.

  • My B2bTradingEngine at 95G does a lot of OMS.
  • Smart order router module often does some OMS.
  • FIX connectivity module often does some OMS.
  • VWAP and other execution algos are usually considered a category of OMS.
  • … Many of the above are very different skills and have almost nothing in common.

I am afraid that even after 3Y or 5Y in some OMS system, when I apply to another OMS job they may realize I’m not a veteran.

Q: In comparison, which domain is more standardized in terms of skillset?
A: raw mkt data. Note 2nd-hand market data processing is much less valuable
A: basic bond math. Note more advanced bond math is rarely needed
A: FIX connectivity

  • How about VaR?
  • quote distribution systems in Citi and OC? poorly standardized

specialty jobs outpay commodity financial IT jobs@@

In 2007 I worked on the most common type of financial IT job — 1) java servlet applications with 2) a big database and stored-procedures  3) small amount of client-side javascript (possibly with ajax). 4) In my case, there was also some scheduled batch job.
It was in Private-Wealth-Management but no financial domain knowledge required.
(I have since moved on to “fancier” systems, not only pricing or market data.) I started to think the most common dev job like the PWM job doesn’t pay as well, but what’s the reality that you see throughout your career?
Q: does the most common type of financial IT job pay lower because it only needs commodity skills?
[Mithun] Not always. It depends on how critical the demands are and how much $ impact it has.
Trading as such they have money to spend, sometimes demanding & definitely impacts $ 
there is a short-term trend —  depends on demand & supply. 
I have lately seen lot of money thrown at angular even thought it might not be challenging as concurrency, but then u need many angular developers for a few server side developers. But then this skill fast retires.
Perhaps it pays just as well, or 5% lower than the specialty dev jobs in trading systems?
Commodity java dev will get around 145k while concurrency guy will get 175k…. 
Things have changed in that “most common type of financial IT job”, which I can only define in vague terms.
* once there was lots of spring/hibernate usage
* once there were lots of xml-related server-side libraries
* I guess there are more RESTful interface?
* more javascript code
* fewer stored procedures
* more noSQL systems, less SQL
* I guess there’s now more Hadoop??
yes Hadoop , spark, but I see a lot of demand for python, and they are coming with rich libraries.

buggy RAII smart_ptr #SIG

Q1: given a well-behaved class X, someone wrote a RAII wrapper below. What problems do you see?

Q1b: how would you fix them?

%%A1: memory leak of x1, due to the synthesized op=()
%%A1: double-delete on x2

Q1c: OK Good. Now what if X has subclass Y,  all with virtual dtor?
%%A1c: X and Y each create virtual clone() to return a pointer to the host type

Q2: what’s the outcome of double-delete?
AA: undefined behavior. You are lucky if it crashes.

struct A{
  A(X* x): myX_(x){}
  ~A(){delete myX_;}
  X* myX_;
void demo(){
  A a1(new X()); //unnamed x1 object on heap
  A a2(new X()); //unnamed x2 object on heap
  a1 = a2;
/////////// above is original code; below is my Q1b answer -- Add op=()
A & operator=(A const & rhs){
  if (this != &rhs){
    auto tmp = this->myX_; //delete only after successful "new"
    this->myX_ = new X(*rhs.myX_); //clone the X object of rhs
    // if new or ctor throws, the memory is not really allocated 🙂
    delete tmp; //let's assume no exception here.
  return *this;

design IV: telltale signs of disapproval

(Note a subtype of design interview is the SDI.) When interviewer asks, as described in https://www.susanjfowler.com/blog/2016/10/7/the-architecture-interview:

  • why you choose this approach (this design, this direction…)
  • what are the pros and cons of this approach
  • what alternatives there might be

It’s 80% a sign of disapproval.

If they are only curious, they would phrase it differently.

Remember interviewer has a rather fixed idea how this design should go. If you propose something unfamiliar to her, she can’t “lead” the interview with confidence. She risks losing control. Therefore, she has no choice but steer you back to her familiar territory.

Most of them won’t want to admit that your idea is plausible but different from her idea.

per-thread^per-DBConn writeBuffer

  1. Broadly speaking, if you write to a table in your DB connection but don’t commit, it will be invisible to other connections.
  2. Similarly, if a thread writes to shared object without memory fence, the updates is only in the per-thread buffer and invisible to other threads.
  3. File write is an even more common scenario. One process writes a single line to the file, but doesn’t flush. It will not be visible to other processes.
  4. CVS

##simplicity@design pushed to the limit

Note in these designs, the complexity can never disappear or reduce. Complexity shifts to somewhere else more manageable.

  • [c] stateless — http
  • microservices
    • complexity moves out of individual services
  • [c] pure functions — without side effects
  • use the database concept in solving algo problems such as the skyline #Gelber
  • stateless static functions in java — my favorite
  • EDT — swing EDT
  • singleton implemented as a static local object, #Scott Meyers
  • [c] garbage collection — as a concept.
    • Complexity shifts from application into the GC module
  • STM
  • REST
  • in c# and c++, all nested classes are static, unlike in java
  • python for-loop interation over a dir, a file, a string … See my blog post
  • [c] immutable — objects in concurrent systems
  • [c] pipe — the pipe concept in unix is a classic
  • [c=classic, time-honored]

stateless (micro)services #%%1st take

in 2018, I have heard more and more sites that push the limits of stateless designs. I think this “stateless” trend is innovative and /bold/. Like any architecture, these architectures have inherent “problems” and limitations, so you need to keep a lookout and deal with them and adjust your solution.

Stateless means simplicity, sometimes “extreme simplicity” (Trexquant)

stateless means easy to stop, restart, backup or recover

Stateless means lightweight. Easy to “provision”, easy to relocate.

Stateless means easy scale-out? Elastic…

Stateless means easy cluster. Http is an example. If a cluster of identical instances are stateless then no “conversation” needs to be maintained.

[18]fastest threadsafe queue,minimal synchronization #CSY

I got this question in a 2017 Wells white-board coding interview, and discussed with my friend Shanyou. We hoped to avoid locks and also avoid other synchronization devices such as volatile, atomic variables..

Q1: only a single producer thread and a single consumer thread and no other threads.

I put together a java implementation that can enqueue without synchronization, most of the time, until See https://github.com/tiger40490/repo1/blob/jProj/java/com/wells/UnboundedQFor1Producer1Consumer.java

Q1b: Is it possible to avoid synchronization completely, i.e. single-threaded mode?
A: No. Consumer thread would have absolutely NO idea whatsoever how close it is to the producer end. No. We need a memory barrier at the very least.

Q2: what if there are multiple producer/consumer threads?

I believe we can use 2 separate locks for the two ends, rather than a global lock. This is more efficient but invites the tricky question “how to detect when the two ends meet“. I am not sure. I just hope the locks enforce a memory barrier.

Alternatively, we could use CAS on both ends.


thread^process: lesser-known differences #IV

Popular IV question. Largely a QQ question.  Some may consider it zbs.

To the kernel, there are man similarities between the “thread” construct vs the “process” construct. In fact, a (non-kernel) thread is often referenced as a LightWeightProcess in many kernels such as Solaris and Linux.

  • context switching — is faster between threads than between processes. In linux, context switching between kernel-threads is even faster.
  • creation — some thread libraries can create threads without the kernel knowing. No such thing for a process.
  • socket — 2 threads in a process can access the same socket; two processes usually can’t access the same socket, unless … parent-child. See post on fork()
  • memory — thread AA can access all heap objects, and even Thread BB’s stack objects via pointers. Two processes can’t share these, except via shared memory.
  • a non-kernel thread can never exist without an owner process. In contrast, every process always has a parent process which could be long gone.


SDI: URL shortening

Q: Design TinyURL or bitly (a URL shortening service)

Given a (typically) long URL, how would how would you design service that would generate a shorter and unique alias for it.

Discuss things like:

  • How to generate a unique ID for each URL?
  • How would you generate unique IDs at scale (thousands of URL shortening requests coming every second)?
  • How would your service handle redirects?
  • How would you support custom short URLs?
  • How to delete expired URLs etc?
  • How to track click stats?

https://www.interviewbit.com/problems/design-url-shortener/ is a long discussion.

SDI: DoS-guard #Nsdq

Q: Design an API Rate Limiter (e.g. for Firebase or Github)

You are expected to develop a Rate Limiter services that can:

  • Limit the number of requests an entity can send to an API within a time window e.g., 15 requests per second.
  • The rate limiting should work for a distributed setup, as the APIs are accessible through a cluster of servers.

(A similar question was asked at Nsdq… )

Q2: how do your cluster of cache servers detect a given IP on the Internet is sending requests too frequently, causing Denial of Service? How do you protect yourself?

Q2b: After you blacklist a client IP, it goes quiet, then it sends a single request again. How you decide whether to ignore the request?

Q2c: what algorithm to decide if a client IP has legitimate need to send lots of requests vs another client IP engaging in Denial of Service attack?

Q2d: what if distributed DoS attack?

https://en.wikipedia.org/wiki/Denial-of-service_attack#Defense_techniques has practical solutions.

2nd in-depth job]c++ #critical mass


  • After Citi-muni, even though I had enough experience to pass similar job interviews, I didn’t feel confident in GTD, so I took a 2nd real time trading system job in Baml, and reached critical mass
    • I did learn more in the ensuing 3 months than I would have over another 3 months in Citi
  • RTS is similar. I could already pass real time c++ interviews, but I didn’t feel confident in GTD.
    • Note the Mac job couldn’t count as a substantial c++ job.

Q: How about c#? I actually feel confident about GTD in a future c# team, so I didn’t ‘need a 2nd c# job?

most(new/old)specializations turn out non-strategic

Look at my past vindicative specializations vs 

The Singapore government make technology bets. Venture capitalist make bets. Many organizations also make technology bets.. Every technology bet can lose or win.

In this post, I’m not advocating trySomethingNew. I am advocating specialization, which often requires accumulation and sticking to something not so new, like FIX, threading, SQL.

If you play safe and stay within the comfort zone of java/SQL/Perl, then don’t under-estimate the negative consequences such as

  • reactive
  • doldrums — see post on “y re-enter c++”
  • no deepening your understanding — a zbs
  • remain afraid and /uninitiated/ with the lower-level details below JVM

engaging #marketableDomainXp.xls

Let’s keep this in blog , not spreadsheets.

Engagement is a real contributor to job satisfaction. Because of it I walked away from CVA, Barx and other higher, easier java jobs.

This factor is notoriously n inherently elusive, unstable (strategic). Engagement is impermanent, then-and-there, like joy and sadness. I once found wafer fab spacesuit fashionable! Big data, wpf, quant, swing, FIX, kdb, … were once attractive to me. What’s engaging used to be stategic trySomethingNew, but look at other blog posts… It’s hard to identify some activity as long-term engaging.

Q: On marketable-tech-xp.xlsx, which factors on row 2 help keep my mind engaged?
– Some form of complexity is always helpful. There are some on Row 2.
– Poor “market value” always decimates my “engagement”. There are some protective factors on Row 2.

The job_satisfaction_predictor spreadsheet compares past jobs in terms of engagement.


blockingMutex implementation ] kernel

Background — Linux kernel provides two types of locks — spinlock and blocking mutex, as in https://www.kernel.org/doc/htmldocs/kernel-locking/locks.html . Here I focus on the mutex. I think this is far more useful to userland applications.

https://lwn.net/Articles/575460/ has good pointers:

  • I believe context switch is expensive since CPU cache has to be replaced. Therefore, optimistic spin is beneficial.

https://github.com/torvalds/linux/blob/master/kernel/locking/mutex.c shows

  • a blocking mutex used in kernel, perhaps not directly used by userland apps
  • implemented using spin lock + some wait_lock
  • maintains a wait_list. Not visible to any userland app.


c++^java..how relevant ] 20Y@@

See [17] j^c++^c# churn/stability…

C++ has survived more than one wave of technology churn. It has lost market share time and time again, but hasn’t /bowed out/. I feel SQL, Unix and shell-scripting are similar survivors.

C++ is by far the most difficult languages to use and learn. (You can learn it in 6 months but likely very superficial.) Yet many companies still pick it instead of java, python, ruby — sign of strength.

C is low-level. C++ usage can be equally low-level, but c++ is more complicated than C.

given value X,get neighboring nodes in treeMap

Basically, find the left/right neighbor nodes.

! Don’t use upper_bound since lower_bound is enough.

  • If perfect match, then lower_bound return value is all you need. No need for 2 nodes:)
  • If no perfect match, then lower_bound() and prev(lower_bound)
  • if X too low, then begin() alone is all we can get
  • if X too high then prev(end()) alone is all we can get

See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/curveInterpolation_CVA.cpp

lower_bound() means touchUpperBound #4 scenarios

  • std::upper_bound() should be named strictUpperBound()
  • std::lower_bound() should be named touchUpperBound() since it can return an element touching the target value

If no perfect hit, then both returns the same node — lowest node above the target

  1. if target is too high, both return end(), which is a fake element
  2. if target is too low, both return begin(), which is a real element
  3. if target is matching one of the nodes, then perfect hit
  4. if target is between 2 nodes, then … this is the most common.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/66miscIVQ/curveInterpolation_CVA.cpp caters to all four scenarios.

Can this array be preorder-BST#hackerrank

This was asked in a 2018 hacker rank interview, not as important as an on-site coding question. However, I see this question as a classic.

https://www.geeksforgeeks.org/check-if-a-given-array-can-represent-preorder-traversal-of-binary-search-tree/ has a tested solution but it’s too cryptic. I added instrumentation to help myself understand it. See my github code.

My analysis — After we have seen N (say 7) nodes, there’s exactly one possible BST we could construct, so let’s construct it. The next node would have only one place to go, and we can check if it obeys pre-order. https://github.com/tiger40490/repo1/blob/py1/py/tree/canBePreOrderBST.py is my tested code, probably less elegant than the above, but I’m still proud of it.

I don’t think any solution (including the G4G) is really O(N). My solution is not inferior. It has a readability advantage. It’s longer but not necessarily slower.


## execution algo + other domains / skills

OMS — is probably a core module in execution system. Equity and FX systems tend to need OMS but many trading desks presumably need no OMS, very simple OMS or off-the-shelf OMS. A buy-side trading desk may simply use the OMS of the broker. The same could be said of “execution systems” like VWAP algos.

Therefore, I feel the importance of OMS/EMS is over-stated.

SOR — is more niche, less “generic” a skill, but as a module is more visible to the business.

FIX connectivity — is a more generic tech skill and shows resilience to churn.

mkt data — is more “separate” from the other skills.

Nsdq onsite IV

Q: Java NonBlocking IO?
%%A: one thread monitoring multiple sockets

Q: many simple Objects with complex relationships/interactions vs one (or very few) huge complex object?

Q: thread Futures? Any comments?

–some of my findings

  • microservices and containers are important
  • According to the architect, the new framework uses one dedicated instance of matching engine serving a single symbol such as AAPL.
  • the new java framework only supports a few smaller exchanges initially, but is designed to roll out to more “exchanges” esp. those to be installed on client sites (like CSP). It is (theoretically) designed to be capable enough to implement the main stock exchange.
  • is Mostly java + javascript (Amber, react) with small amount of c++
  • the new java framework was deliberately designed to operate in single-threaded mode, despite oppositions. Therefore, multiple “clients” calling the same library concurrently would be unsafe.
  • java8
  • noSQL is adopted in addition to PostgreSQL, but the principal architect is less concerned about data store, which is only used in individual components

check array@0-N in-situ #Nsdq#contrived

— Q1: Given a size-5 array of integers. For every element x: 0<= x <=4. Array would be considered Good if all elements are unique, i.e. all the numbers 0 to 4 show up exactly once each. Please write a program to check if the array is good. If Bad, then print every number that’s showing more than once. O(N) time and O(1) space please.

We will denote array size as sz, so N == sz-1.  N=4 and sz=5 in the opening example.

— comments:

I feel this question (even without the bonus) is too hard to complete on the spot, unless you have done it before.

I made a mistake that would be unforgivable in west coast and Bloomberg interviews — failed to clarify requirements and assumed input array is immutable.

— Q1b (bonus): Also print every missing number. (Requires iterating entire input.)

Example A {1,2,0,2,0} has repeating 0 and 2, and missing 3 and 4. Example B {3,1,4,2,0} is Good.

— comments:

If you feel the O() complexity requirements are daunting, then work out a less efficient solution first, and improve on it.

I feel in real projects, we always have enough memory to build a separate array, so the O(1) space requirement is contrived. A O(N) time and O(N) space solution is easier. Therefore, I rate this coding question as contrived and relatively non-standard. Yet the required techniques are “universal” in many high-end interviews.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/array/checkArr0-N_Nsdq.cpp has my 99% solution. The unfinished part is trivial.



max-profit #Nsdq short-sell

Q1: given a time series of price points within a past day, there are many ways to trade the day — one buy-sell, five buy-sells, or do-nothing … Short-sell allowed, but you must start and end the day holding no shares. For example, if you sell 9 times then you must buy 9 times, each time 1 lot (100 shares) exactly. Write a function to trade the day and produce the highest profit.  Note you can analyze the price points with perfect hindsight.

Interviewer asked for real code. Very little time given, so I believe the right solution is short, and much simpler than the question on “array of 0-N” (where interviewer asked for pure algorithm).

https://github.com/tiger40490/repo1/blob/cpp1/cpp/array/oracleDayTrader_Nsdq.cpp is my buggy”greedy” solution.

https://github.com/tiger40490/repo1/blob/py1/py/array/maxProfit_Nsdq.py is my new solution.

Q2: now you are allowed to buy UP-TO 100 shares each time. All other rules remain. When if ever is it optimal to buy/sell fewer than 100 shares (a.k.a. an odd lot)?
%%A: never

Q3: same as Q1 but now your holding must always be long 1 lot, short 1 lot or zero holding.

–(broken?) brute force solution I gave on the spot:

Start with just four price points 1,2,3,4. Name every pair A=1>2; B=1>3; C=1>4; D=2>3; E=2>4; F=3>4. Each pair represents a buy/sell round trip, with a profit (ignore it if unprofitable).

How many “plays” i.e. how many ways to trade the day? 2^6 plays.

Just evaluate each play and find the best. Beware that ABD is same as B x 2 and should be scaled down by 2. Similarly, AEBFC == C x 3

##[15] top22 low-level IV topics : c++imt Java

This is collected after i stopped active interviewing in Aug 2012.

Low-level domains are my strength, proven again in 2017-2018 interviews. Beside the obvious domains — threading,  data structures, c++/java/c#..

  • exception in c++/java
  • concurrent singleton
  • GC
  • big-O analysis in west coast interviews
  • linux/compiler/C++ interviews are even more low-level than java. Despite serious “weak-joints”[1], I generally excel at the lower-level
    • shared mem — pointer therein!
    • sockets .. like kernel bypass
    • unix signals
    • cpu cache – instruction and data
    • inline
    • rvr, the most important c++0x feature, is very low-level
    • big-4
    • pbclone^pbref^pbptr, slicing
    • undefined behavior – usually are low level
    • internals of deque, vector/hashtable reallocation…
    • smart pointer internals
    • reinterpet_cast vs move()
    • pass a functor to a thread
    • [1] See items of the same color
  • —-On the other hand, interviewers don’t bother to go low level on …
  • thread pool? somehow never low-level
  • Unix commands, python? never low level
  • (noSQL and) SQL? no
  • FIX? not so low-level
  • SOA, MOM
  • asynch designs
  • design patterns

CVA c++ IV 2 #oq

void func(ResourceMgr & rm){ 
  int connId = rm.getConn();
  double d=externalFunc(connId); 
  rm.reclaim(connId); //release the resource to the mgr
  • Q5: In the code above, external func can throw, so write a RAII wrapper to prevent resource leak.
  • Q5b: what if you aren’t allowed to use RAII? Ok you said catch-all. Is an empty catch block enough?
    AA: need to re-throw the original exception, to mimic the RAII behavior.
  • Q(paper coding): Iterate over two vectors in lock steps. Which is faster — iterator vs vector index?
  • Q (bond math): Is there any uncertainty in the present value calc on an pre-existing vanilla IRS?
  • q: what design patterns do you use?
  • q: write a singleton skeleton
  • Q: how do we make a class “final” in pre-c++11
    %%A: either make dtor private or make all constructors private
  • Q: is shared_ptr thread safe?
  • Q: any difference — const shared_ptr<T> vs shared_ptr<const T>?
  • %%Q: does it make sense to pass a shared_ptr by const reference? I feel it’s cleaner to pass by raw ptr
    AA!: pass by const-reference is faster and recommended


  • Q: why use weak_ptr instead of raw ptr. Valid question.
    A: See [[std c++lib]] P84.
    A: See last paragraph in https://bintanvictor.wordpress.com/2010/01/20/weak_ptr-can-access-inner-ptr-only-through-a-shared_ptr/
  • Q: you said atomic<int> operator++ probably wraps a CAS in a while loop internally, so is it spinlock?
    %%A: I think so.   http://www.cs.cornell.edu/courses/cs4410/2015su/lectures/lec06-spin.html is a detailed explanation
  • Q34: various synchronization techniques?
  • Q34b: what’s the c++ function to force a memory barrier?
    A: std::atomic_thread_fence(),
    but i feel if an application (not a library) uses this function then something is seriously wrong.
  • Q: can we implement a mutex using CAS?
    AA: blockingMutex implementation ] kernel
    AA: spinlock, not blockingMutex, can be implemented as in   https://www.andrew.cmu.edu/course/15-440-sp09/applications/ln/lecture4.html
  • ? Q (paper coding): write a piecewise linear interpolation Curve class with a ctor taking a vector<pair<double, double>>.  See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/curveInterpolation_CVA.cpp
  • Q: What are r-values and r-value references
  • Q: Explain your use of SFINAE. See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/template/SFINAE_demo1.cpp
  • Q: what’s the c++ library function for cmpxch?
    A: atomic::compare_exchange_weak()
  • Q: is your barcap vol surface used for EOD risk only?
    %%A: no. A trader might suspect a lesser known product is currently mis-priced. She could confirm the live bid/ask are obvious outliers on the fitted surface.
    %%A: a dealer desk price new deals based on the fitted surface, whether or not the strike/expiry requires interpolation.

–Simon Ma:

  • Q (paper coding): implement Fib() recursively and iteratively
  • Q: what’s inline?
  • Q: semaphore vs mutex
  • Q: how did you compute greeks like gamma?
  • Q (bond math): given 6M spot IR = 5% and 12M = 10%, what’s the 6M rate 6M forward?
  • Q: Assign first half of a vector to another vector
    %%A: vector::assign() probably takes two iterators so I can pass v.begin(), v.begin()+5
  • Q (obscure): What data structure to represent a directed graph of N nodes? See NxN matrix for graph of N nodes
    %%A: create a Node class with a single ptr field…
  • Q: use parallel algorithm to compute sum of a vector<int>
    %%A: only the last stage global aggregation has a shared mutable that needs protection. The sub-aggregation can be single-threaded.
  • Q (probability problem): Both Alan and Bob agree to show up at Cafe9 sometime between 12 and 2pm. Whoever arrives first would wait for 15 minutes only. What’s the probability of them actually meeting
  • Q: what’s thread_local?
    %%A (correct): a new storage class that’s similar to static
  • Q (paper coding): A natural number is Good if it is a product of only 3, 5 and 7. The smallest Good numbers are 1,3,5,7,9,15,21,25,27,35… Write a print(int N) to print the Nth Good number. Hint: write a isGood(int k). See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/isGood357_CVA.cpp
  • Q (unclear): implement subtract(int a, int b) using only operator+ and comparison operators. I feel this question is unclear. Can we use bitwise? Can we multiply?
    %%A: just try all the integers (i) one by one a == b+i
  • Q (obscure): what operators can’t be overloaded?
    %%A: q(?:) correct
    AA: address-of CAN be overloaded!

–Mikhail the mgr

  • Q: inserting 1000,000 items into a list vs a vector without reserve()
    A: vector wins
  • Q: do you define your exception classes?
  • Q: in what context is a deque completely unusable whereas vector is perfectly fine?
    A: a C function taking an array (i.e. a ptr + a count). Vector::data() is designed for this. In Pre-c++11, we can pass &v[0] and v.size()
    A: if the code takes one element and uses pointer increment to access other elements. Deque would fail at segment boundaries.
  • Q89 (coding question): given a vector of T [passed in by const ref], write a template function to return [by const reference] the minimum element.
  • Q89b: how do you handle an empty vector?
  • ? Q89c (obscure): given q(vector const & vec), what can you say about q{auto it = vec.begin()} versus q{vector::const_iterator it=vec.begin()}

[18]top 4 IV(!! GTD)domains 2 provide 20Y job security

See also

Let’s ignore zbs or GTD or biz domains like mktData/risk here …

  • –roughly ranked by value-to-me
  • [c s] java? resilient in the face of c# and dynamic languages. At least 10Y relevance.
  • [c s] c++? resilient in the face of java. Time-honored like SQL
  • [c] abstract algorithm and data structures, comp science problem solving
  • [c n] tcp/udp optimization + other hardware/kernel/compiler optimizations
  • ……….No more [c]
  • py + shell scripting? no [c] rating since depth unappreciated
  • Linux and windows? at least 10Y growth, but no [c]
  • [s] SQL? resilient in the face of noSQL, but no [c]
  • bond math?
  • [n s] FIX? At least 10Y relevance
  • [c=high complexity in IV; shelf-life; depth appreciated …]
  • [n=niche, but resilient]
  • [s=survived serious challenges]

convert a recursive algo to iterative #inOrderWalk

Suppose you have just one function being called recursively. (2-function scenario is similar.) Say it has 5 parameters. Create a struct named FRAME (having 5 fields + possibly a field for lineNo/instructionPointer.)

Maintain a stack holding the Frame instances. Each time the recursive algorithm adds to the call stack, we add to our stack too.

Wiki page on inorder tree walk  has very concise recursive/iterative algos. https://github.com/tiger40490/repo1/blob/py1/py/tree/iterative_InOrderWalk.py is my own attempt that’s not so simple. Some lessons:

  • Differentiate between popping vs peeking the top.
  • For a given node, popping and printing generally happen at different times without any clear pattern.
    • the sequence of pop() is probably a pre-order tree walk
    • the sequence of print is an in-order tree walk

retrans biz logic #if I were architect

Many mkt data interviewers ask me high-level biz logic question about retrans, without implementation details.

Q: after you detect a gap, what does your parser do?
A (Deepak): parser saves the gap and moves on. After a configured timeout, parser sends out the retrans request. Parser monitors messages on both Line A and B.

Q: if you go on without halting the parser, then how would the orderbook engine cope?
A: if we are missing the addOrder, then rebus could warehouse all subsequent messages about unknown order IDs. Ditto for a Level 1 trade msg. Deepak felt this warehouse could build up quickly since the permanent + active gaps could contain tens of thousands of missing sequence numbers. I feel orderId values are increasing and never reused within a day, so we can check if an “unknown” orderId is very low and immediately discard it, assuming the addOrder is permanently lost.

A: if we are missing an order cancel (or trade cancel), i.e. the last event in the life cycle, then we don’t need to do anything special. When the Out-of-sequence message shows up, we just apply it to our internal state and send it to downstream with the OOS flag. If a order cancel is lost permanently, we could get a crossed order book. After a few refreshes (15min interval), system would discard stale orders sitting atop a crossed book.

In general, crossed book can be fixed via the snapshot feed.

A: If we are missing some intermediate msg like a partial fill, then we won’t notice it. I think we just proceed. The impact is smaller than in FIX.

OOS messages are often processed at the next refresh time.

SIG phone round

See retrans questions on retrans biz logic if I were CTO

Q: how do you extract the seq number from packet header?
A: reinterpret_cast to a pktHeaderStruct ptr, then read the field

Q: is reinterpret_cast safe?

Q: why is RBTree sometimes faster than hashtable for a small collection? See unordered_map^map performance: unpredictable

Q4: speed of std::sort() on a vector vs list::sort
%%A: probably vector due to cache efficiency
A: also the stability of list::sort comes at a cost

Q4b: how are the two sorting implemented?
%%A: possibly two variations of quicksort

Q: std::sort() vs qsort() in C. Same algorithm, but which is faster?
%%A: std::sort, due to functor inlining
AA: Item 46 in [[effectiveSTL]]

Q: given a csv file with a million rows, you use c/java/c#/py to (convert to number) sum up 2nd column and produce a single output number, which language is fastest?
%%A: main cost is file read, but let’s suppose entire file is loaded into memory.
%%A: I feel python bytecode (even optimized) will be slightly slower. Java (due to JIT) and C are likely faster.
A: now in hindsight I feel integer parsing will cost much more than arithmetics.

move() ^ pointer-casts: mostly compile-time

See https://stackoverflow.com/questions/27309604/do-constant-and-reinterpret-cast-happen-at-compile-time/27309763

  • dynamic_cast incurs runtime cost.
  • static_cast, const_cast, reinterpret_cast are compile-time
  • std::move() is compile time. An unconditional cast, according to Scott Meyers.

That’s for pointers or references.

Nonref variable cast is uncommon, unfamiliar and pretty much unnecessary, except for numeric types.  static_cast<std::string>("Hello") ends up calling std::string constructor.

unexpected longevity@FOSS

Conclusion — my tech-bets and investment in many FOSS technologies proved to be correct. In contrast, only a few of my tech bets on commercial softwares are correct — MSVS, Oracle, Sybase, Excel+VBA,

I didn’t want to spend too much effort analyzing the forces around FOSS, but to my surprise, those forces keep growing and evolving.

  • Eg: Microsoft has to contend with Linux, Java, Apache
  • Eg: Oracle has to keep developing OpenSolaris, and MySQL
  • Eg: IBM, Oracle … have to support Linux
  • Eg: SUN, HP-UX all lost the battle against Linux. SUN has no choice but OpenSolaris
  • Most of them have to face the stiff challenge by a single FOSS — GNU/Linux

Because a FOSS needs no revenue no payroll to stay alive, there’s no survival risk or financial uncertainty in a FOSS project. Therefore, a FOSS often has better longevity.

Some of the most influential, dominant, enduring and low-churn softwares are FOSS and are unlikely to change:

  1. linux, BSD-unix
  2. java and GCC
  3. python, perl, and most scripting languages
  4. most development tools in *nix
  5. many javascript frameworks
  6. many browsers

Q: what forces power the FOSS and provide the energy, momentum?
A: alpha-geeks who want to create a impact and legacy?

Apparently, you need just one (or a few) alpha-geek to create a formidable competitor to a software vendor’s army of developers.

binary data ] FIX

It’s possible to “embed” arbitrary binary data in a FIX tag-value pair. However, the parser has to know where it ends. Therefore, the “value” consist of a length and a binary payload.

Q: Any example?

Q: can we send a image?

Q: can we send a document?

Q: Encryption? Need to read more

c++11 throwing dtor: a few technicalities for IV

  • STL containers and std::swap() lose all exception guarantees when a payload class dtor throws
  • In a normal context (without double exception), if a dtor throws, it’s same as new() throwing exception. Stack unwinds as expected, and all subsequent dtors run as expected.
  • in c++11, all dtors are implicitly noexcept. If such a dtor throws, it triggers std::terminate() by default

Q: what if your dtor encounters a critical error but not allowed to throw exception?
A: https://www.kolpackov.net/projects/c++/eh/dtor-1.xhtml mentions a pre_destroy() member function that can throw. Dtor would call this function but catch and ignore the exception. Client code can also call this same function but handle the exception intelligently.

inline perf can Backfire ! #Google

As illustrated below, without inline, instruction cache system could hit ChangeOfFlow twice as it enters/exits your function aa(). If aa() is actually inlined and embedded in a hostFunc, then the instruction cache system can often load entire hostFunc, eliminating COF. This helps instruction cache, but excessive inlining can increase executable footprint (code bloat).

google c++ guide points out that

  • inline can either increase or decrease (for tiny functions) executable footprint. In general, smaller footprint improves running time due to instruction cache efficiency
  • virtual functions are inlined (i.e. defined in class body) primarily for convenience/maintainability, not performance

See also https://www.eetimes.com/document.asp?doc_id=1275470 and https://www.eetimes.com/document.asp?doc_id=1275474

As an example, consider the function call tree flow in Figure 1. Suppose function F2 is linked near function F1, but function F3 is not linked near F1. When function F1 calls F2, it is possible that F2 is already in the cache and that there will be no cache miss. (The likelihood that F2 is in the cache depends on the sizes of F1 and F2, as well as the location of the call inside F1.) In contrast, there will probably be a cache miss when F1 calls F3. Because F3 is located far away from F1 in memory, it is not likely to be in the cache when F1 calls it.