avoid if()assert


if (…) assert(…)

Instead, put the if-condition into the assertion, which is

  1. more readable since the if-condition is clearly part of an assertion so other people tracing code don’t have one more IF block to check!
  2. more efficient compilation since preprocessor would strip the if-condition in release build

ibank: trade book`against exch and against client

In an ibank equity trading system, every (partial or full) fill seems to be booked twice

  1. as a trade against a client
  2. as a trade against an exchange

I think this is because the ibank, not the client, is a member of the exchange, even though the client is typically a big buy-side like a hedge fund or asset manager.

The booking system must be reconciled with the exchange. The exchange’s booking only shows the ibank as the counterparty (the opposite counterparty is the exchange itself.) Therefore the ibank must record one trade as “ibank vs exchange”

That means the “ibank vs client” trade has to be booked separately.

Q: how about bonds?
A: I believe the ibank is a dealer rather than a broker. Using internal inventory, the ibank can execute a trade against client, without a corresponding trade on ECN.

Q: how about forex?
A: I think there’s less standardization here. No forex ECN “takes the opposite side of every single trade” as exchanges do. Forex is typically a dealer’s market, similar to bonds. However for spot FX, dealers usually maintain low inventory and executes an ECN trade for every client trade. Biggest FX dealers are banks with huge inventory, but still relative small compared to the daily trade volume.

Most c++guys have no insight in STL

When I recall my STL interviews on Wall St, it’s clear that majority of c++ app developers use STL almost like a black box, every year for 5-10 years. Only 1% has the level of insight as Henry Wu.

Reason? such insight is never needed on the job. This factor also explains my advantage in QQ interviews.

  • Analogy — Most of us live in a house without civil engineering insight.

STL uses many TMP and memory techniques. Some may say STL is simple but I doubt they have even scratched surface on any of the advanced topics.

  • Analogy — some may say VWAP execution algo is simple.

Many interview questions drill in on one or two essential STL functionality (which everyone would have used), just below the surface. These questions filter out majority[1] of candidates. Therein I see an opportunity — You can pick one or two topics you like, and grow an edge and a halo, just as Henry did. Intellectual curiosity vs intellectual laziness. 不求甚解,不加咀嚼,囫囵吞枣 — I see it in many self-taught developers. That’s exactly what these Wall St interview questions are designed to identify and screen out.

How about west coast and other high-end tech interviews? I’m not sure.

[1] sometimes 90%, sometimes 60%.


mgr|stress| ownership #heavy

I feel some people are good at (or trained for) juggling big responsibilities that affect other people as “They depend on my system as a service provider”. I’m not so good at it so I feel these responsibilities are very heavy, and a type of cancer-stress.

Someone like Josh, Larry … need to be on the ball and keep track of the large number of (big or small [1]) changes affecting “his baby”.

In contrast, as a foot soldier I only have obligation (professional responsibility) to keep an eye on my project or my module. Some foot soldiers are eager to learn, but usually her scope of responsibility is much smaller.

Stay on top of “everything” — as a parent I have to be on top of everything about my kids and my house.

LocalSys — If I don’t understand one of many things in and outside my team’s applications, I don’t need to struggle to find out and clarify my understanding.

[1] it’s not always straightforward to determine if a change is big or small. The app owner need to quickly estimate the impacts of each change. Sometimes the estimate is inaccurate. There’s a bit of risk.

## identify your superior-absorbency[def#3]domains #QQ,fasting

  1. When in sky-high absorbency, you should try and take on the _really_tough_.
  2. When in “good” absorbency, you should take on the medium tough jobs.

Before goggles, swimmers compete by increasing the amount of practice, which was limited by the capacity of their eyes to endure the “abuse” — I call it “capacity”, and sometimes “absorbency”, as defined in two blogposts ##absorbency[def#2]worn_out by endeavors #coding and My absorbency[def#1]

abstinence^hard-driving is another blogpost.

Myself as an illustration — Compared to my peers, I have superior Long-term absorbency/capacity for these *specific* challenges:

  • best eg: jogging, but not stretching. I have much higher capacity to cope with jogging then stretching. I don’t really need to capture the motivation for jogging.
  • I have higher capacity for push-up practice compared to chin-up practice.
  • eating raw veg (and fruits) but not celery or raw carrot
  • delaying meals but not reducing dinner

These long-term absorbency advantages are truly life-enhancing, to put it mildly…

At times, in these same domains I find my absorbency capacities falling very low. Completely normal. It’s a human condition.

— eg: xx theoretical QQ study : I have an absorbency advantage, more than in coding drill.

These are big, real competitions.

For many peers, when they find the fleeting motivation to study QQ, they should capture it. I don’t have to.

Ken Li said something like … If you enjoy tech and esp. dev work, then you have job security in tech sector till old age — I would say well into 60’s. I think he said that because there are so many tech jobs in U.S. not so demanding, where my absorbency advantage alone is more than sufficient to sustain a long career.

However, if the learning is unrelated to IV then my absorbency falls towards to absolute zero.

— eg: xx java QQ more than c++ QQ

I have much easier learning journey on java. More satisfying, more Aha. I feel more in control, and less “lost”. I see more logical connections. The fundamental designs make more sense to me.

##lockfree queue implementations c++J

I think a linked queue with 2 pointers (head / tail) could be relatively easy to implement by myself.

%%approaches to classic generator problemS

* p2u in next_perm

* insert at each position of each array, in a growing collection to create new arrays for the collection. Optionally, all arrays are immutable.

Actually, only two approaches so far, though each of them can be adapted in many ways.

Note many (simple) classic generator algos are required in coding tests or (practical) algo problems

[19] sorting J integers, each in [1,N]

Q: What’s the time complexity of sorting J integer scores which are all [1, N], possibly non-unique?

This is classic counting sort. My book [[algo in a nutshell]] incorrectly says O(J). Incorrect when N is large.

My analysis below uses a total of three dimensions N, J and K, where N and J can be vastly different, and K <=min(N, J), but K could be much smaller.


Textbook counting sort is listed at the bottom. If N is smaller than J, then O(N+J) is dominated by O(J). Now suppose N is huge (like the revenue figure of a company).

Suppose there are K distinct scores among the J scores. I would want a constant-time translation from each distinct score to a distinct counter. I wish there’s a perfect hashing from the K scores to K buckets but I think it’s impossible since the K distinct values are not known in advance. Even with imperfect hashing, I can get O(1) translation from each score value to a distinct counter. I will iterate over the J scores. For each,

  • look up (on hash table) the corresponding counter.
  • If the score is new i.e. missing from the hash table,
    • then create a counter
    • add the “pair” {score -> counter} into the hash table.
    • Also insert the score into a min-heap
  • increment the counter

O(K logK) : Now pop the min-heap K times, each time to get a distinct score in ascending order. Lookup the score in hash table to get the counter. If counter for score 55 the count is 3, then output 55 three times. This output would be a sorted sequence of the original J scores.

— comparing with alternatives: Mine is O(J + K logK). If K*logK exceeds N, then I would fall back to standard counting sort.

The comparison-based sort would be O(J logJ), inferior to mine if K is much smaller than J.

The textbook counting sort would be O(J + N) , using N independent counters.

##coreJava QQ topics: stand-out from crowd

Q: what core java QQ topics (not necessarily skills) to make you stand out from the crowd?

  1. (Best example) low-level core threading features — higher market value than ecosystem features like concurrency utilities, which are so numerous that most candidates are not expected to know in-depth
  2. jGC
  3. latency tuning — including JIT
  4. collections internals

— second tier QQ topics

  • reflection, AOP, bytecode engineering? never popular
  • advanced generics? out of fashion
  • class loader internals? out of fashion
  • JNI? never popular
  • jvm internals? never popular and never needed on any project

topK,K-th ranked within a bag: quickselect

For any top-K problem, and for any problems about K-th ranked item, a Fairly efficient solution is the quickselect algo, in O(N). O(N) is the theoretical lower bound since you must examine each of N element.

For top-K, we can use quickselect to partition the array and the upper section would be top-K. This is O(N). In addition, I can think of two O(N logK) solutions.

Keep the container size at up-to-K. For each new item encountered, compare to the minimum in the container. IFF bigger, then remove the minimum then insert new-comer.


insert is O(1) on average but remove() is O(logK)

“If you are inserting a single node for every one you remove, then you lose the advantage of the asymptotic O(1) average insert that heaps provide as the delete would dominate, and you might as well use a BST.” according to https://stackoverflow.com/questions/6147242/heap-vs-binary-search-tree-bst


Each remove() and insert() is logK

##order states open2cancel/replace

Each OMS (order management system) in an OMS chain [2] remembers the state of each order [1]. Let’s focus on OMS4. When OMS4 receives an order-cancel/replace request message from upstream, will OMS4 accept the request and send the message downstream? Depends on the order state (of the order8) in question.

I believe if the order has been already rejected or fully filled or cancelled then not possible. OMS4 would reject the request.

What if order8 is in New state? That means acknowledged by liquidity venue but no fill received yet. Yes request is acceptable.

What if order8 is in PendingNew? That means OMS4 has not received the acknowledgement from downstream. meaning the liquidity venue has yet to enqueue the order. Well, in this case cancel (or replace) request will be forwarded to liquidity venue. OMS4 can optionally send back a PendingCancel or PendingReplace to indicate the current order status. In time, liquidity venue will either accept the request, or reject it because “already filled”, and send this response to OMS4.

[1] in non-volatile memory, and can respond to status queries and can often broadcast the order status to subscribers.

[2] A large trading platform often has a chain consisting of multiple OMS nodes, such as clientConnectivity, smartOrderRouter, AlgoEngine. I feel should not be more than 5 nodes, unless some aggregation is required.

## nothing to be proud of ] %% life@@

See also [2] ##4 qualities I admire in%%peers #!!status and
[1] ##envy Them constantly; !!celebrat`family,achievements..

Once a while in a young person’s life she(he) find herself (rather than her life) despicable in every way —

  • studies
  • dating
  • body image
  • fitness
  • team sports, not part of any “gang”
  • ….When we grow older we have some of these same “whiplashes” + some new whiplashes, which we use to beat ourselves up:
  • profession
  • home location, home size, school district
  • personal investments including properties
  • monthly saving rate
  • plenty of life-chances available to me throughout my life

I think in such a situation we desperately need concrete, tangible evidence of personal achievement (not necessarily visible to other people). Here are some evidence to congratulate myself and defend myself against the whipping:

  • wife and kids — see [1]
  • SGP citizenship — see [1]
  • education credentials — see [1]
  • personal finance — see [1]
  • profession — see the first list in [1]
  • knowledge + experience about diet and wellness — see [2]
  • introspective or expressive writing and blogging — as a versatile coping, self-improvement, self-discovery device
    • English proficiency — I struggled for years and now I’m able to use English for expressive writing — something unimaginable in my teenage years.

## time-honored textual data file formats

[[art of Unix programming]] has insights on it… Note Windows tend to use binary file formats, whereas the Unix tradition favors textual.

Standard cvs format is not well designed. Its use of double-quote is criticized.

The /etc/passwd format supports metachar embedding better than standard csv.

The “record-jar” format combines cookie-jar with RFC-822 format and is more human-friendly than xml, ini, and cvs. It has field names. Its values can be unstructured text. It also supports “stanza” records.

xml can be checked for well-formed even without a DTD or schema.

2 nodes] binTree-with-cycle: locate common ancestor

Q (Leetcode #236): given 2 valid nodes (and root) of a binary tree, find the lowest common ancestor. A node can be a (direct/indirect) descendant of itself. All values distinct. No uplink.

classic problem:)

Q2 (my own requirement): what if cycle is possible?

My idea — Just run a lazy-dft to find the two paths-from-root. On each path, if we detect a cycle we terminate that path. Before terminating any path, need to check if we hit both nodes, so after finding one node we must go all the way to the leaf node or the one of the 2 given node.

As soon as we find the 2 paths we terminate DFT.

IIF two CPUs are given, my dft will use two threads — one left to right; the other right to left. This will more quickly locate the 2 target nodes if they appear near extremities.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/binTree/commonAncestor_Cycle.cpp is my self-tested code, not tested on Leetcode

Q: what I want{cod`drill beside muscle-build`#Rahul

Hi XR,

I had a brief chat with my young, talented Indian colleague (not the talented intern). He pointed out that he is making progress with his Leetcode real-code drill better than 2 years ago in his Master’s program, because at that time he was in a (job-hunting) hurry and under real pressure. I can identify with him — at present, both he and I feel low or no pressure to improve coding skill so our brains work more efficiently. When I made this point he immediately agreed.

Over-stress inhibits brain capacity; Insufficient stress reduces energy level in the brain.

This colleague always focus on passing Leetcode tests.

You asked me this question —

Q: so exactly what do you want from your coding drill? I won’t answer for my colleague, but my answer is related to him nevertheless.

Indeed if my goal was to pass coding interviews, then my real-code practice is not efficient and overkill. So why am I spending 20% to 33% of my coding drill effort on real-coding rather than reading key ideas… what’s the real reason? I described a fundamental reason in
https://bintanvictor.wordpress.com/2018/08/04/why-our-coding-drills-are-different-fundamental-reason/ but today I will shift focus.

  • A1: I want enjoyment, positive feedback, self-esteem, feeling good about myself …
  • A2: I want some sense of achievement — when you read posted solutions you probably feel you learned something and achieved something. I feel more achievement when I come up with my own solutions, even if not optimal nor elegant.
  • A3: I want visible progress— When you read posted solutions to three common questions, clearly you feel progress. That’s why you aim to study 800 questions. I’m different. I don’t feel significant progress reading posted solutions.
  • A4: I want self-mastery — overcoming many obstacles, similar to yoga. I want to regain control of my spare time and follow some self-directed course.
  • A9: I want to build my mileage — as in driving. Mileage means how many productive hours I have spent on coding problems since age 20. A1/A2/A3 above all help keep me focused on doing more problems. Even with all the joy, achievement and progress, it’s still very easy to give up or lose steam.

rangeAND: bitwise AND of all ints in range

Q: Given a continuous range [m, n] where 0 <= m <= n <= 2147483647, return the bitwise AND of all numbers in this range, inclusive.

  • the lowest bit will most likely see 0 and 1 so … becomes zero
  • (turns out to be a minor tip) if the range has size > 8 then lowest 3 bits all zeroed out
  • imagine the bit array object incrementing from m to n. We want to find out if there’s a stable higher portion
  • Key technique — look at some examples. We can convert m and n to two bit images. we can look at some examples below.
  • we can compare the two bit images left to right to find the “higher portion”. All lower bits are probably zeroed out in the result


–my solution not tested on Leetcode: https://github.com/tiger40490/repo1/blob/cpp1/cpp/rangeAnd.cpp
* compare m and n left to right. If m is shorter, then return 0.
* if same length, then compare left to right until a difference is found. Until that bit, all left-end bits are “retained”.


[13] U.S. vs Singapore economies

I used to feel Singapore economy is less stable than North America, Japan and Western Europe. Even a small European economy like Belgium or Finland is in a safer position than Singapore because it’s a stable, integral part of a large, strong ecosystem. These western economies have experienced relatively low or negative growth, persistent unemployment, high national debt, aging population, diminishing manufacturing competitiveness under competition from emerging economies, …. but "strong" means they can withstand bigger adversities longer. Strong means economic competitive advantage over the developing countries.

Now I see signs of U.S. economic status in decline. However, positive signs are still visible — in high tech, new/old media, life science, aerospace, innovation, advanced R&D, energy, material science…. I feel the next 10 years may see further decline in USD and U.S. GDP relative to rest of the world.

In contrast, over 30 years, I feel S’pore is still too small to be strong. S’pore government has to make all the right decisions to avoid the many potential disasters (i won’t list them). Being small, I feel S’pore would suffer quite badly from even a mild misstep. (Sorry no specific examples.)

For big or small economies, every country needs to stay competitive and stay relevant, and identify new growth sectors every 20 years. Just witness the current decline of Nokia and BlackBerry. Nortel is another high-tech example — once the biggest company in Canada. Creative Technology, DEC too. I guess AOL is another cautionary tale of staying relevant. The software industry is even more unforgiving and littered with even more dinosaurs compared to hardware or telecom industries.

Perhaps Singapore would fall on a similar path, but there’s opportunity of renewal…

For a big country like the U.S., renewal seems inevitable. Is there a major sector in U.S. economy that may decline and never recover again? Maybe the railway, steel and garment industries of the past?

LDLibPath ^ classpath

See also compile-time ^ run-time linking

Q: when I run the a.out, where does the runtime linker search for those *.so files?

Now I know the a.out file remembers/stores the dependency *.so file locations. Suppose a.out remembers pspc.so was loaded from ~/a/b/c. At runtime, the linker will try ~/a/b/c/pspc.so  but likely fail… completely normal.

https://docs.oracle.com/cd/E19683-01/816-1386/chapter3-10898/index.html clearly explains that *.so file locations can be saved in a.out as a ‘runpath’. The 2nd-best solution is the ldLIbPath env var.

https://superuser.com/questions/192573/how-do-you-specify-the-location-of-libraries-to-a-binary-linux lead me to the -R and -rpath linker-options.

https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_3.html describes q[ -rpath ] linker option

A c++ executable like a.out using dynamic lib is comparable a java main class using a bunch of jar files.

[17]%%agile advantages: b aggressive

  • healthy job market
  • advantage: short commute. Job not too demanding. Plenty of time available for self-study
  • advantage: I can work day and night to get things done
  • advantage: I’m not aiming for a promotion, so I can try many interviews
  • advantage: my quant credentials and zbs is probably top tier among techies
  • advantage: domain nlg
  • advantage: health and stamina
  • advantage? data analysis aptitude
  • advantage: I have Singapore + java as safety net –> risk capacity.
  • advantage: am open to relocation
  • advantage: am open to short term contracts
  • advantage: am open to pay cuts
  • advantage: no family.
  • advantage: c++/c# in addition to java
  • advantage: above-average SQL, Unix, python/perl/bash

If I hit $220k as a contractor, my self-image and self-esteem would improve. I would feel confident, no longer inferior. In fact, I would feel better than the managers since I don’t rely on that one employer.

low-latency jobs; learning python#CSY

I believe my current trading system is latency-sensitive (as a large-scale [1] order management engine), but on this job I don’t do anything about latency — this is just another equity trading system job. I feel you don’t need to aim for low latency. Just aim for any c++ job that’s not off-putting to you.

[1] I consider it large scale because it has probably 10-20 developers working on it, for at least 5 years.

Low-latency jobs are harder to get, and they usually prefer developers in their 30’s or younger.

I don’t exactly avoid such jobs, but I don’t hold up any serious expectation of an offer by those teams. The main barrier of entry is technical capabilities, either coding test or obscure c++/linux questions. Even if I’m lucky to hit familiar, well-prepared tech questions, they may still find reasons to take a pass, as I experienced in my technical wins at SIG, Bloomberg etc.

Java and python are never show-stoppers for these jobs, so
* if you only aim for low-latency jobs, then don’t worry about python or java
* if you widen your search scope, then I suggest you pick up python, not java

Taking a step back to get a perspective, I think job seekers are like smartphone makers — we need to adjust our “product” for the changing customer taste. The “product” in this case is our testable tech skills. Domain knowledge is losing importance; Python is now in-demand; coding tests are growing harder; Linux/compiler system knowledge has always been important to low-latency interviews .. so we need to decide how to adjust our “product” to attract our customers.

—- Earlier email —-

How is your python practice?

This is a personal journey. Therefore some people don’t like to discuss in details how they are learning something new. So feel free to disclose any amount of information you feel comfortable with.

I advocated solving coding problems in python. I now realize both you and me don’t want to solve every problem. Out of 10 problems, we might solve 1 or 2 in real code. So in the past 4 weeks, perhaps you didn’t look at 10 problems so the number of problems you could solve in python might be very low. Therefore, my suggestion may not work for you.

In that case, I wonder what python coding experiments you prefer.

I once said python is easier to learn, but like learning any new language, it still demands a huge commitment, sustained focus, and personal sacrifice. Therefore, it helps greatly if there’s a python project on a day job. Without it, we need self-discipline, determination, and clear targets to sustain the focus and energy. None of these is really “easy”.

Talking about clear targets, one example is “solve one coding problem of medium complexity each week, in python”.

## linux kernel concurrency primitives #to elaborate

I feel the concurrency primitives in kernel are conceptually similar to those in thread libraries but fundamentally different in implementation. This topic is deep and time consuming so I will only memorize some jargons for show-off

I should read the OReilly Linux book + Josh’s book

  • spinlock — addressed in both books
  • read-write spinlock
  • binary and counting semaphores
  • read-write semaphores
  • RCU?


Forced2branch out@ java by age70@@

LotusNotes, FoxPro, Php, Perl-CGI programmers … had to branch out and take up new aggressive, colonizing technologies. Their massive T-investment (mostly GTD) in the old technologies has lost value.

Q: will I need to branch out of java till age 65@@

%%A: I don’t know.

In 2012 I thought I had to embrace the new technology of c# but now in 2019 I feel java dominance has grown, rather than eclipsed by c#

If I have to branch out to take up another technology, that colonizer will NOT be c++ or c#. I won’t speculate on other challengers.

anon classes^lambda: java perf #class file loading

Based on [[JavaPerm]] P381

  • An anon class requires an actual *.class file created by javac compiler and loaded from serialized form (usually disk).
  • No such class file for a lambda.

This difference has very limited performance impact, as of java 8. However, More than half the interview questions are about fancy theoretical knowledge, so this is knowledge valuable to interviews.


windows registry: criticized

[[art of unix programming]] points out that Windows registry are sometimes corrupted. ( Remember the standard advice to always back up the registry before editing it? ) When corrupted, the registry “frequently” breaks the entire system and requires a complete reinstall, as the book claims.

  • weakness — the registry is a centralized hierarchical config data store shared in RW mode by all (dozens of) applications.
  • weakness — single point of failure

In contrast, Unix config data is decentralized, but I won’t elaborate.

Centralized hierarchical configuration sounds plausible but isn’t reliable in practice.

Hadoop apps: Is java preferred@@

I didn’t hear about any negative experience with any other languages, I would assume yes java is preferred, and the most proven choice. If you go with the most popular “combination” then you can find the thriving ecosystem — most resources online and the widest support tools.

–According to one website:

Hadoop itself is written in Java, with some C-written components. The Big Data solutions are scalable and can be created in any language that you prefer. Depending on your preferences, advantages, and disadvantages presented above, you can use any language you want.

bigData: java’s strengths #^python

Although a lot of specialists argue in favor of Python, Java is also required for data analytics.

I-banks actually prefer java for building enterprise systems. Many Big Data systems are developed in Java or created to run on JVM. The stack may include the following tools:

  1. Spark is used to stream data and distribute batch.
  2. Kafka – to queue huge volumes of information.
  3. Spring Boot – to provide system’s options to the customers via REST API.

data vs information — I feel for high volume, high reliability, low-level “data” handling, java (and C++) are more suitable. For high-level “information” analysis, python and R are more suitable. However, in reality m /feel/ might be wrong. Python might have big frameworks comparable to java’s.


[18]took calculated career-risks over last5Y

I took calculated risks not only in my younger years (like before marriage), but over the last 5 years too. I’m not conservative. I am more aggressive and out-of-the-box thinker than vast majority of my peers (Though there is always someone more daring, I don’t need to compare with them.)

It’s interesting to note that none of those once-peers-now-high-flyers is an entrepreneur or a big risk taker trying out new things. I feel most (if not all) of them move up the corporate ladder not by extraordinary performance but by relationship, boss management…

I would say Shuo and my sis took big risks with uncertainties but not at a pay cut.

I have always believed (taught to believe) that big achievements require risk taking among other things. Those risk takers in my circle haven’t made it big yet (except Kenny Zhu?)

Conclusion – probability of risk-taking leading to financial success is much lower than I estimated.

top-of-book imbalance: predictive power

At any time, for a given order book there’s an observable ticking ratio I name as “top-of-book-imbalance” or TOBI := b/(b+a) where

b is total volume at best bid level and
a is total volume at best ask level

For a given stock, whenever TOBI was high, statistically we had seen more market-buy orders in the preceding time period; and conversely when TOBI was low, we had seen more market-sell orders.

Therefore, TOBI has proven predictive power.

370,000 MPS isn’t tough for multicast #CSY

370,000 msg/sec isn’t too high. Typical exchange message size is 100 bits, so we are talking about 37 Mbps, , less than 0.1% of a 100 Gbps network capacity.

My networking mentor CSY and I both believe it’s entirely possible to host 8 independent threads in a single process to handle the 8 independent message channels. Capacity would be 296 Mbps on a single NIC and single PID.

See also mktData parser: multi-threaded]ST mode #Balaji

I feel a more bandwidth-demanding multicast application is video-on-demand where a single server may need to simultaneously stream different videos.

Q: how about world cup final real-time multicast video streaming to millions of subscribers?
%%A: now I think this is not so demanding, because the number of simultaneous video streams is one

77 c++IV paperTigers

Avoid spending too much time on this list…. These c++ topics appeared non-trivial (perhaps daunting, intimidating) for years, until I started cracking the QnA interviews. Then I realized in-depth expertise isn’t required.

  1. TMP
  2. make_shared
  3. std::forward()
  4. fwd declaration
  5. … these are some new items to be sorted…
  6. — real tigers i.e. non-trivial nlg is quizzed
  7. [A] CRPP, SFINAE — real tigers. I got asked about these around 5 times, sometimes in-depth
  8. socket: non-blocking
  9. — Now the paper tigers
  10. open source or commercial instrumentation for memory, dependency instrumentation. See blogpost to Ashish
  11. [s] what debuggers and memory leak detectors are you familiar with?
  12. [a] singleton, factory, pimpl and other design patterns
  13. [s] c++ regex
  14. [s] stream I/O and manipulators
  15. —sockets # many more paper tigers to be listed
  16. udp, multicast, select()
  17. [s] socket buffers
  18. [a] byte alignment
  19. endianness
  20. TCP flow control
  21. TCP handshake and disconnect
  22. ack numbers
  23. partial send
  24. close() vs shutdown()
  25. — STL # many more paper tigers to be listed
  26. [s] STL binders
  27. [s] STL allocators
  28. adapters for containers, iterators and functors
  29. [a] iterator invalidation rules
  30. [s] how is deque different from a vector
  31. RBtree
  32. —concurrency # many more paper tigers to be listed
  33. [A] atomic types
  34. pthread functions
  35. [A] IPC mutex
  36. mutex in shared memory
  37. recursive lock
  38. read-write lock
  39. what if a thread dies while holding a lock?
  40. [s] RAII scoped lock
  41. — multi-file build
  42. forward class declarations (as required in pimpl) and their limitations
  43. [s] C/C++ integration, extern-C — heavily quizzed, but no in-depth
  44. [s] what C features are not supported by c++ compiler
  45. circular dependency between libraries — confusing. real tiger but seldom quizzed
  46. [As] shared lib vs static lib
  47. —integration and data exchange
  48. [A] shared memory
  49. [A] CORBA, RPC
  50. [a] serialization in compact wire format — only primitive data types!
  51. [s] OS system calls vs std library (glibc) — sounds daunting to most developers
  52. —exception
  53. catch by ref or by value or by pointer?
  54. [s] exception guarantees
  55. [s] stack unwinding due to exception
  56. throwing destructors — various questions
  57. —memory
  58. which part of memory do static data members go? How about file scope static variables? How about global variables
  59. [s] preventing heap/stack allocation of my class
  60. [s] custom new/delete,  set_new_handler()
  61. [s] intrusive smart ptr, weak_ptr
  62. [s] ref counting
  63. custom deleter in shared_ptr
  64. [s] reinterpret_cast  # always on pointers
  65. [A] custom allocators
  66. [A] free list in the free store
  67. what if you call delete on a pointer that’s created by array-new?
  68. placement new
  69. out-of-memory in operator-new
  70. —inheritance
  71. dynamic_cast
  72. [A] multiple inheritance
  73. [s] virtual inheritance… which base class ctor gets called first? See https://isocpp.org/wiki/faq/multiple-inheritance#mi-vi-ctor-order
  74. [a] slicing problem
  75. [a] private inheritance
  76. [s] pure virtual
  77. —other low level topics
  78. [s] setjmp, jmp_buf… See the dedicated blog post jmp_buf/setjmp() basics for IV #ANSI-C
  79. [s] cpu cache levels
  80. [s] translation lookaside buffer
  81. [s] what data structures are cache-friendly?
  82. [a] memcpy, memset
  83. [s] ++myItr vs myItr++ how are they implemented differently?
  84. —other language features
  85. [s] RAII
  86. [s] operator-overload
  87. [A] template specialization — part of the STL fabric but largely transparent
  88. [s] ptr to member (function) — seldom used outside library code. I tried the syntax in my binary tree serializer
  89. [A] std::forward() std::move(), rvalue-ref
  90. const and constexp
  91. [a] lambda with capture
  92. [a] double-pointer .. why do you need it?
  93. —-
  94. [s == shallow book knowledge is enough]
  95. [a == actually not that deep, IMHO]
  96. [A == actually deep topic]

7 clusters@HFT-c++IV questions

Every single HFT interview question is about low-latency. Furthermore, the ibank algo-trading interviews revolve around the same clusters.

Even though I’m not aiming for HFT jobs, these topics are still very relevant to ibank and the “3rd type” of c++ shops.

  1. socket — lots of details as Shanyou and I agreed
  2. template meta-programming — deep topic but never quizzed in-depth beyond “tricks”
  3. move-semantics

— Themes are less visible in these clusters:

  1. pthreads and c++ threading but I seldom get c++11 question here
  2. STL container internals, mostly shared_ptr, raw array, vector, RBTree, and hashmap
  3. (back of tricks) memory optimization techniques using allocators, cache-optimization, malloc(), placement-new, object-pool, memset,
  4. miscellaneous core OO features like big4, virtual, MI, pbref/pbval

— other HFT topics are dispersed/scattered, not showing any strong central theme

  1. shared memory
  2. linux system calls
  3. compiler details
  4. selected boost libraries