vector used in hash table bucket

Summary — Same worst case time/space complexity as traditional chaining, but d-cache efficient during lookup, insert and removal.

Background — hash table worst case complexity is due to a bad input (and a weak hash function) leading to heavy collision and a single hot bucket. Load factor will not help as overall hash table size is still small. We end up with linear search in a long linked list. This affects not only lookup and removal. Insert also requires a lookup to prevent duplicate.

— Based on

  • This design makes more efficient use of CPU caching and the translation lookaside buffer(TLB), because slot entries are stored in sequential memory positions.
  • It also dispenses with the next pointers that are required by linked lists, which saves space.
  • Despite frequent array resizing, space overheads incurred by the operating system such as memory fragmentation were found to be small

speed^soft qualities: CIV scoring

If you are very strong, you can achieve both

  • complete the required number of problems in time with no serious bugs
  • demonstrate requirement clarification, edge cases, dry-run, thorough error handling, detailed complexity analysis, coding style.

In reality, some speed coding tests at FB, Indeed etc really favor speed over the other things.

When we take such a test, there is always internal tension between two motivations — speedy completion vs demonstrating other qualities.

Thinking out loud is a common recommendation but it can create unnecessary conversation and slow you down. However, some of these conversations may lead you to the right path.

multicast routing: efficient data duplication

The generic  routing algorithm in multicast has optimization features that offer efficient data duplication.

— optimized message duplication at routing layer (Layer 3?) explains that

Routers duplicate data packets and forward multiple copies wherever the path to recipients diverges. Group membership information is used to calculate optimal branch points i.e. the best routers at which to duplicate the packets to optimize the use of the network.

The multicast distribution tree of receiving hosts holds the route to every recipient that has joined the multicast group, and is optimized so that

  • multicast traffic does not reach networks that do not have any such recipients (unless the network is a transit network on the way to other recipients)
  • duplicate copies of packets are kept to a minimum.

##very few theoretical compSci constructs 4 CIV

Most comp science constructs are too advanced too complicated for a 45-minute coding interview. So reading any comp science book is like my son reading science books not written for his exams. What a candidate need is drills targeted at interviews.” I told friends, based on Leetcode.

A few exceptional interviews (eg: Google) go beyond Leetcode, but still use only simple features of advanced comp science constructs. Here are A few notable comp science constructs to study, mostly advanced data structures

  • [s] trie, suffix array, suffix tree,
  • geometry (is dStruct-heavy domain) —
    • nearest neighbor query;
    • query: which (x,y) points fall within this rectangle?
    • line sweep
  • segment tree
  • topological sort – 2 algos
  • [s] disjoint set
  • relationships among in|pre|post-order binTree walks — these insights are valuable for some Leetcode problems.
  • self-balancing algo in RBTree
  • [s] B-tree
  • [s] shortest path algos like BFT
  • [s=only the simple ones are needed in CIV]
  • —- techniques
  • technique: augmented tree
  • technique: back tracking
  • technique: DP and greedy
  • technique bottom-up DP with memoization

Q: How about coding interview books? Can help. They are not as deep or wide ranging as algorithm books.

Q: how about bigO on common data structures? Yes relevant because of the interviewer’s insistence.


minimize average distance to N cells]matrix #Kyle 60%

Q: given N red cells in a white matrix, find the best cell anywhere in the matrix, whose average distance to the N red cells is minimized. You can imagine N homes deciding to build a waterpoint.

Distance is like how many horizontal or vertical steps.


Insight — vertical vs horizontal dimensions are independent, so this is really a one-dimension problem.

center of gravity?

Insight — rate of change should be zero at the trough…

5 constructs: c++implicit singletons

#1 most implicit singleton in c++ is the ubiquitous “file-scope variable”. Extremely common in my projects.

  • — The constructs below are less implicit as they all use some explicit keyword to highlight the programmer’s intent
  • keyword “extern” — file-scope variable with extern
    • I seldom need it and don’t feel the need to remember the the details.. see other blogposts
  • keyword “static” — file-scope static variables
  • keyword “static” within function body — local static variables — have nice feature of predictable timing of initializaiton
  • keyword “static” within a class declaration —  static field

~~~~~~  The above are the 5 implicit singleton constructs ~~~~~~

Aha — it’s useful to recognize that when a data type is instantiated many many times i.e. non-singleton usage, it is usually part of a collection, or a local (stack) variable.

Sometimes we have the ambiguous situation where we use one of the constructs above, but we instantiate multiple instances of the class. It’s best to document the purpose like “instance1 for …; instance2 for …”

multicast over public internet

Many people say that in general, multicast over public internet is unsupported. I think many retail websites (including the dominant west coast firms) are technically unable to use multicast.

As an end-user, You can multicast across the public Internet to another site by using a tunnel that supports multicast.

As a larger organization, like a video provider or an ISP, it is certainly possible to forward multicast packets across their domain boundary (i.e. across an Internet).

To forward those multicast packets to another ISP, you would need a peering agreement with them and use the Multicast Source Discovery Protocol (MSDP), configured on both ends.

While you won’t propagate your multicast across the global Internet, crossing network boundaries with multicast packets is not impossible.

27 Jul coding drill

— presumably solved:

  1. power of 3
  2. Fizz Buzz
  3. first unique char in a string
  4. longest common prefix
  5. grouping anagram
  6. in-place-remove all dupes from sorted array
  7. getByRank() in sorted matrix #51%
  8. longest descending path through matrix #60%
  9. count lower ints to my right #50%
  10. Accounts merge
— already-solved problems reviewed
  1. Roman to int
  2. generate all loop-free paths between two nodes
  3. append+maxInRangeK #Rahul
  4. O(1)getRandom+add+del on Bag#Rahul
— pending
  1. intersection of two arrays

homemade clustered sparse array

O(1) lookup and Random Access is hard to achieve among sparse array implementation. Here my sparse array is a special subset of sparse arrays — the populated sections come in clusters. For with special subset, I hope Random access is feasible. My design is Fragment table:

  • My fragment table is a std::deque to hold raw pointers to “fragment arrays” not vectors. I choose the word “fragment” since std::deque has its own internal “segment” that is completely irrelevant.
  • elements 0 to 99 —  live in a 100-array, whose address is saved in fragmentTable[0]
  • elements 100 to 199: live in a 100-array, whose address is saved in fragmentTable[1]
  • elements 200 to 299: live in a 100-array, whose address is saved in fragmentTable[2]
  • elements 500 to 599: all missing, so a special ptr is is saved in fragmentTable[5]
  • elements 603 to 605 are present. A special ptr is is saved in fragmentTable[6]. This fragment consists of a small-footprint 3-array. This fragment also has a field this->offset=3
  • Note the fragmentTable is a deque of void pointers. Each void pointer can be a regular plain old array or a special pointer.

— As a bonus, this design supports unlimited append. Here’s a limited prepend() procedure

  1. to prepend a value X in front of the current sparseArr[0], we first allocate a new 100-array as a new fragment whose offset=99. This new fragment supports 99 future prepend() operations, but each time offset needs decrementing.
  2. save X as the last element in the new fragment.
  3. record the new fragment’s address in a new fragmentTable[0]. This is possible because the fragmentTable is a std::deque.

Note I can also allocate a small-footprint new fragment, iFF no further prepend() is allowed.

— Cornerstone of my design — all fragments have Identical range=100 inspired by the std::deque and the ragged 2D array. This enables simple and fast random access. The underlying arrays are the same size with few exceptions — empty fragments, and tiny-n-frozen fragments.

If one of the “tiny fragments” is expected to get populated in the future i.e. not frozen, then it should be allocated as a full array with bigger footprint. The wasted space should not be wasted for long.

Even if this design is not flexible and general-purpose, it meets my design goals so I’m proud of it.

Many implementations use linked fragments, where each fragment has a “next_fragment” pointer. I don’t see any benefit.

I don’t want to look at pure-java implementation as java array has many restrictions.

count lower ints to my right is labelled “hard”.

Q: given an integer array nums , return a new counts array, wherein counts[i] is the number of smaller elements to the right of nums[i]


Order statistics tree  (i.e. an augmented RBTree) should make O(N logN). However, the actual algorithm is not clear to me.

One scan from right. Insert each node into this tree. Before inserting a node of value like 22, we will query the tree getRank(22).

Implementation wise, it’s hard to create a self-balancing BST from scratch. So I think an unbalanced BST might do.

Also, there might be some alternative solutions, like mergesort??

longest descending path through matrix #60%

Q: given an int-matrix that allows 4-way moves,  find longest path with strictly descending nodes. Lateral move disallowed.


I have code to generate all paths… generate loopfree paths: graph node A→B

— Solution 1: O(N logN)
O(N) First pass  to construct “waterflow” graph — visit N nodes. For each, check the four neighbors. Add an outgoing edge (in a Node.edgeList) to a neighbor node if it is strictly lower.

Now put all nodes into a min-heap, according to node height. Can be part of first pass.

Second pass we visit each item in the min-heap. For each node, compute longest descending path starting therefrom. Record this path length as Node.score. Note if a node is surrounded by higher or equal nodes, then it has score 0, as in the lowest node.

A higher node AA is always upstream to some “computed” nodes. (We don’t care about any incoming edges into AA). So pick the max of (up to four) lower neighbors’ scores. Add 1 to get AA’s score. This computation is O(1) but the heap pop() makes second pass O(N logN)

Note the lowest node may not be part of the longest path, if it is surrounded by the highest nodes.

## defaultdict tricks #matrix

  • defaultdict(list) — is one of the most used. I think “list” is a function pointer. demos a few time-saving tricks with defaultdict

  • mydict[key] # without printing or using the return value … would trigger the dictionary to construct a default value for this key IFF missing
  • from key/value pairs, build dict {key -> valueList} shows

  • defaultdict(int) as a frequency counter.
    • mydict[key] += 1 #works even when key was previously missing in mydict
  • defaultdict(lambda: ‘_’).
    • mydict[(row,col)] can implement a matrix. negative (or other invalid) indices results in default value of string ‘_’. I believe dict key can be tuple but not list.
    • In contrast, the 2D array interprets negative indices as wrap-around !
  • defaultdict(lambda: [[],[]]) creates a list of 2 small lists for any “invalid” key. In contrast defaultdict(list) will create a monolithic list which hits runtime error when you access element of the nested list i.e. dict[someKey][0][anyIndex]

RBTree O(1)insert quite common ] coding IV

— for single-insert is precise saying that hinted insert is O(1).
— for mass insert
Special case — If we know the input sequence is pre-sorted and going to be “appended” to existing tree nodes, then each insert will always hit the end(). We can rely on hinted insert to achieve O(N) .  is even more encouraging — ” If N elements are inserted, Nlog(size+N) in general, but linear in size+N if the elements are already sorted” so as long as pre-sorted, then we get O(N)

For 64-bit integer or float inputs, we can radix sort in O(N) then mass-insert in O(N).

— for construction from a pre-sorted sequence confirms it’s O(N) if the input range is already sorted.

getByRank() in sorted matrix: priorityQ^RBTree


recombinant binTree pyramid, where “water” flows east or south.

  • first level has one node .. lowest value. Add it to pq (i.e. priorityQ)
  • pop the pq and insert the two downstream nodes
  • total K pops, each pop is followed by up to 2 inserts

Heap will grow to up to K items, so each pop will be up to logK

Total O(K logK). To achieve this time complexity, we can also use a RBTree. The tree nodes can come from a pre-allocated array.

pre-allocated array as backing store for graph nodes #java No

I think any graph node can use the same technique, but here I present a simple yet interesting use case — a linked list with each node allocated from an array. shows three home-made implementations:

  1. backing array of dummy link nodes, pre-allocated at compile time
  2. backing array of dummy link nodes, pre-allocated from free store aka DMA
  3. backing array is a byte array on heap or data section. Each link node is constructed via placement-new.

Here are a few Advantages that I consider minor because linked list is seldom needed in low-latency

  1. d-cache efficiency
  2. eliminates runtime load on heap allocator, since memory is pre-allocated. See malloc=long considered costly

Advantage #3: For c++ algo questions, this set-up has an interesting advantage — The node address is now an index into the backing array. This index is a natural auto-increment ID , based on creation order.

Now, the biggest advantage of linked list over vector is mid-stream insert/delete. One of the biggest disadvantages is lack of random-access. If nothing happens mid-stream (as in coding questions), then we can achieve random-access-by-id using array as backing store.

If nothing happens mid-stream, then this linked list is physically similar to an array with extra capacity.

This technique won’t work in java because java array of Node is array-of-pointers.

power-of-2 sized hash tables: implications  and explain that java HashMap uses bucket count equal to power-of-two. A few implications

  • the modulo operation is a fast bit-masking i.e. select the lowest N bits: hash & (length-1)
  • an unlucky hashCode() may not offer dispersion among the lower bits. Assuming 16 buckets so only the lowest 4 bits matter, but hashCode() returns predominantly multiples of 8. Then two out of 16 buckets would be white hot. To guard against it, HashMap performs a rehash on the hashCode() output.

I implemented the same feature in

Note rehash is ineffective if hashCode() returns very few distinct values. In java8, there’s one more protection against it — RBTree as alternative to linked list… See RBTree used inside java8 hashmap

RBTree used inside java8 hashmap is a language-neutral discussion. shows an early implementation

they are implemented as a hashed array of binary trees. 
My experience with them is that they are very efficient.

— JEP 180 is the white paper to introduce a self-balanced tree as an alternative to linked list. shows with one million entries in a HashMap a single lookup taken 20 CPU cycles, or less than 10 nanoseconds. Another benchmark test demonstrates O(logN) get(key) in java8, but O(N) in java7, as traditionally known. N being the colliding elements in a single bucket.

If two hashCode() values are different but ended up in the same bucket (due to rehashing and bucketing), one is considered bigger and goes to the right. If hashCodes are identical (as in a contrived hashCode() implementation), HashMap hopes that the keys are Comparable, so that it can establish some order. This is not a requirement of HashMap keys.

If hashCodes are mostly identical (rare) and keys are not comparable, don’t expect any performance improvements in case of heavy hash collisions. Here’s my analysis of this last case:

  • if your Key.equals() is based on address and hashCode() is mostly the same, then the RBTree ordering can and should use address. You won’t be able to look up using a “clone” key.
  • if you have customized Key.hashCode() then you ought to customize equals(), but suppose you don’t implement Comparable, then you are allowed to lookup using a clone key. Since there’s no real ordering among the tree nodes, the only way to look up is running equals() on every node. says

A given bucket contains both Node (linked list) and TreeNode (red-black tree). Oracle decided to use both data structures with the following rules:

  • If for a given index (bucket) in the inner table there are more than 8 nodes, the linked list is transformed into a red black tree
  • If for a given index (bucket) in the inner table there are less than 6 nodes, the tree is transformed into a linked list

With the self-balanced tree replacing the linked list, worst-case lookup, insert and delete are no longer O(N) but O(logN) guaranteed.

This technique, albeit new, is one of the best simple ideas I have ever seen. Why has nobody thought of it earlier?

RBTree: technical notes #Realtime

Red–black trees offer … worst-case guarantees, valuable in real-time applications. The Completely Fair Scheduler used in current Linux kernels and epoll system call implementation[19] uses red–black trees.

This valuable guarantee is valid only on always-balanced trees, but no need to be strictly balanced. In fact, AVL is more rigidly balanced but lacks this guarantee.

In contrast, hash table doesn’t offer worst-case guarantee in the face of hash collision. In fact, Java 8 HashMap uses RBTree in additional to linked list… see my blogpost RBTree used in java hashmap.

After an insert or delete, restoring the red-black properties requires a small number (O(log n) or amortized O(1)) of color changes (which are very quick in practice) and no more than three tree rotations (two for insertion). Although insert and delete operations are complicated logically, their times remain O(log n).

The AVL tree is another structure supporting O(log n) search, insertion, and removal. AVL trees can be colored red-black, thus are a subset of RB trees. Worst-case height is better than the worst-case height of RB trees, so AVL trees are more rigidly balanced. However, Mehlhorn & Sanders (2008) point out: “AVL trees do not support constant amortized deletion costs”, but red-black trees do.[25]

doubly-linked list as AuxDS for BST

I find it a very natural auxDS. Every time the BST gets an insert/delete, this list can be updated easily.

Q: How about the self-adjustment after an insert or delete?
%%A: I think this list is unaffected

With this list, in-order walk becomes really easy. is the first place I saw this simple technique.

CPU run-queue #java perspective

— Mostly based on Charlie Hunt’s [[JavaPerf]] P28

Runtime.availableProcessors() returns the count of virtual processors, or count of hardware threads. This is an important number for CPU tuning, bottleneck analysis.

When a run-queue depth exceeds 4 times the processor count, then host system will become visibly slow (presumably due to excessive context switching).  For a host dedicated to jvm, this is a 2nd reason for CPU saturation. First reason is high CPU usage, which can become high even with a single CPU-hog.

Note run-queue depth is the first column in vmstat output

[19] q[extern] idiosyncracies

Despite my numerous summaries, I could never reach the bottom of this keyword. has good details like

int x;          // is the same as
extern int x{}; // which will both likely cause linker errors if placed in header.

extern int x;   // while this only declares the integer, which is ok.

lowest missing+ve int#Codility #80%

Q: Write a function int solution(int[] A);  that, given an array A of N integers, returns the smallest natural number that does not occur in A. For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.
Given A = [1, 2, 3], the function should return 4.
Given A = [−1, −3], the function should return 1.

• each element of array A is an 32-bit signed int
• expected worst-case time complexity is O(N);
• expected worst-case space complexity is O(N).
* Elements of input arrays can be modified. is similar but O(1) space and average O(N) time!

—— my analysis —–

The mutable and O(1) space hints at — saving array indices as payload !

—- Idea A:

first scan to swap non-positives to the end and remember the new boundary (say #111).

In the same scan also determine min and max. Suppose min=55.

Another scan to reduce all payloads by 55 so they fall in the range of 0 to max-55.

— idea A1

Now use CSY technique .. check array@0-N in-situ #Nsdq#contrived

— idea A2:

Use quicksort partition to anchor a random node. If it is a[8] > 7, then discard higher section, to focus on lower section. During that scan also keep track of frequency of each int. If the lower section has no dupe numbers, then we can discard it and focus on the higher section.

However, if there are some dupes then this algo can’t discard any section.

—-solutionB: make_heap O(1) space but O(N logN) time. Build a min-heap based on the array in O(1) space, then keep calling min().

  • make_heap shows random access container (vector used in demo), and “rearrange”, implying O(1) space
  • make_heap shows O(N) to construct the heap

min() is O(1) but delete_min() is O(log N), so overall we probably get O(N logN)

—-solutionC: radix sort. shows an in-place binary radix sort.

First pass to transform all negative numbers to 0. Then iterate the sorted array and check for the earliest “gap”. Worst case — you get 1,2,3… without gap, so answer is the 1+ largest array element.

O(W*N) where W is width of the largest integer. If we assume 64-bit then W is a constant:) is another in-place radix sort.

java array footprint: hypothetical calc

In low-latency or high-volume java apps (like exchange-facing), I think array of primitives is quite popular due to its efficiency advantage. Here I review a few potential footprint issues. Suppose we have an array AA of 12 ints, each 32-bit.

  • AA carries housekeeping data such as the array length (i.e. 12). AA also has the “standard” object header, of twelve (or eight) bytes… See size of, and how does it add up@@
  • The 12 ints take up 48 bytes, allocated in contiguous memory… lucky. For 12 Trade objects, we would get 12 scattered allocations.. expensive for runtime allocation and cache-unfriendly.
  • So we have 12 + 48 = 60 bytes so far. There might be padding for alignment. 64-bit CPU probably uses 8-byte alignment as shown on
  • Finally, footprint is 64 bytes, but I have not verified it.

## Y revisit old algo Qn #Kyle

I find it hard to be interested in previously solved problems. I think many fellow developers (not only those in a hurry) simply skip such problems. However, we are not so “good at” these problems actually.

  1. Reason – there are often insights and recurring patterns in an old problem, that require slow and repeated digestion. Solving it quickly is usually not enough… 囫囵吞枣.
  2. reason — our solutions are often sub-optimal.
  3. reason — even if our solutions are optional, there are still very smart alternative solutions out there we don’t know.
  4. reason — we forget.
  5. reason — with a simple tweak, the problem can become a new problem. How well and how fast we can solve the modified problem depends on our insight into the original problem.

Therefore, I feel folks tend to trivialize some well-known problems and fail to learn enough from them.

  • Well-known doesn’t mean well-understood.
  • Well-known doesn’t mean simple.
  • Well-known doesn’t inflexible and restricted to its known form.

Revisiting old problems is boring to many programmers, but can often feels easier than tackling new problems. I don’t get tired easily when revisiting old problems. If you are like me, then it could be more beneficial than tackling new problems.

However, my visible progress is much slower in terms of #problems solved per week. I think Kyle also pointed out something related to this.

old-timers trapped ] pressure-cooker #Davis

See also G3 survival capabilities #health;burn rate

update: old timers are safe in some teams like MLP, but not in MS.

Q: Why are so many old timers unable to get out of a shitty job in a pressure-cooker company?

  • A small number of old timers change job but some of them regret as new job turns out worse. Other old timers hear those war-stories and prefer the certainty of known pain in the current job. They don’t feel confident they can find a better job.
  • bench time — Many old timers can’t afford bench time due to low reserve and high burn rate.
    • Old timers tend to be very unused to unstable income.
    • Damien (Macq) relied on his bonus payout..
    • Some had no savings at all.
  • Most old timers won’t even consider a 6M contract.
  • Most old timers will never consider a voluntary pay cut to get a more comfortable job
  • golden handshake — Some old timers (UBS?) can’t let go of the promised compensation package. They would not resign and give up on a 100k promised windfall
  • some old timers worry about job loss so much that they work their ass off to keep the job, even if the workload is unreasonable/unfair. I think in GS I was like them. I have a friend in a similar situation. I think the boss wants to let him go but may feel merciful, so they keep him but give him shitty tasks and demand fast turnaround at good quality…. unreasonable/unfair workload.
  • Some old timers may have a wife who completely opposes job change due to potential impact on family.

Q: why XR and I can afford to quit a job, stay home, and find a lower job?
A: Supportive wife, burn rate, passive income and savings.

##[15]types@Work2slow brain aging #ChrisMa!

This is an important topic for the researchers. As laymen, I will just reflect on my personal experience — looki.

What kind of activity is considered “work”? I feel it’s the responsibility or commitment, obligation, consequences on users -> fear …

Most but not all the “work” is tiring. I suppose creative work can help keep the brain young.

  • IT — Learning a new tech in body-building mode doesn’t feel tiring to me, but how about to XR?
  • IT — A big chunk of everyday IT work (including troubleshooting) is partly creative and investigative. Can be brain-boosting.
  • Kids doing homework – can be tiring but at their age it won’t speed brain aging. How about adults?

— programming:

Chris Ma felt any programming job is intense, never lethargic. However, I tend to classify some maintenance type of work as coding.

Chris felt a relatively easy programming job might be good for my brain. EPA? Now I feel this view echoes Jiang Ling’s view.

— Avoid stressful, demanding jobs that require burning my candle on both ends.

Need to listen to the inside signals when working the long hours. When I work day and night on coding assignments, I was like my Dad and perhaps CSDoctor. I was driven by joy and burning pleasure, not pressure like in Stirt and GS. I feel my body knows if the pressure is positive or negative, pleasure or pain, so better listen to the signals and avoid aggravating the aging process.

Sudhir told me about encouraging environment vs fear… See blogpost on 5 external inputs

##deeply felt Priorities b4 U.S.→SG@45 #big-picture

  1. — priorities over the next 2-10Y horizon
  2. Career[a] longevity till 70, probably on wall st, not in Singapore or West Coast. A related keyword is “relevance” to the geek economy.
    1. On Wall St, I continue to keep a keen focus on robust technologies like core Java, cpp, SQL, sockets, core threading, common data structures. Outside Wall st, jxee (and possibly web stacks) offers the best market depth and demand.
    2. Compared to Wall St, West coast is possibly low priority for now as I don’t see long term visibility.
  3. my wellness — Need to guard against PIP-hell, trapped… A “stability factor” arguably more impactful than GC, housing, schooling… One of the key signs of wellness is weight and calorie count.
  4. boy’s education — I don’t know if U.S. system is better for him
  5. GC — a G5 priority in my current plan, primarily on the back of the longevity factor.
  6. preparing for war at new job — short-term, immediate but actionable.
  7. — 2nd tier
  8. increase precious time with grand parents in my 3rd U.S. era — fly business class to NY.
  9. more passive income to reduce the cash flow stress
  10. Saving up for U.S. housing — not much I can do now.
  11. I now have a deep desire but limited hope to keep up my 细水长流 motivation for coding drill and QQ learning. Burning pleasure; self-esteem; satisfaction; absorbency.
  12. wife’s and daughter’s life-chances in U.S. — important but not much I can do now
  13. leaving a good impression with MS manager

[a] I didn’t say “income”. I think more important to me is my marketability (+ relevance, in-demand ..). Career longevity is the basis of entire family’s well-being for 15Y until kids start working.

Salary — is kinda secondary because the room for improvement is negligible in my biased mental picture.

GS-sg coreJava IV with rare feedback

A Tech win by my own assessment.

–rare email feedback, with highlighting by me. It is clear o me the Hongkong OMS interviewer gave the most direct (negative) feedback. I feel grateful for that.

I feel this was a technical win. I think only the Hongkong OMS interviewer found some weakness in me.

I feel perm role selection/screening was more stringent than contractors because of leadership, communication, personality match, ownership… The new permanent hire is often seen as a potential manager to join the “race”.

In contrast, contractors are mostly assessed for technical + basic communication/attitude.

In SG, more than half the time I was assessed as a dev lead (or architect) but I don’t want to. U.S. contract market is better.

As he interviewed till the end, I’d like to share a bit of feedback.

The interviewers believed he had good industry experience (low latency, low-level coding, memory usage, CPU usage, MSA) and can implement/maintain systems, and appreciated his honesty e.g. not hiding the fact that he didn’t perform performance testing.

However, the interviewers wished to see more leadership and communication skills cognizant of 15+ years experience, and to conduct his own tests rather than easily take others’ assumptions at face value.

— Tokyo interviewer
Given a BST, how to do you insert, look-up by key?

Lastly, how do you remove an arbitrary branch node? Need a deterministic algorithm to reshuffle affected subtree and keep the BST property. Considering the python coding rounds, this is about the hardest pure-algo question throughout the interview process.

  • This level of difficulty is very different from tech shops
  • This level of difficulty is much lower than the coreJava QQ questions

— QQ interviewer Ming

Q: suppose you find your prod log file suddenly stops growing, what do you do?
%%A: check disk space; check DB query hung due to a pending transaction; check socket data input/output blocking
%%A: take thread dump, which is a nice JVM feature not available in c++ or other applications

Q: how did you tune your jvm?

Q2: how did you size the heap enough to ensure no jGC for entire day?
%%A: We estimated the high watermark (like 2GB) and configure my jvm with a heap bigger than that.

This technique is simple and crude but it usually works. I think interviewer may not be impressed but I was confident because … it works!

Q2b: but will that prevent GC? Do you know every GC must stop all application thread first? (typical QQ)
%%A: at least the app thread can run after the snapshot, during the actual collection, right? I assume the snapshot is quick.

Q: in java, even without any protection, reading a 32-bit int will give you either the before-value or the after-value, right?
%%A: Not in c++. See c++11 atomic{int}^ #Bool can be half-written

Q: beside speed, what are other benefits/advantages of lock-free compared to lock-based designs?
%%A: the unsuccessful thread can do something else before retrying, rather than block indefinitely
%%A: a blocked thread (due to lock contention) is likely context-switched off the CPU. All the CPU data caches would soon be flushed.

Q: single-producer, single-consumer … can you design an efficient data structure to share data between them
Now I think we only have one shared mutable variable — the producer’s moving pointer. Interviewer hinted that all we need is for the producer to notify consumer where the new position of the moving pointer, in the data structure. Perhaps we don’t need lockfree. Volatile is probably sufficient.

Aha — If the consumer thread needs to wait (not busy-wait), then notification is the key, so condVar is more suitable than lockfree. I need to be more familiar, more fluent with the fundamental constructs (thread class, mutex+condVar)

I tend to think of that moving pointer as an integer.

I asked interviewer: are these concurrency knowledge and skills needed in your projects?
Interviewer: “Your CV mentioned these skills“. I believe interviewer implicitly answered NO. Standard practice of verifying/scrutinizing claims on CV !

–This OMS interviewer was trying to break the candidate…

Q: describe CAS. I recalled the details in
Q: how many clock cycles for a CAS instruction?

See further details in toy^surgeon

Q: what’s your actual CAS usage in your project, not a textbook use case?
%%A: At citi-muni, I had two threads putting quotes on a lockfree queue or stack. Queue is natural. In hindsight, the stack can be useful — when readers want to process the latest quote first, LIFO.

Q: how would you design an order book, if you ignore your current system?
Q: why did you say array-based is faster than AVL?
%%A: most likely cache efficiency

Q: what’s the performance level of your order book? “I (the interviewer) would use array-based order book .. can easily achieve 1M msg/sec”
Q: what’s the disadvantage of an order book based on AVL tree?

— coding^QQ^non-tech questions

I think I did well on non-tech (A-) and coding (A), OK on QQ (B). With the non-technical, again, I feel interviewers can’t reach conclusions.

I believe the tough questions listed above are classic QQ i.e. theoretical knowledge not needed in real projects. Interviewers may consider it zbs but I would call it QQ.

Consistency and stability in high-end coreJava QQ topics — No real change since 2007

Do I look like Deepak? I think with algos I seldom feel fake and broken. As to the QQ topics if I am armed with a tidbit of c++ low-level knowledge then I can often defend myself, because on those questions, c++ is invariably more low-level than java. There are still some “weakness topics” where I may come across as … superficial+academic, like Deepak.

I generally concede to the interviewer, but not today during the “CAS justification” discussion.

j8 MethodRef #cheatsheet

Need to developer more low-level insights as QQ…

  • Out of the four kinds of method refs, only AA) static-method and BB) specific-instance kinds are popular. The other two types are obscure.
  • q[ :: ] is used in all four kinds
  • I think the static-method kind is most readable and most intuitive. The javadoc tutorial features a BB example that should be implemented as static method IMHO.
  • a lambda expression has parameter types. A method ref has none and must be converted to a lambda expression. Where does compiler infer the parameter types? I think it is inferred from the calling context.

QuickSelect: select nth key#O(1)space

Quickselect has average O(N) runtime but worst-case O(NN), if the partitioning goes bad repeatedly. Randomized pivot can reduce the chance but I don’t think we can eliminate it

Space complexity is O(1) despite the apparent recursive set-up. Tail recursion optimization is usually available to reuse the same stack frame. Otherwise, recursion can be replaced by a loop.

std::nth_element() is linear on average but quadratic in worst case — explains QuickSelect algo

Quickselect is by the same inventor of quicksort.

get_majority_elem]unsorted array,O(1)space #90%

Q: Given an array of size n, find the majority element. The majority element is the element that appears more than ⌊ n/2 ⌋ times. You may assume that the array is non-empty and the majority element always exist in the array.

worst input: odd length, only X and Y. X occurs once more than Y.

hash table solution needs O(N) space since there can be N/2 distinct values. To improve space complexity, how about quick select? Discard the smaller side and pick another random pivot.

Median-finder algorithm can solve this problem, using std::nth_element() which uses QuickSelect… O(1) space despite recursive set-up.

— idea 3 O(1) space: random pick then verify
Random pick and count the occurrence of this pick. Is it more than N/2? Within a few trials we should find a good pick.

K-palindrome #careercup 20%

Q (Facebook India, careercup): A k-palindrome is a string which transforms into a palindrome on removing at most k characters. Implement check(str, k). str has at most 20,000 characters. 0<=k<=30

Sample Test Case: Input : abdxa 1 -> false
It’s crucial to zoom into the worst input early on — binary string with the correct “kills” well-dispersed.

Insight — once we group the chars by ascii code, I can see a successful palindrome is really an A-palindrome interleaved with a B-palindrome and C-palindrome…

This problem is related to the (tougher) problem of “max palindrome sub-sequence”

Eg k=33.
–idea 3: per-club distribution table
each char gets a dedicated club/distro, where we record all positions occupied by this char.
Since K is small, then every distro should be almost “balanced” around the midpoint.
— idea 7
There can be only up to 33 possible midpoints, including those in-between. (If all kills hit the upper end, then midpoint drops by 33/2.) Let’s try each possible midpoint.

Insight — once we pick a midpoint, we know how many we can kill above/below. For example, if we pick the highest midpoint, then all kills must hit lower half. The extreme midpoints are easier due to stringent constraints.

If we pick a central midpoint, then we can kill about 33/2 in each half, but the kills must match — nice constraint.

For each midpoint picked, we run Algo B:
At init-time, we compute the defects in each club/distro using Algo B1:
given the midpoint and the positions of all J’s, the defect count is the number of kills (killing a J or non-J) to balance J’s distro…

We will compile all 26 defect counts.  Now we start growing our seed at the chose midpoint. We expand both ways. If next 2 positions belong to different clubs J vs L, we evaluate the two kill options using Algo B2
One of the two choices will reduce overall defects. We first look at J’s distro. The J kill would improve or worsen J club’s defect.

If there’s an efficient way to compare evaluate the two kills or update the default counts, then we have a decent solution.

–idea 9: frq table. If there are 35 odd clubs, then 33 of them can become even but still 2 odd clubs –> immediately failure. However, This idea fails with the binary string.

## coding drill 19 Jul

  2. — merge sort in-place

— presumably solved

  3. — pacific/atlantic
  4. — max-product subarray
  7. find min in rotated array
  8. — T.isSubtree(S)
  9. — longest run@same char
  10. 3-sum

— previously solved problem forgotten and refreshed

  1. Leetcode #139 word-break. .. not really “medium”
  2. — max sub difference
  3. — min range to include all clubs
  4. — find lone-wolf among AAABBBCCC
  5. — flip binTree
  6. — serialize linked list: each node points to a random node

— unsolved:

  2. — K-palindrome


T.isLikeSubtree(S) #60%

Q (Leetcode 572): Given two non-empty binary trees s and t, check whether tree t has exactly the same structure and node values as a subtree of s. A subtree of s is a tree consists of a node in s and all of this node’s descendants. The tree s could also be considered as a subtree of itself.

====analysis relies (as “primary check”) on the payloads of  each tree node, but in some trees, all payloads are empty or all payloads are either True or False. In these cases, the comparison of payload is only usable as a secondary check. The primary check must be structural. See key in a tree node

The O(N+K) is plain wrong.

I guess the 2nd solution (and possibly 1st solution) would compare every node in S to the root of T. I think there are more efficient solutions using subtree size and subtree height as secondary checks – more reliable than payload check.

My solution below uses BFT + pre/post/in-order walk !

— Preliminary step: post-order walk to get subtree-size, subtree-height at each S node + T root. (I will skip other T nodes.). Suppose T is size 22 height 4. We will look for any node Y of size 22 and height 4 and a matching payload. This would eliminate lots of S nodes:

If T height is more than 2, then lots of low-level S nodes are eliminated.
If T height is 2 or 1, then T size would be at most 3. Most high-level S nodes are eliminated.

— Solution 1: For both T and S, We take in-order walk to assign incremental IDs, then take pre-order walk to produce an array of IDs that represent the tree structure.

Can We run a level-aware BST. Only one level need to be examined … wrong!

I think the in-order walk itself is what I need. Find any node Y in S that matches size+height+payload of T root. Suppose ID(Y)=44 but ID(T root) = 4, then simply shift down by 40 and do a linear scan. Must check payload, but not height/size.


min int_range2include all clubs#presorted 60% careercup

Q (careercup): You have k uneven lists of pre-sorted integers, N ints in total. Find the smallest range that includes at least one number from each of the k lists. For example,
List 1: [4, 10, 15, 24, 26]
List 2: [0, 9, 12, 20]
List 3: [5, 18, 22, 30]
The smallest range here would be [20, 24] as it contains 24 from list 1, 20 from list 2, and 22 from list 3
==== Analysis
Looks rather contrived but very popular.
— solution 1 O(N): moving window algo:
O(N) merge all K clubs’ items into a big array, where payload includes the original clubId of each item. We end up with
[0/2nd 4/1st 5/3rd 9/2nd 10/1st …]

Once I come up with this big array, I feel confident to conceive a O(N) moving-window algo.

A ‘basic window’ is any subarray having representatives from each club.
A ‘tight window’ is a basic window that can’t shrink further, i.e. the 1st and last items are the only reps of their clubs.

Now findInitiaLwindow(), similar to findNextWin().
Simple procedure —
As we build the window, we update a hashtable like {club1: position 1,3,6; club2: position 0,5; club3: position 2,4,..}
Basically an array of queues.
We update this table each time we increment the front pointer in search of the next basic window.
When we find a basic window, this table should have all K queues non-empty and at least one of them singular, probably the last updated queue.

Now shrink this window via shrink(). Here’s a fast algo but more tricky and no O() impact —
For each queue, find the last added i.e. right-most position.
Put these K positions in a container(actually no container needed).
Now out of these K positions, find the minimum. Say 22. I claim 22 is the left edge of a smaller window.
Keep shrinking if we can but I doubt we can.

Here’s a slow shrink algo —
For the current window, look at the clubId of the left edge (back ptr).
Use the clubId to locate a queue in the table.
Head item in the queue should match the position of back ptr.
If not empty, pop this queue and increment the back ptr by one; else exit shrink()

Now we have a tight window, we remember its size (and update current winner).

After shrink(), we call findNextWin() —
Suppose the basic window’s earliest item is a clubX.
Now we move back ptr (clubX queue is now empty).
Now we move front pointer i.e. right edge of the window, looking for the next clubX item.

* findNextWin() is O(1) per front ptr increment. Only touches the tail of queues
* shrink() is O(1) per back ptr increment. Only touches the head of queues
* therefore, at most N front ptr moves and N back ptr moves.. O(N)

PriorityQ can speed up some, but won’t improve bigO

## reasons4salary_diff ] ibank #Davis,Jay

Primarily supply-demand driven

* Reason (demand-side): age discrimination — Davis pointed out that many ibank and other employers would accept (reality) to pay higher salary to a fresh grad than an old-timer because these employers long for fresh blood

* Reason (demand-side): core dev teams — are more central than peripheral teams like devops (eg QAPM), QA, DBA, reporting,.. Consider Ling, the QAPM devops scripting guru.

* Reason (demand-side): Jay Hu pointed out that hard-core dev skillset — is perceived as more valuable, harder than “tools” including reporting tools, devops tools, automation tools, vendor products that encapsulate the underlying complexities.

Similarly, c++ skillset is often perceived as harder than coreJava
Similarly, coreJava skillset is often perceived as harder than jxee
Similarly, c++java skillset is often perceived as harder than scripting

On the supply side, there are fewer hardcore c++ developer than coreJava developers, and a lot more jxee developers. However, there’s also the Curious case of quant dev — tight supply, but dwindling demand.

If you have some brain power (like the 华中理工 compScience friend of Jay Hu), then don’t be put off by hard-core dev, even though it’s dry, boring, tough, unglamorous. See my blogpost on my geek profile.

Pacific/Atlantic #51%

Q: Given an m*n matrix of non-negative integers representing the height of each unit cell in a continent, the “Pacific ocean” touches the left and top edges of the matrix and the “Atlantic ocean” touches the right and bottom edges. Water can only flow in four directions (up, down, left, or right) from a cell to another one with height equal or lower.

Print all cells where water can flow to both the Pacific and Atlantic ocean.

peek? not yet
— solution 1:
Create m * n nodes in a 2D grid. Two more nodes:
All north|west boundary nodes have an edge From Pacific node
All south|east boundary nodes have an edge From Atlantic node

Between two adj cells, there’s 1 or 2 directed edges, from low to high. Intuition — tracing pollutant upstream. An arrow from P to Q means Q is suspect.

How do I represent the graph? Each node can hold a list of “suspect neighbors” (higher or equal).

Run BFT or DFT from Pacific node and turn on flag in each reachable node.

Run the same from Atlantic. Every reachable node with the flag set is a double-suspect to be printed.

clean-up: non-overlapping intervals #70%

Q (L435): Given a collection of intervals, find the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.


  1. You may assume the interval’s end point is always bigger than its start point.
  2. Intervals like [1,2] and [2,3] have borders “touching” but they don’t overlap each other.

==== analysis:

I think this is same as the greedy room scheduler. Schedule the earliest-ending task, so to to maximize accepted meetings.

A deselected meeting is an interval removed.

recent algo questions: mostly presented in intArray+binTree

binary tree + int_array represent 60% of my recent algo questions. With string and matrix included, we get 80%.

  • “int_array” also include array of int_tuples. Other arrays are rarely quizzed
  • In the real world, binTree is less common than arrays , but trees are more common than binTree. I think interview questions tend to get watered down to the binary species. K-ary trees are rarely quizzed
  • matrix — 95% of Matrix problems are 2D grid problems or integer array problems.
  • graph — Most graph problems are presented as matrix or binary trees, occasionally with cycle.
  • slist — shows up in 5-8% of problem descriptions. Some problems need slist as auxDS.

Leetcode is representative of a constellation of websites.

iterate K pre-sorted uneven immutable lists #FB

Interviewer (Richard) was kind enough to offer me a tip early enough. so we didn’t waste time (which could easily result in out-of-time)

Q: given K pre-sorted immutable lists, each up to N items, return an iterator that on demand yields each of the (up to K*N) items in sorted sequence.

Estimate time and space complexities.


— I first proposed pair-wise merge. Since there are logK passes, Time complexity == O(NK logK)

Space complexity is tricky. Very first pass i have to create a single list up to NK items. Then I can reuse this list in each merge. so space complexity == NK [1], but I said NK*logK. Interviewer wasn’t familiar with this solution and didn’t correct me.

[1] See Suppose 8 lists to merge. I will merge A+B into first quarter of the big array (size NK), then C+D into 2nd quarter… In next pass, I will merge AB + CD in-place using the first half of my big array.

The advantage of this solution — once I create a single merged array, each iteration is O(1). This can be valuable if run-time iteration speed is crucial but initial warm-up speed is unimportant.

bigO insight — merging N pre-sorted arrays is O(N logN), same as merge sort?

— Then interviewer suggested iterating over the K lists so I implemented the solution in

  • Space complexity = K
  • Time complexity:
    • next() O(logK) due to popping. I said Fib heap has O(1) insert
    • init() O(K)
    • hasNext() O(1)

How about a RBTree instead of heap? Space complexity is unchanged.  Time complexity:

  • next() O(logK) for both remove and insert
  • init()  O(K logK), worse than priorityQ
  • hasNext() O(1)

— FB interviewer asked why I prefer to keep state in global var rather than a per-record field

%%A: Runtime heap allocation is slower if the record is bigger. In contrast, the global dictionary is tiny and likely lives in L1-cache

[19] zbs cf to QQ+GTD #compiler+syntax expertise

Why bother — I spend a lot of time accumulating zbs, in addition to QQ halos and localSys GTD

I have t_zbs99 and other categories/tags on my blogposts showcasing zbs (真本事/real expertise) across languages. Important to recognize the relative insignificance of zbs

  • #1 QQ — goal is mobility. See the halo* tags. However, I often feel fake about these QQ halos.
  • #2 GTD — localSys or external tools … goal is PIP, stigma, helping colleagues. Basic skill to Make the damn thing work. LG2 : quality, code smell, maintainability etc
  • #3 zbs — goal is self-esteem, respect and “expert” status. By definition, zbs knowledge pearls are often not needed for GTD. In other words zbs is “Deeper expertise than basic GTD”. Scope is inherently vague but..
    • Sometimes I can convert zbs knowledge pearls to QQ halos, but the chance is lower than I wished, so I often find myself overspending on zbs. Therefore I consider the zbs topics a distant Number 3.
    • Zbs (beyond GTD) is required as architect, lead developer, decision makers.

I also have blog categories on (mostly c++) bulderQuirks + syntax tricks. These knowledge pearls fall under GTD or zbs.

post-order walk: create valuable subtree statistics is rather simple and short, showing stats like

  • max path sum from current node to any subtree node. Note a path must be non-empty, so this max can be negative
    • We can even save in each node this actual path
  • subtree height,
  • subtree size,
  • d2root — tested but doesn’t use subtree statistics
  • — can be easily implemented:
  • subtree payload sum
  • subtree max payload… These stats are powerful AuxDS

Applicable on k-ary tree

reverse post-order mirrors pre-order #diagram+Dot

For a given binTree the pre-order sequence is mirrored by the reverse-post-order sequence.

“Reverse” := “right_child then left_child”

I had an “Aha” moment when reading P 389 [[discrete mathematics]]. Insightful illustrated path-tracer diagram showing that pre-order = ABCDEFGHIJ with root as first visited, and J=right most leaf node. The reverse-post-order = JIHGFEDCBA with root as the last visited, due to post-order. Note the DOT is on the Left for both reverse-post and pre.

( has standard diagrams with a DOT on the left for pre-order; post-order has dot on the right of each node)

Note any post-order curve is less intuitive less visual (than pre-order) because the curve hits a root of subtree upon leaving ! The earlier encounters are touch-n-go. The dot can reinforce this key insight. The dot can help bridge the conceptual gap.

Similarly post-order sequence is mirrored by reverse-pre-order. For the same binTree above, reverse-pre-order = AFGHJIBDEC (not mentioned in the book). Post-order = CEDBIJHGFA, as printed in the middle of P 389. Note root is last visited due to post-order.

max proceeds selling 2 homes #Kyle

Q: you own a residential and an investment property. You have the historical price time series on both. You want to see the max aggregate amount you could have sold both, provided you must sell the investment property before the residential property. You can pick a time to sell the investment and a later time to sell the residential.

I created this questions for Kyle.

find offending section ] originally-sorted array

I rephrase Leetcode 581 Q: There was an ascending int array containing duplicates. Someone reshuffled some elements so no longer sorted. Write a function to determine the shortest subarray needs reshuffle. Before or after this subarray, no change needed.

O(N) time and O(1) space.

==== analysis
Not “easy” as labelled.

I want to find lowest integer that’s out of place. (Do the same on the other end and problem completed.)

first scan to find global min and max values, and ensure they are correctly placed.

Next scan from beginning on the rising slope. At first down-turn, treat the array as two partitions. In the right partition after the first peak, we determine the lowest value, say, -55. Then we will find the correct position for -55, within the Left partition! We will look for the last item equal-or-lower-than -55…. This is the key constraint of this entire problem.

Insight — We know -55 must move left and we don’t care about any other item in the right partition.

Insight — As a concrete illustration, if within left partition, -55 should stand after #7, then we know all Earlier items up to #7 are already correctly placed. Why? All Subsequent items in the left section are strictly higher than -55; and all other items in right section are all (equal or)higher than -55.

— my idea 1 in ON) space : radix sort a clone of the array, then compare to the input array.

count unique squares ] matrix #70%

Q1: Count unique squares within a R*C matrix. A unique square is identified by two opposite corners. For example, [0,0] and [1,1] identifies a 4-cell square; [2,3], [2,3] identifies a 1-cell square.

Q2: print out all of them.


Q1 Feels like a math problem, but printing is a algo question

— solution 1: Let’s focus on height-2 for now. I will denote the subset of squares having height-2 as “2-subset”. (The 1-subset is trivial.)

Insight — 2-subset is naturally non-overlapping with the 3-subset, 4-subset etc. These non-overlapping subsets constitute the whole solution. This fact is extremely valuable to the count algorithm.

For each height-2 square, I will focus on the north-west corner i.e. the corner among four, having lowest r and c IDs. We can divide the 2-subsets by the rowID. For a 55-row/44-column matrix, there are 54 height-2 squares with rowID=0. Same count with rowID=1 etc.

Aha — each square can be identified by the north-west corner {rowID, colID} + height. So at the higher level I divide the whole solution set by height. At the lower level, I sub-divide the set by the “low rowID”

bigO CIV: clarify_requirement means..worst_data

———- Forwarded message ———
Date: Tue, 9 Jul 2019 at 23:10

Big-O means “worst” case complexity. We need to be doubly sure what “worst” case data look like. (Note it doesn’t mean ridiculously rare data like integer overflow.)

On high-end coding interviews, these “worst data set” frequently shed light on the structure of the original problem, and often point to the correct direction for an efficient solution. (Note a nice-looking solution is not efficient if it exhibits poor time complexity in the face of a worst yet realistic data set. Such a system may not scale well in practice, given its Achilles’ heel.)

If interviewer wants a solution optimized for bigO time complexity, you can waste precious time tackling the wrong problem. The wrong problem is the problem defined by your idea of “worst data set”, but the real worst data set is very different.

Sometimes, the worst data is _subtly_ different but it points to a different algorithm direction. It may point to an iterative solution rather than recursion, for example.

The original interview question may not be that hard iFF we identify the worst data set early on, but sadly, we run out of time.

For some tough problems, the first (sometimes the only) challenge is quickly identifying worst data set. Interviewer always gives you the simple data set, NEVER the worst data set. It’s your job to identify the worst data set… often the key insight into the problem.

It’s no exaggeration to say — identifying the worst data set early or too late can make-or-break your chance at this employer. You may kiss good-bye to this job opportunity exactly because you are too slow to identify the worst data set. I know what this means. Other candidates probably got shot by this arrow on the heel, without knowing what happened.

Time is ticking as soon as the problem is presented to you. If interviewer says time is not a factor… take your time.. we want quality and thought process … I just ignore it. Our speed is always part of assessment. Therefore, a very common failure scenario is —

.. tractable problem but candidate runs out of time after wasting time on preliminaries like using incorrect worst data set to reason about the (wrong) problem.

print height@subtree at every node #post-order

For these problems, elegant implementation is expected.

Note in-order is for binary tree, but post-order is for any tree.

== Q: given Any tree without cycle, print the size of subtree at every node

I think post-order walk is usable.

== Q: given Any tree without cycle, print the height of subtree at every node. Also print the node address just for identification

Intuition — dft post-order walk. Need to become more familiar with the detailed steps. has details.

== Q (Leetcode): Given a binary tree, determine if it is height-balanced — a binary tree in which the depth of the two subtrees of every node never differ by more than 1.

I can try it on Leetcode .. no need to set up test harness myself but I might be frustrated

Intuition– post-order walk. A simple recursive algo — for any node, find the its subtree height including itself and return it. Before returning, also compare the left vs right subtree heights

Optional.empty()=immutable !!singleton– points out that Optional.empty() probably would return a global singleton instance, but no-guarantee ! No one should assume it is a singleton. Use of q[ == ] assumes it, and is wrong.

Unlike enums, JVM doesn’t guarantee “singleton”. I think this no-guarantee decision was chosen so as to give compiler/JVM maximum freedom.

[[effJava]] suggests that immutable instances need no copying. They probably can be singletons, but don’t need to be singletons.


array of sd::byte^Unsigned-char as “byte array”

Background: Suppose you need to store or transfer arbitrary binary data.

In RTS, we used char-array. Some say we should use unsigned char. It is the only data type that is guaranteed (by the ANSI C Standard) to have no padding bits. So all 8 bits in an unsigned char contribute to the value. None of them is a padding bit. I think std::uint8_t is similar but less common.

Contrary to some online posts, unsigned-char type is different from “char” —

In C++17, std::byte is probably preferred because only the bitwise operations are defined. I believe you can reinterpret_cast a pointer to std::ptr. In, I used none of the above — I used placement-new 🙂

In java we use the primitive “byte” type — an 8-bit signed integer.

realities@Canada universities: internship #DavisWei

Davis agreed the quality of education is good in Canada universities, but students are less aggressive (I would say hungry) than in the U.S. He studied in a lesser-known college in Canada. His (and classmates’) salary grew but remained much lower than in U.S. even in pre-tax.

I feel U.S. pays high salary to software engineers. Canada may not.

(Same can be said about U.K. and France.)

Davis said “no investment banking jobs” (Same can be said about Australia.)

He gave examples to show — job opportunities are much fewer there.

(Same can be said about Australia.)

U.S. college graduates can find internships + job in the U.S., but Canadian college grads can’t so they usually work in Canada, with much (per Davis) lower salaries. Like in other countries, only a minority in Canada can break into more lucrative job markets overseas (ie US). There’s clearly an invisible barrier between Canada and U.S. job market. I thought there was none. Now I think the barrier is visa —

Canadian graduates need visa to work in U.S. TN is usable but still employers need extra legwork. Overall, Davis perceives a real barrier of entry. In 2016 I experienced something similar — Beyond the heavy door there’s a land of lucrative job opportunities. I wrote a blogpost about it.

  • I overestimated the “level playing ground” in U.S. job market. In reality, local candidates have an advantage bigger than I assumed.
  • I tend to overestimate my (or my kids’) raw talent. With extraordinary talent we can kill any interview, but we are not so extraordinary.
  • I underestimated the entry barrier.
  • Best example — I can see the advantage of Wall St as a job market relative to London (per Ram) and Singapore. Even a mediocre java guy can earn 150k. However, so many competent techies in Asia experience many obstacles when they try to break into Wall St. Most of them don’t bother to try.

hash table of integer_ratios #slope

It’s not possible to store arbitrary ratios as floating point values in a hash table. We hit this problem when matching line slopes. provides a solution

  • the gcd(int1 int2) function finds the greatest common divisor of two integers. This is the key to our problem.
  • convert a string like ‘x/y’ to a simplified Fraction object
  • convert a decimal 
  • convert a float 

I think other languages also provide the same gcd functionality.

low_churn low_stress domains

–high churn high expectation is worst but I have no first hand experience

  • devops – churn due to integration with automation tools
  • xp: Barcap?

–low churn high expectation

  • xp: Macq quant-dev domain
  • xp: PWM: SQL + Perl + java
  • xp: Qz

–high churn low expectation (low caliber)

  • xp: dotnet at OC
  • mediocre web shops

–low churn low expectation is ideal

  • xp: RTS socket + mkt data
  • xp: citi muni: bond math + coreJava
  • xp: mvea


Minor QnA with Stroustrup

Q: memory efficiency is a traditional advantage of c++. Is that still true?
A: yes

Q: some of the top innovators in c++ including yourself, Andrei Alexandrescu, Herb Sutter, … are not contributing to java, c# or other languages. I wonder why.
A: Andrei is working D.
%%A: GC languages have their constraints. Many of the c++ innovations don’t easily apply. However, the cross pollination is more natural between c and c++. Stroustrup said he might be one of the biggest contributors to C feature improvements.

Q: strongholds of c++?
A: complex code, close-to-hardware

Q: do you agree that the new crop of language features are used mostly for system programmers, library developers … not application developers?
A: See the stronghold answer

Q: what competitors do you see over the next 20 years? Go? Rust?
A: none

QQ study=burning_joy

See also the pleasure^chore framework

Unbelievably, QQ study feels like “burning” pleasure. After I complete some chore like coding drill, tax, I long for some QQ study !

There are very few burning pleasures:

  • tech study including zbs, halo…
  • localSys hacking — camp_out
  • weekend coding assignments
  • jogging in stadium
  • learning Oracle or unix in my younger days — camp_out

signed-int: longest K-sum subArray #60%

Q: given a signed-int array, find the longest subarray having sum=K

realistic scenario: we know someone executed a buy and a sell and made $K profit. There are many possible buy-sell points when I review the price points. We know this trader held the position longest to gain exposure, so which buy-sell points did he use?


First scan to build cumulative sum — the sum-array. Looks like a price chart. Initial value is the first original element.

Now scan this sum-array. At each element s[j],

  • I save s[j] in hashtable {s[j] -> j] iFF no such key yet.
  • Then I look for the (by design, earliest) element s[i] such that s[i] + K == s[j]. If found in the hash table, then compute hold:=j-i and update the global longest_hold if necessary.

Note the hash table’s values (positions) are all earlier than j, since we are scanning forward.

Can consolidate to one scan.

Tip — hash table is usable to reduce O() thanks to exactly-K.

Q: what could derail work-till-70 plan

Q: what things can derail my work-till-75 plan. Let’s be open and include realistic and theoretical factors.

  • A: I think health is unlikely to be the derailer. In contrast, IV competition, age discrimination are more likely. The ruthless march of technology also means demand for my skillset will decline..
  • A: mental health? Look at GregM at RTS. See other blogposts on brain aging.
  • A: On the job, my energy, absorbency and sustained focus might go down (or up) with age. I wrote more in another blogpost — As I age, brown-field localSys will become harder to absorb. I may need to stay at one system longer.
    • On the other hand, if offered the chance to convert from contractor to FTE, I may need to resist and possibly move out.
  • A: On interviews, I think my QQ knowledge will remain competitive for many years.
  • A (pessimistic view): green field vs brown field — as I age, my capacity to handle green field may go down. My speed to learn brown field codebase may also go down but after I learn it, I may be able to retain the knowledge.
  • A1: #1 derailer is demand for my skills. In fact, beside doctors, Wall St tech might be one of the most enviable domains for work-till-70. Note “tech” also includes BAU, sysAdmin, projMgr, productMgr and other support functions.
  • A1b: Based on the rumor that west coast is more competitive and age-unfriendly, then the techies there in their 40’s may have more difficulty to remain hands-on like on Wall st. I have a natural bias towards WallSt contract market. If confirmed, then Wall st is better for older programmers.

4 common mem techniques]%%HRT assignment

Technique: reinterpret_cast

Technique: memcpy

Technique: pre-allocated buffer as local static objects … can be better than "stack" allocation only for a large buffer.

Unlike local variables, local Static objects can be returned by pointer — major flexibility in low-level memory management. discusses the runtime cost of static local vs local

std::vector capacity reduction

Q: how would vector’s backing array reduce in size? In other words, how would the capacity every reduce?
%%A: The only hope is to shrink_to_fit() which is a request to compiler. Compiler may ignore it.

If capacity reduction does happen at runtime, then reallocation would probably happen.

I believe resize() assign() clear() etc will never reduce vector capacity.

[17] string(+vector) reserve()to avoid re-allocation #resize()wrong

See also std::vector capacity reduction and vector internal array: always on heap; cleaned up via RAII

string/vector use expandable arrays. Has to be allocated on heap not stack.

For vector, it’s very simple —

  • resize() creates dummy elements iFF upsizing
  • reserve() only allocates spare “raw” capacity without creating elements to fill it. See P279 [[c++stdLib]]

[[Optimized C++]] P70 confirms that std::string can get re-allocated when it grows beyond current capacity. compares std::string.resize() vs reserve().

  • Backgrounder: Capacity — allocated capacity is often uninitialized memory reserved for this string.
  • Backgrounder: size — length of the string that’s fully initialized and accessible. Some of the characters (nulls) could be invisible.
  • string.resize() can increase the string’s size by filling in space (at the end) with dummy characters (or a user-supplied char). See
    • After resize(55) the string size is 55. All 55 characters are part of the string proper.
    • changes string’s size
  • string.reserve() doesn’t affect string size. This is a request to increase or shrink the “capacity” of the object. I believe capacity (say 77) is always bigger than the size of 55. The 77 – 55 = 22 extra slots allocated are uninitialized! They only get populated after you push_back or insert to the string.

python bisect #cod`IV

The bisect module is frequently needed in coding tests esp. codility. In this write-up, I will omit minor function parameters.

* bisect.bisect_right(x)  # returns a position “i” after the last hit, if any, such that
all values are <= x from a[lo] to a[i-1]) for the left side, and
all values are > x from a[i] to a[hi-1]) for the right side. So the “hit” items are on the left, unconditionally 🙂
* bisect.bisect_left(x) # returns an index i before or on the first hit:
all values are < x from a[lo] to a[i-1]) for the left side, and
all values are >= x from in a[i] to a[hi-1]) for the right side.

That’s too long-winded. My sound bytes:

  • bisect_left(needle) returns the first index above or matching needle.
  • bisect_right(needle) returns the first index above needle, unconditionally.

A few scenarios:

  1. If No perfect hit, then same value returned by both functions.
    • Common scenario: if needle is higher than all, then “i” would both be the last index + 1.
    • Common scenario: if the needle is lower than all, then “i” would both be 0
    • in all cases, You can always insert Before this position
  2. If you get perfect hits, bisect_left would return the first “perfect” index, so bisect_left() is more useful than bisect_right(). I feel this is similar to std::lower_bound
    • This is confusing, but bisect_right() would return surpass the last perfect index, so the returned “i” value is higher. Therefore, bisect_right() would never return the “perfect” index.
  3. If you have a lower-bound input value (like minimum sqf) that might hit, then use bisect_left(). If it returns i, then all list elements qualify from i to end of list
  4. If you have an upper-bound input value that might hit, then use bisect_left(). If it returns i, then all list values qualify from 0 to i. I never use bisect_right.
  5. Note the slicing syntax in python a[lo] to a[i-1] == a[lo:i] where the lower bound “lo” is inclusive but upper bound “i” is exclusive.

See demo code in

new languages lose limelight soon

New languages come into and go out of fashion, like hairstyle and cars.

  • Java has lost the lime light for 10 years but is still fairly popular due to job market and high-profile and “cool” companies like google.
  • C and C++ are among the top 5 longest-living languages. They are unpopular among the young, but they remain important.
  • c# is a curious case. I think the really important windows applications are still written in c++.
  • Many high-level languages go out of fashion, including C#, D, ..
  • Most scripting or dynamic languages go out of fashion


##G5 c++skills { Stroustrup chat

C++ has clear advantage in low-latency

Ranked from most specific to most vague. Another ranking criteria is market value.

  1. memory mgmt – for time and space efficiency
  2. generic techniques to use stack objects instead of heapy thingies
    • reduce reliance on dynamic containers
  3. virtual — reusable techniques to avoid virtual
  4. other compiler internals that impact latency
    1. knowledge of inline — to trade off latency against code size
  5. latency instrumentation tools
  6. safe c++ techniques, to reduce the chance of run-time errors. C++ is particular vulnerable as it is
    • more complex than C and others and
    • more close-to-hardware than Java and other languages.
  7. TMP — again to trade off latency against complexity. But this zbs domain is too big for me
  8. tool chain


vector internal array: always on heap; cleaned up via RAII

Across languages, vector is the #1 most common and useful container. Hashtable might be #2.

I used to feel that a vector (and hashtable) can grow its “backing array” without heap. There’s global memory and stack memory and perhaps more. Now I think that’s naive.

  • Global area is allocated at compile time. The array would be fixed in size.
  • For an array to be allocated on stack, the runtime must know its size. Once it’s allocated on that stack frame, it can’t grow beyond that stack frame.
  • In fact, any array can’t grow at all.

The vector grows its backing array by allocating a bigger array and copying the payload. That bigger array can’t be in global area because this type on-demand reallocation is not compile-time.

Anything on heap is accessed by pointer. Vector owns this hidden pointer. It has to delete the array on heap as in RAII. No choice.

Now the painful part — heap allocation is much slower than stack allocation. Static allocation has no runtime cost at all after program starts.

c++toolChain: archaic for new coders #%%advantage

The modern languages all feature dramatically simplified tool chain. In contrast, c++ tool chain feels daunting to me, including profilers, static analyzers, binary file dumpers, linkers .. Just like manual cars, too complicated compared to newer languages like java. The dynamic scripting languages are even simpler.

This is one of the real obstacles to new entrants, young or old. This is also my (slowly growing) competitive advantage. I feel some people (like Kevin of Macq) know more, but most developers have a cursory working knowledge in this field. Just as I prefer command line, I often feel more confident (relative to my peers) with c++ tool chain

This learning curve, entry barrier … is a direct consequence to the c++ “sweet spot” as Stroustrup described — inherently complex codebase close to hardware.

.. Due to the bigger ecosystem needed to support c++, new features are added at a slower pace than languages having a central organization.

— personal xp:

In my early days using c++, I tried to set up eclipse CDT and spent lots of time on the tool chain. My goal was to set up similar convenience … Futile. Not worthwhile. java tools are miles ahead. Most c++ programmers don’t bother with such convenience and rely on command line tools.

I wrote dozens of blogposts about c++ build issues. For example, on windows, my strawberryPerl + git_bash + notepad++ setup is unknown to many. These fellow developers struggle with MSVS or Eclipse !

I was frustrated for years by the complex and messy build tools in c++. Same for the other new entrants — Rahul spent a month setting up Eclipse CDT…

— mileage as an entry barrier:

This toolchain is an entry barrier. Young people can take it up if determined, but majority of them are put off.

Q: A non-compSci graduate on a mid-career boot camp can take up programming in a scripting language or a simpler compiled language, but how many percent would take up c++?

There’s a minimum mileage required. Many young people don’t stay on it long enough.

My vi mileage is also too short.

Similarly, yoga is hard for most people but some individuals are determined and therefore able to overcome the initial hump (6-24M) and find joy and reward.


kernels(+lang extensions)don’t use c++

Q: why kernels are usually written in C not c++? This is an underpinning of the value and longevity of the languages.

I asked Stroustrup. He clearly thinks c++ can do the job. As to why C still dominates, he cited a historical reason. Kernels were written long before C++ was invented.

Aha — I think there are conventions and standard interfaces (POSIX is one of them)… always in C.

I said “The common denominator among various languages is always a C API”. He said that’s also part of what he meant.

many modern languages rely@c++4heavy-lifting

Stroustrup said the jvm, tensorflow, javascript, … can be considered c++ applications. They make use of the c++ compiler.

The c++ compiler is more flexible, more complex, more powerful, more engineered. Those other languages’ compilers lack those advanced features so they leverage the c++ compiler.

I would not say “most modern languages” rely on c++ for heavy-lifting.

benchmark c++^newer languages

C++ can be 5x faster than java if both programs are well-tuned — A ball-park estimate given by Stroustrup.

The c++ code is often written like java code, using lots of pointers, virtual functions, no inline, perhaps with too many heap allocations (STL containers) rather than strictly-stack variables .

Many other benchmarks are similarly questionable. These new languages out there are usually OO and rely on GC + pointer indirection. If you translate their code to C++, the resulting c++ code would be horribly inefficient, not taking advantage of c++ compiler’s powers. An expert c++ developer would rewrite everything to avoid virtual functions and favor local variables and inline, and possibly use compile-time programming. The binary would usually become comparable in benchmark.  The c++ compiler is more sophisticated and have more optimization opportunities, so it usually produces faster code.

local_var=c++strength over GC languages

Stroustrup told me c++ code can use lots of local variables, whereas garbage collected languages put most objects on heap.

I hypothesized that whether I have 200 local variables in a function, or no local variable, the runtime cost of stack allocation is the same. He said it’s nanosec scale, basically free. In contrast, with heap objects, biggest cost is allocation. The deallocation is also costly.

Aha — At compile-time, compiler already knows how many bytes are needed for a given stack frame

insight — I think local variables don’t need pointers. GC languages rely heavily on “indirect” pointers. Since GC often relocates objects, the pointer content need to be translated to the current address of the target object. I believe this translation has to be done at run time. This is what I mean by “indirect” pointer.

insight — STL containers almost always use heap, so they are not strictly “local variables” in the memory sense

heap allocation: java Can beat c++

  • case 1 (standard java): you allocate heap memory. After you finish with it you wait for the java GC to clean it up.
  • case 2 (low latency java): you allocate heap memory but disable java GC. Either you hold on to all your objects, or you leave unreachable garbage orbiting the earth forever.
  • case 3 (c++): you allocate heap memory with the expectation of releasing it, so the compiler sets up housekeeping in advance for the anticipated delete(). This housekeeping overhead is somehow similar to try/catch before c++11 ‘noexcept’.

Stroustrup suggested that #2 will be faster than #3, but #3 is faster than #1. I said “But c++ can emulate the allocation as jvm does?” Stroustrup said C++ is not designed for that. I think he meant impractical/invalid. I have seen online posts about this “emulation” but I would trust Stroustrup more.

  • case 4 (C): C/c++ can sometimes use local variables to beat heap allocation. C programmers use rather few heap allocations, in my experience.

Note jvm or malloc are all userland allocators, not part of kernel and usually not using system calls. You can substitute your own malloc. top answer by Kanze is consistent with what Stroustrup told me.

  • zero dynamic allocation (Similar to Case 4) is always faster than even the fastest dynamic allocation.
  • jvm allocation (without the GC clean-up) can be 10 times faster than c++ allocation. Similar to Case 2^3
    • Q: Is there a free list in JVM allocator? Yes claims

  • c++ Custom allocators managing a pool of fixed-sized objects can beat jvm
  • jvm allocation often requires little more than one pointer addition, which is certainly faster than typical C heap allocation algorithms in malloc

append+maxInRangeK #Rahul

Q: given a sequence of integers, implement two operations

  1. Operation 1: append another integer. Size becomes N
  2. Operation 2: given two arbitrary subscripts into the sequence, (they define a sub-array of size K), find the max

Rahul saw discussions without any optimal solution. I think a simple vector-based algo can achieve amortized O(1) for Append and O(logK) for Max.

AuxDS required.

— idea 1: segment the sequence into fixed segments


new lang2challenge c++on efficiency@@

I asked Stroustrup — efficiency is the traditional strength of C and C++,  both memory efficiency and speed.. Is that still true? He immediately said yes.

I think it was clear in his mind that c/c++ were still the most efficient languages around. He did say Fortran is optimized for scientific computing.

I later asked him — any new language he watches out for. He said none, without real hesitation, no ifs or buts.

Recalling that conversation, I feel new languages are usually more high-level and easier to use. They are more likely to use heap to provide a “consistent interface” and avoid the complexities of a low-level language.

If I’m right, then these languages can’t and won’t optimize for efficiency as a top priority. Efficiency is possibly a 2nd priority.

A/Bob/C: cod`IV #vec for python list

Here are other variables names frequent needed in coding puzzles:

  • Once you declared a long, meaningful var name, you could use abbr of it afterwards.
  • if you have an array of composite objects like “frame” or “trade”, each represented by a inner array, then be careful with frame[1]. Better call it aFrame[1] or frames[0].
  • what about “sorted” which is a python function name:( …. I would use ascd by default.
  • B4 means “before”; Af means “after”
  • hm means hashtable; tm means treeMap or std::map
  • targetRepetition or tgtRep
  • “val” — payload data filed in a Node class. This name is short, easy to type. Harder to global-replace but LG in a short program
  • candd” — candidate
  • “incum” — incumbent i.e. the replaced element from a collection
  • “End” for the last element in a container
  • li means list … vec means vector … arr means array
    • In python, we need extra caution to use “vec” not “li” for the python list !
  • cnt means count or counter
  • — for a matrix
  • M, m or mat for the matrix
  • r as the 1st index; c as the 2nd index. Avoid x/y
  • height/rows as number of rows; width/cols as number of columns
  • I will refer to each element as a “cell”

—It’s more useful to have personal names than mere “Person A”. Shorter names are easier to write. Gender doesn’t matter.

  1. Alan, Alice, Aaron,
  2. Bob, Ben (Big Ben), Bill,
  3. Cody,Chris, Charlie
  4. Dan, David
  5. Ed, Emily, Elizabeth (queen)
  6. Fred
  7. Greg, Gail
  8. Henry, Harry,
  9. Ilya, Ivy, Ian
  10. Jen, Jim, Joe, Jack, John, Jason,
  11. Ken, Kate, Kevin,
  12. Lucy, Lee, Lilly, Larry,
  13. Mike, Mary, Matt,
  14. Nick, Nancy,
  15. Oliver, Oscar, …
  16. Pam, Peter, Paul, Peggy
  17. Quincy
  18. Raj, Rob, Ray,
  19. Sam, Sue,
  20. Tom, Ted, Tim,
  21. Uday ….
  22. Vic, Venkat
  23. Wendy, Will
  24. Xavier, Xander/Zander …
  25. Yixin, Yang ….
  26. Zofia, Zack,

when2introduce new tokens into O()

The extra tokens like A  B C in a O(A+BB+logC) are not always independent dimensions.

  • Case: DFT/BFT have O(V+E) but some people may say V <= E, so why not remove V and say O(E).

I think this estimate misses the point that E could greatly outnumber V. If another algo is O(V+logE) it would probably win.

  • case: radix sort is O(WN), but some may say W <=N, so why not remove W and say O(NN)

Well, W is typically log(unique count among N), though W can be half N.

Some bigO students don’t like too many tokens and may simplify it to O(KK), but I think it’s imprecise. If M or N is very small like logK, then O(MN) is much faster than O(KK).

Sunday night: if in the mood4localSys

Realistic scenario — I find myself in the mood for localSys learning on a Sunday night 11pm.

I think it’s better to sleep in office than to go home, but in Singapore, I had better go home and sleep, by taking taxi.

I think it’s better to work on the localSys till 2am (or later). Those 3 hours are precious engagement … Burning pleasure. I don’t get such 3-hour engagements in a week.

I used to feel “why not work on my QQ or coding practice now, and focus on work Monday morning?” It turns out that I’m usually less motivated on Monday morning, for whatever reasons.

Friday night is similar. I often find my highest appetite and absorbency on Friday nights (or eve of public holidays). So better go home late, sleep, then come back to office Saturday early morning to capture the mood.

##[19]Y WallStContract=%%best Arena #Grandpa

Competition arenas … we all CHOOSE the arena to compete in. It’s a choice, either implicit choice or explicit choice. I would say better to be conscious about this choice.

Not much new content in this blogpost. I feel very convinced to stick with WallSt contract market. Here is a ranking of the reasons why I consider it a rational decision, though our decisions are shaped by our deeply personal experiences and inherently irrational.

Beware of attachment !

  1. low stress, low expectation — my #1 reason as of 2019
  2. low-caliber competitors, mostly due to the “offputting” below
  3. age friendly
  4. I get to practice interviews and keep a precious burning-pleasure for a month each year on average.
    1. In contrast, If I were an ibank VP I would have multiple obstacles on that front.
  5. — other reasons to prefer Wall St contract
  6. higher probability of greenfield projects
  7. leverage on domain knowledge
  8. I can easily explain my job hopping profile
  9. ?? a number of firms to hop around

Now the downside, off-putting factors. ## Y most young dev shun contracts — Many bright or young competitors are put off by these factors, reducing the competition.

baseclass(template)use subclass field+!vptr

In my baseclass template is able to use a subclass field. This reusable technique is reusable in a very restrictive context —

  1. the base and sub classes represent market data messages, used in a reinterpret_cast context, with zero padding. Therefore, a proposed vptr would add a pointer and mess up reinterpret_cast.
  2. multiple (say 11) subclasses have a “qty” field, so I don’t want code duplication 11 times
  3. The order among the fields is fixed in struct definition. Without this restriction, I would define the “price” field in a base struct and also a getPrice() method. With a subclass instance, CRTP could probably work like static_cast<Subclass const*>(ptr)->getPrice() but the “price” field’s physical offset would be be dictated by the compiler not according to my struct definition

My technique uses CRTP but no SFINAE no enable_if.

My technique is easier to /internalize/, as it relies on simple overload resolution + simple type deduction. In contrast, the SFINAE technique used in my RTS codebase is off-putting and alien to me

wildcard matching #more tractable than regex_FB

Q (Leetcode44): Given an input string (s) and a pattern (p), implement wildcard pattern matching with support for ‘?’ and ‘*’. The matching should cover the entire input string (not partial).

‘?’ Matches any single character.
‘*’ Matches any sequence of characters (including the empty sequence). The star is not a quantifier. I think a single star is like the “.*” in perl.

no space allowed. I think a wildcard can even sit at end of string. I think we could see two wildcards in a row. Two stars can be treated as one star.


My topdn-memoization solution (on github) was accepted at Leetcode.

I feel this is simpler than the problem, therefore easier to absorb. I think either DP or my original recursive solution should work.

Accounts merge #disjoint set

Q: .. too long

Union-find + BFT

— solution 1:
Data structure:

  • Acct object {name, vector of Email objects, isVisited flag }
  • Email object {addr, set of Accts }
  • hash table {address -> Email objects}

First scan to Create one acct object for each row. When populating the vector, check if the address exists in the hash table. If yes, then save the existing email object in the vector. Also the Email object’s set is updated with the new Acct object.

Second scan to print everything. Iterate over the accounts. Need to merge all the emails belong to John —

Scan the vector. If any Email  object has more than one Acct (named John), visit each Acct following BFT, until all John accounts are visited. Collect all of these accounts’ emails in one tree-set, then print them.


max-product subArray #51% untested

Q: Leetcode 152: given signed int array, find the subarray (minimum size 1) with highest product. Return the product.

Note if there’s 0 in your subarray you must return 0.

peek? Not yet

— solution 1: O(N)

One scan to identify all the zeros. They partition the array.  For each partition, we need algorithm-A:

Count neg elements. If even, then return product of entire array. Else

Accumulate from left until last neg element -> prodL.
Accumulate from right until last neg element -> prodR. Compare the two.

trivial edge case — if the zeros create only single-neg partitions, then zero is the best.

trivial edge case — If subarray has only one element, be it zero or negative, then that’s the best for that subarray.



U.S.startups: often less selective,lower caliber

U.S. startups may represent a sweet spot among my job choices. I think they are less selective than HFT or top tech shops. They attract fewer top geeks.

  • Workload – presumably comparable to ibanks like MS/Baml/Barc. I guess some startups are higher, but I guess fewer than 50%. Mithun described 3 startups but he has no firsthand experience.
  • Salary — Many startups pay on-par with ibanks, according to CYW
  • Leaves – possibly fewer than ibanks
  • Cowoker benchmark – possibly lower caliber
    • Respect – could be slightly better than in an ibank if you earn the respect

Some of the big tech shops are actually less selective than HFT – Amazon, Apple

big_gun + laser_gun on localSys

I like the imagery of laser gun and big gun

Backdrop — My brain power vs the brain power baked into the local system.

My laser focus is a blaster-gun that can break down the stonewall of opacity in the local system. This laser-gun doesn’t require “independence” but IFF I become independent early on, then I can make use of my weekend quiet hours and apply my laser. My weekend quiet hours are a real big gun in the local competition among the team members [1].

However, my quiet hours tend to get spent on research rather than localSys. See office-hour productivity#1 distraction==research 

I could also have an engagement honeymoon.

[1] My advantages and weaknesses are fairly obvious when compared to my colleagues.

Many of my peers seem to have that laser gun more often I had.

  • ==== my victorious war stories were mostly in brown field arenas. In each case I needed the same sustained focus for a few months
  • +ve xp: PWM error memos
  • xp: PWM AICE
  • +ve xp: RTS
  • -ve xp: at mvea I was not independent.
  • -ve xp of CSDoctor, but I won’t spend too much time on hearsay war stories.

in each treeNode, save d2root

My friend YH showed me a simple BFT algo where each parent node AA saves AA’s level+1 into the two children nodes AAL and AAR.

So every node remembers its own level from root 🙂

I think this technique can be used in pre-order DFT too. For post-order, it is possible but less natural, as shown in my github code subTreeStats.cpp.

Note pre|post-order can work B-trees not only binary trees.

If nodes are immutable, we can use a hashmap {node address -> level}

##IV Q:Name some big-impact projects U contributed to

Kyle pointed out that your change could be a 10-minute job with big impact, but only because there was already an infrastructure. You leverage the infrastructure to create the impact. For a story to be compelling, it has to show complexity and personal contributions

  1. NYSE integrated feed. Most of the financial data websites get nyse data from us
  2. Baml muni trade execution — rewrite from scratch
  3. billing/commission calc at PWM.
    1. eg: mortgage commissions from scratch
    2. eg: special investment (hedge funds) commission rewrite
  4. Citi muni re-offer engine. Biggest muni house in the U.S. Lots of retail investors buy muni bonds through their investment advisors. My changes are actually incremental.

bigO usable in Average | worst-case #CSY

bigO is about large input data SIZE or GROWTH, not about input data quality.

Therefore, both average-case and worst-case can use bigO notation. Multiple sources specify “average case O(N logN)” —

big-Oh means upper-bound. big-Theta means tight-bound. They refer to the growth of running time as input data grows, regardless of data quality.

always separate write^read traffic

Rahul pointed out my “simplistic” thinking. Now I feel there’s no good reason to create a web server to handle both read and write requests.

A Read server has a sizable data cache to service client requests. This cache gets updated ….

A Write server (“writer”) has no such data cache, but it might have an incoming request queue + a downstream queue.

The incoming queue introduces delay, but users who send updates often understand that writes take longer than read.

The downstream queue is relatively new to me, so here’s my hypothesis —

Say 100 writers all need to get their records persisted in a central data store. The infrastructure at the central data store is now a bottleneck, so the 100 writers send their records in a queue rather than wait indefinitely. The writers can then handle other incoming requests.

static horizontal sharding: drawbacks #Rahul

Eg of static horizontal sharding — by-symbol. NYSE, OPRA…

Rahul told me that outside finance sector, many companies (esp. west coast) are cautious about static sharding. Over time, One shard can become extremely hot while other shards’ resources stay underutilized.

Rahul said it’s a hassle to change static sharding config. Logistically challenging. Requires phased restarts.

As a result, many tech companies use static horizontal sharding very cautiously, only with valid reasons.

Dynamic sharding is a new concept to me. I think it’s like … based on load level, we could shift one “topic” between shards.

FizzBuzz in O(N)

Q (Leetcode ) Write a program that outputs the string representation of numbers from 1 to n. But for multiples of three it should output “Fizz” instead of the number and for the multiples of five output “Buzz”. For numbers which are multiples of both three and five output “FizzBuzz”.

What if the divisors in the requirement is not two but much more and each a prime number?


In the standard solution, there’s a hashtable O(NK) algo — for every int 1 to N, check the K divisors.  Here’s a O(N) solution:

  • first populate the return vector with the basic strings
  • Then make a strided iteration to append “Fizz” on every multiple of 3.
  • Then make a strided iteration to append “Buzz” on every multiple of 5.

The fact that prime numbers become very high very soon help our bigO:

N/3 + N/5 +… + N/31 + N/37… will be lower than 5N

16 Aug weekend drill

Target: up to 10 problems in total including 5 new ones.

— fully tested:

  2. — offending section ] sorted arr
  4. — Alien dictionary

— presumably solved

  4. — count squares in matrix
  5. — longest full path name
  6. — O(N) by quick-select

— pending

— reviewed

CIV: subOptimal $$valuable solutions#CSY

My friend Shanyou consistently holds a position like “I don’t need to find those better solutions (forget the “optimal”). If my solution is brute force or non-optimal, at least it works, so it will be good enough for many interviews and good enough for my coding drill. Many candidates can’t find even a brute force solution in the given time, so if I strengthen my capabilities to this level it’s valuable.

Admirable !

That’s inner strength — 定力
That’s much-needed focus
That’s undervalued realism
That’s celebration of every progress
That’s a pat on our own back

The all-or-nothing, single-minded pursuit of optimal solution can often wipe out my motivation, joy, satisfaction, self-esteem.

It’s a common and fatal de-motivation to tell ourselves (like XR … ) “There’s no value in pursuing non-optimal solution.” Well there ARE a few:

  1. pride and achievement from completion of a tough challenge
  2. ECT, speed
  3. syntax idioms
  4. self-mastery, internal locus of control, similar to my yoga practice
  5. absorbency — demonstrate to myself; self confidence in absorbency
  6. joy, self-satisfaction — see exactly what I want{cod`drill beside immediate IV #Rahul
  7. … Other blogs provide other perspectives

!! Many published solutions are non-optimal in O()
!! Most solutions out there are non-optimal in terms of concise implementation, even if matching the best solutions in big-O.

Some interviewers only care about big-O and don’t care about too many loops; Other interviews want an elegant, simple solution; The most extreme interviews (Indeed, FB?) score candidates on syntax and code quality too. Which one do you target? Realistically, I would take Shanyou’s target.

Q: your real goals, motivations for coding drill? Absorbency? Burning pleasure? Satisfaction (as Rahul puts it)? Maintaining a hobby?
So defocus can help. The focus on “optimal” creates negativity.

— Many non-optimal solutions are worthwhile.

  • Some of them offer insight into the structure, constraints… but may not offer enough insight to solve the problem
  • Some of them are non-optimal but optimal solutions to modified problems.
  • Some of them are non-optimal but break down the original problem into smaller, familiar problems. This process builds the xRef I need for thick->thin. The breakdown technique can be reusable.
  • Some of them offer reusable techniques
  • Some of them are non-optimal but very short and clever
  • Some of them (my homemade) are unique and innovative … worthwhile achievements.
  • Some of them are good enough for “2nd-tier” coding interviews


## halo knowledge to impress west coast interviews

My naive list of "halo" technical knowledge that might impress some west coast interviewers.

  • order stat tree
  • Morris tree walk – O(1) space
  • path compression in union-find
  • hash table worst case is no longer O(N) as in java8
  • shortest-path algorithms
  • Fibonacci and other advanced heaps — I discussed it at Facebook
  • tail recursion to reduce space complexity
  • worst case quick sort and randomized pivot selection
  • bucket sort
  • insertion sort as the fastest sort
  • radix sort for floats and strings

good@pushing myself

am good at pushing myself to the limit, sometimes beyond my limit, even if i may not be “good at” them in a competitive sense.

  • yoga
  • jogging
  • coding drill
  • camp out

coding drill is arguably the most interesting case study. Other people (including students) may complete more problems, but i probably spend more time

  • Ranked by number of problems, i’m just average
  • ranked by hours spent, I’m much higher
  • ranked by effort, I’m even higher, as I push myself harder than others

generate combinationSum compositions #backtrack up]trie #2review


Given a set of unique candidate numbers and a target number, find all unique combinations of candidates, where each combination sums to target. Each candidate may be used repeatedly.

My solution is , showing a reusable backtracking technique described below. It’s clever and brief. However, the efficiency is questionable. Memoization might be applicable here.

This backtracking relies on a key insight. Suppose we have target = 7 and X=2,Y=3,Z=4 as the candidates.

  • when we try a Y, we don’t need to try any more Xs. For example, If we are trying XY, then all XX* solutions are already handled by earlier recursive calls.
  • each combo sequence is naturally pre-sorted internally.
  • Also, “lower” formulas are generated before “higher” formulas
          x        y     z
      /   |  \
     x    y   z       
    / \   | \  \
 xxx  xxy | xyz \ 
         xyy     xzz
void //can return something if needed

recurs( solutionsFound &, //growing
        curPartialSolution &, 
// above collections could be global variables, to simplify things

        remainingCandidates, /*probably an immutable global array. 
If we need the remaining array to shrink, we can still rely on startIndex to skip used candidates.*/

        startIndex, //used to select from remaining candidates

Inside this function, we scan remaining candidates starting from startIndex. Typically in one iteration

  1. we add a new candidate into curPartialSolution
  2. we call recurs
  3. we remove the last added candidate from curPartialSolution to restore the original curPartialSolution — backtracking up the tree.
  4. move on to the next candidate

I feel this is more greedy than DP

what could keep up%%coding drill]SG

  • I think the west coast salary would continue to stay high for a few years
  • I may see more examples of older programmers (like me) getting nice offers on the west coast.
  • I am sure to see more friends (in their 30’s) getting nice offers on the west coast.
  • Even in Singapore, there are a few west coast shops including Google and facebook.
  • Rich startups are always attractive to me. I was told some of them use similar coding tests.

Above are the “evidence” that I would see over the next 24 months. But how would I react to them?

Fundamentally, I feel confident I can improve on the algo and coding tests, if I work hard and keep up the effort. Internal locus of control. Self-efficacy

Fundamentally, I always want to stay relevant to the tech economy and various job markets. I never want to be sidelined.

You mentioned the lack of personal time. Big challenge. Another challenge is new job where I must work extremely hard to survive. First X months I would focus on new job and kids. Once I have some free cycles, I will try to resume coding drill.

I don’t know how, but I hope to find a bit of time each month (not each week) working on some challenging questions, or at least review some old questions.

suffix array of a haystack #[[Pearls]]

[[ProgrammingPearls]] P172 is concise with a single-sentence definition — “Initialize an array of pointers to every character (or word) in your text, sort them, and you have a suffix array”

For an usage, see find All hiding places of needle]haystack #suffixArray

— LCP array + suffix array

The enhanced suffix array comes with additional tables that reproduce the full functionality of suffix trees. The basic suffix array is a construct simplified from a suffix tree.

  • Suffix table — saves the lexicographic rank of each suffix of a haystack.
  • LCP table —- Contains the maximum length of prefix match between two consecutive suffixes, after they are sorted (as in a dictionary) and stored in the suffix array.

Each array element describes one suffix.

Both are integer arrays of length N (i.e. haystack length). LCP saves some lengths; Suffix array saves head positions (these positions are defined within haystack string)

By definition, the LCP is always “helper” for the suffix array. In contrast, the suffix array can be useful by itself.

suffixArray: find All hid`places@ needle]haystack

Now I think suffix array is probably the best solution. Suffix array can be built in O(N) and can help any needle1, needle2, needle3 etc. This O(N) cost of pre-processing is small if searched repeatedly.

The Aha — this array is sorted by the string content. Therefore, all the “similar” suffixes club together.

Using the example in, if I want all matches for needle “AN” in haystack BANANA, using the suffix array, even a simple ( inefficient ) linear scan (over the array) would find AN, ANANA in a “cluster”. In general, O(logN) binary search is good.

This usage requires only suffix array, but many other usage scenarios require the helper LCP array.

Note the preprocessing is on the haystack not the needle.

Note building the suffix array is an one-time investment , for repeated usage. If used only once, then total cost is O(N) to build and O(logN) to use this array. I think it is comparable to, not worse than, brute-force.

concretize asterisk among brackets #Okao

Q (Leetcode 678): Given a string containing only three types of characters: ‘(‘, ‘)’ and ‘*’, write a function to check whether this string is valid.  An asterisk can concretize to empty string or a left or right bracket


Kyle reviewed an O(N) solution, but advanced. Not really a “Medium” question if you want optimal. However in an interview perhaps a less optimized solution is good enough?

Worst data set will point out the constraint and structure, as explained in CIV: we wish all corner cases+worst data sets r given and clarify_requirement means

Need to use spreadsheet to work out a good sample data.

I feel confident to crack it if I get a good sample including some worst data.

— published O(N) solution

One scan. At each position ask the question “How many extra openers can there be up to this char” When the answer is negative, we give up and return false, but usually the answer is s a range like 0-2.

Use two global variables to record the lowest answer, and highest possible answer.

  • When we hit a ‘(‘, lo++ and hi++
  • when we hit a ‘)’, lo– and hi–
  • when we hit a ‘*’, then range expands,
    • lo—. Lowest possible extra opener count is now one lower , since the asterisk can be a closer
    • high++. Highest possible extra opener count is now one higher , since the asterisk can be an opener

I think at end of the scan, lo should be zero, i.e. zero-extra-opener is at least a possibility.

(Possibly reusable) Insight — The Range of possible values as a simple data structure. Might be reusable whenever we are asked “Can this be ….?”

java off-heap memory #cf database,reinterpret_cast

  • justification — see “jGC” item
  • tradeoff — ?
  • java.nio.ByteBuffer — non-blocking IO api is very relevant. It has API to access memory-mapped files, among other things. More often, I think we probably use NIO to request a big chunk of memory from kernel, and bypass the java heap.
  • serialization — most (if not all) off-heap techniques require objects serialized and stored off-heap.
  • byte array — Unlikely object pool, off-heap memory solutions must serialize Trade objects into binary byte array [1] and stores it. It has to be transformed back to Trade objects before use. C would use reinterpet_cast() — no disk involved.
  • offloading jGC — With huge data moved off the heap, jGC is now much faster, and jitter-free. This is the main goal and justification of off-heap
  • bigger — off-heap storage can be bigger than on-heap storage.
  • slower — off-heap storage is slower (less immediate) than on-heap, but faster than disk or database[2].
  • performance boost — even though “slower”, it can boost performance of GC.
  • memory sharing — between two JVM’s is easier if using off-heap. I think memory-mapped file is one means.

[1] In this way, the NYSE XDP protocol could arguably be considered an off-heap serialization protocol.

[2] database is actually a competing alternative solution for some use cases like “shared-cache” or “large lukewarm cache”. I feel database solution is very proven, reliable but slower than off-heap.

Among the many disparate online articles about off-heap, here are my top 2 fav


4 competing domains to Support dev salary@WSt

In terms of supply/demand, there are systemic forces that support the relatively high developer salary in investment banks.

  • #1 pure tech shops — generate high profit, attract hot investment
  • #2 buy-side shops including mutual funds
  • #3 small startups — only in U.S. can they offer competitive salary
  • #4 exchanges and fintech shops including bbg, Reuters, S&P. Some (Miami, NYSE..) seem to generate high profits
  • #5 package software vendors

Few of these categories exist in Singapore.

hashtable resizing ≅ jGC jitter #unpredicatable

“Note because of the rehashing issue – a realtime applications and applications that need low latency- should not use a hash table as their data structure.” — said one guy on StackOverflow.

I would say if we have relative confidence about total data size (like below 1 million) then we can still use hash table. A benchmark test is not hard to set up, to compare against alternative designs such as RBTree. For large data sizes, I think hash table would win.

Note in latency sensitive systems, resizing (and stop-the-world GC) is likely to happen at busiest time.

Can we trigger a resize? Just send in more data to build up the table.

Can you trigger a pre-emptive GC? See can you force garbage collector to run@@


vtable also contains.. #class file

C++ is more complex than java. A typical vtable in c++ contains

  • offset of base type subobject. In multiple inheritance, this offset is often non-zero. This offset is needed not only for field access but also up-casting
  • typeid for RTTI

These details are part of the compiler ABI, since object files from older and newer compilers (of the same brand) could link together iFF they agree on these details.

Best-known part of ABI is name-mangling algorithm. This vtable detail would be the 2nd best-known ABI feature.

I believe the class file in java is one file per class. Therefore, vtable is something like the equivalent of a java class file.


demanding mental power: localSys imt QQ

I told grandpa that I have noticed signs that my mental power is declining, mostly in localSys and local codebase absorbency. However, localSys has always been a weakness in me. The decline, if any, is very gradual.

QQ learning is still in good progress. Experience, accumulation, deepening.. thick->thin.

Note A.Brooks’ article talks about creativity, innovation, analytical … My issue is mostly memory capacity and speed. Recall the Indeed interview.

PIP@Macq: tough judge@%%design

If I were the judge, then Kevin’s solution may get rejected or rated mediocre.

I think the judgement can be unreasonably tough when the judge herself is a practitioner — consider Yang and Sundip Jangi.

On the other hand,

  • Yang liked my OO design in EOS
  • Sundip liked my personalization design

The outcome (PIP etc) doesn’t mean my work (i.e. output) is sub-standard. The outcome has many reasons and causes.

I need to be fair and impartial to myself. [[learned optimism]] uses the three P’s. One of them is Personal.

represent trees with unbounded fanout

When I first read it in [[CLRS]], I didn’t know why trees exist with unbounded fanout, until I read about

Insight —- If we don’t know how many child nodes a parent can have, then it’s not easy to define TreeNode fields to hold children. If you use a vector or linked list within a TreeNode (perhaps a java developer), your space efficiency will be poor.

The only way I know is a left-child-right-sibling scheme — TreeNode A has a pointer to A’s first child. If A is also one of the children of TreeNode P, then A will also have a pointer to a nextSibling , possibly null.

insight —- reveals that this scheme uses a binary tree to represent a k-ary tree

This is one of the few usages of linked list to my knowledge.

touch not cross: path between 2 corners

Q1: from p 183 [[discrete math]]: given a n x n grid. Start from north west corner moving south or east each step, towards that corner. The diagonal connecting them can be touched from north, but not crossed. print all paths

easier to treat origin as [0,0] and end as [N,N]

DFT will require deep recursion.
BFT (with color) where each node remembers all paths-from-root? Kinda brute force

Insight — Actually this is not necessarily a graph problem though it can be solved that way.

Q2 (accepted@leetcode ): Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses. For example, given n = 3, a solution set is:

[ “((()))”,
“()()()” ]

Four related problems — These two problems are related to the abbreviation generator i.e. the combination generator.

However, the abbr/combo generators are more versatile and possibly overkill for this problem. These two problems can use a bit array to represent the output. I think my solution on github is probably considered inelegant but I don’t care. BigO insight —  number of paths (or valid strings) is O(N!) so any solution would not be any better than O(N!)

In general, our own ideas are often inefficient. If efficient, then often inelegant by some arbitrary interview standard. Still more valuable than learning standard solutions. One of the biggest values is xRef which helps build insight, intuition, thick->thin.

3reasons coding drill is insufficient for getting offers

Q: why do many students and professionals practice hundreds of coding questions and still fall below the bar?

— Reason: no similarity — many problems I got are not really similar to my past Leetcode questions. They are really new problem to me.

However, It’s possible that another candidate would say “Oh this is similar to problem #1012” because he has done 400% more problems.

Some employers avoid published questions all together. They may tweak their problems to make them original.

— Reason: hidden similarity — when interviewer poses a coding question, it may be actually similar to a practice problem we did last month, but can we spot the hidden similarity? The similarity is often well-hidden. To really see it, we often need training and insight into the structure and key constraints in a given problem.

Without the right key, we can get mislead into thinking it’s similar to the wrong problem. We all have war-stories where “somewhat similar” is very different from really similar.

To develop the insight into those structures, I find it necessary to compare past problems, and develop pattern-recognition. 读书从厚读到薄 or synthesis learning.

— Reason: forgetting — Without sufficient refresh, many candidates simply forget 50%+ of the past problems they practiced, esp. if they have done 200+. Unfortunately, knowing a few key points is often insufficient at high-end interviews like Facebook.

convert 512bit int to bit-vector #python=best

In coding questions, int-to-bit_vector conversion is common. I now think python offers a simple clean solution, applicable on integers of any size:)

'{0:032b} <-32|8-> {0:08b}'.format(myInt) #convert to a 32-element list, simultaneously to an 8-element
  • first zero refers to argument #0 (we have only one argument).
  • The zero after colon means zero-padding
  • The last ‘b’ means binary
  • Don’t forget your (dental) braces 🙂

Note this is the str.format() method, not the global bulitin function format()

Return value is a python string, but easily convertible to a vector

What if the number is too large? My {0:08b}  example shows that if the int is above 255, then python does the safe thing — extending the output beyond 8-element, rather than truncating 🙂

To convert bit-vector to an int, use int(bitArr , 2) # 2 is the base i.e. binary

data-heavy big webpage: 99%content is received incrementally

Rahul introduced me to this concept briefly, so here’s my own guess.

Context — soccer world cup page with live comments. Heavy updates. Jargons

  • record — a comment, with some additional data fields
  • pull — browser-initiated, can be periodically scheduled
  • push — server-initiated
  • server-side cache — for each userID (or IP), server remembers what records already sent
  • client-side cache

Initially when a user visits this “frontpage”, the web server has to deliver quite a lot of records. Heavy load on the server.

Thanks to pagination, browser can immediately show the 10 latest comments right away. As user scrolls down, browser would read the client-side cache and display 10 more comments, without a round trip.

If user scrolls fast, then browser may (?) need to make another request. In any case, server’s response continues to trickle in carrying the #20+, #30+, #40+ records in the background, asynchronously. In my jargon, this is not server push, but rather the slow response to a client-pull request.

When user hits Refresh button, client sends a new pull. A smart server remembers the most recent record id, so its response would only carry more recent records. Browser would receive the payload, merge it with the client-side cache and use AJAX to redraw the screen.

Even without user’s refresh, the server also can push (or a periodic client-pull) latest updates. I feel server-push is more efficient than a periodic client-pull, and preempts a “run” on the server.

Without auto updates, the page would become static and outdated, like web 1.0. When user hits Refresh, server would receive a brand-new pull request — heavy load.

best2subArrays show`max diff in sums #70%

Q (careercup): Given an array of N signed integers. Find two disjoint contiguous sub-arrays such that the absolute difference between the two sub-array sums is maximized.  The sub-arrays should not overlap. O(n^2) algorithm was not accepted.

eg: [2 -1 -2 1 -4 2 8] ans: (-1 -2 1 -4) (2 8), diff = 16

==== analysis
int-array problems are my relative weakness but I have done quite a few. I feel this is more about the algo rather the implementation

Without loss of generality, let’s assume the best result includes a min subarray on the left and a max subarray on the right.

–idea 1: a partition point at 5 splits the array to 0-4 and 5-end
For a given partition point, we can find the min subarray on the left and also max subarray on the right.

A O(N^2) solution can be designed based on this idea — at each partition point, O(N) search of min subarray sum on the left and max subarray sum on the right

—idea 2: 2 colliding pointers le and ri.
0-le contains a min subarray, and ri-end contains a max subarray. But how do I move the 2 pointers?

— idea 3: O(N) scan fwd to record at each position “max sum ending here” and “min sum ending here” .. two numbers. Ditto reverse scan. Total 4 numbers per tuple.

A subarray is never empty, as produced from my

End up with N-array of tuples like {a,b,c,d} . Then scan this array in O(N)

lone-wolf hidden ] AAABBBCCC unsorted #52%

Q[Lv] 60%: Given array of integers, every element appears three times except for one, which appears exactly once. Find that single one in O(1) time. Could you implement it without using extra memory?


peek? Not yet

well-defined problem:)
O(1) space probably means swapping
— Idea 1: with more space, I can use a hashtable of count to achieve O(N)
— Idea 2: with O(1) space I can also sort in O(N logN)
— Idea 9: My O(N) algo, applicable for any-size integers and also other than “three”.

Pick a random pivot and partition in O(N) time and O(1) space. Also keep track how many repetitions of the pivot value (probably 3). Exclude the pivot value, count size of both partitions and discard the one whose size=3X. Repeat.

N+N/2+N/4+N/8 …= 2N

3-sum #52%

Q[L] Given an array nums of K integers, are there elements a, b, c in nums such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero. Doable in O(KK)


— Solution 1: separate into (M) non-negatives and (N) negatives. M+N=K.

— solution 1a: given M targets and N negative items,

if M > N, then check O(NN) pairs among negatives …. and look up the sum in a pre-populated hash table of M items.
If M < N, then use the two-sum algo O(MN)

Total O(MN + MM) for small M
Total O(MN + NN) for small N

Both comes out to O(MN+max(M,N)), smaller than O( [N+M]^2 )

Note some bigO students don’t like too many tokens and may simplify it to O(KK), but I think it’s imprecise… see when to introduce new tokens in O()

— solution 1c: given M targets and N negative items,

I can sort the N-array. For each target, I can binary-search, but this is possibly O(MN logN)

max-sum path up+down binTree #FB

Q2 (Leetcode “hard” Q124): find max path sum, where a ” path ” has minimum one node A and can include A’s left child and/or A’s right child. No uplink available.


I see two types of paths

  1. down the tree, starting at some upstream node A ending at a node in A’s subtree
  2. up-down the tree, starting at some A’s left subtree, ending somewhere in A’s right subtree.

For 1) has my tested solution

For 2: post-order walk to update each node a (signed) max-path-sum-from-here. See The 2 values from left+right children can solve this sub-problem

Q: can we use the 2) solution for 1)? I think so