likability n unusual habits #Kyle

I felt simultaneously intrigued and enlightened by your comment (also echoed by previous colleagues) about my uncommon habits, behavior, mannerism… When we discussed it, I decided not to ask for specific examples, but I assumed these are relevant

  • Wearing helmet when walking in
  • One barefoot on the rolling cabinet when using standing desk
  • Eating my food on a tray
  • Taking note in every meeting with a post-it pad rather than a notebook
  • Unshaven, or uncombed hair
  • Carrying a big camping backpack every day, as I described to you
  • Declining to attend any drink party, except the last party before I resign

These habits often draw unwanted attention. Since my late 20’s, I have made it a point to try and minimize these behaviors, but only if I’m aware of it. The helmet effect was only revealed to me by your comment.

Q: are these “features” or “defects”. The conventional wisdom seems to say these are defects because they are disliked by other people around.

I actually like people with these “features”. Rough around the edges, these people are usually geeks, nerds or mad scientists, but occasionally a manager in some team. Due to these unusual habits, some of these individuals have a disarming demeanor, others uptight and stern. Common theme among them – unusual “features”.

In contrast, across my past workplaces and beyond my immediate teams, about half the coworkers are well-polished with no unusual mannerism. A lot of times I don’t feel very close to them, as if there’s a mask between us. Semiconsciously I feel they don’t let their hair down easily, less spontaneous, not my type. A small subset of these coworkers can be described as cautious, even secretive, even territorial. In my subconscious, I tend to assume they show a sense of insecurity and are too afraid of losing face.

Across my past workplaces and beyond my immediate teams, most of the coworkers I have liked most are always the less well-polished.

My father is the well-polished type (but very much a nerd); my mom and sister are mavericks. Each of them is liked by some people, disliked by some people. I chose to follow my mother’s style, though I listen more to my father. Decades ago, I might have wanted to follow my father’s style, but I don’t have the skills and capabilities. is a blog post I wrote many years ago.


invalid/unbalanced brackets: kernel #42%

Q: given a string of N left+right brackets, you can find out how many invalid chars to remove to make it balanced. Say it’s R. There are up to N-choose-R ways to make it balanced. Print all unique balanced strings in compact form.


subproblem: minimum how many invalid chars to remove

Useful preprocessing technique — on the left end, any closers must be removed. Same on the right end. No choice 🙂 We would end up with a valid char on each end. This is nice but optional in my Idea3 below.

— my idea 3 to count minimum cuts

Insight — There will surely be some “kernels” i.e. opener-then-closer in a row. First scan I will remove them, then remove more kernels. This is always optimal if we aim to minimize the cuts

  • [] ] [ [] ] [ [] ] becomes ]
  • []][[] becomes ][
  • [] [ [] [] [] [] becomes [
  • [[[][]][[]] becomes [
  • [[][]]][[]]] becomes ]]

What remain are the positions of bad chars. I need to remember these positions.

Case: closers only. Let’s look at one position like #55. We can cut #55 or another closer at an earlier position.

Case: openers only. Similar to the above.

Case: closers-openers. The original string is partitioned into exactly two sections, each similar to the above cases.

job satisfaction^salary #CSDoctor

I feel CSDoctor has reasonable job satisfaction. If I were in his role my job satisfaction would be lower.

His financial domain is rather niche. Equity volatility data analysis, not a big infrastructure that needs a team to support. I guess it could be considered a glorified spreadsheet. I think the users are front office traders looking at his data to make strategic business decisions.

His technical domain is very niche. The team (4 people) made a conscious decision to use c# early on, and has not really changed to java. C# is shunned by West Coast and all internet companies including mobile, cloud and big-data companies. Even Wall St is increasing investment in javascript based GUI, instead of c# GUI.

I see very high career risk in such a role. What if I get kicked out? (Over 7 years the team did kick out at least 2 guys.. Last-in-first-out.) What if I don’t get promoted beyond VP and want to move out?

I won’t enjoy such a niche system. It limits career mobility. 路越走越窄.

This is one of the major job dis-satisfactions in my view. Other dis-satisfactions:

* long hours — I think he is not forced to work long hours. He decides to put in extra hours, perhaps to get promotion.

* stress — CSDoctor is motivated by the stress. I would not feel such high motivation, if put in his shoes.

* commute — I believe he has 1H+ commute. I feel he has very few choices in terms of jobs closer to home, because his domain is extremely niche.

y C++will live on #in infrastructure

I feel c++ will continue to dominate the “infrastructure” domains but application developer jobs will continue to shift towards modern languages.

Stroustrup was confident that the lines of source code out there basically ensure that c++compiler will still be needed 20 years out. I asked him “What competitors do you see in 20 years”. He estimated there are billions of c++ source code by line count.

I said C would surely survive and he dismissed it. Apparently, many of the hot new domains rely on c++. My examples below all fall under the “infrastructure” category.

  • mobile OS
  • new languages’ runtimes such as the dotnet CLR, JVM
  • google cloud
  • TensorFlow
  • AlphaGo
  • Jupyter for data science
  • Most deep learning base libraries are written in c++, probably for efficiency

c++toolchain=bigger than new languages #%%advantage

The modern languages all feature dramatically simplified tool chain. In contrast, c++ tool chain feels much bigger to me, including profilers, static analyzers, binary file dumpers, linkers ..

This is one of the real obstacles to new entrants, young or old. This is also my (growing) competitive advantage.

I was frustrated for years by the complex and messy build tools in c++. Same for the other new entrants.

This learning curve, entry barrier … is a direct consequence to the c++ “sweet spot” as Stroustrup described — inherently complex codebase close to hardware.

I wrote dozens of blogposts about c++ build issues. For example, on windows, my strawberryPerl + git_bash + notepad++ setup is unknown to many. These fellow developers struggle with MSVS or Eclipse !

Due to the bigger ecosystem needed to support c++, new features are added at a slower pace than languages having a central organization.

many modern languages rely@c++4heavy-lifting

Stroustrup said the jvm, tensorflow, javascript, … can be considered c++ applications. They make use of the c++ compiler.

The c++ compiler is more flexible, more complex, more powerful, more engineered. Those other languages’ compilers lack those advanced features so they leverage the c++ compiler.

I would not say “most modern languages” rely on c++ for heavy-lifting.

local_var=c++strength over GC languages

Stroustrup told me c++ code can use lots of local variables, whereas garbage collected languages put most objects on heap.

I hypothesized that whether I have 200 local variables in a function, or no local variable, the runtime cost of stack allocation is the same. He said it’s nanosec scale, basically free. In contrast, with heap objects, biggest cost is allocation. The deallocation is also costly.

Aha — At compile-time, compiler already knows how many bytes are needed for a given stack frame

insight — I think local variables don’t need pointers. GC languages rely heavily on “indirect” pointers. Since GC often relocates objects, the pointer content need to be translated to the current address of the target object. I believe this translation has to be done at run time. This is what I mean by “indirect” pointer.


java allocation Can beat c++

  • case 1 (standard java): you allocate heap memory. After you finish with it you wait for the java GC to clean it up.
  • case 2 (low latency java): you allocate heap memory but disable java GC. Either you hold on to all your objects, or you leave unreachable garbage orbiting the earth forever.
  • case 3 (c++): you allocate heap memory with the expectation of releasing it, so the compiler sets up housekeeping in advance for the anticipated delete(). This housekeeping overhead is somehow similar to try/catch before c++11 ‘noexcept’.

Stroustrup suggested that #2 will be faster than #3, but #3 is faster than #1. I said “But c++ can emulate the allocation as jvm does?” Stroustrup said C++ is not designed for that. I have seen online posts about this “emulation” but I would trust Stroustrup more.

  • case 4 (C): C/c++ can sometimes use local variables to beat heap allocation. C programmers use rather few heap allocations, in my experience.

Note jvm or malloc are all userland allocators, not part of kernel and usually not using system calls. You can substitute your own malloc. top answer by Kanze is consistent with what Stroustrup told me.

  • no dynamic allocation is always faster than even the fastest dynamic allocation. Similar to Case 4
  • jvm allocation (without the GC clean-up) can be 10 times faster than c++ allocation. Similar to Case 2^3 claims

  • c++ Custom allocators managing a pool of fixed-sized objects can beat jvm
  • jvm allocation often requires little more than one pointer addition, which is certainly faster than typical C++ heap allocation algorithms

new lang2challenge c++on efficiency@@

I asked Stroustrup — efficiency is the traditional strength of C and C++,  both memory efficiency and speed.. Is that still true? He immediately said yes.

I think it was clear in his mind that c/c++ were still the most efficient languages around. He did say Fortran is optimized for scientific computing.

I later asked him — any new language he watches out for. He said none, without real hesitation, no ifs or buts.

Recalling that conversation, I feel new languages are usually more high-level and easier to use. They are more likely to use heap to provide a “consistent interface” and avoid the complexities of a low-level language.

If I’m right, then these languages can’t and won’t optimize for efficiency as a top priority. Efficiency is possibly a 2nd priority.

func overloading: pro^con

Overloading is a common design tool in java and other languages, but more controversial in c++. As interviewer, I once asked “justification for using overload as a design tool”. I kinda prefer printInt/printStr/printPtr… rather than overloading on print(). I like explicit type differentiation, similar to [[safe c++]] advice on asEnum()/asString() conversion function.

Beside the key features listed below, the most common j4 is convenience | readability….. a on-technical justification!

— ADL — is a key c++ feature to support smart overload resolution

— TMP — often relies heavily on function overloading

— optional parameter and default arguments — unsupported in java so overloading is the alternative.

— visitor pattern — uses overloading. See

— ctor and operator overloading — no choice. Unable to use differentiated function names

— C language — doesn’t support overloading. In a sense, overloading is non-essential.

Name mangling is a key ABI bridge from c++ to C

max-sum non-adjacent subSequence: signedIntArr#70%

Q: given a signed int array a[], find the best sub-sequence sum. Any Adjacent elements in a[] should not both show up. O(N) time and O(1) space.


— O(1) idea 2:

Two simple variables needed: m_1 is the m[cur-1] and m_2 is m[cur-2]. So compare a[cur] + m_2 vs m_1. is self-tested DP solution.

[19] localSys proficiency]1st 4M #Macq

Arguably my #1 weakness is the initial learning curve on localSys. For that reason, My figure-things-out speed would be slower than other guys.

Due to my superior interview performance, I “sometimes” show underwhelming learning curve, depending on the other new hires. In some teams like RTS, Stirt, … some new hires were slower, less motivated, more distracted.

Camp-out — is my advantage, provided I have an engagement honeymoon.

self-independent learning — Ashish mentioned it many times. Not sure how realistic. If I can achieve it, I would have a big-gun, as described in big_gun + laser_gun on localSys

respect — Ashish pointed out independent discovery would win me respect from team members. I can recall many experiences to support it.

burning pleasure

virtuous circle — the respect would spur me on.

annual leaves — It’s OK to take some leaves, but if I can spend more time at work in first 3-6 months I have a higher chance of creating the virtuous circle.

Q: At Macq, was the initial 6M ramp-up effective for localSys, GTD?
A: It did help prevent the Stirt mistake. Key is capture of absorbency and burning pleasure, in the engagement honeymoon. In 2015 I didn’t have a lot to capture 😦
A: Consider set-up-to-fail . I would say even with a quick ramp-up, I would still fall below the bar.
A: now i feel 6M is hard to maintain. Realistic honeymoon is 4M, but sometimes it lasts longer.

big_gun + laser_gun on localSys

I like the imagery

Backdrop — My brain power vs the brain power baked into the local system.

My laser focus is a blaster-gun that can break down the stonewall of opacity in the local system. This laser-gun doesn’t require “independence” but IFF I become independent early on, then I can make use of my weekend quiet hours and apply my laser. My weekend quiet hours are a real big gun in the local competition in the team [1].

I could also have an engagement honeymoon.

[1] My advantages and weaknesses are fairly obvious when compared to my colleagues.

Many of my peers seem to have that laser gun more often I had.

  • ==== my victorious war stories were mostly in brown field arenas. In each case I needed the same sustained focus for a few months
  • +ve xp: PWM error memos
  • xp: PWM AICE
  • +ve xp: RTS
  • -ve xp: at mvea I was not independent.
  • -ve xp of CSDoctor, but I won’t spend too much time on hearsay war stories.

##skillist to keep brain active+healthy

— generally I prefer low-churn (but not lethargic) domains to keep my brain reasonably loaded:
[as] classic data structure+algorithms — anti-aging
[s] C/C++,
c++ build using tool chain, and shooting using instrumentation
c++ TMP
[s] memory-mgmt
[s] socket,
[s] mkt data,
[s] bond math,
basic http?
[a=favor accu ]
[s=favor slow-changing domains]
— Avoid churn
  • jxee,
  • c#
  • scripting,
  • GUI
— Avoid white hot domains popular with young bright guys … too competitive, but if you can cope with the competition, then it could keep your brain young.
  • quant
  • cloud? but ask Zhao Bin
  • big data
  • machine learning

bigO usable in average|worst-case #CSY

bigO is about large input data SIZE or GROWTH, not about input data quality.

Therefore, both average-case and worst-case can use bigO notation. Multiple sources specify “average case O(N logN)” —

big-Oh means upper-bound. big-Theta means tight-bound. They refer to the growth of running time as input data grows, regardless of data quality.

static horizontal sharding: drawbacks #Rahul

Eg of static horizontal sharding — by-symbol. NYSE, OPRA…

Rahul told me that outside finance sector, many companies (esp. west coast) are cautious about static sharding. Over time, One shard can become extremely hot while other shards’ resources stay underutilized.

Rahul said it’s a hassle to change static sharding config. Logistically challenging. Requires phased restarts.

As a result, many tech companies use static horizontal sharding very cautiously, only with valid reasons.

Dynamic sharding is a new concept to me. I think it’s like … based on load level, we could shift one “topic” between shards.

prepare FB onsite: reasonable pace

Hi XR,
It could be useful to have a target number of problems for each weekend, but only if the negative stress is manageable.
Q: difficult (medium or hard rating) leetcode problems?
A: some “easy” problems are very challenging and enriching.
A: I feel I should not take on too many problems that feel off-putting. They tend to slow me down, and deflate my motivation.
However, if a problem is a classic and offers valuable insights, then I should probably try it out, then read the solution.

There are too many problems on Leetcode and elsewhere. I have done 100+ but can’t do 200 more until my onsite interview. I think I need to pace myself and accept that I won’t get familiar problems.

Q: should I read solutions on some of the medium/hard leetcode problems after trying for a while without breakthrough?
A: Now I think i should , but I am reluctant to.
Q: focus on coding or pure algo?
A: mostly pure algo… aiming to get the gist of each problem. Implement if fun and simple, but I often get drawn into the implementation and waste my time.
Q: should I consider taking some half-day leaves when I find myself in the mood for coding drill?
A: yes, after discussing with manager.
Q: I used to spend 1 full day on one problem. Of course the best solutions only use 100 lines and will take me 1 hour to understand. Should I avoid that?
A: I think I should, given my limited time.

Reminders —

  • peek11 and “fuxi” tags
  • oq tags in WP
Goal — improve on my Indeed onsite performance.

merge 2 binTrees by node position

Q (leetcode 617):

==== Analysis is a solution fully tested on but hard to run offline.

Insight — Challenge is implementation. Input and return type of DFT function are tricky but server as useful implementation techniques.

Labelled as easy, but pretty hard for me.

— Idea 1: BFT. When a node has any null child, put null into queue. Not so simple to pair up the two iterations

— Solution 2: DFT on both trees. Always move down in lock steps. When I notice a node in Tree A is missing child link that Tree B has, then I need to suspend the Tree A DFT?

My DFT function would have two parameters nodeInA and nodeInB. One of them could be null , but the null handling is messy.

Aha — I set the input parameters to to dummy objects, to avoid the excessive null check. In this problem, this technique is not absolutely necessary, but very useful in general


16 Aug weekend drill

Target: up to 10 problems in total including 5 new ones.

— fully tested:

  2. — offending section ] sorted arr
  4. — Alien dictionary

— presumably solved

  4. — count squares in matrix
  5. — longest full path name
  6. — O(N) by quick-select

— pending

— reviewed

standard SQL to support pagination #Indeed

Context — In the Indeed SDI, each company page shows the latest “reviews” or “articles” submitted by members. When you scroll down (or hit Next10 button) you will see the next most recent articles.

Q: What standard SQL query can support pagination? Suppose each record is an “article” and page size is 10 articles.

I will assume each article has an auto-increment id (easier than a unique timestamp) maintained in the database table. This id enables the “seek”” method.  First page (usually latest 10 articles) is sent to browser.  Then the “fetch-next” command from browser would contain the last id fetched. When this command hits the server, should we return the next 10 articles after that id (AA), or (BB) should we check the latest articles again, and skip first 10? I prefer AA. BB can wait until user refreshes the web page.

The SQL-2008 industry standard supports both (XX) top-N feature and (YY) offset feature, but for several reasons [1], only XX is recommended :

select * from Articles where id < lastFetchedId fetch first 10

[1] clearly explains that the “seek” method is superior to the “offset” method. The BB scenario above is one confusing scenario affecting the offset method. Performance is also problematic when offset value is high. Fetching the 900,000th page is roughly 900,000 times slower than the first page.

3 ways to expire cached items

server-push update ^ TTL ^ conditional-GET # write-through is not cache expiration

Few Online articles list these solutions explicitly. Some of these are simple concepts but fundamental to DB tuning and app tuning. compares write-through ^ write-behind ^ refresh-ahead. I think refresh-ahead is similar to TTL.

B) cache-invalidation — some “events” would trigger an invalidation. Without invalidation, a cache item would live forever with a infinity TTL, like the list of China provinces.

After cache proxies get the invalidation message in a small payload (bandwidth-friendly), the proxies discard the outdated item, and can decide when to request an update. The request may be skipped completely if the item is no longer needed.

B2) cache-update by server push — IFF bandwidth is available, server can send not only a tiny invalidation message, but also the new cache content.

IFF combined with TTL, or with reliability added, then multicast can be used to deliver cache updates, as explained in my other blogposts.

T) TTL — more common. Each “cache item” embeds a time-to-live data field a.k.a expiry timestamp. Http cookie is the prime example.

In Coherence, it’s possible for the cache proxy to pre-emptively request an update on an expired item. This would reduce latency but requires a multi-threaded cache proxy.

G) conditional-GET in HTTP is a proven industrial strength solution described in my 2005 book [[computer networking]]. The cache proxy always sends a GET to the database but with a If-modified-since header. This reduces unnecessary database load and network load.

W) write-behind (asynchronous) or write-through — in some contexts, the cache proxy is not only handling Reads but also Writes. So the Read requests will read or add to cache, and Write requests will update both cache proxy and the master data store. Drawback — In distributed topology, updates from other sources are not visible to “me” the cache proxy, so I still rely one of the other 3 means.

TTL eager server-push conditional-GET
if frequent query, in-frequent updates efficient efficient frequent but tiny requests between DB and cache proxy
if latency important OK lowest latency slower lazy fetch, though efficient
if in-frequent query good waste DB/proxy/NW resources as “push” is unnecessary efficient on DB/proxy/NW
if frequent update unsuitable high load on DB/proxy/NW efficient conflation
if frequent update+query unsuitable can be wasteful perhaps most efficient


clean language==abstracted away{hardware

Nowadays I use the word “clean” language more and more.

Classic example — java is cleaner than c# and c++. I told a TowerResearch interviewer that java is easier to reason with. Fewer surprises and inconsistencies.

The hardware is not “consistent” as wished. Those languages closer to the hardware (and programmers of those languages) must deal with the “dirty” realities. To achieve consistency, many modern languages make heavy use of heap and smart pointers.

For consistency, Everything is treated as a heapy thingy, including functions, methods, metadata about types …

For consistency, member functions are sometimes treated as data members.

central data-store updater in write-heavy system

I don’t know how often we encounter this stringent requirement —

Soccer world cup final, or a big news about Amazon … millions of users posts comments on a web page and all comments need to be persisted and shown on some screen.

Rahul and I discussed some simple design. At the center is a single central data store.

  • In this logical view, all the comments are available at one place to support queries by region, rating, keyword etc.
  • In the physical implementation, could use multiple files or shared-memory or distributed cache.

Since the comments come in a burst, this data store becomes the bottleneck. Rahul said there are two unrelated responsibilities on the data store updaters. (A cluster of updaters might be possible.)

  1. immediately broadcast each comment to multiple front-end read-servers
  2. send an async request to some other machine that can store the data records. Alternatively, wait to collect enough records and write to the data store in a batch

Each read-server has a huge cache holding all the comments. The server receives the broadcast and updates its cache, and uses this cache to service client requests.

find any black-corner subMatrix #52%

Q: given a black/white matrix, find any rectangle whose all four corners are black.
Q2: list all of them
Q3 (google): find the largest

— idea 2: record all black cell locations and look out for 3-corner groups

a Rec class with {northwest corner, northeast corner, southwest corner}

first pass, For each pair on the same row, create a pair object and save it in hashmap {northwest cell -> list of x-coordinates on my right} We will iterate the list later on.

2nd pass, scan each column. For each pair on the same col, say cell A and C, use A to look-up the list of northeast corners in the hashmap. Each northeast corner B would give us a 3-corner group. For every 3-corner pack, check if the forth corner exists.

— idea 1: row by row scan.
R rows and C columns. Assuming C < R i.e. slender matrix

For first row, record each ascending pair [A.x,B,x] (No need to save y coordinate) in a big hashset. If S black cells on this row, then O(SS).

In next row, For each new pair, probe the hashset. If hit, then we have a rectangle, otherwise add to the hashset. If T black cells on this row, then O(TT) probes and (if no lucky break) O(TT) inserts.

Note one single hashset is enough. Any pair matching an earlier pair identifies a rectangle. The matching would never occur on the same row 🙂 Optionally, We can associate a y-coordinate to each record, to enable easy calculation of area.

After all rows are processed, if no rectangle, we have O(SS+TT+…). Worst case the hashset can hold C(C-1)/2 pairs, so we are bound by O(CC). We also need to visit every cell, in O(CR)

If C > R, then we should process column-by-column, rather than row-by-row

Therefore, we are bound by O( min(CC,RR) + CR). Now min(CC,RR) < CR, so we are bound by O(CR) .. the theoretical limit.

— idea 4: for each diagonal pair found, probe the other corners
If there are H black cells, we have up to HH pairs to check 😦 In each check, the success rate is too low.

— Idea 5: Brute force — typewriter-scan. For each black cell, treat it as a top-left, and look rightward for a top-right. If found, then scan down for a pair of bottom-left/bottom-right. If no then complete the given row and discard the row.

For a space/time trade-off,

  • once I find a black cell on current row, I put it in a “horizontal” vector.

WallSt=kind to older guys like me@@

I would say Wall St is Open to older techies.

Q: I sometimes feel WallSt hiring managers are kind to older techies like me, but really?
A: I feel WallSt hiring managers are generally a greedy species but there are some undercurrents :

  • 🙂 traditionally, they have been open to older techies who are somewhat less ambitious, less driven to move up, less energetic, less willing to sacrifice personally
  • 🙂 U.S.hir`mgr may avoid bright young candd #Alan
  • 🙂 some older guys do perform well above expectation
  • 😦 if an older guy needs to be cut, many hiring managers won’t hesitate.

Overall, I think Wall St hiring managers are open to older guys but not sympathetic or merciful. They are profit-driven, not compassionate. The fact that I am so welcome on Wall St is mostly due to my java/c++ QQ, not anyone’s kindness.

I thank God. I don’t need to thank Wall St.

advantage@old_techie: less to lose]race

Imagine on a new job you struggle to clear the bar

  • in figure-things-out speed,
  • in ramp-up speed,
  • in independent code-reading…
  • in delivery speed
  • in absorbency,
  • in IQ

You would feel terrified, broken, downcast, desperate .. as a 30-something, since you are supposed to be at your prime in terms of capacities. In contrast, an older techie like me starts the race at a lower level of expectation [1] and have nothing to prove, so I can compete with less emotional baggage. A related advantage — older techies have gone through a lot so we have some wisdom. However, we could fall in the trap of 刻舟求剑. [1] WallSt contract market

brain aging: 4external inputs

Context — professional programmer career till my 70’s. The #1 derailer is not physical health but my eventual decline of “brain power” including …

— Josh felt that the harmful stress in his job was worse in his junior years when he didn’t know the “big picture”. Now he feels much better because he knows the full context. I said “You are confident you can hold your end of the log. Earlier you didn’t know if you were good enough.”

— Grandpa gave the Marx example — in between intense research and writing, Marx would solve math problems to relax the brain. I said “I switch between algo problem solving and QQ knowledge”

— Alex V of MS — Ask yourself
Q: compare to the young grads, what job function, what problems can you handle better? My mental picture of myself competing against the young guys is biased against my (valuable) battlefield experience. Such experience is discounted to almost $zero in that mental picture!

— Sudhir
Mental gymnastics is good, like board games and coding practice and Marx’s math practice, but all of these are all secondary to (hold your breath) … physical workout! I personally enjoy outdoor exercise more.

Also important is sleep. I think CSDoctor and grandpa are affected.

Sudhir hinted that lack of time affects sleep, workout and personal learning.

  • Me: I see physical exercise and sleep as fundamental “protections” of my brain. You also pointed out when we reach home we often feel exhausted. I wonder if a shorter commute would help create more time for sleep/workout and self-study. If yes, then is commute is a brain-health factor?
  • Sudhir: Absolutely, shorter commutes are always better, even if that means we can only afford smaller accommodation. Or look for a position that allows working remotely some of the time.

Sudhir also felt (due to current negative experience) an encouraging team environment is crucial to brain health. He said mental stress is necessary, but fear is harmful. I responded  “Startup might be better”.

freelist in pre-allocated object pool #Deepak

Deepak CM described a pre-allocated free-list used in his telecom system. is my tested implementation.

He said the system initialization make 400,000 new() calls to allocate 400,000 dummy objects and put them into a linked list. Personally, My design /emulates/ it with a single malloc() call. This is at startup time.

During the day, each new msg will overwrite [1] a linked node retrieved at the Head of the slist.

[1] using operator=(). Placement-new would be needed if we use a single malloc()

Every release() will link-in the node at the Tail of the slist. Can we link it in at the Head? I think so. Benefit — It would leave a large chunk of continuous free space near the tail. Improved Fragmentation.

Illustration — Initially the physical addresses in the slist are likely consecutive like addr1 -> addr2 -> addr3…. After some release() calls, it would look like a random sequence.

Using return-to-Head, I would get

  • pop, pop, pop: 4->5->..
  • rel 2: 2->4->5..
  • pop: 4->5->….
  • rel 1: 1->4->5…

— The API usage:

  • void ffree(void *)
  • void * fmalloc(size_t), possibly used by placement new

longest run@same char,allow`K replacements #70%

Q: Given a string s that consists of only uppercase English letters, you can perform at most k operations on that string. In one operation, you can choose any character of the string and change it to any other character. Find the length of the longest sub-string containing all repeating letters you can get after performing the above operations.

Here’s my description of the same problem:

Q: Suppose we stand by highway and watch cars of the each color. Only 26 possible colors. Cars pass fast, so sometimes we miscount.

My son says “I saw 11 red cars in a row in the fast lane”.
My daughter says “I saw 22 blue cars in a row in the middle lane”
We allow kids to miss up to 3 cars in their answer. In other words, my son may have seen only 8, 9 or 10 red cars in a row.

When we review the traffic video footage of N cars in a single lane, determine the max X cars in a row of the same color, allowing k mistakes. K < N.
Suppose k is 3

— solution 1: O(N) use 2 variables to maintain topFrq and w i.e. winSize

Within a sliding window of size w, maintain a frq table. initialize w to a good conservative value of 4 (i.e. k+1).

If we notice top frq is 2, better than (w-k) i.e. w-k<=topFrq , then lucky we can be less conservative and we can expand the current window backward (possibly safer than fwd).

After expansion, immediate try further expansion. IFF impossible i.e. w – topFrq > k, then slide the window.

If correct answer is 11 i.e there’s a 11-substring containing 8 reds, I feel my sliding window will not miss it.

key^payload: realistic treeNode #hashset

I think only SEARCH trees have keys. “Search” means key search. In a search tree (and hashset), each node carries a (usually unique) key + 0 or more data fields as payload.

Insight — Conceptually the key is not part of the payload. Payload implies “additional information”. In contrast, key is part of the search tree (and hashset) infrastructure, similar to auto-increment record IDs in database. In many systems, the key also has natural meaning, unlike auto-incremented IDs.

Insight — If a tree has no payload field, then useful info can only exist in the key. This is fairly common.

For a treemap or hashmap, the key/value pair can be seen as “key+payload”. My implementation saves a key/value pair in each bucket.

A useful graph (including tree , linked list, but not hashset) can have payloads but no search keys. The graph nodes can hold pointers to payload objects on heap [1]. Or The graph nodes can hold Boolean values. Graph construction can be quite flexible and arbitrary. So how do you select a graph node? Just follow some iteration/navigation rules.

Aha — similar situation: linked list has well-defined iterator but no search key . Ditto vector.

[1] With pre-allocation, a static array substitutes for the heap.

Insight — BST, priorityQ, and hash tables need search key in each item.

I think non-search-TREES are rarely useful. They mostly show up in contrived interview questions. You can run BFT, pre|post-order DFT, but not BF S / DF S, since there’s nothing to “search”.

Insight — You may say search by payload, but some trees have Boolean payloads.

std::move(): robbed object still usable !

Conventional wisdom says after std::move(obj2), this object is robbed and invalid for any operation…. Well, not quite! specifies exactly what operations are invalid. To my surprise, a few common operations are still valid, such as clear() and operator=().

The way I read it — if (but not iFF) an operation wipes out the object content regardless of current content, then this operation is valid on a robbed/hollowed object like our obj2.

Crucially, any robbed object should never hold a pointer to any “resource”, since that resource is now used by the robber object. Most movable data types hold such pointers. The classic implementation (and the only implementation I know) is by pointer reseat.

how many ways to decode #60%

Q(Leetcode 91): A message containing letters from A-Z is being encoded to numbers using the following mapping:

‘A’ -> 1, ‘B’ -> 2, … ‘Z’ -> 26
Given a non-empty string containing only digits, determine the total number of ways to decode it.


I think this is similar to the punctuation problem.

–my botup solution

At each position in the string, keep a “score” number that represents “how many ways to decode a left-subtring ending here”

Useful — define my convenient jargon: we will say the encoding 10 to 26 are “high letters”, and the encoding 1 to 9 are “low letters”. If there are 95 ways to decode a string, i will call them 95 “formulas”.

At Position 33, i will look at score[31] (say equal to 95) and score[32] (say, equal to 97). if the two-char substring str[32:34] is between 10 and 26, then score[33] should include the 95 ways to decode str[:32]. Those 95 “formulas” can grow one high letter.

If str[33] is not ‘0’, then score[33] should also include the 97 ways to decode str[:33], because those 97 “formulas” can grow one low letter.

Aha — The 95 and the 97 formulas are all distinct because of the ending letter

I think we only need two variables to hold the previous two scores, but it’s easier to code with a score[] array.

hash table expansion: implementation note

(I don’t use the word “rehash” as it has another meaning in java hashmap. See separate blogpost.)

Note this blogpost applies to separate chaining as well as linear probing.

As illustrated in my, the sequence of actions in a expansion is tricky. Here is what worked:

  1. compute bucket id
  2. insert
  3. check new size. If exceeding load factor, then
    1. create new bucket array
    2. insert all entries, including the last one
    3. reseat the pointer at the new bucket array

If you try to reduce the double-insertion of the last entry, you would try moving Step 2 to later. This is tricky and likely buggy.

Say after the expansion, the computed bucket id was 25, so you insert at (or around Bucket25), but this 25 was based on the old “capacity”. When we lookup this key, we would use the current capacity to get a bucket id of 9, so we won’t find this key.

zero out rows/columns +! auxDS

Q (Leetcode 73): Given a m x n matrix, if an element is 0, set its entire row and column to 0. Do it in-place in O(1) space. No time complexity.

I will assume all signed ints.

I find this problem very well-defined, but the O(1) space is highly contrived. I think it only needs some clever technique, not really reusable technique.

Reusable technique — for array of integers with stringent space complexity, we can save indices in the array

Aha — I only need to know the full set of rowIDs and columnIDs.

— My O(minimum(m,n)) space solution 1:
zeroRowCnt:=how many rows to be zeroed out
zeroColCnt  :=how many columns to be zeroed out

Compare the two. Suppose zeroRowCnt == 11 is smaller. I will save the 11 rowID’s in a collection. Then scan horizontally to zero out by column. Then use the rowIDs to zero out by row

–My O(1) space idea 2 — more elaborate than the published solution.

Aha — Can we save the 11 rowID’s in a column to be zeroed out?

Compare zeroRowCnt and zeroColCnt as earlier. Get first rowID among the 11. Suppose it’s Row #3.

Now we know Row#3 has some zeros, so find the first column having a zero. It might be the last column (farthest east). Wherever it is, we pick that column as our “bookkeeper column”.

Visual insight — Suppose bookkeeper is Column #33. Then a[3,33] would be the first zero if we scan entire matrix by-row-and-internally-by-column

We scan row by row again (since we don’t remember those 11 rowIDs), starting after that first rowID. For every rowID found, we will zero out one corresponding cell in bookkeeper column.

Insight — We should end up with exactly 11 zeros in that column. Can’t exceed 11 (only 11 rows having zeros). Can’t fall below 11 (we save all 11 rowIDs)

From now on, freeze that column until further notice. Now zero out each Column to be zeroed out, but leave out our bookkeeper column.

Lastly, follow our bookkeeper column to zero out every “dirty row”.

##algo Q categories: %%strength/weakness

— relative strengths compared:

  • data-structure heavy
  • union find
  • generator — more experienced than average
  • geometry
  • sliding window
  • whiteboard relative to IDE

— weakness

  1. DP applied to string
  2. DP — as I usually can’t visualize and wrap my mind around it,
    • I have put in more effort than others, but not connecting the dots and developing intuition
  3. recursion in a loop — over my head
  4. 2D grid and matrix
    1. but I have put in more effort than others
  5. num-array. CSY is stronger
  6. priorityQ


reliable multicast for replicated-cache update is a 1999 research paper. I hope by now multicast has grown more mature more proven. Not sure where this is used, perhaps within certain network boundaries such as a private network of data servers.

This paper examines reliable multicast for invalidation and delivery of popular, frequently updated objects to web cache proxies.

BST: post-order as serialization

Q: if I backscan a post-order output (unsorted sequence), is there only a single BST we can build?

Intuitively, I think this is like fwd scanning a pre-order output so yes.

If I keep a webcam at each tree node
* During a fwd scan of pre-order output, every node’s left child is born before right child.
* During a backscan of post-order output, every node’s right child is born before left child. During the actual post-order walk, my left child is always printed before my right child.


For a given BST, the pre-order output is unique; the post-order output is also unique. However,

Can two BSTs produce the same pre-order output? I think impossible
Can two BSTs produce the same post-order output? I think impossible

Like the pre-order, the post-order sequence is also a serialization but only for a BST.

val@pattern_recognition imt reusable_technique

Reusable coding techniques include my classic generators, DP techniques, array-element swap, bigO tricks like hash tables and medium-of-3 pivot partitioning.

  • One of the best examples of reusable technique — generate “paths/splits” without duplicates. Usually tough.

More useful than “reusable techniques” are pattern-recognition insight into the structure and constraints in a given coding problem. Without these insights, the problem is impenetrable.

Often we need a worst input to highlight the the hidden constraint. The worst input can sometimes quickly eliminate many wrong paths through the forest and therefore help us see the right pathway.

However, a bigO trick can sometimes wipe out the competition, so much so that pattern recognition, however smart, doesn’t matter.


vector used in hash table bucket

Summary — Same worst case time/space complexity as traditional chaining, but d-cache efficient during lookup, insert and removal.

Background — hash table worst case complexity is due to a bad input (and a weak hash function) leading to heavy collision and a single hot bucket. Load factor will not help as overall hash table size is still small. We end up with linear search in a long linked list. This affects not only lookup and removal. Insert also requires a lookup to prevent duplicate.

— Based on

  • This design makes more efficient use of CPU caching and the translation lookaside buffer(TLB), because slot entries are stored in sequential memory positions.
  • It also dispenses with the next pointers that are required by linked lists, which saves space.
  • Despite frequent array resizing, space overheads incurred by the operating system such as memory fragmentation were found to be small

speed^soft qualities: CIV scoring

If you are very strong, you can achieve both

  • complete the required number of problems in time with no serious bugs
  • demonstrate requirement clarification, edge cases, dry-run, thorough error handling, detailed complexity analysis, coding style.

In reality, some speed coding tests at FB, Indeed etc really favor speed over the other things.

When we take such a test, there is always internal tension between two motivations — speedy completion vs demonstrating other qualities.

Thinking out loud is a common recommendation but it can create unnecessary conversation and slow you down. However, some of these conversations may lead you to the right path.

multicast routing: efficient data duplication

The generic  routing algorithm in multicast has optimization features that offer efficient data duplication.

— optimized message duplication at routing layer (Layer 3?) explains that

Routers duplicate data packets and forward multiple copies wherever the path to recipients diverges. Group membership information is used to calculate optimal branch points i.e. the best routers at which to duplicate the packets to optimize the use of the network.

The multicast distribution tree of receiving hosts holds the route to every recipient that has joined the multicast group, and is optimized so that

  • multicast traffic does not reach networks that do not have any such recipients (unless the network is a transit network on the way to other recipients)
  • duplicate copies of packets are kept to a minimum.

##very few theoretical compSci constructs 4 CIV

Most comp science constructs are too advanced too complicated for a 45-minute coding interview. So reading any comp science book is like my son reading science books not written for his exams. What a candidate need is drills targeted at interviews.” I told friends, based on Leetcode.

A few exceptional interviews (eg: Google) go beyond Leetcode, but still use only simple features of advanced comp science constructs. Here are A few notable comp science constructs to study, mostly advanced data structures

  • [s] trie, suffix array, suffix tree
  • geometry (is dStruct-heavy domain) —
    • nearest neighbor query;
    • query: which (x,y) points fall within this rectangle?
    • line sweep
  • segment tree
  • topological sort – 2 algos
  • [s] disjoint set
  • relationships among in|pre|post-order binTree walks — these insights are valuable for some Leetcode problems.
  • self-balancing algo in RBTree
  • [s] B-tree
  • [s] shortest path algos like BFT
  • [s=only the simple ones are needed in CIV]
  • —- techniques
  • technique: augmented tree
  • technique: back tracking
  • technique: DP and greedy
  • technique bottom-up DP with memoization

Q: How about coding interview books? Can help. They are not as deep or wide ranging as algorithm books.

Q: how about bigO on common data structures? Yes relevant because of the interviewer’s insistence.


minimize average distance to N cells]matrix #Kyle 60%

Q: given N red cells in a white matrix, find the best cell anywhere in the matrix, whose average distance to the N red cells is minimized. You can imagine N homes deciding to build a waterpoint.

Distance is like how many horizontal or vertical steps.


Insight — vertical vs horizontal dimensions are independent, so this is really a one-dimension problem.

center of gravity?

Insight — rate of change should be zero at the trough…

5 constructs: c++implicit singletons

#1 most implicit singleton in c++ is the ubiquitous “file-scope variable”. Extremely common in my projects.

  • — The constructs below are less implicit as they all use some explicit keyword to highlight the programmer’s intent
  • keyword “extern” — file-scope variable with extern
    • I seldom need it and don’t feel the need to remember the the details.. see other blogposts
  • keyword “static” — file-scope static variables
  • keyword “static” within function body — local static variables — have nice feature of predictable timing of initializaiton
  • keyword “static” within a class declaration —  static field

~~~~~~  The above are the 5 implicit singleton constructs ~~~~~~

Aha — it’s useful to recognize that when a data type is instantiated many many times i.e. non-singleton usage, it is usually part of a collection, or a local (stack) variable.

Sometimes we have the ambiguous situation where we use one of the constructs above, but we instantiate multiple instances of the class. It’s best to document the purpose like “instance1 for …; instance2 for …”

multicast over public internet

Many people say that in general, multicast over public internet is unsupported. I think many retail websites (including the dominant west coast firms) are technically unable to use multicast.

As an end-user, You can multicast across the public Internet to another site by using a tunnel that supports multicast.

As a larger organization, like a video provider or an ISP, it is certainly possible to forward multicast packets across their domain boundary (i.e. across an Internet).

To forward those multicast packets to another ISP, you would need a peering agreement with them and use the Multicast Source Discovery Protocol (MSDP), configured on both ends.

While you won’t propagate your multicast across the global Internet, crossing network boundaries with multicast packets is not impossible.

27 Jul coding drill

— presumably solved:

  1. power of 3
  2. Fizz Buzz
  3. first unique char in a string
  4. longest common prefix
  5. grouping anagram
  6. in-place-remove all dupes from sorted array
  7. getByRank() in sorted matrix #51%
  8. longest descending path through matrix #60%
  9. count lower ints to my right #50%
  10. Accounts merge
— already-solved problems reviewed
  1. Roman to int
  2. generate all loop-free paths between two nodes
  3. append+maxInRangeK #Rahul
  4. O(1)getRandom+add+del on Bag#Rahul
— pending
  1. intersection of two arrays

homemade clustered sparse array

O(1) lookup and Random Access is hard to achieve among sparse array implementation. Here my sparse array is a special subset of sparse arrays — the populated sections come in clusters. For with special subset, I hope Random access is feasible. My design is Fragment table:

  • My fragment table is a std::deque to hold raw pointers to “fragment arrays” not vectors. I choose the word “fragment” since std::deque has its own internal “segment” that is completely irrelevant.
  • elements 0 to 99 —  live in a 100-array, whose address is saved in fragmentTable[0]
  • elements 100 to 199: live in a 100-array, whose address is saved in fragmentTable[1]
  • elements 200 to 299: live in a 100-array, whose address is saved in fragmentTable[2]
  • elements 500 to 599: all missing, so a special ptr is is saved in fragmentTable[5]
  • elements 603 to 605 are present. A special ptr is is saved in fragmentTable[6]. This fragment consists of a small-footprint 3-array. This fragment also has a field this->offset=3
  • Note the fragmentTable is a deque of void pointers. Each void pointer can be a regular plain old array or a special pointer.

— As a bonus, this design supports unlimited append. Here’s a limited prepend() procedure

  1. to prepend a value X in front of the current sparseArr[0], we first allocate a new 100-array as a new fragment whose offset=99. This new fragment supports 99 future prepend() operations, but each time offset needs decrementing.
  2. save X as the last element in the new fragment.
  3. record the new fragment’s address in a new fragmentTable[0]. This is possible because the fragmentTable is a std::deque.

Note I can also allocate a small-footprint new fragment, iFF no further prepend() is allowed.

— Cornerstone of my design — all fragments have Identical range=100 inspired by the std::deque and the ragged 2D array. This enables simple and fast random access. The underlying arrays are the same size with few exceptions — empty fragments, and tiny-n-frozen fragments.

If one of the “tiny fragments” is expected to get populated in the future i.e. not frozen, then it should be allocated as a full array with bigger footprint. The wasted space should not be wasted for long.

Even if this design is not flexible and general-purpose, it meets my design goals so I’m proud of it.

Many implementations use linked fragments, where each fragment has a “next_fragment” pointer. I don’t see any benefit.

I don’t want to look at pure-java implementation as java array has many restrictions.

count lower ints to my right #52% is labelled “hard”.

Q: given an integer array nums , return a new counts array, wherein counts[i] is the number of smaller elements to the right of nums[i]


Order statistics tree , i.e. an augmented RBTree should make O(N logN).

One scan from right. Insert each node into this tree. Before inserting a node of value like 22, we will query the tree getRank(22).

Implementation wise, it’s hard to create a self-balancing BST from scratch. So I think an unbalanced BST might do.

Also, there might be some alternative solutions.

longest descending path through matrix #60%

Q: given an int-matrix that allows 4-way moves,  find longest path with strictly descending nodes. Lateral move disallowed.


I have code to generate all paths… generate loopfree paths: graph node A→B

— Solution 1: O(N logN)
O(N) First pass  to construct “waterflow” graph — visit N nodes. For each, check the four neighbors. Add an outgoing edge (in a Node.edgeList) to a neighbor node if it is strictly lower.

Now put all nodes into a min-heap, according to node height. Can be part of first pass.

Second pass we visit each item in the min-heap. For each node, compute longest descending path starting therefrom. Record this path length as Node.score. Note if a node is surrounded by higher or equal nodes, then it has score 0, as in the lowest node.

A higher node AA is always upstream to some “computed” nodes. (We don’t care about any incoming edges into AA). So pick the max of (up to four) lower neighbors’ scores. Add 1 to get AA’s score. This computation is O(1) but the heap pop() makes second pass O(N logN)

Note the lowest node may not be part of the longest path, if it is surrounded by the highest nodes.

## defaultdict tricks #matrix demos a few time-saving tricks with defaultdict

  • mydict[key] # without printing or using the return value … would trigger the dictionary to construct a default value for this key IFF missing
  • from key/value pairs, build dict {key -> valueList} shows

  • defaultdict(int) as a frequency counter.
    • mydict[key] += 1 #works even when key was previously missing in mydict
  • defaultdict(lambda: ‘_’).
    • mydict[(row,col)] can implement a matrix. negative (or other invalid) indices results in default value of string ‘_’. I believe dict key can be tuple but not list.
    • In contrast, the 2D array interprets negative indices as wrap-around !
  • defaultdict(lambda: [[],[]]) creates a list of 2 small lists for any “invalid” key. In contrast defaultdict(list) will create a monolithic list which hits runtime error when you access element of the nested list i.e. dict[someKey][0][anyIndex]

average_case ^ amortized

–> bigO (and big-Theta) is about input SIZE, therefore, usable on average or worst case data quality

There’s no special notation like bigX for average case.

–> Worst vs average case … is about quality of input data.

With hash table, we (novices) are permitted to say “quality of hash function” but this is incorrect.  My book [[algorithms in a nutshell]] P93 states that IFF given a uniform hash function, even in worst case bucket sort runs in linear time O(N).

–> Amortized … is about one expensive step in a long series of steps like inserts. Note Sorting has no amortized analysis AFAIK.

worst case amortized — is a standard analysis
Eg: vector expansion. Worst-case doesn’t bother us.

eg: quicksort+quickSelect can degrade to O(NN). Amortization doesn’t apply.

average case amortized — is a standard analysis
eg: hash table collision. A few operations happen to hit long chain, while most operations hit short chain, so O(1) on average.

Amortization is relevant iFF an insert triggers a hash table expansion.

What if the input is so bad that all hash codes are identical? Then amortization won’t help. For hash tables, people don’t worry about worst case, because a decent, usable hash function is always, always possible.

average case per-operation — can be quite pessimistic
eg: vector insert

worst case per-operation —

RBTree O(1)insert quite common ] coding IV

— for single-insert is precise saying that hinted insert is O(1).
— for mass insert
Special case — If we know the input sequence is pre-sorted and going to be “appended” to existing tree nodes, then each insert will always hit the end(). We can rely on hinted insert to achieve O(N) .  is even more encouraging — ” If N elements are inserted, Nlog(size+N) in general, but linear in size+N if the elements are already sorted” so as long as pre-sorted, then we get O(N)

For 64-bit integer or float inputs, we can radix sort in O(N) then mass-insert in O(N).

— for construction from a pre-sorted sequence confirms it’s O(N) if the input range is already sorted.

getByRank() in sorted matrix: priorityQ^RBTree


recombinant binTree pyramid, where “water” flows east or south.

  • first level has one node .. lowest value. Add it to pq (i.e. priorityQ)
  • pop the pq and insert the two downstream nodes
  • total K pops, each pop is followed by up to 2 inserts

Heap will grow to up to K items, so each pop will be up to logK

Total O(K logK). To achieve this time complexity, we can also use a RBTree. The tree nodes can come from a pre-allocated array.

pre-allocated array as backing store for graph nodes #java No

I think any graph node can use the same technique, but here I present a simple yet interesting use case — a linked list with each node allocated from an array. shows three home-made implementations:

  1. backing array of dummy link nodes, pre-allocated at compile time
  2. backing array of dummy link nodes, pre-allocated from free store aka DMA
  3. backing array is a byte array on heap or data section. Each link node is constructed via placement-new.

Here are a few Advantages that I consider minor because linked list is seldom needed in low-latency

  1. d-cache efficiency
  2. eliminates runtime load on heap allocator, since memory is pre-allocated. See malloc=long considered costly

Advantage #3: For c++ algo questions, this set-up has an interesting advantage — The node address is now an index into the backing array. This index is a natural auto-increment ID , based on creation order.

Now, the biggest advantage of linked list over vector is mid-stream insert/delete. One of the biggest disadvantages is lack of random-access. If nothing happens mid-stream (as in coding questions), then we can achieve random-access-by-id using array as backing store.

If nothing happens mid-stream, then this linked list is physically similar to an array with extra capacity.

This technique won’t work in java because java array of Node is array-of-pointers.

power-of-2 sized hash tables: implications  and explain that java HashMap uses bucket count equal to power-of-two. A few implications

  • the modulo operation is a fast bit-masking i.e. select the lowest N bits: hash & (length-1)
  • an unlucky hashCode() may not offer dispersion among the lower bits. Assuming 16 buckets so only the lowest 4 bits matter, but hashCode() returns predominantly multiples of 8. Then two out of 16 buckets would be white hot. To guard against it, HashMap performs a rehash on the hashCode() output.

I implemented the same feature in

Note rehash is ineffective if hashCode() returns very few distinct values. In java8, there’s one more protection against it — RBTree as alternative to linked list… See RBTree used inside java8 hashmap

RBTree used inside java8 hashmap is a language-neutral discussion. shows an early implementation

they are implemented as a hashed array of binary trees. 
My experience with them is that they are very efficient.

— JEP 180 is the white paper to introduce a self-balanced tree as an alternative to linked list. shows with one million entries in a HashMap a single lookup taken 20 CPU cycles, or less than 10 nanoseconds. Another benchmark test demonstrates O(logN) get(key) in java8, but O(N) in java7, as traditionally known.

If two hashCode() values are different but ended up in the same bucket (due to rehashing and bucketing), one is considered bigger and goes to the right. If hashCodes are identical (as in a contrived hashCode() implementation), HashMap hopes that the keys are Comparable, so that it can establish some order. This is not a requirement of HashMap keys.

If hashCodes are mostly identical (rare) and keys are not comparable, don’t expect any performance improvements in case of heavy hash collisions. Here’s my analysis of this last case:

  • if your Key.equals() is based on address and hashCode() is mostly the same, then the RBTree ordering can and should use address. You won’t be able to look up using a “clone” key.
  • if you have customized Key.hashCode() then you ought to customize equals(), but suppose you don’t implement Comparable, then you are allowed to lookup using a clone key. Since there’s no real ordering among the tree nodes, the only way to look up is running equals() on every node. says

A given bucket contains both Node (linked list) and TreeNode (red-black tree). Oracle decided to use both data structures with the following rules:

  • If for a given index (bucket) in the inner table there are more than 8 nodes, the linked list is transformed into a red black tree
  • If for a given index (bucket) in the inner table there are less than 6 nodes, the tree is transformed into a linked list

With the self-balanced tree replacing the linked list, worst-case lookup, insert and delete are no longer O(N) but O(logN) guaranteed.

This technique, albeit new, is one of the best simple ideas I have ever seen. Why has nobody thought of it earlier?

RBTree: technical notes #Realtime

Red–black trees offer … worst-case guarantees, valuable in real-time applications. The Completely Fair Scheduler used in current Linux kernels and epoll system call implementation[19] uses red–black trees.

This valuable guarantee is valid only on always-balanced trees, but no need to be strictly balanced. In fact, AVL is more rigidly balanced but lacks this guarantee.

In contrast, hash table doesn’t offer worst-case guarantee in the face of hash collision. In fact, Java 8 HashMap uses RBTree in additional to linked list… see my blogpost RBTree used in java hashmap.

After an insert or delete, restoring the red-black properties requires a small number (O(log n) or amortized O(1)) of color changes (which are very quick in practice) and no more than three tree rotations (two for insertion). Although insert and delete operations are complicated logically, their times remain O(log n).

The AVL tree is another structure supporting O(log n) search, insertion, and removal. AVL trees can be colored red-black, thus are a subset of RB trees. Worst-case height is better than the worst-case height of RB trees, so AVL trees are more rigidly balanced. However, Mehlhorn & Sanders (2008) point out: “AVL trees do not support constant amortized deletion costs”, but red-black trees do.[25]

doubly-linked list as AuxDS for BST

I find it a very natural auxDS. Every time the BST gets an insert/delete, this list can be updated easily.

Q: How about the self-adjustment after an insert or delete?
%%A: I think this list is unaffected

With this list, in-order walk becomes really easy. is the first place I saw this simple technique.

lowest missing+ve int#Codility #70%

Q: Write a function int solution(int[] A);  that, given an array A of N integers, returns the smallest natural number that does not occur in A. For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.
Given A = [1, 2, 3], the function should return 4.
Given A = [−1, −3], the function should return 1.

• each element of array A is an 32-bit signed int
• expected worst-case time complexity is O(N);
• expected worst-case space complexity is O(N).
* Elements of input arrays can be modified. is similar but O(1) space and average O(N) time!

—— my analysis —–

The mutable and O(1) space hints at — saving array indices as payload !

—- Idea A:

first scan to swap non-positives to the end and remember the new boundary (say #111).

In the same scan also determine min and max. Suppose min=55.

Another scan to reduce all payloads by 55 so they fall in the range of 0 to max-55.

Now use CSY technique .. check array@0-N in-situ #Nsdq#contrived

—-solutionB: make_heap O(1) space but O(N logN) time. Build a min-heap based on the array in O(1) space, then keep calling min().

  • make_heap shows random access container (vector used in demo), and “rearrange”, implying O(1) space
  • make_heap shows O(N) to construct the heap

min() is O(1) but delete_min() is O(log N), so overall we probably get O(N logN)

—-solutionC: radix sort. shows an in-place binary radix sort.

First pass to transform all negative numbers to 0. Then iterate the sorted array and check for the earliest “gap”. Worst case — you get 1,2,3… without gap, so answer is the 1+ largest array element.

O(W*N) where W is width of the largest integer. If we assume 64-bit then W is a constant:) is another in-place radix sort.

java array footprint: hypothetical calc

In low-latency or high-volume java apps (like exchange-facing), I think array of primitives is quite popular due to its efficiency advantage. Here I review a few potential footprint issues. Suppose we have an array AA of 12 ints, each 32-bit.

  • AA carries housekeeping data such as the array length (i.e. 12). AA also has the “standard” object header, of twelve (or eight) bytes… See size of, and how does it add up@@
  • The 12 ints take up 48 bytes, allocated in contiguous memory… lucky. For 12 Trade objects, we would get 12 scattered allocations.. expensive for runtime allocation and cache-unfriendly.
  • So we have 12 + 48 = 60 bytes so far. There might be padding for alignment. 64-bit CPU probably uses 8-byte alignment as shown on
  • Finally, footprint is 64 bytes, but I have not verified it.

ask`lower base2reduce mgr expectation

This plan didn’t work out at Macq. Expectation was still too high.

The logic is, if my coworkers get total comp 200k and I ask only 160k, then I’m more likely to get some bonus. Even if I underperform them, I would still hit somewhere below 200k.

Now I think if I qualify to stay, then there will be some bonus even if my base is, say 190k. Hiring managers would not agree to a 200k base and run the risk paying doughnut bonus to a qualified employee.

## Y revisit old algo Qn #Kyle

I find it hard to be interested in previously solved problems. I think many people (not only those in a hurry) simply skip such problems. However, we are not so “good at” previously solved problems actually.

  • Reason – there are often insights and repeating patterns in an old problem, that require slow and repeated digestion. Solving it quickly is usually not enough.
  • reason — our solutions may not be optimal.
  • reason — there are very smart solutions out there we don’t know.
  • reason — we forget.
  • reason — with a simple tweak the problem can become a new problem. How well and how fast we can solve the modified problem depends on our insight into the original problem.

Therefore, I feel folks tend to trivialize some well-known problems and fail to learn enough from them.

Well-known doesn’t mean well-understood.
Well-known doesn’t mean simple.

Revisiting old problems is boring to many programmers, but can often feels easier than tackling new problems. I don’t get tired easily when revisiting old problems. If you are like me, then it could be more beneficial than tackling new problems.

However, my visible progress is much slower in terms of #problems solved per week. I think Kyle also pointed out something related to this.

oldtimers trapped ] pressure-cooker #Davis

See also G3 survival capabilities #health;burn rate

Q: Why are so many old timers unable to get out of a shitty job in a pressure-cooker company?

  • A small number of old timers change job but some of them regret as new job turns out worse. Other old timers hear those war-stories and prefer the certainty of known pain in the current job. They don’t feel confident they can find a better job.
  • bench time — Many old timers can’t afford bench time due to low reserve and high burn rate.
    • Old timers tend to be very unused to unstable income.
    • Damien (Macq) relied on his bonus payout..
    • Some had no savings at all.
  • Most old timers won’t even consider a 6M contract.
  • Most old timers will never consider a voluntary pay cut to get a more comfortable job
  • golden handshake — Some old timers (UBS?) can’t let go of the promised compensation package. They would not resign and give up on a 100k promised windfall
  • some old timers worry about job loss so much that they work their ass off to keep the job, even if the workload is unreasonable/unfair. I think in GS I was like them. I have a friend in a similar situation. I think the boss wants to let him go but may feel merciful, so they keep him but give him shitty tasks and demand fast turnaround at good quality…. unreasonable/unfair workload.
  • Some old timers may have a wife who completely opposes job change due to potential impact on family.

Q: why XR and I can afford to quit a job, stay home, and find a lower job?
A: Supportive wife, burn rate, passive income and savings.

##deeply felt Priorities b4 U.S.→SG@45 #big-picture

  1. — priorities over the next 2-10Y horizon
  2. Career[a] longevity till 70, probably on wall st, not in Singapore or West Coast. A related keyword is “relevance” to the tech economy.
    1. On Wall St, I continue to keep a keen focus on robust technologies like core Java, cpp, SQL, sockets, core threading, common data structures. Outside Wall st, jxee (and possibly web stacks) offers the best market depth and demand.
    2. Compared to Wall St, West coast is possibly low priority for now as I don’t see long term visibility.
  3. my wellness — Need to guard against PIP-hell, trapped… A “stability factor” arguably more impactful than GC, housing, schooling… One of the key signs of wellness is weight and calorie count.
  4. boy’s education — I don’t know if U.S. system is better for him
  5. GC — a G5 priority in my current plan, primarily on the back of the longevity factor.
  6. — 2nd tier
  7. increase precious time with grand parents — fly business class to NY.
  8. preparing for war at new job — short-term, immediate but actionable.
  9. wife’s and daughter’s life-chances in U.S. — important but not much I can do now
  10. more passive income to reduce the cash flow stress
  11. Saving up for U.S. housing — not much I can do now.
  12. I now have a deep desire but limited hope to keep up my 细水长流 motivation for coding drill and QQ learning. Burning pleasure; self-esteem; satisfaction; absorbency.
  13. leaving a good impression with MS manager

[a] I didn’t say “income”. I think more important to me is my marketability (+ relevance, in-demand ..). Career longevity is the basis of entire family’s well-being for 15Y until kids start working.

Salary — is kinda secondary because the room for improvement is negligible in my biased mental picture.

j8 MethodRef #cheatsheet

Need to developer more low-level insights as QQ…

  • Out of the four kinds of method refs, only AA) static-method and BB) specific-instance kinds are popular. The other two types are obscure.
  • q[ :: ] is used in all four kinds
  • I think the static-method kind is most readable and most intuitive. The javadoc tutorial features a BB example that should be implemented as static method IMHO.
  • a lambda expression has parameter types. A method ref has none and must be converted to a lambda expression. Where does compiler infer the parameter types? I think it is inferred from the calling context.

QuickSelect: select nth key#O(1)space

Quickselect has average O(N) runtime but worst-case O(NN), if the partitioning goes bad repeatedly. Randomized pivot can reduce the chance but I don’t think we can eliminate it

Space complexity is O(1) despite the apparent recursive set-up. Tail recursion optimization is usually available to reuse the same stack frame. Otherwise, recursion can be replaced by a loop.

std::nth_element() is linear on average but quadratic in worst case — explains QuickSelect algo

Quickselect is by the same inventor of quicksort.

get_majority_elem]unsorted array,O(1)space #90%

Q: Given an array of size n, find the majority element. The majority element is the element that appears more than ⌊ n/2 ⌋ times. You may assume that the array is non-empty and the majority element always exist in the array.

worst input: odd length, only X and Y. X occurs once more than Y.

hash table solution needs O(N) space since there can be N/2 distinct values. To improve space complexity, how about quick select? Discard the smaller side and pick another random pivot.

Median-finder algorithm can solve this problem, using std::nth_element() which uses QuickSelect… O(1) space despite recursive set-up.

— idea 3 O(1) space: random pick then verify
Random pick and count the occurrence of this pick. Is it more than N/2? Within a few trials we should find a good pick.

K-palindrome #careercup 20%

Q (Facebook India, careercup): A k-palindrome is a string which transforms into a palindrome on removing at most k characters. Implement check(str, k). str has at most 20,000 characters. 0<=k<=30

Sample Test Case: Input : abdxa 1 -> false
It’s crucial to zoom into the worst input early on — binary string with the correct “kills” well-dispersed.

Insight — once we group the chars by ascii code, I can see a successful palindrome is really an A-palindrome interleaved with a B-palindrome and C-palindrome…

This problem is related to the (tougher) problem of “max palindrome sub-sequence”

Eg k=33.
–idea 3: per-club distribution table
each char gets a dedicated club/distro, where we record all positions occupied by this char.
Since K is small, then every distro should be almost “balanced” around the midpoint.
— idea 7
There can be only up to 33 possible midpoints, including those in-between. (If all kills hit the upper end, then midpoint drops by 33/2.) Let’s try each possible midpoint.

Insight — once we pick a midpoint, we know how many we can kill above/below. For example, if we pick the highest midpoint, then all kills must hit lower half. The extreme midpoints are easier due to stringent constraints.

If we pick a central midpoint, then we can kill about 33/2 in each half, but the kills must match — nice constraint.

For each midpoint picked, we run Algo B:
At init-time, we compute the defects in each club/distro using Algo B1:
given the midpoint and the positions of all J’s, the defect count is the number of kills (killing a J or non-J) to balance J’s distro…

We will compile all 26 defect counts.  Now we start growing our seed at the chose midpoint. We expand both ways. If next 2 positions belong to different clubs J vs L, we evaluate the two kill options using Algo B2
One of the two choices will reduce overall defects. We first look at J’s distro. The J kill would improve or worsen J club’s defect.

If there’s an efficient way to compare evaluate the two kills or update the default counts, then we have a decent solution.

–idea 9: frq table. If there are 35 odd clubs, then 33 of them can become even but still 2 odd clubs –> immediately failure. However, This idea fails with the binary string.

## coding drill 19 Jul

  2. — merge sort in-place

— presumably solved

  3. — pacific/atlantic
  4. — max-product subarray
  7. find min in rotated array
  8. — T.isSubtree(S)
  9. — longest run@same char
  10. 3-sum

— previously solved problem forgotten and refreshed

  1. Leetcode #139 word-break. .. not really “medium”
  2. — max sub difference
  3. — min range to include all clubs
  4. — find lone-wolf among AAABBBCCC
  5. — flip binTree
  6. — serialize linked list: each node points to a random node

— unsolved:

  2. — K-palindrome


T.isLikeSubtree(S) #60%

Q (Leetcode 572): Given two non-empty binary trees s and t, check whether tree t has exactly the same structure and node values as a subtree of s. A subtree of s is a tree consists of a node in s and all of this node’s descendants. The tree s could also be considered as a subtree of itself.

====analysis relies (as “primary check”) on the payloads of  each tree node, but in some trees, all payloads are empty or all payloads are either True or False. In these cases, the comparison of payload is only usable as a secondary check. The primary check must be structural. See key in a tree node

The O(N+K) is plain wrong.

I guess the 2nd solution (and possibly 1st solution) would compare every node in S to the root of T. I think there are more efficient solutions using subtree size and subtree height as secondary checks – more reliable than payload check.

My solution below uses BFT + pre/post/in-order walk !

— Preliminary step: post-order walk to get subtree-size, subtree-height at each S node + T root. (I will skip other T nodes.). Suppose T is size 22 height 4. We will look for any node Y of size 22 and height 4 and a matching payload. This would eliminate lots of S nodes:

If T height is more than 2, then lots of low-level S nodes are eliminated.
If T height is 2 or 1, then T size would be at most 3. Most high-level S nodes are eliminated.

— Solution 1: For both T and S, We take in-order walk to assign incremental IDs, then take pre-order walk to produce an array of IDs that represent the tree structure.

Can We run a level-aware BST. Only one level need to be examined … wrong!

I think the in-order walk itself is what I need. Find any node Y in S that matches size+height+payload of T root. Suppose ID(Y)=44 but ID(T root) = 4, then simply shift down by 40 and do a linear scan. Must check payload, but not height/size.


min int_range2include all clubs#presorted 60% careercup

Q (careercup): You have k uneven lists of pre-sorted integers, N ints in total. Find the smallest range that includes at least one number from each of the k lists. For example,
List 1: [4, 10, 15, 24, 26]
List 2: [0, 9, 12, 20]
List 3: [5, 18, 22, 30]
The smallest range here would be [20, 24] as it contains 24 from list 1, 20 from list 2, and 22 from list 3
==== Analysis
Looks rather contrived but very popular.
— solution 1 O(N): moving window algo:
O(N) merge all K clubs’ items into a big array, where payload includes the original clubId of each item. We end up with
[0/2nd 4/1st 5/3rd 9/2nd 10/1st …]

Once I come up with this big array, I feel confident to conceive a O(N) moving-window algo.

A ‘basic window’ is any subarray having representatives from each club.
A ‘tight window’ is a basic window that can’t shrink further, i.e. the 1st and last items are the only reps of their clubs.

Now findInitiaLwindow(), similar to findNextWin().
Simple procedure —
As we build the window, we update a hashtable like {club1: position 1,3,6; club2: position 0,5; club3: position 2,4,..}
Basically an array of queues.
We update this table each time we increment the front pointer in search of the next basic window.
When we find a basic window, this table should have all K queues non-empty and at least one of them singular, probably the last updated queue.

Now shrink this window via shrink(). Here’s a fast algo but more tricky and no O() impact —
For each queue, find the last added i.e. right-most position.
Put these K positions in a container(actually no container needed).
Now out of these K positions, find the minimum. Say 22. I claim 22 is the left edge of a smaller window.
Keep shrinking if we can but I doubt we can.

Here’s a slow shrink algo —
For the current window, look at the clubId of the left edge (back ptr).
Use the clubId to locate a queue in the table.
Head item in the queue should match the position of back ptr.
If not empty, pop this queue and increment the back ptr by one; else exit shrink()

Now we have a tight window, we remember its size (and update current winner).

After shrink(), we call findNextWin() —
Suppose the basic window’s earliest item is a clubX.
Now we move back ptr (clubX queue is now empty).
Now we move front pointer i.e. right edge of the window, looking for the next clubX item.

* findNextWin() is O(1) per front ptr increment. Only touches the tail of queues
* shrink() is O(1) per back ptr increment. Only touches the head of queues
* therefore, at most N front ptr moves and N back ptr moves.. O(N)

PriorityQ can speed up some, but won’t improve bigO

## reasons4salary_diff ] ibank #Davis,Jay

Primarily supply-demand driven

* Reason (demand-side): age discrimination — Davis pointed out that many ibank and other employers would accept (reality) to pay higher salary to a fresh grad than an old-timer because these employers long for fresh blood

* Reason (demand-side): core dev teams — are more central than peripheral teams like devops (eg QAPM), QA, DBA, reporting,.. Consider Ling, the QAPM devops scripting guru.

* Reason (demand-side): Jay Hu pointed out that hard-core dev skillset — is perceived as more valuable, harder than “tools” including reporting tools, devops tools, automation tools, vendor products that encapsulate the underlying complexities.

Similarly, c++ skillset is often perceived as harder than coreJava
Similarly, coreJava skillset is often perceived as harder than jxee
Similarly, c++java skillset is often perceived as harder than scripting

On the supply side, there are fewer hardcore c++ developer than coreJava developers, and a lot more jxee developers. However, there’s also the Curious case of quant dev — tight supply, but dwindling demand.

If you have some brain power (like the 华中理工 compScience friend of Jay Hu), then don’t be put off by hard-core dev, even though it’s dry, boring, tough, unglamorous. See my blogpost on my geek profile.

Pacific/Atlantic #51%

Q: Given an m*n matrix of non-negative integers representing the height of each unit cell in a continent, the “Pacific ocean” touches the left and top edges of the matrix and the “Atlantic ocean” touches the right and bottom edges. Water can only flow in four directions (up, down, left, or right) from a cell to another one with height equal or lower.

Print all cells where water can flow to both the Pacific and Atlantic ocean.

peek? not yet
— solution 1:
Create m * n nodes in a 2D grid. Two more nodes:
All north|west boundary nodes have an edge From Pacific node
All south|east boundary nodes have an edge From Atlantic node

Between two adj cells, there’s 1 or 2 directed edges, from low to high. Intuition — tracing pollutant upstream. An arrow from P to Q means Q is suspect.

How do I represent the graph? Each node can hold a list of “suspect neighbors” (higher or equal).

Run BFT or DFT from Pacific node and turn on flag in each reachable node.

Run the same from Atlantic. Every reachable node with the flag set is a double-suspect to be printed.

clean-up: non-overlapping intervals #70%

Q (L435): Given a collection of intervals, find the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.


  1. You may assume the interval’s end point is always bigger than its start point.
  2. Intervals like [1,2] and [2,3] have borders “touching” but they don’t overlap each other.

==== analysis:

I think this is same as the greedy room scheduler. Schedule the earliest-ending task, so to to maximize accepted meetings.

A deselected meeting is an interval removed.

recent algo questions: mostly presented in intArray+binTree

binary tree + int_array represent 60% of my recent algo questions. With string and matrix included, we get 80%.

  • “int_array” also include array of int_tuples. Other arrays are rarely quizzed
  • In the real world, binTree is less common than arrays , but trees are more common than binTree. I think interview questions tend to get watered down to the binary species. K-ary trees are rarely quizzed
  • matrix — 95% of Matrix problems are 2D grid problems or integer array problems.
  • graph — Most graph problems are presented as matrix or binary trees, occasionally with cycle.
  • slist — shows up in 5-8% of problem descriptions. Some problems need slist as auxDS.

Leetcode is representative of a constellation of websites.

iterate K pre-sorted uneven immutable lists #FB

Interviewer (Richard) was kind enough to offer me a tip early enough. so we didn’t waste time (which could easily result in out-of-time)

Q: given K pre-sorted immutable lists, each up to N items, return an iterator that on demand yields each of the (up to K*N) items in sorted sequence.

Estimate time and space complexities.


— I first proposed pair-wise merge. Since there are logK passes, Time complexity == O(NK logK)

Space complexity is tricky. Very first pass i have to create a single list up to NK items. Then I can reuse this list in each merge. so space complexity == NK [1], but I said NK*logK. Interviewer wasn’t familiar with this solution and didn’t correct me.

[1] See Suppose 8 lists to merge. I will merge A+B into first quarter of the big array (size NK), then C+D into 2nd quarter… In next pass, I will merge AB + CD in-place using the first half of my big array.

The advantage of this solution — once I create a single merged array, each iteration is O(1). This can be valuable if run-time iteration speed is crucial but initial warm-up speed is unimportant.

bigO insight — merging N pre-sorted arrays is O(N logN), same as merge sort?

— Then interviewer suggested iterating over the K lists so I implemented the solution in

  • Space complexity = K
  • Time complexity:
    • next() O(logK) due to popping. I said Fib heap has O(1) insert
    • init() O(K)
    • hasNext() O(1)

How about a RBTree instead of heap? Space complexity is unchanged.  Time complexity:

  • next() O(logK) for both remove and insert
  • init()  O(K logK), worse than priorityQ
  • hasNext() O(1)

[19] zbs cf to GTD^compiler+syntax expertise

Why bother — I spend a lot of time accumulating zbs, in addition to QQ halos and GTD know-how

I have t_zbs99 and other categories/tags on my blogposts showcasing zbs (真本事/real expertise) across languages. Important to recognize the relative insignificance of zbs

  • GTD — localSys or std tools … goal is PIP, stigma, helping colleagues
  • QQ — goal is mobility. See the halo* tags. However, I often feel fake about these QQ halos.
  • zbs — goal is self-esteem, respect and “expert” status
    • Sometimes I can convert zbs knowledge pearls to QQ halos, but the chance lower than I wished, so I often find myself overspending here.

I also have blog categories on (mostly c++) bulderQuirs + syntax tricks. These knowledge pearls fall under GTD or zbs.

post-order walk: create valuable subtree statistics is rather simple and short, showing stats like

  • max path sum from current node to any subtree node. Note a path must be non-empty, so this max can be negative
    • We can even save in each node this actual path
  • subtree height,
  • subtree size,
  • d2root — tested but doesn’t use subtree statistics
  • — can be easily implemented:
  • subtree payload sum
  • subtree max payload… These stats are powerful AuxDS

Applicable on k-ary tree

reverse post-order mirrors pre-order #diagram+Dot

For a given binTree the pre-order sequence is mirrored by the reverse-post-order sequence.

“Reverse” := “right_child then left_child”

I had an “Aha” moment when reading P 389 [[discrete mathematics]]. Insightful illustrated path-tracer diagram showing that pre-order = ABCDEFGHIJ with root as first visited, and J=right most leaf node. The reverse-post-order = JIHGFEDCBA with root as the last visited, due to post-order. Note the DOT is on the Left for both reverse-post and pre.

( has standard diagrams with a DOT on the left for pre-order; post-order has dot on the right of each node)

Note any post-order curve is less intuitive less visual (than pre-order) because the curve hits a root of subtree upon leaving ! The earlier encounters are touch-n-go. The dot can reinforce this key insight. The dot can help bridge the conceptual gap.

Similarly post-order sequence is mirrored by reverse-pre-order. For the same binTree above, reverse-pre-order = AFGHJIBDEC (not mentioned in the book). Post-order = CEDBIJHGFA, as printed in the middle of P 389. Note root is last visited due to post-order.

max proceeds selling 2 homes #Kyle

Q: you own a residential and an investment property. You have the historical price time series on both. You want to see the max aggregate amount you could have sold both, provided you must sell the investment property before the residential property. You can pick a time to sell the investment and a later time to sell the residential.

I created this questions for

find offending section ] originally-sorted array

I rephrase Leetcode 581 Q: There was an ascending int array containing duplicates. Someone reshuffled some elements so no longer sorted. Write a function to determine the shortest subarray needs reshuffle. Before or after this subarray, no change needed.

O(N) time and O(1) space.

==== analysis
Not “easy” as labelled.

I want to find lowest integer that’s out of place. (Do the same on the other end and problem completed.)

first scan to find global min and max values, and ensure they are correctly placed.

Next scan from beginning on the rising slope. At first down-turn, treat the array as two partitions. In the right partition after the first peak, we determine the lowest value, say, -55. Then we will find the correct position for -55, within the left partition! We will look for the last item same or lower than -55. This is the key constraint of this entire problem.

Insight — We know -55 must move left and we don’t care about any other item in the right partition.

Insight — if within left partition, -55 should be after #7, then we know all earlier items up to #7 are already correctly placed. Why? All subsequent items in the left section are strictly higher than -55; and all other items in right section are all (equal or)higher than -55.

— my idea 1 in ON) space : radix sort a clone of the array, then compare to the input array.

count unique squares ] matrix #70%

Q1: Count unique squares within a R*C matrix. A unique square is identified by two opposite corners. For example, [0,0] and [1,1] identifies a 4-cell square; [2,3], [2,3] identifies a 1-cell square.

Q2: print out all of them.


Q1 Feels like a math problem, but printing is a algo question

— solution 1: Let’s focus on height-2 for now. I will denote the subset of squares having height-2 as “2-subset”. (The 1-subset is trivial.)

Insight — 2-subset is naturally non-overlapping with the 3-subset, 4-subset etc. These non-overlapping subsets constitute the whole solution. This fact is extremely valuable to the count algorithm.

For each height-2 square, I will focus on the north-west corner i.e. the corner among four, having lowest r and c IDs. We can divide the 2-subsets by the rowID. For a 55-row/44-column matrix, there are 54 height-2 squares with rowID=0. Same count with rowID=1 etc.

Aha — each square can be identified by the north-west corner {rowID, colID} + height. So at the higher level I divide the whole solution set by height. At the lower level, I sub-divide the set by the “low rowID”

bigO CIV: clarify_requirement means..worst_data

———- Forwarded message ———
Date: Tue, 9 Jul 2019 at 23:10

Big-O means “worst” case complexity. We need to be doubly sure what “worst” case data look like. (Note it doesn’t mean ridiculously rare data like integer overflow.)

On high-end coding interviews, these “worst data set” frequently shed light on the structure of the original problem, and often point to the correct direction for an efficient solution. (Note a nice-looking solution is not efficient if it exhibits poor time complexity in the face of a worst yet realistic data set. Such a system may not scale well in practice, given its Achilles’ heel.)

If interviewer wants a solution optimized for bigO time complexity, you can waste precious time tackling the wrong problem. The wrong problem is the problem defined by your idea of “worst data set”, but the real worst data set is very different.

Sometimes, the worst data is _subtly_ different but it points to a different algorithm direction. It may point to an iterative solution rather than recursion, for example.

The original interview question may not be that hard iFF we identify the worst data set early on, but sadly, we run out of time.

For some tough problems, the first (sometimes the only) challenge is quickly identifying worst data set. Interviewer always gives you the simple data set, NEVER the worst data set. It’s your job to identify the worst data set… often the key insight into the problem.

It’s no exaggeration to say — identifying the worst data set early or too late can make-or-break your chance at this employer. You may kiss good-bye to this job opportunity exactly because you are too slow to identify the worst data set. I know what this means. Other candidates probably got shot by this arrow on the heel, without knowing what happened.

Time is ticking as soon as the problem is presented to you. If interviewer says time is not a factor… take your time.. we want quality and thought process … I just ignore it. Our speed is always part of assessment. Therefore, a very common failure scenario is —

.. tractable problem but candidate runs out of time after wasting time on preliminaries like using incorrect worst data set to reason about the (wrong) problem.

print height@subtree at every node #post-order

For these problems, elegant implementation is expected.

Note in-order is for binary tree, but post-order is for any tree.

== Q: given Any tree without cycle, print the size of subtree at every node

I think post-order walk is usable.

== Q: given Any tree without cycle, print the height of subtree at every node. Also print the node address just for identification

Intuition — dft post-order walk. Need to become more familiar with the detailed steps. has details.

== Q (Leetcode): Given a binary tree, determine if it is height-balanced — a binary tree in which the depth of the two subtrees of every node never differ by more than 1.

I can try it on Leetcode .. no need to set up test harness myself but I might be frustrated

Intuition– post-order walk. A simple recursive algo — for any node, find the its subtree height including itself and return it. Before returning, also compare the left vs right subtree heights

array of sd::byte^Unsigned-char as “byte array”

Background: Suppose you need to store or transfer arbitrary binary data.

In RTS, we used char-array. Some say we should use unsigned char. It is the only data type that is guaranteed (by the ANSI C Standard) to have no padding bits. So all 8 bits in an unsigned char contribute to the value. None of them is a padding bit. I think std::uint8_t is similar but less common.

Contrary to some online posts, unsigned-char type is different from “char” —

In C++17, std::byte is probably preferred because only the bitwise operations are defined. I believe you can reinterpret_cast a pointer to std::ptr. In, I used none of the above — I used placement-new 🙂

In java we use the primitive “byte” type — an 8-bit signed integer.

realities@Canada universities #Davis

Davis agreed the quality of education is good in Canada universities, but students are less aggressive (I would say hungry) than in the U.S. He studied in a lesser-known college in Canada. His (and classmates’) salary grew but remained much lower than in U.S. even in pre-tax.

I feel U.S. pays high salary to software engineers. Canada may not.

(Same can be said about U.K. and France.)

Davis said “no investment banking jobs” (Same can be said about Australia.)

He gave examples to show — job opportunities are much fewer there.

(Same can be said about Australia.)

U.S. college graduates can find internships + job in the U.S., but Canadian college grads can’t so they usually work in Canada, with much lower salaries. Like in other countries, only a minority can break into more lucrative job markets overseas. There’s clearly an invisible barrier between Canada and U.S. job market. I thought there was none. Now I think the barrier is visa —

Canadian graduates need visa to work in U.S. TN is usable but still employers need extra legwork. Overall, Davis perceives a real barrier of entry. In 2016 I experienced something similar — Beyond the heavy door there’s a land of lucrative job opportunities. I wrote a blogpost about it.

  • I overestimated the “level playing ground” in U.S. job market. In reality, local candidates have an advantage bigger than I assumed.
  • I tend to overestimate my (or my kids’) raw talent. Without extraordinary talent we can kill the interview but we are not so extraordinary.
  • I underestimated the entry barrier.
  • Best example — I can see the advantage of Wall St as a job market relative to London (per Ram) and Singapore. Even a mediocre java guy can earn 150k. However, so many competent techies in Asia experience many obstacles when they try to break into Wall St. Most of them don’t bother to try.

hash table of integer_ratios #slope

It’s not possible to store arbitrary ratios as floating point values in a hash table. We hit this problem when matching line slopes. provides a solution

  • the gcd(int1 int2) function finds the greatest common divisor of two integers. This is the key to our problem.
  • convert a string like ‘x/y’ to a simplified Fraction object
  • convert a decimal 
  • convert a float 

I think other languages also provide the same gcd functionality.

low_churn+low_stress domains

–high churn high expectation is worst but I have no experience

  • devops – churn due to integration with automation tools

–low churn high expectation

  • xp: Macq quant-dev domain
  • xp: PWM: SQL + Perl + java
  • xp: Qz

–high churn low expectation (low caliber)

  • xp: dotnet at OC
  • mediocre web shops

–low churn low expectation is ideal

  • xp: RTS socket + mkt data
  • xp: citi muni: bond math + coreJava


Minor QnA with Stroustrup

Q: memory efficiency is a traditional advantage of c++. Is that still true?
A: yes

Q: some of the top innovators in c++ including yourself, Andrei Alexandrescu, Herb Sutter, … are not contributing to java, c# or other languages. I wonder why.
A: Andrei is working D.
%%A: GC languages have their constraints. Many of the c++ innovations don’t easily apply. However, the cross pollination is more natural between c and c++. Stroustrup said he might be one of the biggest contributors to C feature improvements.

Q: strongholds of c++?
A: complex code, close-to-hardware

Q: do you agree that the new crop of language features are used mostly for system programmers, library developers … not application developers?
A: See the stronghold answer

Q: what competitors do you see over the next 20 years? Go? Rust?
A: none

std::sort() beating qsort() #inline

Stroustrup was the first one to tell me c++ std::sort() can beat C qsort() easily. says:

Since the qsort() code is compiled ahead of time and is found inside the shared libc binary, there is no chance that the comparator funciton, passed as a function pointer, can be inlined. says

For qsort(), since the function is passed as a pointer, and the elements are passed as void pointers as well, it means that each comparison costs three indirections and a function call.

In C++, the std::sort is a template algorithm, so that it can be compiled once for each type. The operator< of the type is usually baked into the sort algorithm as well (inlining), reducing the cost of the comparison significantly.

QQ study=burning_joy

See also the pleasure^chore framework

Unbelievably, QQ study feels like “burning” pleasure. After I complete some chore like coding drill, tax, I long for some QQ study !

There are very few burning pleasures:

  • tech study including zbs, halo…
  • localSys hacking — camp_out
  • weekend coding assignments
  • jogging in stadium
  • learning Oracle or unix in my younger days — camp_out

signed-int: longest K-sum subArray #60%

Q: given a signed-int array, find the longest subarray having sum=K

realistic scenario: we know someone executed a buy and a sell and made $K profit. There are many possible buy-sell points when I review the price points. We know this trader held the position longest to gain exposure, so which buy-sell points did he use?


First scan to build cumulative sum — the sum-array. Looks like a price chart. Initial value is the first original element.

Now scan this sum-array. At each element s[j],

  • I save s[j] in hashtable {s[j] -> j] iFF no such key yet.
  • Then I look for the (by design, earliest) element s[i] such that s[i] + K == s[j]. If found in the hash table, then compute hold:=j-i and update the global longest_hold if necessary.

Note the hash table’s values (positions) are all earlier than j, since we are scanning forward.

Can consolidate to one scan.

Tip — hash table is usable to reduce O() thanks to exactly-K.

Q: what could derail work-till-70 plan

Q: what things can derail my work-till-75 plan. Let’s be open and include realistic and theoretical factors.

  • A: I think health is unlikely to be the derailer. In contrast, IV competition, age discrimination are more likely. The ruthless march of technology also means demand for my skillset will decline..
  • A: mental health? Look at GregM at RTS. See other blogposts on brain aging.
  • A: On the job, my energy, absorbency and sustained focus might go down (or up) with age. I wrote more in another blogpost — As I age, brown-field localSys will become harder to absorb. I may need to stay at one system longer.
    • On the other hand, if offered the chance to convert from contractor to FTE, I may need to resist and possibly move out.
  • A: On interviews, I think my QQ knowledge will remain competitive for many years.
  • A (pessimistic view): green field vs brown field — as I age, my capacity to handle green field may go down. My speed to learn brown field codebase may also go down but after I learn it, I may be able to retain the knowledge.
  • A1: #1 derailer is demand for my skills. In fact, beside doctors, Wall St tech might be one of the most enviable domains for work-till-70. Note “tech” also includes BAU, sysAdmin, projMgr, productMgr and other support functions.
  • A1b: Based on the rumor that west coast is more competitive and age-unfriendly, then the techies there in their 40’s may have more difficulty to remain hands-on like on Wall st. I have a natural bias towards WallSt contract market. If confirmed, then Wall st is better for older programmers.

a few common memory techniques in my HRT assignment

Technique: reinterpret_cast

Technique: memcpy

Technique: pre-allocated buffer as local static objects … can be better than "stack" allocation only for a large buffer.

Unlike local variables, local Static objects can be returned by pointer — major flexibility in low-level memory management. discusses the runtime cost of static local vs local

std::vector capacity reduction

Q: how would vector’s backing array reduce in size? In other words, how would the capacity every reduce?
%%A: The only hope is to shrink_to_fit() which is a request to compiler. Compiler may ignore it.

If capacity reduction does happen at runtime, then reallocation would probably happen.

I believe resize() assign() clear() etc will never reduce vector capacity.

[17] string(+vector) reserve()to avoid re-allocation #resize()wrong

See also std::vector capacity reduction and vector internal array: always on heap; cleaned up via RAII

string/vector use expandable arrays. Has to be allocated on heap not stack.

For vector, it’s very simple —

  • resize() creates dummy elements iFF upsizing
  • reserve() only allocates spare “raw” capacity without creating elements to fill it. See P279 [[c++stdLib]]

[[Optimized C++]] P70 confirms that std::string can get re-allocated when it grows beyond current capacity. compares std::string.resize() vs reserve().

  • Backgrounder: Capacity — allocated capacity is often uninitialized memory reserved for this string.
  • Backgrounder: size — length of the string that’s fully initialized and accessible. Some of the characters (nulls) could be invisible.
  • string.resize() can increase the string’s size by filling in space (at the end) with dummy characters (or a user-supplied char). See
    • After resize(55) the string size is 55. All 55 characters are part of the string proper.
    • changes string’s size
  • string.reserve() doesn’t affect string size. This is a request to increase or shrink the “capacity” of the object. I believe capacity (say 77) is always bigger than the size of 55. The 77 – 55 = 22 extra slots allocated are uninitialized! They only get populated after you push_back or insert to the string.

python bisect #cod`IV

The bisect module is frequently needed in coding tests esp. codility. In this write-up, I will omit all other function parameters.

* bisect.bisect_right(x)  # returns an position “i” after the first hit, if any
all values are <= x from a[lo] to a[i-1]) for the left side and all values are > x from a[i] to a[hi-1]) for the right side. So the “hit” items are on the left, unconditionally 🙂
* bisect.bisect_left(x) # returns an index i before or on the first hit:
all values are < x from a[lo] to a[i-1]) for the left side and all values are >= x from in a[i] to a[hi-1]) for the right side.

My sound bytes:

  • bisect_left(needle) returns the first index above or matching needle.
  • bisect_right(needle) returns the first index above needle, unconditionally.

A few scenarios:

  1. If No perfect hit, then same value returned by both functions.
    • Common scenario: if needle is higher than all, then “i” would both be the last index + 1.
    • Common scenario: if the needle is lower than all, then “i” would both be 0
    • in all cases, You can always insert Before this position
  2. If you get a perfect hit on a list values, bisect_left would return that “perfect” index, so bisect_left() is more useful than bisect_right(). I feel this is similar to std::lower_bound
    • This is confusing, but bisect_right() would return a value such that a[i-1] == x, so the returned “i” value is higher. Therefore, bisect_right() would never return the “perfect” index.
  3. If you have a lower-bound input value (like minimum sqf) that might hit, then use bisect_left(). If it returns i, then all list elements qualify from i to end of list
  4. If you have an upper-bound input value that might hit, then use bisect_left(). If it returns i, then all list values qualify from 0 to i. I never use bisect_right.
  5. Note the slicing syntax in python a[lo] to a[i-1] == a[lo:i] where the lower bound “lo” is inclusive but upper bound “i” is exclusive.
import bisect
needle = 2
float_list = [0, 1, 2, 3, 4]
left = bisect.bisect_left(float_list, needle)
print 'left (should be lower) =', left # 2

right = bisect.bisect_right(float_list, needle)
print 'right (should be higher) =', right # 3

new languages lose limelight soon

New languages come into and go out of fashion, like hairstyle and cars.

  • Java has lost the lime light for 10 years but is still fairly popular due to job market and high-profile and “cool” companies like google.
  • C and C++ are among the top 5 longest-living languages. They are unpopular among the young, but they remain important.
  • c# is a curious case. I think the really important windows applications are still written in c++.
  • Many high-level languages go out of fashion, including C#, D, ..
  • Most scripting or dynamic languages go out of fashion


G6 c++skills { Stroustrup chat

  1. memory mgmt – for time and space efficiency
  2. reusable techniques to use stack objects instead of heapy thingies
    • reduce reliance on dynamic containers
  3. virtual — reusable techniques to avoid virtual
  4. other compiler internals that impact latency
    1. knowledge of inline — to trade off latency against code size
  5. safe c++ techniques, to reduce the chance of run-time errors. C++ is particular vulnerable as it is
    • more complex than C and others and
    • more close-to-hardware than Java and other languages.
  6. TMP — against to trade off latency against complexity. But this zbs domain is too big for me


vector internal array: always on heap; cleaned up via RAII

Across languages, vector is the #1 most common and useful container. Hashtable might be #2.

I used to feel that a vector (and hashtable) can grow its “backing array” without heap. There’s global memory and stack memory and perhaps more. Now I think that’s naive.

  • Global area is allocated at compile time. The array would be fixed in size.
  • For an array to be allocated on stack, the runtime must know its size. Once it’s allocated on that stack frame, it can’t grow beyond that stack frame.
  • In fact, any array can’t grow at all.

The vector grows its backing array by allocating a bigger array and copying the payload. That bigger array can’t be in global area because this type on-demand reallocation is not compile-time.

Anything on heap is accessed by pointer. Vector owns this hidden pointer. It has to delete the array on heap as in RAII. No choice.

Now the painful part — heap allocation is much slower than stack allocation. Static allocation has no runtime cost at all after program starts.

kernels(+lang extensions)don’t use c++

Q: why kernels are usually written in C not c++? This is an underpinning of the value and longevity of the languages.

I asked Stroustrup. He clearly thinks c++ can do the job. As to why C still dominates, he cited a historical reason. Kernels were written long before C++ was invented.

Aha — I think there are conventions and standard interfaces (POSIX is one of them)… always in C.

I said “The common denominator among various languages is always a C API”. He said that’s also part of what he meant.

benchmark c++^newer languages

C++ can be 5x faster than java if both programs are well-tuned — A ball-park estimate given by Stroustrup.

The c++ code is often written like java code, using lots of pointers, virtual functions, no inline, perhaps with too many heap allocations rather than stack variables.

Many other benchmarks are similarly questionable. These new languages out there are usually OO and rely on GC + pointer indirection. If you translate their code to C++, the resulting c++ code would be horribly inefficient, not taking advantage of c++ compiler’s powers. An expert c++ developer would rewrite everything to avoid virtual functions and favor local variables and inline, and possibly use compile-time programming. The binary would usually become comparable in benchmark.  The c++ compiler is more sophisticated and have more optimization opportunities, so it usually produces faster code.

append+maxInRangeK #Rahul

Q: given a sequence of integers, implement two operations

  1. Operation 1: append another integer. Size becomes N
  2. Operation 2: given two arbitrary subscripts into the sequence, (they define a sub-array of size K), find the max

Rahul saw discussions without any optimal solution. I think a simple vector-based algo can achieve amortized O(1) for Append and O(logK) for Max.

AuxDS required.

— idea 1: segment the sequence into fixed segments


A/Bob/C: cod`IV #vec for python list

Here are other variables names frequent needed in coding puzzles:

  • Once you declared a long, meaningful var name, you could use abbr of it afterwards.
  • if you have an array of composite objects like “frame” or “trade”, each represented by a inner array, then be careful with frame[1]. Better call it aFrame[1] or frames[0].
  • what about “sorted” which is a python function name:( …. I would use ascd by default.
  • B4 means “before”; Af means “after”
  • hm means hashtable; tm means treeMap or std::map
  • targetRepetition or tgtRep
  • “val” — payload data filed in a Node class. This name is short, easy to type. Harder to global-replace but LG in a short program
  • candd” — candidate
  • “End” for the last element in a container
  • li means list … vec means vector … arr means array
    • In python, we need extra caution to use “vec” not “li” for the python list !
  • cnt means count or counter
  • — for a matrix
  • M, m or mat for the matrix
  • r as the 1st index; c as the 2nd index. Avoid x/y
  • height/rows as number of rows; width/cols as number of columns
  • I will refer to each element as a “cell”

—It’s more useful to have personal names than mere “Person A”. Shorter names are easier to write. Gender doesn’t matter.

  1. Alan, Alice, Aaron,
  2. Bob, Ben (Big Ben), Bill,
  3. Cody,Chris, Charlie
  4. Dan, David
  5. Ed, Emily, Elizabeth (queen)
  6. Fred
  7. Greg, Gail
  8. Henry, Harry,
  9. Ilya, Ivy, Ian
  10. Jen, Jim, Joe, Jack, John, Jason,
  11. Ken, Kate, Kevin,
  12. Lucy, Lee, Lilly, Larry,
  13. Mike, Mary, Matt,
  14. Nick, Nancy,
  15. Oliver, Oscar, …
  16. Pam, Peter, Paul, Peggy
  17. Quincy
  18. Raj, Rob, Ray,
  19. Sam, Sue,
  20. Tom, Ted, Tim,
  21. Uday ….
  22. Vic, Venkat
  23. Wendy, Will
  24. Xavier, Xander/Zander …
  25. Yixin, Yang ….
  26. Zofia, Zack,

when2introduce new tokens into O()

The extra tokens like A  B C in a O(A+BB+logC) are not always independent dimensions.

  • Case: DFT/BFT have O(V+E) but some people may say V <= E, so why not remove V and say O(E).

I think this estimate misses the point that E could greatly outnumber V. If another algo is O(V+logE) it would probably win.

  • case: radix sort is O(WN), but some may say W <=N, so why not remove W and say O(NN)

Well, W is typically log(unique count among N), though W can be half N.

Some bigO students don’t like too many tokens and may simplify it to O(KK), but I think it’s imprecise. If M or N is very small like logK, then O(MN) is much faster than O(KK).

Sunday night: if in the mood4localSys

Realistic scenario — I find myself in the mood for localSys learning on a Sunday night 11pm.

I think it’s better to sleep in office than to go home, but in Singapore, I had better go home and sleep, by taking taxi.

I think it’s better to work on the localSys till 2am (or later). Those 3 hours are precious engagement … Burning pleasure. I don’t get such 3-hour engagements in a week.

I used to feel “why not work on my QQ or coding practice now, and focus on work Monday morning?” It turns out that I’m usually less motivated on Monday morning, for whatever reasons.

Friday night is similar. I often find my highest appetite and absorbency on Friday nights (or eve of public holidays). So better go home late, sleep, then come back to office Saturday early morning to capture the mood.

[19] y WallSt_contract=my best Arena #Grandpa

Competition arenas … we all choose the arena to compete in. It’s a choice, either implicit choice or explicit choice. I would say better to conscious about this choice.

Not much new content in this blogpost. I feel very convinced to stick with WallSt contract market. Here is a ranking of the reasons why I consider it a rational decision, though our decisions are shaped by our deeply personal experiences and inherently irrational.

Beware of attachment !

  1. low stress, low expectation — my #1 reason as of 2019
  2. low-caliber competitors, mostly due to the “offputting
  3. age friendly
  4. I get to practice interviews and keep a precious burning-pleasure for a month each year on average.
    1. In contrast, If I were an ibank VP I would have multiple obstacles on that front.
  5. — other reasons to prefer Wall St contract
  6. higher probability of greenfield projects
  7. leverage on domain knowledge
  8. I can easily explain my job hopping profile
  9. a number of firms to hop around

Now the downside, off-putting factors. Many bright or young competitors are put off, reducing the competition

  • salary can drop; furlough
  • frequent job change required — often forced to change job.
  • no job security — Employers tend to cut contractors first.
  • no compensation package
  • no growth, no bonus
  • no chance to move off hands-on dev and become prj mgr etc
  • no medical benefits
  • restricted to mostly ibanks — many young bright people are interested in buy-side or non-finance.

baseclass(template)use subclass field+!vptr

A very restrictive context —

  1. the base and sub classes represent market data messages, used in a reinterpret_cast context, with zero padding. Therefore, vptr would add a pointer and mess up reinterpret_cast.
  2. multiple (say 11) subclasses have a “price” field, so I don’t want code duplication 11 times
  3. The field order is fixed in struct definition. Without this restriction, I would define the “price” field in a base struct and also a getPrice() method. With a subclass instance, CRTP could probably work like static_cast<Subclass const*>(ptr)->getPrice() but the “price” field’s physical offset would be be dictated by the compiler not according to my struct definition

My technique uses CRTP but no SFINAE no enable_if.

My technique is easier to /internalize/, as it relies on simple overload resolution + simple type deduction. In contrast, the SFINAE technique used in my RTS codebase is off-putting and alien to me

wildcard matching #more tractable than regex_FB

Q (Leetcode44): Given an input string (s) and a pattern (p), implement wildcard pattern matching with support for ‘?’ and ‘*’. The matching should cover the entire input string (not partial).

‘?’ Matches any single character.
‘*’ Matches any sequence of characters (including the empty sequence). The star is not a quantifier. I think a single star is like the “.*” in perl.

no space allowed.

I think a wildcard can sit at end of string

I think we could see two wildcards in a row. Two stars can be treated as one star.


I feel this is simpler than the problem, therefore easier to absorb. I think either DP or my original recursive solution should work.

Once we have a topdn-memoization solution (published on githut) working, we can work on a botup.

Accounts merge #disjoint set

Q: .. too long

Union-find + BFT

— solution 1:
Data structure:

  • Acct object {name, vector of Email objects, isVisited flag }
  • Email object {addr, set of Accts }
  • hash table {address -> Email objects}

First scan to Create one acct object for each row. When populating the vector, check if the address exists in the hash table. If yes, then save the existing email object in the vector. Also the Email object’s set is updated with the new Acct object.

Second scan to print everything. Iterate over the accounts. Need to merge all the emails belong to John —

Scan the vector. If any Email  object has more than one Acct (named John), visit each Acct following BFT, until all John accounts are visited. Collect all of these accounts’ emails in one tree-set, then print them.


max-product subArray #51% untested

Q: Leetcode 152: given signed int array, find the subarray (minimum size 1) with highest product. Return the product.

Note if there’s 0 in your subarray you must return 0.

peek? Not yet

— solution 1: O(N)

One scan to identify all the zeros. They partition the array.  For each partition, we need algorithm-A:

Count neg elements. If even, then return product of entire array. Else

Accumulate from left until last neg element -> prodL.
Accumulate from right until last neg element -> prodR. Compare the two.

trivial edge case — if the zeros create only single-neg partitions, then zero is the best.

trivial edge case — If subarray has only one element, be it zero or negative, then that’s the best for that subarray.



U.S.startups: often less selective,lower caliber

U.S. startups may represent a sweet spot among my job choices. I think they are less selective than HFT or top tech shops. They attract fewer top geeks.

  • Workload – presumably comparable to ibanks like MS/Baml/Barc. I guess some startups are higher, but I guess fewer than 50%. Mithun described 3 startups but he has no firsthand experience.
  • Salary — Many startups pay on-par with ibanks, according to CYW
  • Leaves – possibly fewer than ibanks
  • Cowoker benchmark – possibly lower caliber
    • Respect – could be slightly better than in an ibank if you earn the respect

Some of the big tech shops are actually less selective than HFT – Amazon, Apple

in each treeNode, save d2root

My friend YH showed me a simple BFT algo where each parent node AA saves AA’s level+1 into the two children nodes AAL and AAR.

So every node remembers its own level from root 🙂

I think this technique can be used in pre-order DFT too. For post-order, it is possible but less natural, as shown in my github code subTreeStats.cpp.

Note pre|post-order can work B-trees not only binary trees.

If nodes are immutable, we can use a hashmap {node address -> level}

##IV Q:Name some big-impact projects U contributed to

Kyle pointed out that your change could be a 10-minute job with big impact, but only because there was already an infrastructure. You leverage the infrastructure to create the impact. For a story to be compelling, it has to show complexity and personal contributions

  1. NYSE integrated feed. Most of the financial data websites get nyse data from us
  2. Baml muni trade execution — rewrite from scratch
  3. billing/commission calc at PWM.
    1. eg: mortgage commissions from scratch
    2. eg: special investment (hedge funds) commission rewrite
  4. Citi muni re-offer engine. Biggest muni house in the U.S. Lots of retail investors buy muni bonds through their investment advisors. My changes are actually incremental.