## significant coding IV P/F xp

See also ##failed c++cod`IV: home^IDE^whiteboard:4 beat-fronts

  • paper — is a shorthand for paper, dumb editor, white-board etc
  • webex vs onsite are somewhat similar, usually pair-programming. Take-Home setting is vastly different.
  • pts (points) — awarded to myself, indicating the performance, value, difficulty ..
  • p/f are only based on interviewer’s subjective judgement
  • p! means passed convincingly, with flying colors
  • [1] I focused on working solution! See %% priorities in a take-home coding IV
  • [2] highlight: where practice can make a difference in a make-or-break test
lang location IDE? firm/person pts 1-5 Notes category type@practice needed [2]
c home hackerrank p! Flex 3 2018 max sum+link cleanup#github std algo array+list problems
java home hackerrank 😦 Pimco #190m 2 2018 Q9: path through matrix. Ineffi [1] std algo too hard, but some 2D practice can help
java home hackerrank 😦 Pimco #190m 2018 Q8: buy 1 sell any. Buggy algo array problems
c onsite paper p LiquidNet 2 2018 30 minutes algo array problems
c home codility 😦 Mako 4 2018 monkey. Ineffi [1] algo too hard
c home codility 😦 Mako 3 2018 quasiconstant [1] algo array practice can
c onsite paper p SIG 2 2018 15-30min … RAII smart ptr std lang
c onsite paper p SIG 3 2018 push_back std lang
py onsite IDE p SIG 3 2018 movie query dStruct algo practice can
c home hackerrank pp Promethean 5 2018 2 tough graph problems algo too hard but some QQ can help
c/py home hackerrank p! Kraken 3 2018 3 (small?) problems algo algo
any onsite paper 😦 Nsdq 2018 oracle day-trading algo array problems
any onsite paper 😦 Nsdq 2018 check array 0-N in-situ algo array@int problems
java home IDE p! Romain 4 2018 drone: medium-level implementation design hard to prepare
c onsite paper p! CVA 3 2018 O(N) in-situ filtering algo array problems
c onsite paper 😦 FB 2018 2 short problems std algo algo
c onsite 😦 Trex 2018 exchange messaging SDI no practice can help
py onsite IDE p! Trex 1 2018 3 short tricky problems. only 1 implemented algo algo practice can help
c onsite paper p! Quantum 1 2018 virt func not declared in base trivial lang
c webex paper p! bbg 4 2018 single-thr pool QQ lang
c webex paper p! bbg 3 2018 shared ptr ctor/dtor QQ lang
py onsite paper p! bbg 5 2018 continuousSentence algo backtracking, string problems
c onsite paper ? Wells 2 2017 concurrent queue std threading
c onsite paper p Wells 1 2017 remove_spaces. Ineffi trivial string problems
c onsite paper p bbg 2 teams 2 2017 array reshuffle lang
c onsite IDE p bbg 2 teams 1 2017 isBST() std tree problems
c onsite paper P! bbg 2 teams 1 2017 string string problems
c onsite paper 😦 bbg 2 teams 2017 filters on screen dStruct heavy dStruct
C home hackerrank 😦 Thesys 2017 3 problems out of 5 algo algo
C home IDE p! Thesys 5 2017 first N prime Fibonacci hard to prepare
c webex IDE p! bbg 2 2017 top N active stocks dStruct heavy dStruct
c onsite paper p! BAML 1 2017 various lang
c home IDE 😦 GS 3 2017 TickEngine dStruct heavy dStruct problems can help
C webex IDE p! bbg 2 2017 free slots dStruct heavy algo
C webex paper p bbg 1st round 2 2017 tree serialization std algo tree algo
Java onsite paper p! BGC 2 2017 multiple short programs lang
Java home IDE p! BGC 3 2017 connected or not hard to prepare
Java home IDE p HSBC 4 2017 barely passed hard to prepare
Java home IDE p! pimco 5 2017 iterator again lang
Java onsite paper p! pimco-Zoltan 2 2017 a few short problems lang?
c webex IDE 😦 Citadel 2017 array shrinking array problems
py webex paper ? Broadway 2017 hashtable. Code (on github) works std algo implement std containers
cpp home IDE ? iRage 3 2015 order book again. No response dStruct heavy hard to prepare
Java home codility 😦 baml 2015 Qz FX option team. too lazy algo
c home codility 😦 Jump 3rd 2015 algo
c home IDE p! jump 1st 3 2012 order book dStruct heavy hard to prepare
c home IDE 😦 DRW 3 2015 Tetris. code quality hard to prepare
c webex paper 😦 bbg -London 2015 unable to understand Q1 of 4 std algo algo
c# webex paper p! eikon 2 2013 short, easy questions QQ lang won’t practice
java home IDE 😦 MS 4 2011 FIX. Big take-home test hard to prepare
swing webex IDE p! Barx 2 2012 swing QQ lang won’t practice
C home IDE 😦 Mac 2012 2 short problems lang
java home IDE 😦 MS-comm 2012 too lazy hard to prepare
c onsite paper p! Cantor 1 2011 auto-ptr QQ lang won’t practice
java onsite paper 😦 Barc SOR 2010 recursion recursion
java onsite IDE p! RBC 1 2010 30-45 min, bond tick conversion lang
java onsite IDE 😦 UBS 2011 Suntec lang
java onsite IDE 😦 Lab49 2010 executor lang
java home IDE p! Gelber 3 2011 multithreaded hard to prepare
C home IDE 😦 Amazon 2011 didn’t take it seriously algo
C onsite paper 😦 FB 2012 regex QQ std algo too hard, but some QQ can make a diff
any onsite paper 😦 Goog 2007 algo

monkey crossing river #codility

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/66miscIVQ/monkey_Mak.cpp is my tested solution. Correct, but Performance was rated at 33%, I have since used a set<Timestamp> to improve performance. Still it can’t meet the time complexity.

Q: A monkey wants to get to the other side of a river. The monkey is initially located on one bank of the river (position −1) and wants to get to the opposite bank (position N). The monkey can jump any (integer) distance between 1 and D. If D is less than or equal to N then the monkey cannot jump right across the river. Luckily, there are many stones hidden under the
water. The water level is constantly decreasing, and soon some of the stones will be out of the water. The monkey can jump to and from the stones, but only when the particular stone is already out of the water.

The stones in the river are described in array A consisting of N integers. A[K] represents a time when the stone at position K will be out of the water (A[K] = −1 means that there is no stone at position K). You can assume that no two stones will surface simultaneously. The goal is to find the earliest time when the monkey can get to the other side of the river.

For example, consider integer D = 3 and the following array A consisting of N = 6 integers:
A[0] = 1
A[1] = -1
A[2] = 0
A[3] = 2
A[4] = 3
A[5] = 5

Initially, the monkey cannot jump across the river in a single jump. However, at time 2, there will be three stones out of the water.

   p o s i t i o n
  -1 0 1 2 3 4 5 6 |<-- 6 is the destination; -1 is Starting point
     1 N 0 2 3 5   |<-- when each stone surfaces
         0         |Time 0 //one stone at position 2
     1   0         |Time 1 //two stones 0,2
     1   0 2       |Time 2 //three stones 0,2,3
                time ⤴

Time 2 is the earliest moment when the monkey can jump across the river (for example, by jumps of length 1, 3 and 3, as marked on the picture above).

Write a function: int solution(vector<int> &A, int D); that, given an array A consisting of N integers, and integer D, returns the earliest time when the monkey can jump to the other side of the river. If the monkey can leap across the river in just one jump, the function should return 0. If the monkey is never able to jump to the other side of the river, the function should return −1.
For example, given array A and integer D as de􀃂ned above, the function should return 2 as explained above.

For given A = [3, 2, 1] and D = 1, the function should return 3, since the monkey has to wait for each stone.

For given A = [1, 2, 3, 4, −1, −1, −1] and D = 3, the function should return −1, since the monkey will never get to the other side of the river.

Each element of array A is an integer within the range [−1..100,000], where -1 means no-stone.

expected worst-case time complexity is O(N+max(A));
expected worst-case space complexity is O(N+max(A)), beyond input storage (not counting the storage required for input arguments).

clone a std::vector by memcpy@@ #Deepak

In general, we can’t use memcpy to clone the internal array of a vector.

If we have a vector of simple value types like an int or float, memcpy is probably simpler.

If there’s any pointer member data, then we are in trouble. For example, if there’s any std::string in the internal array, memcpy will bypass the string copy operations. The internal pointer in the string will be copied incorrectly. When the original array content gets deleted, our cloned strings will be messed up.

There are many other element types that involve pointer:

  • the element may have a smart ptr data member. All smart ptr classes provide special copy control, not to be bypassed.
  • the element may be another container. Every container I know uses pointers. They all have special copy control.
  • the element may have some shared resource like a mutex or a connection. These resources are typically allocated on heap. When the original vector gets destroyed, the resource will need release or clean-up

Q: For vector of int, is memcpy faster than copying element by element?
A: Probably not. Both O(N). A for-loop to copy an array of integers may compile to the same binary.

proof-of-work ] blockchain #key questions

Based on my discussions with some experts in the field..

Say Cody just paid Bob some amount in bitcoins. At this time, there could be a large number of pending (unaccepted) transactions like this one waiting to be accepted into the linked list (i.e. the block chain).

One or more miners without coordination, can pick up this transaction + up to 999 other transactions and attempt to add them to the block chain.

They have to overcome a challenge, a computationally intensive challenge, by brute force — They need to find a number (called nonce) such that the integer hash code from hashing function

H(localhost timestamp, //not always in sync with other hosts
[T1, T2, … T1000], // 1 or more pending transactions
the minder’s Id, //typically the IP
previous block’s hash // <– forming a linked list or block-chain
) < barrier

The barrier is a binary number with 3 (or more) leading zeros. In other words, the challenge is mining for an nonce to give a low-enough hash code with enough leading zero bits. Once a miner finds a “solution” and creates a valid block, she broadcasts the block’s hash code (with enough leading zeros) to the network. If one node verifies it, then other nodes would soon verify it.  There’s no central authority to decide when to accept. A node accepts it and uses it as “previous block hash” in a new block, as described in TOWP.

The head block of a linked list is special — no previous block! (A Git repo has only one such “block”, but not in blockchain.) 2nd block includes the first block’s hash. 3rd block includes 2nd block’s hash, and indirectly includes first block’s hash. Therefore, for any given block KK, its entire ancestry lineage is hashed into KK.

  • TOWP — TheOriginalWhitePaper
  • POW — proofOfWork
  • lucky — Some miner could be lucky and find a “solution” in the first try, but the challenge is so tough that there’s only brute force.
  • LevelOfDifficulty — is the number of leading zeros, like 3. When hardware speeds improve, someone (perhaps the merchant) will increase the LevelOfDifficulty, as explained in TOWP
  • immutable — Just like Git, every block is immutable once accepted into the chain.
  • The most common hash function is SHA and SCRYPT.

Q: how fast is the verification vs POW
A: POW is supposed to take 10 min on average. Verification should take nanosec, according to the experts I spoke to. Beside the “attached” transactions, I don’t know what inputs there are to the verification, but it computes a hashcode and compares it to something.

Q: Fundamentally, how would verification fail when an owner double-spends a coin?

Q: what if two miners both pick Cody/Bob’s transaction and broadcasts their “solution” hash code? The first miner would win the race (and get the reward after some delay)

Q: How are two independent linked lists handled in one host?

Q: what’s the need for POW?

Q: what’s the motivation for the a miner to take up a transaction?
A: there’s a 25-coin reward, or possibly higher if the Cody or Bob increases the reward

Q: does each coin have an id?
AA: no, but there is ID for each transaction, each block and each account.

Q5: what if verification fails?
%%A: I feel it’s rare but possible. If happens, the node would ignore it.

Q5b: how would the miner know? Not sure.


movie query #SIG

Like the other coding question from the same employer, this is mostly about nested data structure.

Q1: Given sample csv file below (unsorted), implement an efficient query function get_movies(genre, Start_year, End_year). Optimize for query, not initial loading. Bonus points if you make it work within 75 minutes.

  1,Toy Story 3,2010,Adventure|Animation|Children|Comedy|Fantasy
  2,Toy Story,1995,Adventure|Animation|Children|Comedy|Fantasy
  ...... 20,000 more lines

https://github.com/tiger40490/repo1/blob/py1/py/movieQuery_SIG.py is my working code, good enough for a small data set like movies, but inefficient for bigger data set like newspaper articles.

Q2: what if the number of genres could be thousands, like a tag cloud.

— Q1 analysis
My initial thought was on binary search, with O(logN) + O(E-S) which is often equivalent to O(N). Locate first and last item in a sorted list, then iterate over enclosed items.

After some brief discussion with interviewer, I came up with a design of O(1) + O(E-S), arguably the fastest, featuring a jagged 2D array:

  • Assumption: genres are shared and shows up repeatedly in the file, rather than random strings that may show up only once.
  • use contiguous array to hold all movies, indexed by year.
  • Within each “year bucket”, store multiple movies in a contiguous array indexed by genreId.
    • Note there is no gap in the genreId /numbers/. I initially said hardcoded genre enum but interview (Ken) challenged me with frequently changing genres. I now feel we can generate genreId as we load the csv file. This is comparable to some special hashing function described in Q2 analysis below.
  • each “cell” holds a list-of-shared_ptr-to-movie
  • each movie object is on heap, and owned by shared_ptr’s

— Q2 analysis:
If tags are user-generated, random and uncoordinated, then the total population of tags could be unlimited. Our genreId would go up to millions. Perhaps use a separate hash table within each “year bucket” instead.

However, since the query is “by tag” it implies that the most important tags are shared.

At this point, I feel design should depend on the real, not imaginary, context, so it’s kind of pointless to speculate what type of tags we will get.

OMS skill !! standardized !! portable

I now feel the glorified OMS know-how is vague and poorly-standardized.

  • My B2bTradingEngine at 95G does a lot of OMS.
  • Smart order router module often does some OMS.
  • FIX connectivity module often does some OMS.
  • VWAP and other execution algos are usually considered a category of OMS.
  • … Many of the above are very different skills and have almost nothing in common.

I am afraid that even after 3Y or 5Y in some OMS system, when I apply to another OMS job they may realize I’m not a veteran.

Q: In comparison, which domain is more standardized in terms of skillset?
A: raw mkt data. Note 2nd-hand market data processing is much less valuable
A: basic bond math. Note more advanced bond math is rarely needed
A: FIX connectivity

  • How about VaR?
  • quote distribution systems in Citi and OC? poorly standardized

specialty jobs outpay commodity financial IT jobs@@

In 2007 I worked on the most common type of financial IT job — 1) java servlet applications with 2) a big database and stored-procedures  3) small amount of client-side javascript (possibly with ajax). 4) In my case, there was also some scheduled batch job.
It was in Private-Wealth-Management but no financial domain knowledge required.
(I have since moved on to “fancier” systems, not only pricing or market data.) I started to think the most common dev job like the PWM job doesn’t pay as well, but what’s the reality that you see throughout your career?
Q: does the most common type of financial IT job pay lower because it only needs commodity skills?
[Mithun] Not always. It depends on how critical the demands are and how much $ impact it has.
Trading as such they have money to spend, sometimes demanding & definitely impacts $ 
there is a short-term trend —  depends on demand & supply. 
I have lately seen lot of money thrown at angular even thought it might not be challenging as concurrency, but then u need many angular developers for a few server side developers. But then this skill fast retires.
Perhaps it pays just as well, or 5% lower than the specialty dev jobs in trading systems?
Commodity java dev will get around 145k while concurrency guy will get 175k…. 
Things have changed in that “most common type of financial IT job”, which I can only define in vague terms.
* once there was lots of spring/hibernate usage
* once there were lots of xml-related server-side libraries
* I guess there are more RESTful interface?
* more javascript code
* fewer stored procedures
* more noSQL systems, less SQL
* I guess there’s now more Hadoop??
yes Hadoop , spark, but I see a lot of demand for python, and they are coming with rich libraries.

buggy RAII smart_ptr #SIG

Q1: given a well-behaved class X, someone wrote a RAII wrapper below. What problems do you see?

Q1b: how would you fix them?

%%A1: memory leak of x1, due to the synthesized op=()
%%A1: double-delete on x2

Q1c: OK Good. Now what if X has subclass Y,  all with virtual dtor?
%%A1c: X and Y each create virtual clone() to return a pointer to the host type

Q2: what’s the outcome of double-delete?
AA: undefined behavior. You are lucky if it crashes.

struct A{
  A(X* x): myX_(x){}
  ~A(){delete myX_;}
  X* myX_;
void demo(){
  A a1(new X()); //unnamed x1 object on heap
  A a2(new X()); //unnamed x2 object on heap
  a1 = a2;
/////////// above is original code; below is my Q1b answer -- Add op=()
A & operator=(A const & rhs){
  if (this != &rhs){
    auto tmp = this->myX_; //delete only after successful "new"
    this->myX_ = new X(*rhs.myX_); //clone the X object of rhs
    // if new or ctor throws, the memory is not really allocated 🙂
    delete tmp; //let's assume no exception here.
  return *this;