FB 2019 ibt Indeed onsite

  • coding rounds — not as hard as the 2011 FB interview — regex problem
    • Eric gave positive feedback to confirm my success but perhaps other candidates did even better
    • No miscommunication as happened in the VersionedDict.
    • 2nd Indeed interviewers failed me even though I “completed” it. Pregnant interviewer may follow suit.
  • data structure is fundamental to all the problems today.
  • SDI — was still my weakness but I think I did slightly better this time
  • Career round — played to my advantage as I have multiple real war stories. I didn’t prepare much. The stories just came out raw and hopefully authentic
  • the mini coding rounds — played to my advantage as I reacted fast, thanks to python, my leetcode practice …

So overall I feel getting much closer to passing. Now I feel One interview is all it takes to enter a new world

* higher salary than HFT
* possibly more green field
* possibly more time to research, not as pressed as in ibanks

##algo Q categories: %%strength/weakness

— relative strengths compared:

  • data-structure heavy
  • union find
  • generator — more experienced than average
  • geometry
  • sliding window
  • whiteboard relative to IDE

— weakness

  1. DP applied to string
  2. DP — as I usually can’t visualize and wrap my mind around it,
    • I have put in more effort than others, but not connecting the dots and developing intuition
  3. recursion in a loop — over my head
  4. 2D grid and matrix
    1. but I have put in more effort than others
  5. num-array. CSY is stronger
  6. priorityQ

 

speed^soft qualities: CIV scoring

If you are very strong, you can achieve both

  • complete the required number of problems in time with no serious bugs
  • demonstrate requirement clarification, edge cases, dry-run, thorough error handling, detailed complexity analysis, coding style.

In reality, some speed coding tests at FB, Indeed etc really favor speed over the other things.

When we take such a test, there is always internal tension between two motivations — speedy completion vs demonstrating other qualities.

Thinking out loud is a common recommendation but it can create unnecessary conversation and slow you down. However, some of these conversations may lead you to the right path.

## coding drill 19 Jul

  1. https://leetcode.com/problems/factorial-trailing-zeroes/
  2. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31671&action=edit — merge sort in-place

— presumably solved

  1. https://leetcode.com/problems/group-anagrams/
  2. https://leetcode.com/problems/merge-two-sorted-lists/
  3. https://wordpress.com/post/bintanvictor.wordpress.com/31650 — pacific/atlantic
  4. https://wordpress.com/post/bintanvictor.wordpress.com/31657 — max-product subarray
  5. https://leetcode.com/problems/contains-duplicate/
  6. https://leetcode.com/problems/majority-element/
  7. https://wordpress.com/post/bintanvictor.wordpress.com/21764 find min in rotated array
  8. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31592&action=edit — T.isSubtree(S)
  9. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31792&action=edit — longest run@same char
  10. 3-sum

— previously solved problem forgotten and refreshed

  1. Leetcode #139 word-break. .. not really “medium”
  2. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31745&action=edit — max sub difference
  3. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31749&action=edit — min range to include all clubs
  4. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31736&action=edit — find lone-wolf among AAABBBCCC
  5. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31755&action=edit — flip binTree
  6. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31758&action=edit — serialize linked list: each node points to a random node

— unsolved:

  1. https://leetcode.com/problems/valid-parenthesis-string/
  2. https://bintanvictor.wordpress.com/wp-admin/post.php?post=31825&action=edit — K-palindrome

 

bigO CIV: clarify_requirement means..worst_data

———- Forwarded message ———
Date: Tue, 9 Jul 2019 at 23:10

Big-O means “worst” case complexity. We need to be doubly sure what “worst” case data look like. (Note it doesn’t mean ridiculously rare data like integer overflow.)

On high-end coding interviews, these “worst data set” frequently shed light on the structure of the original problem, and often point to the correct direction for an efficient solution. (Note a nice-looking solution is not efficient if it exhibits poor time complexity in the face of a worst yet realistic data set. Such a system may not scale well in practice, given its Achilles’ heel.)

If interviewer wants a solution optimized for bigO time complexity, you can waste precious time tackling the wrong problem. The wrong problem is the problem defined by your idea of “worst data set”, but the real worst data set is very different.

Sometimes, the worst data is _subtly_ different but it points to a different algorithm direction. It may point to an iterative solution rather than recursion, for example.

The original interview question may not be that hard iFF we identify the worst data set early on, but sadly, we run out of time.

For some tough problems, the first (sometimes the only) challenge is quickly identifying worst data set. Interviewer always gives you the simple data set, NEVER the worst data set. It’s your job to identify the worst data set… often the key insight into the problem.

It’s no exaggeration to say — identifying the worst data set early or too late can make-or-break your chance at this employer. You may kiss good-bye to this job opportunity exactly because you are too slow to identify the worst data set. I know what this means. Other candidates probably got shot by this arrow on the heel, without knowing what happened.

Time is ticking as soon as the problem is presented to you. If interviewer says time is not a factor… take your time.. we want quality and thought process … I just ignore it. Our speed is always part of assessment. Therefore, a very common failure scenario is —

.. tractable problem but candidate runs out of time after wasting time on preliminaries like using incorrect worst data set to reason about the (wrong) problem.

A/Bob/C: cod`IV #vec for python list

Here are other variables names frequent needed in coding puzzles:

  • Once you declared a long, meaningful var name, you could use abbr of it afterwards.
  • if you have an array of composite objects like “frame” or “trade”, each represented by a inner array, then be careful with frame[1]. Better call it aFrame[1] or frames[0].
  • what about “sorted” which is a python function name:( …. I would use ascd by default.
  • B4 means “before”; Af means “after”
  • hm means hashtable; tm means treeMap or std::map
  • targetRepetition or tgtRep
  • “val” — payload data filed in a Node class. This name is short, easy to type. Harder to global-replace but LG in a short program
  • candd” — candidate
  • “End” for the last element in a container
  • li means list … vec means vector … arr means array
    • In python, we need extra caution to use “vec” not “li” for the python list !
  • cnt means count or counter
  • — for a matrix
  • M, m or mat for the matrix
  • r as the 1st index; c as the 2nd index. Avoid x/y
  • height/rows as number of rows; width/cols as number of columns
  • I will refer to each element as a “cell”

—It’s more useful to have personal names than mere “Person A”. Shorter names are easier to write. Gender doesn’t matter.

  1. Alan, Alice, Aaron,
  2. Bob, Ben (Big Ben), Bill,
  3. Cody,Chris, Charlie
  4. Dan, David
  5. Ed, Emily, Elizabeth (queen)
  6. Fred
  7. Greg, Gail
  8. Henry, Harry,
  9. Ilya, Ivy, Ian
  10. Jen, Jim, Joe, Jack, John, Jason,
  11. Ken, Kate, Kevin,
  12. Lucy, Lee, Lilly, Larry,
  13. Mike, Mary, Matt,
  14. Nick, Nancy,
  15. Oliver, Oscar, …
  16. Pam, Peter, Paul, Peggy
  17. Quincy
  18. Raj, Rob, Ray,
  19. Sam, Sue,
  20. Tom, Ted, Tim,
  21. Uday ….
  22. Vic, Venkat
  23. Wendy, Will
  24. Xavier, Xander/Zander …
  25. Yixin, Yang ….
  26. Zofia, Zack,

convert 512bit int to bit-vector #python=best

In coding questions, int-to-bit_vector conversion is common. I now think python offers a simple clean solution, applicable on integers of any size:)

'{0:032b} <-32|8-> {0:08b}'.format(myInt) #convert to a 32-element list, simultaneously to an 8-element
  • first zero refers to argument #0 (we have only one argument).
  • The zero after colon means zero-padding
  • The last ‘b’ means binary
  • Don’t forget your (dental) braces 🙂

Note this is the str.format() method, not the global bulitin function format()

Return value is a python string, but easily convertible to a vector

What if the number is too large? My {0:08b}  example shows that if the int is above 255, then python does the safe thing — extending the output beyond 8-element, rather than truncating 🙂

To convert bit-vector to an int, use int(bitArr , 2) # 2 is the base i.e. binary

CIV: we wish all corner cases+worst data sets r given

In a few problems, the real important “Aha” is getting the full list of corner cases. This list define the boundary of the problem, and would point out the direction, often immediately.

Of course coding test won’t give you the time.  Part of the skill tested is “clarifying requirement”. This is the real meaning.

Sugg: probe the interviewer

Sugg: Gayle pointed out some techniques to create some test data to reveal the boundary

XOR: key points4 (coding)IV

  • usage: swap two int variables. I think the “minus” trick is easier to understand
  • usage: doubly-linked list … MLP-sg connectivity java IV#2
  • usage: ‘1’ XOR an unknown bit among neighboring bits => TOGGLE the bit
  • usage: If you apply an all-1 toggle (bit vector), you get the “bitwise NOT” also known as “one’s complement”
  • Like AND, OR, this is bitwise meaning each position is computed independently — nice simplicity

If you are looking for a one-phrase intro to the 2-input XOR, consider

) TOGGLE ie toggle the selected bits. If you apply a toggle twice, you get the original.
) DIFFERENCE ie difference gate

See https://hackernoon.com/xor-the-magical-bit-wise-operator-24d3012ed821

— how about a bunch of bits to XOR together?

Wikipedia points out —  A chain of XORs — a XOR b XOR c XOR d (and so on) — is true whenever an odd number of the inputs are true.
Venn 0110 1001.svg is the Venn diagram for a xor b xor c. Red means True. Each of the three circles were initially meaning if you shoot dart inside the ‘a’ circle, then you get ‘a=True’. If outside the ‘a’ circle, then ‘a=False’. You can see that your dart is red (i.e. True) only when encircled an odd number of times. Note your dart is unable to land outside the big circle.

Insight — iFF you toggle a NULL (taken as False) odd times, it becomes True. Therefore, if among N input bits, count of True (toggles) is odd, then result is True.

Does order of operand matter? No

https://leetcode.com/problems/single-number/ has a O(1) time O(1) space solution using this hack. Applicable on a collection of floats, or dates, or any serializable objects.

 

multi-file weekend coding: c++is worst lang

In my weekend assignments, tcost of BP is higher for c++  than other languages.

In any language, function names must match across the files. Ditto for type names and variable names.

Java additionally requires consistency between supertype^subtype, across files. I often rely on Eclipse for that.

C++ adds one more major headache — header files.My C++ coding uses vi or notepad++.

  • more #includes are needed when I extract a header file
  • in header files, I can’ t “use namespace ..”, so I must manually add the namespace prefixes on many names
  • function declaration^definition much match across header file vs implementation files

In a weekend coding assignment, I need to avoid multiple-file coding.

Just keep everything in one file. Tidy up in the end and adhere to Best Practices.

 

 

versioned dict #indeed

Suppose a resume consists of key/value pairs of strings. We need to support two operations —

  1. Each write(key, value) operation creates or updates a key with a new value, and returns an incremented integer vid i.e. version ID. A version ID represents a point in history, i.e. a snapshot of the entire resume
  2. Each read(vid) returns a key/value dictionary as a complete resume snapshot
  3. * In a realistic system, we also need a deleteVersion(vid) but no need to optimize for it

Single threaded system. Version 0 is an empty resume.

Q: design an efficient solution.
Q: after creating N versions via N write() operations, what’s the time and space complexity of next write()? What’s the time and space complexity of the next read()?

——-

Without loss of generality, Let’s say there are K fields in the resume version N. Note K <= N. I should have differentiated K and N, which is useful for my clarity of thinking.

—Design 5 in hind sight, optimized for read(): Simple solution would snapshot entire dict at each write() i.e. each version. Space complexity is bad (I now believe time complexity is more important.) Worst case — K==N as version1 creates key1, version 5 creates key5 etc.

A single read() is presumably O(K) just to read the K fields. I was convinced this was the theoretically limit for read() complexity, because I was thinking of network serialization — either my system or client system need to iterate over all fields of the resume. Now in hind sight I feel interviewer was thinking in java mode — to return entire dict by reference is O(1) rather than O(K) … I didn’t get his hint when he said O(K) read was not really theoretical limit.

Each write() clones the dict for a new vid and saves the new dict in a vector of pointers (!) . Therefore write() is O(K).

—My design 1, Based on the same idea, I used a vector for each key. However, my read() had to create a dict to be returned, on the fly in O(K). I didn’t think of return-by-reference/pointer.

—My design 2 is a minor optimization to remove the cloning of unchanged values. I thought I was on to something efficient but in java (and python?), all strings are copied by reference so my optimization is meaningless in java.

—My design 3 is lazy write. So write() only appends to one of the K vectors. The other K-1 vectors are updated only when needed by a later read() or write(). Amortized cost?

This O(1) write() and O(?) read() complexity can be emulated and surpassed by …

—My design 4 used K RBTrees (small trees) to optimize for frequent write() at O(1). Each write appends on one tree. STL insert with hint is O(1). No such feature in java or python. Read() would construct a new hashmap and return it.

Design 4b would use a separate vector of Record{vid, value} to replace the small tree.

Read(vid) requires binary search in each [3] of K trees. After N updates, total node count across all K trees is N, so even a naive search would not exceed O(N).

If needed, I can also maintain at O(1) cost a “replay vector of originals” i.e. original {key, value} from each write(). We can use this vector for replay.

[3] Q1: Can we avoid scanning all K trees when vid is small or K is much smaller than N ?

A: Like rebus, let’s maintain a separate vector (or RBTree) “birthday” holding records {vid, pointer to a single small tree}. Birthday expands (O(1)) by one element whenever a new small-tree is born i.e. new key is created. My read(vid=55) would binary-search in this birthday data structure using vid to locate the last-born small-tree before version 55, and then iterate backward to visit each small tree born before version 55. This optimized read() uses binary-search throughout to eliminate unnecessary linear search.

Q1b: Worst case read():  K == N i.e. every write() creates a new key. So read() is still O(N)?

A: read() can become O(A) where A:=difference between queried vid and any smaller vid previously cached.
A: If we find out in practice K is close to N, we can use additional data structure to memoize previous read(vid) results i.e. the new hashmaps. I can use a sparse vector of records {pointer to read(vid) result}, indexed by vid. Each read() would save, at O(1) cost [4], the result in this memoization vector. For the next read(vid=77) I would use binary search in this memoization vector to find the closest lower neighbor (eg 55), then replay from vid=77 towards  55, using XR’s algorithm.

[4] see my blogpost on memoization keyed by auto-increment id

Note RBTree supports efficient deletion from both birthday tree + small tree.

—-

Now i feel a big string representing the entire resume is still under 100KB. I assumed were were talking about millions of fields of gigabytes each.

For a long time I was confused and assumed that each write() could update multiple key/value pairs.

short+efficient : holy grail ] algo search

For most of my tough algo problems (such as leetcode problems)

  1. many celebrated solutions are very short, but not so efficient
    • I think about 60% of them are easier to memorize due to length.
    • if you can understand and memorize them, they are quicker to write due to fewer keystrokes
    • About half of them are easier to understand, thanks to the source code size
    • eg: yield-based generators are a prime example
    • eg: recursive solutions are often brief but inefficient
  2. many great solutions are efficient in terms of O(), but verbose
    • eg: bottom-up DP
    • eg: top-down DP with memoization
    • eg: using a tree (when not required) can sometimes give an efficient solution

However, these two goals are often hard to harmonize. It is my holy grail to meet both criteria simultaneously, but i won”t try so hard.

  1. For real coding problems, I will prefer brevity
  2. For verbal discussions, I will concentrate on efficient solutions

xp: pure algo question often require QQ 打通脉络

  • eg: bbg 2017 pure algo question on tree serialization required a standard BFS. Just a few lines but I struggled for a long time. If you memorize the key parts and practice enough, then the *knowledge* would show as a algo skill
  • eg: facebook regex is very hard to anyone I know, unless they worked on a similar problem before, and *know* the tricks.
  • eg: generate permutations, combinations. Can be tricky unless you *know* the key points.
  • eg: DP bottom-up vs top-down with memoization
  • eg: SCB-FM IV: stack-based queue
  • .. many more examples

打通知识脉络 is the Chinese phrase

required experience: QQ imt CIV

A few years into my technology career, I observed that Solaris/Oracle was harder to self-teach at home than linux/mysql/php/python/perl/javascript because even a high-school student can install and hack with the latter. Entry-barrier was non-existent.

Similarly, I now observe that Wall St QQ-IV demands longer experience than coding IV of west-coast style. Coding drill can use Leetcode over months. QQ requires not so much project experience but a lot of interview experience. It takes more than months of exploration and self-learning.

  • example — tcp/ip. My friend Shanyou preferred to read books systematically, but a book touches on hundreds of topic. He wouldn’t know what topics interviews like to dig into.
  • example — c++ TMP. A fresh grad can read about it but won’t know the favorite topics for interviewers
  • Compelling Example — java concurrency. A fresh grad can build up theoretical knowledge but won’t have my level of insight

Many inexperienced candidates were highly appreciated in west coast interviews. No such appreciation on Wall St because Wall St can VP or contract roles require work experience .. tried-n-tested.

  • Wall St interviews are selective in terms of experience
  • West Coast coding interviews are selective in terms of … speed and optimality

%% absorbency: experiment/SDI imt speed-coding

When you find yourself in high-absorbency mood, favor (#1 most draining) speed-coding. See list of less-draining drills at the bottom.

I can read dry QQ topics for hours each day for many days, but tend to lose steam after coding for a few hours. Speed coding drill drains my “laser” energy faster than QQ reading/blogging/experiment. When my laser is weakened (by boredom), I must drill harder to go through the “brick walls”.

I guess many fellow programmers enjoy coding more than reading. I feel lucky that in my interviews, knowledge tests still outweigh coding test.

Q: What part of coding drill is worst on my absorbency?

A: speed coding implementation is worst. It drains my laser energy fastest. After coding for a few hours I always feel like a deflated balloon and discharged battery and and need a full day to recharge.

I think frustration is the key. Self-expectation (about progress and traction) and self-image create the frustration and the drain.

Instead of traction, I often feel stuck and overspent.

I feel disappointed with myself (Deepak self-identify as “annoyed”. )

Q: Can we stop comparing with others and just compare with our past? Doable sometimes. Consider Leetcode speed-coding contest #Rahul

— Just like yoga

  • if I work on easy problems I feel wasting my time
  • if I work on tough problems I feel painful, draining and want to give up. After the practice i need hours to recover.

Q: … So can we find easier coding drills that I could enjoy (as Rahul suggested)? Definitely not easy. I think the first difficult step is self-acceptance that I can’t improve much at this age.

Q (excellent question): What type of “coding” drill can I do for hours like reading/blogging?

  • pseudo-code algo on paper/whiteboard is lighter. No ECT so I am swift and efficient. Less draining/frustrating.
  • SDI is most fun, least boring, not draining/frustrating. I can spend hours on a SDI. I feel a bit of accu. More like QQ less like coding drill.
  • concurrency coding questions are less draining as other guys are not faster
  • c++/java language feature QQ experiments are more like QQ. I can spend hours on a QQ experiment. More interesting as there’s no time-line no benchmark no frustration. Also other guys are not stronger. I feel some accu exactly like reading on these me features
  • review of my previous code is much less draining (than writing new solutions) as there’s no time-line and code is already working
  • analyzing patterns and reusable techniques (very few) in past problems. Thick->thin is the holy grail. I work hard towards it.
  • reading syntax and ECT tips in books and my blog

 

#1(reusable)AuxDS for algo challenges

Here’s a Reusable data structure for many pure algo challenges:

Pre-process to construct a static data store to hold a bunch of “structs” in linked list -OR- growing vector -OR- growing RBTree , all O(1) insertion :). Then we can build multiple “indices” pointing to the nodes

Here are a few O(1) indices: (Note O(1) lookup is the best we can dream of)

  • hashtable {a struct field like a string -> iterator into the data store}
  • array indexed by a struct field like small int id, where payload is an iterator from the data store
  • If the structs have some non-unique int field like age, then we can use the same key lookup to reach a “group”, and within the group use one (or multiple) hashtable(s) keyed by another struct field

I think this is rather powerful and needed only in the most challenging problems like LRU cache.

Will tough coding IV decline]10Y@@ #Rahul

It’s possible that as easy coding IVs spread or grow, tough coding IVs decline i.e. less wide-spread. In such a case, my t-investment now will have lower ROTI than expected.

It’s possible that as easy coding IVs spread or grow, tough coding IVs also grow … for differentiation of talent.

Rahul responded to my question and pointed out the two styles of interviewers he observed:

* interested in can/can’t solve — can this candidate get the problem solved in time?
* interested in thought process

Rahul felt the first type will want to use ever tougher problems to pick outstanding candidates… selectivity

Rahul felt the second type will want to use easier problems to get more insight into the candidate’s thought process.

Rahul felt to differentiate, employers can compare completion time on easy questions

%% pure-algo/dStructure skill@recent CIV

Hi Friends,

Over recent (about 12) months I have attempted several coding interviews

  • passed a Standard Chartered pure algo test over phone, short questions but non-trivial
  • probably passed a big weekend coding assignment given by an e-commerce start-up. I later found out my solution is similar to a published topological sorting algorithm so my algo is probably close to optimal.
  • passed another weekend big coding assignment given by Nasdaq
  • passed two separate paper+pencil coding tests at two reputable sell-side firms (i.e. banks or brokers)
  • passed a speed coding test at a reputable internet company headquartered in Texas.
  • probably passed a few bloomberg coding tests, both remote and onsite, on computers or whiteboard.
  • (All recent failed coding tests happened at High-frequency trading shops .. very picky.)

All of these positive experiences convinced me that my algo and esp. data structure skills are improving. So I gave myself a rating of “A-minus” among Wall Street candidates. I felt esp. confident with white-board coding — with a real compiler, my proficiency is lower than other candidates.

Then I failed, on technical ground, at two big internet companies (FB and … let’s say company XX). Ironically, these were white-board pure-algo problems involving non-trivial data structures — my traditional strengths. So what’s going on? Unexpected failures deserve analyses.

— First, I think I was still too slow by their standard. One interviewer (at XX) told me they wouldn’t reject candidates because of slow completion, but I believe that’s not what she actually meant to say (She is not native speaker). Suppose a candidate takes a long time to come up with a sub-optimal solution, I actually believe given more time he would find an optimal solution because I am that type of candidate. But interviewers would have to assume he is weaker than those who complete the optimal solution quickly.

Interviewers often say they look out for thought process, but that’s probably something they look out for in-addition-to finding decent solutions in time. I think I tend to show clear (not always correct) thought process but that’s not enough.

I think those interviewers really look for “near-optimal solutions but not because candidates memorized it before hand”, so candidate need to 1) find good solution 2) be able to explain when challenged.

— Second, I think the competitors are stronger (faster) than on Wall St. I call them “west-coast competitors” though they could be working in any country, all competing for west-coast tech jobs. I guess they practice hundreds of problems at Leetcode. Such heavy practice doesn’t guarantee success but increase their winning chance.

— Third, I think my big-O estimate needs improvement. I made multiple mistakes at XX coding rounds, and elsewhere.

Big-O has never been a show-stopper in my Wall St interviews so I didn’t pay close attention on details.

— Fourth, I think I have not tried often enough. If I keep trying similar internet companies I might get lucky and score a technical win.

—-

Given the stiff competition and high standard at west coast coding tests, I now feel it’s extremely important to set intermediate goals

  1. goal — pass some quick algo quizzes given by west coast interviewers over phone. I used to get those quizzes from Google phone round
  2. goal — pass some initial (simpler) coding round via webex/coderpad
  3. goal — impress at least one interviewer at one coding round. I can often sense that an interviewer is delighted to hear my idea
  4. goal — impress multiple interviewers, even if your solution is not optimal. I achieved this at Bloomberg
  5. goal — pass all coding rounds at a reputable west-coast tech shop — which is my original goal, never achieved
  6. goal — 100% technical win i.e. pass all technical rounds including system-design-interviews. Note you can still fail the soft skill test.
  7. goal — get an offer

With these intermediate goals, the picture is no longer black and white, but grayscale. We might have hit some of the intermediate goals already 🙂

prevalence@speed coding among coding-IV

25-33% of my past coding IV (counted by hour) were tough Speed-coding. The remaining hours were

  • Easy speed-coding problems, easy to me
  • no-coding Pure-algo quizzes, often telephonic
  • Take-home assignments
  • Threading
  • latency engineering
  • test of language knowledge
  • SDI without implementation

— west-coast coding IV≅speed-coding contest

The skillset is unneeded on any job. For GTD or PKI, this skillset is not in top 10

You need everything fast. Many students practice heavily to gain speed.

 

%%AuxDS strength ] coding test #algoQQ

Update — Indeed experience..

  1. when a tough problem requires us to construct an explicit and non-trivial auxiliary data structures (AuxDS), i often have an intuitive feel for the data structures needed, not always the optimal.
  2. when a tough problem requires a simpler AuxDS + unusual algo, I don’t have an edge.
  3. When a tough problem requires only an unusual algo, I don’t have an edge.
    * eg: edit distance
    * eg: sliding window max
    * eg: maximal sub-matrix

    • Suggestion: invest in algoQQ. I feel they will stick with me.

short coding IV: ibank^HFT^inetShop

See also prevalence@speed coding among coding-IV

  • “InetShop” includes all west-coast type of hiring teams including China shops
  • Ibanks are mostly interested in language knowledge, so they use coding test (including multithreaded) for that purpose.  Ibanks also use simpler coding questions for basic screening.
  • HFT shops (and bbg) have an unpredictable mix of focuses

I predict that Wall St will Not adopt the west coast practice due to .. poor test-coverage — Not testing language knowledge, latency engineering, threading etc.

The table below excludes take-home tests and includes hackerrank, white-board, webex, and by-phone algo questions.

ibanks web2.0 Shops HFT
language knowledge; latency NOT bigO #1 focus never key focus, not #1
simple-medium speed coding rare #1 focus. High bar common
medium-simple algo without implementation,
logistically easiest => popular
common #SCB
minimum bar is low
common key focus, not #1
tough algo never sometimes rare
big-O of your solution NOT latency LGlp. low bar #2 focus. high bar LGlp
concurrency coding test G5 focus never #python/ruby/js sometimes

 

prior knowledge do make you look brainy: algo IV

After solving the min-stack problem, I found something paradoxical , even nonsensical — Someone said this question is considered “easy”, but if you have not seen it before then you may find it extremely hard. I thought O(1) worst case was impossible. I have experienced many similar situations where prior knowledge can make you look very brainy.
  • example — detect loops in a singly linked list, where the loop could be far out.
  • example — reverse an extremely long singly linked list which breaks any recursive solution, so you must use non-recursion
  • example — in a regular unsorted binary tree (every node has up to 2 child nodes and no uplink to parent) how to print all nodes in-order but with O(1) additional space and O(N) time
I feel west coast (and Bloomberg) tend to ask this type of questions because they expect their candidates to study and acquire the knowledge before applying.
If you don’t study, and hope to come up with a reasonable solution on the spot, then you must be extremely intelligent or lucky. I think the math Olympiad gold medalists also need to study before their competitions.

xx: weekendCIV^codingDrill^work

Amount of tech learning and zbs growth over weekend coding assignments are different from coding drill or work projects

  • intensity — highest
  • QQ halos — comparable to my language feature experiments — improves QQ
  • scale — larger than coding drill but not too large like work projects
  • BP — is a major learning focus and evaluation criteria
  • absorbency — highest
  • sustained focus — lasting over a few days, pretty rare for me

I separate pure-algo from real-coding questions

Many job seekers simply say “algorithm question” or “coding interview” as if they mean the same. I see significant differences between two types of coding test — compilable code vs pure algo.

Here’s one difference to start with — real-coding questions require us to be very familiar with syntax and very fast edit-compile-test cycles. I often struggle even when my idea was correct. I often run out of time. No such requirement in a white-board algo test.

I try to visualize a spectrum image like

<=== harder on language+implementation (blue side) … (red side) increasingly harder on pure algorithm ===>

  • * (On far left “Ultraviolet” side of the spectrum) multi-threading coding questions are light on pure computer science algorithms.
  • * (Also on the far left  side) some weekend coding assignments are very serious about design trade-off’s, reasonable assumptions, thorough automated testing … No such requirement in a pure-algo test.
  • * (On the mid-left side) real coding questions can have a code smell issue, esp. in the weekend take-home assignment. I failed many times there even though my code worked fine.
  • * (in the middle) many Hackerrank and Codility questions involve difficult algorithm and fast edit-compile-test. Very hard for me.
  • * (on the right side) pure algorithm questions are sometimes on white board, and even over phone if very short. Light on syntax; no edit-compile-test; no code smells.
  • * (on the far right side) some west coast white board interviews require candidates to ask clarifying questions, and later eyeball-test our code without a compiler. No such requirement in a real-coding interview.
  • * (On the extreme right end) very tough algorithm questions care only about O() not about language efficiency. Pick your language. The toughest algorithm questions don’t even care about O() so long as you can find any solution within 15 minutes.

I often perform better with pure algo questions than real-coding questions. My worst game is codility/hackerrank.

I find it imprecise to call all of them “algorithm questions” and disregard the key distinctions between the two types. Both types have some algorithm elements, but the real-coding questions often require additional skills such as quick completion, language efficiency, and code smells knowledge. A bug can often take us 10 minutes to locate — something rare in pure-algo questions.

Effective strategies to prepare for real-coding vs pure algo interviews are rather different. In this email I will only mention a few.
1) for speed, we need to practice and practice writing real code and solving short problems; you can try timing yourself.
2) for pure algo, we can use python; we can look at other people’s solutions; we still need to practice implementing the ideas
3) for white-board, we need to practice talking through a problem solution… need to practice asking clarifying questions and eyeball-testing
4) for weekend take-home tests, practice is not effective for me. I won’t prepare.
5) for multi-threading problem, need to practice with basic multi-threading problems.

## Are those leetcoders really stronger@@

You wrote “I think many leetcoders are fresh graduates, or young people, who have more time and energy.  I know I cannot compete them …”

Q: Are those leetcoders really stronger in coding tests? Generally yes, but it depends on the type of coding question.

  • For a weekend take-home assignments …. I am not sure they would produce better quality code than us. (I’m not very good either). Code smells matter to my past interviewers.
    • They may not run as many tests as we do.
    • I tend to spend 200%+ more time on the assignment
  • For white-board ….. or remote dumb editor pair programming (west coast favorite), they may not be able to explain their thought process as well as I do.
    • My 2007 Google white-board interview used pseudo code, so the leetcoders’ advantage would be much lower.
  • For completely original questions created by a hiring manager then sent via hacker rank platform, leetcoders may not be fast enough to solve them, since leetcode problems can’t include every newly invented problems.
    • I hit two problems related to shortest-path algorithm, which can be similar to leetcode problems. I also hit very unusual problems.
  • For multi-threading coding questions, leetcode doesn’t help.
    • hit me at least 8 times — UBS, Barcap, Gelber, Bbg ..

prefer std::deque when you need stack/queue

  • std::stack is unnecessarily troublesome for dumping and troubleshooting … see https://github.com/tiger40490/repo1/blob/cpp1/cpp/2d/maxRectangle.cpp
  • std::queue has a similar restriction.

Philosophically, std::stack and std::queue are container adapters designed to deny random access, often for safety. They are often based on an underlying “crippled deque”. As soon as you need to dump the container content, you want a full-fledged deque.

nsdq=lightweight west coast CIV

  • like west-coast, No coding questions in threading like bbg
  • like west-coast, no memory mgmt questions
  • like west-coast, no question on containers as implemented in any language
  • arch design question — more in-depth than Indeed
  • algo + whiteboard
  • unlike west-coast, there were a bit of knowledge questions, but nothing in-depth like on Wall St.
  • Unlike west-coast, there was a weekend assignment

Now I feel some west coast interviews might be similar to nsdq interview.

IDE IV=easier4some !!4me

I guess many (majority ?) candidates consider IDE coding IV easier than white-board. I think they rely on IDE as a coding aid. Many programmers are not used to “demo” coding using white board … stressful, unnatural?

The “coding aid” feature helps me too, but it helps my competitors more !

If on white-board (or paper), my relative rating is A-, then on IDE i would score B

  • my relative weakness on IDE — ECT speed
  • my relative strength on white-board — clarity of thinking/explanation

## coding IV P/F

See also ##failed c++cod`IV: home^IDE^whiteboard:4 beat-fronts

  • paper — is a shorthand for paper, dumb editor, white-board etc
  • webex vs onsite are somewhat similar, usually pair-programming. Take-Home setting is vastly different.
  • pts (points) — awarded to myself, indicating the performance, value, difficulty ..
  • p/f are only based on interviewer’s subjective judgement
  • pp means passed convincingly, with flying colors
  • [1] I focused on working solution! See %% priorities in a take-home coding IV
  • [2] highlight: where practice can make a difference in a make-or-break test
lang location IDE? firm/person pts 1-5 Notes category type@practice needed [2]
py home hackerrank Optiver 3 2019 8H: Morse;prod@2BigInt tough algo
c home hackerrank p FlowTrader #Ashish 3 2019 algo
c home hackerrank p Ziliqa #Ashish 2 2019 see mail 20 Oct std algo speed-coding; memorize syntax
py webex IDE p GS-HK 4 2019 See highest point ]%%visible python proficiency
py onsite paper f FB 2019 full day
c/py onsite paper p FB 5 2019 first round algo Leetcode medium/hard
c weekend IDE HRT 5 2019 parser dStruct heavy
py onsite paper 😦 Indeed 3 2019 algo algo practice
c weekend IDE they gave up redMart 5 2019
c/py home hackerrank no news Altonomy 4 2019 4 problems #Deepak
c onsite paper p Quoine 1 2019 balanced brackets #5-min job algo
py webex IDE p Indeed 5 2019 see my blogposts algo speed-coding; memorize syntax
c home hackerrank p TradeWeb 1 2019 CSY: basic STL map manipulation algo STL + basic algo practice needed definitely
c phone phone p SCB-FM 2 2018 pure algo 1 easy + 2 hard ones. I over-performed algo array/slist problems
c weekend IDE 😦 quantlab 4 2018 Houston HFT weekend coding drill ROTI dStruct
c/thr home IDE p Pimco quant 3 2018 2H parallel programming thr study pthreads
c home hackerrank p FlexTrade 3 2018 slist cleanup#github std algo array+list problems
java home hackerrank 😦 Pimco #190m 2 2018 Q9: path through matrix. Ineffi [1] std algo too hard, but some 2D practice can help
java home hackerrank 😦 Pimco #190m 2018 Q8: buy 1 sell any algo array problems
c onsite paper p LiquidNet 2 2018 30 minutes algo array problems
c home codility 😦 Mako 4 2018 monkey. Ineffi [1] algo too hard
c home codility 😦 Mako 3 2018 quasiconstant [1] algo array practice
c onsite paper p SIG 2 2018 15-30min … RAII smart ptr std lang
c onsite paper p SIG 3 2018 push_back std lang
py onsite IDE p SIG 3 2018 movie query dStruct algo practice
c weekend hackerrank p Promethean 5 2018 2 tough graph problems algo too hard but some QQ can help
c/py home hackerrank pp Kraken 3 2018 3 (small?) problems algo algo
any onsite paper 😦 Nsdq 2018 oracle day-trading algo array problems
any onsite paper 😦 Nsdq 2018 check array 0-N in-situ algo array@int problems
java home IDE pp Nsda 4 2018 drone: medium-level implementation SDI hard to prepare
c onsite paper pp CVA 3 2018 O(N) in-situ filtering algo array problems
c onsite paper 😦 FB 2018 2 short problems std algo algo
c onsite 😦 Trex 2018 exchange messaging SDI no practice can help
py onsite IDE pp Trex 1 2018 3 short tricky problems. only 1 implemented algo algo practice can help
c onsite paper pp Quantum 1 2018 virt func not declared in base trivial lang
c/thr webex paper pp bbg 4 2018 single-thr pool QQ lang
c webex paper pp bbg 3 2018 shared ptr ctor/dtor QQ lang
py onsite paper pp bbg 5 2018 continuousSentence algo backtracking, string problems
c/thr onsite paper ? Wells 2 2017 concurrent queue std
c onsite paper p Wells 1 2017 remove_spaces. Ineffi trivial string problems
c onsite paper p bbg 2 teams 2 2017 array reshuffle lang
c onsite IDE p bbg 2 teams 1 2017 isBST() std tree problems
c onsite paper P! bbg 2 teams 1 2017 string string problems
c onsite paper 😦 bbg 2 teams 2017 filters on screen dStruct heavy dStruct
C home hackerrank 😦 Thesys 2017 3 problems out of 5 algo algo
C home IDE pp Thesys 5 2017 first N prime Fibonacci hard to prepare
c webex IDE pp bbg 2 2017 top N active stocks dStruct heavy dStruct
c onsite paper pp BAML 1 2017 various lang
c weekend IDE 😦 GS 3 2017 TickEngine dStruct heavy dStruct problems can help
C webex IDE pp bbg 2 2017 free slots dStruct heavy algo
C webex paper p bbg 1st round 2 2017 tree serialization std algo tree algo
Java onsite paper pp BGC 2 2017 multiple short programs lang
Java weekend IDE pp BGC 3 2017 connected or not hard to prepare
Java/thr weekend IDE p HSBC 4 2017 3 big questions barely passed hard to prepare
Java/thr home IDE pp pimco 4 2017 iterator again lang
Java onsite paper pp pimco-Zoltan 2 2017 a few short problems lang?
c webex IDE 😦 Citadel 2017 array shrinking array problems
py webex paper ? Broadway 2017 hashtable. Code (on github) works std algo implement std containers
cpp weekend IDE ? iRage 3 2015 order book again. No response dStruct heavy hard to prepare
py home codility p baml 2016 Ashish: applying to Qz algo
Java home codility 😦 baml 2015 Qz FX option team. too lazy algo
c home codility 😦 Jump 3rd 2015 algo
c weekend IDE pp jump 1st 3 2012 order book dStruct heavy hard to prepare
c weekend IDE 😦 DRW 3 2015 Tetris. code quality hard to prepare
c webex paper 😦 bbg -London 2015 unable to understand Q1 of 4 std algo algo
c# webex paper pp eikon 2 2013 short, easy questions QQ lang won’t practice
java home IDE 😦 MS 4 2011 FIX. Big take-home test hard to prepare
swing webex IDE pp Barx 2 2012 swing QQ lang won’t practice
C home IDE 😦 Mac 2012 2 short problems lang
java home IDE 😦 MS-comm 2012 too lazy hard to prepare
c onsite paper pp Cantor 1 2011 auto-ptr QQ lang won’t practice
java onsite paper 😦 Barc SOR 2011 recursion recursion
java onsite IDE pp RBC 1 2011 30-45 min, bond tick conversion lang
java onsite IDE 😦 UBS 2011 Suntec lang
java/thr onsite whiteboard 😦 UBS 2011 Jack Yang lockfree stack QQ
java/thr onsite whiteboard 😦 Barc 2011 wait/notify + lockfree QQ
java/thr onsite IDE 😦 Lab49 2010 executor lang
java/thr home IDE pp Gelber 3 2011 multithreaded hard to prepare
C home IDE 😦 Amazon 2011 didn’t take it seriously algo
C onsite whiteboard 😦 FB 2012 regex QQ std algo too hard, but some QQ can make a diff
any onsite whiteboard 😦 Goog 2007 algo

bitwise coding questions: uncommon

Don’t over-invest.

Q: How prevalent are these questions in coding interviews?

  • I feel these questions are usually contrived, and therefore low-quality and unpopular among hiring firms.
  • There’s no classic comp-science constructs at bitwise level, and bitwise doesn’t play well with those contructs
  • the bitwise hacks must be language-neutral wrt python and javascript, so quality questions are scarce.
  • phone round? impossible

white-board real estate usage

  1. We often insert lines later at …
    1. Leave 2 lines of space before  and after loop header.
    2. leave 2 lines of space after function header
    3. leave space before and after if-header
  2. Example data — should be drawn at bottom right, but make sure you can see them as you code
  3. global var list — I tend to keep them at bottom left on one line
  4. I like many utility functions
    1. short utility function — you can start it near the bottom, or above existing code, with limited “runway”
    2. lengthy yet easy utility — If you worry a utility function may be lengthy, you can start writing on paper and stick it near the board
  5. top right — is for a 2nd major function, perhaps extracted from the first major function
  6. top level driver function — is never complicated but can grow.  I can write it below an existing function
  7. center of the board — leave empty for eyeball test, other examples, graph etc

AuxDS ] each node2simplify algo #shadow matrix

In some algo questions, I like to add a tiny “metadata” field to the VO.

eg: BFT showing level

Q: how likely is interviewer to accept it?

A: I feel west coast interviews tend to entertain crazy ideas since algo challenges are more free-flowing, exploratory, thought-provoking. I remember the Bbg FX interviewer (Karin?) said my “in-place” is OK if I append to the existing vector then truncate the original portion.

A: I feel if it’s clever then they may appreciate it.

Space/time trade-off. The metadata field

  • can be a pointer to parent or next node (like LinkedHashMap)
  • can be an iterator into another (static) data structure
  • can be the element’s own address if not already available
  • can be a sequence id as sorting key
    • insertion id is popular,
  • can be a bunch of flags packed into 8-bit integer

 

sentinel node trick: array/slist

  • eg: STL uses sentinel nodes. I believe container.end() returns that.
  • eg: c-string uses \0 as sentinel value.
  • eg: binary search needle can be too high and we have to test if the return value is a real element or the end-of-container, so it’s extremely useful to add another sentinel to guarantee a successful search

When solving timed coding questions on “clean” arrays, it can be useful to append a sentinel node. It can simplify some algorithms which take action after each segment.

P 90 [[Pearls]] introduced a loop coding idiom. Idiom is applicable whenever we loop over a container and have an early-exit “break”. Such a loop has exactly two [1] conditional tests per iteration, therefore can run faster if we combine the two into one conditional test. This is a small halo idiom for latency. But beyond latency, there are more interesting benefits such as cognitive complexity reduction.

For example, consider the business logic after reaching “normal” end of the container without hitting the early exit. Rarely can we “forget” this business logic and simply exit the function and rely on the implicit return. Instead, this normal-completion scenario requires careful handling and a cognitive burden. To remind myself, I often place a comment after end of the loop. (Python provides an Else clause for a for-loop.)

In some cases, the business logic after end of the loop is 2 pages away from the early-exit business logic, but they really should be in one module. I hate this situation so much that I always have set a flag and use a “break” so that the two routines are both written after end of the loop.

In all these scenarios, it’s often simpler to append an artificial sentinel element at end of the container!  The sentinel is guaranteed to hit the early-exit, so the loop would never exhaust all elements and complete normally. Therefore we can combine the two into one conditional test:)

I usually replace “break” with “exit”, further reducing the control-flow complexity. Such micro simplifications can pay huge dividends in high-pressure timed tests.

Right before the exit/break we may still need to handle normal-completion scenario by checking if we are at the sentinel. Interesting scenario. Now we can combine the logic of normal-completion vs early exit. In many cases, they are (nearly) identical, so we can express the logic in very structured code.

Whether they are identical or not, handling the two scenarios (early vs normal completion) in a centralized module is usually simpler and more structured, reducing the cognitive complexity.

[1] what if three tests (two breaks)? The mental burden is even worse. The sentinel could reduce it

Gayle: no rush2write code: point allocation

Gayle strongly advocates to get a brute-force solution as soon as possible, and walk through the “idea” in detail, before coding. However, I’d rather start coding up my idea much earlier since I need more time coding.

I guess Gayle’s experience shows that some interviewers are less worried about runnable working code than the thought process. For a tricky algo, if the idea is sound, then you basically pass….? See with algo cracked,ECT=雕虫小技

However, I feel an elaborated/detailed “idea” (item 2 below) alone is not enough to pass. I frequently run out of time implementing things like https://www.geeksforgeeks.org/print-a-given-matrix-in-spiral-form/. The algorithm is simple and not worth 40% but I was too slow completing a basic implementation. “Done is better than perfect”. Basic implementation is better than no implementation. You need to demonstrate you can implement a working solution.

Here’s my estimate of the point distribution in a typical whiteboard interview:

  1. 10(% clear understanding of requirements + clarifying questions
  2. 35(50)% detailed idea (even unimplemented) with decent performance enhancement
  3. 35(20-40)% basically working implementation.
  4. 5% performance analysis — can be tricky
  5. 5% readability — modularized, clean
  6. 5% eyeball test — can be hard to get right if code is convoluted with many variables
  7. 5% minor bugs, corner cases

Gayle: use big,ugly sample2explore; tiny sample4test

–In the initial exploration (only 50% clear direction), the big ugly sample would clarify many assumptions.

Any mis-assumption corrected is a potential life-saver since the mis-assumption could lead us down the wrong track.

Such a sample will help us see the pattern in data and get to the gist of the problem.

–However, during the eyeball testing on white board, a large sample would take too much time. A tiny sample can flush out the bugs more quickly

## BRING reminder sheet : cod`IV

  1. edge case — my loop may get skipped completely
  2. edge case — empty output
  3. drive to improve af it works
  • bring rectangular post-it + breakable tape

—-todo:

bbg: Problem-Solving skill

I think this skill is like IQ…

I feel this is broadly similar to west coast interviews, perhaps with subtle differences in how Mistakes are assessed.

One Bbg interviewer (young Chinese guy) shared with me what he meant by “problem solving skill” — given a well-defined programming problem

  • not superb performance
  • not syntax perfection
  • solution should be “reliable”. I guess he means ‘free of major bugs’
  • problem is often realistic — I have noticed this feature repeatedly
  • I said many candidates go on the wrong track and (usually) get stuck. He agreed.
  • I said many solutions are broken because candidate misses some key factor. He agreed.
    • perhaps a factor about requirement
  • I now think some candidates have no experience no basic understanding on a given programming topic (say, DP, or regex parser, or condVar) so they will never get off the wrong tracks.

Rahul’s reasoning — on the job, most of the time a new hire would receive inputs from other people, and need to think through a problem with many obstacles and unknowns. Interviewer can’t really pose a realistic work site problem, so interviewers use algo questions as a proxy.

I think that Beside the technical capabilities, there are many personality, motivation factors that determine candidate’s progress on a given problem

  • Is the candidate confident (rather than nervous) to take the hint and work on it?
  • Is the candidate too ready to give up?
  • Is the candidate unable to use the hint but unable to seek clarification?

Rahul as an interviewer tries to make the candidate relax and perform at her best.

c++matrix using deque@deque #python easier

My own experiment https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/spiral_FB.cpp shows

  • had better default-populate with zeros. Afterwards, you can easily overwrite individual cells without bound check.
  • it’s easy to insert a new row anywhere. Vector would be inefficient.
  • To insert a new column, we need a simple loop

For python, # zero-initialize a 5-row, 8-column matrix:
width, height = 8, 5
Matrix = [[0 for x in range(width)] for y in range(height)]

In any programming language, the underlying data structure is a uniform pile-of-horizontal-arrays, therefore it’s crucial (and tricky) to understand indexing. It’s very similar to matrix indexing — Mat[0,1] refers to first row, 2nd element.

Warning: 2d array is hard to pass in c++, based on personal experience 😦 You often need to specify the size in the receiving function declaration. Even if this is feasible, it’s unwanted legwork.

[1] Warning — The concept of “column” is mathematical (matrix) and non-existent in our implementation, therefore misleading! I will avoid any mention of it in my source code. No object no data structure for the “column”!

[2] Warning — Another confusion due to mathematics training. Better avoid Cartesian coordinates. Point(4,1) is on 2nd row, 5th item, therefore arr[2][5] — so you need to swap the subscripts.

1st subscript 2nd subscript
max subscript  44  77
height #rowCnt 45 #not an index value  <==
width #rowSz [1]  ==> 78 #not an index value
example value 1 picks 2nd row 4 picks 5th item in the row
variable name rowId, whichRow subId [1]
Cartesian coordinate[2] y=1 (Left index) x=4

FB cod`IV tips: #official

bucket sort; merge sort; graph

watch more videos

go through the Prep emails.

  • Practice.. Test your code on the board not code in your head.
  • U can confirm with interviewer if u want to start coding earlier. I tend to run out of time solving the problem. Kelly said just practice more to speed up.
  • You can request to skip small talk n start technical part earlier.
  • Messy code is ok for some interviewers, as long as they can follow your thoughts. Poor recall of language details probably OK, but not fundamental mistakes.
  • Always .. Interviewer will under-specify the problem, waiting for you to ask clarifying questions. The more corner cases you identify the better.
  • Always .. for all coding interviews, the problem is solvable within 30-40 minutes.
  • Always .. Done is more important than perfect. You can improve on a imperfect solution in terms of big-O.
  • Always .. walk through your solution with some unusual cases. Some interviewers don’t care about trivial corner cases, but some unusual inputs are valid and important, like an extremely lopsided tree.
  • Better be “deep” with one language so you can demonstrate your depth (esp. about big-O), but you could also choose different languages for different problems.
  • FB is very specific about big-O
  • some interviewers are very specific about details in the output. Empty input? What are the exact characters in the output? A client program will need the output to be exactly right.
  • FB is very specific about details in the input. Ask clarifying questions.
  • Never invent facts that don’t exist.
  • You should be constantly talking while solving the problem. Talk through more than you usually do.

y I use lots of if..{continue;}

Inside a loop, many people prefer if/elif/else. To them, it looks neat and structured.

However, I prefer the messier if…continue; if…continue; if..continue. Justification?

I don’t have to look past pageful of if/elif/elif/…/else to see what else happens to my current item. I can ignore the rest of the loop body.

Beside the current item, I also can safely let go (rather than keeping track) of all the loop-local variable values longer. All of them will be wiped out and reset to new values.

## G5 algoIV constructs x-lang #dataStruct++

  1. elegant, legit simplifications
  2. Judicious use of global -vs- argument -vs- local variables
  3. iterator and  Pointer in node/VO class
  4. Construct tree, hashmap or Sorted array as auxiliary, or for memoisation
  5. Recursion
  6. sentinel node trick: array/slists

The tools below are more specialized and less widely relevant

  • Graph traversal, recursive or iterative..
  • Permutations etc, recursive or iterative ..
  • Std::lower_bound etc
  • sorting

The topics below are Not really for a coding test:

  • DP
  • concurrency

Codility: ignore overflow+bigO

The machine won’t check your bigO!

Focus on getting it to work. If you have time, address the overflow. If you get bogged down by the overflow concerns, you lose focus and lose speed.

–Here are my practical strategy for codility

  1. Don’t aim for high marks, since these tests are very hard-to-please and not very popular.
  2. Get friends to help crack it.
  3. Remember that ROTI is lower than other coding tests like Bloomberg style.  ECT and thinking speed is way too demanding.

##elegant/legit simplifications ] cod`IV

  • eg: reverse link list in K-groups — (CS algo challenge) assume there’s no stub, solve the real problem, then deal with the stub
  • eg: consumer thread dequeue() method. When empty, it “should” be waiting for a notification, according to Jun of Wells Fargo, but a simpler design returns a special value to indicate empty. The thread can then do other things or return to the thread pool or just die. I think the wait model is not ideal. It can waste a thread resource. We could end up with lots of waiting threads.
  • Eg: https://bintanvictor.wordpress.com/2017/09/10/lru-cache-concise-c-implementation/ requirement is daunting, so it’s important and completely legitimate to simplify lookup() so it doesn’t insert any data. API is simpler, not incomplete
  • Eg: find every combination adding up to a given target #Ashish permutation within each combination is a separate issue and not the main issue
  • recursive solution is often a quicker route to working code. Once we cross that milestone, we could optimize away the recursion.
  • eg: extract isPrime() as a unimplemented function and simply assume it is easy to implement when I get around to do it.
  • Eg: show free slots between meetings #bbg I solved a similar and more familiar problem.
  • eg: violation check for Sudoku is a tedious but simple utility function. We could assume it’s available
  • eg: violation check for n-queens is likewise slightly harder

 

## G20 operations(4cod`IV)on containers+str

“Operation” means logical operations any programmer (java, Perl, javascript, whatever) frequently performs on a common data structure. STL offers about 30 to 60 everyday operations. A beginner can benefit from a good short list.

List below leans towards sequential containers (shows up in 80% coding questions), but also includes essential operations on associative containers.

  1. ) print — using copy and ostream_iterator. See post. See stl-tut
  2. ) find, find_all
  3. slicing i.e. subsequence
  4. [iv] sort — using custom “myLess” as a type param. See post on functor-type. See effSTL
  5. [iv] construct a sorted container using customer “myLess” as type param. See blog
  6. filter
  7. [iv] nested container
  8. truncate at a point
  9. append, in bulk
  10. pop front
  11. binary search in pre-sorted
  12. [iv] deep copy
  13. mass-populate with one default value
  14. remove/erase – see effSTLplug back_inserter into copy, remove_copy, replace_copy …
  15. [iv] plug bind2nd into replace_if/replace_copy_if, count_if, remove_if/remove_copy_if …. See blog
  16. initialize with hardcoded literals — relatively easy
  17. conversion like between string and vector of char. See blog. See [[ stl tut ]]
  18. —-2nd tier
  19. insert mid-stream, in bulk
  20. [iv] = possibly picked by interviewers

##9dataStruct(+..)for c++speed cod`

  1. linked node manipulation in a graph context
  2. vector (more useful than array), std::string (more useful than cStr). Many string operations are still unfamiliar
    1. Array-based data structures are required in 80% of my coding tests.
    2. More than 50% of all my coding tests require nothing but arrays.
    3. Most of my toughest coding tests are presented in arrays but may need maps as helpers
  3. string in std::string or c-str
  4. iterator manipulation like erase, lower_bound, find,
  5. sorting, operator<(), upper_bound, binary search … on containers
  6. sorted data structure like std::map
  7. [w] stringstream — ECT to improve

Very few Wall St interviewers would test you on graph or DP. Here are other less important constructs:

  1. [w] binary tree is common and simple, but they can ask very tough questions on it!
  2. [w] double pointer is trickier and my advantage
  3. [w] Node class in a linked data structure.
  4. [w] stack, queue.
  5. [w] grid or matrix
  6. file I/O? only for IDE tests, not pure algo or take-home tests
  7. [w] basic syntax for pointer arithmetic.
  8. [w] dtor, copier, op=? only for design questions, not algo questions.
  9. [w] shared_ptr? Only for design questions, never needed in algo questions!
  10. [w] ref variable only as function I/O.
  11. stl algo? Only Citadel array-shrink
  12. never exception
  13. never template
  14. no (very, very seldom) threading in coding Q
  15. adv: pointer to function
  16. adv: circular buffer
  17. [w = no weakness]

 

##failed cod`IV

Here I only list the failed ones. There are many successful ones in ##some@the significant coding IV experiences

 

QQ BP ECT speed syntax algo where
 ? 😦 NA NA ? home c GS tick-engine
NA 😦 NA NA ? home c Tetris
 ? 😦 NA NA ? home c Macq
😦 😦 NA NA NA home j MS-commodity
? ? NA NA 😦 home j Amazon
😦 ? NA NA NA home j MS itr
😦 ? NA NA NA home j MS FIX
? NA good good 😦 codility c/j Mako
NA NA 😦 ? ? codility j FXoption
NA NA 😦 ? ? codility c Jump #3
NA NA 😦 slow good IDE c Bbg 2017
 NA ? 😦 slow good webex c Citadel array-shrink
none ? 😦 good simple IDE j UBS Suntec
😦 NA NA NA ? paper c Nsdq array 0-N
NA NA NA NA 😦 paper c Nsdq day-trading
😦 NA NA NA 😦 paper c FB regex
😦 NA NA NA 😦 paper c Goog
😦 NA NA NA ? paper j Barc SmartOrderRouter
😦 ? ? NA NA NA paper j UBS YACC
😦 NA NA NA NA paper c Bbg London
😦 ? NA NA NA paper c Imagine Software

See C++(n java)IV tend to beat us]4fronts.

Q: where is my biggest coding IV weakness? crack the algo? complete in pseudo code? or get the damn thing to compile and work?

  1. take-home tests usually beat me in terms of Best Practice
    • possibly give up if I must give up something 😦
  2. speed coding tests usually beat me in terms of ECT speed
  3. white-board tests usually beat me in algoQQ

c++cod`IV: no threading/boost/template/socket..

See C++(n java) high-end IV tend2beat us]4ways

This post is about the c++ topics. The BP/ECT/algo fronts primarily use basic constructs in ##10 basic programm`constructs for c++ cod`IV

QQ BP ECT algo#pure
heavy no never never thread
heavy yes seldom never smart ptr
seldom never never never other boost
yes never never never rvr+move
seldom optional never other c++11
heavy seldom never never memory tricks
yes never never never templates meta..
heavy yes never never c^c++
heavy never never never sockets
never never never once object graph serialization
heavy never never never inline
heavy never never never cpu cache
yes never never never exception safety
yes never never never pimpl

don’t feel so bad about recursive solutions #CSY

If I only have a recursive solution and interviewer points out it could overflow the stack, then here’s my answer

"Indeed I wish there’s an iterative solution. It would be more efficient. However, if there are multiple threads available, then iterative solution might be less adaptable to use multiple threads.

As to stack overflow, I understand the run time stack space is much smaller than the heap space, so if I am given a large data set and I hit a million nested recursive calls, then I will follow standard procedure to convert my recursive solution to an iterative solution, and maintain my own stack in heap. This will relieve the pressure on stack space"

This answer applies to all your recursive solutions.

10front-stage(+backstage)data structures@cod`IV

Based on my experience of algorithm interviewing at investment banks, high or low frequency hedge funds, 3rd-party financial firms like data providers, liquidity venues, a few deep-pocketed startups and a few WestCoast tech giants.. here’s the breakdown in terms of the data structure used in the original problem description.

  • 5%of them are presented in nothing but a single integer or two integers or all integers under 10000. Calculation required.
  • 5% of them are presented in huge stacks or queues
  • — arrays in 1D or 2D
  • 10% of them are presented in huge integer arrays, that require comparison, summation, subtraction or other simple calculations
  • 15%+ of them are presented in a huge sequence of value-objects, not pre-sorted
    • This is the default way to receive input data
  • 10% of them are presented in huge 2D arrays including matrices, mazes and primitive images
    • my #1 weakness
  • — strings
  • 15% of them are presented in nothing but a single char-array or two strings
  • 10% of them are presented in huge collection of strings like words
  • — linked node data structures:
  • 10% of them are presented in long linked lists
  • 10% of them are presented in huge unsorted trees
  • 5% of them are presented in huge binary search trees
  • 4% of them are presented in other huge directed graphs but not binary trees or linked lists
  • 1% of them are presented as huge collection of points on an x/y coordinate plane. Always some math required, therefore rare.
  • =====
  • 100% in total

Any of these problems could require a solution using DP/Greedy, swapping, and auxiliary data structures like binary trees, hash tables, linked lists, stacks, queues or priority queues..

  1. 70% need auxiliary arrays or hash table
  2. 50% can be solved with recursion excluding DFT
  3. 25% needs a breadth/depth-first tree traversal
  4. 25% can use DP or greedy algo, usually in one iteration — tough
  5. 20% can use an auxiliary BST
  6. 10% can use an auxiliary tree but not a BST — tough and non-intuitive. Eg: many matrix problems
  7. 10% can use an auxiliary linked list
  8. 10% require recursion in a loop — tough
  9. 10% can use an auxiliary stack or queue excluding BFT
  10. 10% can use an auxiliary sorted vector
  11. 5% can use an auxiliary matrix — tough and non-intuitive. Eg: EditDistance

44tasks@array,str,dict ] algoIV

See also ##9dataStruct(+..)for TIMED c++cod`IV

See also EPI300

With STL, py, java, c#, and every other language, we need to develop fast fingers (ECT) over these basic tasks. Muscle memory is best.

  1. custom comparitor for a payload Trade class; Specify it in vector sorting or BST
    • custom hashcode is only popular in Java algoIV
  2. — custom tree/link node structs
  3. declare node class
  4. initialize a bunch of nodes
  5. — #1 vector of int (vector of other data type is rarely quizzed):
  6. populate with hard-coded data
  7. populate with default values to a fixed size
  8. copy the container — py/java require explicit ctor
  9. join 33 vectors
  10. slice — harder for c++
  11. max
  12. truncate
  13. reverse-copy and reverse-in-place
  14. append, prepend, insert
  15. binary search, lower_bound
  16. dump, iterate, size
  17. — #2 string:
  18. convert between string and vector<char>
  19. initialize with content
  20. prepend, append, trim
  21. insert_by_pos — not easy
  22. replace_substr
  23. read-write char by index
  24. reverse-copy
  25. successive find of target string from left (or from right)
  26. comp, sort among strings
  27. split at custom delimiter
  28. join 3 strings
  29. search
  30. substr
  31. dump, iterate each char, size
  32. — #3 lookup (usually dict or std::map)
  33. populate with hard-coded data
  34. strictly insert
  35. strictly update
  36. lookup, possibly with failure
  37. dump, iterate, size
  38. — tree map: lower_bound
  39. — Queue: deque, enque, front, back,
  40. — Stack: push, pop, top
  41. — linked list? no special tasks
  42. — misc essential iterator tasks
  43. prev(), next(),
  44. distance()

 

9tips: hacker rank test #cout-macro

similar to codiliy…

  • I may need a macro for cout, so I can disable all cout quickly before uploading my source.
    • #define ss if(1>2)cout //1>0
  • I keep my own container data dumper function for instrumentation.
  • you can submit multiple times. So as soon as you can pass some tests, please submit.
  • 🙂 You can look at multiple questions at the same time, so feel free to skip tough or time-consuming questions.
  • You absolutely need your own local compiler and ECT environment.
    • I put a small dos window (for g++ and a.exe) over a notepad++ editor. I use F2 to save

The online one is too slow and provides a single screen. You can copy paste code from local IDE, but you can’t copy paste test data. You could try downloading test data, but you need to be really efficient there. Setting up test can take 5 minutes out of average 20 minutes per question.

  • There could be hidden test cases. Don’t worry too much about them. It’s a real achievement if you can pass all the visible test cases.
  • Ignore integer overflow. If there’s a hidden test for it, you will see the failure
  • Don’t ever worry about minor inefficiency. You won’t have the time. Just get it to work first.
  • pass by clone by default. Easier to debug
  • avoid long variable names. Use std abbreviations like li, vec, ma (for map)
  • I maintain many (unnecessary) global variables of generic names like i1, s2
  • 😦 lost connectivity? timer keeps ticking

 

cod`IV – favored by the smartest companies

XR,

I see a few categories of IV questions:

1a) fundamentals — Some Wall St (also in Asia) tough interviews like deep, low-level (not obscure) topics like threading, memory, vtbl, RB-tree ..

1b) lang — Some (IMO mediocre) interviewers (like programming language lawyers) focus on language details unrelated to 1a)
2) Some (IMO mediocre) interviewers like architecture or high level design questions (excluding algo designs) like OO rules and patterns but I feel these are not so tough.

3) algo — Some tough interviews focus on algo problem solving in pseudo-code. See http://bigblog.tanbin.com/2007/09/google-interviews-apply-comp-science.html. I got these at Google and Facebook. Some quants get these too.

4) compiler coding question — is another tough interview type, be it take-home, onsite or webex.
With some exceptions (like easy coding questions), each of these skills is somewhat “ivory tower” i.e. removed from everyday reality, often unneeded in commercial projects. However these skills (including coding) are heavily tested by the smartest employers, the most respected companies including Google, Facebook, Microsoft… You and I may feel like the little boy in “emperor’s new dress”, but these smart guys can’t all be wrong.
I will again predict that coding questions would grow more popular as the logistic cost is driven down progressively.
Candidate screening is tough, time consuming and, to some extent, hit-and-miss. With all of these screening strategies, the new hire still can fail on TECHNICAL ground. Perhaps she/he lacks some practical skills — Code reading; debugging using logs; automation scripts; tool knowledge (like version control, text search/replace tools, build tools, and many linux commands)
Needless to say, new hire more often fail on non-technical ground like communication and attitude — another topic altogether.

In terms of real difficulty, toughest comp science problems revolve around algorithm and data structures, often without optimal solutions. Algo interview questions are the mainstay at big tech firms, but not on Wall St. Some say “Practice 6 months”. Second toughest would be coding questions —
* Often there’s too little time given
* Sometimes interviewer didn’t like our style but onsite coding tends to focus on substance rather than style.

Tan bin

P.S. I had webex style coding interviews with 1) ICE Feb 2011, 2) Barclays (swing), 3) Thomson Reuters, 4) Bloomberg

P.S. I tend to have more success with algo interviews and onsite coding than take-home coding. See blog posts.. (
http://tigertanbin2.blogspot.com/2015/05/sticky-weakness-revealed-by-interviews-c.html
http://tigertanbinpripri.blogspot.com/2015/03/high-end-developer-interviews-tend-to.html
)

quicksort learning notes #no src

Quick sort is not the most efficient in many situations. It’s not the implementation of sort() in many popular languages. Yet, it’s the most valuable sort to learn, primarily because of interview. I think quicksort is a good test of coding abilities. The challenges:

#1 high-level goal of first pass — hard to remember!
#2 the gist of the first pass
#3 implementation in any language — many pitfalls

http://baike.baidu.com/link?url=rl5hQIbAqdmt53Pmxp7FXhrksHQxjOIb8Knd7dI4xQ0lRSjLNdSbcVj_Dcav-V17dKBe1ggrOWXsDinNPINd22RnFZ2cQ5MDkWdGM8XZwc1TTqtMNfc8kM-0LuT6iCrDR-RigbcKPpYQwK_gDWWOQ_ has real code.

High-level keywords:

  • pivot Object — the rightmost object in a section can be designated the pivot object. Some people use the 1st object or a random object within the section. To keep things simple, we will assume the values are unique
  • final resting place — goal is to put the pivot object into its final resting place within the section, using scan-and-swap
  • swaps — are the mechanism of movements
  • scan — you must scan in some direction
  • partition — after the first pass, the pivot object is in its final resting place and all smaller objects are on its left.

First we need to understand the high-level goal. Then the next challenge is the partition algorithm. The only way to remember the implementation details is writing the code.

On P 146 [[introduction to algorithms]], the procedure partition(A,p,r) does the following on the Section A[p,r] inclusive.

  • it progressively shifts the rightmost (pivot) object from r to the grave/anchor position within the section
  • it keeps the rightmost object value as the benchmark value throughout the procedure.
  • it returns the new index of this object. The index is defined in the entire array A[].
  • it shuffles 0 or more elements within the section
  • it doesn’t try to sort any subsection

Upon receiving an unsorted section, the procedure simply puts the rightmost thingy into the grave position within the section.

Corollary: first scene in the quicksort movie actually completes the job of putting the rightmost object into its final resting place as an anchor within the entire array. After that, we focus on sorting the “left-section” and the “right-section” (in separate threads) without worrying about the first anchor object. Within the left-section, first scene completes the job of putting the rightmost object into its grave, a final resting place within the entire array.

Coding note — the recursion is not using a single function name like A calling A itself. Instead, qsort() calls partition() then qsort(). Most of the work is in partition().

Coding note — partition() function isn’t recursive in itself.

Wall St cod`IV don’t care about efficiency

Most challenging algo interviews (onsite) are really about “solve it by hook or by crook” regardless of efficiency. (Usually there isn't, but if there's an obvious brute-force solution it gives you no credit anyway.)

In the toughest algo questions, usually there's no memory capacity issue to worry about.
Occasionally, interviewer stipulates O(N logN) or O(1)
Once we solve it we can improve efficiency. It's a second phase. Can be easier than first phase.
In that case threading, sorting … are unimportant. Instead we need judicious use of trees, recursions, pattern recognition… and very clear thinking.
See [[EPI300]] and the green book by Zhou Xinfeng

prefer for(;;)+break: cod`IV

For me at least, the sequencing of the 3-piece for-loop is sometimes trickier than I thought. It’s supposedly simple rule(s), but I don’t get it exactly right sometimes. Can you always intuitively answer these simple questions? (Answers scattered.)

A87: ALWAYS absolutely nothing
A29: many statements. They are separated by many statements.

Q1: how many times (minimum, maximum) does the #1 piece execute?
Q2: how many times (minimum, maximum) does the #2 piece execute?
Q3: how many times (minimum, maximum) does the #3 piece execute?
Q: Does the number in A2 always exceeds A3 or the reverse, or no always-rule?
Q29: what might happen between #2 and #3 statements?
Q30: what might happen between #3 and #2? I feel nothing could happen.
Q87: what might happen between #1 and #2 statements?
Q: what’s the very last statement (one of 3 pieces or a something in loop body) executed before loop exit? Is it an “always” rule?

If there’s a q(continue), then things get less intuitive. http://stackoverflow.com/questions/16598222/why-is-continue-statement-ignoring-the-loop-counter-increment-in-while-loop explains the subtle difference between while-loop vs for-loop when you use “continue”.

In contrast, while-loop is explicit. So is do-while. In projects, for-loop is concise and often more expressive. In coding interviews, conditions are seldom perfect, simple and straightforward, so for-loop is error prone. White-board coding IV (perhaps bbg too) is all about nitty-gritty details. The condition hidden in the for-loop is not explicit enough! I would rather use for(;;) and check the condition inside and break.

The least error-prone is for(;;) with breaks. I guess some coding interviewers may not like it, but the more tricky the algorithm is, the more we appreciate the simplicity of this coding style.

Always safe to start your coding interview with an a for(;;) loop and carefully add to the header. You can still have increments and /break/continue inside.

%% priorities in a take-home cod`IV

A lot of times the requirements are way too numerous or stringent, given the time limit. Must give up some. Here are my priorities:

  1.  basic functionality. The essential problem should be solved, or shown to be solvable.
    • pass the essential test cases only
  2. On that foundation, add as many big features (in the problem statement) as I can within the time limit. Leave out the small features.
  3. simplify. Simplicity is the first element of clean , readable code.

I guess some hiring managers are very particular about code quality and readability. I feel it’s like beauty contest. I would give up in those cases. I was told JPM follows [[Clean Code]] when judging/scoring/ranking code submitted.

Need to ensure file timestamps are all within a short window. Prefer to use fewer files. Java solutions require more files :(. If I claim 2H, then better leave some rough edges in the code like input validations.

tips for bbg cod`IV

  • chrome (not Opera) supports Alt-R
  • Show 67 lines of source code
    • chrome F11 to toggle full screen
    • chrome can zoom out 75% to show 67 lines in editor, if you can’t find a bigger monitor.
    • use hackerrank top right icon -> settings -> SMALL font
    • put multiple lines in one line as much as possible
    • Can we hide the address bar?
  • hackerrank top right icon -> settings -> light (not dark) scheme shows line highlight better, but dark shows text better.
  • add comments to show your thinking

Here’s some code we can copy-paste:

#include 
#include 
#define ss if(1>0)cout
using namespace std;   int i1, i2, i3, N;   string s, s1, s2, s3;
int main(){
    ss<<"s = "<<s<<endl;
}

 

[07] google-style interviews: apply comp-science Constructs on realistic problems

Hi LS,

Now I think yesterday’s toughest interview questions are similar in spirit to Google interview questions. I think the key skill tested yesterday was “how does this candidate apply computer science constructs to real world problems”. That’s what I meant when I said “connecting the algorithms/data-structures with the given requirements“. Before you can identify any algo/data-structure to model the requirements, you have no clue how to support the requirements. Whenever we see a first-in-first-out system like Green Card applicants, we automatically think of a Queue data structure — ie connecting the Queue construct to a real world problem.

A 6-year veteran developer like me is supposed to be familiar with
1) associative arrays, hashmaps, sets, binary search trees, doubly linked lists, multi-dimensional arrays and (hopefully) circular linked lists, binary heaps, red-black trees, and any combination of the above like a stack of hashmaps,
2) the concepts of space efficiency, time efficiency, random access, sequential access, what can/can’t be sorted …
3) OO concepts of dependency, coupling, polymorphism, static, multiple inheritance, HAS-A vs IS-A, setters (extremely important) …
4) OO design patterns

But this veteran may not know how to APPLY these constructs to a real-world problem, according to my observation. All of Google’s and my recent problem-solving interview questions attempt to test a candidate’s analysis skill to “apply tools to problems”. In my case, I couldn’t applied the Strategy design pattern unless I spot the similarity between the given requirement and the Strategy pattern’s requirements.

The 4 groups of tools mentioned above are simple tools when studied individually, but any design that need tools from 3 of these groups is likely to be non-trivial. Consider algorithms to delete a node from a binary search tree.

Finally I have some idea why Google like those questions, including the question about “1-7 random number generator using a 1-5 generator”.

recursion – whiteboard cod`IV tips

  • spend some time writing down the external interface of the recursive func — exactly what to print/return, and what global variables to update.
  • define end-of-recursion condition explicitly. Interviewers want to see that.
  • list all important local variables vs global variables on the side.
  • Some variables must be local — on the stack. All other variables had better be global variables.
    • sometimes local variables are simpler — throw-away temp
    • sometimes global variables are simpler — fewer stateful objects.
    • Simplifies the visualization. Any variable passed into a recursive call is additional state on every stack frame and makes the logic harder to reason about. This is more serious when there are many such local states to keep in mind