speed^soft qualities: CIV scoring

If you are very strong, you can achieve both

  • complete the required number of problems in time with no serious bugs
  • demonstrate requirement clarification, edge cases, dry-run, thorough error handling, detailed complexity analysis, coding style.

In reality, some speed coding tests at FB, Indeed etc really favor speed over the other things.

When we take such a test, there is always internal tension between two motivations — speedy completion vs demonstrating other qualities.

Thinking out loud is a common recommendation but it can create unnecessary conversation and slow you down. However, some of these conversations may lead you to the right path.

bigO CIV: clarify_requirement means..worst_data

———- Forwarded message ———
Date: Tue, 9 Jul 2019 at 23:10

Big-O means “worst” case complexity. We need to be doubly sure what “worst” case data look like. (Note it doesn’t mean ridiculously rare data like integer overflow.)

On high-end coding interviews, these “worst data set” frequently shed light on the structure of the original problem, and often point to the correct direction for an efficient solution. (Note a nice-looking solution is not efficient if it exhibits poor time complexity in the face of a worst yet realistic data set. Such a system may not scale well in practice, given its Achilles’ heel.)

If interviewer wants a solution optimized for bigO time complexity, you can waste precious time tackling the wrong problem. The wrong problem is the problem defined by your idea of “worst data set”, but the real worst data set is very different.

Sometimes, the worst data is _subtly_ different but it points to a different algorithm direction. It may point to an iterative solution rather than recursion, for example.

The original interview question may not be that hard iFF we identify the worst data set early on, but sadly, we run out of time.

For some tough problems, the first (sometimes the only) challenge is quickly identifying worst data set. Interviewer always gives you the simple data set, NEVER the worst data set. It’s your job to identify the worst data set… often the key insight into the problem.

It’s no exaggeration to say — identifying the worst data set early or too late can make-or-break your chance at this employer. You may kiss good-bye to this job opportunity exactly because you are too slow to identify the worst data set. I know what this means. Other candidates probably got shot by this arrow on the heel, without knowing what happened.

Time is ticking as soon as the problem is presented to you. If interviewer says time is not a factor… take your time.. we want quality and thought process … I just ignore it. Our speed is always part of assessment. Therefore, a very common failure scenario is —

.. tractable problem but candidate runs out of time after wasting time on preliminaries like using incorrect worst data set to reason about the (wrong) problem.

A/Bob/C: cod`IV #vec for python list

Here are other variables names frequent needed in coding puzzles:

  • Once you declared a long, meaningful var name, you could use abbr of it afterwards.
  • if you have an array of composite objects like “frame” or “trade”, each represented by a inner array, then be careful with frame[1]. Better call it aFrame[1] or frames[0].
  • what about “sorted” which is a python function name:( …. I would use ascd by default.
  • B4 means “before”; Af means “after”
  • hm means hashtable; tm means treeMap or std::map
  • targetRepetition or tgtRep
  • “val” — payload data filed in a Node class. This name is short, easy to type. Harder to global-replace but LG in a short program
  • candd” — candidate
  • “End” for the last element in a container
  • li means list … vec means vector … arr means array
    • In python, we need extra caution to use “vec” not “li” for the python list !
  • cnt means count or counter
  • — for a matrix
  • M, m or mat for the matrix
  • r as the 1st index; c as the 2nd index. Avoid x/y
  • height/rows as number of rows; width/cols as number of columns
  • I will refer to each element as a “cell”

—It’s more useful to have personal names than mere “Person A”. Shorter names are easier to write. Gender doesn’t matter.

  1. Alan, Alice, Aaron,
  2. Bob, Ben (Big Ben), Bill,
  3. Cody,Chris, Charlie
  4. Dan, David
  5. Ed, Emily, Elizabeth (queen)
  6. Fred
  7. Greg, Gail
  8. Henry, Harry,
  9. Ilya, Ivy, Ian
  10. Jen, Jim, Joe, Jack, John, Jason,
  11. Ken, Kate, Kevin,
  12. Lucy, Lee, Lilly, Larry,
  13. Mike, Mary, Matt,
  14. Nick, Nancy,
  15. Oliver, Oscar, …
  16. Pam, Peter, Paul, Peggy
  17. Quincy
  18. Raj, Rob, Ray,
  19. Sam, Sue,
  20. Tom, Ted, Tim,
  21. Uday ….
  22. Vic, Venkat
  23. Wendy, Will
  24. Xavier, Xander/Zander …
  25. Yixin, Yang ….
  26. Zofia, Zack,

CIV: we wish all corner cases+worst data sets r given

In a few problems, the real important “Aha” is getting the full list of corner cases. This list define the boundary of the problem, and would point out the direction, often immediately.

Of course coding test won’t give you the time.  Part of the skill tested is “clarifying requirement”. This is the real meaning.

Sugg: probe the interviewer

Sugg: Gayle pointed out some techniques to create some test data to reveal the boundary

short+efficient : holy grail ] algo search

For most of my tough algo problems (such as leetcode problems)

  1. many celebrated solutions are very short, but not so efficient
    • I think about 60% of them are easier to memorize due to length.
    • if you can understand and memorize them, they are quicker to write due to fewer keystrokes
    • About half of them are easier to understand, thanks to the source code size
    • eg: yield-based generators are a prime example
    • eg: recursive solutions are often brief but inefficient
  2. many great solutions are efficient in terms of O(), but verbose
    • eg: bottom-up DP
    • eg: top-down DP with memoization
    • eg: using a tree (when not required) can sometimes give an efficient solution

However, these two goals are often hard to harmonize. It is my holy grail to meet both criteria simultaneously, but i won”t try so hard.

  1. For real coding problems, I will prefer brevity
  2. For verbal discussions, I will concentrate on efficient solutions

## Are those leetcoders really stronger@@

You wrote “I think many leetcoders are fresh graduates, or young people, who have more time and energy.  I know I cannot compete them …”

Q: Are those leetcoders really stronger in coding tests? Generally yes, but it depends on the type of coding question.

  • For a weekend take-home assignments …. I am not sure they would produce better quality code than us. (I’m not very good either). Code smells matter to my past interviewers.
    • They may not run as many tests as we do.
    • I tend to spend 200%+ more time on the assignment
  • For white-board ….. or remote dumb editor pair programming (west coast favorite), they may not be able to explain their thought process as well as I do.
    • My 2007 Google white-board interview used pseudo code, so the leetcoders’ advantage would be much lower.
  • For completely original questions created by a hiring manager then sent via hacker rank platform, leetcoders may not be fast enough to solve them, since leetcode problems can’t include every newly invented problems.
    • I hit two problems related to shortest-path algorithm, which can be similar to leetcode problems. I also hit very unusual problems.
  • For multi-threading coding questions, leetcode doesn’t help.
    • hit me at least 8 times — UBS, Barcap, Gelber, Bbg ..

bitwise coding questions: uncommon

Don’t over-invest.

Q: How prevalent are these questions in coding interviews?

  • I feel these questions are usually contrived, and therefore low-quality and unpopular among hiring firms.
  • There’s no classic comp-science constructs at bitwise level, and bitwise doesn’t play well with those contructs
  • the bitwise hacks must be language-neutral wrt python and javascript, so quality questions are scarce.
  • phone round? impossible

white-board real estate usage

  1. We often insert lines later at …
    1. Leave 2 lines of space before  and after loop header.
    2. leave 2 lines of space after function header
    3. leave space before and after if-header
  2. Example data — should be drawn at bottom right, but make sure you can see them as you code
  3. global var list — I tend to keep them at bottom left on one line
  4. I like many utility functions
    1. short utility function — you can start it near the bottom, or above existing code, with limited “runway”
    2. lengthy yet easy utility — If you worry a utility function may be lengthy, you can start writing on paper and stick it near the board
  5. top right — is for a 2nd major function, perhaps extracted from the first major function
  6. top level driver function — is never complicated but can grow.  I can write it below an existing function
  7. center of the board — leave empty for eyeball test, other examples, graph etc

Gayle: no rush2write code: point allocation

Gayle strongly advocates to get a brute-force solution as soon as possible, and walk through the “idea” in detail, before coding. However, I’d rather start coding up my idea much earlier since I need more time coding.

I guess Gayle’s experience shows that some interviewers are less worried about runnable working code than the thought process. For a tricky algo, if the idea is sound, then you basically pass….? See with algo cracked,ECT=雕虫小技

However, I feel an elaborated/detailed “idea” (item 2 below) alone is not enough to pass. I frequently run out of time implementing things like https://www.geeksforgeeks.org/print-a-given-matrix-in-spiral-form/. The algorithm is simple and not worth 40% but I was too slow completing a basic implementation. “Done is better than perfect”. Basic implementation is better than no implementation. You need to demonstrate you can implement a working solution.

Here’s my estimate of the point distribution in a typical whiteboard interview:

  1. 10(% clear understanding of requirements + clarifying questions
  2. 35(50)% detailed idea (even unimplemented) with decent performance enhancement
  3. 35(20-40)% basically working implementation.
  4. 5% performance analysis — can be tricky
  5. 5% readability — modularized, clean
  6. 5% eyeball test — can be hard to get right if code is convoluted with many variables
  7. 5% minor bugs, corner cases

Gayle: use big,ugly sample2explore; tiny sample4test

–In the initial exploration (only 50% clear direction), the big ugly sample would clarify many assumptions.

Any mis-assumption corrected is a potential life-saver since the mis-assumption could lead us down the wrong track.

Such a sample will help us see the pattern in data and get to the gist of the problem.

–However, during the eyeball testing on white board, a large sample would take too much time. A tiny sample can flush out the bugs more quickly

## BRING reminder sheet : cod`IV

  1. edge case — my loop may get skipped completely
  2. edge case — empty output
  3. drive to improve af it works
  • bring rectangular post-it + breakable tape

—-todo:

bbg: Problem-Solving skill

I think this skill is like IQ…

I feel this is broadly similar to west coast interviews, perhaps with subtle differences in how Mistakes are assessed.

One Bbg interviewer (young Chinese guy) shared with me what he meant by “problem solving skill” — given a well-defined programming problem

  • not superb performance
  • not syntax perfection
  • solution should be “reliable”. I guess he means ‘free of major bugs’
  • problem is often realistic — I have noticed this feature repeatedly
  • I said many candidates go on the wrong track and (usually) get stuck. He agreed.
  • I said many solutions are broken because candidate misses some key factor. He agreed.
    • perhaps a factor about requirement
  • I now think some candidates have no experience no basic understanding on a given programming topic (say, DP, or regex parser, or condVar) so they will never get off the wrong tracks.

Rahul’s reasoning — on the job, most of the time a new hire would receive inputs from other people, and need to think through a problem with many obstacles and unknowns. Interviewer can’t really pose a realistic work site problem, so interviewers use algo questions as a proxy.

I think that Beside the technical capabilities, there are many personality, motivation factors that determine candidate’s progress on a given problem

  • Is the candidate confident (rather than nervous) to take the hint and work on it?
  • Is the candidate too ready to give up?
  • Is the candidate unable to use the hint but unable to seek clarification?

Rahul as an interviewer tries to make the candidate relax and perform at her best.

FB cod`IV tips: #official

bucket sort; merge sort; graph

watch more videos

go through the Prep emails.

  • Practice.. Test your code on the board not code in your head.
  • U can confirm with interviewer if u want to start coding earlier. I tend to run out of time solving the problem. Kelly said just practice more to speed up.
  • You can request to skip small talk n start technical part earlier.
  • Messy code is ok for some interviewers, as long as they can follow your thoughts. Poor recall of language details probably OK, but not fundamental mistakes.
  • Always .. Interviewer will under-specify the problem, waiting for you to ask clarifying questions. The more corner cases you identify the better.
  • Always .. for all coding interviews, the problem is solvable within 30-40 minutes.
  • Always .. Done is more important than perfect. You can improve on a imperfect solution in terms of big-O.
  • Always .. walk through your solution with some unusual cases. Some interviewers don’t care about trivial corner cases, but some unusual inputs are valid and important, like an extremely lopsided tree.
  • Better be “deep” with one language so you can demonstrate your depth (esp. about big-O), but you could also choose different languages for different problems.
  • FB is very specific about big-O
  • some interviewers are very specific about details in the output. Empty input? What are the exact characters in the output? A client program will need the output to be exactly right.
  • FB is very specific about details in the input. Ask clarifying questions.
  • Never invent facts that don’t exist.
  • You should be constantly talking while solving the problem. Talk through more than you usually do.

## G5 algoIV constructs x-lang #dataStruct++

  1. elegant, legit simplifications
  2. Judicious use of global -vs- argument -vs- local variables
  3. iterator and  Pointer in node/VO class
  4. Construct tree, hashmap or Sorted array as auxiliary, or for memoisation
  5. Recursion
  6. sentinel node trick: array/slists

The tools below are more specialized and less widely relevant

  • Graph traversal, recursive or iterative..
  • Permutations etc, recursive or iterative ..
  • Std::lower_bound etc
  • sorting

The topics below are Not really for a coding test:

  • DP
  • concurrency

Codility: ignore overflow+bigO

The machine won’t check your bigO!

Focus on getting it to work. If you have time, address the overflow. If you get bogged down by the overflow concerns, you lose focus and lose speed.

–Here are my practical strategy for codility

  1. Don’t aim for high marks, since these tests are very hard-to-please and not very popular.
  2. Get friends to help crack it.
  3. Remember that ROTI is lower than other coding tests like Bloomberg style.  ECT and thinking speed is way too demanding.

##failed cod`IV

Here I only list the failed ones. There are many successful ones in ##some@the significant coding IV experiences

 

QQ BP ECT speed syntax algo where
 ? 😦 NA NA ? home c GS tick-engine
NA 😦 NA NA ? home c Tetris
 ? 😦 NA NA ? home c Macq
😦 😦 NA NA NA home j MS-commodity
? ? NA NA 😦 home j Amazon
😦 ? NA NA NA home j MS itr
😦 ? NA NA NA home j MS FIX
? NA good good 😦 codility c/j Mako
NA NA 😦 ? ? codility j FXoption
NA NA 😦 ? ? codility c Jump #3
NA NA 😦 slow good IDE c Bbg 2017
 NA ? 😦 slow good webex c Citadel array-shrink
none ? 😦 good simple IDE j UBS Suntec
😦 NA NA NA ? paper c Nsdq array 0-N
NA NA NA NA 😦 paper c Nsdq day-trading
😦 NA NA NA 😦 paper c FB regex
😦 NA NA NA 😦 paper c Goog
😦 NA NA NA ? paper j Barc SmartOrderRouter
😦 ? ? NA NA NA paper j UBS YACC
😦 NA NA NA NA paper c Bbg London
😦 ? NA NA NA paper c Imagine Software

See C++(n java)IV tend to beat us]4fronts.

Q: where is my biggest coding IV weakness? crack the algo? complete in pseudo code? or get the damn thing to compile and work?

  1. take-home tests usually beat me in terms of Best Practice
    • possibly give up if I must give up something 😦
  2. speed coding tests usually beat me in terms of ECT speed
  3. white-board tests usually beat me in algoQQ

don’t feel so bad about recursive solutions #CSY

If I only have a recursive solution and interviewer points out it could overflow the stack, then here’s my answer

"Indeed I wish there’s an iterative solution. It would be more efficient. However, if there are multiple threads available, then iterative solution might be less adaptable to use multiple threads.

As to stack overflow, I understand the run time stack space is much smaller than the heap space, so if I am given a large data set and I hit a million nested recursive calls, then I will follow standard procedure to convert my recursive solution to an iterative solution, and maintain my own stack in heap. This will relieve the pressure on stack space"

This answer applies to all your recursive solutions.

10front-stage(+backstage)data structures@cod`IV

Based on my experience of algorithm interviewing at investment banks, high or low frequency hedge funds, 3rd-party financial firms like data providers, liquidity venues, a few deep-pocketed startups and a few WestCoast tech giants.. here’s the breakdown in terms of the data structure used in the original problem description.

  • 5%of them are presented in nothing but a single integer or two integers or all integers under 10000. Calculation required.
  • 5% of them are presented in huge stacks or queues
  • — arrays in 1D or 2D
  • 10% of them are presented in huge integer arrays, that require comparison, summation, subtraction or other simple calculations
  • 15%+ of them are presented in a huge sequence of value-objects, not pre-sorted
    • This is the default way to receive input data
  • 10% of them are presented in huge 2D arrays including matrices, mazes and primitive images
    • my #1 weakness
  • — strings
  • 15% of them are presented in nothing but a single char-array or two strings
  • 10% of them are presented in huge collection of strings like words
  • — linked node data structures:
  • 10% of them are presented in long linked lists
  • 10% of them are presented in huge unsorted trees
  • 5% of them are presented in huge binary search trees
  • 4% of them are presented in other huge directed graphs but not binary trees or linked lists
  • 1% of them are presented as huge collection of points on an x/y coordinate plane. Always some math required, therefore rare.
  • =====
  • 100% in total

Any of these problems could require a solution using DP/Greedy, swapping, and auxiliary data structures like binary trees, hash tables, linked lists, stacks, queues or priority queues..

  1. 70% need auxiliary arrays or hash table
  2. 50% can be solved with recursion excluding DFT
  3. 25% needs a breadth/depth-first tree traversal
  4. 25% can use DP or greedy algo, usually in one iteration — tough
  5. 20% can use an auxiliary BST
  6. 10% can use an auxiliary tree but not a BST — tough and non-intuitive. Eg: many matrix problems
  7. 10% can use an auxiliary linked list
  8. 10% require recursion in a loop — tough
  9. 10% can use an auxiliary stack or queue excluding BFT
  10. 10% can use an auxiliary sorted vector
  11. 5% can use an auxiliary matrix — tough and non-intuitive. Eg: EditDistance

9tips: hacker rank test #cout-macro

similar to codiliy…

  • I may need a macro for cout, so I can disable all cout quickly before uploading my source.
    • #define ss if(1>2)cout //1>0
  • I keep my own container data dumper function for instrumentation.
  • you can submit multiple times. So as soon as you can pass some tests, please submit.
  • 🙂 You can look at multiple questions at the same time, so feel free to skip tough or time-consuming questions.
  • You absolutely need your own local compiler and ECT environment.
    • I put a small dos window (for g++ and a.exe) over a notepad++ editor. I use F2 to save

The online one is too slow and provides a single screen. You can copy paste code from local IDE, but you can’t copy paste test data. You could try downloading test data, but you need to be really efficient there. Setting up test can take 5 minutes out of average 20 minutes per question.

  • There could be hidden test cases. Don’t worry too much about them. It’s a real achievement if you can pass all the visible test cases.
  • Ignore integer overflow. If there’s a hidden test for it, you will see the failure
  • Don’t ever worry about minor inefficiency. You won’t have the time. Just get it to work first.
  • pass by clone by default. Easier to debug
  • avoid long variable names. Use std abbreviations like li, vec, ma (for map)
  • I maintain many (unnecessary) global variables of generic names like i1, s2
  • 😦 lost connectivity? timer keeps ticking

 

cod`IV – favored by the smartest companies

XR,

I see a few categories of IV questions:

1a) fundamentals — Some Wall St (also in Asia) tough interviews like deep, low-level (not obscure) topics like threading, memory, vtbl, RB-tree ..

1b) lang — Some (IMO mediocre) interviewers (like programming language lawyers) focus on language details unrelated to 1a)
2) Some (IMO mediocre) interviewers like architecture or high level design questions (excluding algo designs) like OO rules and patterns but I feel these are not so tough.

3) algo — Some tough interviews focus on algo problem solving in pseudo-code. See http://bigblog.tanbin.com/2007/09/google-interviews-apply-comp-science.html. I got these at Google and Facebook. Some quants get these too.

4) compiler coding question — is another tough interview type, be it take-home, onsite or webex.
With some exceptions (like easy coding questions), each of these skills is somewhat “ivory tower” i.e. removed from everyday reality, often unneeded in commercial projects. However these skills (including coding) are heavily tested by the smartest employers, the most respected companies including Google, Facebook, Microsoft… You and I may feel like the little boy in “emperor’s new dress”, but these smart guys can’t all be wrong.
I will again predict that coding questions would grow more popular as the logistic cost is driven down progressively.
Candidate screening is tough, time consuming and, to some extent, hit-and-miss. With all of these screening strategies, the new hire still can fail on TECHNICAL ground. Perhaps she/he lacks some practical skills — Code reading; debugging using logs; automation scripts; tool knowledge (like version control, text search/replace tools, build tools, and many linux commands)
Needless to say, new hire more often fail on non-technical ground like communication and attitude — another topic altogether.

In terms of real difficulty, toughest comp science problems revolve around algorithm and data structures, often without optimal solutions. Algo interview questions are the mainstay at big tech firms, but not on Wall St. Some say “Practice 6 months”. Second toughest would be coding questions —
* Often there’s too little time given
* Sometimes interviewer didn’t like our style but onsite coding tends to focus on substance rather than style.

Tan bin

P.S. I had webex style coding interviews with 1) ICE Feb 2011, 2) Barclays (swing), 3) Thomson Reuters, 4) Bloomberg

P.S. I tend to have more success with algo interviews and onsite coding than take-home coding. See blog posts.. (
http://tigertanbin2.blogspot.com/2015/05/sticky-weakness-revealed-by-interviews-c.html
http://tigertanbinpripri.blogspot.com/2015/03/high-end-developer-interviews-tend-to.html
)

WallSt cod`IV don’t care about efficiency

Most challenging algo interviews (onsite) are really about “solve it by hook or by crook” regardless of efficiency. (Usually there isn't, but if there's an obvious brute-force solution it gives you no credit anyway.)

In the toughest algo questions, usually there's no memory capacity issue to worry about.
Occasionally, interviewer stipulates O(N logN) or O(1)
Once we solve it we can improve efficiency. It's a second phase. Can be easier than first phase.
In that case threading, sorting … are unimportant. Instead we need judicious use of trees, recursions, pattern recognition… and very clear thinking.
See [[EPI300]] and the green book by Zhou Xinfeng

%% priorities in a take-home cod`IV

A lot of times the requirements are way too numerous or stringent, given the time limit. Must give up some. Here are my priorities:

  1.  basic functionality. The essential problem should be solved, or shown to be solvable.
    • pass the essential test cases only
  2. On that foundation, add as many big features (in the problem statement) as I can within the time limit. Leave out the small features.
  3. simplify. Simplicity is the first element of clean , readable code.

I guess some hiring managers are very particular about code quality and readability. I feel it’s like beauty contest. I would give up in those cases. I was told JPM follows [[Clean Code]] when judging/scoring/ranking code submitted.

Need to ensure file timestamps are all within a short window. Prefer to use fewer files. Java solutions require more files :(. If I claim 2H, then better leave some rough edges in the code like input validations.

[07] google-style interviews: apply comp-science Constructs on realistic problems

Hi LS,

Now I think yesterday’s toughest interview questions are similar in spirit to Google interview questions. I think the key skill tested yesterday was “how does this candidate apply computer science constructs to real world problems”. That’s what I meant when I said “connecting the algorithms/data-structures with the given requirements“. Before you can identify any algo/data-structure to model the requirements, you have no clue how to support the requirements. Whenever we see a first-in-first-out system like Green Card applicants, we automatically think of a Queue data structure — ie connecting the Queue construct to a real world problem.

A 6-year veteran developer like me is supposed to be familiar with
1) associative arrays, hashmaps, sets, binary search trees, doubly linked lists, multi-dimensional arrays and (hopefully) circular linked lists, binary heaps, red-black trees, and any combination of the above like a stack of hashmaps,
2) the concepts of space efficiency, time efficiency, random access, sequential access, what can/can’t be sorted …
3) OO concepts of dependency, coupling, polymorphism, static, multiple inheritance, HAS-A vs IS-A, setters (extremely important) …
4) OO design patterns

But this veteran may not know how to APPLY these constructs to a real-world problem, according to my observation. All of Google’s and my recent problem-solving interview questions attempt to test a candidate’s analysis skill to “apply tools to problems”. In my case, I couldn’t applied the Strategy design pattern unless I spot the similarity between the given requirement and the Strategy pattern’s requirements.

The 4 groups of tools mentioned above are simple tools when studied individually, but any design that need tools from 3 of these groups is likely to be non-trivial. Consider algorithms to delete a node from a binary search tree.

Finally I have some idea why Google like those questions, including the question about “1-7 random number generator using a 1-5 generator”.

recursion – whiteboard cod`IV tips

  • spend some time writing down the external interface of the recursive func — exactly what to print/return, and what global variables to update.
  • define end-of-recursion condition explicitly. Interviewers want to see that.
  • list all important local variables vs global variables on the side.
  • Some variables must be local — on the stack. All other variables had better be global variables.
    • sometimes local variables are simpler — throw-away temp
    • sometimes global variables are simpler — fewer stateful objects.
    • Simplifies the visualization. Any variable passed into a recursive call is additional state on every stack frame and makes the logic harder to reason about. This is more serious when there are many such local states to keep in mind