generate combinationSum compositions #backtrack up]trie #2review


Given a set of unique candidate numbers and a target number, find all unique combinations of candidates, where each combination sums to target. Each candidate may be used repeatedly.

My solution is , showing a reusable backtracking technique described below. It’s clever and brief. However, the efficiency is questionable. Memoization might be applicable here.

This backtracking relies on a key insight. Suppose we have target = 7 and X=2,Y=3,Z=4 as the candidates.

  • when we try a Y, we don’t need to try any more Xs. For example, If we are trying XY, then all XX* solutions are already handled by earlier recursive calls.
  • each combo sequence is naturally pre-sorted internally.
  • Also, “lower” formulas are generated before “higher” formulas
          x        y     z
      /   |  \
     x    y   z       
    / \   | \  \
 xxx  xxy | xyz \ 
         xyy     xzz
void //can return something if needed

recurs( solutionsFound &, //growing
        curPartialSolution &, 
// above collections could be global variables, to simplify things

        remainingCandidates, /*probably an immutable global array. 
If we need the remaining array to shrink, we can still rely on startIndex to skip used candidates.*/

        startIndex, //used to select from remaining candidates

Inside this function, we scan remaining candidates starting from startIndex. Typically in one iteration

  1. we add a new candidate into curPartialSolution
  2. we call recurs
  3. we remove the last added candidate from curPartialSolution to restore the original curPartialSolution — backtracking up the tree.
  4. move on to the next candidate

I feel this is more greedy than DP


touch not cross: path between 2 corners

Q1: from p 183 [[discrete math]]: given a n x n grid. Start from north west corner moving south or east each step, towards that corner. The diagonal connecting them can be touched from north, but not crossed. print all paths

easier to treat origin as [0,0] and end as [N,N]

DFT will require deep recursion.
BFT (with color) where each node remembers all paths-from-root? Kinda brute force

Insight — Actually this is not necessarily a graph problem though it can be solved that way.

Q2 (accepted@leetcode ): Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses. For example, given n = 3, a solution set is:

[ “((()))”,
“()()()” ]

Four related problems — These two problems are related to the abbreviation generator i.e. the combination generator.

However, the abbr/combo generators are more versatile and possibly overkill for this problem. These two problems can use a bit array to represent the output. I think my solution on github is probably considered inelegant but I don’t care. BigO insight —  number of paths (or valid strings) is O(N!) so any solution would not be any better than O(N!)

In general, our own ideas are often inefficient. If efficient, then often inelegant by some arbitrary interview standard. Still more valuable than learning standard solutions. One of the biggest values is xRef which helps build insight, intuition, thick->thin.

staircase:1or2 each step

Q: You are climbing a stair case. It takes n steps to reach to the top. Each time you can either climb 1 or 2 steps. In how many distinct ways can you climb to the top?

I think this problem is not “easy”. similar to Fib:

f(3) = f(1) + f(2) where f(1) = how many unique paths after taking a double-step; f(2) how many unique paths after taking a single step.

%%approaches to classic generator problemS

* p2u in next_perm

* insert at each position of each array, in a growing collection to create new arrays for the collection. Optionally, all arrays are immutable.

Actually, only two approaches so far, though each of them can be adapted in many ways.

Note many (simple) classic generator algos are required in coding tests or (practical) algo problems

master The yield-generator uses python q[ yield ] to implement classic generators of

  • combinations
  • permutations
  • abbreviations
  • redraw

Key features and reasons why we should try to memorize it

  1. very brief code, not too dense (cryptic), .. helps us remember.
  2. reliability — brief means fewer hiding place for a bug. I added automated tests for basic quality assurance. Published solution
  3. versatile — one single function to support multiple classic generators.
  4. yield functions’ suspend-resume feature is particular powerful and possibly a power drill. This is my first opportunity to learn it.
  5. instrumentation — relatively easy to add prints to see what’s going on.
  6. halo — yield-based generator functions are a halo and also a time-saver
  7. elegant — brief, simple, relatively easy to understand
  8. recursive — calling itself recursively in a loop therefore fairly brief but powerful
  9. useful in take-home assignments
  10. identity-aware — The second time you call myYieldFunc(44), it would usually return the same generator object as the first time you call it with that same argument (i.e. 44). If that generator object has reached end of it’s execution, then you won’t get any more output.

— How I might try to memorize it

  1. If needed, we just need to reproduce it quickly from memory a few times
  2. I added comments to help me understand and memorize it

generate loopfree paths: graph node A→B

Q1: given 2 nodes in a graph containing N (eg 121) nodes, potentially with cycles, generate all simple paths between the pair. A simple path has no cycle. (In other words, for a simple path, length + 1 ==  # unique nodes)

I think there are classic math algorithms for it, because this is part of basic graph theory. Here are some applications of this type of algorithms —

  • Q1b (special case of Q1): given 2 nodes in a C-by-R rectangular grid, where every node is connected to (up to) four neighbors, generate all cycle-free paths between the pair.
    • Below, I solved this problem in python
  • Q2 (simple application of Q1 algo): generate all simple paths between ALL node pair in a graph. The shortest simple path has length=0. Longest simple path can potentially visit every node exactly once.
  • A: first generate all 121-Choose-2 node pairs. For each pair, solve Q1. Lastly generate the 121 trivial paths of length=0.
    • repetitive 😦
  • Q2b (special case of Q2): In a C-by-R rectangular grid, where every node is connected to (up to) four neighbors, generate all cycle-free paths between ALL pairs.
  • Q2c (easy one based on Q2): given a binary tree containing no cycles, generate all paths.

— A1b: my DFT implementation (probably not 100% correct) , where each “trail” either fails or becomes a path.

  1. from NodeA start a breadcrumb/trail. We can’t revisit any node already visited on current breadcrumb,
    1. if this is a matrix, then instead of a hashtable, we can also use a shadow matrix, but the breadcrumb is much smaller than a shadow matrix
  2. if we reach a node surrounded by nodes on the same breadcrumb, then the trail fails at a dead-end
  3. else we will reach NodeB 🙂 Print the breadcrumb

By construction, we won’t see duplicate paths 🙂 is the implementation

–BFT? I don’t think BFT can print each unique path

generate all abbr starting from longest.. +! recursion

I won’t insist on relative ordering among the shortest.

Idea 1() — Start with longest abbreviation i.e. the original string S, assuming 5 characters.

  1. populate the smallHM with the original word
  2. copy every char except the first. save into bigHM, then print/process this abbrevation.
  3. copy every char except the 2nd and print
  4. ..
  5. copy every char except the last. Now we have 5 strings in bigHM (a Level-4 hashmap), each length S-1=4
  6. make smallHM point to bigHM object; point bigHM to an empty hm
  7. now take a string from smallHM (Level-4 collection) and generate 4 shorter strings and save them in bigHM (a Level-3 collection), each length S-2=3
  8. now take 2nd string from Level-4 …
  9. After we finish Level-4, we have generated 20 strings in Level-3, but there are only 10 distinct items! so we need a L3 hashmap.


staircase problem #CSY@Bbg

Q (bbg question posed to CSY): given a staircase of height N (eg 3), you can reach the top by three steps 1,1,1 or two steps 1,2 or 2,1, or a single step of 3. Generate all paths for a given N.

I think the solution(s) need more review and internalization.

This problem has many classic solutions

  • bottom-up
  • top-down recursion with memoization
    • yield-based generator function
  • BFT without recursion without duplicates

I feel this is simpler than AQR factorization and the CombinationSum problems.

Is this same as N-boy-split? No… split is harder. With the split, ABC can have a “step” of AC.

easy to test — was my non-recursive BFT solution. is my yield-based recursive solution. Only 5 lines in the implementation. I added lots of tests.

Q: why path count == 2n-1?
Answer from CSY: f(1)=1, f(2)=2, f(n) = f(n-1)+f(n-2)…+f(1) = 2n-1
A: For a staircase of n, you can take step of 1 and have f(n-1) paths, or you can take step of 2 and have f(n-2) paths…

—jargon file:

a STEP from LEVEL 0 to 2 has LENGTH=2, and skips level 1; a PATH consists of 1 or more STEPS; a FORMULA is a complete PATH, and always starts at level 0 and may hit intermediate levels #2, #5, #6 …

— key observations:
* Observation 1: Every formula starts with firstStepOf1, or firstStepOf2, (or firstStepOf3…) with not duplicate between these two groups !
* Observation 2: After taking firstStepOf1, the remaining steps are really a formula for staircase of N-1, a smaller problem. This invites a bottom-up DP solution.

—BFT solution. suppose N = 5. We model each path as a directory path from root. Each queue item is a path represented by a linked list or vector of capacity N.

I hope this solution avoids duplicates… Yes it does:)

  1. enqueue all “first levels” /1; /2; /3; /4; /5
  2. then dequeue and visit first path i.e. “1”. In this case, there are 4 levels remaining in the staircase, so enqueue /1/1;  /1/2;  /1/3;  /1/4
  3. then dequeue and visit 2nd path in the queue i.e. “/2”. enqueue /2/1; /2/2;  /2/3

Every time (after dequeue) we see a path has no remaining level, we have a formula.

—DP (bottom-up) iterative solution to be implemented

  1. first, build the formulas for a length-2 staircase. They are /1/1 + /2 i.e. two formulas saved in FormulaArrayForStaircase2 i.e. “fa2”
  2. then build the formulas for a length-3 staircase — fa3 = firstStepOf2 -> fa1 + firstStepOf1 -> fa2
  3. then build the formulas for a length-4 — fa4 = firstStepOf3 -> fa1 + firstStepOf2 -> fa2 + firstStepOf1 -> fa3

— recursive: CSY gave a recursive solution and interviewer asked “Can you find a non-recursive solution”?

I think the recursive solution repeats the same sub-problem (such as a 3-level staircase) over and over again, but we can make the recursive function more efficient with memoization — it should short-circuit when the input already has a solution saved in a global hashtable.

—- python yield/generator recursive solution

My github code shows simple generator function without memoization is about half the speed of bottom-up. With memoization, it’s 20% faster than bottom-up.

next_perm@N color socks #complex #100%

A common IDE coding challenge — given x pairs socks of various colors, generate all the permutations, in ascending order. Each color has a value.

–Solution 1: std::next_permutation() and prev_permutation()

–solution 2: I can probably write my own next_perm(). Using This function we can generate an ascending sequence of permutations starting from the current content of a vector. is my iterative solution, but should use lower_bound() is my python solution, overcoming the lower_bound problem.

relating my perm/comb algos

–next_perm: I have an iterative solution

–next perm 5C3: iterative algo, growing global collection

–next_combo 5C3: I have a (approx) 10-line recursive algo. (Iterative algo is non-ideal). Passing in two collections.

–next_abbreviation of any length: I have a (approx) 10-line recursive algo. Using global collection, this algo is simpler than next_combo

–So far, these algos are decent but have nothing in common.


next abbreviation@word with repeat char #2^N #100%

This is a real coding question at 10gen/mongoDB is a simple iterative py solution. Not ascending.

Q1: Given a word with unique letter (like “abc”), Can you write c++ program to generate all abbreviations in any order?
Q2: same question, but don’t use any external storage beside character arrays.
Q3: same question, but output in ascending order? Expected output:

  1. a
  2. ab
  3. abc
  4. ac
  5. b
  6. bc
  7. c

It’s daunting to generate this sequence without recursion. is my recursive solution to Q3.

Q4: what if the word has non-unique letters, like “mongo”? The only solution I know relies on an external lookup device.

generate random line-up@Any combo@N boys #70%

A standard permutation/combination problem in some coding tests. You are often required to iterate all of them.

Given N cities, how many permutations of any combinations are there in total.

My iterative sum formula: answer(N)= N_choose_0 + N_choose_1 * 1! + N_choose_2 * 2! + … + N_choose_N * N!

N_choose_0 = 1 !

–recursive algo: use answer(N-1) to answer(N)

–iterative algo:

  • Suppose we have an iterative_next_perm(list) function already written.
  • suppose we have an iterative_next_combo(N, 3) that generates all combinations of 3 out of N distinct chars.

Then, here’s a solution — call

iterative_next_combo(N,N), in a loop (of N iterations).

Suppose one of the N calls generated 221 combinations. Run a loop (of 221 iterations) each time pass one combination into iterative_next_perm(combo). So our main program has only 2 nested loops. Most of the complexities are encapsulated in iterative_next_perm() and iterative_next_combo()

next Combo@3,using5distinct chars,permitting redraw

Modified from Leetcode problem 17. Suppose there is nothing but one red button. and there are L (like 5) letters on it.

Q: With 4 (=C) draws from 3 (=L) letters (same phone pad button), how many permutations? L^C.
Q: With 4 (=C) draws from 3 (=L) letters, how many combinations?

For C=1, Ans=3
For C=2, Ans=6
For C=3, Ans=10= 6 + 3 + 1?
For C=4, Ans=15=10+(10-6)+1
For C=5, Ans=21=15+(15-10)+1

–My explanation of the count:

Key: Each combo is represented as a sorted vector (ascending). There’s one-to-one mapping between such a vector and a combo.

let’s say L=3 and C grows from 1 to 3 .. For C=2, I put all combos on 3 rows (3 vectors or lists)

  • aa ab ac #all that starts with a
  • bb bc #all that start with b
  • cc # lone wolf

For C=3, Ans

  • first container is populated by prepending ‘a’ to all 3 “old” containers of items
  • 2nd container is populated by prepending ‘b’ to 2nd-and-lower old containers
  • 3rd container is always a lone wolf

Both solutions below are simple, reusable and implemented in

–sol-1 efficient incremental solution — for C=3, given one combo as a string, find the “next” combo. For example after bbc, we should get bcc. Bonus feature — output is naturally sorted 🙂

–sol 2: append the next char ..

next split@N boys #51%

N boys are all unique “entities”, each identified by student id. Later we can look into “next split of N colored balls, having J unique colors”.

In a split, every boy belongs to exactly one group (minimum 1 member). We could rely on python’s default sort order, but for now let’s define a sort order between two groups AA and BB:

  1. sort by group size
  2. If same size, sort by the lowest student id in AA vs in BB.

Every “split” is recorded as a sorted list of groups. I will define a sort order between 2 splits:

  1. sort by list size
  2. if same size, sort by first group, the by 2nd group..

Q1: how many ways to split N boys?
%%A: First we line up N boys — there are N! line-ups. In each line-up, we have 2^(N-1) splits because there are N-1 slots to place a group-boundary. This is a way to iterate over ALL splits, but without uniqueness.

Q2: Write a program to generate all splits in some order. For example, with 3 boys h,i,j:

  1. h/i/j in a giant group…
  2. –3 split-of-2-groups: h, i/j
  3. i, h/j
  4. j, h/i…
  5. –1 split of 3 singular groups: h, i, j
  6. ==== total 5 distinct splits showing 10 group objects but not all distinct

With 3+1 boys h,i,j,k: First add k to each of 10 group objects:

  1. h/i/j/k
  2. h/k, i/j
  3. h, i/j/k
  4. i/k, h/j
  5. i, h/j/k
  6. j/k, h/i
  7. j, h/i/k
  8. h,i, j/k
  9. h,j, i/k
  10. i,j, h/… — so far, 10 splits. Now 5 more splits featuring k in its own singular group
  11. k, h/i/j
  12. h, k, i/j
  13. i, k, h/j
  14. j, k, h/i
  15. h,i,j,k

—- analysis: I don’t know how many splits there will be for N boys. An iterative function will probably work:

After I listed out exactly 5 splits of 3 boys, and I added another boy. So here’s the algo I used — For each of the 5 (existing) splits, add this boy to each of 10 group objects, or put himself as a new singular group.

next_Perm@3boys out@5 #80%

algo-practice: generate permutation@3, using5distinct chars

Such a skill might be needed in some coding IV sooner or later. Let’s write this in py or c++. Formally,

Q1: generate all permutations of 3, from 5 distinct chars, in any order.
Q2: generate all permutations of 3, from 5 distinct chars, in ascending order. You can sort the 5 chars first.
Q2b: once you have generated one permutation, how do you identify The next?

Note the same solution is a generalization of std::next_permutation(), so once you have this solution you have that solution.

How many? 5-choose-3 * 3! = 5!/(5-3)! = 5*4*3 = 60, denoted Answer(3). Paradoxically, Answer(4) == Answer(5) = 120

–algorithm 1 for Q1

  • 1st generate 5 single-char strings;
  • then for each generate 4 two-char strings. We get 20 strings.

–algorithm 1b for Q1: rec(5,3) will call rec(5,2). rec(5,2) has 20 items, each can generate 3 items for rec(5.3), because each item has 2 characters and a void to be filled by the 3 unused chars.

The 20 items returned should be a pair{vector of 2, vector of 3}

This produces a sorted collection:)

See my code




next_Combo@3 boys out@5 # 100%

Given, abcde, How many combinations of 3? 5-choose-3 = 10 standard formula, but let’s implement in py or c++.

It’s useful to define a score for a combination — sort the combination into a string. The string is the score. Now,

Q1: generate all combinations@3 from 5 distinct chars, in ascending score.
Q2 (more importantly) given any combination of 3, generate the next higher combination of 3.  Output each combination as a sorted string to keep things clean. Extremely effective constraint and simplification.

1) is a decent recursive.

2) fixed-length abbreviation generator also generates exactly the same  sequence!

3) Perhaps we can modify the iterative generate random perm@3boys out@5 algorithm:

Key: Scan from end of string to identify position2upgrade. Compare current (i.e.last) position with the unused characters. If can’t upgrade me [eg abe], then move left which is guaranteed to be a lower character. Now compare the new “me” with only the unused characters (in this algo we don’t allow swap within the string) If can’t upgrade “me”, then move left.

If current position is upgradeable, then set it as p2u. Upgrade to the lowest unused character. Then reset all characters on my right and return.

  1. abc
  2. abd
  3. abe
  4. acd
  5. ace
  6. ade //last 2 letters are the max. p2u = 0
  7. bcd
  8. bce
  9. bde
  10. cde


factorize a natural number #AQR

My friend Dilip received this question in a 2017 AQR on-site.

Q: given a natural number (like 8), write a function to output every factorization such as (2,4) (2,2,2). You can ignore or include the trivial factorization (1,8). You can use recursion if you want.
— (incomplete) Analysis

  1. I would first generate all the prime numbers up to sqrt(N)
  2. among them, i would find all the factors. Once I find a prime factor x, keep dividing by x so I know in total I have 3 x’s, 2 y’s and 1 z, or (x,x,x,y,y,z). I call them 6 non-distinct “prime factors”.

From there, I might be able to write a (recursive?) function to output the factorization formulas. The ideal algo automatically avoids duplicate factorizations but Here’s my non-ideal design: generate all 2-way “splits”, all 3-way splits… If I keep all my “splits” in a hashtable, I can detect duplicates. So just treat the 6 factors as 6 distinct factors. Now the problem is well-defined — next split@N boys .

— trie solution based on generate combinationSum compositions #backtrack up] trie+tree

Candidates are the non-distinct prime factors and their products, each a factor of the big number.

— recursive solution by CSY

  • no prime number needed! A major drawback — if the target number is odd, we would still keep testing 2, 4, 6, 8 as possible divisors! is very short solution by CSY. Here’s my analysis —

  • I believe every time the factorize(60) function finds a small factor like 2, it pushes the factor onto a global stack, then run factorize() on the quotient i.e. 30 — wherein every factorization formula on 30 is “decorated” with the stack. is my modified/improved python solution

  • I replaced the global vector with a local immutable list on each call stack. It helps me reason. This is also thread-friendly, if the target number is large.
  • It’s highly instructive to work out the expected output from the recursive loops, as in my comments.
  • Just like the continuousSentence problem, the recursive solution is clever-looking but not scalable.