%%approaches to classic generator problemS

* p2u in next_perm

* insert at each position of each array, in a growing collection to create new arrays for the collection. Optionally, all arrays are immutable.

Actually, only two approaches so far, though each of them can be adapted in many ways.


versioned-queue problem

I think this problem is mostly about data structure, not algorithm.

Q: Design and implement a Version-Queue. A Version-Queue maintains a version number along with normal Queue functionality. Every version is a snapshot of the entire queue. Every operation[Enqueue/Dequeue] on the Queue increments its version.

Implement the following functions:

1. Enqueue – appends an element at the end of the queue.
2. Dequeue – returns the top element of the queue.
3. Print – it takes a version number as input and prints the elements of the queue of the given version. The version number input can also be an old/historical version number.

E.g. if the current version number of the queue is 7 and the input to this function is 5, then it should print the elements of the queue
when its version number was 5.

For simplicity, assume the elements are integers.

We expect you to write a helper program to test the above data structure which will read input from stdin and prints output to

Input format:
First line should have an integer n (number of operations). This should be followed by n number of lines, each denoting one operation.
e 1
e 4
e 5
p 2
p 4

‘e’ stands for enqueue

— My design —
In addition to the regular queue data structure, we need a few helper data structures.

All current + historical queue elements are saved as individual elements in a “Snapshot” vector, in arrival order. This vector never decreases in size even during dequeue. Two pointers represent the trailing edge (dequeue) and leading edge (enqueue).

(minor implementation detail — Since it’s a vector, the pointers can be implemented as 2 integer index variables. Pointers and iterators get invalidated by automatic resizing.)

Every enqueue operation increments the right pointer to the right;
Every dequeue operation Increments the left pointer to the Right;
(No decrement on the pointers.)

With this vector, I think we can reconstruct each snapshot in history.

Every pointer increment is recorded in a Version table, which is a vector of Version objects {Left pointer, Right pointer}. For example, if Version 782 has {L=22, R=55} then the snapshot #782 is the sub-array from Index 22 to 55.

Additional space costs:
O(K) where K = number of operations

Additional time costs:
Enqueue operation — O(1). Insert at end of the Snapshot vector and Version vector
Dequeue operation — O(1). Move a pointer + insert into Version vector Print operation — O(M) where M = maximum queue capacity

Minor point — To avoid vector automatic resize, we need to estimate in advance the limit on K i.e. number of operations. If you tell me you get millions of operations a microsecond, then over a day there
could be trillions of “versions” so the Versions vector and the Snapshot vector need sufficient initial capacity.

Note we could get 9000 enqueue operations in a row.

average O(): hashtable more imperfect than qsort

In these celebrated algorithms, we basically accept the average complexity as if they are very very likely in practice.

In comp science problems, hashtable’s usage and importance is about 10 times higher than qsort

  • I would say qsort is faster than many O(N logN) sorts. Qsort can use random pivot. It degrades only if extremely “lucky:)” like getting a 6 on every dice roll.
  • In contrast, hashtable performance depends mostly on programmer skill in designing the hash function, less on luck.

Performance compared to alternatives — qsort relative performance is usually as good as expected, but hashtable relative performance often falls short.

generate simple paths between 2 graph nodes

Q1: given 2 nodes in a graph containing N (eg 121) nodes, potentially with cycles, generate all simple paths between the pair. A simple path has no cycle. (In other words, length + 1 ==  # unique nodes in a simple path)

  • I think there are classic math algorithms for it, because this is part of basic graph theory. Here are some applications of this type of algorithms —
  • Q1b (special case of Q1): given 2 nodes in a C by R matrix grid, where every node is connected to (up to) four neighbors, generate all cycle-free paths.
    • I can solve this problem in python
  • Q2 (easy one based on Q1): generate all simple paths between any node pair in a graph. The shortest simple path has length=0. Longest simple path can potentially visit every node exactly once.
  • A: first generate all 121-Choose-2 node pairs. For each pair, solve Q1. Lastly generate the 121 trivial paths of length=0.
  • Q2b (special case of Q2): given a C by R (eg 11×11) matrix grid, where every node is connected to (up to) four neighbors, generate all simple paths.
  • Q2c (easy one based on Q2): given a binary tree containing no cycles, generate all paths.

— my DFT implementation (probably not 100% correct) , where each “trail” either fails or becomes a path.

  1. from NodeA start a breadcrumb/trail. We can’t revisit any node already visited on current breadcrumb,
    1. if this is a matrix, then instead of a hashtable, we can also use a shadow matrix, but the breadcrumb is much smaller than a shadow matrix
  2. if we can reach a node surrounded by nodes on the same breadcrumb, then the trail fails
  3. else we will reach NodeB 🙂 Print the breadcrumb

By construction, we won’t see duplicate paths 🙂

https://github.com/tiger40490/repo1/blob/py1/py/grid/classic_count4waySimplePaths.py is the implemnetation

–BFT? I don’t think it can print each unique path

generate all abbr starting from longest.. +! recursion

I won’t insist on relative ordering among the shortest.

Idea 1() — Start with longest abbreviation i.e. the original string S, assuming 5 characters.

  1. populate the smallHM with the original word
  2. copy every char except the first. save into bigHM, then print/process this abbrevation.
  3. copy every char except the 2nd and print
  4. ..
  5. copy every char except the last. Now we have 5 strings in bigHM (a Level-4 hashmap), each length S-1=4
  6. make smallHM point to bigHM object; point bigHM to an empty hm
  7. now take a string from smallHM (Level-4 collection) and generate 4 shorter strings and save them in bigHM (a Level-3 collection), each length S-2=3
  8. now take 2nd string from Level-4 …
  9. After we finish Level-4, we have generated 20 strings in Level-3, but there are only 10 distinct items! so we need a L3 hashmap.


O(1)space,O(1)search,O(N)sort: tricks

Every time I see O(1) space required on an array problem, I think of …. swapping.

Every time I see O(1) space required on a list problem, I ask is it a ….. linked list.

Every time I see O(N) time required on an array problem, I think of … radix sort, applicable to 64-bit integers, 64-bit floats and strings.

Every time I see O(1) search, I think of … hash table and radix array

Tower IV: c++,algo,past prj..

Q1: how could you use enum to improve performance
AA: use it to replace strings
AA: enum is used in template programming to move computation from run time to compile time. See Q4 and SFINAE in github.
%%A: see clever use of enum(char?) in demanding c++ app

Q1b: you put in your change but it turns out to hurt performance. Any idea?

Q4: how do you compute Fib(N) like Fib(99999) at compile time using template programming?
A: See my Fibonacci code in github

Q: how would you use multicast to send a message to a group of registered users?
%%A: less data copying than TCP

Q: In O(1) space, Given an unsorted array of natural numbers, how do you find any pair to produce a target sum? What’s your best time complexity?
%%A: if I assume the integer size is 32-bit then a fixed-size O(1) space radix structure can sort the array in O(N). See also my blog post on [[locate a pair with targetSum=55 #bbg IV #Morris]]
%%A: if integer size is unlimited, then I think the best is O(NN)

Q: How is your muni bond pricing engine designed internally?
Q2b: what kind of pricing rules live in the cache?
Q2c: how many threads to react to a given event?

Q: how did you calculate FX fwd points?
%%A: use the interest rates in the two currencies and the spot FX rate.

Q: how is your implied volatility computed from an option price?
%%A: if there’s not a closed-form formula, then Newton’s method or something similar should be able to converge quickly.

Q3 (OO design): How would you design a game software with Animal objects, various birds, and fish?
Q3a: Say the system is running fine, but now you need to add ostrich class?

binary search in rotated sorted array

https://leetcode.com/problems/search-in-rotated-sorted-array/description/ has the requirement. I don’t want to look at other people’s solution, so I have reproduced the requirements below. I have not encountered this problem in any coding interview.

Q: Suppose an array sorted in ascending order is rotated at some pivot unknown to you beforehand. (i.e., [0,1,2,4,5,6,7] might become [4,5,6,7,0,1,2]). You are given a target value to search. If found in the array return its index, otherwise return -1. You may assume no duplicate exists in the array. Your algorithm’s runtime complexity must be in the order of O(log n).

https://github.com/tiger40490/repo1/blob/cpp1/cpp/array/binSearchRoatedArr.cpp is my solution

–Solution 2:

first run a binary search to locate the pivot point. Then run a O(1) test to discard one of the 2 segments. We are left with the remaining segment as a regular sorted array. Run binary search on it.

bbg: allocate half the players to NY^SF

Always good to be concrete, so without loss of generality, let’s say J=99 i.e. 198 players.

Q1: Given 2J players, we need to allocate exactly half of them to NY and the rest to San Francisco, to form two teams for a inter-city contest. We receive the NY fare and SF fare for each player, as 2J (i.e. 198) pairs of integers. How can we minimize the total fares?

Q2(bonus): what if 2J+1 players, and you can decide which city gets an extra player?

— Analysis: how many possible allocations? 2J-choose-J. Very large number of allocations. Something like O(J!) so brute force is impractical.

If for any player the two fares both go up by $9800, it doesn’t change our algorithm at all. Therefore, we only care about the fare difference (N-S) for each player.

— solution: I will assume most players live near SF so SF fares are lower. I tabulate and sort the 198 fare differences “NY fare – SF fare” and suppose indeed at least 99 (half) are positive[1]. Therefore, my “base” is SF i.e. my base allocation is everyone-allocated-to-SF. Next I must pick 99 (half) of them and allocate to NY.

I will avoid the higher values in the table.

I simply pick the lower 99 (half) in the table, because these are the 99 “extra” costs I will incur. I want to minimize the total of these 99 values, whether they are mostly positive or mostly negative.

  • Note the highest number in the table indicates the most expensive player for NY relative to SF. If this number is negative, then SF is more expensive than NY for All players so follow the rule in [1] but notice he is the least expensive players for SF relative to NY. Regardless of positive/negative sign, we want to keep this person in SF .
  • Note the lowest number in the table indicates the least expensive player for NY relative to SF. If this number is negative, then SF is more expensive than NY for this player — normal. Regardless, we want to allocate her to NY.

[1] if proven otherwise (by the data), then we could easily treat NY as base to keep the solution intuitive. Actually even if we don’t change the base, the algorithm still works, albeit unintuitively.

A2: Say J=99 so total 199 players We already picked 99 for NY. Do we pick one more for NY or keep remaining 100 in SF?

Just look at the next lowest value in the table. If positive, then NY is more expensive for him, so keep him in SF.

bucketSort: O(N+k) could be better or worse than O(N)

Based on the top answer in https://stackoverflow.com/questions/7311415/how-is-the-complexity-of-bucket-sort-is-onk-if-we-implement-buckets-using-lin

It’s obvious (by definition) that O(2N) is equivalent to O(N).

O(N+k) is worse than O(2N) when k is larger, like 4000 times N. For a concrete illustration, say, we have N=5 million strings to sort, using k = 20 billion buckets.

  • It takes constant time to bucket each string, so the first step takes O(N) i.e. grows proportional to N.
  • 2nd step is now dominated by k, since all 20 billion buckets have to be examined. Time complexity is O(N+k). This is explained in the stackoverflow answer.
  • Therefore, total complexity is O(N+k) i.e. cN + bk with two constant factors c and b. Even if b is small (complexity of looking at an empty bucket), as k grows, at sometime the k term would overtake N term and become dominant factor.

In practice, k is often chosen to be a smaller integer than N, so I don’t think this is a practical issue.

25-horse puzzle #Google

Q: A single racetrack. P=25 mechanical horses each with a unique speed. You can race up to 5 horses each time, and get a printout of the ranking, but no timing.

What’s the minimum number of races to identify fastest #1 #2 #3 horses.

Youtube and my green quant-IV book both have the ananlysis.


Quick elimination — each time I can eliminate minimum 2 horses but can we eliminate faster?

What if T is 1 and P is 9? Two races exactly.

Here’s my algo #1

  1. first pick 5 random horses,
  2. race them and eliminate slowest 2 (reduce population by 2). Name the top 3 as AA/BB/CC
  3. race CC with 4 untested random horses. Four scenarios:
    1. if 3 or more beat CC, then eliminate the two slowest including CC, and put the top 3 in a group with AA/BB. Go back to Step b)
    2. if 2 beat CC, then eliminate the three slowest including CC, and put top 2 and an untested horse in a group with AA/BB. Go back to b)
    3. if only one horse beat CC, then eliminate the four slowest including CC, and put the #1 and 2 untested horses in a group with AA/BB. Go back to step b)
    4. if CC is #1 (happy path), then eliminate the other four horse and go back to step c)

Worst case — first race actually had the slowest 5, then we hit case C1 every time. So we only eliminate 2 horses each race

best case — first race had 5 fastest. total 6 races [1] but this is statistically unlikely. If you are not the most lucky, then we will need more than 6 races.

[1] first race eliminates 2… next 5 races each eliminates 4 only if CC is always fastest in every race.

–algo #2:

  1. [5 races] Split into 5 groups and run 5 unrelated races. Discard slowest 2 in each group. We end up with 15 horses.
  2. [1 race] Now race the 5 number one’s. Name them ABCDE. The D and E groups are completely eliminated (-6) and the C group is left with C alone (-2), and B group’s #3 is eliminated (-1). We end up with 6 horses. I will name the seven as A/B/C and the less-tested A2 A3 B2. There are only three possibilities:
    1. ABC
    2. A A2 A3
    3. A B B2
  3. [1 race] so I will race 5 of the 6 horses, sparing A since it’s already the final champion.
  4. — total 7 races best or worst case.


fastestSortFor: one str; arrayOfStr; 32bit ints #non-comparison

I think the big-O coding question may require this knowledge, because

….real world sorting problems can often use key-based sorting, rather than comparison-based sorting.

Therefore, if your overall solution requires sorting, don’t assume it would become O(J*K+M^2+P*N logN) based on your O(N logN) sorting assumption. Here are some achievable linear-time sorts:

sort O(..) data type ref notes
counting O(N) chars
 radix O(N) 64-bit ints O(Nw) -> O(N)
 radix O(N) variable-size strings [1/2] fastest
 burst O(N) variable-size strings 2nd fastest. key-based, trie-based
 radix O(N) floats [3] IEEE format
 radix O(N) date/time converting to fixed-length string or fixed-size integers
 bucket? O(N) random ints in python/perl unbounded. No limit on #digits

Note if data is nearly sorted, then insertion-sort will outperform everyone else.

[1] https://www.cs.princeton.edu/~rs/AlgsDS07/18RadixSort.pdf

[2] https://link.springer.com/chapter/10.1007%2F978-3-540-89097-3_3 paper presented a MSD radix sort for large collections of strings, faster than any other string sorting algorithm.

[3] http://www.boost.org/doc/libs/1_59_0/libs/sort/doc/html/sort/sort_hpp/float_sort.html

coin problem #all large-enough amounts are decomposable

This is not really a algo IV question, but more like brain teaser problem.

Based on https://en.wikipedia.org/wiki/Coin_problem — For example, the largest amount that cannot be obtained using only coins of 3 and 5 units is 7 units. The solution to this problem for a given set of coin denominations is called the Frobenius number of the set. The Frobenius number exists as long as the set of coin denominations has no common divisor.

Note if a common divisor exists as in {2,4} then all the odd amounts will be non-decomposable.

Q: why a very large amount is always decomposable ? Give an intuitive explanation for 2 coin values like 3 and 5.

Here’s an incomplete answer — 15 (=3*5), 16, 17 are all decomposable. Any larger number can be solved by adding 3’s .

In fact, it was proven that any amount greater than (not equal to) [xy-x-y] are always decomposable. So if we are given 2 coin values (like 4,5, where x is the smaller value) we can easily figure out a range

xy-x-y+1  to xy-y

are each decomposable. Note this range has x distinct values. So any higher amount are easily solved by adding x’s

Also note xy-y is obviously decomposable as (x-1)y.


GS c++/dataStruct IV

Q1: what’s the problem to be solved by virtual inheritance
A: diamond..

Q2: consider such a diamond with A as grandparent, B/C virtually deriving from A, then grandchild D. Suppose class C has a foo() method that uses a field A.a. Now, within C, the offset of A.a depends on the inheritance hierarchy, so how does the compiler compile C.foo()?

A: I do see a problem here. In this ABCD case, the offset of A.a is one value, depending on the size of B. However, in a ABJKCD case (J/K virtually subclass A), the offset of A.a will be a different value, depending on the sizes of B/J/K. So the C.foo() method will compile to different object code! Yet the compiler has a single version of C.foo().

The trick is the offset of the A sub-object within C. This offset value can be saved in the vtable of C.

Note if there’s no virtual inheritance, then simpler. Within C, the A sub-object’s offset is a constant. No need to look up vtable.

Q3: appending a million integers to a list vs a vector, which is faster and by how much?
%%A: vector requires reallocation, roughly 20 times, assuming doubling each time, much fewer than 1M allocations for the list.  The reallocation entails element-by-element copy, but that’s cheaper than allocations. If I must estimate how much faster, I need to estimate how many allocations and how many copying:

  • for list, 1M allocations + 1M copying
  • for vector, 20 allocations + about 2M copying

%%A: I still think allocation is a hotspot and much more expensive than copying, perhaps by 10 or 100 times.

%%A: of course, you can pre-allocate for vector, but for this problem we ignore that feature.

Q4: deque — known for fast append/prepend, and random access. How?
%%A: I think it’s implemented as segments of vectors, with the front and back segments not fully occupied and growable. Once one of them reaches a certain size, a new segment is created. RandomAccess is implemented by identifying the host segment. To optimize, segments should have equal length.

Q4b: what if we want to support fast midstream insertion?
%%A: this is a vector limitation inherited by deque. If we need to support it, we may need to allow unequal segment lengths. RandomAccess is now slightly slower. For a given subscript 16, we need to maintain a step function like

  • 5-10 in Segment 2
  • 11-13 in Segment 3
  • 14-20 in Segment 4
  • 21-30 in Segment 5
  • 31-32 in Segment 6
  • … so each time any segment expands, the step function needs an update
  • Once we have the step function, a binary search will do, at log(S) where S = Segment count

deque with uneven segments #GS

Q: (2017 GS interview question) We know std::deque offers efficient insert at both ends. But now we want efficient insert mid-stream, and willing to relax O(1) lookup.

I proposed a solution using uneven segments. Each segment has a max capacity. When reached, a new segment would be allocated and linked in. Lookup using a staircase function.

Segment sizes (not capacity) is a mutable int array. From this array, we lazily derive another int array, the “threshold array” or staircase array. Every lookup requires a binary search in this threshold array to locate the target segment. If we keep a hard limit (like 1024) on segment count, then the threshold array would be at most 1024 long, so within 10 hits we always locate the target segment — better than O(log N)

The two int arrays should be L1-cached.

## insertion sort: phrasebook

  • nearly sorted — for nearly-sorted sample, beat all other sorts  including counting sort
  • small — for small sample, beats MSD radix sort and all comparison sort
    • switch — many comparison sort implementations switch to insertion-sort when the sub-array becomes small
  • poker — Intuitively, this is how players sort poker cards
    • shift — requires shifting part of the sorted portion.

factorial is worst O(n!) > O(2^n)

Factorial complexity is (much) worse than exponential complexity. Look at n! and 53^n, without loss of generality. Model them as two snowballs rolling over 1,2,3,..1000… Initially, the exponential (E) snowball is bigger, but after we pass n=53, the factorial(F) snowball grows by a factor of 54, while the snowball E grows by a factor of 53. From then on, F grows faster and faster and will outgrow E sooner or later.

[[Discrete Mathematics]] puts Factorial as the worst algorithm, but both Factorial and Exponential are unacceptable, while Polynomial is acceptable.

[[algo in a nutshell]] bucket sort, insertion sort..

== bucket sort
This book asserts that bucket sort is #1 top dog when data is uniformly distributed (not normally distributed!) — i.e. can be uniformly partitioned using a fast hashing function. Number of buckets created equals the input data size, just like in a standard hash table. Book shows bench-marking on large samples of random floating point numbers. However, I disagree.
  1.  for special data types like strings and small ints, radix sort is probably faster than bucket sort
  2. Bucket sort is #1 for large data volumes, beating quick sort, heap sort etc. But for nearly-sorted data, Insertion-sort is faster.
  • Linked list — used inside each bucket. Arrays are possible alternative, but 50% slower.
  • —-Relation to other sorts:
  • Can be seen as generalization of counting sort
  • A cousin of radix sort in the most-to-least significant digit (MSD) flavor i.e. top-down radix sort
–hash sort, a variation of bucket sort, customized for random strings.
Book shows benchmarking on random strings. Hash sort is #1 for large data volumes, beating quick sort, heap sort etc

I guess if strings are unconstrained (no guarantee of randomness), hash sort will not have any advantage

in both cases, the hash code must be strictly “ordered” , so bucket #1 must hold lowest data items.

== insertion sort
P100. for nearly-sorted data, insertion sort is faster than all other sorts (including bucket sort, radix sorts), often by an order of magnitude

https://stackoverflow.com/questions/736920/is-there-ever-a-good-reason-to-use-insertion-sort shows insertion sort is better than divide-n-conquer for 1) nearly-sorted or 2) small collection

https://www.cs.princeton.edu/~rs/AlgsDS07/18RadixSort.pdf shows that MSD radix sort also need to switch to insertion-sort when sample size is small.

== counting sort
P315. “fastest” sort for the right data set. I think a single English word is suitable. However, for nearly sorted data, insertion sort is still faster.

I guess it’s still slower than insertion sort if nearly-sorted.

== Binary search tree vs hash table
P130 discusses when to use which

locate a pair with targetSum=55 #bbg IV #Morris


Q: any O(1) space sort?

Q: From an unsorted array of positive integers, is it possible to find a pair of integers that sum up to a given sum?

Constraints: This should be done in O(n) and in-place without any external storage like arrays, hash-maps, but you can use extra variables/pointers.

If this is not possible, can there be a proof given for the same?

—–Initial analysis—-
I wish I were allowed to use a hash table of “wanted ” values. (iterate once and build hashtable. For Each new value encountered, check if it is in the “wanted” list…)

I feel this is typical of west coast algorithm quiz.

I feel it’s technically impossible, but proof?  I don’t know the standard procedure to prove O(n) is impossible. Here’s my feeble attempt:

Worst case — the pair happens to be the 1st and last in the array. Without external storage, how do we remember the 1st element?  We can only use a small number of variables, even if the array size is 99999999. As we iterate the array, I guess there would be no clue  that 1st element is worth remembering. Obviously if we forget the 1st element, then when we see the last element we won’t recognize they are the pair.
—–2nd analysis—-
If we can find a O(n) in-place sort then problem is solvable [1]. Let’s look at Radix sort, one of the most promising candidates. https://stackoverflow.com/questions/463105/in-place-radix-sort has an implementation for DNA strings.

Assumption 1: the integers all have a maximum “size” in terms of digits. Let’s say 32-bit. then yes radix is O(n) but not sure about space complexity. Now, with any big-O analysis we impose no limit on the sample size. For example we could have 999888777666555444333 integers. Now, 32-bit gives about 4 billion distinct “pigeon-holes”, so by the pigeon-hole principle most integers in our sample have to be duplicates!

Therefore, Assumption 1 is questionable. In fact, some programming languages impose no limit on integer size. One integer, be it 32 thousand bits or 32 billion bits, could use up as much memory as there is in the system. Therefore, Assumption 1 is actually superfluous.

Without Assumption 1, and if we allow our sample to be freely distributed, we must assume nothing about the maximum number of digits. I would simply use

Assumption 2: maximum number of digits is about log(n). In that case radix sort is O(n log(n)), not linear time:(

—– new analysis as of Jun 2017 —-
Can Morris algo sort an array (external storage)? If yes,  then use the 2-moving-pointer algo in locate a pair whose diff=55

However, Morris needs a tree not an array.

[1] locate a pair with targetSum=55 : pre-sorted #O(N) 1-pass

to a novice programmer — the importance of data structures in financial IT systems

Data structure is essential and non-trivial to every programming language on wall street — java (collections), c++ (STL), python/perl (high-level data types)… A lot of algorithms (in the traditional, strict sense of the word) are created on or using classic data structures. There are a lot of jargons you need to grasp. Everyday programming requires you to be familiar with common operations on these data structures, but these are relatively straightforward. However, here are some of the non-trivial aspects of data structures and


* sort python dict or tuple

* internals of hash table and ConcurrentHashMap

* STL container holding pointers

* STL pair

* STL stream iterators

* customize a sorted set/map for MyClass in C++, which is harder than java

* sort a map by value (real interview question) in java. Perl and python is easy.

* whiteboard-implement a blocking stack/queue (UBS)

* whiteboard-implement a basic hash table (MS)

* find top 99 in a large array (Barc) but no need to return them sorted

* choose array vs circular array vs linked list, in terms of memory efficiency (JPM).

* given 2 nodes in a binary tree, with no pointers except pointer-to-parent, find lowest common ancestor (Google)

* iterative tree walk (Barc)

* filtering iterator (MS)

* STL transform(), erase(), partition() etc

* STL allocators

* STL functors, binders

* STL iterator adaptors (back_inserter etc)

Posted By familyman to learning finance,c++,py… <http://bigblog.tanbin.com/2011/05/to-novice-programmer-importance-of-data.html> at 5/03/2011 07:34:00 PM

100-gunmen puzzle

Problem — Suppose 100 gunmen stand in a big circle arranged by height, clockwise from #1 (lowest) to #100 (highest). In first shooting round, #1 starts off by shooting clockwise, so #2 gone. Then #3 shoots clockwise at #4, and so on. First round ends, and the lowest among the remaining gunmen continues to shoot clockwise. How many rounds will there be and who will be the last standing?
Assume each gunmen stays in his spot, and his spot has his name.
Analysis —
Let me first clarify the terminology.
* Each round has a “starter” shooter. The round ends immediately before he shoots again or immediately before he gets shot.
* Each round has a InitialCount being the number of gunmen immediately before that round.
Solution —
Round 1: starter is #1. InitialCount=100. This is an even round, so the starter remains.
Round 2: starter is #1. InitialCount=50. This is an even round, so the starter remains.
End of Round 2 the remaining shooters are #1 #5 #9… #97. They are 4 (i.e. 2^2) stops apart.
Round 3: starter is #1. InitialCount = 25. This is an odd round, so starter will shift anticlockwise by 2^2 to #97. Why am I so sure? Since InitialCount is odd, the highest (among the 25) gunmen will NOT die and will become the next round’s starter.
End of Round 3 the remaining gunmen are 8 (i.e. 2^3) stops apart. Therefore, highest two are #89 #97.
Round 4: starter is #97. InitialCount = 13. This is an odd round, so starter will shift anticlockwise by 2^3 to #89.
End of Round 4, the starter (#97) is a “sitting duck” to be shot soon.
Round 5: starter is #89. InitialCount = 7. This is an odd round, so starter will shift anticlockwise by 2^4 to #73.
End of Round 5, #89 is a “sitting duck” to be shot soon.
Round 6: starter is #73. InitialCount = 4, which is a power of 2, so #73 will remain the starter for all subsequent rounds and therefore the last standing.

dynamic programming problems – my first look

The concept was well defined by the inventor but is now a bit vague. http://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf is the best I have seen, but also check out http://mat.gsia.cmu.edu/classes/dynamic/dynamic.html . Below are some hallmarks of DP —

[C=comp science]

* solve sub-problems before solving the bigger problem. This approach/strategy is quite general
* [M] Principle of optimality
* [C] optimal substructure
* [M] stochastic dynamic programming
* [M] tail problems
* [C] DAGs

A major category of DP problems involve decision-making at discrete-time stages. Without stochastic elements, all the “contingent” decision-making could be (brought forward and) optimized upfront with clinical precision, like a robot analyzing a maze. In that case the stages are artificial, even unnecessary.

Therefore I now feel _ u n c e r t a i n t y _ is the major differentiator among DP problems. With uncertainty at each stage, we have no choice but break up the problem into periods. Without uncertainty, we can break up the problem artificially by period, or by segment…

Therefore, here are my classification of 3 categories of common DP problems (commonly found in tutorials) —
1) with uncertainty
2) with time periods without uncertainty
3) without time period. You create artificial stages. Often the simplest DP illustrations.

dynamic glass-drop

http://en.wikipedia.org/wiki/Dynamic_programming#Egg_dropping_puzzle is a similar but more ambiguous problem.

If we toss a wine glass out of a 3rd floor window it may break, but it may not if tossed from a 2nd floor window.

Q: with exactly two wine glasses taken from a (Japanese:) production line, you want to identify starting at exactly which level of a 11-floor tower this brand of glass will break if tossed out of the window. Equivalently, determine the highest safe floor. Optimization goal — minimize worst-case cost. Top floor is known to break, so answer might be 1st floor, 2nd floor,… or top floor — 11 possible answers, but 10 floors to test.

Needless to say, if the glass breaks at 6th floor, it will surely break at a higher level.

—- Analysis —-
With just a single glass, you have no strategy. You have to try from 1st floor up. If your classmate starts on 2nd floor and breaks her only glass there, she is disqualified because with no more glass to use, it’s now impossible to know whether it would survive1st floor.

I feel this is NOT a probability problem, so each toss is bound to pass or fail with 100% certainty but we just don’t know it.

We don’t want to toss first glass @2nd, @4th, @6th… because too many first-glass tests. Neither do we want to toss first glass @5th because if broken, then 2nd glass must try @1@2@3@4. Therefore first glass toss should be somewhere between ( @2, @5 ) exclusive.

—-%% solution, with heavy use of symbols —–
I feel this problem is tractable by dynamic programming. Let
* L denote the highest _safe__ floor known so far, and
* U denote the lowest _unsafe_ floor known so far. U is initially at top floor, and L is initially 0, not 1
* u denote U-1

u >= L is an invariant. The Upper and Lower bounds. (u-L) is the maximum number of floors to test. When we get u-L = 0 and final Answer is the current value of U.

Here’s a typical example
6 —- U known unsafe
5 —- u, so 3 floors to test (u-L)
2 —- L known safe

z8 denotes the worst case # tests required to locate the Answer with both glasses when u-L = 8 i.e. 8 floors to test. Could be z(L=0, u=8) or z(L=1,  u=9). Same thing.

For example, z(1,3) means worst case # tests when 1st and 4th storeys need no testing. Note z(1,3) == z(2,4) == z2

z1 = 1 i.e. one toss needed
z2 = 2
z3 = 2 // B! C || BA. The untested floors are named (upward) ABCDEFG (if 7 floors to try), and trailing ! means broken.

z4 = 3 // B! CD || BA
z5 = 3 // C! +2 || C then z2 // if C broken, then must test last glass at D and E
z6 = 3 // C! +2 || C z3  // ABCDEF
z7 = 4 // (D! + 3 || D z3) or (C! + 2 || C z4)
z8 = 4 // (D! + 3||D z4) or the smarter (C! + 2||C z5)
z9 = 4 // (D! + 3||D z5) or (C! + 2 || C z6)
z10 = 4 // (D!+3||D z6). Can’t use (C! + 2 || C z7) — for a 11-storey tower, we need 4 tosses.

I think there are cleverer implementations. This is not brute force. It builds up useful results with lower z() values and use them to work out higher z() values — dynamic programming.

This solution not only finds where to start the test and the worst-case # tests. It also works out the exact decision tree.

dynamic poker game in Zhou Xinfeng book

Q: you are dealt the 52 cards in a random sequence.
Each red card earns you $360
Each black card costs you $360
Game ends at end of the 52 draws or any time you quit. How do you optimize your earning and how much is admission ticket worth?

Let’s denote the amount of money you take home as H, a random variable. Your net profit/loss would be H minus admission price. If 555 reasonable/intelligent people play this game, then there would be five hundred and fifty-five values of H. What’s the average? That would be the answer.

p or “+” denotes the # of positive cards earned so far
n or “-” denotes the # of negative cards suffered so far

Exp(H|p=25,n=26) = Exp(H|p=26,n=26) = $0.
There’s an intrinsic value in our hands, Defined as i=(p-n). The e-value or Extrinsic value Evaluated from Expectation, may be higher or lower. Whenever i-value > e-value, we should exit. This is our strategy/policy.

Whenever intrinsic value <$0, E(H) is out of the money but always non-negative because we can wait till maturity and get all 52 cards.

E(p24,n26) = p(next is p)E(p25,n26) = 100%E(p25,n26) = $0
E(p26,n25) = $360     because we will exit

E(p25n25) = Prob(next is p)E(p26,n25) + Prob(next card is n)E(p25,n26) = 50%$360+0 = $180
E(p24n25) = p(p)E(p25,n25) + p(n)E(p24,n26)= p(p)E(p25,n25)+$0 = 66%$180= $120
E(p25n24)= p(p)E(26,n24)+p(n)E(p25,n25)= 33%$360×2 + 66%$180= $360

E(p24,n24)= half of E(25,24)+E(24,25)= half of  $360+$120= $240
E(p23,n25)= p(p)E(24,25)
+ p(n)E(p23,n26) # $0
=3/4x$120= $90
E(p23,n24)= p(p)E(24,24)+p(n)E(23,25)= 3/5 .$240+2/5 .$90

16bit page-counter to Estimate when a webpage is hit the 1,000,000th time

We don’t need to estimate the hit Frequency which could be time-varying.

If we can use a 2 counters, we can precisely determine the 1,000,000th hit. One slow-counter one regular fast counter. But that means we use 32 bits for our counters, like a 32-bit counter!

void increment(){ // was named count()?
  static int count16bit;

  long unsigned int now = system_time_in_nanos();
  if (now%16 == 0){ //just check last 4 bits of “now”
     //now check count16bit
     if  (count16bit <= 0){cout<<"got it"<<endl; }

https://en.wikipedia.org/wiki/Approximate_counting_algorithm#Algorithm is the standard implementation of a similar idea.

O(n) sort – count`sort

#1 requirement: keys [note 1] be very small integers, small enough to be array indices.
# 1.1 I think this requirement is met whenever you notice an enum of “possible values”

[1] “values of keys” is perhaps more accurate.

This requirement alone reduces its applicability to a “niche”, but an extremely important niche. In fact, i think every time this requirement is met, we should consider count`sort. How about sorting the letters in a single english word? Remember the “anagram” question?

If the values are small integers of 0,1,2…k, then u need a temporary array of k+1.

count`sort is stable, and therefore usable in radix sort.

practical sort algorithms in popular languages

Important for …. real world software, interviews..

# 1) mergesort — chosen as
– primary sorter of java,
– primary sorter of perl,
– primary sorter of python

#2) quicksort — chosen by
– primary sorter of dotnet’s List.sort()
– primary sorter of STL (in a modified form — introsort)

heapsort – used in embedded, also part of the market-leading introsort
[iv] counting sort? important to natural words and numbers?
radix sort — important to natural words and numbers

# eor
[iv] bubble sort — easiest to code in an interview:)
[iv] insertion sort — simple
[iv] bucket
[iv] counting
[iv] radix


difference between Set and List

* Fundamentally, a list is maintained in insertion order physically, whereas a Set can optionally maintain insertion order (such as LinkedHashSet).

* Because of the insertion order, List can be the basis of Queues (FIFO) and Stacks (LIFO). A Set can’t.

* A set can stay sorted (eg TreeSet), whereas a list is seldom sorted. The most useful List implementations don’t “like” to stay sorted.

* Look-up i.e. finding an element, is faster in Set than List.

* In terms of API, the primary difference is — Set won’t add an object if it is equal to any existing element (determined by equals() method).

java hash table taking in a new node

incoming “guest” — a key-value pair, to be stored. We attach a 3rd item — computed hash code or compressed index — to each guest.

Actually there are other “things” attached to each guest. These tend to distract you when you first study hash tables —

* a pointer. When storing data, we usually store pointers to original data. In Java, Original Keys and Values are always Objects never primitives, so they are always always stored as pointers.

* a bucket id ie the array slot

* a next-pointer since each node is in a linked list

additional space requirement of quicksort

In-place sorting like quicksort has low space requirement. O(log N) space complexity is considered low. I think it’s probably the minimum for comparison-sorts.
Iterative is an alternative to recursive. Iterative lets first call complete before starting 2nd call — no call stack. Recursive is characterized by a deep call stack (of depth log N). Think onion. Think LIFO.
For any resursive sort, you usually need O(s) space where s = stack-depth. Since most (?) well-known divide-and-conquer sorts involve log N divisions and log N nested calls, space complexity is presumably O(log N)?
Now what’s the constant factor in O(log N)? At least the number of local variables defined in the subroutine. Each call on the stack maintains that number of local variables.
“additional” means besides space for the N elements.

a couple of realistic sorting challenges

Update: [[algo in a nutshell]] written by professors, has practical problem-solving suggestions.
Q: what if the storage system holding the long list doesn’t support random access, such as tape?
%%A: read into memory first, by chunk, then merge sort?

Q: what if there are too many items to hold in a computer’s RAM available within our budget? merge sort, but the bucket idea below is better.

Q: what if data is still arriving when we need to start sorting?
%%A: standard merge sort
%%A: if you know data is string and distribution is fairly uniform, you may want to create 26 (or 26+10) trees for A – Z, 0-9 and put incoming data therein. If data rate is too high, you can build 26 queues/circular arrays as the buffer in a producer/consumer pattern.
%%A: even with numeric input, you can do the same provided you know the distribution. In real system, use histogram to estimate the distribution.

I feel this is one case that needs real world experience. Designing a sorter without any idea of distribution is too academic and a waste of time.

hash search again, briefly

Q: how can hash search be O(1) when the separate chaining gets longer and longer as input size grows?
A: hash table length grows in proportional to input size. I guess expected separate-chaining length doesn’t get longer but is independent of input size.

Most hash functions assume positive integer keys. If u have some other inputs, convert them to natural numbers.

Java probably uses MAD (multiply-add-divide) compression after hashing.

a less ambiguous explanation of big O

–If you can only remember one phrase, remember “bound-from-above”

If we say “Algorithm R is O(lg n)”, then number-of-iterations are bound-from-above by lg n multiplied by a constant.

–If you can remember a 2nd phrase, remember “worst case upper bound”

Worst case upper bound is often different from average-case upper bound of a program’s running time. For quicksort, they are O (n-squared) vs. O(n lg n). If we talk of O() in the absence of *-case, it means worst-case-upper-bound. A really long translation of “Algorithm R is O(n-squared)” would be

“iterations are bound from above by n-squared even for the worst-case input array of size n.”

A shorter version: “At most n-squared interations for the worst input”

Average-case-upper-bound calculation often (if not always) involves probability. Amortized analysis is probability-free.

O(n) sort – radix sort

The principles are relevant to pure-algo interviews.

Q: quicksort is universally usable. How about radix sort?
A: not widely mentioned as a drawback against quicksort. (In contrast, I think Counting-sort is not universally usable…)
A: wikipedia says integers could [1] represent limited-length strings of characters (e.g., names or dates) and specially formatted floating point numbers, radix sort is not limited to integers.

Radix sort is good for parallel computing. See http://www.drdobbs.com/parallel/parallel-in-place-radix-sort-simplified/229000734

Radix-sort Expected running time is faster than quicksort? Could be but….
… but the hidden constant factor in O() is much worse than quicksort.

[1] It’s hard to prove “cannot”

KMP algorithm for string search

Q: Why is std::search() and string::find() use simple O(N*M) solution and not using kmp?
A: kmp requires extra memory to store the pre-computed data structure. Allocating the storage is likely more costly than the big-O savings,
A: in the common use cases, the strings are not super long and there are not many repetitions in the pattern, so kmp will not help a lot
A: therefore, overall, the simple implementation is faster, even though big-O is inferior. I believe big-O comparison really compares large values of N
–prefix-function in one sentence: “longest border of a partial-match needle[1..j-1] that is not followed by the character needle[j] “. In other words, we are looking for the largest integer k such that

* needle[1..k-1] is a border of needle[1..j-1] and

* needle[1..k] is not a border of needle[1..j].

needle[1..j-1] is a “partial-match” and needle[1..k-1] is “border” of the partial-match. The border occurs at both ends of the partial-match.


finite automata for string matching

–First, Take the pattern as an array [Note 4]. You can move a pointer to between any neighbours [note 1], to indicate how many leading characters [note 2] match at this moment.

There’s a rule — pointer can move forward (ie rightward) only one position at a time. However, the pointer often jumps backward (leftward).

I call it the s-pointer. It’s a secret pointer. It’s also a state-pointer as it determines the state the automaton is in.

[ Note 4: array of chars, laid out left-to-right ]
[Note 1: Each position of the pointer is one of the so-called states. ]
[Note 2: of the pattern. The “leading charaters” form a so-called prefix of the pattern ]

–Once you understand the s-pointer, you can now appreciate how s-pointer creeps forward or jumps back as you “eat” input characters one by one.

As you move rightward (“eat”) through any text [Note 3] one char at a time, s-pointer creeps forward or jumps back to indicate how many leading characters of pattern match at this moment.

[ Note 3: the text can grow indefinitely. Our automaton just keeps shouting “Match, Match” ]

O(1) constant complexity means ….

1st type of time-complexity to study is the “linear growth” — O (N). As list size N tripples, running time tripples.

“constant complexity” means …. As list size N grows, running time doesn’t grow with N. It’s independent of input size. I think running time may still vary as list grows, but much slower than log N .

Example: somehow, hash search has a constant complexity for search and insert, independent of list size or hash table size(?)

hash search, briefly

hash function produces a table *address* for each input key — no comparison required. Therefor hash represents the 2nd major type of search algorithm, after comparison-based algorithms like Binary Search Tree, or red-black tree

usually the fastest choice for symbol table search

widely used.

constant time for insert and search, but not some other operations like … sort.

hash output should be evenly distributed across table addresses.

modular hashing should use large prime numbers but why? P174 [[java generics]]

modular hashing need adjustment for long alphanumeric keys cos hardware can’t handle 4049040490934 bits for a single variable. Perhaps apply mod-multiple(by 256)-add to (MSB, next MSB, next next MSB..)

Step 1 of hashing algorithm: hashcode(key) to convert key into table address
Step 2 of hashing algorithm: collision resolution — separate chaining. One separate unordered linked list per shared table-address

big O, in%%lang

O, o, Omega and theta all try to answer the question “for this algorithm, how many more iterations when N triples?” (Avoid “doubles” to avoid potential ambiguity about 2), where N measures the input.

Example: For the quicksort algorithm, answer is O (N * log N), meaning “iterations increase by a factor of (3 * log 3) when input triples”
Example: For the binary search algorithm, answer is O (log N), meaning “(log 3) times more iterations when input triples”

O () means “no more than iterations”.
O is an upper bound, not a tight bound