# sorting collection@chars|smallIntegers

Therefore, given a single (possibly long) string, sorting it should Never be O(N logN) !

Whenever an algo problem requires sorting a bunch of English letters, it’s always, always better to use counting sort. O(N) rather O(N logN)

Similarly, in more than one problems, we are given a bunch of integers bound within a limited range. For example, the array index values could be presented to us as a collection and we may want to sort them. Sorting such integers should always , always be counting sort.

# topN,N-th ranked within a bag: quickselect

For any top-N problem, and for any problems about N-th ranked item,

Fairly efficient solution is the quickselect algo

# find any black-corner subMatrix #52%

https://www.geeksforgeeks.org/find-rectangle-binary-matrix-corners-1/

Q: given a black/white matrix, find any rectangle whose all four corners are black.
Q2: list all of them

— idea 2: record all black cell locations and look out for 3-corner groups

a Rec class with {northwest corner, northeast corner, southwest corner}

first pass, For each pair on the same row, create a pair object and save it in hashmap {northwest cell -> list of x-coordinates on my right} We will iterate the list later on.

2nd pass, scan each column. For each pair on the same col, say cell A and C, use A to look-up the list of northeast corners in the hashmap. Each northeast corner B would give us a 3-corner group. For every 3-corner pack, check if the forth corner exists.

— idea 1: row by row scan.
R rows and C columns. Assuming C < R i.e. slender matrix

For first row, record each ascending pair [A.x,B,x] (No need to save y coordinate) in a big hashset. If S black cells on this row, then O(SS).

In next row, For each new pair, probe the hashset. If hit, then we have a rectangle, otherwise add to the hashset. If T black cells on this row, then O(TT) probes and (if no lucky break) O(TT) inserts.

Note one single hashset is enough. Any pair matching an earlier pair identifies a rectangle. The matching would never occur on the same row 🙂 Optionally, We can associate a y-coordinate to each record, to enable easy calculation of area.

After all rows are processed, if no rectangle, we have O(SS+TT+…). Worst case the hashset can hold C(C-1)/2 pairs, so we are bound by O(CC). We also need to visit every cell, in O(CR)

If C > R, then we should process column-by-column, rather than row-by-row

Therefore, we are bound by O( min(CC,RR) + CR). Now min(CC,RR) < CR, so we are bound by O(CR) .. the theoretical limit.

— idea 4: for each diagonal pair found, probe the other corners
If there are H black cells, we have up to HH pairs to check 😦 In each check, the success rate is too low.

— Idea 5: Brute force — typewriter-scan. For each black cell, treat it as a top-left, and look rightward for a top-right. If found, then scan down for a pair of bottom-left/bottom-right. If no then complete the given row and discard the row.

• once I find a black cell on current row, I put it in a “horizontal” vector.

# average_case ^ amortized

–> bigO (and big-Theta) is about input SIZE, therefore, usable on average or worst case data quality

There’s no special notation like bigX for average case.

–> Worst vs average case … is about quality of input data.

With hash table, we (novices) are permitted to say “quality of hash function” but this is incorrect.  My book [[algorithms in a nutshell]] P93 states that IFF given a uniform hash function, even in worst case bucket sort runs in linear time O(N).

–> Amortized … is about one expensive step in a long series of steps like inserts. Note Sorting has no amortized analysis AFAIK.

worst case amortized — is a standard analysis
Eg: vector expansion. Worst-case doesn’t bother us.

eg: quicksort+quickSelect can degrade to O(NN). Amortization doesn’t apply.

average case amortized — is a standard analysis
eg: hash table collision. A few operations happen to hit long chain, while most operations hit short chain, so O(1) on average.

Amortization is relevant iFF an insert triggers a hash table expansion.

What if the input is so bad that all hash codes are identical? Then amortization won’t help. For hash tables, people don’t worry about worst case, because a decent, usable hash function is always, always possible.

average case per-operation — can be quite pessimistic
eg: vector insert

worst case per-operation —

# RBTree O(1)insert quite common ] coding IV

— for single-insert
https://en.cppreference.com/w/cpp/container/map/insert is precise saying that hinted insert is O(1).
— for mass insert
Special case — If we know the input sequence is pre-sorted and going to be “appended” to existing tree nodes, then each insert will always hit the end(). We can rely on hinted insert to achieve O(N) .

http://www.cplusplus.com/reference/map/map/insert/  is even more encouraging — ” If N elements are inserted, Nlog(size+N) in general, but linear in size+N if the elements are already sorted” so as long as pre-sorted, then we get O(N)

## For 64-bit integer or float inputs, we can radix sort in O(N) then mass-insert in O(N).

— for construction from a pre-sorted sequence
https://en.cppreference.com/w/cpp/container/map/map confirms it’s O(N) if the input range is already sorted.

# pre-allocated array as backing store for graph nodes #java No

I think any graph node can use the same technique, but here I present a simple yet interesting use case — a linked list with each node allocated from an array. https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_66mem/slistFromArray.cpp shows three home-made implementations:

1. backing array of dummy link nodes, pre-allocated at compile time
2. backing array of dummy link nodes, pre-allocated from free store aka DMA
3. backing array is a byte array on heap or data section. Each link node is constructed via placement-new.

Here are a few Advantages that I consider minor because linked list is seldom needed in low-latency

1. d-cache efficiency
2. eliminates runtime load on heap allocator, since memory is pre-allocated. See malloc=long considered costly

Advantage #3: For c++ algo questions, this set-up has an interesting advantage — The node address is now an index into the backing array. This index is a natural auto-increment ID , based on creation order.

Now, the biggest advantage of linked list over vector is mid-stream insert/delete. One of the biggest disadvantages is lack of random-access. If nothing happens mid-stream (as in coding questions), then we can achieve random-access-by-id using array as backing store.

If nothing happens mid-stream, then this linked list is physically similar to an array with extra capacity.

This technique won’t work in java because java array of Node is array-of-pointers.

# QuickSelect: select nth key#O(1)space

Quickselect has average O(N) runtime but worst-case O(NN), if the partitioning goes bad repeatedly. Randomized pivot can reduce the chance but I don’t think we can eliminate it

Space complexity is O(1) despite the apparent recursive set-up. Tail recursion optimization is usually available to reuse the same stack frame. Otherwise, recursion can be replaced by a loop.

std::nth_element() is linear on average but quadratic in worst case — https://stackoverflow.com/questions/11068429/nth-element-implementations-complexities explains QuickSelect algo

Quickselect is by the same inventor of quicksort.

https://en.wikipedia.org/wiki/Quickselect

# iterate K pre-sorted uneven immutable lists #FB

Interviewer (Richard) was kind enough to offer me a tip early enough. so we didn’t waste time (which could easily result in out-of-time)

Q: given K pre-sorted immutable lists, each up to N items, return an iterator that on demand yields each of the (up to K*N) items in sorted sequence.

Estimate time and space complexities.

====analysis

— I first proposed pair-wise merge. Since there are logK passes, Time complexity == O(NK logK)

Space complexity is tricky. Very first pass i have to create a single list up to NK items. Then I can reuse this list in each merge. so space complexity == NK [1], but I said NK*logK. Interviewer wasn’t familiar with this solution and didn’t correct me.

[1] See https://www.geeksforgeeks.org/sort-array-two-halves-sorted/. Suppose 8 lists to merge. I will merge A+B into first quarter of the big array (size NK), then C+D into 2nd quarter… In next pass, I will merge AB + CD in-place using the first half of my big array.

The advantage of this solution — once I create a single merged array, each iteration is O(1). This can be valuable if run-time iteration speed is crucial but initial warm-up speed is unimportant.

### bigO insight — merging N pre-sorted arrays is O(N logN), same as merge sort?

— Then interviewer suggested iterating over the K lists so I implemented the solution in https://github.com/tiger40490/repo1/blob/py1/py/88miscIVQ/itr_K_presortedLists_FB.py

• Space complexity = K
• Time complexity:
• next() O(logK) due to popping. I said Fib heap has O(1) insert
• init() O(K)
• hasNext() O(1)

How about a RBTree instead of heap? Space complexity is unchanged.  Time complexity:

• next() O(logK) for both remove and insert
• init()  O(K logK), worse than priorityQ
• hasNext() O(1)

— FB interviewer asked why I prefer to keep state in global var rather than a per-record field

%%A: Runtime heap allocation is slower if the record is bigger. In contrast, the global dictionary is tiny and likely lives in L1-cache

# when2introduce new tokens into O()

The extra tokens like A  B C in a O(A+BB+logC) are not always independent dimensions.

• Case: DFT/BFT have O(V+E) but some people may say V <= E, so why not remove V and say O(E).

I think this estimate misses the point that E could greatly outnumber V. If another algo is O(V+logE) it would probably win.

• case: radix sort is O(WN), but some may say W <=N, so why not remove W and say O(NN)

Well, W is typically log(unique count among N), though W can be half N.

Some bigO students don’t like too many tokens and may simplify it to O(KK), but I think it’s imprecise. If M or N is very small like logK, then O(MN) is much faster than O(KK).

# bigO usable in Average | worst-case #CSY

bigO is about large input data SIZE or GROWTH, not about input data quality.

Therefore, both average-case and worst-case can use bigO notation. Multiple sources specify “average case O(N logN)” —

big-Oh means upper-bound. big-Theta means tight-bound. They refer to the growth of running time as input data grows, regardless of data quality.

# FizzBuzz in O(N)

Q (Leetcode ) Write a program that outputs the string representation of numbers from 1 to n. But for multiples of three it should output “Fizz” instead of the number and for the multiples of five output “Buzz”. For numbers which are multiples of both three and five output “FizzBuzz”.

What if the divisors in the requirement is not two but much more and each a prime number?

====analysis

In the standard solution, there’s a hashtable O(NK) algo — for every int 1 to N, check the K divisors.  Here’s a O(N) solution:

• first populate the return vector with the basic strings
• Then make a strided iteration to append “Fizz” on every multiple of 3.
• Then make a strided iteration to append “Buzz” on every multiple of 5.

The fact that prime numbers become very high very soon help our bigO:

N/3 + N/5 +… + N/31 + N/37… will be lower than 5N

# ## halo knowledge to impress west coast interviews

My naive list of "halo" technical knowledge that might impress some west coast interviewers.

• order stat tree
• Morris tree walk – O(1) space
• path compression in union-find
• hash table worst case is no longer O(N) as in java8
• shortest-path algorithms
• Fibonacci and other advanced heaps — I discussed it at Facebook
• tail recursion to reduce space complexity
• worst case quick sort and randomized pivot selection
• bucket sort
• insertion sort as the fastest sort
• radix sort for floats and strings

# BST: finding next greater node is O(1) #Kyle

https://stackoverflow.com/questions/11779859/whats-the-time-complexity-of-iterating-through-a-stdset-stdmap shows that in a std::map, the time complexity of in-order iterating every node is O(N) from lowest to highest.

Therefore, each step is O(1) amortized. In other words, finding next higher node in such a BST is a constant-time operation.

https://en.cppreference.com/w/cpp/iterator/next confirms that std::next() is O(1) if the steps to advance is a single step. Even better, across all STL containers, iterator movement by N steps is O(N) except for random-access iterators, which are even faster, at O(1).

# make unstable sorts stable

For any unstable sort, we can make it stable with this simple technique.

Aha — I have to assume two items of equal key have visible differences such as address. If they are indistinguishable then sort stability is moot.

I will first build a hashtable {address -> origPos}

After the unstable sort, all the items with same key=77 will cluster together but not in original order. Within this segment of the output, I will run another mini-sort based on the origPos values.

This doesn’t affect time complexity. In terms of space complexity, it’s O(N).

# CIV: I still prefer RBtree #Rahul

• goal #1 in CIV — speedy completion of an optimal solution in terms of O().
• j4 #1 — RBTree is more intuitive more natural to my problem solving, more than priorityQ and sorted vector. Given my #1 goal, I would go for my favorite tools.
• j4 — if the interviewer gives a new requirement, my RBtree may become useful (51% chance) or become unusable (49% chance)
• Drawback #1 of RBtree — not supported in python
• Drawback  — array sorting can achieve O(N) using radix sort or counting sort, esp. in the contrived contexts of Leetcode problems.

Q: what if interviewer feels RBtree is overkill and over-complicated?
A: I would say overall bigO is not affected.
A: I would say RBTree supports future features such as delete and dynamic data set.  Realistic reasons are powerful arguments.

Q: what if interviewer gives a follow-up request to simplify the design?
A: if I already have an optimal solution, then yes I don’t mind replacing my RBTree

# hash table time complexity: O(N)^O(logN)^O(1)

Q: When interviewer ask about time complexity, shall I point out my hash table is O(log N) not O(1)?
A: if time permits, I would briefly mention O(N) and O(logN) worst case
A: if no time, then just say typically O(1). I don’t think we would lose many points.

https://stackoverflow.com/questions/9214353/hash-table-runtime-complexity-insert-search-and-delete has a fairly authoritative answer. Basically, it is indeed O(N) worst case in general but

• can improve to O(logN) in many situations where java8 technique applies
• in other cases it is O(N) in theory, but extremely rare in practice. The only known case is a deliberate DoS attack.

# nth largest element in unsorted array #QuickSelect

Q: Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.

I think this is mostly a bigO algoQQ problem.

std::nth_element is linear on average .. https://stackoverflow.com/questions/11068429/nth-element-implementations-complexities talks about QuickSelect algo

— idea 6: priority Q (Fib-heap) of size k
if any item is higher than the min, then pop min O(logK) and insert in O(1)
— idea 6: priority Q
Linear time to build it
— idea 5: 95th percentile problem from NYSE
— idea 4: multiple scans
— idea 3: segments
— Sol2: O(N). use the O(N) algo in the blog on “given int array, find median ] O(N)”. Then discard one of the two segments. Then repeat.
Note: Each time the target element must exist in one of the 2 segments.

O(N) + O(N/2) + O(N/4) … -> O(N)

— Sol2a: Use the O(N) qsort partition algo to anchor a (random) pivot element to create two segments. Our target must exist in one of the two, so discard the other by adjusting the le/ri boundaries.

This idea is same as the most voted solution in leetcode discussion.
O(N) on average — we get O(N)+O(N/2) + O(N/4) + … < O(2N)

Note average complexity is acceptable in hashtable!

# union-find O(1) data structure

[[CLRS]] has a chapter on disjoint set. It shows a simple implementation of disjoint set using a linked list. Another implementation is a tree. I also used a hashset.

The tree implementation is described in https://en.wikipedia.org/wiki/Disjoint-set_data_structure . This data structure is probably not used in the union-find algo challenges.

The IV algo challenges are usually small scale. They can use a primitive, trivial version of disjoint-set but doesn’t really gain any meaningful performance advantage. I think this attitude towards disjoint-set is similar to the prevailing attitude towards hash table — it is just a means to achieve big-O demands. With or without this data structure, the problem is solvable but the real challenge is the big-O requirement.

Two main operations can both be optimized — 1) Union() can use by-rank or by-size and 2) Find() can use path compression. Both optimizations keep the tree height very low. High fan-out is desirable and achievable.

Using both path compression and union by rank or size ensures that the amortized time per operation is essentially O(1).

# #1(reusable)AuxDS for algo challenges

Here’s a Reusable data structure for many pure algo challenges:

Pre-process to construct a static data store to hold a bunch of “structs” in linked list -OR- growing vector -OR- growing RBTree , all O(1) insertion :). Then we can build multiple “indices” pointing to the nodes

Here are a few O(1) indices: (Note O(1) lookup is the best we can dream of)

• hashtable {a struct field like a string -> iterator into the data store}
• array indexed by a struct field like small int id, where payload is an iterator from the data store
• If the structs have some non-unique int field like age, then we can use the same key lookup to reach a “group”, and within the group use one (or multiple) hashtable(s) keyed by another struct field

I think this is rather powerful and needed only in the most challenging problems like LRU cache.

# word ladder: build graph +! test`every pair

Leetcode Q 127: word ladder — Given two words (beginWord and endWord), and a word list, find the length of shortest transformation sequence from beginWord to endWord, such that:

* Only one letter can be changed at a time.
* Each transformed word must exist in the word list. Note that beginWord is not a transformed word.
* Return 0 if there is no such transformation sequence.
* All words have the same length.
* All words contain only lowercase alphabetic characters.
* You may assume no duplicates in the word list.

Example 1:
beginWord = “hit”,
endWord = “cog”,
wordList = [“hot”,”dot”,”dog”,”lot”,”log”,”cog”]
Output: 5
Explanation: As one shortest transformation is “hit” -> “hot” -> “dot” -> “dog” -> “cog”,
return its length 5.

Example 2:
beginWord = “hit”
endWord = “cog”
wordList = [“hot”,”dot”,”dog”,”lot”,”log”]
Output: 0

==== analysis:
First scan O(NN)to build the bidirectional edges of the graph. Given an existing graph of N words (N can be 1), a new word is compared against each to construct the edge list.

N(N+1)/2 comparisons. No need to be extra clever as this simple O(NN) algo is optimal for small N. No need to worry about the visualization of the graph either because the edge list is a proven representation

Aha — depending on the relative size of N and S (the standard length of every word), the optimal algo is different !

— Idea 2

Now I realize there’s a more efficient algo to build the graph, based on the neglected fact that all strings have equal length.

• for a new word (always same length S), like abc, create S(=3) “patterns” — *bc, a*c, ab*.
• each pattern will try to join an existing club or create a new club. Clubs are maintained in a hashtable of {pattern -> list of original words} In java or python, each word is basically a pointer to an an immutable global object.
• If joining an existing club, then all existing club members are linked to the new word, so new word will now hold a reference to this club as an “edge list”
• I think this algo is O(N*S). If S is small, then this algo is more efficient than O(NN)

At the end of this phase, if beginWord and endWord belong to disjoint sets then return 0. However I see no simple implementation of disjoint set. Therefore, I will run 2nd scan O(N+E) BFS. But there are many cycles, so we need a hashset “Seen”, or array of size N.

Insight — Array is more elegant than hash table in this case.

To compare two words, char-wise subtraction should give all zero except one char. This last routine can be extracted to a “simple routine to be implemented later”, so no need to worry about it in a white board session.

RBTree O(1) insert is quite common in coding questions.

[1] Contrary to popular belief, RBTree mass insert (including mass-construction) is O(logN) rather than O(1) per node in the general case. However, see link above.

See lecture notes https://courses.cs.washington.edu/courses/cse373/02au/lectures/lecture11l.pdf and SOF post on
https://stackoverflow.com/questions/6147242/heap-vs-binary-search-tree-bst

# Alien dictionary

https://leetcode.com/problems/verifying-an-alien-dictionary/

Suppose Total C characters, and N words

====analysis

Mostly implementation challenge.

insight — Published solution is mediocre performance as it scans each word exactly TWICE, but luckily “twice” doesn’t affect bigO — O(total char count across all words)

— idea 1: maintain a linked list of “clusters”. Each cluster is {pos, startWordID, optional lastWordID} Each cluster has words with the same prefix up to pos.

copy first letter of N words into an N-array. verify this array is sorted. Now separate the words into up to 26 clusters. Suppose we a cluster of 55 words. This cluster is the payload of a link node. When we look at 2nd char within this cluster, we see up to 26 sub-clusters, so we replace the big cluster with these sub-clusters.

Invariant — the original order among the N words is never changed.

Even if this idea is overkill, it can be useful in other problems.

the verify(arrayOf1stChar) is a util function.

— Idea 4: convert each word to an English word, in O(C).

Then sort the collection. What’s the O()? O(N logN C/N) = O(C logN)

— idea 5: compute a score for each word and check the words are sorted in O(N)

Q: create an unordered multiset with O(1) add(Item), del(Item) and a getRandom() having the  probability of returning any item  based on the PMF.

Rahul posed this question to our Princeton PhD candidate, who needed some help on the Bag version.

====my solution:
On the spot, I designed a vector<Item> + hashmap<Item, hashset<Pos>>. The hashset records the positions within the vector.

Aha — Invariant — My vector will be designed to have no empty slots even after many del(). Therefore vec[ random() * vec.size() ] will satisfy getRandom() PMF.

add() would simply (in O(1)) append to the vector, and to the hashset.

— del() algo is tricky, as Rahul and I agreed. Here’s an illustration: Let’s say Item ‘A’ appears at positions 3,7,8,11,16 and B appears at positions 2,5,31 (the last in the vector). del(A) needs to remove one of the A’s and move the B@31 into that A’s position.

1. Suppose the PMF engine picks vec[11] which is an A.
2. unconditionally O(1) find the item at the last position in vector. We find a B, which is different from our ‘A’
3. Here’s how to physically remove the A from position 11:
4. O(1) replace ‘A’ with ‘B’ at position 11 in the vector
5. O(1) remove 11 from A’s hashset and add 11 into B’s hashset, so A’s hashset size decrements.
6. O(1) remove 31 from B’s hashset, so B’s hashset size remains

# average O(): hashtable more IMperfect than qsort

In these celebrated algorithms, we basically accept the average complexity as if they were very likely in practice. Naive…

In comp science problems, hash table’s usage and importance is about 10 times higher than qsort

• I would say qsort is faster than many O(N logN) sorts. Qsort can use random pivot. It degrades only if extremely “lucky;)” like getting a “6” on all ten dice.
• In contrast, hash table performance depends mostly on programmer skill in designing the hash function, less on luck.

Performance compared to the alternatives — qsort competitive performance is pretty good in practice, but hash table relative performance is often underwhelming compared to red-black trees or AVL trees in practice. Recall RTS.

# fastestSortFor: 1-str; arr@Str; 64bit floats #non-comparison

I think the big-O coding question may require this knowledge, because

….real world sorting problems can often use key-based sorting, rather than comparison-based sorting.

Therefore, if your overall solution requires sorting, don’t assume it would become O(J*K+M^2+P* N logN) based on your O(N logN) sorting assumption. Here are some achievable linear-time sorts:

 sort O(?) data type ref notes counting O(N) chars radix O(N) 64-bit ints O(Nw) -> O(N) radix O(N) floats [3][4] IEEE format radix O(N) variable-size strings [1/2] #1 fastest burst O(N) variable-size strings #2 fastest. key-based, trie-based radix O(N) date/time converting to fixed-length string or fixed-size integers bucket? O(N) random ints in python/perl unbounded… No limit on #digits

# O(1)space,O(1)search,O(N)sort: tricks

• every problem asking “top N” — quick-select gives average linear time. Real upper bound is O(NN)
• Every time I see O(1) space required on an array problem, I think of …. swapping.
• Every time I see O(1) space required on small int arrays, I think of … save (2D) array indices in the array as payload
• Every time I see O(1) space required on a list problem, I ask is it a ….. linked list.
• Every time I see O(N) time required on an array problem, I think of … radix sort, applicable to 64-bit integers, 64-bit floats and strings.
• Every time I see O(1) search, I think of … hash table and radix …?
• Every time I see O(1) operation on a sorted data structure, I think of …. append on BST/priorityQ, deque

# merge K presorted lists #O(what)

Q: Merge k sorted linked lists and return it as one sorted list. Analyze and describe its complexity.

Note K could be much larger than N.

I feel this is mostly an optimization challenge. I can think of a few solutions

–Sol1: merge 2nd list into first. Then merge 3rd list into first …

https://leetcode.com/problems/merge-k-sorted-lists/solution/ shows that this has higher runtime cost than the brackets solution.

Reason is, each 2-merge-to-1 must visit every node in both lists. So the first list nodes get visited K times!

–Sol1b: brackets.

There are only (log K) levels in the bracket so any list gets visited that many times.

–Sol3: in-place (inefficient)

We maintain K node-pointers for the K lists (K teams)

We also maintain a pointer to the last-added node in the merged list.

first node in K lists are put into a min-heap. Winner (smallest) team would be the “current list”. Now the winner team offers next node and add it into the heap. Winning team ..

What if N=1 and K is 1 billion?

# bucketSort: O(N+k) could be better or worse than O(N)

Based on the 53-vote top answer in https://stackoverflow.com/questions/7311415/how-is-the-complexity-of-bucket-sort-is-onk-if-we-implement-buckets-using-lin

It’s obvious (by definition) that O(2N) is equivalent to O(N).

O(N+k) is worse than O(2N) when k is larger, like 4000 times N. For a concrete illustration, say, we have N=5 million strings to sort, using k = 20 billion buckets.

• It takes constant time to bucket each string, so the first step takes O(N) i.e. grows proportional to N.
• 2nd step is now dominated by k, since all 20 billion buckets have to be examined. Time complexity here is O(N+k). This is explained in the stackoverflow answer.
• Therefore, total complexity is O(N+k) i.e. cN + bk with two constant factors c and b. Even if b is small (tcost of looking at an empty bucket), as k grows, at sometime the k term would dominate the N term.

In practice, k is often chosen to be comparable to N, so I don’t think this is a practical issue.

# min-stack #bbg

My friend Abhinav (not his real name, to protect his privacy) got this question at Bloomberg internship interview. I added some details to make it clearer:

Q: Design a stack that supports push, pop, top, and retrieving the minimum element all with O(1) time complexity in the worst case.

There exists a function compare(Item A, Item B) that returns 1 if A is greater, 0 if equal, and -1 if A is smaller.

• getMin() — Retrieve the minimum element in the stack.
• push(x) — Push element x onto stack.
• pop() — Removes the element on top of the stack.
• top() — Get the top element.

==== analysis =====

The most efficient getMin() data structure is the binary heap, but insert/delete runs in O(logN). Therefore I felt the requirements here are computationally impossible. But in this context, we only need to support deleting the last added item 🙂

Key insight — popping is a constrained form of deletion. It’s hard to hit O(1) while supporting unconstrained deletions, BUT with a constraint on deletions, all operations can be O(1).

I need a regular stack + a helper data structure. A linked list or vector can support the stack operations

— The helper — a naturally sorted stack (or vector or deque) to hold record-breakers and record-matchers.

IFF a new minimum (record breaker) or another item matching the existing minimum (record-matcher) is added, we push it to the sorted stack.

After every pop(), we check the popped item. If equal to top of sorted stack, then pop the sorted stack.

At any time, top of the sorted stack is the current minimum.

Vector and deque are actually fine. Interviewer may feel they are inferior to a stack, but with a modified requirement, they may become very useful.

— Here’s a design to beat binary_heap, based on finite-sized (32 or 64-bit int) keys

Assuming the comparison is based on 32-bit integer key (string or floats can also use radix sort). I will use a radix array structure. Perhaps 4 arrays of 256-elements.  Or perhaps a 4×256 matrix, Ultimately the payload stored in the data structure are pointers to stack nodes. This saves memory since a stack node may be 99MB.

Every time we push a node, we derive the new key value in O(1), use the key value to locate its ‘home’ in the radix structure and store the new node’s address in a linked list therein. After the push(), we can compare and update (in constant time) a global variable pointing to the minimum node.

Each stack node also has a back-pointer to the iterator into the list, so before pop() we can use the iterator to locate the object in the radix structure and delete it from the host linked list. We will also update the global variable.

# factorial is worst O(n!) > O(2^n)

Factorial complexity is (much) worse than exponential complexity. Look at n! and 53^n, without loss of generality. Model them as two snowballs rolling over 1,2,3,..1000… Initially, the exponential (E) snowball is bigger, but after we pass n=53, the factorial(F) snowball grows by a factor of 54, while the snowball E grows by a factor of 53. From then on, F grows faster and faster and will outgrow E sooner or later.

[[Discrete Mathematics]] puts Factorial as the worst algorithm, but both Factorial and Exponential are unacceptable, while Polynomial is acceptable.

# [[algo in a nutshell]] bucket sort, insertion sort..

== bucket sort
This book asserts that bucket sort is #1 top dog when data is uniformly distributed (not normally distributed!) — i.e. can be uniformly partitioned using a fast hashing function. Number of buckets created equals the input data size, just like in a standard hash table. Book shows bench-marking on large samples of random floating point numbers. However, I disagree.
1.  for special data types like strings and small ints, radix sort is probably faster than bucket sort
2. Bucket sort is #1 for large data volumes, beating quick sort, heap sort etc. But for nearly-sorted data, Insertion-sort is faster.

Other notes

• Linked list — used inside each bucket. Arrays are possible alternative, but 50% slower.
• —-Relation to other sorts:
• Can be seen as generalization of counting sort
• A cousin of radix sort in the most-to-least significant digit (MSD) flavor i.e. top-down radix sort
–hash sort, a variation of bucket sort, customized for random strings.
Book shows benchmarking on random strings. Hash sort is #1 for large data volumes, beating quick sort, heap sort etc

I guess if string distribution is unconstrained/unknown (without guarantee of randomness), hash sort will not have any advantage

in both cases, the hash code must be strictly “ordered” , so bucket #1 must hold lowest data items.

== insertion sort
P100. for nearly-sorted data, insertion sort is faster than all other sorts (including bucket sort, radix sorts), often by an order of magnitude

https://stackoverflow.com/questions/736920/is-there-ever-a-good-reason-to-use-insertion-sort shows insertion sort is better than divide-n-conquer for 1) nearly-sorted or 2) small collection

https://www.cs.princeton.edu/~rs/AlgsDS07/18RadixSort.pdf shows that MSD radix sort also need to switch to insertion-sort when sample size is small.

== counting sort
P315. “fastest” sort for the right data set. I think a single English word is suitable. However, for nearly sorted data, insertion sort is still faster.

I guess it’s still slower than insertion sort if nearly-sorted.

== Binary search tree vs hash table
P130 discusses when to use which

# O(n) sort – count`sort

#1 requirement: keys [note 1] be very small integers, small enough to be array indices.
# 1.1 I think this requirement is met whenever you notice an enum of “possible values”

[1] “values of keys” is perhaps more accurate.

This requirement alone reduces its applicability to a “niche”, but an extremely important niche. In fact, i think every time this requirement is met, we should consider count`sort. How about sorting the letters in a single english word? Remember the “anagram” question?

If the values are small integers of 0,1,2…k, then u need a temporary array of k+1.

count`sort is stable, and therefore usable in radix sort.

# a less ambiguous explanation of big O

–If you can only remember one phrase, remember “bound-from-above”
If we say “Algorithm R is O(lg n)”, then number-of-iterations are bound-from-above by lg n multiplied by a constant.

–If you can remember a 2nd phrase, remember “worst case upper bound”
Worst case upper bound is often different from average-case upper bound of a program’s running time. For quicksort, they are O (n-squared) vs. O(n lg n). If we talk of O() in the absence of *-case, it means worst-case-upper-bound. A really long translation of “Algorithm R is O(n-squared)” would be

“iterations are bound from above by n-squared even for the worst-case input array of size n.”

A shorter version: “At most n-squared iterations for the worst input”

Average-case-upper-bound calculation often (if not always) involves probability. Amortized analysis is probability-free.

# O(1) constant complexity means ….

1st type of time-complexity to study is the “linear growth” — O (N). As list size N tripples, running time tripples.

“constant complexity” means …. As list size N grows, running time doesn’t grow with N. It’s independent of input size. I think running time may still vary as list grows, but much slower than log N .

Example: somehow, hash search has a constant complexity for search and insert, independent of list size or hash table size(?)

# big O, in%%lang

O, o, Omega and theta all try to answer the question “for this algorithm, how many more iterations when N triples?” (Avoid “doubles” to avoid potential ambiguity about 2), where N measures the input.

Example: For the quicksort algorithm, answer is O (N * log N), meaning “iterations increase by a factor of (3 * log 3) when input triples”
Example: For the binary search algorithm, answer is O (log N), meaning “(log 3) times more iterations when input triples”

O () means “no more than iterations”.
O is an upper bound, not a tight bound