# sorting collection@chars|smallIntegers

Therefore, given a single (possibly long) string, sorting it should Never be O(N logN) !

Whenever an algo problem requires sorting a bunch of English letters, it’s always, always better to use counting sort. O(N) rather O(N logN)

Similarly, in more than one problems, we are given a bunch of integers bound within a limited range. For example, the array index values could be presented to us as a collection and we may want to sort them. Sorting such integers should always , always be counting sort.

# zero out rows/columns +! auxDS

Q (Leetcode 73): Given a m x n matrix, if an element is 0, set its entire row and column to 0. Do it in-place in O(1) space. No time complexity.

I will assume all signed ints.

====analysis
I find this problem very well-defined, but the O(1) space constraint is highly contrived. I think it only needs some clever technique, not really reusable technique.

Reusable technique — for mutable array of small positive integers, with stringent space complexity —> try to save indices in the original array

Aha — I only need to know the full set of rowIDs and columnIDs.

— My O(minimum(m,n)) space solution 1:
zeroRowCnt:=how many rows to be zeroed out
zeroColCnt  :=how many columns to be zeroed out

Compare the two. Suppose zeroRowCnt == 11 is smaller. I will save the 11 rowID’s in a collection. Then first scan horizontally to zero out by column. Then use the rowIDs to zero out by row

–My O(1) space idea 2 — more elaborate than the published solution.

Aha — Can we save the 11 rowID’s in a column to be zeroed out?

insight — The “save indices in orig int array” technique is a halo trick in contrived coding problems. Is it practically useful in real world? I doubt it.

Compare zeroRowCnt and zeroColCnt as earlier. Get first rowID among the 11. Suppose it’s Row #3.

Now we know Row#3 has some zeros, so find the first column having a zero. It might be the last column (farthest east). Wherever it is, we pick that column as our “bookkeeper column”.

Visual insight — Suppose bookkeeper is Column #33. Then a[3,33] would be the first zero if we scan entire matrix by-row-and-internally-by-column

We scan row by row again (since we don’t remember those 11 rowIDs), starting after that first rowID. For every rowID found, we will zero out one corresponding cell in bookkeeper column.

Insight — We should end up with exactly 11 zeros in that column. Can’t exceed 11 (only 11 rows having zeros). Can’t fall below 11 (we save all 11 rowIDs)

From now on, freeze that column until further notice. Now zero out each Column to be zeroed out, but leave out our bookkeeper column.

Lastly, follow our bookkeeper column to zero out every “dirty row”.

# merge 2 binTrees by node position

Q (leetcode 617): https://leetcode.com/problems/merge-two-binary-trees/submissions/

==== Analysis

https://github.com/tiger40490/repo1/blob/py1/py/algo_tree/merge2Tree.py is a short, simple solution fully tested on leetcode.com but hard to run offline. Elegant in term of implementation.

Insight — Challenge is implementation. Input and return type of DFT function are tricky but server as useful implementation techniques.

Labelled as easy, but pretty hard for me.

— Idea 1: BFT. When a node has any null child, put null into queue. Not so simple to pair up the two iterations

— Solution 2: DFT on both trees. Always move down in lock steps. When I notice a node in Tree A is missing child link that Tree B has, then I need to suspend the Tree A DFT?

My DFT function would have two parameters nodeInA and nodeInB. One of them could be null , but the null handling is messy.

Aha — I set the input parameters to to dummy objects, to avoid the excessive null check. In this problem, this technique is not absolutely necessary, but very useful in general

# nth largest element in unsorted array #QuickSelect

Q: Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.

I think this is mostly a bigO algoQQ problem.

std::nth_element is linear on average .. https://stackoverflow.com/questions/11068429/nth-element-implementations-complexities talks about QuickSelect algo

— idea 6: priority Q (Fib-heap) of size k
if any item is higher than the min, then pop min O(logK) and insert in O(1)
— idea 6: priority Q
Linear time to build it
— idea 5: 95th percentile problem from NYSE
— idea 4: multiple scans
— idea 3: segments
— Sol2: O(N). use the O(N) algo in the blog on “given int array, find median ] O(N)”. Then discard one of the two segments. Then repeat.
Note: Each time the target element must exist in one of the 2 segments.

O(N) + O(N/2) + O(N/4) … -> O(N)

— Sol2a: Use the O(N) qsort partition algo to anchor a (random) pivot element to create two segments. Our target must exist in one of the two, so discard the other by adjusting the le/ri boundaries.

This idea is same as the most voted solution in leetcode discussion.
O(N) on average — we get O(N)+O(N/2) + O(N/4) + … < O(2N)

Note average complexity is acceptable in hashtable!

# rotate square matrix 90-degree #clever swap

``````/*
* clockwise rotate
* first reverse up to down, then swap the x/y subscripts
* 1 2 3     7 8 9     7 4 1
* 4 5 6  => 4 5 6  => 8 5 2
* 7 8 9     1 2 3     9 6 3
*/``````

For anticlockwise, personally I would reverse horizontally, then same swap

This technique relies on swap(). What if the payload objects are blackboxes (or immutable) so you can’t even see contents? Just swap the 2 pointers.

— My own solution is more legwork but possibly more versatile. Suppose the matrix is N by N

m=N-1. Start from [0 0]. find new home, move in and kick out old tenant [0 m] …Each group has 4 nodes. a-> b -> c -> d -> a

See also spiral … If N is odd, then the kernel node stays changed. If N is even, then the innermost shell has a 2×2 matrix.

# check array@0-N +! auxDS #Nsdq#contrived

— Q1: Given a size-5 array of integers. For every element x: 0<= x <=4. Array would be considered Good if all elements are unique, i.e. all the numbers 0 to 4 show up exactly once each. Please write a program to check if the array is good. If Bad, then print every number that’s showing more than once. O(N) time and O(1) space please.

We will denote array size as sz, so N == sz-1.  N=4 and sz=5 in the opening example.

I feel this question (even without the bonus) is too hard to complete on the spot, unless you have done it before.

I made a mistake that would be unforgivable in west coast and Bloomberg interviews — failed to clarify requirements and assumed input array is immutable.

— Q1b (bonus): Also print every missing number. (Requires iterating entire input.)

Example A {1,2,0,2,0} has repeating 0 and 2, and missing 3 and 4. Example B {3,1,4,2,0} is Good.

If you feel the O() complexity requirements are daunting, then work out a less efficient solution first, and improve on it.

I feel in real projects, we always have enough memory to build a separate array, so the O(1) space requirement is contrived. A O(N) time and O(N) space solution is easier. Therefore, I rate this coding question as contrived and relatively non-standard. Yet the required techniques are “universal” in many high-end interviews.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/algo_arr/checkArr0-N_Nsdq.cpp

has my 99% solution. The unfinished part is trivial.

# locate a pair with targetSum==55 #bbg IV #Morris

Update

Q: any O(1) space sort?

Q: From an unsorted array of positive integers, is it possible to find a pair of integers that sum up to a given sum?

Constraints: This should be done in O(n) and in-place without any external storage like arrays, hash-maps, but you can use extra variables/pointers.

If this is not possible, can there be a proof given for the same?

—–Initial analysis—-
I wish I were allowed to use a hash table of “wanted ” values. (iterate once and build hashtable. For Each new value encountered, check if it is in the “wanted” list…)

I feel this is typical of west coast algorithm quiz.

I feel it’s technically impossible, but proof?  I don’t know the standard procedure to prove O(n) is impossible. Here’s my feeble attempt:

Worst case — the pair happens to be the 1st and last in the array. Without external storage, how do we remember the 1st element?  We can only use a small number of variables, even if the array size is 99999999. As we iterate the array, I guess there would be no clue  that 1st element is worth remembering. Obviously if we forget the 1st element, then when we see the last element we won’t recognize they are the pair.
—–2nd analysis—-
If we can find a O(n) in-place sort then problem is solvable [1]. Let’s look at Radix sort, one of the most promising candidates. https://stackoverflow.com/questions/463105/in-place-radix-sort has an implementation for DNA strings.

Assumption 1: the integers all have a maximum “size” in terms of digits. Let’s say 32-bit. then yes radix is O(n) but not sure about space complexity. Now, with any big-O analysis we impose no limit on the sample size. For example we could have 999888777666555444333 integers. Now, 32-bit gives about 4 billion distinct “pigeon-holes”, so by the pigeon-hole principle most integers in our sample have to be duplicates!

Therefore, Assumption 1 is questionable. In fact, some programming languages impose no limit on integer size. One integer, be it 32 thousand bits or 32 billion bits, could use up as much memory as there is in the system. Therefore, Assumption 1 is actually superfluous.

Without Assumption 1, and if we allow our sample to be freely distributed, we must assume nothing about the maximum number of digits. I would simply use

Assumption 2: maximum number of digits is about log(n). In that case radix sort is O(n log(n)), not linear time:(

—– new analysis as of Jun 2017 —-
Can Morris algo sort an array (external storage)? If yes,  then use the 2-moving-pointer algo in locate a pair whose diff=55

However, Morris needs a tree not an array.