find any black-corner subMatrix #52%

https://www.geeksforgeeks.org/find-rectangle-binary-matrix-corners-1/

Q: given a black/white matrix, find any rectangle whose all four corners are black.
Q2: list all of them
Q3 (google): find the largest

— idea 2: record all black cell locations and look out for 3-corner groups

a Rec class with {northwest corner, northeast corner, southwest corner}

first pass, For each pair on the same row, create a pair object and save it in hashmap {northwest cell -> list of x-coordinates on my right} We will iterate the list later on.

2nd pass, scan each column. For each pair on the same col, say cell A and C, use A to look-up the list of northeast corners in the hashmap. Each northeast corner B would give us a 3-corner group. For every 3-corner pack, check if the forth corner exists.

— idea 1: row by row scan.
R rows and C columns. Assuming C < R i.e. slender matrix

For first row, record each ascending pair [A.x,B,x] (No need to save y coordinate) in a big hashset. If S black cells on this row, then O(SS).

In next row, For each new pair, probe the hashset. If hit, then we have a rectangle, otherwise add to the hashset. If T black cells on this row, then O(TT) probes and (if no lucky break) O(TT) inserts.

Note one single hashset is enough. Any pair matching an earlier pair identifies a rectangle. The matching would never occur on the same row 🙂 Optionally, We can associate a y-coordinate to each record, to enable easy calculation of area.

After all rows are processed, if no rectangle, we have O(SS+TT+…). Worst case the hashset can hold C(C-1)/2 pairs, so we are bound by O(CC). We also need to visit every cell, in O(CR)

If C > R, then we should process column-by-column, rather than row-by-row

Therefore, we are bound by O( min(CC,RR) + CR). Now min(CC,RR) < CR, so we are bound by O(CR) .. the theoretical limit.

— idea 4: for each diagonal pair found, probe the other corners
If there are H black cells, we have up to HH pairs to check 😦 In each check, the success rate is too low.

— Idea 5: Brute force — typewriter-scan. For each black cell, treat it as a top-left, and look rightward for a top-right. If found, then scan down for a pair of bottom-left/bottom-right. If no then complete the given row and discard the row.

For a space/time trade-off,

  • once I find a black cell on current row, I put it in a “horizontal” vector.
Advertisements

zero out rows/columns +! auxDS

Q (Leetcode 73): Given a m x n matrix, if an element is 0, set its entire row and column to 0. Do it in-place in O(1) space. No time complexity.

I will assume all signed ints.

====analysis
I find this problem very well-defined, but the O(1) space is highly contrived. I think it only needs some clever technique, not really reusable technique.

Reusable technique — for array of integers with stringent space complexity, we can save indices in the array

Aha — I only need to know the full set of rowIDs and columnIDs.

— My O(minimum(m,n)) space solution 1:
zeroRowCnt:=how many rows to be zeroed out
zeroColCnt  :=how many columns to be zeroed out

Compare the two. Suppose zeroRowCnt == 11 is smaller. I will save the 11 rowID’s in a collection. Then scan horizontally to zero out by column. Then use the rowIDs to zero out by row

–My O(1) space idea 2 — more elaborate than the published solution.

Aha — Can we save the 11 rowID’s in a column to be zeroed out?

Compare zeroRowCnt and zeroColCnt as earlier. Get first rowID among the 11. Suppose it’s Row #3.

Now we know Row#3 has some zeros, so find the first column having a zero. It might be the last column (farthest east). Wherever it is, we pick that column as our “bookkeeper column”.

Visual insight — Suppose bookkeeper is Column #33. Then a[3,33] would be the first zero if we scan entire matrix by-row-and-internally-by-column

We scan row by row again (since we don’t remember those 11 rowIDs), starting after that first rowID. For every rowID found, we will zero out one corresponding cell in bookkeeper column.

Insight — We should end up with exactly 11 zeros in that column. Can’t exceed 11 (only 11 rows having zeros). Can’t fall below 11 (we save all 11 rowIDs)

From now on, freeze that column until further notice. Now zero out each Column to be zeroed out, but leave out our bookkeeper column.

Lastly, follow our bookkeeper column to zero out every “dirty row”.

minimize average distance to N cells]matrix #Kyle 60%

Q: given N red cells in a white matrix, find the best cell anywhere in the matrix, whose average distance to the N red cells is minimized. You can imagine N homes deciding to build a waterpoint.

Distance is like how many horizontal or vertical steps.

=====analysis

Insight — vertical vs horizontal dimensions are independent, so this is really a one-dimension problem.

center of gravity?

Insight — rate of change should be zero at the trough…

getByRank() in sorted matrix: priorityQ^RBTree

https://leetcode.com/problems/kth-smallest-element-in-a-sorted-matrix/

====analysis

recombinant binTree pyramid, where “water” flows east or south.

  • first level has one node .. lowest value. Add it to pq (i.e. priorityQ)
  • pop the pq and insert the two downstream nodes
  • total K pops, each pop is followed by up to 2 inserts

Heap will grow to up to K items, so each pop will be up to logK

Total O(K logK). To achieve this time complexity, we can also use a RBTree. The tree nodes can come from a pre-allocated array.

count unique squares ] matrix #70%

Q1: Count unique squares within a R*C matrix. A unique square is identified by two opposite corners. For example, [0,0] and [1,1] identifies a 4-cell square; [2,3], [2,3] identifies a 1-cell square.

Q2: print out all of them.

====analysis

Q1 Feels like a math problem, but printing is a algo question

— solution 1: Let’s focus on height-2 for now. I will denote the subset of squares having height-2 as “2-subset”. (The 1-subset is trivial.)

Insight — 2-subset is naturally non-overlapping with the 3-subset, 4-subset etc. These non-overlapping subsets constitute the whole solution. This fact is extremely valuable to the count algorithm.

For each height-2 square, I will focus on the north-west corner i.e. the corner among four, having lowest r and c IDs. We can divide the 2-subsets by the rowID. For a 55-row/44-column matrix, there are 54 height-2 squares with rowID=0. Same count with rowID=1 etc.

Aha — each square can be identified by the north-west corner {rowID, colID} + height. So at the higher level I divide the whole solution set by height. At the lower level, I sub-divide the set by the “low rowID”

recombinant binTree^2D_grid

I prefer “recombinant binary tree”. As a special 2D-grid, it has BOTH [2] child links pointing down [1]

[1] or “eastward” in a treeDump() output.

[2] Note if there are four child links then no binTree.

This constraint is important. Important for visualization. Easy to remember:

  • Every edge is slanted at 45 degrees, either southeast or southwest
  • Consequently, we can visualize that all nodes 33 steps below root line up horizontally on the 33rd Level 🙂

Now I feel that any 2D-grid with this feature should be drawn (and used) as a recombinant binary tree. This includes half the 2D-grid problems and path-through-matrix problems 🙂

 

compute any sum@subMatrix by O(1) hashmap lookup

suppose origin cell is at northwest [1,1] not [0,0]

Pre-processing — We can incrementally compute each the rooted-submatrix SumToOrigin[r,c] in linear time. Save them in a s2o hashmap (shadow matrix is actually better)

Now Consider the submatrix identified by [2,2] and [3,4] enclosing 6 cells. This submatrix has sum =

s2o[3,4]+s2o[1,1]-s2o[1,4]-s2o[3,1]

12 cells + 1 cell  – 4 cells – 3 cells

So any submatrix sum can be computed in O(1). To go through all NNMM of them requires O(NNMM) where N and M are the board dimensions

max all-black subMatrix #ZR

Same problem as https://leetcode.com/problems/maximal-rectangle/description/

Q: Given a 2D binary matrix (L by N) filled with white(0) and black(1) cells, find the largest all-black rectangle. See raiserchu’s mail on 12 Sep 13. There is a clever DP solution, probably O(LN).

—Analysis:

Worst case — A standard chess board? We can’t do better than O(LN) since there are LN cells to read.

–O(LN) leetcode solution based on histogram

https://github.com/tiger40490/repo1/blob/cpp1/cpp/2d/maxRectangle.cpp is latest code with my adaptations and my detailed comments.

— sol5:

First scan O(LN) to record, in each cell {bar height; horizontalBarStart/End}.

— idea 4unfinished

Scan #1 O(LN): build a shadow matrix “histogram” where each integer in the cell is the height (possibly 0) of the bar anchored therein.

Scan #2 O(LN) for each cell, remember the currentRunStart column index i.e. from that column until current column, we have an all-black box of height == current bar height

— sol3 O(LNS) new idea based on max rectangle ] histogram treat top 2 (denote J:=2) rows as a histogram. Find the max rectangle therein. Then J:=3 …

  • Scan #1 O(LN): build a shadow matrix “histogram” where each integer in the cell is the height (possibly 0) of the bar anchored therein. In other words, if a cell value=5 then there are exactly 4 consecutive black cells above this (black) cell. Build it incrementally, level by level, top to bottom.
  • Scan #2a: for each row in the shadow matrix, we run the proven algo in O(NS), Note there’s no help from previous row:(
    • S:= #unique heights, N:= matrix width 
  • Scan #2 := the entire scan of L rows. so worst case we hit O(LNS)

Q: Can we do better by reducing scan #2a complexity to O(N), by making use of the previous row results?

— My brute force solution 1: Each rectangle is identified by 2 vertices, i.e 4 integers. Without loss of generality, We require the “high” corner to have higher x-coordinate and higher y-coordinate than the “low” corner. (We can assume y-axis run upward.) With this O(N^4) nested loop we can iterate over all possible rectangles:

Lock low corner
Move high corner in typewriter (zigzag) steps i.e.
  hold highY and move highX step by step
  process the (series of) resulting rectangles
  increment highY and repeat
Move the lower corner in typewriter steps and repeat

Key observation: any “bad pixel” disqualifies every rectangle containing it.

— Here’s my partial solution:
We can effectively ignore all the “good pixels”.

1) Look at the x coordinates of all bad pixels. Sort them into an array. Find the largest gap. Suppose it’s between x=22 and x=33. Our candidate rectangle extends horizontally from 23 to 32, exactly. Notice there’s no bad pixel within this vertical band [1].
2) Look at the y coordinates of all bad pixels. Sort them into an array. Find the largest gap. Suppose it’s between y=15 and y=18. Our candidate rectangle extends vertically from 16 to 17, exactly.
[1] This candidate rectangle can expand All the way vertically, though it may give a bigger rectangle
Ditto horizontally.

longest consecutive ints ] matrix 70% done

View at Medium.com

Q: Given a n*n square matrix where all numbers are distinct, find the maximum length path (starting from any cell) such that all cells along the path are in increasing order with a difference of 1. We can move in 4 directions from a given cell (i, j), i.e., we can move to (i+1, j) or (i, j+1) or (i-1, j) or (i, j-1) with the condition that the adjacent cells have a difference of 1.

Example:

Input: mat[][] = {
{1, 2, 9}
{5, 3, 8}
{4, 6, 7}}
Output: 4
The longest path is 6-7-8-9.

–sol1, probably O(NN) since each node is checked no more than 5 times.

1) typewriter search for the first unvisited node. Exit if all visited. Once a node is found, designate it as cur.
2) down-search .. explore cur’s four neighbors to see if any == cur-1.
3) if found, then desingate that node as cur and continue the down-search.
4) At end of down-search, Go back to #2) and start up-search.
5) At end of up-search we have a linked list. IFF list size > 1, then all nodes on it are marked as visited.
7) go back to #1.

rotate square matrix 90-degree #clever swap

https://leetcode.com/problems/rotate-image/discuss/18872/A-common-method-to-rotate-the-image shows

/*
 * clockwise rotate
 * first reverse up to down, then swap the x/y subscripts
 * 1 2 3     7 8 9     7 4 1
 * 4 5 6  => 4 5 6  => 8 5 2
 * 7 8 9     1 2 3     9 6 3
*/

For anticlockwise, personally I would reverse horizontally, then same swap

This technique relies on swap(). What if the payload objects are blackboxes (or immutable) so you can’t even see contents? Just swap the 2 pointers.

— My own solution is more legwork but possibly more versatile. Suppose the matrix is N by N

m=N-1. Start from [0 0]. find new home, move in and kick out old tenant [0 m] …Each group has 4 nodes. a-> b -> c -> d -> a

See also spiral … If N is odd, then the kernel node stays changed. If N is even, then the innermost shell has a 2×2 matrix.

update each cell with d2nearest 0 #DeepakCM

(Deepak 2019) tough matrix problem: given a black/white but mostly white matrix, for each white cell, compute the least horizontal/vertical steps (shortest distance) to nearest white cell.

Given a Matrix with 1’s and very few 0’s, replace all the 1’s in the matrix with the adjacent distance to nearest 0. There can be more than one ‘0’ in the matrix
Ex : Input: Matrix contains more than one ‘0’ Matrix = {
1, 1, 1, 1, 1,
1, 1, 1, 1, 1,
0, 1, 0, 1, 1,
1, 1, 1, 1, 1,
1, 1, 0, 1, 1 };
Output = {
2, 3, 2, 3, 4,
1, 2, 1, 2, 3,
0, 1, 0, 1, 2,
1, 2, 1, 2, 3,
2, 1, 0, 1, 2 }

——————-
Theoretical limit: O(N)

I will denote W := # white cells. Optimal solution should/might be O(N). For N = 1 quintillion, the run-time should NOT grow even when there are more white locations.

If we know for sure the scores in four neighbors, then it’s O(1) to work out my score. Which cells have known scores? those next to the zeros.

–idea 4: same as frontier but using a queue to hold frontier cells.

–idea 3 (frontier):
Does this algo work with any graph?
What if the white cells are on the perimeter and the frontier shrinks?
How would two frontiers join?

initial-scan #0a to initialize all non-white locations to -1 (indicating “green”). Save the number of greens in a “greenCount”
initial-Scan #0b to collect all white locations. For each white location F with some green neighbor, saves F in a “frontier” collection, perhaps a linked list.

When “saving” also cache (using a 4-bit integer) the nongreen neighbors, to avoid repeated memory access.

Also create an empty “new frontier” collection.

The initial scans can be combined but at the cost of simplicity.

Invariants before any subsequent update-scan —

  • Every frontier location has some green neighbor.
  • new frontier collection is empty.
  • greenCount is up to date.

update-scan #1 update each adjacent green location to the frontier. Set score to 1, finalized and no longer green. iif a finalized location F has a green neighbor, then save F in the new frontier collection.

After the scan, assert the new and old frontier collections have no overlap. Now swap old and new frontier collections and clear the new collection.

update-scan #2 for each location in the frontier, update adjacent green locations to 2, finalized and no longer green. If such a finalized location F has a green neighbor, then save F in the new frontier collection.

Green count should now reduce. When it becomes 0 we are done.

big-O? At each scan, the new frontier is usually larger than the old frontier until an inflection point. If before each update-scan we keep a count of the frontier collection size, i think they won’t add up to exceed N. Therefore, total complexity is O(N) provided the fanout is a constant like 4. If fanout is unlimited, then possibly O(V+E) since we visit each node and each ege up to 3 times.

–idea 2 (shells):
scan #1 to save all white cell locations, and save all black cell locations in a shadow matrix (bool shadow matrix of the same size as original matrix) and a blacklist (hashtable indexed by cell location)
For each while, compute distance to center. At end of this scan, designte one whte cell as red i.e. closest to center.

scan #1b[O(N)] update all black cells with d2red. Now we have some baseline values, to be improved
scan #2 [O(W)] for each white, update all cells around it with distance 1. Remove the updated cells from the “blacklist”

Also set the bool in the shadow matrix

Scan #3 for each white, update the 2nd shell

details?

If a black cell is updated by 5 white cells in the same iteration, then all 5 whites would be equally distant, so the first of them would remove the black cell from blacklist.

So each black cell is only updated once .. O(N)?

–idea 1 (DP) incomplete:
Scan#1 from top and left, update each cell with a “TL score” i.e. shortest dist ignoring the Bottom-Right quadrant of cells i.e. cells right-and-lower of current.

consider a typical cell on 2nd row. what’s tl score? Can compute using upper and left neighbors? will it ignore a white on the right?

Scan#2 from bottom right, to compute a BR score for each cell

Scan#3 (can be part of Scan#2) combine the data

Rationale — for each white cell, the shortest path can be in either BR quadrant (Scan2) or (Scan1) the other 3 quadrants.

N-queen

Q1: generate one solution

Q2: generate all solution to the 4-queen problem without
A: there are up to 4^4 = 256 possible solutions so just iterate over all, but let’s say we want to extend the Q1 solution

https://github.com/tiger40490/repo1/blob/py1/py/2d/classic_nQueens.py is my tested solution. The only important function is placeOnRow(), which took me 17 minutes to write on white-board, but I missed something very important — restoring Mat[r][c] to UNFILLED before continue

4×4 sudoku: backtracking #FB

4-queen is one of the simplest problems in the backtracking category, but I will look at a mini sudoku. Sudoku was once asked in a Facebook 45-min white-board coding test. https://github.com/tiger40490/repo1/blob/py1/py/2d/sudoku4.py is my tested implementation.

I name the four values as color1  color2  color3  color4. A placement is {a color, row index, column index}.

Basic idea — at each empty cell, we try all four possible colors. If color1 is allowed, then we call the same function recursively to fill the next empty cell. Then (crucially) we check the status. If OK, then game over. If NOK, then we try another color. If no more color to try, then (crucially) we reset current cell to unfilled and return NOK

Key to the algo — backtracking is implemented by the restoration and returning NOK

Key to the algo — each empty cell would start a new recursive stack frame. But what data is saved on that stack frame?

Key to the implementation — know what variables go into the loop, what go to the stack and what can be global (simplest). In this case the returned next cell indices CANNOT be globals because each stack frame must remember “next-cell-after-me”

Key to the implementation — simplify state. Each stack frame doesn’t need to remember the “breadcrumb” of all previous placements. Each layer of recursive call only need to remember the colors already tried for its own current cell.

Key to the implementation — factor out simple utilities if they are well-defined and easy to implement. See comment in code.

The important function requires no more than 10 lines! So an interviewer could say “This is something a candidate can write in 45 minutes, assuming all the simple utilities are already available to him, namely failedSomething() and findNextCell2fill().

Truth is, most of us can’t write these 10 lines in an hour on the white-board if we don’t have the correct idea. My initial white-board implementation took 18 minutes and missed one important part —

Mat[r][c] = UNFILLED_COLOR # restore

I then took about an hour to make it work.

max-sum subMatrix O(N^3)

Q: given a square matrix (height N) of signed ints, find a submatrix with maxi sum

====analysis

compute submatrix sum by hashmap lookup is a O(N^4) solution.

https://www.geeksforgeeks.org/dynamic-programming-set-27-max-sum-rectangle-in-a-2d-matrix/ has a O(N^3) solution for square matrix. I think brute force is O(N^6). If you know the trick it is doable.

Clearly the max-profit algo is inapplicable. The Kadane algo may help.

Need auxDS for sure.

— simple, elegant DP idea based on hint:

For every left-right column pair, find the best submatrix having touching those two columns . For example, given column pair 3/7, the best submatrix must extend from column3 to column7. I think this constraint is extremely liberating, and leads to an elegant DP solution:

For pair 1/2, construct a new column array temp[] where temp[6]:= row-wise sum across columns 1/2. Once we have temp[], apply Kadane’s algo on temp[] to find the max submatrix sum. Entire process is O(N)

Q: Is each pair evaluated as an independent problem? I doubt it. I think after we have done pair 1/2, adding column3 into the mix is O(N), so the pairs 1/2, 1/3, 1/4 .. 1/N can be completed in O(NN). All pairs would O(NNN)

So how is adding column3 into the mix O(N)? I think we don’t use the prev Kadane result, but we use the previous temp[] to gain efficiency! We can update existing temp[] in O(N), then apply Kadane’s algo on it in O(N)

What if I have a 99×88 matrix? Do we use a O(NNM) or O(NMM) algo? Since there are fewer columns, I will stick to the column-pair algo presented.  (If there are fewer rows i.e. flat matrix then I would use row pairs.)

FB: spiral number pattern

This is a fairly popular coding question.

Q: given an positive integer N, print all integers from 1 to N2 in a N-by-N matrix, following a spiral pattern, starting from top-left. Please implement it in

int[][] spiral(int n); //to output

9 8 7
2 1 6
3 4 5

Facebook focuses on corner cases, complexity analysis.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/spiralFB.cpp is my implementation — iterative, deque<deque> matrix.

https://www.quora.com/How-do-I-print-spiral-number-patterns-using-only-loops-conditions-and-basic-math-i-e-without-the-use-of-libraries-data-structures-arrays-on-c

Flatten 2D array and convert subscripts #MS

Here’s a standard way to flatten a 2D array into a 1D array. Suppose A is an 2D array int[3][5]. I flattens to 1D array F of int[15]. Imagine A as 3 rows of 4 items each. Denote constants R = 3, S = 5. So A(1,2) → F(7). All subscripts are 0-based.

Q1: Implement

int convertSubscripts(r,s)

Q2: Now B is 3D array int [Q][R][S], where Q, R and S are the sizes. Implement

int convertSubscripts(q,r,s)

You need to work out the algorithm on the white board. I actually drew the 3D array on white board to show my thinking.

for every black matrix cell, paint its entire row/col black

https://www.geeksforgeeks.org/a-boolean-matrix-question/

Q: Given a boolean matrix mat[M][N] of size M X N, modify it such that if a matrix cell mat[i][j] is 1 then make all the cells of ith row and jth column as 1.

–sol1
1) typewriter scan for the next black (i.e. 1)
2) once found, change it to brown (i.e. 2) and paint all cells on its row and its column to brown.
3) go back to #1 until no more
4) change all brown to black.

count coordinate points enclosed by polygon #codecon 70%

Q: On a 49-by-49 super-sized GO-board, there are 2500 coordinate points. Now I give you a 15-side polygon described using the [x,y] coordinates of the 15 vertexes. List all enclosed points.

A: Here’s my tentative algorithm. We will paint and repaint all coordinate points in lime and red as we process the 15 vertexes one by one. At the end of drawing the polygon, we will know which color is enclosed. (Without loss of generality, assume vertexes are never on board boundary.)

1st, assume lines are never completely vertical. We will easily deal with vertical lines later.

  1. with 1st vertex, nothing to paint
  2. After 2nd vertex, we have a closed line (i.e. both ends defined), but we will treat it as an “open line” which is a line with both ends on board boundary.
    1. We paint half the board lime and the other half red. All 2500 coordinate points are painted. Any point on the actual closed line (not the open line) is left unpainted.
    2. how we decide the color? Basically when moving from Vertex 1 to 2, points on the left are painted lime. Specifically, if line is to the right, then below is painted red; if line is to the left, then below is lime.
  3. When we process 3rd vertex, we create a half-open line, meaning it has only one end defined (the previous vertex). The other end is actually the new vertex but we pretend the line extends beyond it and ends at boundary of the board. This is a divider into either lime or red section — new vertex’s current color will tell us. Suppose it cuts into the red, then part of the red must turn lime , i.e. we repaint the affected points lime. We will repeat this step for each remaining vertex.
    1. so which part to change color? follow the rule in 2.2
    2. Note we only repaint part of the newly-divided region. We don’t touch points in the unaffected region.
  4. When we process 4th vertex, we again create a dividing half-open line, either in the red or lime region.
  5. After we process the last vertex, in addition to the half-open line, we create a final line, this time a closed line with both ends defined. We repaint the “outside” points
  6. Completed. The color of [0,0] is the outside color.

Once we have a “database” or “lookup function” of all the “inside” coordinate points, let’s return to the original Bloomberg algo question (https://codecon.bloomberg.com/contest/5948031613881024512/3908)

First, we paint all unpainted points (i.e. on the closed lines) as inside points. Then we iterate over the 49 x 49 squares on the board one by one.

  • IFF all 4 corners are “inside” then it’s an inside square.

Math-wise, the main challenge appears to be Step 2.2. Solution: Suppose we hold x=2 constant. Check all the coordinate points. All the points “below” the “open line” can be pained lime .

IFF the line is exactly vertical, then all the points to the right are considered “below”.

Data structure — 2D array for the board, similar to my Tetris program. A Point class