(Deepak 2019) tough matrix problem: given a black/white but mostly white matrix, for each white cell, compute the least horizontal/vertical steps (shortest distance) to nearest white cell.

Given a Matrix with 1’s and very few 0’s, replace all the 1’s in the matrix with the adjacent distance to nearest 0. There can be more than one ‘0’ in the matrix

Ex : Input: Matrix contains more than one ‘0’ Matrix = {

1, 1, 1, 1, 1,

1, 1, 1, 1, 1,

0, 1, 0, 1, 1,

1, 1, 1, 1, 1,

1, 1, 0, 1, 1 };

Output = {

2, 3, 2, 3, 4,

1, 2, 1, 2, 3,

0, 1, 0, 1, 2,

1, 2, 1, 2, 3,

2, 1, 0, 1, 2 }

——————-

Theoretical limit: O(N)

I will denote W := # white cells. Optimal solution should/might be O(N). For N = 1 quintillion, the run-time should NOT grow even when there are more white locations.

If we know for sure the scores in four neighbors, then it’s O(1) to work out my score. Which cells have known scores? those next to the zeros.

–idea 4: same as frontier but using a queue to hold frontier cells.

–idea 3 (frontier):

Does this algo work with any graph?

What if the white cells are on the perimeter and the frontier shrinks?

How would two frontiers join?

initial-scan #0a to initialize all non-white locations to -1 (indicating “green”). Save the number of greens in a “greenCount”

initial-Scan #0b to collect all white locations. For each white location F with some green neighbor, saves F in a “frontier” collection, perhaps a linked list.

When “saving” also cache (using a 4-bit integer) the nongreen neighbors, to avoid repeated memory access.

Also create an empty “new frontier” collection.

The initial scans can be combined but at the cost of simplicity.

Invariants before any subsequent update-scan —

- Every frontier location has some green neighbor.
- new frontier collection is empty.
- greenCount is up to date.

update-scan #1 update each adjacent green location to the frontier. Set score to 1, finalized and no longer green. iif a finalized location F has a green neighbor, then save F in the new frontier collection.

After the scan, assert the new and old frontier collections have no overlap. Now swap old and new frontier collections and clear the new collection.

update-scan #2 for each location in the frontier, update adjacent green locations to 2, finalized and no longer green. If such a finalized location F has a green neighbor, then save F in the new frontier collection.

Green count should now reduce. When it becomes 0 we are done.

big-O? At each scan, the new frontier is usually larger than the old frontier until an inflection point. If before each update-scan we keep a count of the frontier collection size, i think they won’t add up to exceed N. Therefore, total complexity is O(N) provided the fanout is a constant like 4. If fanout is unlimited, then possibly O(V+E) since we visit each node and each ege up to 3 times.

–idea 2 (shells):

scan #1 to save all white cell locations, and save all black cell locations in a shadow matrix (bool shadow matrix of the same size as original matrix) and a blacklist (hashtable indexed by cell location)

For each while, compute distance to center. At end of this scan, designte one whte cell as red i.e. closest to center.

scan #1b[O(N)] update all black cells with d2red. Now we have some baseline values, to be improved

scan #2 [O(W)] for each white, update all cells around it with distance 1. Remove the updated cells from the “blacklist”

Also set the bool in the shadow matrix

Scan #3 for each white, update the 2nd shell

details?

If a black cell is updated by 5 white cells in the same iteration, then all 5 whites would be equally distant, so the first of them would remove the black cell from blacklist.

So each black cell is only updated once .. O(N)?

–idea 1 (DP) incomplete:

Scan#1 from top and left, update each cell with a “TL score” i.e. shortest dist ignoring the Bottom-Right quadrant of cells i.e. cells right-and-lower of current.

consider a typical cell on 2nd row. what’s tl score? Can compute using upper and left neighbors? will it ignore a white on the right?

Scan#2 from bottom right, to compute a BR score for each cell

Scan#3 (can be part of Scan#2) combine the data

Rationale — for each white cell, the shortest path can be in either BR quadrant (Scan2) or (Scan1) the other 3 quadrants.