max rectangle ] histogram

Q: Given N possibly recurring non-negative integers representing the histogram’s bar heights, and given the width of each bar is 1, find the largest rectangle in the histogram. Return its area.

Visually well-defined problem. Perhaps naturally-occurring. Very simple data structure. No O() requirement, so I will just try my own solution. is my solution. 100% passed on Leetcode.

==== analysis — heavy on data structure design.

Key insight — one scan to update a clever data structure.

key insight — data structure is not per bar, but per height!

For every bar J, there exists an enclosing max-rectangle of J’s height. We can just compare all of these rectangles.

We might start with two extreme candidates
1) the peak — whose enclosing rectangle is likely slender — O(N) one scan to find all the peaks
2) the lowest bar — whose enclosing rectangle has width N — O(N)

If we paint the histogram as a binary matrix, then this is equivalent to anther problem max all-black submatrix #DP #zhurongbut I think there exists better solutions like O(N logN) or O(N*S) …

–homegrown algo with O[N*S] where S:= #unique heights. The binary search doesn’t show up as logS.

A pre-scan to get all distinct heights. For each distinct height, we maintain a RunRecord object {bestRun, currentRunStart, height}, in a sorted map {height -> record}. In py, I can use a pre-sorted vector of Records, sorted on height

In main scan, As we encounter a new bar of height J, we update these records.

  • if not falling or rising
    • record-J and each record-H below J must have a current run … extend that run (no-op)
  • if rising from height H
    • each record up to H must have a current run … extend that run by no-op
      • iterate the treemap up to H
    • iterate treemap from H+1 to J. start a new run for each record
  • if falling from height P to J
    • record-J and each record-H (where H <J) must have a current run … extend that run
    • iterate treemap from J+1 to P … each record-K must have a current run, indicated by a valid currentRunStart, then this record’s current run has just ended. We update bestRun and put a invalid value into currentRunStart.

At end of the main scan, every record has a bestRun i.e. the duration. I can then calc the area under each bestRun and return the max.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s