TreeNode lock/unlocking #Rahul

Q: given a static binary tree, provide O(H) implementation of lock/unlocking subject to one rule — Before committing locking/unlocking on any node AA, we must check that all of AA’s subtree nodes are currently in the “free” state, i.e. already unlocked. Failing the check, we must abort the operation.

H:= tree height

====analysis

— my O(H) solution, applicable on any k-ary tree.

Initial tree must be fully unlocked. If not, we need pre-processing.

Each node will hold a private hashset of “locked descendants”. Every time after I lock a node AA, I will add AA into the hashset of AA’s parent, AA’s grand parent, AA’s great grandparent etc. Every time after I unlock AA, I will do the reverse i.e. removal.

I said “AFTER locking/unlocking” because there’s a validation routine

bool canChange(node* AA){return AA->empty(); } # to be used before locking/unlocking.

Note this design requires uplink. If no uplink available, then we can run a pre-processing routine to populate an uplink lookup hash table { node -> parent }

— simplified solution: Instead of a hashset, we may make do with a count, but the hashset provides useful info.

 

Advertisements

post-order walk: create valuable subtree statistics

https://github.com/tiger40490/repo1/blob/cpp1/cpp/algo_binTree/subTreeStats.cpp is rather simple and short, showing stats like

  • max path sum from current node to any subtree node. Note a path must be non-empty, so this max can be negative
    • We can even save in each node this actual path
  • subtree height,
  • subtree size,
  • d2root — tested but doesn’t use subtree statistics
  • — can be easily implemented:
  • subtree payload sum
  • subtree max payload… These stats are powerful AuxDS

Applicable on k-ary tree

print height@subtree at every node #post-order

For these problems, elegant implementation is expected.

Note in-order is for binary tree, but post-order is for any tree.

== Q: given Any tree without cycle, print the size of subtree at every node

I think post-order walk is usable.

== Q: given Any tree without cycle, print the height of subtree at every node. Also print the node address just for identification

Intuition — dft post-order walk. Need to become more familiar with the detailed steps. https://www.techiedelight.com/calculate-height-binary-tree-iterative-recursive/ has details.

== Q (Leetcode): Given a binary tree, determine if it is height-balanced — a binary tree in which the depth of the two subtrees of every node never differ by more than 1.

I can try it on Leetcode .. no need to set up test harness myself but I might be frustrated

Intuition– post-order walk. A simple recursive algo — for any node, find the its subtree height including itself and return it. Before returning, also compare the left vs right subtree heights

in each treeNode, save d2root

My friend YH showed me a simple BFT algo where each parent node AA saves AA’s level+1 into the two children nodes AAL and AAR.

So every node remembers its own level from root 🙂

I think this technique can be used in pre-order DFT too. For post-order, it is possible but less natural, as shown in my github code subTreeStats.cpp.

Note pre|post-order can work B-trees not only binary trees.

If nodes are immutable, we can use a hashmap {node address -> level}

generate combinationSum compositions #backtrack up]trie #2review

Q: https://leetcode.com/problems/combination-sum/description/

Given a set of unique candidate numbers and a target number, find all unique combinations of candidates, where each combination sums to target. Each candidate may be used repeatedly.

My solution is https://github.com/tiger40490/repo1/blob/cpp1/cpp/combo_permu/comboSum.cpp , showing a reusable backtracking technique described below. It’s clever and brief. However, the efficiency is questionable. Memoization might be applicable here.

This backtracking relies on a key insight. Suppose we have target = 7 and X=2,Y=3,Z=4 as the candidates.

  • when we try a Y, we don’t need to try any more Xs. For example, If we are trying XY, then all XX* solutions are already handled by earlier recursive calls.
  • each combo sequence is naturally pre-sorted internally.
  • Also, “lower” formulas are generated before “higher” formulas
                 root
          x        y     z
      /   |  \
     x    y   z       
    / \   | \  \
 xxx  xxy | xyz \ 
         xyy     xzz
void //can return something if needed

recurs( solutionsFound &, //growing
        curPartialSolution &, 
// above collections could be global variables, to simplify things

        remainingCandidates, /*probably an immutable global array. 
If we need the remaining array to shrink, we can still rely on startIndex to skip used candidates.*/

        gapFromTarget, 
        startIndex, //used to select from remaining candidates
) 

Inside this function, we scan remaining candidates starting from startIndex. Typically in one iteration

  1. we add a new candidate into curPartialSolution
  2. we call recurs
  3. we remove the last added candidate from curPartialSolution to restore the original curPartialSolution — backtracking up the tree.
  4. move on to the next candidate

I feel this is more greedy than DP

represent trees with unbounded fanout

When I first read it in [[CLRS]], I didn’t know why trees exist with unbounded fanout, until I read about https://en.wikipedia.org/wiki/Suffix_tree.

Insight —- If we don’t know how many child nodes a parent can have, then it’s not easy to define TreeNode fields to hold children. If you use a vector or linked list within a TreeNode (perhaps a java developer), your space efficiency will be poor.

The only way I know is a left-child-right-sibling scheme — TreeNode A has a pointer to A’s first child. If A is also one of the children of TreeNode P, then A will also have a pointer to a nextSibling , possibly null.

insight —- https://en.wikipedia.org/wiki/Left-child_right-sibling_binary_tree reveals that this scheme uses a binary tree to represent a k-ary tree

This is one of the few usages of linked list to my knowledge.

simple tree applied to DP++

This is part of my thick->thin effort, to connect the dots and pattern-recognition. May or may not be highly effective, but no harm trying

By my own definition, A “simple tree” is non-recombinant and cycle-free, and every node has one uplink only. I find these trees very visual. They often help me visualize tough Dynamic Programming and other algorithms:

  • decision tree
  • (Rahul pointed out) DP algos are often recursive, and recursive algos often have a call-tree structure
  • recursive call within each loop iteration — this powerful algo can often be represented as a call-tree or decision-tree
  • backtracking — often uses a tree
  • generating all solutions (paths, formulas..) — often uses a tree
    • paths-from-root — each path often maps to a solution
    • eg: punctuation — to print out all sentences
    • eg: AQR factorization problem may be solvable using the comboSum algo.

real-world treeNode has uplink

I think only in contrived interview questions do we find tree nodes without uplink.

Uplink basically, means every edge is bidirectional. With uplinks, from any node we can easily trace to the ancestor nodes.

In real world trees, uplink is cheap-yet-valuable most of the time. AVL tree node has uplink, but let’s look at RBTree:

  • RBTree needs uplink for tree rotation.
  • Also, I believe when you give an insertion hint, the “engine” needs to validate the hinted node, by checking parent + another ancestor.

depth-first-traversal, height-aware

Latest: https://github.com/tiger40490/repo1/blob/cpp1/cpp1/binTree/DFT_show_level.cpp


struct Node {
    int data;
    Node *left, *right, *next;
    Node(int x, Node * le = NULL, Node * ri = NULL) : data(x), left(le), right(ri), next(NULL) {}
};
    Node _15(15);
    Node _14(14);
    Node _13(13);
    Node _12(12);
    Node _11(11);
    Node _10(10);
    Node _9(9);
    Node _8(8);
    Node _7(7, &_14, &_15);
    Node _6(6, NULL, &_13);
    Node _5(5, &_10, NULL);
    Node _4(4, NULL, &_9);
    Node _3(3, &_6,  &_7);
    Node _2(2, &_4,  &_5);
    Node root(1, &_2, &_3);

int maxD=0;
void recur(Node * n){
  static int lvl=0;
  ++lvl;
  if (lvl>maxD) maxD = lvl;
  if (n->left){ recur(n->left); }
  cout<<n->data<<" processed at level = "<<lvl<<endl;
  if (n->right){ recur(n->right); }
  --lvl;
}
int maxDepth(){
    recur(&root);
    cout<<maxD;
}
int main(){
   maxDepth();
}

Breadth-first-traversal, level-aware

BST visits root level (one node only), and 2nd generation, and 3rd generation etc. The BST below is aware of the height or “generation” currently visited.

Latest: https://github.com/tiger40490/repo1/blob/cpp1/cpp1/binTree/BST_show_level.cpp

struct Node {
    int data;
    Node *left, *right, *next;
    Node(int x, Node * le = NULL, Node * ri = NULL) : data(x), left(le), right(ri), next(NULL) {}
};
    Node _15(15);
    Node _14(14);
    Node _13(13);
    Node _12(12);
    Node _11(11);
    Node _10(10);
    Node _9(9);
    Node _8(8);
    Node _7(7, &_14, &_15);
    Node _6(6, NULL, &_13);
    Node _5(5, &_10, NULL);
    Node _4(4, NULL, &_9);
    Node _3(3, &_6,  &_7);
    Node _2(2, &_4,  &_5);
    Node root(1, &_2, &_3);
    Node marker(-1); //to be added to queue to mark new level

Node * dequeue(queue<Node*> & q){
    assert(!q.empty());
    Node * n = q.front();
    q.pop();
    return n;
}
void BFT(){
  queue<Node *> q;
  size_t height=0;

  for( q.push(&marker), q.push(&root); !q.empty();){
    Node * n = dequeue(q);
    if (n == &marker){
        if (q.empty()){
                cout<<"  ^^^^^^^^^^"<<endl;
                break;
        }
        n = dequeue(q);
        q.push(&marker); //move the marker to end of queue
        cout<<"starting height # "<<++height<<endl;
    }
    int dataL = 0, dataR=0;
    if (n->left){
        cout<<"pushing "<<n->left->data<<endl; q.push(n->left);
        dataL = n->left->data;
    }
    if (n->right){
        cout<<"pushing "<<n->right->data<<endl; q.push(n->right);
        dataR = n->right->data;
    }
    cout<<dataL<<" <- "<<n->data<<" -> "<<dataR<<endl;
// "  #  next is "<<dataN<<endl;
  }
}
int main() {
    BFT();
}

BFT^DFT, pre^in^post-order #trees++

Here are my own “key” observations, possibly incomplete, on 5 standard tree walk algos, based on my book [[discrete math]]

  • in-order is defined for binary trees only. Why? in-order means “left-subtree, parent, right-subtree”, implying only 2 subtrees.
  • In contrast, BFS/DFS are more versatile. Their input can be any graph. If not a tree, then BFS/DFS could produce a tree — you first choose a “root” node arbitrarily.
  • The 3 *-order walks are recursively defined. I believe they are three specializations of DFS. See https://www.cs.bu.edu/teaching/c/tree/breadth-first/ and https://en.wikipedia.org/wiki/Tree_traversal
  • The difference of the 3 *-order walks are visually illustrated via the black dot on  https://en.wikipedia.org/wiki/Tree_traversal#Pre-order_(NLR)

—-That’s a tiny intro. Below are more advanced —

BFT uses an underlying queue. In contrast DFS is characterized by backtracking, which requires a stack. However, you can implement the queue/stack without recursion.

BFT visits each node twice; DFT visits each node at least twice — opening (to see its edges) and leaving (when all edges are explored)

DFT backtrack in my own language —

  1. # start at root.
  2. At each node, descend into first [1] subtree
  3. # descend until a leaf node A1
  4. # backtrack exactly one level up to B
  5. # descend into A1’s immediate next sibling A2’s family tree (if any) until a leaf node. If unable to move down (i.e. no A2), then move up to C.

Visually — if we assign a $val to each node such that these nodes form a BST, then we get the “shadow rule”. In-order DFS would probably visit the nodes in ascending order by $value

[1] the child nodes, of size 1 or higher, at each node must be remembered in some order.

binary tree in-order walk #famed bbg Q

Is there a way to design an iterator over a binary search tree with the following properties?

1. Elements are visited in ascending order (i.e. an in-order traversal)
2. next() and hasNext() queries run in O(1) time.
3. Memory usage is O(1)

——-
I feel O(1) memory is impossible since this is recursive (though all recursions can convert to iteration with a queue bigger than O(1)), with a stack data structure of size H := height of tree.

Similarly, hasNext() need to work through the stack of size H, so O(1) run time impossible

Morris may meet the requirement.

store a billion phone numbers for fast lookup #Barc

The solution I managed to recall in 2015: tree. Each node has the count as payload, and up to 10 child nodes.

Q: efficiently store a billion phone numbers for fast lookup.

I assume realistic usage is read-mostly rather than read-only, but this solution is reasonably efficient in read-write scenarios too. My home-grown idea happens to resemble http://en.wikipedia.org/wiki/Trie.

Assumption — phone numbers across the world have at most 15 digits i.e. short strings. Our trie would be at most 15 levels deep. Fan-out at any parent node is at most 10.

( This 15-digit assumption is not as restrictive as it may appear. The distribution of any realistic “population” of interest to an information system is often /bounded/ by some important constraints. We seldom need to store data of arbitrary length or arbitrary precision. )

Compared to a naïve array data structure (converting each phone number to integer and treat it as subscript) which requires 10^15 array elements (about a million billion elements), our trie would create only a billion leaf nodes and a comparable number [1] of intermediate nodes.

[1] I’d estimate this number: on 14th level, there can be only a few hundred million nodes. On the 13rd lowest, even fewer. Phone numbers share area codes, so on the 4th or 5th lowest level, we have clusters.

Search is O(1) given the Assumption above — just follow the pointers. I feel performance is more predictable than hash table and not susceptible to bad hashing or skewed distribution.

Insert is O(1) given the Assumption above.

How about a hash table? Most standard implementations use a fill factor below 100%, so we need a bucket array of 1billion+. Memory efficient. I’m not sure about quality of hash code distribution. I feel it’s possible to get many collisions.

recursive tree walk -> iterative #barcap

Suppose there’s a deep directory tree (no hard/soft links) with millions of files and directories. Write a tree walker to list files excluding directories. For memory efficiency, turn the recursive algorithm into iterative.

I feel recursive thinking is far more powerful than iterative, because it solves problems otherwise unsolvable. Most classic algo problems fall into this category. I always try to train myself to think recursively. In this regard, It’s good to study and compare iterative algorithms. What kind of recursive algo can be converted to iterative? I feel only simple ones.

Inspired by the article in the link, here are my rules for converting recursion to iteration

Rule: one push (on the stack) per call of the recursive func.

Rule: a stack to hold the arg of the recursive calls, in the correct order. Assuming m(1) -> m(2) -> m(3), then we should push 1,2,3 on the stack. Ideally, the order in our stack should match the recursive call stack.

Rule: since it’s a stack, you remove from the top and add to the top too.

Rule: first step when you reenter the while(not_empty) loop, you simulate a new recursive call by _popping_ the stack. The popped item is no longer a “pending” todo item.

Next Question: how to parallelize? You don’t know how deep each sub-tree is.


public class LowLatencyTreeWalker {
static java.util.LinkedList envelopesToOpen = new java.util.LinkedList();

static void factorial() { // a simple iterative solution to compute
// factorials
envelopesToOpen.add(6);
long ret = 1;
while (!envelopesToOpen.isEmpty()) {
int currNum = envelopesToOpen.removeLast();
System.out.println(ret = ret * currNum);
if (currNum > 1)
envelopesToOpen.add(currNum - 1);
}
}

static void walkTree(String args[]) { // a harder iterative algorithm for
// tree-walk
envelopesToOpen.add(new File("C:/"));
while (!envelopesToOpen.isEmpty()) {
// once we open an envelope, we remove from the todo stack
File f = envelopesToOpen.removeLast();
System.out.println(f.getAbsolutePath());
if (f.listFiles() != null)
envelopesToOpen.addAll(Arrays.asList(f.listFiles()));
}
}
}