partition list into good^bad sections #c++LiquidNet

Q: template<typename I, typename P>
I myPartition(I bb, I const end, P isgood); //return first Bad node in the range after shuffling all Bad ones to the end. If no Bad node, then return the original end.

You can call it like firstBad = shuffle(li.begin(), li.end(), isEven). Itr and Pred are template params or well-defined typedefs.

Note we can’t assume the iterator provide sort order. We can only inc/dec and check equality.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/linkedList/partition_LNet.cpp is my solution but it’s hard to reason because I move two rather than one pointer. Interviewer gave a simpler solution with similar time complexity.

Lessons:

  • in a two-iterator algo, if every move on each iterator has some logic, then it is probably over-complicated. In this case, we could move one iterator according to some logic, and simply move the other iterator to the same spot.
  • “bb should be at or before ee” — the invariant is critical but hard to maintain in the face of complex logic
  • There were many scenarios in my design. There are fewer scenarios in the interviewer’s design.

depth-first-traversal, height-aware

Latest: https://github.com/tiger40490/repo1/blob/cpp1/cpp1/binTree/DFT_show_level.cpp


struct Node {
    int data;
    Node *left, *right, *next;
    Node(int x, Node * le = NULL, Node * ri = NULL) : data(x), left(le), right(ri), next(NULL) {}
};
    Node _15(15);
    Node _14(14);
    Node _13(13);
    Node _12(12);
    Node _11(11);
    Node _10(10);
    Node _9(9);
    Node _8(8);
    Node _7(7, &_14, &_15);
    Node _6(6, NULL, &_13);
    Node _5(5, &_10, NULL);
    Node _4(4, NULL, &_9);
    Node _3(3, &_6,  &_7);
    Node _2(2, &_4,  &_5);
    Node root(1, &_2, &_3);

int maxD=0;
void recur(Node * n){
  static int lvl=0;
  ++lvl;
  if (lvl>maxD) maxD = lvl;
  if (n->left){ recur(n->left); }
  cout<<n->data<<" processed at level = "<<lvl<<endl;
  if (n->right){ recur(n->right); }
  --lvl;
}
int maxDepth(){
    recur(&root);
    cout<<maxD;
}
int main(){
   maxDepth();
}

bbg-Eq: longest abbreviation #easier

Given a long word with N non-unique characters, we know there can be 2^N abbreviations (including the original word, and including the empty string).

Requirement: I give you a list of strings and you suspect some of them could be an abbreviation of your word. Find the longest string among them that’s a valid abbreviation. Optimize for time complexity.

I feel this problem appears simple but not easy to complete quickly

https://github.com/tiger40490/repo1/blob/py1/py/str/testAbbr_ez.py is my python solution, different from the c++ solution below. Not sure which has more bugs.


I started with string::find() and hit the problem of non-unique characters. Interviewer hinted char-matching — the key point I needed for this type of problem.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

typedef string candidate;

vector<candidate> v{"monkey","aaaaaaaaaaaaaaaaaa", "mamalon", "monk"};
string const haystack{"mamalonkey"};
size_t hsize=haystack.size();

//Tip: can use typdef alias in argument:)
bool lenComp(candidate const & a, candidate const & b){
  return a.size()<=b.size();
}
ostream & operator<<(ostream & os, vector<candidate> const & c){
  size_t const sz = c.size();
  for (int i=0;i<sz; ++i){
        os<<c[i]<<" ";
  }
  os<<endl; return os; } //single-pass 🙂 
} 

bool check1candate(candidate const & c){ 
  if (c.size() > hsize) return false;

  for(int i=0, h=0; h<hsize; ++h){
        if (c[i] == haystack[h]){
          if (i==c.size()-1) return true;
          ++i;
        }
  }
  return false;
}
int main(){
  sort(v.begin(), v.end(), lenComp);
  cout<<v; for (int i=v.size()-1; i>=0; --i){
    if (check1candate(v[i])){
        cout<<"winner is "<<v[i];
        return 0;
    }
  }
}

 

bbg-FX: check if a binary tree is BST #no deep recursion

Binary search tree has an “ascending” property — any node in my left sub-tree are smaller than me.

Q1: given a binary tree, check if it’s a BST. (No performance optimization required.) Compile and run the program.

Suppose someone tells you that the lastSeen variable can start with a random (high) initial value, what kind of test tree would flush out the bug? My solution is below, but let’s Look at Q1b.

Q1b: what if the tree could be deep so you can’t use recursion?
%%A: use BFT. When we actually “print” each node, we check greatest node in its left subtree (and least node in the right subtree)


I made a mistake with lastSeen initial value. There’s no “special” initial value to represent “uninitialized”. Therefore I added a boolean flag.

Contrary to the interviewer’s claim, local statics are automatically initialized to zerohttps://stackoverflow.com/questions/1597405/what-happens-to-a-declared-uninitialized-variable-in-c-does-it-have-a-value, but zero or any initial value is unsuitable (payload can be any large negative value), so we still need the flag.

#include <iostream>
#include <climits>
using namespace std;

struct Node {
    int data;
    Node *le, *ri;
    Node(int x, Node * left = NULL, Node * right = NULL) : data(x), le(left), ri(right){}
};
/*    5
  2       7
 1 a
*/
Node _7(7);
Node _a(6);
Node _1(1);
Node _2(2, &_1, &_a);
Node _5(5, &_2, &_7); //root

bool isAscending = true; //first assume this tree is BST
void recur(Node * n){
//static int lastSeen; // I don't know what initial value can safely represent a special value

  //simulate a random but unlucky initial value, which can break us if without isFirstNodeDone
  static int lastSeen = INT_MAX; //simulate a random but unlucky initial value
  static bool isFirstNodeDone=false; //should initialize to false

  if (!n) return;
  if (n->le) recur(n->le);

  // used to be open(Node *):
  if (!isAscending) return; //check before opening any node
  cout<<"opening "<<n->data<<endl; 
  if (!isFirstNodeDone){ 
        isFirstNodeDone = true; 
  }else if (lastSeen > n->data){
        isAscending=false;
        return; //early return to save time
  }
  lastSeen = n->data;

  if (n->ri) recur(n->ri);
}

int main(){
  recur(&_5);
  cout<< (isAscending?"ok":"nok");
}

next_Perm@3boys out@5 #80%

algo-practice: generate permutation@3, using5distinct chars

Such a skill might be needed in some coding IV sooner or later. Let’s write this in py or c++. Formally,

Q1: generate all permutations of 3, from 5 distinct chars, in any order.
Q2: generate all permutations of 3, from 5 distinct chars, in ascending order. You can sort the 5 chars first.
Q2b: once you have generated one permutation, how do you identify The next?

Note the same solution is a generalization of std::next_permutation(), so once you have this solution you have that solution.

How many? 5-choose-3 * 3! = 5!/(5-3)! = 5*4*3 = 60, denoted Answer(3). Paradoxically, Answer(4) == Answer(5) = 120

–algorithm 1 for Q1

  • 1st generate 5 single-char strings;
  • then for each generate 4 two-char strings. We get 20 strings.

–algorithm 1b for Q1: rec(5,3) will call rec(5,2). rec(5,2) has 20 items, each can generate 3 items for rec(5.3), because each item has 2 characters and a void to be filled by the 3 unused chars.

The 20 items returned should be a pair{vector of 2, vector of 3}

This produces a sorted collection:)

See my code  https://github.com/tiger40490/repo1/blob/py1/py/combo_perm/nextPerm%403boysFrom5.py

 

 

 

next_Combo@3 boys out@5 # 100%

Given, abcde, How many combinations of 3? 5-choose-3 = 10 standard formula, but let’s implement in py or c++.

It’s useful to define a score for a combination — sort the combination into a string. The string is the score. Now,

Q1: generate all combinations@3 from 5 distinct chars, in ascending score.
Q2 (more importantly) given any combination of 3, generate the next higher combination of 3.  Output each combination as a sorted string to keep things clean. Extremely effective constraint and simplification.

1) https://github.com/tiger40490/repo1/blob/py1/py/combo_perm/nextComboOf3from6boys.py is a decent recursive.

2) fixed-length abbreviation generator also generates exactly the same  sequence!

3) Perhaps we can modify the iterative generate random perm@3boys out@5 algorithm:

Key: Scan from end of string to identify position2upgrade. Compare current (i.e.last) position with the unused characters. If can’t upgrade me [eg abe], then move left which is guaranteed to be a lower character. Now compare the new “me” with only the unused characters (in this algo we don’t allow swap within the string) If can’t upgrade “me”, then move left.

If current position is upgradeable, then set it as p2u. Upgrade to the lowest unused character. Then reset all characters on my right and return.

  1. abc
  2. abd
  3. abe
  4. acd
  5. ace
  6. ade //last 2 letters are the max. p2u = 0
  7. bcd
  8. bce
  9. bde
  10. cde

 

show free slots between meetings #bbg c++11

Q: You are given a list of teleconference booking like {9,10} meaning from 9am to 10am. We can multi-task, so overlap is fine. Can you print out all the free slots within the day? You can safely assume all the input data are sensible and between 9 to 18, i.e. our workday.

I solved this problem on the spot. My weakness is still the ECT cycle esp. the compiler errors.

I was familiar with the inverse problem of listing all busy sessions. So I decided to *focus* on that. I took a calculated risk that once I get that to work, I will have the cool confidence to solve the original problem.

corner case: two adjacent meetings. I added comment to specify exactly how to check for that
corner case: what if the entire day is one busy session?

int const SOD = 0, EOD=24;

struct Intv{ //can be a single meeting, a joined session or a free window
  int bgn, end; //end = 10 means 10 o'clock sharp. Will become float
  Intv(int a, int b): bgn(a), end(b){}
  bool operator<(Intv const & other) const { return this->bgn < other.bgn;}

  //can be a free function if no private fields
  friend ostream & operator<<(ostream & os, Intv const & a){
        os<<a.bgn<<'-'<<(float)a.end-0.41;
        return os;
  }
};
template<typename T> ostream & operator<<(ostream & os, vector<T> const & v){
        for(int i = 0; i<v.size(); ++i) os<<v[i]<<"  ";
        os<<endl;
}
void printBusy(vector<Intv> const & arg){
  if (arg.empty()) return;
  vector<Intv> & v = const_cast<vector<Intv>&> (arg);
  sort(v.begin(), v.end());
  size_t const sz = v.size(); //cache
  Intv growable = v[0];
  for(size_t i = 1; i<sz; ++i){
        Intv & req = v[i];
        if (growable.end < req.st){
          cout<<growable<<endl; //close current session
          growable = req;//start new session
          continue;
        }
        growable.end = max(growable.end, req.end);
  }
  cout<<"last session : "<<growable<<endl;
}
void printWindow(vector<Intv> const & arg){ //subsequent exercise
        if (arg.empty()){
                cout<<"free slot "<<Intv(SOD, EOD)<<endl;
                return;
        }
        vector<Intv> &v = const_cast<vector<Intv> &>(arg); //avoid vector deep cloning
        sort(v.begin(), v.end());
        cout<<v<<endl;
        if (v[0].bgn > SOD){
          cout<<"SOD free slot "<<Intv(SOD, v[0].bgn)<<endl;
        }
        Intv growing(v[0]); //current busy period
        for(int i=1; i<v.size(); ++i){
          //Heart of the algo. each new meeting either extends the current
          //busy period or prints a free window before it
          Intv & meeting = v[i];
          //cout<<i<<" a meeting: "<<meeting<<endl;
          //cout<<"growing = "<<growing<<endl;
          if (growing.end < meeting.bgn){
                cout<<"free slot "<<Intv(growing.end, meeting.bgn)<<endl;
                growing = meeting;
          }else{
                growing.end = max(growing.end, meeting.end);
          }
        }
        if (growing.end < EOD)
                cout<<"EOD free slot "<<Intv(growing.end, EOD)<<endl;
}
int main() {
        //pass in initializer list to initliaze a const vector
        printBusy({Intv(15,17), Intv(7,9), Intv(5,7)});
}

Breadth-first-traversal, level-aware

BST visits root level (one node only), and 2nd generation, and 3rd generation etc. The BST below is aware of the height or “generation” currently visited.

Latest: https://github.com/tiger40490/repo1/blob/cpp1/cpp1/binTree/BST_show_level.cpp

struct Node {
    int data;
    Node *left, *right, *next;
    Node(int x, Node * le = NULL, Node * ri = NULL) : data(x), left(le), right(ri), next(NULL) {}
};
    Node _15(15);
    Node _14(14);
    Node _13(13);
    Node _12(12);
    Node _11(11);
    Node _10(10);
    Node _9(9);
    Node _8(8);
    Node _7(7, &_14, &_15);
    Node _6(6, NULL, &_13);
    Node _5(5, &_10, NULL);
    Node _4(4, NULL, &_9);
    Node _3(3, &_6,  &_7);
    Node _2(2, &_4,  &_5);
    Node root(1, &_2, &_3);
    Node marker(-1); //to be added to queue to mark new level

Node * dequeue(queue<Node*> & q){
    assert(!q.empty());
    Node * n = q.front();
    q.pop();
    return n;
}
void BFT(){
  queue<Node *> q;
  size_t height=0;

  for( q.push(&marker), q.push(&root); !q.empty();){
    Node * n = dequeue(q);
    if (n == &marker){
        if (q.empty()){
                cout<<"  ^^^^^^^^^^"<<endl;
                break;
        }
        n = dequeue(q);
        q.push(&marker); //move the marker to end of queue
        cout<<"starting height # "<<++height<<endl;
    }
    int dataL = 0, dataR=0;
    if (n->left){
        cout<<"pushing "<<n->left->data<<endl; q.push(n->left);
        dataL = n->left->data;
    }
    if (n->right){
        cout<<"pushing "<<n->right->data<<endl; q.push(n->right);
        dataR = n->right->data;
    }
    cout<<dataL<<" <- "<<n->data<<" -> "<<dataR<<endl;
// "  #  next is "<<dataN<<endl;
  }
}
int main() {
    BFT();
}

maxProfit, max-sum subArray #Kadane

One of the top 3 all-time favorite algo questions, worth studying in-depth. I know only two algorithms — (Kanade) disposableCurSubArray and lowWaterMark (my own and Xinfeng Zhou). I think both are equivalent from a few angles.

My algo is intuitive (to me) for level input (delta input can be transformed into levels) . One pass. I feel it’s not inferior in any way.

In contrast, disposableCurSubArray is intuitive for delta input i.e. maxSubArray.

https://github.com/tiger40490/repo1/tree/py1/py/array has a tested solution with lots of explanations.

https://www.geeksforgeeks.org/largest-sum-contiguous-subarray/ has a simpler solution, to be understood. I rewrote it in https://github.com/tiger40490/repo1/blob/cpp1/cpp/array/maxSubarraySum.cpp

  1. special case: If all numbers are negative, then final answer is the “smallest” negative number as a lonewolf array.
  2. special case: If all numbers are positive, then final answer is the entire array
  3. so the tricky case must have a mix of negative/positive

Here’s my explanation of Kadane’s algo:

  • Delta Input array has arr[0], arr[1] .. arr[n]. let’s denote maxSubsumEndingAt(i) as B[i]. The max Subarray Ending At #55 is a continuous subarray starting somewhere before #55. It can be a lone-wolf containing only #55, as an important special case. In fact, in the mixed positive/negative array, we usually encounter this case.
  • (2-pointer algo) We maintain a left marker and a right marker. Both move to the right. maxSubArrayEndingAt(i) is basically an accumulating subarray from left marker to right marker.
  • B[i] == max(B[i-1] + arr[i]    vs  arr[i] )
    • if  B[3] is negative i.e. all 4 sub-array sums ending in arr[3] are all negative, then B[4] should not include any of them. We can discard them for good and “start afresh“(while keeping the global maxSumSeen)
    • else, there exists a “positive” sub-array ending at arr[3], so we keep growing it until it becomes negative.
    • (See github python code) I can imagine a left-marker, the start of the current max subarray. We will move the right marker to grow the sub-array if the current sub-array is useful i.e. positive. Otherwise, we start afresh and the left marker jumps and coincide with right-marker
  • Note sign(B[i-1]) is key but sign(arr[i]) is irrelevant

Here’s my attempt to connect my lowWaterMark to Kanade’s algo:

I suspect that whenever we move the left marker, we always have a new lowWaterMark.

(Before the first element, Level is defined as 0 . In case III, the low water mark is usually a negative level.) Suppose A few elements after hitting low water mark level=-66, we hit level=-22. This is not a new water mark. maxSubsumEndingAt[this element] is actually positive since there exists a subarray starting right after the “level=-66” element!

When we hit level=-77, a new water mark, the maxSubsumEndingAt[this element] is basically zero, as the disposableCurSubArray is discarded. We start afresh to accumulate. Essentially, we reset the “base” to be the new water mark.



			

[12] locate]sentence all permutations@99words #dragonSearch

https://leetcode.com/problems/substring-with-concatenation-of-all-words/description/ is very similar

Q1: you are given a list of (possibly repeating) words denoted L that are all the same length , and a long string denoted S. Find a substring of S that is a concatenation of each word in the list exactly once and without any intervening characters. Such a substring is known as a dragon.

To simplify the problem, you can assume it occurs at least once in the sentence.

Example:.
L: “fooo”, “barr”, “wing”, “ding”, “wing”.
S: “lingmindraboofooowingdingbarrwingmonkeypoundcake”.
Answer — fooowingdingbarrwing.

For O(), let’s assume there are H characters in haystack, W characters in each word, and N words (possibly repeating)

Q1b: what if word length is not uniform?

Solution-A — See https://github.com/tiger40490/repo1/blob/py1/py/str/dragonSearch.py with O(HN)

I hope this can be adapted for Q1b — see source comments

Solution-B1 (Brute force) — For each permutation, scan by indexOf() or strstr() or string::find(). Guaranteed to find a solution in finite time. Doable if N is small, since there’s O(N!) element.

Solution-B2 (Brute force) — WN-long sliding window over haystack. Each window is examined to determine if it’s a dragon.

This is reasonable if H is small relative to N.

Solution-C — a potentially faster solution if the list is large — Permutation of 5 would be 120 😦

Phase 1: For each word in the list, i’d count how many times it occurs. The word with lowest occurrence would be a “candidate”.

If every word has the same occurrence (say 3), I think this is still faster than Solution-B, because all but one of the occurrences is a false positive. It’s relatively easy to screen them out.

As soon as I find a word with occurrence=1 exactly, I exit Phase 1 with a super candidate. For this discussion, I assume we have a super candidate.

Updae — In such a case, I can also use the Solution A approach. Even if the best candidate has occurrence = 2 I can consider this approach…

Phase 2a: remove the super candidate from the list. Blindly read the next 4 character in S after our super candidate. See if any word starts with that…..

Phase 2b: do the same leftward. Perhaps by reversing all the strings.