# tail-recursion Fibonacci

Tail recursion is a “halo” skill in coding interviews.

Easiest problem is factorial(N). For Fibonacci, https://stackoverflow.com/questions/22111252/tail-recursion-fibonacci has a very short python implementation. Let me rephrase the question:

Q: Given f(firstVal=0, secondVal=1, length=0) returns 0, f(0,1,1) returns 1, can you implement f(0,1,N) using recursion but in O(N) time and O(1) space? Note Fib(N) ==f(0,1,N)

Key points in the python solution:

• use 2 extra arguments to hold last two intermediate values like Fib(2) Fib(3) etc
• We saw in the iterative solution that memory usage is O(1), a good sign that tail recursion might be possible.
• if you observe the sequence of Fib() values computed in the blackbox, actually, you see Fib(2), Fib(3) … up to Fib(N), exactly like the iterative solution.
• solution is extremely short but non-trivial

# GS c++/dataStruct IV

Q1: what’s the problem to be solved by virtual inheritance
A: diamond..

Q2: consider such a diamond with A as grandparent, B/C virtually deriving from A, then grandchild D. Suppose class C has a foo() method that uses a field A.a. Now, within C, the offset of A.a depends on the inheritance hierarchy, so how does the compiler compile C.foo()?

A: I do see a problem here. In this ABCD case, the offset of A.a is one value, depending on the size of B. However, in a ABJKCD case (J/K virtually subclass A), the offset of A.a will be a different value, depending on the sizes of B/J/K. So the C.foo() method will compile to different object code! Yet the compiler has a single version of C.foo().

The trick is the offset of the A sub-object within C. This offset value can be saved in the vtable of C.

Note if there’s no virtual inheritance, then simpler. Within C, the A sub-object’s offset is a constant. No need to look up vtable.

Q3: appending a million integers to a list vs a vector, which is faster and by how much?
%%A: vector requires reallocation, roughly 20 times, assuming doubling each time, much fewer than 1M allocations for the list.  The reallocation entails element-by-element copy, but that’s cheaper than allocations. If I must estimate how much faster, I need to estimate how many allocations and how many copying:

• for list, 1M allocations + 1M copying
• for vector, 20 allocations + about 2M copying

%%A: I still think allocation is a hotspot and much more expensive than copying, perhaps by 10 or 100 times.

%%A: of course, you can pre-allocate for vector, but for this problem we ignore that feature.

Q4: deque — known for fast append/prepend, and random access. How?
%%A: I think it’s implemented as segments of vectors, with the front and back segments not fully occupied and growable. Once one of them reaches a certain size, a new segment is created. RandomAccess is implemented by identifying the host segment. To optimize, segments should have equal length.

Q4b: what if we want to support fast midstream insertion?
%%A: this is a vector limitation inherited by deque. If we need to support it, we may need to allow unequal segment lengths. RandomAccess is now slightly slower. For a given subscript 16, we need to maintain a step function like

• 5-10 in Segment 2
• 11-13 in Segment 3
• 14-20 in Segment 4
• 21-30 in Segment 5
• 31-32 in Segment 6
• … so each time any segment expands, the step function needs an update
• Once we have the step function, a binary search will do, at log(S) where S = Segment count

# show free slots between meetings #bbg c++11

Q: You are given a list of teleconference booking like {9,10} meaning from 9am to 10am. We can multi-task, so overlap is fine. Can you print out all the free slots within the day? You can safely assume all the input data are sensible and between 9 to 18, i.e. our workday.

I solved this problem on the spot. My weakness is still the ECT cycle esp. the compiler errors.

I was familiar with the inverse problem of listing all busy sessions. So I decided to *focus* on that. I took a calculated risk that once I get that to work, I will have the cool confidence to solve the original problem.

corner case: two adjacent meetings. I added comment to specify exactly how to check for that
corner case: what if the entire day is one busy session?

```struct Intv{ //can be a single meeting, a joined session or a free slot
size_t st, end; // 9 means 9 o'clock. Can be float
Intv(size_t a, size_t b): st(a), end(b){}
bool operator<(Intv const & other) const { return this->st < other.st;}
friend ostream & operator<<(ostream & os, Intv const & a){
os<<a.st<<'-'<<a.end;
return os;
}
};
void printBusy(vector<Intv> const & arg){
if (arg.empty()) return;
vector<Intv> & v = const_cast<vector<Intv>&> (arg);
sort(v.begin(), v.end());
size_t const sz = v.size(); //cache
Intv growable = v[0];
for(size_t i = 1; i<sz; ++i){
Intv & req = v[i];
if (growable.end < req.st){
cout<<growable<<endl; //close current session
growable = req;//start new session
continue;
}
growable.end = max(growable.end, req.end);
}
cout<<"last session : "<<growable<<endl;
}
int main() {
//pass in initializer list to initliaze a const vector
printBusy({Intv(15,17), Intv(7,9), Intv(5,7)});
}
```

BST visits root level (one node only), and 2nd generation, and 3rd generation etc. The BST below is aware of the height or “generation” currently visited.

```struct Node {
int data;
Node *left, *right, *next;
Node(int x, Node * le = NULL, Node * ri = NULL) : data(x), left(le), right(ri), next(NULL) {}
};
Node _15(15);
Node _14(14);
Node _13(13);
Node _12(12);
Node _11(11);
Node _10(10);
Node _9(9);
Node _8(8);
Node _7(7, &_14, &_15);
Node _6(6, NULL, &_13);
Node _5(5, &_10, NULL);
Node _4(4, NULL, &_9);
Node _3(3, &_6,  &_7);
Node _2(2, &_4,  &_5);
Node root(1, &_2, &_3);
Node marker(-1); //to be added to queue to mark new level

Node * dequeue(queue<Node*> & q){
assert(!q.empty());
Node * n = q.front();
q.pop();
return n;
}
void BFT(){
queue<Node *> q;
size_t height=0;

for( q.push(&marker), q.push(&root); !q.empty();){
Node * n = dequeue(q);
if (n == &marker){
if (q.empty()){
cout<<"  ^^^^^^^^^^"<<endl;
break;
}
n = dequeue(q);
q.push(&marker); //move the marker to end of queue
cout<<"starting height # "<<++height<<endl; } int dataL = 0, dataR=0; if (n->left){
cout<<"pushing "<<n->left->data<<endl; q.push(n->left);
dataL = n->left->data;
}
if (n->right){
cout<<"pushing "<<n->right->data<<endl; q.push(n->right);
dataR = n->right->data;
}
cout<<dataL<<" <- "<<n->data<<" -> "<<dataR<<endl;// "  #  next is "<<dataN<<endl;
}
}
int main() {
BFT();
}
```

# imaginary IV question: BAG data structure

A “Bag” data structure is a generalized Set that allows duplicates. Implement it in your choice of language. Let’s take python for example:

myBag = Bag()
myBag.size() == 3
myBag.count(‘x’) == 2
myBag.remove1(‘x’)
myBag.count(‘x’) == 1

# Monkey-jump problem{Ashish IV@Baml #solved

On the 2D-coordinate plane, A monkey starts from point [a,b] and wants to reach [P,Q]. From any point, the money can only jump to one of two points:

1. [x,y] up to [x, x+y]
2. [x,y] out to [x+y, y]

Q: Can you write bool canReach(unsigned long long a,b,P,Q)
A: start from [P,Q] and have no decision to make!

```#include <iostream>
using namespace std;
struct Point {
int x, y;
Point(int x, int y) : x(x), y(y) {}
friend ostream& operator<<(ostream& os, Point & n){
os<<n.x<<","<<n.y;
return os;
}
};
bool canReach(unsigned int a, unsigned int b, unsigned int P, unsigned int Q){
Point lower(a,b), higher(P,Q);
for(Point p=higher;;cout<<p<<endl){
if (p.x == lower.x && lower.y == p.y){
cout<<"Good:)"<<endl;
return true;
}
if (p.x==p.y || p.x<lower.x || p.y<lower.y){
cout<<"Failed:( Can't jump further"<<endl;
return false;
}
if (p.x<p.y){ p.y = p.y-p.x ; continue;}
if (p.x>p.y){ p.x = p.x-p.y ; continue;}
}
}
int main() {
cout<<canReach(1,10, 57,91);
}
```

# sorted circular array max() in O(log N)

/*http://practice.geeksforgeeks.org/problem-page.php?pid=819

A sorted array A[] with distinct elements is rotated at some unknown point, the task is to find the max element in it.

Expected Time Complexity : O(Log n)

–Analysis —
It takes a const time to determine if the list is ascending or descending.

(Take 1st 3 elements. If not sorted, then entire task is simple — answer is the max among the three because we hit the turning point)

Suppose we know it’s ascending, use binary search forward to locate the turn.
*/

```#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
vector<double> v{57, 56, 55, 91, 82, 73, 64};
int N = v.size();

int main() {
auto aa = v[0], bb = v[1], cc = v[2];
if ((aa - bb)*(bb - cc) < 0) {
cout << max(max(aa, bb), cc) << " found within fist 3 elements" << endl;;
return 0;
}

if (aa<bb) { //ascending
for (int probe = N / 2, left = 0, right = N - 1;; probe = (left + right) / 2) {
if (left + 1 == right) {
cout << max(v[left], v[right]) << endl;
return 0;
}
if (v[left] < v[probe]) //normal
left = probe;
else
right = probe;
}
}
else { //descending
for (int probe = N / 2, left = 0, right = N - 1;; probe = (left + right) / 2) {
if (left + 1 == right) {
cout << max(v[left], v[right]) << endl;
return 0;
}
if (v[left] < v[probe])
left = probe;
else //normal
right = probe;
}

}
}
```

# c++CollabEdit/Broadway IV: implement hash table#I used py

Q: implement a hash table class in any language. You could use existing implementations of linked list, array, hash function…

Q: talk about how you would implement rehash?
%%A: hash code won’t change for the key objects. But I would rerun the modulus against the new bucketCount. Based on the new index values, I would create the linked lists in each new bucket. Every pair needs to be relocated. Lastly I need to get rid of the old bucket array.

Q: how would you test your hash table?
%%A: try inserting (key1, val1), then (key1, val2), then look up key1
%%A: if I know any common weakness of the hash function, then test those.
%%A: trigger rehash

Q: what could go wrong in a multi-threaded context?
%%A: things like lost update or duplicate entries

Q: What concurrency solution would you choose for best performance?
%%A: could use lockfree algo at each point of writing to the bucket array or writing to a linked list.

# algo: minimum-cost array shrinking #Citadel

Input is a vector of positive integers. You are to shrink it progressively. Each step you remove 2 selected elements, and replace with their sum. Therefore vector size drops by 1 at each step, until there’s one element left.

At each step there’s a cost, which is defined as that sum.

Eg: {4,2,1} becomes {4,3} after you combine 2/1. The cost at this step is the sum 2+1=3.

Q1: For a given vector, find a sequence of steps with the lowest total cost. Test your code in c++
Q2: how would you optimize your code, assuming high data volume.

```#include <vector>
#include <queue>
#include <algorithm>
#include <iostream>
#include <string>
#include <functional> // std::greater
using namespace std;

vector<int> vec = { 3,2,1 }; // a std::set will fail when duplicate values show up like {3,3}
priority_queue<int, vector<int>, std::greater<int> > pq(vec.begin(), vec.end());

void dumpVector(string msg) {
cout << msg << ", size = " << vec.size() << endl;
for (auto it = vec.begin(); it != vec.end(); ++it) cout << *it << ' ';
cout << endl;
}

int operateVector(int sum = 0) {
auto lowestItem = min_element(vec.begin(), vec.end());
sum += *lowestItem;
vec.erase(lowestItem); // now s1 is an invalidated iterator and unusable

//remove() is bad as it removes all not one item having target value!
//v.erase(std::remove(v.begin(), v.end(), *lowestItem), v.end());

dumpVector("afer erase");
return sum;
}

void dumpHeap(string msg) {
auto clone = pq;
cout << msg << ", size = " << clone.size() << endl;
for (; !clone.empty();clone.pop()) {
std::cout << clone.top() << ' ';
}
cout << endl;
}
int operateHeap(int sum = 0) {
sum += pq.top();
pq.pop();
//dumpHeap("afer pop");
return sum;
}

int f1(int sum = 0) {
return operateHeap(sum);
}
int main87() {
int totalCost = 0;
for (; pq.size()>1;) {
int sum = f1(f1()); //call f1() twice recursively.
pq.push(sum);
dumpHeap("afer push");
totalCost += sum;
}
cout << "total cost = " << totalCost << endl;
return 0;
}
```

# construct graph from list of connections #BGC

Given an input file showing a list of {string, string} pairs, build a connection graph.

If you have a correct connection graph, then you can easily determine the connectedness (bool) of any 2 nodes. In a social-network, this bool flag indicates whether 2 individuals are completely unconnected or somehow connected.

—-analysis:
I see this as a social-network. Any pair represents an edge connecting 2 nodes.  At any time there are a number of disconnected islands. The next pair could 1) merge 2 islands or 2) add a node to an existing island or 3) create a new island 4) do nothing, if the 2 nodes are already in some existing island

• Any known node appears exactly once in the entire graph, in exactly one of the islands.
• All nodes are contained in a lookup table or hashmap  {node -> island}
• Each island can be a implemented as a hashset of nodes.

So here’s a proposed algo to process a new pair {A, B}. Look for A and B in the  graph. 3 scenarios + a dummy scenario:

• (Scenario 3) If both A an B are new comers, then they form a new island.
• if both A and B are already in the graph,
• (Scenario 4) if they are in the same island, then exit. Nothing to do
• (Scenario 1) else we can merge the 2 islands
• (Scenario 2) If A is in island 3 but B is new comer, then B joins island 3

The merge operation is expensive. The big lookup table needs update but here’s an alternative:

• At merge time, the smaller island would have all the nodes moved to the bigger island. When the island is empty, it gets a pointer “this.redirect” to the bigger island.
• lookup table needs no update, avoiding locking a global object.
• At query time, we look up the table to get the original island, then we follow its pointer (defaults to null) until the island is non-empty.
• endless loop? would only be a programming error.