Market Value of this knowledge — I feel this level of intricate knowledge is useful mostly in bigO coding interviews. If you get the bigO wrong there, you can lose major points. Versioned dict #indeed is one typical coding interview question, where the result of read(id) can be memoized to speed up subsequent reads.
Q: what’s the optimal time complexity of maintaining the memoization data structure? User requests can hit us with read(9), read(88), read(15), read(0), read(15) again…
Notation — N:= number of id values.
- Option 0 (unusable): hash table — nearest-neighbor search impossible
- Option 1 (sub-optimal): RBTree — Lookup is O(logN). In general, insertion costs O(logN).
( Special case — If the read() id never decreases, then STL RBtree insert-with-hint is O(1). )
- Option 2: vector — indexed by id . Lookup is binary search O(logN). Insertion is always amortized O(1). Here’s why.
Eg: after read(9), we get read(88). We would need to null-fill the vector until position 88. This null-fill design looks sub-optimal.
But how is the id 88 created after 87 ? Probably in some write() operation. So write() can incrementally expand this vector 🙂
Actually, this incremental-insert design is not faster. I would rather leave write() alone as the null-fill design is a mass-insert and minimizes re-allocation.
 Is it common or lucky to have id values as auto-increment integers? I think it’s quite common in practice and in algo interviews. Auto-increment Integer id is light-weight and fast to generate.