efficient memoization: keyed by auto-increment id

Market Value of this knowledge — I feel this level of intricate knowledge is useful mostly in bigO coding interviews. If you get the bigO wrong there, you can lose major points. Versioned dict #indeed is one typical coding interview question, where the result of read(id) can be memoized to speed up subsequent reads.

Q: what’s the optimal time complexity of maintaining the memoization data structure? User requests can hit us with read(9), read(88), read(15), read(0), read(15) again…

Notation — N:= number of id values.

  • Option 0 (unusable): hash table — nearest-neighbor search impossible
  • Option 1 (sub-optimal): RBTree —  Lookup is O(logN). In general, insertion costs O(logN).

( Special case — If the read() id never decreases, then STL RBtree insert-with-hint is O(1). )

  • Option 2: vector — indexed by id [1]. Lookup is binary search O(logN). Insertion is always amortized O(1). Here’s why.

Eg: after read(9), we get read(88). We would need to null-fill the vector until position 88. This null-fill design looks sub-optimal.

But how is the id 88 created after 87 ? Probably in some write() operation. So write() can incrementally expand this vector 🙂

Actually, this incremental-insert design is not faster. I would rather leave write() alone as the null-fill design is a mass-insert and minimizes re-allocation.

[1] Is it common or lucky to have id values as auto-increment integers? I think it’s quite common in practice and in algo interviews. Auto-increment Integer id is light-weight and fast to generate.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s