stateless (micro)services #%%1st take

in 2018, I have heard more and more sites that push the limits of stateless designs. I think this “stateless” trend is innovative and /bold/. Like any architecture, these architectures have “problems” and limitations, so you need to keep a lookout and deal with them and adjust your solution.

Stateless means simplicity, sometimes “extreme simplicity” (Trexquant)

Stateless means lightweight. Easy to “provision”, easy to relocate.

Stateless means easy scale-out.

Stateless means easy cluster. Http is an example. If a cluster of identical instances are stateless then no “conversation” needs to be maintained.


#undef NDEBUG before including assert.h

#undef NDEBUG
#include <assert.h>

If you include the header before the undef, then the undef has no effect!

You need this technique if the build system permanently, brutally defines NDEBUG and tries to kill all asserts.

RDBMS performance boost to competing with noSQL

  • In-memory? I think this can give a 100X boost to throughput
  • memcached? “It is now common to deploy a memory cache server in conjunction with a database to improve performance.”, according to [1]. Facebook uses memcached.
  • most used indices are usually cached
  • mysql was much faster than traditional RDBMS because no ACID transaction support

[1] says

[18]fastest threadsafe queue,minimal synchronization #CSY

I got this question in a 2017 Wells white-board coding interview, and discussed with my friend Shanyou. We hoped to avoid locks and also avoid other synchronization devices such as volatile, atomic variables..

Q1: only a single producer thread and a single consumer thread and no other threads.

I put together a java implementation that can enqueue without synchronization, most of the time, until See

Q1b: Is it possible to avoid synchronization completely, i.e. single-threaded mode?
A: No. Consumer thread would have absolutely NO idea whatsoever how close it is to the producer end. No. We need a memory barrier at the very least.

Q2: what if there are multiple producer/consumer threads?

I believe we can use 2 separate locks for the two ends, rather than a global lock. This is more efficient but invites the tricky question “how to detect when the two ends meet“. I am not sure. I just hope the locks enforce a memory barrier.

Alternatively, we could use CAS on both ends.


thread^process: lesser-known differences #kernel

This is largely a QQ question.  Some may consider it zbs.

  • To the kernel, there are man similarities between the thread construct vs the process construct. In fact, a (non-kernel) thread is often referenced as a LWP in many kernels such as Solaris and Linux
  • socket — threads in a process can access the same socket; two processes usually can’t access the same socket, unless … parent-child. See post on fork()
  • memory — thread AA can access all heap objects, and even Thread BB’s stack objects. Two process can’t share these, except via shared memory.
  • context switching — is faster between threads
  • creation — some thread libraries can create threads without the kernel knowing. No such thing for a process.
  • a non-kernel thread can never exist without an owner process. A process always has a parent process which could be long gone.

SDI: URL shortening

Q: Design TinyURL or bitly (a URL shortening service)

Given a (typically) long URL, how would how would you design service that would generate a shorter and unique alias for it.

Discuss things like:

  • How to generate a unique ID for each URL?
  • How would you generate unique IDs at scale (thousands of URL shortening requests coming every second)?
  • How would your service handle redirects?
  • How would you support custom short URLs?
  • How to delete expired URLs etc?
  • How to track click stats? is a long discussion.

SDI: DoS-guard #Nsdq

Q: Design an API Rate Limiter (e.g. for Firebase or Github)

You are expected to develop a Rate Limiter services that can:

  • Limit the number of requests an entity can send to an API within a time window e.g., 15 requests per second.
  • The rate limiting should work for a distributed setup, as the APIs are accessible through a cluster of servers.

(A similar question was asked at Nsdq… )

Q2: how do your cluster of cache servers detect a given IP on the Internet is sending requests too frequently, causing Denial of Service? How do you protect yourself?

Q2b: After you blacklist a client IP, it goes quiet, then it sends a single request again. How you decide whether to ignore the request?

Q2c: what algorithm to decide if a client IP has legitimate need to send lots of requests vs another client IP engaging in Denial of Service attack?

Q2d: what if distributed DoS attack? has practical solutions.

## UPSTREAM tech domains: defined by eg

It pays to specialize in a domain that helps you break into similar domains. (Unsuccessful with my quant adventure so far.)

  • eg: [L] socket —- C++ is the alpha wolf, leader of the wolf pack
  • eg: [L] latency —- C++ is the leader of the pack in low-level optimizations .. In-line, cache-efficiency, footprint..
  • eg: collections —- C++ is the leader of the pack. There’s a lot of low level details you gain in STL that help you understand the java/c#/python collections
  • eg: threading —- java is the leader of the pack. In c++ threading is too hard, so once I master some important details in java threading, it helps me build zbs in c/c#, but only to some extent
  • eg: [L]pbref^val —- C++ is the leader. C# has limited complexity and java has no complxity
  • eg: regex/string —- Perl is the leader
  • eg: alpha —- equities trading is the leader
  • [L] heap/stack —- C++ is the leader. Java and c# are cleaner, simpler and hides the complexities.
  • defensive coding —- (and crash management) C++ is the leader, because c++ hits more errors
  • lambda —- C# is the leader. Scripting languages follow a different pattern.
  • list/stream —- Linq is the leader
  • Black-Scholes —- feels like a leader or upstream, but not /convincing/
  • execution algo —- sell side equities desk seems to be the leader
  • data analytics —- python feels like the leader
  • [L = lowLevel]


2nd in-depth job]c++.. critical mass


  • After Citi-muni, even though I had enough experience to pass similar job interviews, I didn’t feel confident in GTD, so I took a 2nd real time trading system job in Baml, and reached critical mass
    • I did learn more in the following 3 months than I would have over another 3 months in Citi 
  • RTS is similar. I could already pass real time c++ interviews, but I didn’t feel confident in GTD.
    • Note the Mac job couldn’t count as a c++ job.
  • How about c#? I actually feel confident about GTD in another c# team, so I didn’t ‘need a 2nd c# job