GTD: algo trading engine from scratch: ez route

Start by identifying some high-quality, flexible, working code base that’s as close to our requirement as possible. Then slowly add X) business features + Y) optimizations (on throughput, latency etc.) I feel [Y] is harder than [X], thought [X] gives higher business value. Latency tuning is seldom high-value, but data volume could be a show-stopper.

Both X/Y enhancements could benefit from the trusty old SQL or an in-memory data store[1]. We can also introduce MOM. These are mature tools to help X+Y. [3]

As I told many peers, my priorities as architect are 1) instrumentation 2) transparent languages 3) product maturity

GTD Show-stopper: data rate overflow. Already addressed
GTD Show-stopper: frequent[4] crashes. Unlikely to happen if you start with a mature working code base. Roll back to last-working version and retest incrementally. Sometimes the crash is intermittent and hard to reproduce 😦 Good luck with those.

To blast through the stone walls, you need power tools like instrumentation, debuggers … I feel these are more important to GTD than optimization skills.

To optimize, you can also introduce memory manager such as the ring buffer and custom allocator in TP, or the custom malloc() in Facebook. If performance doesn’t improve, just roll back as in Rebus.

For backend, there are many high or low cost products, so they are effectively out of scope, including things like EOD PnL, position management, risk management, reporting. Ironically, many products in these domains advertise themselves as “trading platforms”. In contrast, what I consider in-scope would be algo executor, OMS[2], market data engine [2], real time PnL.

— The “easy route” above is probably an over-simplification, but architects must be cautiously optimistic to survive the inevitable onslaught of adversities and setbacks —

It’s possible that such a design gradually becomes outdated like GMDS or the Perl codebase in PWM-commissions, but that happens to many architects, often for no fault of their own. The better architects may start with a more future-proof design, but more likely, the stronger architects are better at adjusting both the legacy design + new requirements

Ultimately, you are benchmarked against your peers in terms of how fast you figure things out and GTD….

Socket tuning? Might be required to cope with data rate. Latency is seldom a hard requirement.

Threading? single-threaded model is probably best. Multiple processes rather than multiple threads.

Shared memory? Even though shared memory is the fastest way to move data between processes, the high-performance and high-throughput ticket plant uses TCP/Multicast instead.

MOM? for high-speed market data gateway, many banks use MOM because it’s simpler and flexible.

Inter-process data encoding? TP uses a single simplified FIX-like, monolithic format “CTF”. There are thousands of token types defined in a “master” data dictionary — semi-static data.

GUI for trader? I would think HTML+javascript is most popular and quick. For a barebones trading engine, the GUI is fairly simple.

Scheduled tasks? Are less common in high speed trading engines and seldom latency-sensitive. I would rely on database or java/c++ async timers. For the batch tasks, I would use scripts/cron.

Testing? I would use scripts as much as possible.

[1] eg: GMDS architect chose memory-mapped-file which was the wrong choice. [2] both require an exchange interface
[3] data store is a must; MOM is optional;
[4]If it crashes once a day we could still cope. Most trading engines can shut down when market closed.

execution algo^algo strategy

I think the “exec algo” type is lesser known.

  • VWAP is best known example
  • bulk orders
  • used by many big sell-sides (as well as buy-sides) to fill client orders
  • the machine acts as a robot rather than a strategist
  • goal is not to generate alpha, but efficient execution of a given bulk order


essential GTD know-how to build a HFT infrastructure#early 2015


(See also post on HFT)

Just to share some observations and reflections. More than one Asia (and to a lesser extent US) recruiters have reached out to me as a potential lead-developer for a HFT engine, to be created from scratch. I believe there are not many old hands in Singapore. Even in the US, this is a relatively small circle. Not a commodity skill.

A small trading shop would have very different needs than a big bank, so their HFT engine will use off-the-shelf tools for most but the most critical, customized modules. (I had a brief blog post on it.) What are the 10 essential know-how i.e. Essential functionalities you must know how to create (prove)?

• executing strategy for order origination, i.e. machine trading
** market data processor
** order book? perhaps at the center of the engine
** not needed — how to come up with strategies
• in-sample testing
• barebones GUI, or perhaps command line interface?
• FIX or other exchange APIs
* store historical data for analysis. Perhaps SQL or KDB.
• data persistence without SQL

Low-level tech knowledge
• threading
• [W] Boost
• debugger
• [N] memory leak detection
• [N] unit testing
• [NW] socket programming

[N = Not needed in every shop, but often required by interviewer]
[W = my weakness, relatively speaking]

mean reversion, deviation detector, pattern recognition

Nothing, including real time mkt data analyzers, can predict the future. They can point out unusual deviations which often precede reversions to norm. In such a case timing is unpredictable though.

Case in point — When I saw historical highs in copper price, I thought it would drop (reversion) within hours, or at most couple of days, but it just kept rising and destroyed my position. (A necessary condition for my undoing is margin. No margin, no collapse.)

I guess China Aviation Oil might have something like this?

Such a reversion is one type of pattern. Some patterns have a strong rational, logical basis. Consider historical vol’s mean reversion pattern. Consider the widening spread on-the-run vs off-the-run.
Mean reversion is one type of deviation detector.

execution risk in a VWAP execution algo

Background: VWAP strategy is known to have minimal market impact but bad “execution risk”.

Suppose you are given a large (500,000 shares) Sell order. Suppose your goal is minimal market impact i.e. avoid pushing up the price. What execution strategy? I don't know about other strategies, but VWAP strategies generally participate according to market volume, so given a decent implementation the market impact is often … reduced.

I think the idea of the Exec risk is the _uncertainty_ of the final block price. If an implementation offers a very tight control and results in well-controlled final block price, then exec risk is small. explains with an example —

suppose MSFT trades 30 million shares on an average day. If a trader has three million MSFT shares to trade, a VWAP algorithm may be appropriate. However, if the trader

gets 30,000 shares of MSFT to trade, then the savings of market impact (by spreading the trade over the whole day) is not significant compared against the opportunity cost the trader could save by trading the stock within the next few minutes. Quick execution means the uncertainty (or std) in “final block price” is much reduced. With a small order you would achieve something close to the arrival price.

vwap execution chasing the wrong signal – my guess

A vwap algo starts with a “model profile”, which tells us each hour (or minute) of the trading day typically experiences how many percent of the total daily volume.

Then the algo tries to execute according to the model profile, executing 10% in the first hour. The actual market profile may show a spike in the second hour. Suppose 2nd hour usually gets below half of first hour according to the model profile, but we see it's going to more than double the first hour, because the past 5 minutes show a very high spike in volume.

Question is, should we increase our trade rate? I guess there's reason to do so. When the volume spikes, we should trade bigger chunks so as to let the spike “mask and absorb” our market impact. If we don't capture this spike, then 2nd hour might end up being 80% of daily volume, but we only put in 4% our quantity, so our remaining quantity would cause market impact.

However, it's also possible to chase the wrong signal. The spike might cause a large rise or (more likely in realistic panic-prone markets) drop in price, which could reverse soon. Suppose we are selling a big quantity and the spike causes a big drop. Our active participation would further deepen the drop. We might do better to patiently wait for a reversal.

HFT – some defining features

I learnt it from a seminar. I didn't do any research. Just some personal observations.

* HFT is different from a broker execution algo in terms of holding period. HFT never holds a position overnight.

* HFT is alpha-driven. In contrast, a sell-side trading engine is driven by customer flow.

** however, HFT doesn't use market orders as much as limit orders, so it may appear to be market-making.

* an HFT engine makes many, many trades in a day, but so does a broker execution algo.

* HFT usually makes no directional bet. In contrast, fundamental strategies have a view. I feel a sell-side often have a view and may hold inventory overnight.

I think the distinction can become unclear between HFT and execution algos.

slippage – several similar meanings

slippage = difference between 2 prices — the earlier quoted by an OTC dealer (FX, bond, swap…) vs the requote when you take that quote. In many quote-driven markets, all public quotes are always indicative.

slippage = difference between 2 prices – the EOD benchmark vwap vs your block order’s execution vwap, assuming order filled within a day.

slippage = difference between 2 prices – the price that triggered your trading signal vs execution price. see

VWAP market impact = slippage

Suppose we have a 5000 IBM buy-order. VWAP is the benchmark chosen. Our goal is to predicted market impact. If the predicted MI is too large, then we should either be very careful, or decline the block order.

The predicted market impact is also the predicted slippage.

Let's put in some fake but concrete numbers. Suppose we estimate that submitting this block buy would raise today's eod_vwap by 100 bps. That means whatever today's eod_vwap happens to be, our execution_vwap will be higher by 1%.

If eod_vwap is $70 then our market impact is about $0.70

If eod_vwap is $75 then our market impact is about $0.75

I think we should inform our client that we can execute the block order at the benchmark eod_vwap + some spread. I would quote a spread like $1 given spot is $69 and I estimate today's eod_vwap to be $69.50.

Suppose client accepts and we execute the order and execution_vwap = $69.9. What's the realized market impact? We can only imagine “what if I had not done this big order”. However, we can observe the benchmark eod_vwap = $69.10. We performed better than quoted. Slippage = execution_vwap – eod_vwap, so our slippage is smaller than our quote!

[[all about HFT]]

Author is an option specialists (currently teaching derivatives at a university). Many mentions of HFT on options.
chapters (80p) on technology. Author believes the 3 legs are {strategy, math, tech}
chapter (50p) on strategy
**first part seems to be uninteresting, math-light but might be important in practice
**chapter (12p) on arbitrage strategies
1 page on native API vs FIX.
a few pages on cpu offloading, including running Monte Carlo on GPGPU
compares c++ vs c#java in a HFT context
compares buy vs build in a HFT context