Several practitioners say MOM is unwelcome due to added latency:
- The HSBC hiring manager Brian R was the first to point out to me that MOM adds latency. Their goal is to get the raw (market) data from producer to consumer as quickly as possible, with minimum stops in between.
- 29West documentation echos “Instead of implementing special messaging servers and daemons to receive and re-transmit messages, Ultra Messaging routes messages primarily with the network infrastructure at wire speed. Placing little or nothing in between the sender and receiver is an important and unique design principle of Ultra Messaging.“
- Then I found that the ICE/RTS systems (not ultra-low-latency ) have no middleware between feed parser and order book engine (named Rebus).
However, HFT doesn’t always avoid MOM. P143 [[all about HFT]] published 2010 says an HFT such as Citadel often subscribes to both individual stock exchanges and CTS/CQS , and multicasts the market data for other components of the HFT. This design has additional buffers inherently. The first layer receives raw external data via a socket buffer. The 2nd layer components would receive the multicast data via their socket buffers.
 one key reason to subscribe redundant feeds — CTS/CQS may deliver a tick message faster!
Lehman’s market data is re-distributed over tibco RV, in FIX format.
Piroz told me that trading IT job interviews tend to emphasize multi-threading and MOM. Some use SQL too. I now feel all of these are unwelcome in low latency trading.
A) MOM – see HFT mktData redistribution via MOM
B) threading – Single-Threaded-Mode is generally the fastest in theory and in practice. (I only have a small observed sample size.) I feel the fastest trading engines are STM. No shared mutable. Nsdq new platform (in java) is STM
MT is OK if they don’t compete for resources like CPU, I/O or locks. Compared to STM, most lockfree systems introduce latency like retries, and additional memory barrier. By default compiler optimization doesn’t need such memory barriers.
C) SQL – as stated elsewhere, flat files are much faster than relational DB. How about in-memory relational DB?
Rebus, the order book engine, is in-memory.
An Singapore ANZ telephone interviewer (Ivan?) 2011?) drilled me down — “just why is MOM more reliable than a blocking synchronous call without a middleware?” I feel this is a typical “insight” question, but by no means academic or theoretical. There are theories and (more importantly) there are empirical evidence. Here I will just talk about the theoretical explanations.
Capacity — MOM can hold a lot more pending requests than a synch service. A RMI or web server can have a limited queue. The TCP socket can hold requests in a queue, but all limited. In contrast, MOM queue can be on disk or in the broker host’s memory. Hundreds or possibly millions time higher capacity.
Burst of request can bring down an RMI system even if it is loaded lightly 99% of the time.
But what if the synch service has enough capacity so no caller needs to wait? I feel this is wishful thinking. For the same hardware capacity, MOM can support 10x or 100x more concurrent requests. For now, let’s assume capacity isn’t the issue.
Long-running — if some of the requests take a long time (like a few sec) to complete then we don’t want too many “on-going” tasks at the same time. They compete for CPU/memory/bandwidth and can reduce stability and reliability. Even logging can benefit from async MOM design.
But again let’s assume the requests take no time to complete.
ACID — Reliable MOM always persists messages before replying with a positive ACK.
AutoReo, BidWanted and many trading engines in Citi Muni are all distributed architectures glued by MOM. Echoed by the Citadel HFT architecture described in [[all about HFT ]]
[[Java CAPS basics]] has a chapter on request/reply patterns, using
* jms — the primary contender
* http without web service
* SOAP — another major choice
5 years apart after I was born, 1975 – 1980.
If you remember one endpoint only, this is it
tibrvsend -daemon mhs-apps149-d.lvt.us.ml.com:7500 // the argument of -daemon is the end point.
The -listen parameter of rvd corresponds to the daemon parameter of the transport creation call
Q1: pros and cons of vector vs linked list?
Q1b: Given a 100-element collection, compare performance of … (iteration? Lookup?)
Q: UDP vs TCP diff?
%%A: multicast needs UDP.
Q: How would you add reliability to multicast?
Q: How would you use tibco for trade messages vs pricing messages?
Q5: In your systems, how serious was data loss in non-CM multicast?
%%A: Usually not a big problem. During peak volatile periods, messaging rates could surge 500%. Data loss would deteriorate.
Q5b: how would you address the high data loss?
%%A: test with a target message rate. Beyond the target rate, we don’t feel confident.
Q7: how is order state managed in your OMS engine?
%%A: if an order is half-processed and pending the 3nd reply from ECN, the single thread would block.
Q7b: even if multiple orders (for the same security) are waiting in the queue?
%%A: yes. To allow multiple orders to enter the “stream” would be dangerous.
Now I think the single thread should pick up and process all new orders and keep all pending orders in cache. Any incoming exchange messages would join the same task queue (or a separate task queue) – the same single thread.
3 main infrastructure teams
* exchange connectivity – order submission
* exchange connectivity – pricing feed. I think this is incoming-only, probably higher volume. Probably similar to Zhen Hai’s role.
* risk infrastructure – no VaR mathematics.
If in one transaction you send a request then read reply off the queue/topic, i think you will get stuck. With the commit pending, the send won’t reach the broker, so you the requester will deadlock with yourself forever.
An unrelated design of transactional request/reply is “receive then send 2nd request” within a transaction. This is obviously for a different requirement, but known to be popular. See the O’Relly book [[JMS]]