##observations@high-volume,latency sensitive eq trading sys #CSY

This is a probably the biggest sell-side equity order-management-system (OMS) out there, written in c++11. Daily order volume is probably highest among all investment banks, presumably 6 to 7 figures based on my speculation, though a lot of them get canceled, rejected or unfilled. I can’t reveal too many internal details due to compliance.

In contrast, GS used to get about a million individual trades a day, probably not counting the high-frequency small trades.

  • I have not seen a message queue so far but they could be hidden somewhere. Earlier I heard people telling me Tibco (and similar messaging middlewares) were popular in fixed income and other trading but now I doubt it. Queues add latency.
    • We do use some pub-sub MOM but not for order messages therefore not part of order flow.
  • I haven’t noticed any locking or condition variable so far. I think single-threaded mode is faster than synchronized multi-threading. Multiple instances of the same software runs in parallel across machines. I think this is in many ways better than one big monolithic process hosting many threads. We have 4 threads per instance in some cases.
  • socket programming is not needed in any module. I believe the applications communicate via FIX, SOAP etc, on top of well-encapsulated TCP library modules.
  • RDBMS is loaded into cache at Start-of-Day and seldom accessed intra-day. I confirmed it with an ex-DBA colleague
  • no garbage collection like in java and dotnet
  • heavy use of CRTP. I don’t remember seeing many virtual functions.
  • The most important message is the order object, represented by a FIX message. The order object gets enriched and modified by multiple functions in a chain. Then it is sent out via FIX session to the next machine. As in any OMS, the order object is stateful. I still don’t know where the order objects are saved. I would think they are persisted somewhere so a crash won’t wipe out pending orders.
    • (Elsewhere, I have seen very lean and mean buy-side OMS systems that don’t persist any order! After crash, it would query the exchange for order states.)
  • The 2nd most important message is probably the response object, represented by a FIX msg. If there are 100,000 order objects then there are roughly 300,000 response objects. Each order generates multiple responses such as Rejection, PendingNew, New, PartialFill, PendingCancel, Cancelled… Response objects probably don’t need to be persisted in my view.
  • The 3rd most common message is the report message object, again in FIX format. Each order object probably generate at least one report, even if rejected. Report objects sound simple but they carry essential responsibilities , not only regulatory reporting and client confirmations, but also trade booking, trade capture… If we miss an execution report the essential books and records (inventory, positions..) would be messed up. However, these reports are not so latency sensitive.
Advertisements

## sell-side eq e-trading arch features #MS,Baml..

Mostly inspired by the MS equity order-management “frameworks”

  • message-based, not necessarily MOM.
    • FIX messages are the most common
    • SOAP messages are also possible.
    • BAML system is based on MOM (tibrv)
  • message routing based on rules? Seems to be central to some sell-side /bloated/ “platforms” consisting of a constellation of processes.
  • event-driven
    • client newOrder, cancel requests
    • trading venue (partial) fills
    • Citi muni reoffer is driven by market data events, but here I focus on equity systems
    • Stirt realtime risk is driven by market data events + new trade booking events
    • buy-side would have order-origination events, but here I focus on sell-side systems
  • market data subscription? Actually not so important to some eq trading engines. Buy-side would make trading decisions based on market data, but a sell-side won’t.

conflation: design question

I have hit this same question twice — Q: in a streaming price feed, you get IBM prices in the queue but you don’t want consumer thread AA to use “outdated” prices. Consumer BB needs a full history of the prices.

I see two conflicting requirements by the interviewer. I will point out to the interviewer this conflict.

I see two channels — in-band + out-of-band needed.

  1. in-band only — if full tick history is important, then the consumers have to /process/ every tick, even if outdated. We can have dedicated systems just to record ticks, with latency. For example, Rebus receives every tick, saves it and sends it out without conflation.
  2. dual-band — If your algo engine needs to catch opportunities at minimal latency, then it can’t afford to care about history. It must ignore history. I will focus on this requirement.
  3. in-band only — Combining the two, if your super-algo-engine needs to analyze tick-by-tick history and also react to the opportunities, then the “producer” thread alone has to do all work till order transmission, but I don’t know if it can be fast enough. In general, the fastest data processing system is single-threaded without queues and minimal interaction with other data stores. Since the producer thread is also the consumer thread for the same message, there’s no conflation. Every tick is consumed! I am not sure about the scalability of this synchronous design. FIFO Queue implies latency. Anyway, I will not talk further about this stringent “combo” requirement.

https://tabbforum.com/opinions/managing-6-million-messages-per-second?print_preview=true&single=true says “Many firms mitigate the data they consume through the use of simple time conflation. These firms throw data on the floor based solely on the time that data arrived.”

In the Wells interview, I proposed a two-channel design. The producer simply updates a “notice board” with latest prices for each of 999 tickers. Registered consumers get notified out-of-band to re-read the notice board[1], on some messaging thread. Async design has a latency. I don’t know how tolerable that is. I feel async and MOM are popular and tolerable in algo trading. I should check my book [[all about HFT]]…

In-band only — However, the HSBC manager (Brian?) seems to imply that for minimum latency, the socket reader thread must run the algo all the way and send order out to exchange in one big function.

Out-of-band only — two market-leading investment bank gateways actually publish periodic updates regardless how many raw input messages hit it. Not event-driven and not monitoring every tick!

  • Lehman eq options real time vol publisher
  • BofA Stirt Sprite publishes short-term yield curves on the G10 currencies.

[1] The notification should not contain price numbers. Doing so defeats conflation and brings us back to a FIFO design.

real-time symbol reference-data: arch #RTS

Real Time Symbol Data is responsible for sending out all security/product reference data in real time, without duplication.

  • latency — typically 2ms (not microsec) latency, from receiving to sending out the enriched reference data to downstream.
  • persistence — any data worthing sending out need to be saved. In fact, every hour the same system sends a refresh snapshot to downstream.
    • performance penalty of disk write — is handled by innoDB. Most database access is in-memory. Disk write is rare. Enough memory to hold 30GB of data. https://bintanvictor.wordpress.com/2017/05/11/exchange-tickers-and-symbols/ shows how many symbols there across all trading venues.
  • insert is actually slower than update. But first, system must check if there’s a need to insert or update. If no change, then don’t save the data or send out.
  • burst / surge — is the main performance headache. We could have a million symbols/messages flooding in
  • relational DB with mostly in-memory storage

## stateful OMS class design: observations

Here’s a well-established and large-scale order manager class design. It handles millions of orders a day.

  • The entire process is restarted on every trading day. Before the restart, all pending orders are cancelled! The OM is probably a per-thread singleton in the process.
  • The OM stores all the orders for the day, including each closed order in case it needs cancellation.
  • The OM keeps all the partial executions (aka partial fills) for a given order, because each execution could be busted.
  • Each action on an order (such as validation, partial execution ..) is performed by a dedicated object. For 1000 orders, if there are 5 actions, then there would be 5000 distinct “action objects”. The OM has pointers to all of these action objects.
  • Most action objects are stateful. ALL action objects are persisted somewhere so as to support busting/cancellation.

 

private bank trade/order/quote/execution flow

Remember — Most non-exchange traded products are voice executed. Only a few very dominant, high volume products are electronically executed. Perhaps 1% of the products account for 99% of the trades — by number of trades. By dollar amount, IRS alone is probably 50% of all the trades, and IRS is probably voice-executed, given the large notional amounts.

The products traded between the bank and its clients (not interbank) are often customized private offerings, unavailable from any other bank. (Remember the BofA puttable floats.)

RM / PWA would get live quotes from dealer and give to a client. Sometimes dealer publishes quotes on an internal network, but RFQ is more common. Any time the quote could be executed between RM and client. RM would book the new position into the bank's database. As soon as as executed (before the booking), the bank has a position but dealer knows the position only after the booking, and would hedge quickly.

Dealer initially only responds to RFQ. It's usually executed without her knowledge, just like an ECN flow.

I think in BofA's wealth management platform, many non-equity products (muni bonds are largely sold to retail clients) trade in the same way. Dealer publishes quotes on an intranet website. RM negotiates with client and executes over the phone. During trade booking, the price and quantity would be validated. Occasionally (volatile market), trade fails to go through and RM must inform client to retry. Perhaps requote. Fundamentally, the dealer gets a last look, unlike the exchange flow.

I believe structured products (traded between bank and clients) are usually not fast and volatile — less requote. However, when dealer hedges the position, I think she often uses vanilla instruments.

Terminology warning — some places use “trade” to mean many things including orders. I think in exchange flow, “order” is a precise word.

max-thruput quote distribution: 6designs#CAS,socket

Update — fastest would require single-threaded model with no shared mutable

Suppose a live feed of market quotes pumps in messages at the max speed of the network (up to 100gigabit/sec). We have (5) thousands of hedge fund clients, each with some number (not sure how large, perhaps hundreds) of subscriptions to these quotes. Each subscription sets up a filter that may look like some combination of “Symbol = IBM”, “bid/ask spread < 0.2…”, or “size at the best bid price….”. All the filters only reference fields of the quote object such as symbol, size and price. We need the fastest distribution system. Bottleneck should be network, not our application.

–memory allocation and copying–
If an IBM /quote/ matches 300 filters, then we need to send it to 300 destinations, therefore copying 300 times, but not 300 allocations within JVM. We want to minimize allocation within JVM. I believe the standard practice is to send just one copy as a message and let the receiver (different machine) forward it to those 300 hedge funds. Non-certified RV is probably efficient, but unicast JMS is fine too.

–socket reader thread latency–
Given the messaging rate, socket reader thread should be as lean as possible. I suggest it should blindly drop each msg into a buffer, without looking at it. Asynchronously consumer threads can apply the filters and distribute the quotes.

A fast wire format is fixed-width. Socket reader takes 500bytes and assume it’s one complete quote object, and blindly drops this 500-long byte array into the buffer.

–multicast rather than concurrent unicast–
See single/multi-thread TCP servers contrasted

–cpu dedication–
Each thread is busy and important enough to deserve a dedicated cpu. That CPU is never given to another thread.
————-
Now let me introduce my design. One thread per filter. Buffer is a circular array — bounded but efficient pre-allocation. Pre-allocation requires fixed-sized nodes, probably byte arrays of 500 each. I believe de-allocation is free — recycling. Another friend (csdoctor) suggested an unbounded linked list of arrays . Total buffer capacity should exceed the *temporary* queue build-up. Slowest consumer thread must be faster than producer, though momentarily the reverse could happen.

—-garbage collection—-
Note jvm gc can’t free the memory in our buffer.

–Design 3–
Allocate a counter in each quote object. Each filter applied will decrement the counter. The thread that hits zero will free it. But this incurs allocation cost for that counter.

–Design 6–
Each filter thread records in a global var its current position within the queue. Each filter thread advances through the queue and increments it’s global var. One design is based on the observation that given the dedicated CPU, the slowest thread is always the slowest in the wolfpack. This designated thread would free the memory after applying its filter.

However, it’s possible for 2 filters to be equally slow.

–design 8–We can introduce a sweeper thread that periodically wakes up to sequentially free all allocations that have been visited by all filters.

–Design 9– One thread to apply all filters for a given HF client. This works if filter logic is few and simple.

–Design A (CAS)– Create any # of “identical” consumer threads. Any time we can expand this thread pool.
while(true){
1)read BigArrayBuffer[++MyThreadPtr] into this thread’s register and examine the fields, without converting to a Quote instance.
2) examine the Taken boolean flag. If already set, then simply “continue” the loop. This step might be needed if CAS is costly.
3) CAS to set this flag
4a) if successful, apply ALL filters on the quote. Then somehow free up the memory (without the GC). Perhaps set another boolean flag to indicate this fixed-length block is now reusable storage.
4b) else just “continue” since another thread will process and free it.
}

hide client names and address

I proposed a system to a buy-side asset manager shop. I said client names don't need to stored in the central database. Maybe the salesforce and investment advisors need the names but they don't need to save those in a shared central database for everyone else to see.

Each client is identified by account id, which might include an initial.

When client logs in to a client-facing website, they will not see their name but some kind of relatively public information such as their self-chosen nick name, investment objectives, account balance, and last login time.

Client postal address is needed only for those who opt for paper statement. And only one system needs to access it — the statement printing shop.

A veteran in a similar system told me this is feasible and proposed an enhancement — encrypt sensitive client information.

What are some of the inconveniences in practice?

no 2 thread for 1 symbol: fastest mkt-data distributor

Quotes (and other market data) sent downstream should be in FIFO sequence, not out-of-sequence (OOS).

In FX and cash equities (eg EURUSD), I know many major market data aggregators design the core of the core feed engine to be single-threaded — each symbol is confined to a single “owning” thread. I was told the main reason is to avoid synchronization between 2 load-sharing threads. 2 threads improve throughput but can introduce OOS risk.

You can think of a typical stringent client as a buy-side high-frequency trader (HFT). This client assumes later-delivered quote is physically “generated” later. If 2 quotes arrive on the same name, one by one, then the later one always overwrites the earlier one – conflation.

A client’s HFT can react in microseconds, from receiving quote (data entering client’s network) to placing orders (data leaving client’s network). For such a fast client, a little bit of delay can be quite bad, but not as bad as OOS. I feel OOS delivery makes the data feed unreliable.

I was told many automated algo trading engines (including automatic offer/bid pricers in bond) send fake orders just to test the market. It sends a test order and waits for the response in the data feed. An OOS delivery would confuse this “observer”.

A HFT could be trend-sensitive. It monitors the rise and fall of sizes of the quotes on a given name (say SPX). It assumes the market data are delivered in-sequence.

common technical challenges in buy-side software systems

10 – 30% of wall st IT jobs are on the buy-side such as funds, portfolio and asset management.

* Core challenge – sub ledger. In one system there was more than 100,000 client accounts in the sub ledger. Each is managed by professional financial advisors. I believe each account on average could have hundreds of positions. (Thousands would be overwhelming I feel.) Since clients (and FA) need instant access to their “portfolio”, there are too many positions to keep up to date. Core of the entire IT infrastructure is the subledger.

** Number of trades per day is a 2nd challenge. These aren’t high-frequency traders, but many Asian clients do nothing but brokerage (equity) trades. Per-account not many, but all the accounts combined is a lot of processing overnight. These must add to the sub ledger by next day.

* quarterly performance reporting
** per-fund, per-account, per-position
** YTD, quarterly, annual etc

I guess there is also monthly performance reporting requirement in some cases.

* asset allocation and periodic portfolio re-balancing — for each client. Key differentiators. Investors get a large “menu” of funds, products … For comparison, they may want performance metrics.

– VaR or realtime risk? Probably important to large funds
– pricing? Probably important to large funds

– swing/wpf not really required. Web is adequate.
– trade booking? not a challenge

Database: limited usage]real time trading

“Database” and “Real-time trading” don’t rhyme!

See http://bigblog.tanbin.com/2009/03/realtime-communication-in-front-desk.html. Trading systems use lots of MOM and distributed cache.

In comparison, DB offers perhaps the most effective logging/audit. I feel every update sent to MOM or cache should ideally be asynchronously persisted in DB. I would probably customize an optimized DB persistence service to be used across the board.

Just about any update in cache need to be persisted, because cache is volatile memory. Consider flat file.

[11] real time high volume FX quote processing #letter

Horizontal scale-out (distributing to different boxes) is the design of choice when we are cpu-bound. For instance, if we get hundreds of updates a sec and each update requires repricing a large number of objects.

Ideally, you would want cpu to be saturated. (By using twice the hardware threads, you want throughput to double.) Our pricing engine didn’t have that much cpu load, so we didn’t scale out to more than a few boxes.

The complication of scale-out is, data required to reprice one object may reside in different boxes. People try many solutions like memory virtualization (non-trivial synchronization cost + network latency), message-passing, RMI, … but I personally prefer the one-big machine approach. Throw in 16 (or 128) processors, each with say 4 to 8 hardware threads, run 64-bit, throw in 256G RAM. No network latency. No RMI/messaging latency. I think this hardware is rather costly. Total cost of 8 smaller machines with a comparable total CPU power would cost much less, so most big banks prefer it – so-called grid computing.

According to my observations, most practitioners in your type of situations eventually opt for scale-out.

It sounds like after routing a message, your “worker” process has all it needs in its local memory. That would be an ideal use case for parallel processing.

I don’t know if FX spot real time pricing is that ideal. Specifically, suppose a worker process is *dedicated* to update and publish eur/usd spot quote. I know you would listen to the eurusd quotes from all liquidity providers, but do you also need to watch usd/jpy and eur/jpy?

15,000 quotes repriced within a minute

One of my bond pricing engines could price about 15,000 offers/bids in about a minute. 4 slow lanes to avoid
 
1) database persistence is done asynchronously by gemfire write-behind.

2) offers/bids we produce must be verified by another system, which officially owns the OutgoingQuote table. The verification takes a long time. We avoid that overhead by pricing all the offers/bids in gemfire, then send them out by batch, then wait for the result. The 1 minute speed is without the verification.

3) all reference data is preloaded into gemfire, so no more disk I/O.

4) minimal serialization overhead, since most of the objects needed are in local JVM.

In contrast, a more complex engine, the mark-to-market engine needs a few minutes to price 15,000 positions. This engine doesn't need real time performance.

y trading systems use so many stored procedures

A popular Wall Street interview question is the pros and cons of stored
proc. Here are a few
#1 single point of access from java, c++ …
#2 modular encapsulation. separation of concern
+ network efficiency
+ access control
+ reusable. DRY
+ easy version control
– readability
– exception handling
– hard to log actual query

Perhaps the biggest motivation is to avoid recompiling binary in an
emergency fix. Many sites have extremely strict control on binary
build/deployment [1]. Every release always builds from version control.
If you need a bug fix release, then deal with all the changes checked
into cvs but not approved!

Redeploy binary can also break any number (or all) other applications.

Proc is the answer to your prayer. In some places, every
select/insert/update/delete statement is extracted into a proc. Changing
the logic in them feels almost painless compared to a binary
build/release. Hibernate is a big departure from the proc tradition.

1) Wall Street users want frequent changes, not bound by software
release controls. Control-vs-time-to-market makes a healthy contention.
2) Wall Street code is often extremely (quick and) dirty, so fixing bugs
without software release is often a life saver.

About half of all business logic, both features (1) and bugs (2), are
often expressed in SQL. Now you see how useful it is to have flexible
ways to change the SQL logic.

If you think hard and always forecast which business logic might need
change, then you can strategically extract those SQL into store
procedures.

[1] Given the huge sums involved, wall st wants control on software.
They can't control code quality but can control build/release. Many,
many levels of approvals. Numerous staging, integration, QA, preQA
environments.

## [11] y no dotnet on sell-side server side@@

(A fairly sketchy, limited, amateurish write-up.)
I was recently asked “dotnet has formidable performance and other strengths compared to java, but in trading engines space, why is dotnet making inroads only on the user-interface, never on the server side?

Reason — as an casual observer, I feel Windows was designed as a GUI operating system with a single user at any given time. Later WinNT tried to extend the kernel to support multiple concurrent users. In contrast, Unix/Linux was designed from Day 1 to be multi-user, with the command line as the primary UI. (Personally I used to feel GUI is a distraction to high volume data processing OS designers.) A trading server needs no GUI.

Reason — Java and c/c++ were created on Unix; dotnet runs only on a windowing operating system. I feel web server is a light weight application, so both java and dotnet (and a lot of scripting languages) are up to the job[1], but truly demanding server-side apps need Unix AND java/c++. I guess Windows is catching up. In terms of efficiency, I guess java and c# are comparable and below C++.

Reason — Sell-side trading system is arms race. (Perhaps same among hedge funds.) Banks typically buy expensive servers and hire expensive system engineers, and then try to push the servers to the max. C/C++ makes the most efficient use of system resources, but Java offers many advantages over C++. Since the late 90’s, trading servers have progressively migrated from C++ to Java. Java and C++ are proven on the high-performance server side. Not dotnet.

Reason — I still feel *nix are more stable than Windows under high load. See http://efreedom.com/Question/1-214362/Java-Large-Heap-Sizes. However, I think you can create big clusters of windows servers

Reason — (from a friend) — *nix is considered more secure than windows. A GUI desktop can affect one trader if compromised, but a sell-side trading server affects all the traders from all the institutional and retail clients if compromised. So security risk is more serious on server side than GUI side.

The reasons below are arguments for java over dotnet in general, but don’t really explain why java is NOT on the GUI side and dotnet is still chosen on the GUI side.

Reason — big banks need stronger support than a single vendor company. What if Microsoft makes a mistake and dotnet loses technical direction and momentum? Java and *nix enjoy broader industry support.

[1] unless you are google or facebook, who needed c++ for their demanding requirements.

overnight risk reporting in portfolio management

I talked to a big portfolio mgmt (PM) firm. Team owns and delivers nightly risk reports to traders (+ perhaps fund managers). According to the team mgr, the most important sister team is the quant team, who are often PhD's but not professional coders. Quants are really qualified to create models but these quants actually implement their models in c++.

There's a large amount of data in DB. Nightly job reads in these data and analyzes them using the c++ models, then writes data back into DB.

This is a heavy-duty number crunching batch job, heavy on DB, light on network – no socket programming.

Logic is mostly in perl, c++, shell and DB. DB holds significant amount of logic, just like Goldman Sachs PrivateWealthManagement. It turned out c++ implements more business logic than perl. These perl scripts are considered low-logic, but if there's a lot of perl, then I believe there's a lot of logic.

Perf is the biggest issue. Job must complete in 12 hours, before a 3am deadline, without break. If it breaks, there will be … delays and …? Bottleneck is DB. There's spare hardware capacity underutilized but the DB server is on its knees. I have heard of the same many times, in GS, citi… so I guess this is hard to avoid. Risk system is probably worst affected.

%Q: stress testing? Monte Carlo?
A: the reporting system doesn't do those. Those are probably the job of quants.

%Q: is VaR the key output?
A: no. duration, curve duration, spread duration

%Q: is matrix and “vectors” used in the c++ code, like those in matlab? So it goes well beyond STL?
A: yes quants use matlab and mathematica to develop the concept, and then use c++ to implement it. We do have our own data structures beyond STL.

%Q: how much domain knowledge required in the analytical work?
A: more of an aptitude and attitude to learn

multiple intermediate data storage]real time trading servers

For easy prod support, get your first stage of processing to save
intermediate output to cache, DB or MOM, and 2nd stage to pick up from
there. You can have many stages (ie pipes) and pipe connectors.
This might help your job security if other developers can’t easily
figure out all of your techniques saving, accessing, investigating (in
prod), filtering, monitoring the intermediate data. Remember gemfire
doesn’t have a working data browser?
This helps testing. Remember Mithun’s DBank cash management project.
This helps prod monitoring.
This helps everyone understand the business as they can see the
intermediate data in blood and flesh. You can get interesting
statistics.
Recall Reo has limited logging so we don’t know why some events don’t
happen upon a user action or market update.

java RMI in trading systems

Now I feel rmi is rather easy, battle-tested, proven, mature,
well-researched, … compared to many alternative solutions. Here's
RMI usage in a trading system circa 2011 —

Nobody calls Neo server via RMI.  The only way you can talk to Neo is
via JMS/Protobuf.  So even if you have 100 instances of Neo servers,
JMS distributes the messages across them.

Neo does make _outbound_ RMI calls to PricingControl, Arb/prop/model
trading engine, and various other systems.

bond trade capture system use-cases

Trading system architect must know such essential use cases:

A hypothetical bond trade booking sys – named Blo (for Blotter)

Blo use case 1: phone execution, then trader enter trade into Blo.

Blo use case 2: traders advertise offers and bids on an internal network. Our salesperson lifts an offer. Trade is confirmed on the spot. System automatically books trade into Blo. This flow converts the Order into a Trade automatically. It’s possible for 2 salespersons to lift the same offer. System will reject A and book B.

Blo use case 3: advertise offers to external venue, lifted automatically. External venue sends us confirmation and trade booked.

Blo use case 4: trader responds to external bid-wanted (RFQ) and her bid is selected, becoming a trade. External venue sends confirmation to us, trade booked.

In Eq, there’s often a big OMS to manage the order state from an initial request to a completed trade.

pricing control in a bond dealer desk

Pricing (along with pnl) is one of the most important data to monitor and control. There’re multiple levels of price controls.
* Offer/bid price limits, to block out-of-range offer/bid advertisements
* the price in a response to a IFB is typically sent out via a system and is probably subject to price control, to prevent bidding too high.
* After trade execution, Middle Office would check the price against some reference prices. If a trader executed an unusually price, she may be responsible. I was told MO only bothers with unfinished (i.e. unclosed) positions.
* Pricing exception report and attestation. I think this is internal compliance.
* There could be regulations on unusual execution prices in some regulated securities. It’s conceivable that government wants to know every trade’s price in a particular derivative so as to prevent another bank collapse.

trade booking, trade capture, position management, sub ledger

The standard OTC trade booking system (TBS) — After u finalize i.e. execute a trade with your counterparty, typically over phone, you enter the completed trade in the TBS. Some call it trade capture. I think this used to be the trade blotter. Before TBS, people used spreadsheet. This is one of the earliest and most essential IT systems for traders.

The other absolutely essential trading system is the position management system (PMS), aka sub ledger. TBS records all the trade activities, and independently computes current positions by accumulation, and synchs up with the PMS every day.

How about pricing engine? In OTC, trader can decide the price with a pencil or a sophisticated pricing engine. I think it’s firm’s money but trader’s decision, so it’s up to her.

eq listed drv desk

Some basic info from a friend –

Equity Listed derivatives – mostly options on single stocks or options on index/future, but also variance-swaps. Even if a stock has no listed options, we would still create a vol surface so as to price OTC options on it, but the technique would be different — The standard technique if given many pairs of {expiration, strike} is to fit a curve on a single expiration, then create similar curves for other expirations on the same underlyer (say IBM), then try to consolidate all IBM curves into a smooth IBM vol surface. Each “point” on the surface is an implied vol value. I was told some of the more advanced “fitting” math is extracted out into a C++ quant lib.

Instrument pricing has to be fast, not multi-second. I guess this is pre-trade, RFQ bid/offer pricing, similar to bond markets’ bid-wanted. In contrast, the more “real” need for vol surface is position pricing (or mark-to-market), which provides unrealized PnL. I feel this is usually end-of-day, but some traders actually want it real time. Beside the traders on the flow[3]/listed/OTC derivative desks, the vol surface is also used by many other systems such as structured derivatives, which are entirely OTC.

It’s quite hard to be really event-driven since they are too frequent, instruments too numerous, and pricing algo non-trivial, exactly like FX option real time risk. Instead, you can schedule periodic repricing batches once a few minutes.

About 3500 underliers and about 450,000 derivative instruments. Average 100 derivatives on each underlier (100 combinations of strike/tenor). S&P500 has more than 1000 derivatives on it.

Market data vendors — Reuterss, Wombat, Bloomberg.

Inputs to vol calculation — product reference (strike/tenor), live market quotes, dividend, interest rate …

One of the most common OTC equity derivatives is barrier option.

Pricing and risk tend to be the most mathematically challenging.

Exchange connectivity is usually c++, client connectivity (clients to send orders or receive market data) is usually java.

[3] Flow means agency trading, most for institutional clients. Retail clients are very wealthy. Those ordinary retail investors won’t use an investment bank. Flow equity derivative can be listed or OTC.

trade booking/capture in the big picture

For a novice who wonders just how important trade-capture is…

b/c (i.e. trade booking/capture) is the #1 essential component of trading systems, if you look across assets. B/c is often the _heart_ of an OTC trading desk or voice trading desk. But not true for trading desks against an exchange/interdealer, because pre-trade apps takes center stage, and post-trade
flow becomes middle-office.

b/c (along with position/pnl and trade blotter) is the first task to be computerized on wall street.

b/c is the basis of position master ie sub-ledger (often in mainframes), one of the most essential systems in any trading system. Sub-ledger is basis of pnl.

I feel b/c is relatively _low_tech_ compared to market data, low latency and some pre-trade systems. However, I feel in an exchange or a large sell-side firm, execution volume can be high.

I feel b/c demands more precision, more stability, better error rate, more robustness… than most pre-trade systems. This is because b/c is the point of no return — After an order is executed it can’t be canceled effortlessly.

In a voice trading desk, b/c is actually post-trade, because the trader executes the trade over phone and simply enters data into the books. Remember MTSTradeEngine? In contrast, the fully electronic b/c is not post-trade but sits at the choke point right between pre-trade and post-trade.

FX option trading – a typical arch

Just as in equity options, the core component is risk engine, because positions are large and long-term. See other reasons in my post on option trading systems.

— My hypotheses —
* I guess for both fx option and IRD, core engine is a realtime event-driven position updater (another side of the same coin as risk engine). Each position has a lot of contract attributes and risk attributes, all subject to frequent updates. A typical FX option desk probably has “too many” positions each reacting to a lot of events, but each update is complex and time-consuming.
* In contrast, cash desk has fewer positions and simpler positions.
* In bond trading, any non-flat position is also subject to updates in terms of marking and unrealized PnL, but calc is simpler.
———————————–
FX option is an OTC market – “no electronic trading” (i guess no ECN either), but there are electronic trade messages in addition to manual trade booking. There’s also plan to access CME listed FX options. Note this plan is not about fx options on futures, and not about PHLX.

%Q: so is it voice based?
A: various means.

Clearing could be done at the London Clearing House. I guess London is a bigger center than NY.

A lot of “exotic” fx option products come online every year. There’s pressure to automate and speed up new product launch. I would guess 1) position management and 2) booking are among the most essential features needed by any new FX option instrument. System must be able to persist positions in these exotic options. If automatic STP booking is hard, then ops can manually enter them, assuming volume is low on new products.

Volume of Trades – FX options desk gets about 1500 trades/day. In contrast, FX cash desk (includes futures + forwards) gets about 100 times the volume, but profit is perhaps 2 to 3 times that of FX options desk, obviously different margins.

Volume of Positions – FX cash desk keep most positions flat so very few positions are non-flat. FX options desk has “too many” open positions, a big headache to risk engine.

Entire FX options desk needs about 20 desk-specific developers world-wide. Besides, I guess there are many supporting systems owned by other teams outside the desk. These teams include (not limited to) firmwide teams, probably further away from the profit centers.

FX option trading is more complex than FX cash trading.

Q: Are FX derivatives simpler than equity derivatives?
A: not necessarily. FX involves 2 interest rates. Eq involves dividends.

— system modules owned by dedicated desk developers–
FIX server (perhaps for market data, not e-trading?)
GUI is in Tcl, early versions of C# platform and WPF.
Market data is a major component in FX. Many modules react to market data —
– risk
– pricing
To traders, real time pricing is presumably more important  than risk is. I guess they need to send out updated bid/offer. RT pricing uses spot prices (market data) and volatility data for calculation. For any pair of currencies, (every?) market data could trigger Automatic price updates across all strikes and expiration.

Actual option valuation math is in c++/JNI.

Biggest headache in fx option risk engine is performance. FX Option Valuation is slow. FX option position Volume is too large for real time risk update. Instead, the risk “report” system is on-demand and covers a requested subset of the full portfolio, presumably those positions belonging to a trader. Such a report takes a few minutes. If market data has changed by then, report is obsolete.

Risk rollup from trader-level to entity-level to firm-level. There’s an external team responsible for analytics library and they call FX options system’s services to get positions. I guess that external system is a firm-wide analytics or risk engine.

#1 essential component (among the distributed components over 30 servers) in the trading desk is trade capture/booking, written in c++ primarily + some java. There’s some c++ valuation module for FX options. Plan is to slowly phase out c++. Other than that, desk is mostly java.

–core architecture–
Since an option (or any derivative) is not settled right away like cash trades, there’s a _lifecycle_ to each derivative trade. Each derivative trade takes on a life of its own and is subject to many “lifecycle events” like
– origination, cancels, amends/modifications
– knock-in, knock-out
– fixings
– market data effecting risk reassessment

Just like bond repricing engine, this is Service Oriented Architecture – MQ facilitates the event-driven architecture, but there are other ways to pass messages like SOAP over TCP (not http).
1) MQ for high volume messages
2) SOAP for slow, complex processing. Possibly a few trades a day! I guess these are exotic products.
A typical event-driven server here is a socket server, holding a thread pool, started with main(). No container or web server.

wall street infrastructure — security trading systems

XR,
(Another blog. No need to reply.)

A large part of the wall street core infrastructure is built around the (regulated) exchanges and (unregulated) ECN’s, and includes the major trading houses’ trading systems — front and back ends, equities, fixed income, currency and commodities, including risk. Developers in these systems are the backbone of wall street.

I feel less than 50% of my company’s technology staff are application developers. Among them, less than 50% develop apps for real time trading. The rest of the developers support reporting, end-of-day risk, post-trading (like my team), GL, compliance, surveillance, price and other data feeds into trading systems and data feeds out of trading systems, maintain accounts and other reference data, ….

Trading system developers are employed by brokerage houses (aka securities firms), hedge funds, mutual funds, prop traders, exchanges, and many boutique firms(?) On the other hand, retail banking, consumer banking, corporate banking (they all taking deposits and giving loans) and the advisory business of investment banks don’t have infrastructure to trade securities. I don’t think they have access to the security exchanges.The IPO, M&A, privatization… investment bankers do need some access to trading systems, as they issue securities.

Overall, how many percent of the financial IT people are in the “backbone”? I guess not more than 10%.

basic (essential?) trading server arch q&&a

Every trading server invariably uses some non-http network daemon. There’s always more than 1 process (JVM, C# or c++) on the server side. There’s usually some MOM daemon such as JMS, tibrv and gemfire notification daemon. Here are some Fundamental questions:

Q22: on top of tcp/udp, what specific network protocol between the server-side and GUI?
A: I have seen rmi and protobuf over tib ems.

Q22a: how about JMS between server and swing? Did we see 160 subscribers on a given topic, due to that many swing installations?

Q33: on top of tcp/udp, what specific network protocol among the server-side processes?
A: I have see tibrv, JMS, RMI, gemfire data distribution protocols …

Q44: since most trading servers must avoid DB latency, where does the trading data live? In memory?
A: i have seen gemfire, rttp,

Q45: in case of distributed cache (not replicated), how does one cache listener update another node?

Q55: how does the daemon stay alive after main() exits?
A: Look at ion, gemfire, activemq. There’s often at least 1 (1 enough) non-daemon thread that’s stuck in wait()

realtime inter-VM communication in front desk trading sys

Inter-VM is our focus.

* [s] MOM — async
** FIX over RV in Lehman Eq
* [s] distributed cache — async?

Above mechanisms notify listeners. Note Listeners are usually async and multi-threaded.

* DB writes by one app, and periodic DB polling by receiving app
* [s] RMI
* [s] EJB? infrequent. I think this is less efficient than MOM
* [s] web service? not sure
* FTP? not real-time but at SOD (startOfDay) and EOD
* email? none

MOM is the clear favorite. Most efficient. Guaranteed

Within my front office app, RMI, MOM and cache are dominant. Within a related ticketing system (iticket), MOM and RMI are dominant.

DB is an extreme form of synchronous pub/sub.

[s=needs object serialization. cross-VM often requires serializable]

Studying biz rule tables used on Wall St

Background: many rule based systems (on Wall St or elsewhere) have thousands of rules in database tables. The simplest and most used pattern is the lookup table. A lookup table has one (or more) input columns and one (or more) output columns.

Each combination of input fields maps to exactly one row. There should be a unique composite index on the input columns as a group.

The concept is simple, but sometimes needs to be hand-crafted and perfected, because
* these rules are often part of the most important 1% source code in a big code base
* users change rules frequently, at least in some Wall St firms
* we often need to query the tables with complex criteria. Such complex queries are usually implemented in source code, but for quick production support we often need to run them by hand, tweak and rerun. In some cases, we end up learning all the intricacies by heart. If your full time job is a spell-checker you won’t need the dictionary much.
* we often need to explain how rules are disqualified, filtered out, ranked, grouped, applied, overlayed, re-applied…. Many IT veterans would say blow-by-blow knowledge of a code module is needed in extremely rare situations.. Welcome to Wall St these business rules are examined and tweaked on a daily basis by multiple groups of users.

Compare with Informatica lookup transform.

1900 tiers of quotes, RFS over FIX, indicative/executable quotes

One of the REAL bottlenecks in a large SELL-side FX dealer system is tiered pricer. Trigger event could be a market data change. Since such an event could trigger an avalanche of messages, the frequency of such events is not very high, probably below 10 events/second on 1 currency pair. If you have 10(or 50 or whatever) active currency pairs, then you could get 100 triggers/sec through your entire system.

Once a trigger is activated, pricer computers new bid/ask quotes for Tier 1 Gold clients. Pricer then adds a distinct spread for each tier. For an active pair like EUR/USD, there can be 1900 tiers. There can be up to 1900 (non-unique) pairs of bid/ask quotes. Typically, the “best” quotes would have a bid/ask spread of 2 to 3 pip, applicable to the best and largest clients. For a *retail* client, it could be 20 – 40 pips.

Core of the tiered pricer is a Drools rule engine.

Another module is the messaging engine using Nirvana by My-Channels. If a particular bid/ask quote applies for all (say, 20) tiers in Silver category, then pricer broadcast the quote to a topic like Quote.EURUSD.Silver. This is kind of alias for 20 different “tier” topics. For efficiency, this is probably multicast.

In the worst case, one event can trigger an avalanche of 1900 messages for one currency pair alone.

Last module is the FIX engine. Quotes often go out the door in FIX format, just as RFQ. Now there are 1900 tiers but the number of clients could be higher or lower than 1900. If there are more than 1900 clients and if all of them subscribe to our quote, then each must be sent the quote. I know a Chicago prop trading firm (Gelber?) subscribe to a lot of “bank feeds”.

The most demanding type of subscription is a RequestForStream (RFS). A typical RFS could ask for a stream of EURUSD quotes for 10 (up to 120 minutes), during which time all quotes must be delivered.

Unlike RFQ, RFS requires special approval. The bid/ask quotes in an RFS can be indicative or *executable* (similar to Firm, but see separate blot post). If a client hit an executable bid or lift an executable offer, then the trade is considered executed, though I believe cancellation is still a possibility, just like any Firm quote in bonds.

How does a dealer make sure he has enough position to honor the quote? Perhaps by setting aside reserve quantities, or by monitoring the open market.

Unlike bidwanted systems (non-negotiable quotes), it’s possible for a client to negotiate on our quote electronically, though I feel manual negotiation is more practical.

Merrill S’pore: fastest stock broadcast

Updates — RV or multicast topic; msg selector

I think this is a typical wall-street interview question for a senior role. System requirement as remembered by my friend the interviewee: ML needs a new relay system to receive real-time stock updates from the stock exachange such as SGX. Each ML client, one of many thousand[1], will each install a new client-software [3] to receive updates on the stocks [2] she is interested. Some clients use algorithmic trading system and need the fastest feed.

[1] Not clear about the order of magnitude. Let’s target 10,000
[2] Not clear how many stocks per client on average. Let’s target 100.
[3] Maintence and customer support for a custom client-software is nightmare and perhaps impractical. Practically, the client-software has to be extremely mature such as browsers or email clients.

Q: database locking?
A: I don’t think so. only concurrent reading. No write-contention.

Key#1 to this capacity planning is how to identify bottlenecks. Bandwidth might be a more severe bottleneck than other bottlenecks described below.

Key#2 — 2 separate architectures for algorithmic clients and traditional clients. Each architecture would meet a different minimum latency standard, perhaps a few seconds for traditional and sub-second for algorithmic.

Solution 0: Whatever broadcasting system SGX uses. In an idea world, no budget constraint. Highest capacity desired.

Solution 2: no MQ? No asynchronous transmission? As soon as an update is received from SGX, the relay calls each client directly. Server-push.

Solution 1: MQ — the standard solution in my humble opinion.

Solution 1A: topics. One topic per stock. If 2000 clients want IBM updates, they all subscribe to this topic.

Q: client-pull? I think this is the bottleneck.

Q: Would Client-pull introduce additional delays?

Solution 1B: queues. one queue for each client each stock.

If 2000 clients want IBM updates, Relay need to make that many copies of an update and send to that many queues — duplication of effort. I think this is the bottleneck. Not absolutely sure if this affects relay system performance. Massively parallel processing is required, with thousands of native CPU threads (not java green threads)

cache size control in trading engines#%%ideas

My answer in 2007 – separate thread to monitor the size and send out alerts. JMX-based clean-up operation.

Now I feel we can take the swing EDT approach. “EDT” here means some kind of separate thread/data-structure for cache size control — a watchdog.

Each insert/delete operation on any thread would send a msg to the watchdog queue. Watchdog can decide how many messages to drop before it decides to take a look. If size is close to the limit, then it could decide to be more vigilant. Once limit is hit, watchdog could turn on some flag in the cache system to remind the inserters.

But how does watchdog keep the size within limit? It has to remove the least accessed item. A priority queue with a last-accecced-time might be good. See post on LRU for a java builtin solution.

More importantly, the app has to be tolerant of involuntary loss of cache items. “If cache doesn’t have what I inserted, i will insert again.”

batch jobs in financial trading system

According to a friend in investment banking technology, ALL trading systems need batch jobs to complement online applications. I think MQ applications are a third type. Typical batch:

* save in DB historical volume/day-high/day-low/day-open/day-close … — “open” information open to the public

* save in DB all market players’ trades. That’s my own terminology referring to “our own” hedge funds, other firms’ hedge funds, other firms’ traders… Our own traders’ activities are probably captured during transaction — no batch required.

* We (the brokerage) may also have large institutional clients whose data need to be recorded in DB. Such data may need batch processing if it is not recorded automatically.

app design in a fast-paced financial firm#few tips

#1 design goal? flexibility (for change). Decouple. Minimize colleagues’ source code change.

characteristic: small number of elite developers in-house (on wall street)
-> learn to defend your design
-> -> learn design patterns
-> automate, since there isn’t enough manpower

characteristic: too many projects to finish but too few developers and too little time
-> fast turnaround

characteristic: reputation is more important here than other firms
-> unit testing
-> automated testing

characteristic: perhaps quite a large data volume, quite data-intensive
-> perhaps “seed” your design around data and data models

characteristic: wide-spread use of stored proc, but Many java designs aren’t designed to work well with stored proc. Consider hibernate.
-> learn coping strategies

characteristic: “approved technologies”
characterstic: developers move around
-> maintenance left to other guys
-> documentation is ideally “less necessary” if your design is easy to understand
-> learn documentation tools like javadoc