async messaging-driven #FIX

A few distinct architectures:

  • architecture based on UDP multicast. Most of the other architectures are based on TCP.
  • architecture based on FIX messaging, modeled after the exchange-bank messaging, using multiple request/response messages to manage one stateful order
  • architecture based on pub-sub topics, much more reliable than multicast
  • architecture based on one-to-one message queue

##observations@high-volume,latency sensitive eq trading sys #CSY

This is a probably the biggest sell-side equity order-management-system (OMS) out there, written in c++11. Daily order volume is probably highest among all investment banks, presumably 7 figures based on my speculation, though a lot of them get canceled, rejected or unfilled. I can’t reveal too many internal details due to compliance.

In contrast, GS used to get about a million individual trades a day, probably not counting the high-frequency small trades.

  • MOM — I have not seen a message queue so far but they could be hidden somewhere. Earlier I heard other ibanks’ employees telling me Tibco (and similar messaging middlewares) were popular in fixed income but now I doubt it. Queues add latency.
    • We do use some pub-sub MOM but not for order messages therefore not part of order flow.
  • synchronization — I haven’t noticed any locking or condition variable so far. I think single-threaded mode is faster than synchronized multi-threading. Multiple instances of the same software runs in parallel across machines. I think this is in many ways better than one big monolithic process hosting many threads. We have 4 threads per instance in some cases.
  • socket — is not needed in any module. I believe the applications communicate via FIX, SOAP etc, on top of well-encapsulated TCP library modules.
  • ticking market data — is available, though I don’t know if my OM system needs them.
  • RDBMS — is loaded into cache at Start-of-Day and seldom accessed intra-day. I confirmed it with an ex-DBA colleague.
    • However, some product DB system sends intra-day real time updates via messaging (not FIX)
  • garbage collection — no GC like in java and dotnet
  • CRTP — heavy use of CRTP. I don’t remember seeing many virtual functions.
  • The most important message is the order object, represented by a FIX message. The order object gets enriched and modified by multiple functions in a chain. Then it is sent out via FIX session to the next machine. As in any OMS, the order object is stateful. I still don’t know where the order objects are saved. I would think they are persisted somewhere so a crash won’t wipe out pending orders.
    • (Elsewhere, I have seen very lean and mean buy-side OMS systems that don’t persist any order! After crash, it would query the exchange for order states.)
  • The 2nd most important message is probably the response object, represented by a FIX msg. If there are 100,000 order objects then there are roughly 300,000 response objects. Each order generates multiple responses such as Rejection, PendingNew, New, PartialFill, PendingCancel, Cancelled… Response objects probably don’t need to be persisted in my view.
  • The 3rd most common message is the report message object, again in FIX format. Each order object probably generate at least one report, even if rejected. Report objects sound simple but they carry essential responsibilities , not only regulatory reporting and client confirmations, but also trade booking, trade capture… If we miss an execution report the essential books and records (inventory, positions..) would be messed up. However, these reports are not so latency sensitive.
  • Many order objects are persisted to disk so that a recovery would reinstate them. So what “order objects”?
    • All pending orders and (for busting support) closed orders.
    • Those stateful order-management objects created on the fly to support an order.

## sell-side eq e-trading arch features #MS,Baml..

Mostly inspired by the MS equity order-management “frameworks”

  • message-based, not necessarily MOM.
    • FIX messages are the most common
    • SOAP messages are also possible.
    • BAML system is based on MOM (tibrv)
  • message routing based on rules? Seems to be central to some sell-side /bloated/ “platforms” consisting of a constellation of processes.
  • event-driven
    • client newOrder, cancel requests
    • trading venue (partial) fills
    • Citi muni reoffer is driven by market data events, but here I focus on equity systems
    • Stirt realtime risk is driven by market data events + new trade booking events
    • buy-side would have order-origination events, but here I focus on sell-side systems
  • market data subscription? Actually not so important to some eq trading engines. Buy-side would make trading decisions based on market data, but a sell-side won’t.

##simplicity@design pushed to the limit

Note in these designs, the complexity can never disappear or reduce. Complexity shifts to somewhere else more manageable.

  • [c] stateless — http
  • microservices
    • complexity moves out of individual services
  • [c] pure functions — without side effects
  • use the database concept in solving algo problems such as the skyline #Gelber
  • stateless static functions in java — my favorite
  • EDT — swing EDT
  • singleton implemented as a static local object, #Scott Meyers
  • [c] garbage collection — as a concept.
    • Complexity shifts from application into the GC module
  • STM
  • REST
  • in c# and c++, all nested classes are static, unlike in java
  • python for-loop interation over a dir, a file, a string … See my blog post
  • [c] immutable — objects in concurrent systems
  • [c] pipe — the pipe concept in unix is a classic
  • [c=classic, time-honored]

stateless (micro)services #%%1st take

in 2018, I have heard more and more sites that push the limits of stateless designs. I think this “stateless” trend is innovative and /bold/. Like any architecture, these architectures have inherent “problems” and limitations, so you need to keep a lookout and deal with them and adjust your solution.

Stateless means simplicity, sometimes “extreme simplicity” (Trexquant)

stateless means easy to stop, restart, backup or recover

Stateless means lightweight. Easy to “provision”, easy to relocate.

Stateless means easy scale-out? Elastic…

Stateless means easy cluster. Http is an example. If a cluster of identical instances are stateless then no “conversation” needs to be maintained.

## good-looking designs/ideas #white elephant

I think a true architect knows the difference. The best design is often not so good-looking and hopelessly outdated.

Not all of them are classified “white elephants”. Not all of them are “designs”.

  1. lock-free — worthwhile?
  2. multi-threading — not always significantly faster than multi-Processing. I find single-threaded mode fastest and cleanest
  3. sharedMem — not always significantly faster than sockets. I briefly discussed these two choices with my friend Deepak M, in the parser/rebus context. I feel it may not be faster.
    1. other fancy IPC techniques? I am most familiar with sockets …
  4. java generic module (beyond collections) — look impressive, can be hard to maintain but doesn’t buy us much
  5. converting java system to c++ — not always brings significant performance gains
  6. forward() and move() instead of cloning
  7. hash-table — not always faster than RBTree
  8.  noSQL — not always significantly faster than REBMS with lots of RAM.  I find rdbms much more reliable and well understood. The indices, temp tables, joins, column constraints, triggers, stored procs add lots of practical value that can dramatically simplify the main application. I understand the limitations of rdbms, but most of my data stores are not so big.
  9. RPC and web services? Probably necessary, but I still don’t know how reliable they are
  10. thick client? I still feel web UI is simplest


microservices arch #MSA #phrasebook

  • SOA — is the ancestor
  • communication protocol — lightweight, but more costly than in-process communication
    • http — is commonly used for communication. Presumably not asynchronous
    • messaging — metaphor is often used for communication. I doubt there’s any MOM of message queue.
  • modularity
  • in-process — services can be hosted in a single process, but less common
  • cloud-friendly
  • scalability — hopefully better
  • devops — is a driver
    • testability — each service is easy to test, but not integration test
    • loosely coupled — decentralized, autonomous dev teams
    • deployment — is ideally independent for each service, and continuous, but overall system deployment is complicated

conflation: design question

I have hit this same question twice — Q: in a streaming price feed, you get IBM prices in the queue but you don’t want consumer thread AA to use “outdated” prices. Consumer BB needs a full history of the prices.

I see two conflicting requirements by the interviewer. I will point out to the interviewer this conflict.

I see two channels — in-band + out-of-band needed.

  1. in-band only — if full tick history is important, then the consumers have to /process/ every tick, even if outdated. We can have dedicated systems just to record ticks, with latency. For example, Rebus receives every tick, saves it and sends it out without conflation.
  2. dual-band — If your algo engine needs to catch opportunities at minimal latency, then it can’t afford to care about history. It must ignore history. I will focus on this requirement.
  3. in-band only — Combining the two, if your super-algo-engine needs to analyze tick-by-tick history and also react to the opportunities, then the “producer” thread alone has to do all work till order transmission, but I don’t know if it can be fast enough. In general, the fastest data processing system is single-threaded without queues and minimal interaction with other data stores. Since the producer thread is also the consumer thread for the same message, there’s no conflation. Every tick is consumed! I am not sure about the scalability of this synchronous design. FIFO Queue implies latency. Anyway, I will not talk further about this stringent “combo” requirement. says “Many firms mitigate the data they consume through the use of simple time conflation. These firms throw data on the floor based solely on the time that data arrived.”

In the Wells interview, I proposed a two-channel design. The producer simply updates a “notice board” with latest prices for each of 999 tickers. Registered consumers get notified out-of-band to re-read the notice board[1], on some messaging thread. Async design has a latency. I don’t know how tolerable that is. I feel async and MOM are popular and tolerable in algo trading. I should check my book [[all about HFT]]…

In-band only — However, the HSBC manager (Brian?) seems to imply that for minimum latency, the socket reader thread must run the algo all the way and send order out to exchange in one big function.

Out-of-band only — two market-leading investment bank gateways actually publish periodic updates regardless how many raw input messages hit it. Not event-driven and not monitoring every tick!

  • Lehman eq options real time vol publisher
  • BofA Stirt Sprite publishes short-term yield curves on the G10 currencies.

[1] The notification should not contain price numbers. Doing so defeats conflation and brings us back to a FIFO design.

blocking scenario ] CPU-bound system

Q: can you describe a blocking scenario in a CPU-bound system?

Think of a few CPU bound systems like

  • database server
  • O(N!) algo
  • MC simulation engine
  • stress testing

I tend to think that a thread submitting a heavy task is usually the same thread that processes the task. (Such a thread doesn’t block!)

However, in a task-queue producer/consumer architecture, the submitter thread enqueues the task and can do other things or return to the thread pool.

A workhorse thread picks up the task from queue and spends hours to complete it.

Now, I present a trivial blocking scenario in a CPU bound system —

  • Any of these threads can briefly block in I/O if it has big data to send. Still, system is CPU-bound.
  • Any of these threads can block on a mutex or condVar

json^protobuf points out

  • —limitations of protobuf:
  • Lack of resources. You won’t find that many resources (do not expect a very detailed documentation, nor too many blog posts) about using and developing with Protobuf.
  • Smaller community. Probably the root cause of the first disadvantage. On Stack Overflow, for example, you will find roughly 1.500 questions marked with Protobuf tags. While JSON have more than 180 thousand questions on this same platform.
  • not human readable
  • schema is extra legwork for quick and dirty project
  • — advantages of protobuf over Json
  • very dense, and binary, data
  • up to 5 times faster, but optimized json parser could reduce the performance gap.


real-time symbol reference-data: arch #RTS

Real Time Symbol Data is responsible for sending out all security/product reference data in real time, without duplication.

  • latency — typically 2ms (not microsec) latency, from receiving to sending out the enriched reference data to downstream.
  • persistence — any data worthing sending out need to be saved. In fact, every hour the same system sends a refresh snapshot to downstream.
    • performance penalty of disk write — is handled by innoDB. Most database access is in-memory. Disk write is rare. Enough memory to hold 30GB of data. shows how many symbols there across all trading venues.
  • insert is actually slower than update. But first, system must check if there’s a need to insert or update. If no change, then don’t save the data or send out.
  • burst / surge — is the main performance headache. We could have a million symbols/messages flooding in
  • relational DB with mostly in-memory storage

STL+smart_pointer for SQL DTO

Are there any best practice online?

Q1: Say I have a small db table of 10 columns x 100 rows. Keys are
non-unique. To cache it we want to use STL containers. What container?
%%A: multimap or list. unordered_multimap? I may start with a vector, for simplicity. Note if 2 duplicate rows aren’t 100% identical, then multimap will lose data

Q1a: search?
%A: for a map, just lookup using this->find(). For list, iterate using generic find()

Q1c: what if I have a list of keys to search?
%%A: is there an “set_intersect()” algorithm? If none, then I would write my nested iteration. Loop through the target keys, and find() on each.
A: for_each()?

Q1e: how do you hold the 10 col?
%%A: each object in container will have 10 fields. They could be 10 custom data classes or strings, ints, floats. Probably 10 smart pointers for maximum flexibility.

Q1h: what if I have other tables to cache too?
%%A: parametrize the CacheService class. CacheService class will be a wrapper of the vector. There will be other fields beside the vector.

Q1m: how about the data class? Say you have a position table and account table to cache
%%A: either inheritance or template.

## stateful OMS class design: observations

Here’s a well-established and large-scale order manager class design. It handles millions of orders a day.

  • The entire process is restarted on every trading day. Before the restart, all pending orders are cancelled! The OM is probably a per-thread singleton in the process.
  • The OM stores all the orders for the day, including each closed order in case it needs cancellation.
  • The OM keeps all the partial executions (aka partial fills) for a given order, because each execution could be busted.
  • Each action on an order (such as validation, partial execution ..) is performed by a dedicated object. For 1000 orders, if there are 5 actions, then there would be 5000 distinct “action objects”. The OM has pointers to all of these action objects.
  • Most action objects are stateful. ALL action objects are persisted somewhere so as to support busting/cancellation.


[09]%%design priorities as arch/CTO

Priorities depend on industry, target users and managers’ experience/preference… Here are my Real answers:

A: instrumentation (non-opaque ) — #1 priority to an early-stage developer, not to a CTO.

Intermediate data store (even binary) is great — files; reliable[1] snoop/capture; MOM

[1] seldom reliable, due to the inherent nature — logging/capture, even error messages are easily suppressed.

A: predictability — #2 (I don’t prefer the word “reliability”.) related to instrumentation. I hate opaque surprises and intermittent errors like

  • GMDS green/red LED
  • SSL in Guardian
  • thick, opaque libraries like Spring
  1. Database is rock-solid predictable.
  2. javascript was predictable in my pre-2000 experience
  3. automation Scripts are often more predictable, but advanced python is not.

(bold answers are good interview answers.)
A: separation of concern, encapsulation.
* any team dev need task breakdown. PWM tech department consists of teams supporting their own systems, which talk to each other on an agreed interface.
* Use proc and views to allow data source internal change without breaking data users (RW)
* ftp, mq, web service, ssh calls, emails between departments
* stable interfaces. Each module’s internals are changeable without breaking client code
* in GS, any change in any module must be done along with other modules’ checkout, otherwise that single release may impact other modules unexpectedly.

A: prod support and easy to learn?
* less support => more dev.
* easy to reproduce prod issues in QA
* easy to debug
* audit trail
* easy to recover
* fail-safe
* rerunnable

A: extensible and configurable? It often adds complexity and workload. Probably the #1 priority among managers i know on wall st. It’s all about predicting what features users might add.

How about time-to-market? Without testibility, changes take longer to regression-test? That’s pure theory. In trading systems, there’s seldom automated regression testing.

A: testability. I think Chad also liked this a lot. Automated tests are less important to Wall St than other industries.

* each team’s system to be verifiable to help isolate production issues.
* testable interfaces between components. Each interface is relatively easy to test.

A: performance — always one of the most important factors if our system is ever benchmarked in a competition. Benchmark statistics are circulated to everyone.

A: scalability — often needs to be an early design goal.

A: self-service by users? reduce support workload.
* data accessible (R/W) online to authorized users.

A: show strategic improvement to higher management and users. This is how to gain visibility and promotion.

How about data volume? important to eq/fx market data feed, low latency, Google, facebook … but not to my systems so far.

DB=%% favorite data store due to instrumentation

The noSQL products all provide some GUI/query, but not very good. Piroz had to write a web GUI to show the content of gemfire. Without the GUI it’s very hard to manage anything that’s build on gemfire.

As data stores, even binary files are valuable.

Note snoop/capture is no data-store, but falls in the same category as logging. They are easily suppressed, including critical error messages.

Why is RDBMS my #1 pick? ACID requires every datum to be persistent/durable, therefore viewable from any 3rd-party app, so we aren’t dependent on the writer application.

Y more threads !! help throughput if I/O bound

To keep things more concrete. You can think of the output interface in the I/O.

The paradox — given an I/O bound busy server, the conventional wisdom says more thread could increase CPU utilization [1]. However, the work queue for CPU gets quickly /drained/, whereas the I/O queue is constantly full, as the I/O subsystem is working at full capacity.

[1] In a CPU bound server, adding 20 threads will likely create 20 idle, starved new threads!

Holy Grail is simultaneous saturation. Suggestion: “steal” a cpu core from this engine and use it for unrelated tasks. Additional threads or processes basically achieve that purpose. In other words, the cpu cores aren’t dedicated to this purpose.

Assumption — adding more I/O hardware is not possible. (Instead, scaling out to more nodes could help.)

If the CPU cores are dedicated, then there’s no way to improve throughput without adding more I/O capacity. At a high level, I clearly see too much CPU /overcapacity/.

dotnet remoting and related jargon

P4 [[.net 1.1 remoting, reflection and threading]] shows a insightful history leading to dotnet remoting —
#1) RPC (pre-OO).
OO movement brought about the Next generation in the form of distributed objects (aka distributed components) —
#2) CORBA, RMI (later ejb) and dcom, which emerged around the same time.
COM is mostly for in-process and dcom is distributed
#3) soap and web services , which are OO-agnostic
I feel soap is more like RPC… The 2 distinct features of soap — xml/http. All predecessors are based on binary protocols (efficient), and the “service component” is often not hosted in any server.
#4) dotnet remoting feels more like RMI to me…According to the book above, remoting can use either
1) http channel with the soap formatter, or
2) tcp channel  with the binary formatter

Therefore, I feel remoting is an umbrella technology with different implementations for different usage scenarios.

#5) WCF
Remoting vs wcf? See other post.

private bank trade/order/quote/execution flow

Remember — Most non-exchange traded products are voice executed. Only a few very dominant, high volume products are electronically executed. Perhaps 1% of the products account for 99% of the trades — by number of trades. By dollar amount, IRS alone is probably 50% of all the trades, and IRS is probably voice-executed, given the large notional amounts.

The products traded between the bank and its clients (not interbank) are often customized private offerings, unavailable from any other bank. (Remember the BofA puttable floats.)

RM / PWA would get live quotes from dealer and give to a client. Sometimes dealer publishes quotes on an internal network, but RFQ is more common. Any time the quote could be executed between RM and client. RM would book the new position into the bank's database. As soon as as executed (before the booking), the bank has a position but dealer knows the position only after the booking, and would hedge quickly.

Dealer initially only responds to RFQ. It's usually executed without her knowledge, just like an ECN flow.

I think in BofA's wealth management platform, many non-equity products (muni bonds are largely sold to retail clients) trade in the same way. Dealer publishes quotes on an intranet website. RM negotiates with client and executes over the phone. During trade booking, the price and quantity would be validated. Occasionally (volatile market), trade fails to go through and RM must inform client to retry. Perhaps requote. Fundamentally, the dealer gets a last look, unlike the exchange flow.

I believe structured products (traded between bank and clients) are usually not fast and volatile — less requote. However, when dealer hedges the position, I think she often uses vanilla instruments.

Terminology warning — some places use “trade” to mean many things including orders. I think in exchange flow, “order” is a precise word.

##[12] bottlenecks in a high performance data "flow" #abinitio


#1 probably most common — database, both read and write operations. Therefore, ETL solutions achieve superior throughput by taking data processing out of database. ETL uses DB mostly as dumb storage.

  • write – if a database data-sink capacity is too slow, then entire pipe is limited by its throughput, just like sewage.
    • relevant in mkt data and high frequency trading, where every execution must be recorded
  • read – if you must query a DB to enrich or lookup something, this read can be much slower than other parts of the pipe.

#2 (similarly) flat files. Write tends to be faster than database write. (Read is a completely different story.)
* used in high frequency trading
* used in high volume market data storage — Sigma2 for example. So flat file writing is important in industry.
* IDS uses in-memory database + some kind of flat file write-behind for persistence.

#? Web service

#? The above are IO-bound. In contrast, CPU-bound compute-intensive transform can (and do) also become bottlenecks.

max-thruput quote distribution: 6designs#CAS,socket

Update — fastest would require single-threaded model with no shared mutable

Suppose a live feed of market quotes pumps in messages at the max speed of the network (up to 100gigabit/sec). We have (5) thousands of hedge fund clients, each with some number (not sure how large, perhaps hundreds) of subscriptions to these quotes. Each subscription sets up a filter that may look like some combination of “Symbol = IBM”, “bid/ask spread < 0.2…”, or “size at the best bid price….”. All the filters only reference fields of the quote object such as symbol, size and price. We need the fastest distribution system. Bottleneck should be network, not our application.

–memory allocation and copying–
If an IBM /quote/ matches 300 filters, then we need to send it to 300 destinations, therefore copying 300 times, but not 300 allocations within JVM. We want to minimize allocation within JVM. I believe the standard practice is to send just one copy as a message and let the receiver (different machine) forward it to those 300 hedge funds. Non-certified RV is probably efficient, but unicast JMS is fine too.

–socket reader thread latency–
Given the messaging rate, socket reader thread should be as lean as possible. I suggest it should blindly drop each msg into a buffer, without looking at it. Asynchronously consumer threads can apply the filters and distribute the quotes.

A fast wire format is fixed-width. Socket reader takes 500bytes and assume it’s one complete quote object, and blindly drops this 500-long byte array into the buffer.

–multicast rather than concurrent unicast–
See single/multi-thread TCP servers contrasted

–cpu dedication–
Each thread is busy and important enough to deserve a dedicated cpu. That CPU is never given to another thread.
Now let me introduce my design. One thread per filter. Buffer is a circular array — bounded but efficient pre-allocation. Pre-allocation requires fixed-sized nodes, probably byte arrays of 500 each. I believe de-allocation is free — recycling. Another friend (csdoctor) suggested an unbounded linked list of arrays . Total buffer capacity should exceed the *temporary* queue build-up. Slowest consumer thread must be faster than producer, though momentarily the reverse could happen.

—-garbage collection—-
Note jvm gc can’t free the memory in our buffer.

–Design 3–
Allocate a counter in each quote object. Each filter applied will decrement the counter. The thread that hits zero will free it. But this incurs allocation cost for that counter.

–Design 6–
Each filter thread records in a global var its current position within the queue. Each filter thread advances through the queue and increments it’s global var. One design is based on the observation that given the dedicated CPU, the slowest thread is always the slowest in the wolfpack. This designated thread would free the memory after applying its filter.

However, it’s possible for 2 filters to be equally slow.

–design 8–We can introduce a sweeper thread that periodically wakes up to sequentially free all allocations that have been visited by all filters.

–Design 9– One thread to apply all filters for a given HF client. This works if filter logic is few and simple.

–Design A (CAS)– Create any # of “identical” consumer threads. Any time we can expand this thread pool.
1)read BigArrayBuffer[++MyThreadPtr] into this thread’s register and examine the fields, without converting to a Quote instance.
2) examine the Taken boolean flag. If already set, then simply “continue” the loop. This step might be needed if CAS is costly.
3) CAS to set this flag
4a) if successful, apply ALL filters on the quote. Then somehow free up the memory (without the GC). Perhaps set another boolean flag to indicate this fixed-length block is now reusable storage.
4b) else just “continue” since another thread will process and free it.

hide client names and address

I proposed a system to a buy-side asset manager shop. I said client names don't need to stored in the central database. Maybe the salesforce and investment advisors need the names but they don't need to save those in a shared central database for everyone else to see.

Each client is identified by account id, which might include an initial.

When client logs in to a client-facing website, they will not see their name but some kind of relatively public information such as their self-chosen nick name, investment objectives, account balance, and last login time.

Client postal address is needed only for those who opt for paper statement. And only one system needs to access it — the statement printing shop.

A veteran in a similar system told me this is feasible and proposed an enhancement — encrypt sensitive client information.

What are some of the inconveniences in practice?

Tx Monitoring System: distributed cache

I believe I learnt this from an Indian consultant while working in Barcap. Perhaps gigaspace?

Basic function is to host live transaction data in a huge cache and expose them to users. Includes outgoing orders and incoming execution reports. I think market quotes can also be hosted this way.
Consumers are either sync or async :
1) Most common client mode is synchronous call-and-wait. Scenario — consumer can’t proceed without the result.
2) Another common mode is subscription based.
3) A more advanced mode is query-subscription (similar to continuous query), where
– consumer first make a sync call to send a query and get initial result
– then MOM service (known as the “broker”) creates a subscription based on query criteria
– consumer must create a onMsg() type of listener.

Query criteria are formatted in SQL format. In a select A,B.. A actually maps to an object in the cache.

Major challenge — volume. Millions of orders/day, mostly eq, futures and options. Gigabytes of data per day. Each order is 5kB – 10KB. One compression technique is FIX style data-dictionary — Requester and reply systems communicate using canned messages, so network is free of recurring long strings.

All cache updates are MOM-based.

Q: when to use async/sync?

A: Asynchronous query – needed by Live apps – need latest data
A: Synchronous query – reporting apps

no 2 thread for 1 symbol: fastest mkt-data distributor

Quotes (and other market data) sent downstream should be in FIFO sequence, not out-of-sequence (OOS).

In FX and cash equities (eg EURUSD), I know many major market data aggregators design the core of the core feed engine to be single-threaded — each symbol is confined to a single “owning” thread. I was told the main reason is to avoid synchronization between 2 load-sharing threads. 2 threads improve throughput but can introduce OOS risk.

You can think of a typical stringent client as a buy-side high-frequency trader (HFT). This client assumes later-delivered quote is physically “generated” later. If 2 quotes arrive on the same name, one by one, then the later one always overwrites the earlier one – conflation.

A client’s HFT can react in microseconds, from receiving quote (data entering client’s network) to placing orders (data leaving client’s network). For such a fast client, a little bit of delay can be quite bad, but not as bad as OOS. I feel OOS delivery makes the data feed unreliable.

I was told many automated algo trading engines (including automatic offer/bid pricers in bond) send fake orders just to test the market. It sends a test order and waits for the response in the data feed. An OOS delivery would confuse this “observer”.

A HFT could be trend-sensitive. It monitors the rise and fall of sizes of the quotes on a given name (say SPX). It assumes the market data are delivered in-sequence.

learning design patterns #letter to Mithun

Another thing about design patterns – each author has a different description.

Some describe it in 2 paragraphs or a 10-line example using one dog/cat class. Others write 3 pages and 5 classes. Yet others write a whole chapter on it in a design pattern book.

If I go by the simplistic interpretation to describe a pattern, then the interviewer may think I don’t really know it.

If I give an in-depth example, then it could be too complicated to describe (I’m not extremely good at describing complexity) and interviewer may not be an expert on that pattern to understand me.

My suggestion is to focus on 1 or 2 patterns. Understand them inside out and also remember some simple examples in your own app. Our knowledge of a pattern should grown from thin to thick to thin, yes back to thin. Only when we have a streamlined understanding can we describe it with clarity.

Here are some complex patterns I have struggled with for years – visitor, bridge, strategy, composite, chain of command, memento, command, observable …. Each of them can be summarized in 1 paragraph, but let’s be honest – these aren’t simple.

async (almost)always requires buffer and additional complexity

Any time I see asynchronous (swing, MOM etc), i see additional complexity. Synchronous is simpler. Synchronous means blocking, and requires no object beside the caller actor and service actor. The call is confined to a single call stack.

In contrast, async almost always involves 2 call stacks, requires a 3rd object in the form of a buffer [1]. Async means caller/sender can return before responder/callback even gets the message. In that /limbo/, the message must be kept in the buffer. If responder were a doctor then she might be “not accepting new patients“.

Producer/consumer pattern … (details omitted)
Buffer has capacity and can overflow.
Buffer is usually shared by different producer threads.
Buffer can resend.
Buffer can send the messages out of order.

[1] I guess the swing event object must be kept not just on the 2 call stacks, but on the event queue — the buffer

Q: single-threaded can be async?
A: yes the task producer can enqueue to a buffer. The same thread periodically dequeues. I believe swing EDT thread can be producer and consumer of tasks i.e. events. Requirement — each task is short and the thread is not overloaded.

Q: timer callback in single-threaded?
A: yes. Xtap is single-threaded and uses epoll timeout to handle both sockets and timer callbacks. If the thread is busy processing socket buffers it has to ignore timer otherwise socket will get full. Beware of the two “buffers”:

  • NIC hardware buffer is very small, perhaps a few bytes only, processed by hardware interrupt handler, without pid.
  • kernel socket buffer is typically 64-256MB, processed under my parser pid.
    • some of the functions are kernel tcp/udp functions, but running under my parser pid

See which thread/pid drains NIC_buffer}socket_buffer

common technical challenges in buy-side software systems

10 – 30% of wall st IT jobs are on the buy-side such as funds, portfolio and asset management.

* Core challenge – sub ledger. In one system there was more than 100,000 client accounts in the sub ledger. Each is managed by professional financial advisors. I believe each account on average could have hundreds of positions. (Thousands would be overwhelming I feel.) Since clients (and FA) need instant access to their “portfolio”, there are too many positions to keep up to date. Core of the entire IT infrastructure is the subledger.

** Number of trades per day is a 2nd challenge. These aren’t high-frequency traders, but many Asian clients do nothing but brokerage (equity) trades. Per-account not many, but all the accounts combined is a lot of processing overnight. These must add to the sub ledger by next day.

* quarterly performance reporting
** per-fund, per-account, per-position
** YTD, quarterly, annual etc

I guess there is also monthly performance reporting requirement in some cases.

* asset allocation and periodic portfolio re-balancing — for each client. Key differentiators. Investors get a large “menu” of funds, products … For comparison, they may want performance metrics.

– VaR or realtime risk? Probably important to large funds
– pricing? Probably important to large funds

– swing/wpf not really required. Web is adequate.
– trade booking? not a challenge

Database: limited usage]real time trading

“Database” and “Real-time trading” don’t rhyme!

See Trading systems use lots of MOM and distributed cache.

In comparison, DB offers perhaps the most effective logging/audit. I feel every update sent to MOM or cache should ideally be asynchronously persisted in DB. I would probably customize an optimized DB persistence service to be used across the board.

Just about any update in cache need to be persisted, because cache is volatile memory. Consider flat file.

[11] real time high volume FX quote processing #letter

Horizontal scale-out (distributing to different boxes) is the design of choice when we are cpu-bound. For instance, if we get hundreds of updates a sec and each update requires repricing a large number of objects.

Ideally, you would want cpu to be saturated. (By using twice the hardware threads, you want throughput to double.) Our pricing engine didn’t have that much cpu load, so we didn’t scale out to more than a few boxes.

The complication of scale-out is, data required to reprice one object may reside in different boxes. People try many solutions like memory virtualization (non-trivial synchronization cost + network latency), message-passing, RMI, … but I personally prefer the one-big machine approach. Throw in 16 (or 128) processors, each with say 4 to 8 hardware threads, run 64-bit, throw in 256G RAM. No network latency. No RMI/messaging latency. I think this hardware is rather costly. Total cost of 8 smaller machines with a comparable total CPU power would cost much less, so most big banks prefer it – so-called grid computing.

According to my observations, most practitioners in your type of situations eventually opt for scale-out.

It sounds like after routing a message, your “worker” process has all it needs in its local memory. That would be an ideal use case for parallel processing.

I don’t know if FX spot real time pricing is that ideal. Specifically, suppose a worker process is *dedicated* to update and publish eur/usd spot quote. I know you would listen to the eurusd quotes from all liquidity providers, but do you also need to watch usd/jpy and eur/jpy?

15,000 quotes repriced within a minute

One of my bond pricing engines could price about 15,000 offers/bids in about a minute. 4 slow lanes to avoid
1) database persistence is done asynchronously by gemfire write-behind.

2) offers/bids we produce must be verified by another system, which officially owns the OutgoingQuote table. The verification takes a long time. We avoid that overhead by pricing all the offers/bids in gemfire, then send them out by batch, then wait for the result. The 1 minute speed is without the verification.

3) all reference data is preloaded into gemfire, so no more disk I/O.

4) minimal serialization overhead, since most of the objects needed are in local JVM.

In contrast, a more complex engine, the mark-to-market engine needs a few minutes to price 15,000 positions. This engine doesn't need real time performance.

y java is dominant in enterprise app

What's so good about OO? Why are the 3 most “relevant” enterprise app dev languages all happen to be OO – java, c# and c++?

Why is google choosing java, c++ and python?

(Though this is not really a typical “enterprise app”) Why is apple choosing to promote a compiled OO language — objective C?

Why is microsoft choosing to promote a compiled OO language more vigorously than

But why is facebook (and yahoo?) choosing php?

Before c++ came along, most enterprise apps were developed in c, cobol, fortran…. Experience in the field show that c++ and java require more learning but do offer real benefits. I guess it all boils down to the 3 base OO features of encapsulation, inheritance and polymorphism.

enterprise reporting with/without cache #%%xp


(A personal blog) We discussed enterprise reporting on a database with millions of new records added each day. Some reflections…

One of my tables had about 10G data and more than 50 million rows. (100 million is kind of minimum to qualify as a large table.) This is the base table for most of our important online reports. Every user hits this table (or its derivative summary tables) one way or another. We used more than 10 special summary tables and Business Objects and the performance was good enough.

With the aid of a summary table, users can modify specific rows in main table. You can easily join. You can update main table using complex SELECT. The most complex reporting logic can often be implemented by pure SQL (joins, case, grouping…) without java. None of these is available to gigaspace or hibernate users. These tools simply get in the way of my queries, esp. when I do something fancy.

In all the production support (RTB) teams I have seen on wall street, investigating and updating DB is the most useful technique at firefighting time. If the reporting system is based on tables without cache, prod support will feel more comfortable. Better control, better visibility. The fastest cars never use automatic gear.

Really need to limit disk I/O? Then throw enough memory in the DB.

y trading systems use so many stored procedures

A popular Wall Street interview question is the pros and cons of stored
proc. Here are a few
#1 single point of access from java, c++ …
#2 modular encapsulation. separation of concern
+ network efficiency
+ access control
+ reusable. DRY
+ easy version control
– readability
– exception handling
– hard to log actual query

Perhaps the biggest motivation is to avoid recompiling binary in an
emergency fix. Many sites have extremely strict control on binary
build/deployment [1]. Every release always builds from version control.
If you need a bug fix release, then deal with all the changes checked
into cvs but not approved!

Redeploy binary can also break any number (or all) other applications.

Proc is the answer to your prayer. In some places, every
select/insert/update/delete statement is extracted into a proc. Changing
the logic in them feels almost painless compared to a binary
build/release. Hibernate is a big departure from the proc tradition.

1) Wall Street users want frequent changes, not bound by software
release controls. Control-vs-time-to-market makes a healthy contention.
2) Wall Street code is often extremely (quick and) dirty, so fixing bugs
without software release is often a life saver.

About half of all business logic, both features (1) and bugs (2), are
often expressed in SQL. Now you see how useful it is to have flexible
ways to change the SQL logic.

If you think hard and always forecast which business logic might need
change, then you can strategically extract those SQL into store

[1] Given the huge sums involved, wall st wants control on software.
They can't control code quality but can control build/release. Many,
many levels of approvals. Numerous staging, integration, QA, preQA

## [11] y no dotnet on sell-side server side@@

(A fairly sketchy, limited, amateurish write-up.)
I was recently asked “dotnet has formidable performance and other strengths compared to java, but in trading engines space, why is dotnet making inroads only on the user-interface, never on the server side?

Reason — as an casual observer, I feel Windows was designed as a GUI operating system with a single user at any given time. Later WinNT tried to extend the kernel to support multiple concurrent users. In contrast, Unix/Linux was designed from Day 1 to be multi-user, with the command line as the primary UI. (Personally I used to feel GUI is a distraction to high volume data processing OS designers.) A trading server needs no GUI.

Reason — Java and c/c++ were created on Unix; dotnet runs only on a windowing operating system. I feel web server is a light weight application, so both java and dotnet (and a lot of scripting languages) are up to the job[1], but truly demanding server-side apps need Unix AND java/c++. I guess Windows is catching up. In terms of efficiency, I guess java and c# are comparable and below C++.

Reason — Sell-side trading system is arms race. (Perhaps same among hedge funds.) Banks typically buy expensive servers and hire expensive system engineers, and then try to push the servers to the max. C/C++ makes the most efficient use of system resources, but Java offers many advantages over C++. Since the late 90’s, trading servers have progressively migrated from C++ to Java. Java and C++ are proven on the high-performance server side. Not dotnet.

Reason — I still feel *nix are more stable than Windows under high load. See However, I think you can create big clusters of windows servers

Reason — (from a friend) — *nix is considered more secure than windows. A GUI desktop can affect one trader if compromised, but a sell-side trading server affects all the traders from all the institutional and retail clients if compromised. So security risk is more serious on server side than GUI side.

The reasons below are arguments for java over dotnet in general, but don’t really explain why java is NOT on the GUI side and dotnet is still chosen on the GUI side.

Reason — big banks need stronger support than a single vendor company. What if Microsoft makes a mistake and dotnet loses technical direction and momentum? Java and *nix enjoy broader industry support.

[1] unless you are google or facebook, who needed c++ for their demanding requirements.

overnight risk reporting in portfolio management

I talked to a big portfolio mgmt (PM) firm. Team owns and delivers nightly risk reports to traders (+ perhaps fund managers). According to the team mgr, the most important sister team is the quant team, who are often PhD's but not professional coders. Quants are really qualified to create models but these quants actually implement their models in c++.

There's a large amount of data in DB. Nightly job reads in these data and analyzes them using the c++ models, then writes data back into DB.

This is a heavy-duty number crunching batch job, heavy on DB, light on network – no socket programming.

Logic is mostly in perl, c++, shell and DB. DB holds significant amount of logic, just like Goldman Sachs PrivateWealthManagement. It turned out c++ implements more business logic than perl. These perl scripts are considered low-logic, but if there's a lot of perl, then I believe there's a lot of logic.

Perf is the biggest issue. Job must complete in 12 hours, before a 3am deadline, without break. If it breaks, there will be … delays and …? Bottleneck is DB. There's spare hardware capacity underutilized but the DB server is on its knees. I have heard of the same many times, in GS, citi… so I guess this is hard to avoid. Risk system is probably worst affected.

%Q: stress testing? Monte Carlo?
A: the reporting system doesn't do those. Those are probably the job of quants.

%Q: is VaR the key output?
A: no. duration, curve duration, spread duration

%Q: is matrix and “vectors” used in the c++ code, like those in matlab? So it goes well beyond STL?
A: yes quants use matlab and mathematica to develop the concept, and then use c++ to implement it. We do have our own data structures beyond STL.

%Q: how much domain knowledge required in the analytical work?
A: more of an aptitude and attitude to learn

multiple intermediate data storage]real time trading servers

For easy prod support, get your first stage of processing to save
intermediate output to cache, DB or MOM, and 2nd stage to pick up from
there. You can have many stages (ie pipes) and pipe connectors.
This might help your job security if other developers can’t easily
figure out all of your techniques saving, accessing, investigating (in
prod), filtering, monitoring the intermediate data. Remember gemfire
doesn’t have a working data browser?
This helps testing. Remember Mithun’s DBank cash management project.
This helps prod monitoring.
This helps everyone understand the business as they can see the
intermediate data in blood and flesh. You can get interesting
Recall Reo has limited logging so we don’t know why some events don’t
happen upon a user action or market update.

java RMI in trading systems

Now I feel rmi is rather easy, battle-tested, proven, mature,
well-researched, … compared to many alternative solutions. Here's
RMI usage in a trading system circa 2011 —

Nobody calls Neo server via RMI.  The only way you can talk to Neo is
via JMS/Protobuf.  So even if you have 100 instances of Neo servers,
JMS distributes the messages across them.

Neo does make _outbound_ RMI calls to PricingControl, Arb/prop/model
trading engine, and various other systems.

bond trade capture system use-cases

Trading system architect must know such essential use cases:

A hypothetical bond trade booking sys – named Blo (for Blotter)

Blo use case 1: phone execution, then trader enter trade into Blo.

Blo use case 2: traders advertise offers and bids on an internal network. Our salesperson lifts an offer. Trade is confirmed on the spot. System automatically books trade into Blo. This flow converts the Order into a Trade automatically. It’s possible for 2 salespersons to lift the same offer. System will reject A and book B.

Blo use case 3: advertise offers to external venue, lifted automatically. External venue sends us confirmation and trade booked.

Blo use case 4: trader responds to external bid-wanted (RFQ) and her bid is selected, becoming a trade. External venue sends confirmation to us, trade booked.

In Eq, there’s often a big OMS to manage the order state from an initial request to a completed trade.

DB as audit trail for distributed cache and MOM

MOM and distributed cache are popular in trading apps. Developers tend to shy away from DB due to latency. However, for rapid development, relational DB offers excellent debugging, tracing, and correlation capabilities in a context of event-driven, callback-driven, concurrent processing. When things fail mysteriously and intermittently, logging is the key, but u often have multiple log files. You can query the cache but much less easily than DB.

Important events can be logged in DB tables and

* joined (#1 most powerful)
* sorted,
* searched in complex ways
* indexed
* log data-mining. We can discover baselines, trends and anti-trends.
* Log files are usually archived (less accessible) and then removed, but DB data are usually more permanent. Don't ask me why:)
* selectively delete log events, easily, quickly.

* Data can be transformed.
* accessible by web service
* concurrent access
* extracted into another, more usable table.
* More powerful than XML.

Perhaps the biggest logistical advantage of DB is easy availability. Most applications can access the DB.

Adding db-logging requires careful design. When time to market is priority, I feel the debug capability of DB can be a justification for the effort.

A GS senior manager preferred logging in DB. Pershing developers generally prefer searching the same data in DB rather than file.

y a regular developer need design patterns

I asked a friend familiar with design patterns. Here's his answer + my comments.

* Sometimes you need to provide an API to another developer. I feel it's often beneficial to provide a familiar API based on a familiar pattern.
* When you refactor existing code
* A lot of frameworks out there embody design patterns. If you have to create your own framework (for whatever reason), you might need to decipher and follow the same design patterns.

In all these scenarios, concept is more important than knowledge. Variations on the theme needed.

pricing control in a bond dealer desk

Pricing (along with pnl) is one of the most important data to monitor and control. There’re multiple levels of price controls.
* Offer/bid price limits, to block out-of-range offer/bid advertisements
* the price in a response to a IFB is typically sent out via a system and is probably subject to price control, to prevent bidding too high.
* After trade execution, Middle Office would check the price against some reference prices. If a trader executed an unusually price, she may be responsible. I was told MO only bothers with unfinished (i.e. unclosed) positions.
* Pricing exception report and attestation. I think this is internal compliance.
* There could be regulations on unusual execution prices in some regulated securities. It’s conceivable that government wants to know every trade’s price in a particular derivative so as to prevent another bank collapse.

trade booking, trade capture, position management, sub ledger

The standard OTC trade booking system (TBS) — After u finalize i.e. execute a trade with your counterparty, typically over phone, you enter the completed trade in the TBS. Some call it trade capture. I think this used to be the trade blotter. Before TBS, people used spreadsheet. This is one of the earliest and most essential IT systems for traders.

The other absolutely essential trading system is the position management system (PMS), aka sub ledger. TBS records all the trade activities, and independently computes current positions by accumulation, and synchs up with the PMS every day.

How about pricing engine? In OTC, trader can decide the price with a pencil or a sophisticated pricing engine. I think it’s firm’s money but trader’s decision, so it’s up to her.

eq listed drv desk

Some basic info from a friend –

Equity Listed derivatives – mostly options on single stocks or options on index/future, but also variance-swaps. Even if a stock has no listed options, we would still create a vol surface so as to price OTC options on it, but the technique would be different — The standard technique if given many pairs of {expiration, strike} is to fit a curve on a single expiration, then create similar curves for other expirations on the same underlyer (say IBM), then try to consolidate all IBM curves into a smooth IBM vol surface. Each “point” on the surface is an implied vol value. I was told some of the more advanced “fitting” math is extracted out into a C++ quant lib.

Instrument pricing has to be fast, not multi-second. I guess this is pre-trade, RFQ bid/offer pricing, similar to bond markets’ bid-wanted. In contrast, the more “real” need for vol surface is position pricing (or mark-to-market), which provides unrealized PnL. I feel this is usually end-of-day, but some traders actually want it real time. Beside the traders on the flow[3]/listed/OTC derivative desks, the vol surface is also used by many other systems such as structured derivatives, which are entirely OTC.

It’s quite hard to be really event-driven since they are too frequent, instruments too numerous, and pricing algo non-trivial, exactly like FX option real time risk. Instead, you can schedule periodic repricing batches once a few minutes.

About 3500 underliers and about 450,000 derivative instruments. Average 100 derivatives on each underlier (100 combinations of strike/tenor). S&P500 has more than 1000 derivatives on it.

Market data vendors — Reuterss, Wombat, Bloomberg.

Inputs to vol calculation — product reference (strike/tenor), live market quotes, dividend, interest rate …

One of the most common OTC equity derivatives is barrier option.

Pricing and risk tend to be the most mathematically challenging.

Exchange connectivity is usually c++, client connectivity (clients to send orders or receive market data) is usually java.

[3] Flow means agency trading, most for institutional clients. Retail clients are very wealthy. Those ordinary retail investors won’t use an investment bank. Flow equity derivative can be listed or OTC.

Spring can add unwanted (unnecessary) complexity

[5] T org.springframework.jms.core.JmsTemplate.execute(SessionCallback action, boolean startConnection) throws JmsException
Execute the action specified by the given action object within a JMS Session. Generalized version of execute(SessionCallback), allowing the JMS Connection to be __started__ on the fly, magically.
Recently i had some difficulties understanding how jms works in my project. ActiveMQ hides some sophisticated stuff behind a simplified “facade”. Spring tries to simplify things further by providing a supposedly elegant and even simpler facade (JmsTemplate etc), so developers don’t need to deal with the JMS api[4]. As usual, spring hides some really sophisticated stuff behind that facade.

Now i have come to the view that such a setup adds to the learning curve rather than shortening it. Quickest learning curve is found in a JMS project using nothing but standard JMS api. This is seldom a good idea overall, but it surely reduces learning curve.

[4] I don’t really know how complicated or dirty it is to use standard JMS api directly!

In order to be proficient and become a problem solver, a new guy joining my team probably need to learn both the spring stuff and the JMS stuff [1]. When things don’t behave as expected[2], perhaps showing unexpected delays and slightly out-of-sync threads, you don’t know if it’s some logic in spring’s implementation, or our spring config, or incorrect usage of JMS or a poor understanding of ActiveMQ. As an analogy, when an alcoholic-myopic-diabetic-cancer patient complains of dizziness, you don’t know the cause.

If you are like me, you would investigate _both_ ActiveMQ and Spring. Then it becomes clear that Spring adds complexity, not reduces complexity. This is perhaps one reason some architects decide to create their own frameworks, so they have full control and don’t need to understand a complex framework created by others.

Here’s another analogy. If a grandpa (like my dad) wants to rely on email everyday, then he must be prepared to “own” a computer with all the complexities. I told my dad a computer is nothing comparable to a cell phone, television, or camera as a fool-proof machine.

[1] for example, how does the broker thread start, at what time, and triggered by what[5]? Which thread runs onMessage(), and at what point during the start-up? When and how are listeners registered? What objects are involved?

[2] even though basic functionality is there and system is usable

trade booking/capture in the big picture

For a novice who wonders just how important trade-capture is…

b/c (i.e. trade booking/capture) is the #1 essential component of trading systems, if you look across assets. B/c is often the _heart_ of an OTC trading desk or voice trading desk. But not true for trading desks against an exchange/interdealer, because pre-trade apps takes center stage, and post-trade
flow becomes middle-office.

b/c (along with position/pnl and trade blotter) is the first task to be computerized on wall street.

b/c is the basis of position master ie sub-ledger (often in mainframes), one of the most essential systems in any trading system. Sub-ledger is basis of pnl.

I feel b/c is relatively _low_tech_ compared to market data, low latency and some pre-trade systems. However, I feel in an exchange or a large sell-side firm, execution volume can be high.

I feel b/c demands more precision, more stability, better error rate, more robustness… than most pre-trade systems. This is because b/c is the point of no return — After an order is executed it can’t be canceled effortlessly.

In a voice trading desk, b/c is actually post-trade, because the trader executes the trade over phone and simply enters data into the books. Remember MTSTradeEngine? In contrast, the fully electronic b/c is not post-trade but sits at the choke point right between pre-trade and post-trade.

FX option trading – a typical arch

Just as in equity options, the core component is risk engine, because positions are large and long-term. See other reasons in my post on option trading systems.

— My hypotheses —
* I guess for both fx option and IRD, core engine is a realtime event-driven position updater (another side of the same coin as risk engine). Each position has a lot of contract attributes and risk attributes, all subject to frequent updates. A typical FX option desk probably has “too many” positions each reacting to a lot of events, but each update is complex and time-consuming.
* In contrast, cash desk has fewer positions and simpler positions.
* In bond trading, any non-flat position is also subject to updates in terms of marking and unrealized PnL, but calc is simpler.
FX option is an OTC market – “no electronic trading” (i guess no ECN either), but there are electronic trade messages in addition to manual trade booking. There’s also plan to access CME listed FX options. Note this plan is not about fx options on futures, and not about PHLX.

%Q: so is it voice based?
A: various means.

Clearing could be done at the London Clearing House. I guess London is a bigger center than NY.

A lot of “exotic” fx option products come online every year. There’s pressure to automate and speed up new product launch. I would guess 1) position management and 2) booking are among the most essential features needed by any new FX option instrument. System must be able to persist positions in these exotic options. If automatic STP booking is hard, then ops can manually enter them, assuming volume is low on new products.

Volume of Trades – FX options desk gets about 1500 trades/day. In contrast, FX cash desk (includes futures + forwards) gets about 100 times the volume, but profit is perhaps 2 to 3 times that of FX options desk, obviously different margins.

Volume of Positions – FX cash desk keep most positions flat so very few positions are non-flat. FX options desk has “too many” open positions, a big headache to risk engine.

Entire FX options desk needs about 20 desk-specific developers world-wide. Besides, I guess there are many supporting systems owned by other teams outside the desk. These teams include (not limited to) firmwide teams, probably further away from the profit centers.

FX option trading is more complex than FX cash trading.

Q: Are FX derivatives simpler than equity derivatives?
A: not necessarily. FX involves 2 interest rates. Eq involves dividends.

— system modules owned by dedicated desk developers–
FIX server (perhaps for market data, not e-trading?)
GUI is in Tcl, early versions of C# platform and WPF.
Market data is a major component in FX. Many modules react to market data —
– risk
– pricing
To traders, real time pricing is presumably more important  than risk is. I guess they need to send out updated bid/offer. RT pricing uses spot prices (market data) and volatility data for calculation. For any pair of currencies, (every?) market data could trigger Automatic price updates across all strikes and expiration.

Actual option valuation math is in c++/JNI.

Biggest headache in fx option risk engine is performance. FX Option Valuation is slow. FX option position Volume is too large for real time risk update. Instead, the risk “report” system is on-demand and covers a requested subset of the full portfolio, presumably those positions belonging to a trader. Such a report takes a few minutes. If market data has changed by then, report is obsolete.

Risk rollup from trader-level to entity-level to firm-level. There’s an external team responsible for analytics library and they call FX options system’s services to get positions. I guess that external system is a firm-wide analytics or risk engine.

#1 essential component (among the distributed components over 30 servers) in the trading desk is trade capture/booking, written in c++ primarily + some java. There’s some c++ valuation module for FX options. Plan is to slowly phase out c++. Other than that, desk is mostly java.

–core architecture–
Since an option (or any derivative) is not settled right away like cash trades, there’s a _lifecycle_ to each derivative trade. Each derivative trade takes on a life of its own and is subject to many “lifecycle events” like
– origination, cancels, amends/modifications
– knock-in, knock-out
– fixings
– market data effecting risk reassessment

Just like bond repricing engine, this is Service Oriented Architecture – MQ facilitates the event-driven architecture, but there are other ways to pass messages like SOAP over TCP (not http).
1) MQ for high volume messages
2) SOAP for slow, complex processing. Possibly a few trades a day! I guess these are exotic products.
A typical event-driven server here is a socket server, holding a thread pool, started with main(). No container or web server.

wall street infrastructure — security trading systems

(Another blog. No need to reply.)

A large part of the wall street core infrastructure is built around the (regulated) exchanges and (unregulated) ECN’s, and includes the major trading houses’ trading systems — front and back ends, equities, fixed income, currency and commodities, including risk. Developers in these systems are the backbone of wall street.

I feel less than 50% of my company’s technology staff are application developers. Among them, less than 50% develop apps for real time trading. The rest of the developers support reporting, end-of-day risk, post-trading (like my team), GL, compliance, surveillance, price and other data feeds into trading systems and data feeds out of trading systems, maintain accounts and other reference data, ….

Trading system developers are employed by brokerage houses (aka securities firms), hedge funds, mutual funds, prop traders, exchanges, and many boutique firms(?) On the other hand, retail banking, consumer banking, corporate banking (they all taking deposits and giving loans) and the advisory business of investment banks don’t have infrastructure to trade securities. I don’t think they have access to the security exchanges.The IPO, M&A, privatization… investment bankers do need some access to trading systems, as they issue securities.

Overall, how many percent of the financial IT people are in the “backbone”? I guess not more than 10%.

basic (essential?) trading server arch q&&a

Every trading server invariably uses some non-http network daemon. There’s always more than 1 process (JVM, C# or c++) on the server side. There’s usually some MOM daemon such as JMS, tibrv and gemfire notification daemon. Here are some Fundamental questions:

Q22: on top of tcp/udp, what specific network protocol between the server-side and GUI?
A: I have seen rmi and protobuf over tib ems.

Q22a: how about JMS between server and swing? Did we see 160 subscribers on a given topic, due to that many swing installations?

Q33: on top of tcp/udp, what specific network protocol among the server-side processes?
A: I have see tibrv, JMS, RMI, gemfire data distribution protocols …

Q44: since most trading servers must avoid DB latency, where does the trading data live? In memory?
A: i have seen gemfire, rttp,

Q45: in case of distributed cache (not replicated), how does one cache listener update another node?

Q55: how does the daemon stay alive after main() exits?
A: Look at ion, gemfire, activemq. There’s often at least 1 (1 enough) non-daemon thread that’s stuck in wait()

web services j4, features — briefly

web service is an old technology (RPC) given a new lease of life 10 years ago.

* [#1 selling point] cross platform for eg between dotnet frontend and java backend
* loosely coupled
* good for external partner integration. Must be up all the time.
* beats MOM when immediate response is required.
* web service (soap) over MOM? should be feasible. One listener thread for the entire client system — efficiency

design patterns often increase code size

Using nested if and instanceof.., you can implement complex logic in much fewer lines, and the logic is centralized and clearly visible.

If you refactor using OO design techniques, you often create many Types and scatter the logic.

Case in point — Error Memos workflow kernel. We could use factory, polymorphism, interfaces…

Case in point — chain of responsibility

make 2 custom exceptions only — one checked, one unchecked

Lots of interviewers asked me about my exception handling strategy. Here’s a tentative exception object design for a small project — Maintain just 2 custom exception classes.

1) — a big custom checked exception (by extending containing an enum[2] field. Each value represents a specific error condition[1].

2) MyUncheckedEx — a similar thing but unchecked (extending RuntimeException?).

Now the usage:

1) When you want a given error situation to always be handled in every *context* and never ignored, then register it in MyCheckedEx. Every place it can possibly occur, compiler will be your trusted friend to ensure it’s declared and handled explicitly.

2) things you put into MyUncheckedEx are nicely flexible, less strict. Callers (other developers) can choose to catch them or ignore them, silently escalating them to upstairs.

What about standard JVM exceptions? You can wrap them into your 2 exceptions. In particular, if you don’t like to handle many of the standard checked exceptions, you can wrap them in MyUncheckedEx. I think Spring wraps lots of Checked exceptions into unchecked. Conversely, you can wrap an unchecked exception into MyCheckedEx.

Another requirement — We often need a hierarchy of exceptions. Gmail uses labels instead of folders, so likewise, shall we declare some marker interfaces so an exception can be labeled as both validation and synchronization and billing exceptions? Will this beat a custom inheritance tree?

[1] a priority — i prefer fine-grained types of exception, for better control
[2] or int. You can use these in switch-case. I don’t like String.
non-priority — exception parent-child family. Unnecessary complexity.

realtime inter-VM communication in front desk trading sys

Inter-VM is our focus.

* [s] MOM — async
** FIX over RV in Lehman Eq
* [s] distributed cache — async?

Above mechanisms notify listeners. Note Listeners are usually async and multi-threaded.

* DB writes by one app, and periodic DB polling by receiving app
* [s] RMI
* [s] EJB? infrequent. I think this is less efficient than MOM
* [s] web service? not sure
* FTP? not real-time but at SOD (startOfDay) and EOD
* email? none

MOM is the clear favorite. Most efficient. Guaranteed

Within my front office app, RMI, MOM and cache are dominant. Within a related ticketing system (iticket), MOM and RMI are dominant.

DB is an extreme form of synchronous pub/sub.

[s=needs object serialization. cross-VM often requires serializable]

separation of concern — one of my top 3 design priorities

for a long time my #1 ideal is testability. Separation of concern is related. Many other related ideals —

* modularize
* layering — extremely popular modularization pattern
* stable interfaces between modules
* interchangeable parts
* encapsulation of implementation details

But “separation of concern” is the best phrase. Now specific examples in an environment of multiple systems and teams:
* a basic idea — a table which multiple systems can read/write
** use views rather than the base table
** system to call a proc rather than select/update on the underlying table
* MOM (RV or mq)
* SOA service bus (MOM)
* web service
* producer/consumer
* getter/setters rather than public fields
* declare return types and variables as interface types

swing/server using 2 http requests (trading

A third party trading platform (like FXAll, tradeweb, Bloomberg, creditex, espeed, brokerTec) could install a swing GUI on traders’ screens. Every time a trader requests info from the remote server, there are 2 swing threads involved —

– one responsible for synchronously uploading the request message, via an http tunnel through firewall
– another responsible for synchronously downloading the response message – http tunnel. I guess this requires a URL including a requestID.

If server is busy or slow, then the 2nd synchronous call should take place after some delay. Otherwise it would block for a long time.

Q: I wonder how server could push market data to swing? I feel the http tunnel means the http client can’t be a server.
%%A: i guess the client makes periodic requests for updates. This is not really real time.

Studying biz rule tables used on Wall St

Background: many rule based systems (on Wall St or elsewhere) have thousands of rules in database tables. The simplest and most used pattern is the lookup table. A lookup table has one (or more) input columns and one (or more) output columns.

Each combination of input fields maps to exactly one row. There should be a unique composite index on the input columns as a group.

The concept is simple, but sometimes needs to be hand-crafted and perfected, because
* these rules are often part of the most important 1% source code in a big code base
* users change rules frequently, at least in some Wall St firms
* we often need to query the tables with complex criteria. Such complex queries are usually implemented in source code, but for quick production support we often need to run them by hand, tweak and rerun. In some cases, we end up learning all the intricacies by heart. If your full time job is a spell-checker you won’t need the dictionary much.
* we often need to explain how rules are disqualified, filtered out, ranked, grouped, applied, overlayed, re-applied…. Many IT veterans would say blow-by-blow knowledge of a code module is needed in extremely rare situations.. Welcome to Wall St these business rules are examined and tweaked on a daily basis by multiple groups of users.

Compare with Informatica lookup transform.

1900 tiers of quotes, RFS over FIX, indicative/executable quotes

One of the REAL bottlenecks in a large SELL-side FX dealer system is tiered pricer. Trigger event could be a market data change. Since such an event could trigger an avalanche of messages, the frequency of such events is not very high, probably below 10 events/second on 1 currency pair. If you have 10(or 50 or whatever) active currency pairs, then you could get 100 triggers/sec through your entire system.

Once a trigger is activated, pricer computers new bid/ask quotes for Tier 1 Gold clients. Pricer then adds a distinct spread for each tier. For an active pair like EUR/USD, there can be 1900 tiers. There can be up to 1900 (non-unique) pairs of bid/ask quotes. Typically, the “best” quotes would have a bid/ask spread of 2 to 3 pip, applicable to the best and largest clients. For a *retail* client, it could be 20 – 40 pips.

Core of the tiered pricer is a Drools rule engine.

Another module is the messaging engine using Nirvana by My-Channels. If a particular bid/ask quote applies for all (say, 20) tiers in Silver category, then pricer broadcast the quote to a topic like Quote.EURUSD.Silver. This is kind of alias for 20 different “tier” topics. For efficiency, this is probably multicast.

In the worst case, one event can trigger an avalanche of 1900 messages for one currency pair alone.

Last module is the FIX engine. Quotes often go out the door in FIX format, just as RFQ. Now there are 1900 tiers but the number of clients could be higher or lower than 1900. If there are more than 1900 clients and if all of them subscribe to our quote, then each must be sent the quote. I know a Chicago prop trading firm (Gelber?) subscribe to a lot of “bank feeds”.

The most demanding type of subscription is a RequestForStream (RFS). A typical RFS could ask for a stream of EURUSD quotes for 10 (up to 120 minutes), during which time all quotes must be delivered.

Unlike RFQ, RFS requires special approval. The bid/ask quotes in an RFS can be indicative or *executable* (similar to Firm, but see separate blot post). If a client hit an executable bid or lift an executable offer, then the trade is considered executed, though I believe cancellation is still a possibility, just like any Firm quote in bonds.

How does a dealer make sure he has enough position to honor the quote? Perhaps by setting aside reserve quantities, or by monitoring the open market.

Unlike bidwanted systems (non-negotiable quotes), it’s possible for a client to negotiate on our quote electronically, though I feel manual negotiation is more practical.

constructor to throw checked exception

Hi XR,
This is just a java learner’s personal opinion — i feel constructors should avoid throwing checked exceptions. Traditional if/while/switch tests are more natural , more convenient, and more maintainable than throwing exceptions. Only in a few special scenarios should a constructor throw a checked exception.
My reasons against checked exceptions thrown from a constructor? My answer is long. A constructor returns null after throwing exception. The caller method has to 1) test for null before using the new object constructed. The caller also need to 2) try-catch the exception. Of course the constructor must 3) detect the invalid input before throwing. At least 3 tests.
Whenever possible, i would move Test #3 out of constructor into the caller, so constructor won’t return null nor throw checked exceptions. How to achieve this? One standard solution is a factory. Factory does Test #3 and handles the invalid condition. Factory may return null, so the caller still need Test #1. So 2 tests to perform.
This leads to an even simpler solution. If possible, the caller method should detect invalid input (Test #3) before passing it to the constructor/factory. 1 test only.
I guess this is rudimentary. Somehow, after programming java for years, i don’t always see the simplest solution. I think there are a lot of java design patterns, java idoms, common java constructs, such as throwing checked exceptions from constructors. I learnt these and didn’t know when NOT to apply them.

database access control solutions for enterprise

solution: deny access to command line tools like sql+ or sqsh, deny access via standard windows clients like toad, aqua studio. Require every interactive user to access via a browser.

solution: custom api for java clients. block access via jdbc.

Solution: views to limit access to subset of rows, subset of columns or derived columns. deny access to underlying tables.

Solution: sproc and function to limit access. deny direct access to underlying tables. I think this is the most flexible.

Merrill S’pore: fastest stock broadcast

Updates — RV or multicast topic; msg selector

I think this is a typical wall-street interview question for a senior role. System requirement as remembered by my friend the interviewee: ML needs a new relay system to receive real-time stock updates from the stock exachange such as SGX. Each ML client, one of many thousand[1], will each install a new client-software [3] to receive updates on the stocks [2] she is interested. Some clients use algorithmic trading system and need the fastest feed.

[1] Not clear about the order of magnitude. Let’s target 10,000
[2] Not clear how many stocks per client on average. Let’s target 100.
[3] Maintence and customer support for a custom client-software is nightmare and perhaps impractical. Practically, the client-software has to be extremely mature such as browsers or email clients.

Q: database locking?
A: I don’t think so. only concurrent reading. No write-contention.

Key#1 to this capacity planning is how to identify bottlenecks. Bandwidth might be a more severe bottleneck than other bottlenecks described below.

Key#2 — 2 separate architectures for algorithmic clients and traditional clients. Each architecture would meet a different minimum latency standard, perhaps a few seconds for traditional and sub-second for algorithmic.

Solution 0: Whatever broadcasting system SGX uses. In an idea world, no budget constraint. Highest capacity desired.

Solution 2: no MQ? No asynchronous transmission? As soon as an update is received from SGX, the relay calls each client directly. Server-push.

Solution 1: MQ — the standard solution in my humble opinion.

Solution 1A: topics. One topic per stock. If 2000 clients want IBM updates, they all subscribe to this topic.

Q: client-pull? I think this is the bottleneck.

Q: Would Client-pull introduce additional delays?

Solution 1B: queues. one queue for each client each stock.

If 2000 clients want IBM updates, Relay need to make that many copies of an update and send to that many queues — duplication of effort. I think this is the bottleneck. Not absolutely sure if this affects relay system performance. Massively parallel processing is required, with thousands of native CPU threads (not java green threads)

from a database row to an object

“a datatype with methods”. In a longer sentence, “a datatype with specific operations defined for it”

For example, a “student” datatype has fields representing the object’s state and and operations appropriate for a student such as enroll(courseID), payFees()….

This is a good example of the simplest type of class — a data class.

Another short answer — “a C struct with methods”

what realworld entity a class represents

The Challenge: given a complex class, how do i quickly find out what kind of realworld thingy, if any, it models? I hope this post offers practical, quick solutions.

Conversely, a good OO-design can help show what kind of entity the class represents. Beside naming and comments, a designer can consider the list below.

Tip: Occassionally, you notice a field of a collection type. The entities inside the collection is the realworld thingy represented by this class. Common design pattern. Perhaps a simple pattern, but NOT obvious!

Tip: See if it has relatively few instance variables. In a good design, instance variables collectively represent object *state*.

Tip: See if the constructors are simple and meaningful. Some constructors receive key argument(s) that reveal the role of this class among the /domain-objects/.

Tip: See if the class advertises an obvious “service” method, perhaps supported by helper-methods

Tip: always see the base types.

Tip: getters and setters are an obvious clue.

rule compiler — learning notes

a rule-compiler compiles rule-source-code into executable-rules. You deploy excutable-rules just as any other java class, to JVM.

rule-compiler contains a rule-parser.

executable-rule files have an encoding format to hold the condition/action.

rule-compilation is an initialization overhead to minimize, perhaps by caching.

Q: How does this compilation affect the edit-cycle?

rule-engine vs rules

If you remember one knowledge pearl about business rule engines, i think you may want to remember the relationship between rule engine and rules. I think these are the 2 main entities to *deploy* to an ent app.

Verizon’s circuit fault-isolator is a typical enterprise application using JRules. Think of your ent app as a host-app (or user or caller) of the rule stuff. There are quite a lot of rule stuff to *deploy* and you will soon realize the 2 main thingies are the (A) generic rule-engine and (B) your rules.

– The rule-engine is written by ILOG (or JBoss or whoever) but the rules are written by you.
– Rule Engine is a standard, generic component but the rules are specific to your business.
– The rule-engine is first and foremost the interpreter of your rules

* a good analogy is found in XSL transformer vs xsl stylesheet. Your host application need to load both of them into memory
* A similar relationship exists between spring the framework and the spring-beans you create.

cache size control in trading engines#%%ideas

My answer in 2007 – separate thread to monitor the size and send out alerts. JMX-based clean-up operation.

Now I feel we can take the swing EDT approach. “EDT” here means some kind of separate thread/data-structure for cache size control — a watchdog.

Each insert/delete operation on any thread would send a msg to the watchdog queue. Watchdog can decide how many messages to drop before it decides to take a look. If size is close to the limit, then it could decide to be more vigilant. Once limit is hit, watchdog could turn on some flag in the cache system to remind the inserters.

But how does watchdog keep the size within limit? It has to remove the least accessed item. A priority queue with a last-accecced-time might be good. See post on LRU for a java builtin solution.

More importantly, the app has to be tolerant of involuntary loss of cache items. “If cache doesn’t have what I inserted, i will insert again.”

batch jobs in financial trading system

According to a friend in investment banking technology, ALL trading systems need batch jobs to complement online applications. I think MQ applications are a third type. Typical batch:

* save in DB historical volume/day-high/day-low/day-open/day-close … — “open” information open to the public

* save in DB all market players’ trades. That’s my own terminology referring to “our own” hedge funds, other firms’ hedge funds, other firms’ traders… Our own traders’ activities are probably captured during transaction — no batch required.

* We (the brokerage) may also have large institutional clients whose data need to be recorded in DB. Such data may need batch processing if it is not recorded automatically.

which methods to make static — FTTP perspective

Key differences between static ^ instance methods

* static methods can only access static fields and static methods of this class, but many important “members” are non-static
* “public static” methods serve as “services” to clients (and deserve public service awards). Example: Arrays.*, Collections.*, Math.*
* static methods can run before the constructor ==> Many common operations in a class must be static — main(), getInstance()
* abstract method can’t be static. See other posts for the complicated reasons.
* overriding^hiding
* “synchronized” has different meanings

tiers ] ports^1-jvm

Refer to the overview post on ports^1-jvm.

Suffering from the same abuse-of-terminology as “server”, the word “tier” can now refer to not only separate-jvm but also to 1-jvm AR. Consider

“data tier” —
“dao tier” — basically the same as “dao modules”, as a layer of abstraction
“orm tier”
“object tier”
“presentation tier” — jsp, struts views, usually within 1-jvm

server/client mean — ports^1-jvm

a “server” is traditionally a separate jvm or unix process but nowadays occasionally can refer to a method (or set of methods) your “client objects” can call within a container.

In the same vein, a “client” used to mean a number — a process id (or thread), often on a different host, but now in Java literature it often means a “caller of a method”. A caller of a method may or may not be an object in memory, but always refers to some “calling context” to be altered by the method.

Every OO students would eventually come to realize when to say “client” and what that implies.

The j2ee community seem to pay little attention to the question “same or separate jvm?”

A “service” is often provided by the container.

The “server” usually provides some utility service like water and electricity.


Perhaps not a good example. jndi is obviously a container service. It can be filesystem-based.

ports^1-jvm to decouple web tier

2 common architectures to decouple any object-oriented web (or non-web) system. Master them. Don’t try to add a 3rd architecture to overload your memory.

— A) ports ie tcp ports. You Separate a chunk of java code into another unix-process with a port. You end up with a “tier” in a multi-tier AR. Beware “tier” can now refer to single-jvm modules too, in an abuse of terminology.

connection pools @@

Examples: EJB, ActiveMQ, CPF Single-sign-on-server, crystal-report-server, web services

— B) single-jvm solution. Instantiate intelligent components from an off-the-shelf or /3rd-party/ (3p) jar
thread issues
In a web tier, the 3p objects could be too big to re-create ==> put in session

Examples: struts, spring, hibernate, nanoXML, log4j, shopping cart

Example: the M in MVC could be a 3rd-party module (shopp`cart) or even a legacy ERP, but almost always there are some M-classes within the MVC jvm.

capacity management, a Unix /perspective/

aim for simultaneous saturation and eliminate bottleneck? More for perf tuning than cap management

“Capacity” is largely (don’t /sweat/ it) about “resources”.

— can’t add resource?
identify critical resources (bandwidth, simultaneous oracle conn, disk throuput..)
collect usage pattern esp. peak usage for each resource
increase effi 4 each resource ie reduce wastage
Identify — most of the time, perf is cpu-bound, mem-bound, disk io-bound, network io-bound … Same for a Weblogic server

— can add resources?
follow the same suggestions above
do cap plann`
do load forecast

perf techniques in T J W’s project–ws,mq,tx

Q: request wait-queuing (toilet queue)? I know weblogic can configure the toilet queue
A: keep the queue entries small. we only keep object id while the objects are serialized to disk (?!)

Q: is 1kB too large?
A: no

q: most common cause of perf issue?
A: mem leak. still present after regression test

q: jvm tuning?
A: yes important, esp mem related

q: regression test?
a: important

q: perf tools?
a: no tools. primarily based on logs. eg. track a long-running
transaction and compute the duration between soap transaction start
and end.

Q: web services?
A: Many of the transactions are based on soap, axis. TCP monitor
can help with your perf investigation.

Q: tx?
A: yes we use two phase commits. Too many transactions involved.
really complex biz logic. Solution is async.

Q: multi-threaded?
A: handled by weblogic.

Q: how is the async and queue implemented?
A: weblogic-mq with persistent store, crash-proof

create functionality != jvm restart]strategy pattern

Strategy patterns allows you to define a family of interchangeable algorithms, to be selected at runtime. In extreme circumstances, a new algorithm is to be created and to be added immediately without jvm restart. This would be a higher level of flexibility.

Perhaps the highest level of flexibility is offered by a DB containing classnames in the “family”. After you create a new algorithm, you insert its classname into the DB.

Class c = Class.forName( “com.myPackage.Myclass” );
Thing t = (Thing)c.newInstance( );

Also see detailed sample code in

runtime change to object behaviour

[[ head first design patterns ]] repeatedly favors *runtime* change to program functionality, rather than compile-time ie source code change. I assume they have a *practical* reason instead of a doctrine.

Related concepts: Strategy pattern, Decorator pattern,

When we need to change from an old functionality to a new functionality, a good approach is
* we try to create a new functionality class, if at all possible,
* at runtime, use existing setters to assign the new functionality, replacing the old, when needed.
* minimize edits to existing, tested classes

See also post on [[ create functionality without jvm restart]strategy ]]

I think this probably incurs least-impact to existing, tested functionalities.
=> regression test@@ no need
=> Low stress for fellow developers, managers, clients, internal users and any non-technies.
=> no need to worry “Did we miss any other existing classes that need edit?”
( documentation on interdependencies is crucial but often neglected by developers. )

batch feature wishlist

[x = lesser-known but fairly regular requirement in my experience]
A “record” means one of a (potentially large) number of input data to be processed

* [x] step-by-step manual confirmation, each with a single keystroke. Just like rm -i
* skip certain steps
* reshuffle some steps — arguably tapping on one of the strengths of interpreted languages.
* [x] re-run a certain step only
* share codebase with other on-going projects, to avoid forking and ease maintenance
* persistent xml config + command-line config
* be nice (Unix terminology) to other processes. Batch jobs can quickly eat up shared resources.

— infrastructure support needed, because standard batch languages can’t
* self-profiling and benchmarking on the batch application, to record time/mem/DB/bandwidth… usage for performance analysis
* scheduled retry or manual retry
* “easy” multi-threading (with data sharing) to exploit multi-threaded processors like our T2000’s 32 kernel threads. Multi-threading is non-trivial, esp. with data sharing. Many batch developers won’t have the time/expertise to create it or test it. Infrastructure support could lower the barrier and bring multi-threading to the “masses”

Re: NextGen server mean time to failure?

(A draft email) Hi,

Thanks for your quick reply. Sorry I’m unable to give any suggestion. Just some nagging worries. I’m trying to be critical yet objective.

My experience suggests that many java-based daemons are fairly susceptible to degradation with a concurrent load level high enough. Similar to denial-of-service attacks.

I’m not easily convinced that any piece of software (including my favorite — apache httpd) can keep up performance without restart for a few months under heavy load. For example, over 20 years solaris went through continuous improvements in terms of self-healing, daemon/service availability — a clear sign that the system can sustain “injuries” and lose performance. If it can happen to OS, what is immune?

I remember Siva told me the FTTP workload could be quite high and it’s not easy to handle that load. I think he said a few thousand cases a day. Will keep us busy:)

tan bin

app design in a fast-paced financial firm#few tips

#1 design goal? flexibility (for change). Decouple. Minimize colleagues’ source code change.

characteristic: small number of elite developers in-house (on wall street)
-> learn to defend your design
-> -> learn design patterns
-> automate, since there isn’t enough manpower

characteristic: too many projects to finish but too few developers and too little time
-> fast turnaround

characteristic: reputation is more important here than other firms
-> unit testing
-> automated testing

characteristic: perhaps quite a large data volume, quite data-intensive
-> perhaps “seed” your design around data and data models

characteristic: wide-spread use of stored proc, but Many java designs aren’t designed to work well with stored proc. Consider hibernate.
-> learn coping strategies

characteristic: “approved technologies”
characterstic: developers move around
-> maintenance left to other guys
-> documentation is ideally “less necessary” if your design is easy to understand
-> learn documentation tools like javadoc

transparency ] j2ee AR

warning: “transparent” has 2 unrelated meanings in java.

[[better, lighter faster java]]

Key concept: coupling. The tighter, the less transparent
Key concept: put “peripherals SERVICES” out of the DOMAIN MODEL
– persistence service
– messaging service
– tx service
– sec service,
– serialization service
– printing service
– email service

For example, Tight coupling between a serialization service and the domain model means “the service is CUSTOMIZED for this biz”. Changes to domain
model requires changes to the service.

For example, Look at persistence. a transparent persistent SERVICE
persists any, yes any, object.

For example, look at serialization SERVICE, which serializes any, yes
any, object

Most imp technique –> reflection <–

Q: what u already understand (LJ) transparency@@
A: see-through. readable logic. an extra layer or functionality should not
impede overall AR readability

declarative control — enterprise design pattern

justification: reduce source change, which can introduce bugs

justification: reduce test effort

justification: Maintain users’, bosses’ and colleagues’ confidence. Confidence that the source didn’t change, so existing functionalities aren’t affected.

justification: slightly Better man-day estimate compared to hacking source code

justification: Adaptble, Flexible

justification: more readable than source code

justification: something of a high-level documentation, well-structured

Examples: DD, spring config file, struts config file, hibernate config file.

I feel this is a habit (unit test is another habit). Initially it’s not easy to apply this idea. A lot of times you feel “not applicable here”, only to witness others applying it here. Easier to justify for component-based, inter-dependent modules. Other projects may find declarative control an overkill, and may opt for a properties file.

Q: Alternative to declarative?
A: The information must move into some place, usually source code. How about a properties file?

Q: is this a design pattern?
A: Purists to avoid the term. For OO and non-OO

biz rules ] DB

What are Business rules? They are set by the business. These guys have written rules. Don’t ask me exactly what qualify and what don’t qualify as business rules. Business rules can be implemented in java, javascript or batch.

Many business rules are best implemented inside the DB. Reason? The concept of biz rule is popularized and heavily influenced by DB industry, vendors and practitioners. Most things that /pass as/ biz rules are defined in terms of DB records (real world objects represented by records). As a result, these biz rules can be and often are best described, saved, encoded in a DB format.

Below are just a few buzzwords, not meant to be an orthogonal, mutually exclusive list of things.

– unique constraint — eg: member id must be unique
– not-null — eg: “We can’t leave this field blank”
– RI — May not be a rule set by business, but closely related to other business rules. eg: “When this salesperson resigns, all her customers must be assigned a replacement salesperson.”
– check constraint — Can be complex. I think (??? confirmed) they should be applied at modification time.
– triggers — can implement RI, check constraints,
– – > input-validation trigger is an important, well-defined type of
– derived data — insert or update “derived data” via triggers, to let java classes select them without “deriving”. The derivation formula contain business rules.
– authorization and access control via views and stored-procs. May not qualify as business rules.
– stored proc — most flexible. Can implement the most complex rules set by business, involving multiple objects.
– – > multi-table correlated modification via stored programs
– cascade delete
– views — can contain business rules in the view’s definition query. eg: “These class of users can only read/modify this subset of data — not those protected columns or irrelevant rows. They should always see the details of each purchase — by a table join.”

An architect should learn this list of techniques. Move business rules from java classes into DB whenever possible, to reduce the complexity of java classes. A large system usually has 60-90% of the business logic implemented in application source code (like java). That’s too much to manage. It’s good to move some to javascript or DB.

[[ pl/sql for dummies ]] advocates putting most “business logic” in DB rather than java. The most complex business logic would need big guns like
* procedures
* functions
* triggers
* complex views, perhaps containing functions in their definitions and have instead-of triggers defined on them.