SDI: 3 ways to expire cached items

server-push update ^ TTL ^ conditional-GET # write-through is not cache expiration

Few Online articles list these solutions explicitly. Some of these are simple concepts but fundamental to DB tuning and app tuning. compares write-through ^ write-behind ^ refresh-ahead. I think refresh-ahead is similar to TTL.

B) cache-invalidation — some “events” would trigger an invalidation. Without invalidation, a cache item would live forever with a infinity TTL, like the list of China provinces.

After cache proxies get the invalidation message in a small payload (bandwidth-friendly), the proxies discard the outdated item, and can decide when to request an update. The request may be skipped completely if the item is no longer needed.

B2) cache-update by server push — IFF bandwidth is available, server can send not only a tiny invalidation message, but also the new cache content.

IFF combined with TTL, or with reliability added, then multicast can be used to deliver cache updates, as explained in my other blogposts.

T) TTL — more common. Each “cache item” embeds a time-to-live data field a.k.a expiry timestamp. Http cookie is the prime example.

In Coherence, it’s possible for the cache proxy to pre-emptively request an update on an expired item. This would reduce latency but requires a multi-threaded cache proxy.

G) conditional-GET in HTTP is a proven industrial strength solution described in my 2005 book [[computer networking]]. The cache proxy always sends a GET to the database but with a If-modified-since header. This reduces unnecessary database load and network load.

W) write-behind (asynchronous) or write-through — in some contexts, the cache proxy is not only handling Reads but also Writes. So the Read requests will read or add to cache, and Write requests will update both cache proxy and the master data store. Drawback — In distributed topology, updates from other sources are not visible to “me” the cache proxy, so I still rely one of the other 3 means.

TTL eager server-push conditional-GET
if frequent query, in-frequent updates efficient efficient frequent but tiny requests between DB and cache proxy
if latency important OK lowest latency slower lazy fetch, though efficient
if in-frequent query good waste DB/proxy/NW resources as “push” is unnecessary efficient on DB/proxy/NW
if frequent update unsuitable high load on DB/proxy/NW efficient conflation
if frequent update+query unsuitable can be wasteful perhaps most efficient


central data-store updater in write-heavy system

I don’t know how often we encounter this stringent requirement —

Soccer world cup final, or a big news about Amazon … millions of users posts comments on a web page and all comments need to be persisted and shown on some screen.

Rahul and I discussed some simple design. At the center is a single central data store.

  • In this logical view, all the comments are available at one place to support queries by region, rating, keyword etc.
  • In the physical implementation, could use multiple files or shared-memory or distributed cache.

Since the comments come in a burst, this data store becomes the bottleneck. Rahul said there are two unrelated responsibilities on the data store updaters. (A cluster of updaters might be possible.)

  1. immediately broadcast each comment to multiple front-end read-servers
  2. send an async request to some other machine that can store the data records. Alternatively, wait to collect enough records and write to the data store in a batch

Each read-server has a huge cache holding all the comments. The server receives the broadcast and updates its cache, and uses this cache to service client requests.

always separate write^read traffic

Rahul pointed out my “simplistic” thinking. Now I feel there’s no good reason to create a web server to handle both read and write requests.

A Read server has a sizable data cache to service client requests. This cache gets updated ….

A Write server (“writer”) has no such data cache, but it might have an incoming request queue + a downstream queue.

The incoming queue introduces delay, but users who send updates often understand that writes take longer than read.

The downstream queue is relatively new to me, so here’s my hypothesis —

Say 100 writers all need to get their records persisted in a central data store. The infrastructure at the central data store is now a bottleneck, so the 100 writers send their records in a queue rather than wait indefinitely. The writers can then handle other incoming requests.

mktData parser: multi-threaded]ST mode #Balaji#mem

Parsers at RTS: a high-volume c++ system I worked on has only one thread. I recall some colleague (Shubin) saying the underlying framework is designed in single-threaded mode, so if a process has two threads they must share no mutable data. In such a context, I wonder what’s better

  1. EACH application having only one thread
  2. Each application has 2 independent threads sharing nothing

I feel (A) is simple to set up. I feel if the executable + reference data footprint is large like 100MB, then (B) would save memory since the 100MB can be shared between threads.

–Answer from a c++ trading veteran
Two independent instances is much simpler time/effort wise. Introducing multiple threads in a single threaded application is not trivial. But I do not see any value if you create multiple threads which do not share any (mutable) data.

–My response:
Executable and reference data take up memory. 1) I feel executable can take up 10MB+, sometimes 100MB. 2) If there’s a lot of immutable reference data loaded from some data store, they can also add megabytes to process footprint. Suppose their combined footprint is 100MB, then two instances would each need 100MB, but the multi-threaded design would need only 100MB in a single instance hosting multiple threads.

My RTS team manager required every market data parser process to reduce footprint below 60MB or explain why. My application there used 200MB+ and I had to find out why.

Apparently 100MB difference is non-trivial in that context.

EnterpriseServiceBus phrasebook #ESB

Mostly based on

  • enterprise — used in big enterprises, but now out of fashion
  • HTTP/MQ — Http is the more popular protocol than MQ
  • MOM — I think this middleware is a separate process. says MOM vendors call their products ESB
  • async — no synchronous http call between client and server
  • latency — perhaps not popular for real time trading, which prefers FIX
  • SOA — ESB jargon was created along with SOA
  • jxee — you may need to know this jargon in the jxee job market.
  • churn 😦

Nsdq onsite IV #emerging Nsdq architecture

Q: Java NonBlocking IO?
%%A: one thread monitoring multiple sockets

Q: many simple Objects with complex relationships/interactions vs one (or very few) huge complex object?

Q: thread Futures? Any comments?

–some of my findings

  • microservices and containers are important
  • According to the system architect (last interviewer), the new framework uses one dedicated instance of matching engine serving a single symbol such as AAPL.
  • the new java framework only supports a few smaller exchanges initially, but is designed to roll out to more “exchanges” esp. those to be installed on client sites (like CSP). It is (theoretically) designed to be capable enough to implement the main stock exchange.
  • is Mostly java + javascript (Amber, react) with small amount of c++
  • the new java framework was deliberately designed to operate in single-threaded mode, despite oppositions. Therefore, multiple “clients” calling the same library concurrently would be unsafe.
  • java8
  • noSQL is adopted in addition to PostgreSQL, but the principal architect is less concerned about data store, which is only used in individual components

prod write access to DB^app server@@

Q: Is production write access more dangerous in DB or app server?
A: I would say app server, since a bad software update can wipe out production data in unnoticeable ways. It could be a small subset of the data and unnoticeable for a few days.

It’s not possible to log all database writes. Such logging would slow down the live system and take up too much disk space. It’s basically seen as unnecessary.

However, tape backup is “protected” from unauthorized writes. It is usually not writable by the app server. There’s a separate process and separate permission to create/delete backup tapes.

async messaging-driven #FIX

A few distinct architectures:

  • architecture based on UDP multicast. Most of the other architectures are based on TCP.
  • architecture based on FIX messaging, modeled after the exchange-bank messaging, using multiple request/response messages to manage one stateful order
  • architecture based on pub-sub topics, much more reliable than multicast
  • architecture based on one-to-one message queue

strategic value of MOM]tech evolution

What’s the long-term value of MOM technology? “Value” to my career and to the /verticals/ I’m following such as finance and internet. JMS, Tibrv (and derivatives) are the two primary MOM technologies for my study.

  • Nowadays JMS (tibrv to a lesser extent) seldom features in job interviews and job specs, but the same can be said about servlet, xml, Apache, java app servers .. I think MOM is falling out of fashion but not a short-lived fad technology. MOM will remain relevant for decades. I saw this longevity deciding to invest my time.
  • Will socket technology follow the trend?
  • [r] Key obstacle to MOM adoption is perceived latency “penalty”. I feel this penalty is really tolerable in most cases.
  • — strengths
  • [r] compares favorably in terms of scalability, efficiency, reliability, platform-neutrality.
  • encourages modular design and sometimes decentralized architecture. Often leads to elegant simplification in my experience.
  • [r] flexible and versatile tool for the architect
  • [rx] There has been extensive lab research and industrial usage to iron out a host of theoretical and practical issues. What we have today in MOM is a well-tuned, time-honored, scalable, highly configurable, versatile, industrial strength solution
  • works in MSA
  • [rx] plays well with other tech
  • [rx] There are commercial and open-source implementations
  • [r] There are enterprise as well as tiny implementations
  • — specific features and capabilities
  • [r] can aid business logic implementation using content filtering (doable in rvd+JMS broker) and routing
  • can implement point-to-point request/response paradigm
  • [r] transaction support
  • can distribute workload as in 95G
  • [r] can operate in-memory or backed by disk
  • can run from firmware
  • can use centralized hub/spoke or peer-to-peer (decentralized)
  • easy to monitor in real time. Tibrv is subject-based, so you can easily run a listener on the same topic
  • [x=comparable to xml]
  • [r=comparable to RDBMS]

##observations@high-volume,latency sensitive eq OMS #CSY

This is a probably the biggest sell-side equity order-management-system (OMS) on Wall St, written in c++11. Daily order volume is probably highest among all investment banks, presumably 7 figures based on my speculation, though a lot of them get canceled, rejected or unfilled. I am disallowed to reveal too many internal details due to compliance.

In contrast, GS used to get about a million individual trades (perhaps the partial fills of an order?) a day, probably not counting the high-frequency small trades.

  • synchronization — I haven’t noticed any locking or condition variable so far. I think single-threaded mode is faster than synchronized multi-threading. Multiple instances of the same software runs in parallel across machines. I think this is in many ways better than one big monolithic process hosting many threads. We have 4 threads per instance in some cases.
  • ticking market data — is available, though I don’t know if my OM system needs them beside the restriction indicators
  • For crash recovery, every order and every fill is persisted in non-volatile memory , and often swapped out to disk and free up memory. These records are never cleared until EOD. Consequently, any fill can be busted any time before EOD. A recovery would reinstate them. So what “order objects”?
    • All pending orders and (for busting support) closed orders. Basically all orders.
    • Each logical order requires a chain of stateful FlowElement objects created on the fly to support the order.
  • data persistence — the OMS enriches every order and also generates new orders. These orders are persisted automatically in case of a server crash and restart. They persistence files are binary and cleared at EOD
  • RDBMS — is loaded into cache at Start-of-Day and seldom accessed intra-day. I confirmed it with an ex-DBA colleague.
    • However, some product DB system sends intra-day real time updates via messaging (not FIX)
  • MOM — I have not seen a message queue so far but they could be hidden somewhere. Earlier I heard other ibanks’ employees telling me Tibco (and similar messaging middlewares) were popular in fixed income but now I doubt it. Queues add latency.
    • We do use some pub-sub MOM (CPS) but not for order messages therefore not part of order flow.
  • socket — is not needed in any module. I believe the applications communicate via FIX, SOAP etc, on top of well-encapsulated TCP library modules.
  • garbage collection — no GC like in java and dotnet
  • CRTP — heavy use of CRTP. I don’t remember seeing many virtual functions.
  • The most important message is the order object, represented by a FIX message. The order object gets enriched and modified by multiple functions in a chain. Then it is sent out via FIX session to the next machine. As in any OMS, the order object is stateful. All order objects are  are persisted somewhere so a crash won’t wipe out pending orders.
    • (Elsewhere, I have seen very lean and mean buy-side OMS systems that don’t persist any order! After crash, it would query the exchange for order states.)
  • The 2nd most important message is probably the response object, represented by a FIX msg. If there are 100,000 order objects then there are roughly 300,000 response objects. Each order generates multiple responses such as Rejection, PendingNew, New, PartialFill, PendingCancel, Cancelled… Response objects probably don’t need to be persisted in my view.
  • The 3rd most common message is the report message object, again in FIX format. Each order object probably generate at least one report, even if rejected. Report objects sound simple but they carry essential responsibilities , not only regulatory reporting and client confirmations, but also trade booking, trade capture… If we miss an execution report the essential books and records (inventory, positions..) would be messed up. However, these reports are not so latency sensitive.

## good-looking designs/ideas #white elephant

I think a true architect knows the difference. The best design is often not so good-looking and hopelessly outdated.

Not all of them are classified “white elephants”. Not all of them are “designs”.

  1. lock-free — worthwhile?
  2. multi-threading — not always significantly faster than multi-Processing. I find single-threaded mode fastest and cleanest
  3. sharedMem — not always significantly faster than sockets. I briefly discussed these two choices with my friend Deepak M, in the parser/rebus context. I feel it may not be faster.
    1. other fancy IPC techniques? I am most familiar with sockets …
  4. java generic module (beyond collections) — look impressive, can be hard to maintain but doesn’t buy us much
  5. converting java system to c++ — not always brings significant performance gains
  6. forward() and move() instead of cloning
  7. hash-table — not always faster than RBTree
  8.  noSQL — not always significantly faster than REBMS with lots of RAM.  I find rdbms much more reliable and well understood. The indices, temp tables, joins, column constraints, triggers, stored procs add lots of practical value that can dramatically simplify the main application. I understand the limitations of rdbms, but most of my data stores are not so big.
  9. RPC and web services? Probably necessary, but I still don’t know how reliable they are
  10. thick client? I still feel web UI is simplest


## sell-side eq e-trading arch features #MS,Baml..

Mostly inspired by the MS equity order-management “frameworks”

  • message-based, not necessarily MOM.
    • FIX messages are the most common
    • SOAP messages are also possible.
    • BAML system is based on MOM (tibrv)
  • message routing based on rules? Seems to be central to some sell-side /bloated/ “platforms” consisting of a constellation of processes.
  • event-driven
    • client newOrder, cancel requests
    • trading venue (partial) fills
    • Citi muni reoffer is driven by market data events, but here I focus on equity systems
    • Stirt realtime risk is driven by market data events + new trade booking events
    • buy-side would have order-origination events, but here I focus on sell-side systems
  • market data subscription? Actually not so important to some eq trading engines. Buy-side would make trading decisions based on market data, but a sell-side won’t.

## clean design may look over-complicated

What is simple/clean really depends on (your knowledge of) best practices.

  • #1 eg: shared_ptr as member variable. Looks complicated but simplifies many things.
  • eg: nested container without pointer — looks scary[1] to java/python developers, but I think in c++ this is a proven design and simplifies many things.
  • eg: most of the 23 GOF design patterns introduce additional classes to address coupling and cohesion. These are classic and clean designs but add complexity. MVC pattern ditto.
  • try-with-resource looks complicated to me
  • RAII looks complicated to me

[1] inserting a entire sub-container  into the umbrella container requires copying the sub-container.

save lookup-key in payload object: data duplication

In c++ data structure designs, I often have a key/payload pair in a lookup map, and face the design choice — should I duplicate data by saving the key in the payload object?

Context (simple) — if every time we access any payload object, we always have the key on hand, then no need. It could be unnecessary complication to copy the key into the payload instance, since we may not have the key at payload construction time.

Context — if sometimes we manipulate the payload object without (easy access) to the key, then reconsider. For example, the key object may be a private field of some Acct class so we would need a handle on that Acct instance. I would incur the cost of data duplication by saving the key inside payload, upon construction.

If key object address is “permanent” .. like maintained in some global collection or allocated on heap, then I prefer reference/pointer. (In java, every object is a reference.) This way, key object content change won’t affect me.

(Special case) If the key object is an integral type and immutable, I would simply copy its content into the payload object.

(complex special case) if the key object is BASED-ON a simple data type (string or int) but needs to be a custom MyKey type, then we have the (KISS) choice of saving the string as a payload field, but half the times when we read this field from a payload, we may need to create a temp MyKey. I usually prefer KISS.

I treat this as a design issue more than merely a low-level implementation issue, because I have many concerns about the data duplication:

  1. memory footprint
  2. !! More important than (A) is artificial complexity and higher chance of bugs. I’m very lucky if the key object location is permanent as described above.
  3. code maintenance

##simplicity@design pushed2the limit #

Here I collection simple concepts proven rather versatile, resilient, adaptable. Note in these designs, the complexity can never disappear or reduce. Complexity shifts to somewhere else more manageable.

  • [c] stateless — http
  • [!s] microservices — complexity moves out of a big service into the architecture
  • [c] pure functions — without side effects
  • use the database concept in solving algo problems such as the skyline #Gelber
  • stateless static functions in java — my favorite
  • [c !s] garbage collection — as a concept. Complexity shifts from application into the GC codebase
  • REST
  • in c# and c++, all nested classes are static, unlike in java
  • [!s] python for-loop iteration over a dir, a file, a string … See my blog post
  • [c] immutable — objects in concurrent systems
  • [c] STM i.e. single-threaded mode, without shared mutable #Nsdq
  • [c] pipe — the pipe concept in unix is a classic
  • JSON
  • [c] hash table as basis of OO — py, javascript, perl..
  • [c] sproc (+trigger) — as a simple concept “data storage as guardian of its data, and a facade hiding the internal complexities”
  • [!s] dependency injection
  • [c !s] EDT — swing EDT and WPF
  • [c] RAII
  • [!s] smart pointers as a concept
  • singleton implemented as a static local object, #Scott Meyers
  • DougLea’s singleton — all-static class
  • [c=celebrated, classic, time-honored, ..]
  • [!s = not so simple in implementation]


##RDBMS=architect’s favorite

A specific advantage .. stored proc can _greatly_ simplify business logic as the business logic lives with the data …

Even without stored proc, a big join can replace tons of application code implementing non-trivial business logic. Hash table lookup can be implemented in SQL (join or sub-query) with better clarity and instrumentation because

* Much fewer implicit complexities in initialization, concurrency, input validation, null pointers, state validity, invariants, immutabilities, …
* OO and concurrency design patterns are often employed to manage these complexities, but SQL can sidestep these complexities.

Modularity is another advantage. The query logic can be complex and maintained as an independent module (Similarly, on the presentation layer, javascript offers modularity too). Whatever modules dependent on the query logic has an dependency interface that’s well-defined and easy to test, easy to investigate.

In fact testability might be the most high-profile feature of RDBMS.

I think one inflexibility is adding new column. There are probably some workarounds but noSQL is still more flexible.

Another drawback is concurrency though there are various solutions.

stateless (micro)services #%%1st take

in 2018, I have heard more and more sites that push the limits of stateless designs. I think this “stateless” trend is innovative and bold. Like any architecture, these architectures have inherent “problems” and limitations, so you need to keep a lookout and deal with them and adjust your solution.

Stateless means simplicity, sometimes “extreme simplicity” (Trexquant)

stateless means easy to stop, restart, backup or recover

Stateless means lightweight. Easy to “provision”, easy to relocate.

Stateless means easy scale-out? Elastic…

Stateless means easy cluster. Http is an example. If a cluster of identical instances are stateless then no “conversation” needs to be maintained.

p2p messaging beats MOM ] low-latency trading

example — RTS exchange feed dissemination infrastructure uses raw TCP and UDP sockets and no MOM

example — the biggest sell-side equity OMS network uses MOM only for minor things (eg?). No MOM for market data. No MOM carrying FIX order messages. Between OMS nodes on the network, FIX over TCP is used

I read and recorded the same technique in 2009… in this blog

Q: why is this technique not used on west coast or main street ?
%%A: I feel on west coast throughput outweighs latency. MOM enhances throughput.

microservices “MSA” #phrasebook

I feel MSA is more of a architect interview topic, not a developer interview topic. Dev complexity is low by design.

eg: error acct lookup, receiving productId + possibly a clientId, returning an error acct

Now the phrasebook:

  • jxee — As of 2019, I guess jxee has the best support for MSA
  • enterprise — enterprise-bias. Most of the practices used in SOA/MSA come from developers who have created software applications for large enterprise organizations.
  • SOA — is the ancestor and now out of fashion. I think MSA will also fall out of fashion.
  • stateless — stateless microservice is best. Can be highly concurrent and scaled out
  • scalability — hopefully better
  • decentralized — rather than monolithic
  • modularity
  • communication protocol — supposedly lightweight, but more costly than in-process communication
    • http — is commonly used for communication. Presumably not asynchronous
    • messaging — metaphor is often used for communication. I doubt there’s any MOM of message queue.
  • cloud-friendly — cheaper
  • flexible — in the face of changing requirements, though I’m not sure time-to-market will improve
  • simple-facade — (of a big monolithic service) is now replaced by more complex interface, so I suspect this is not always popular.
  • complexity — (various forms) is the public enemy but I don’t know which weapon (REST,SOA,ESB,MOM,Spring) actually works
  • in-process — services can be hosted in a single process, but less common
  • devops — is a driver
    • testability — each service is easy to test, but not integration test
    • loosely coupled — decentralized, autonomous dev teams
    • deployment — is ideally independent for each service, and continuous, but overall system deployment is complicated

blocking scenario ] CPU-bound system

Q: can you describe a blocking scenario in a CPU-bound system?

Think of a few CPU bound systems like

  • database server
  • O(N!) algo
  • MC simulation engine
  • stress testing

I tend to think that a thread submitting a heavy task is usually the same thread that processes the task. (Such a thread doesn’t block!)

However, in a task-queue producer/consumer architecture, the submitter thread enqueues the task and can do other things or return to the thread pool.

A workhorse thread picks up the task from queue and spends hours to complete it.

Now, I present a trivial blocking scenario in a CPU bound system —

  • Any of these threads can briefly block in I/O if it has big data to send. Still, system is CPU-bound.
  • Any of these threads can block on a mutex or condVar

json^protobuf points out

  • —limitations of protobuf:
  • Lack of resources. You won’t find that many resources (do not expect a very detailed documentation, nor too many blog posts) about using and developing with Protobuf.
  • Smaller community. Probably the root cause of the first disadvantage. On Stack Overflow, for example, you will find roughly 1.500 questions marked with Protobuf tags. While JSON have more than 180 thousand questions on this same platform.
  • not human readable
  • schema is extra legwork for quick and dirty project
  • — advantages of protobuf over Json
  • very dense, and binary, data
  • up to 5 times faster, but optimized json parser could reduce the performance gap.


real-time symbol reference-data: arch #RTS

Real Time Symbol Data is responsible for sending out all security/product reference data in real time, without duplication.

  • latency — typically 2ms (not microsec) latency, from receiving to sending out the enriched reference data to downstream.
  • persistence — any data worthing sending out need to be saved. In fact, every hour the same system sends a refresh snapshot to downstream.
    • performance penalty of disk write — is handled by innoDB. Most database access is in-memory. Disk write is rare. Enough memory to hold 30GB of data. shows how many symbols there across all trading venues.
  • insert is actually slower than update. But first, system must check if there’s a need to insert or update. If no change, then don’t save the data or send out.
  • burst / surge — is the main performance headache. We could have a million symbols/messages flooding in
  • relational DB with mostly in-memory storage

## stateful OMS class design: observations

More details are in email…

Here’s a well-established and large-scale order manager class design. It handles millions of orders a day.

  • The entire process is restarted on every trading day. Before the restart, all pending orders are cancelled! The OM is probably a per-thread singleton in the process.
  • The OM stores all the orders for the day, including each closed order in case it needs cancellation.
  • The OM keeps all the partial executions (aka partial fills) for a given order, because each execution could be busted.
  • Each action on an order (such as validation, partial execution ..) is performed by a dedicated object. For 1000 orders, if there are 5 actions, then there would be 5000 distinct “action objects”. The OM has pointers to all of these action objects.
  • Most action objects are stateful. ALL action objects are persisted somewhere so as to support busting/cancellation.


[09]%%design priorities as arch/CTO

Priorities depend on industry, target users and managers’ experience/preference… Here are my Real answers:

A: instrumentation (non-opaque ) — #1 priority to an early-stage developer, not to a CTO.

Intermediate data store (even binary) is great — files; reliable[1] snoop/capture; MOM

[1] seldom reliable, due to the inherent nature — logging/capture, even error messages are easily suppressed.

A: predictability — #2 (I don’t prefer the word “reliability”.) related to instrumentation. I hate opaque surprises and intermittent errors like

  • GMDS green/red LED
  • SSL in Guardian
  • thick, opaque libraries like Spring
  1. Database is rock-solid predictable.
  2. javascript was predictable in my pre-2000 experience
  3. automation Scripts are often more predictable, but advanced python is not.

(bold answers are good interview answers.)
A: separation of concern, encapsulation.
* any team dev need task breakdown. PWM tech department consists of teams supporting their own systems, which talk to each other on an agreed interface.
* Use proc and views to allow data source internal change without breaking data users (RW)
* ftp, mq, web service, ssh calls, emails between departments
* stable interfaces. Each module’s internals are changeable without breaking client code
* in GS, any change in any module must be done along with other modules’ checkout, otherwise that single release may impact other modules unexpectedly.

A: prod support and easy to learn?
* less support => more dev.
* easy to reproduce prod issues in QA
* easy to debug
* audit trail
* easy to recover
* fail-safe
* rerunnable

A: extensible and configurable? It often adds complexity and workload. Probably the #1 priority among managers i know on wall st. It’s all about predicting what features users might add.

How about time-to-market? Without testibility, changes take longer to regression-test? That’s pure theory. In trading systems, there’s seldom automated regression testing.

A: testability. I think Chad also liked this a lot. Automated tests are less important to Wall St than other industries.

* each team’s system to be verifiable to help isolate production issues.
* testable interfaces between components. Each interface is relatively easy to test.

A: performance — always one of the most important factors if our system is ever benchmarked in a competition. Benchmark statistics are circulated to everyone.

A: scalability — often needs to be an early design goal.

A: self-service by users? reduce support workload.
* data accessible (R/W) online to authorized users.

A: show strategic improvement to higher management and users. This is how to gain visibility and promotion.

How about data volume? important to eq/fx market data feed, low latency, Google, facebook … but not to my systems so far.

DB=%% favorite data store due to instrumentation

The noSQL products all provide some GUI/query, but not very good. Piroz had to write a web GUI to show the content of gemfire. Without the GUI it’s very hard to manage anything that’s build on gemfire.

As data stores, even binary files are valuable.

Note snoop/capture is no data-store, but falls in the same category as logging. They are easily suppressed, including critical error messages.

Why is RDBMS my #1 pick? ACID requires every datum to be persistent/durable, therefore viewable from any 3rd-party app, so we aren’t dependent on the writer application.

dotnet remoting and related jargon

P4 [[.net 1.1 remoting, reflection and threading]] shows a insightful history leading to dotnet remoting —
#1) RPC (pre-OO).
OO movement brought about the Next generation in the form of distributed objects (aka distributed components) —
#2) CORBA, RMI (later ejb) and dcom, which emerged around the same time.
COM is mostly for in-process and dcom is distributed
#3) soap and web services , which are OO-agnostic
I feel soap is more like RPC… The 2 distinct features of soap — xml/http. All predecessors are based on binary protocols (efficient), and the “service component” is often not hosted in any server.
#4) dotnet remoting feels more like RMI to me…According to the book above, remoting can use either
1) http channel with the soap formatter, or
2) tcp channel  with the binary formatter

Therefore, I feel remoting is an umbrella technology with different implementations for different usage scenarios.

#5) WCF
Remoting vs wcf? See other post.

private bank trade/order/quote/execution flow

Remember — Most non-exchange traded products are voice executed. Only a few very dominant, high volume products are electronically executed. Perhaps 1% of the products account for 99% of the trades — by number of trades. By dollar amount, IRS alone is probably 50% of all the trades, and IRS is probably voice-executed, given the large notional amounts.

The products traded between the bank and its clients (not interbank) are often customized private offerings, unavailable from any other bank. (Remember the BofA puttable floats.)

RM / PWA would get live quotes from dealer and give to a client. Sometimes dealer publishes quotes on an internal network, but RFQ is more common. Any time the quote could be executed between RM and client. RM would book the new position into the bank's database. As soon as as executed (before the booking), the bank has a position but dealer knows the position only after the booking, and would hedge quickly.

Dealer initially only responds to RFQ. It's usually executed without her knowledge, just like an ECN flow.

I think in BofA's wealth management platform, many non-equity products (muni bonds are largely sold to retail clients) trade in the same way. Dealer publishes quotes on an intranet website. RM negotiates with client and executes over the phone. During trade booking, the price and quantity would be validated. Occasionally (volatile market), trade fails to go through and RM must inform client to retry. Perhaps requote. Fundamentally, the dealer gets a last look, unlike the exchange flow.

I believe structured products (traded between bank and clients) are usually not fast and volatile — less requote. However, when dealer hedges the position, I think she often uses vanilla instruments.

Terminology warning — some places use “trade” to mean many things including orders. I think in exchange flow, “order” is a precise word.

[12] sub-millis OMS architecture

I feel ideally you want to confine entire OMS to one single process (like the barebones mvea), minimizing IPC latency [1]. In practice however, even for one symbol OMS is often split into multiple processes.

[1] Data parallelism (into multiple processes) is perfectly fine.

So what’s the IPC? It turns out that in sub-millis trading, FIX/Solace/Tibrv messaging is the IPC of choice. [2] I mentioned synchronous call and shared memory, but my veteran friend said messaging performs better in practice. I still believe shared mem beats messaging.

[2] mvea had about 150 micros in speedway, and about 10 micros in the single-process OMS.

The main component is one big JVM instance with an internal order lookup cache for order state maintenance.

Multi-queue – if there are 50,001 symbols, there will be 50,001 queues. Once a queue is assigned a given thread T351, it is permanently bound to T351. This is to prevent multiple threads handling events on the same symbol. Obviously we don’t want 50,001 threads. Therefore, some kind of multiplexing is in place.

##[12] bottlenecks in a high performance data "flow" #abinitio


#1 probably most common — database, both read and write operations. Therefore, ETL solutions achieve superior throughput by taking data processing out of database. ETL uses DB mostly as dumb storage.

  • write – if a database data-sink capacity is too slow, then entire pipe is limited by its throughput, just like sewage.
    • relevant in mkt data and high frequency trading, where every execution must be recorded
  • read – if you must query a DB to enrich or lookup something, this read can be much slower than other parts of the pipe.

#2 (similarly) flat files. Write tends to be faster than database write. (Read is a completely different story.)
* used in high frequency trading
* used in high volume market data storage — Sigma2 for example. So flat file writing is important in industry.
* IDS uses in-memory database + some kind of flat file write-behind for persistence.

#? Web service

#? The above are IO-bound. In contrast, CPU-bound compute-intensive transform can (and do) also become bottlenecks.

hide client names and address

I proposed a system to a buy-side asset manager shop. I said client names don't need to stored in the central database. Maybe the salesforce and investment advisors need the names but they don't need to save those in a shared central database for everyone else to see.

Each client is identified by account id, which might include an initial.

When client logs in to a client-facing website, they will not see their name but some kind of relatively public information such as their self-chosen nick name, investment objectives, account balance, and last login time.

Client postal address is needed only for those who opt for paper statement. And only one system needs to access it — the statement printing shop.

A veteran in a similar system told me this is feasible and proposed an enhancement — encrypt sensitive client information.

What are some of the inconveniences in practice?

Tx Monitoring System: distributed cache

I believe I learnt this from an Indian consultant while working in Barcap. Perhaps gigaspace?

Basic function is to host live transaction data in a huge cache and expose them to users. Includes outgoing orders and incoming execution reports. I think market quotes can also be hosted this way.
Consumers are either sync or async :
1) Most common client mode is synchronous call-and-wait. Scenario — consumer can’t proceed without the result.
2) Another common mode is subscription based.
3) A more advanced mode is query-subscription (similar to continuous query), where
– consumer first make a sync call to send a query and get initial result
– then MOM service (known as the “broker”) creates a subscription based on query criteria
– consumer must create a onMsg() type of listener.

Query criteria are formatted in SQL format. In a select A,B.. A actually maps to an object in the cache.

Major challenge — volume. Millions of orders/day, mostly eq, futures and options. Gigabytes of data per day. Each order is 5kB – 10KB. One compression technique is FIX style data-dictionary — Requester and reply systems communicate using canned messages, so network is free of recurring long strings.

All cache updates are MOM-based.

Q: when to use async/sync?

A: Asynchronous query – needed by Live apps – need latest data
A: Synchronous query – reporting apps

no 2 thread for 1 symbol: fastest mkt-data distributor

Quotes (and other market data) sent downstream should be in FIFO sequence, not out-of-sequence (OOS).

In FX and cash equities (eg EURUSD), I know many major market data aggregators design the core of the core feed engine to be single-threaded — each symbol is confined to a single “owning” thread. I was told the main reason is to avoid synchronization between 2 load-sharing threads. 2 threads improve throughput but can introduce OOS risk.

You can think of a typical stringent client as a buy-side high-frequency trader (HFT). This client assumes later-delivered quote is physically “generated” later. If 2 quotes arrive on the same name, one by one, then the later one always overwrites the earlier one – conflation.

A client’s HFT can react in microseconds, from receiving quote (data entering client’s network) to placing orders (data leaving client’s network). For such a fast client, a little bit of delay can be quite bad, but not as bad as OOS. I feel OOS delivery makes the data feed unreliable.

I was told many automated algo trading engines (including automatic offer/bid pricers in bond) send fake orders just to test the market. It sends a test order and waits for the response in the data feed. An OOS delivery would confuse this “observer”.

A HFT could be trend-sensitive. It monitors the rise and fall of sizes of the quotes on a given name (say SPX). It assumes the market data are delivered in-sequence.

learning design patterns #letter to Mithun

Another thing about design patterns – each author has a different description.

Some describe it in 2 paragraphs or a 10-line example using one dog/cat class. Others write 3 pages and 5 classes. Yet others write a whole chapter on it in a design pattern book.

If I go by the simplistic interpretation to describe a pattern, then the interviewer may think I don’t really know it.

If I give an in-depth example, then it could be too complicated to describe (I’m not extremely good at describing complexity) and interviewer may not be an expert on that pattern to understand me.

My suggestion is to focus on 1 or 2 patterns. Understand them inside out and also remember some simple examples in your own app. Our knowledge of a pattern should grown from thin to thick to thin, yes back to thin. Only when we have a streamlined understanding can we describe it with clarity.

Here are some complex patterns I have struggled with for years – visitor, bridge, strategy, composite, chain of command, memento, command, observable …. Each of them can be summarized in 1 paragraph, but let’s be honest – these aren’t simple.

async (almost)always requires buffer and additional complexity

Any time I see asynchronous (swing, MOM etc), i see additional complexity. Synchronous is simpler. Synchronous means blocking, and requires no object beside the caller actor and service actor. The call is confined to a single call stack.

In contrast, async almost always involves 2 call stacks, requires a 3rd object in the form of a buffer [1]. Async means caller/sender can return before responder/callback even gets the message. In that /limbo/, the message must be kept in the buffer. If responder were a doctor then she might be “not accepting new patients“.

Producer/consumer pattern … (details omitted)
Buffer has capacity and can overflow.
Buffer is usually shared by different producer threads.
Buffer can resend.
Buffer can send the messages out of order.

[1] I guess the swing event object must be kept not just on the 2 call stacks, but on the event queue — the buffer

Q: single-threaded can be async?
A: yes the task producer can enqueue to a buffer. The same thread periodically dequeues. I believe swing EDT thread can be producer and consumer of tasks i.e. events. Requirement — each task is short and the thread is not overloaded.

Q: timer callback in single-threaded?
A: yes. Xtap is single-threaded and uses epoll timeout to handle both sockets and timer callbacks. If the thread is busy processing socket buffers it has to ignore timer otherwise socket will get full. Beware of the two “buffers”:

  • NIC hardware buffer is very small, perhaps a few bytes only, processed by hardware interrupt handler, without pid.
  • kernel socket buffer is typically 64-256MB, processed under my parser pid.
    • some of the functions are kernel tcp/udp functions, but running under my parser pid

See which thread/pid drains NIC_buffer}socket_buffer

common technical challenges in buy-side software systems

10 – 30% of wall st IT jobs are on the buy-side such as funds, portfolio and asset management.

* Core challenge – sub ledger. In one system there was more than 100,000 client accounts in the sub ledger. Each is managed by professional financial advisors. I believe each account on average could have hundreds of positions. (Thousands would be overwhelming I feel.) Since clients (and FA) need instant access to their “portfolio”, there are too many positions to keep up to date. Core of the entire IT infrastructure is the subledger.

** Number of trades per day is a 2nd challenge. These aren’t high-frequency traders, but many Asian clients do nothing but brokerage (equity) trades. Per-account not many, but all the accounts combined is a lot of processing overnight. These must add to the sub ledger by next day.

* quarterly performance reporting
** per-fund, per-account, per-position
** YTD, quarterly, annual etc

I guess there is also monthly performance reporting requirement in some cases.

* asset allocation and periodic portfolio re-balancing — for each client. Key differentiators. Investors get a large “menu” of funds, products … For comparison, they may want performance metrics.

– VaR or realtime risk? Probably important to large funds
– pricing? Probably important to large funds

– swing/wpf not really required. Web is adequate.
– trade booking? not a challenge

Database: limited usage]real time trading

“Database” and “Real-time trading” don’t rhyme!

See Trading systems use lots of MOM and distributed cache.

In comparison, DB offers perhaps the most effective logging/audit. I feel every update sent to MOM or cache should ideally be asynchronously persisted in DB. I would probably customize an optimized DB persistence service to be used across the board.

Just about any update in cache need to be persisted, because cache is volatile memory. Consider flat file.

[11] real time high volume FX quote processing #letter

Horizontal scale-out (distributing to different boxes) is the design of choice when we are cpu-bound. For instance, if we get hundreds of updates a sec and each update requires repricing a large number of objects.

Ideally, you would want cpu to be saturated. (By using twice the hardware threads, you want throughput to double.) Our pricing engine didn’t have that much cpu load, so we didn’t scale out to more than a few boxes.

The complication of scale-out is, data required to reprice one object may reside in different boxes. People try many solutions like memory virtualization (non-trivial synchronization cost + network latency), message-passing, RMI, … but I personally prefer the one-big machine approach. Throw in 16 (or 128) processors, each with say 4 to 8 hardware threads, run 64-bit, throw in 256G RAM. No network latency. No RMI/messaging latency. I think this hardware is rather costly. Total cost of 8 smaller machines with a comparable total CPU power would cost much less, so most big banks prefer it – so-called grid computing.

According to my observations, most practitioners in your type of situations eventually opt for scale-out.

It sounds like after routing a message, your “worker” process has all it needs in its local memory. That would be an ideal use case for parallel processing.

I don’t know if FX spot real time pricing is that ideal. Specifically, suppose a worker process is *dedicated* to update and publish eur/usd spot quote. I know you would listen to the eurusd quotes from all liquidity providers, but do you also need to watch usd/jpy and eur/jpy?

15,000 quotes repriced within a minute

One of my bond pricing engines could price about 15,000 offers/bids in about a minute. 4 slow lanes to avoid
1) database persistence is done asynchronously by gemfire write-behind.

2) offers/bids we produce must be verified by another system, which officially owns the OutgoingQuote table. The verification takes a long time. We avoid that overhead by pricing all the offers/bids in gemfire, then send them out by batch, then wait for the result. The 1 minute speed is without the verification.

3) all reference data is preloaded into gemfire, so no more disk I/O.

4) minimal serialization overhead, since most of the objects needed are in local JVM.

In contrast, a more complex engine, the mark-to-market engine needs a few minutes to price 15,000 positions. This engine doesn't need real time performance.

4 infrastructure features@Millennium

 swing trader station + OMS on the server-side + smart order router over low-latency connectivity layer

* gemfire distributed cache. why not DB? latency too high.
* tibrv is the primary MOM
* between internal systems — FIX based protocol over tibrv, just like Lehman equities. Compare to protobuf object serialization
* there’s more advanced math in risk system; but the highest latency requirements are on the eq front office systems.

y java is dominant in enterprise app

What's so good about OO? Why are the 3 most “relevant” enterprise app dev languages all happen to be OO – java, c# and c++?

Why is google choosing java, c++ and python?

(Though this is not really a typical “enterprise app”) Why is apple choosing to promote a compiled OO language — objective C?

Why is microsoft choosing to promote a compiled OO language more vigorously than

But why is facebook (and yahoo?) choosing php?

Before c++ came along, most enterprise apps were developed in c, cobol, fortran…. Experience in the field show that c++ and java require more learning but do offer real benefits. I guess it all boils down to the 3 base OO features of encapsulation, inheritance and polymorphism.

enterprise reporting with^without cache #%%xp


(A personal blog) We discussed enterprise reporting on a database with millions of new records added each day. Some reflections…

One of my tables had about 10G data and more than 50 million rows. (100 million is kind of minimum to qualify as a large table.) This is the base table for most of our important online reports. Every user hits this table (or its derivative summary tables) one way or another. We used more than 10 special summary tables and Business Objects and the performance was good enough.

With the aid of a summary table, users can modify specific rows in main table. You can easily join. You can update main table using complex SELECT. The most complex reporting logic can often be implemented by pure SQL (joins, case, grouping…) without java. None of these is available to gigaspace or hibernate users. These tools simply get in the way of my queries, esp. when I do something fancy.

In all the production support (RTB) teams I have seen on wall street, investigating and updating DB is the most useful technique at firefighting time. If the reporting system is based on tables without cache, prod support will feel more comfortable. Better control, better visibility. The fastest cars never use automatic gear.

Really need to limit disk I/O? Then throw enough memory in the DB.

y trading systems use so many stored procedures

A popular Wall Street interview question is the pros and cons of stored
proc. Here are a few
#1 single point of access from java, c++ …
#2 modular encapsulation. separation of concern
+ network efficiency
+ access control
+ reusable. DRY
+ easy version control
– readability
– exception handling
– hard to log actual query

Perhaps the biggest motivation is to avoid recompiling binary in an
emergency fix. Many sites have extremely strict control on binary
build/deployment [1]. Every release always builds from version control.
If you need a bug fix release, then deal with all the changes checked
into cvs but not approved!

Redeploy binary can also break any number (or all) other applications.

Proc is the answer to your prayer. In some places, every
select/insert/update/delete statement is extracted into a proc. Changing
the logic in them feels almost painless compared to a binary
build/release. Hibernate is a big departure from the proc tradition.

1) Wall Street users want frequent changes, not bound by software
release controls. Control-vs-time-to-market makes a healthy contention.
2) Wall Street code is often extremely (quick and) dirty, so fixing bugs
without software release is often a life saver.

About half of all business logic, both features (1) and bugs (2), are
often expressed in SQL. Now you see how useful it is to have flexible
ways to change the SQL logic.

If you think hard and always forecast which business logic might need
change, then you can strategically extract those SQL into store

[1] Given the huge sums involved, wall st wants control on software.
They can't control code quality but can control build/release. Many,
many levels of approvals. Numerous staging, integration, QA, preQA

overnight risk reporting in portfolio management

I talked to a big portfolio mgmt (PM) firm. Team owns and delivers nightly risk reports to traders (+ perhaps fund managers). According to the team mgr, the most important sister team is the quant team, who are often PhD's but not professional coders. Quants are really qualified to create models but these quants actually implement their models in c++.

There's a large amount of data in DB. Nightly job reads in these data and analyzes them using the c++ models, then writes data back into DB.

This is a heavy-duty number crunching batch job, heavy on DB, light on network – no socket programming.

Logic is mostly in perl, c++, shell and DB. DB holds significant amount of logic, just like Goldman Sachs PrivateWealthManagement. It turned out c++ implements more business logic than perl. These perl scripts are considered low-logic, but if there's a lot of perl, then I believe there's a lot of logic.

Perf is the biggest issue. Job must complete in 12 hours, before a 3am deadline, without break. If it breaks, there will be … delays and …? Bottleneck is DB. There's spare hardware capacity underutilized but the DB server is on its knees. I have heard of the same many times, in GS, citi… so I guess this is hard to avoid. Risk system is probably worst affected.

%Q: stress testing? Monte Carlo?
A: the reporting system doesn't do those. Those are probably the job of quants.

%Q: is VaR the key output?
A: no. duration, curve duration, spread duration

%Q: is matrix and “vectors” used in the c++ code, like those in matlab? So it goes well beyond STL?
A: yes quants use matlab and mathematica to develop the concept, and then use c++ to implement it. We do have our own data structures beyond STL.

%Q: how much domain knowledge required in the analytical work?
A: more of an aptitude and attitude to learn

multiple intermediate data storage]real time trading servers

For easy prod support, get your first stage of processing to save
intermediate output to cache, DB or MOM, and 2nd stage to pick up from
there. You can have many stages (ie pipes) and pipe connectors.
This might help your job security if other developers can’t easily
figure out all of your techniques saving, accessing, investigating (in
prod), filtering, monitoring the intermediate data. Remember gemfire
doesn’t have a working data browser?
This helps testing. Remember Mithun’s DBank cash management project.
This helps prod monitoring.
This helps everyone understand the business as they can see the
intermediate data in blood and flesh. You can get interesting
Recall Reo has limited logging so we don’t know why some events don’t
happen upon a user action or market update.

java RMI in trading systems

Now I feel rmi is rather easy, battle-tested, proven, mature,
well-researched, … compared to many alternative solutions. Here's
RMI usage in a trading system circa 2011 —

Nobody calls Neo server via RMI.  The only way you can talk to Neo is
via JMS/Protobuf.  So even if you have 100 instances of Neo servers,
JMS distributes the messages across them.

Neo does make _outbound_ RMI calls to PricingControl, Arb/prop/model
trading engine, and various other systems.

10 (random) arch features of HFT

When talking to low-latency shops, i realize the focus shifts from pricing, trade booking, position mgmt … to market data, message formatting and sockets – rather low-level stuff. A high-frequency trading engine has many special features at architectural and impl levels, but here i will focus on some important architectural features that make a difference. By the way, my current system happens to show many of these features.

1) message-driven, often using RV or derivatives. Most trading signals come in as market data, tick data, benchmark shifts, position adjustments (by other traders of own own bank). Among these, I feel market data poses the biggest challenge from the latency perspective.
2) huge (reluctantly distributed – see other post) cache to minimize database access
) judicious use of async and sync IPC, if one-big-machine is undesirable.
3) optimized socket layer, often in C rather than c++. No object-orientation needed here:)
) server collocation
) large number of small orders to enable fine-grained timing/cancel and avoid disrupting market
) market data gateway instantiates a large number of small objects
) smart order router, since an order can often execute on multiple liquidity venues

Beyond the key features, I guess there’s often a requirement to immediately change a parameter in the runtime rather than updating a database and waiting for the change to be noticed by the runtime. I feel messaging is one option, and RMI/JMX is another.

bond trade capture system use-cases

Trading system architect must know such essential use cases:

A hypothetical bond trade booking sys – named Blo (for Blotter)

Blo use case 1: phone execution, then trader enter trade into Blo.

Blo use case 2: traders advertise offers and bids on an internal network. Our salesperson lifts an offer. Trade is confirmed on the spot. System automatically books trade into Blo. This flow converts the Order into a Trade automatically. It’s possible for 2 salespersons to lift the same offer. System will reject A and book B.

Blo use case 3: advertise offers to external venue, lifted automatically. External venue sends us confirmation and trade booked.

Blo use case 4: trader responds to external bid-wanted (RFQ) and her bid is selected, becoming a trade. External venue sends confirmation to us, trade booked.

In Eq, there’s often a big OMS to manage the order state from an initial request to a completed trade.

DB as audit trail for distributed cache and MOM

MOM and distributed cache are popular in trading apps. Developers tend to shy away from DB due to latency. However, for rapid development, relational DB offers excellent debugging, tracing, and correlation capabilities in a context of event-driven, callback-driven, concurrent processing. When things fail mysteriously and intermittently, logging is the key, but u often have multiple log files. You can query the cache but much less easily than DB.

Important events can be logged in DB tables and

* joined (#1 most powerful)
* sorted,
* searched in complex ways
* indexed
* log data-mining. We can discover baselines, trends and anti-trends.
* Log files are usually archived (less accessible) and then removed, but DB data are usually more permanent. Don't ask me why:)
* selectively delete log events, easily, quickly.

* Data can be transformed.
* accessible by web service
* concurrent access
* extracted into another, more usable table.
* More powerful than XML.

Perhaps the biggest logistical advantage of DB is easy availability. Most applications can access the DB.

Adding db-logging requires careful design. When time to market is priority, I feel the debug capability of DB can be a justification for the effort.

A GS senior manager preferred logging in DB. Pershing developers generally prefer searching the same data in DB rather than file.

gemfire write-behind and gateway queue #conflation, batched update says (simplified by me) —
In the Write-Behind mode, updates are asynchronously written to DB. GemFire uses Gateway Queue. Batched DB writes. A bit like a buffered file writer.

With the asynch gateway, low-latency apps can run unimpeded. See blog on offloading non-essentials asynchronously.

GemFire’s best known use of Gateway Queue technology is for the distribution/propagation of cache update events between clusters separated by a WAN (thus they are referred to as ‘WAN Gateways’).

However, Gateways are designed to solve a more fundamental integration problem shared by both disk and network IO — 1) disk-based databases and 2) remote clusters across a WAN. This problem is the impedance mismatch when update rates exceed absorption capability of downstream. For remote WAN clusters the impedance mismatch is network latency–a 1 millisecond synchronously replicated update on the LAN can’t possibly be replicated over a WAN in the same way. Similarly, an in-memory replicated datastore such as GemFire with sustained high-volume update rates provides a far greater transaction throughput than a disk-based database. However, the DB actually has enough absorption capacity if we batch the updates.

Application is insulated from DB failures as the gateway queues are highly available by default and can be configured to allow zero data loss.

Reduce database load by enabling conflation — Multiple updates of the same key can be conflated and only the final entry (containing all updates combined) written to the database.

Each Gateway queue is maintained on at least 2 nodes, internally arranged in a primary + (one or multiple) secondary configuration.

y a regular developer need design patterns

I asked a friend familiar with design patterns. Here's his answer + my comments.

* Sometimes you need to provide an API to another developer. I feel it's often beneficial to provide a familiar API based on a familiar pattern.
* When you refactor existing code
* A lot of frameworks out there embody design patterns. If you have to create your own framework (for whatever reason), you might need to decipher and follow the same design patterns.

In all these scenarios, concept is more important than knowledge. Variations on the theme needed.

pricing control in a bond dealer desk

Pricing (along with pnl) is one of the most important data to monitor and control. There’re multiple levels of price controls.
* Offer/bid price limits, to block out-of-range offer/bid advertisements
* the price in a response to a IFB is typically sent out via a system and is probably subject to price control, to prevent bidding too high.
* After trade execution, Middle Office would check the price against some reference prices. If a trader executed an unusually price, she may be responsible. I was told MO only bothers with unfinished (i.e. unclosed) positions.
* Pricing exception report and attestation. I think this is internal compliance.
* There could be regulations on unusual execution prices in some regulated securities. It’s conceivable that government wants to know every trade’s price in a particular derivative so as to prevent another bank collapse.

trade booking, trade capture, position management, sub ledger

The standard OTC trade booking system (TBS) — After u finalize i.e. execute a trade with your counterparty, typically over phone, you enter the completed trade in the TBS. Some call it trade capture. I think this used to be the trade blotter. Before TBS, people used spreadsheet. This is one of the earliest and most essential IT systems for traders.

The other absolutely essential trading system is the position management system (PMS), aka sub ledger. TBS records all the trade activities, and independently computes current positions by accumulation, and synchs up with the PMS every day.

How about pricing engine? In OTC, trader can decide the price with a pencil or a sophisticated pricing engine. I think it’s firm’s money but trader’s decision, so it’s up to her.

eq listed drv desk

Some basic info from a friend –

Equity Listed derivatives – mostly options on single stocks or options on index/future, but also variance-swaps. Even if a stock has no listed options, we would still create a vol surface so as to price OTC options on it, but the technique would be different — The standard technique if given many pairs of {expiration, strike} is to fit a curve on a single expiration, then create similar curves for other expirations on the same underlyer (say IBM), then try to consolidate all IBM curves into a smooth IBM vol surface. Each “point” on the surface is an implied vol value. I was told some of the more advanced “fitting” math is extracted out into a C++ quant lib.

Instrument pricing has to be fast, not multi-second. I guess this is pre-trade, RFQ bid/offer pricing, similar to bond markets’ bid-wanted. In contrast, the more “real” need for vol surface is position pricing (or mark-to-market), which provides unrealized PnL. I feel this is usually end-of-day, but some traders actually want it real time. Beside the traders on the flow[3]/listed/OTC derivative desks, the vol surface is also used by many other systems such as structured derivatives, which are entirely OTC.

It’s quite hard to be really event-driven since they are too frequent, instruments too numerous, and pricing algo non-trivial, exactly like FX option real time risk. Instead, you can schedule periodic repricing batches once a few minutes.

About 3500 underliers and about 450,000 derivative instruments. Average 100 derivatives on each underlier (100 combinations of strike/tenor). S&P500 has more than 1000 derivatives on it.

Market data vendors — Reuterss, Wombat, Bloomberg.

Inputs to vol calculation — product reference (strike/tenor), live market quotes, dividend, interest rate …

One of the most common OTC equity derivatives is barrier option.

Pricing and risk tend to be the most mathematically challenging.

Exchange connectivity is usually c++, client connectivity (clients to send orders or receive market data) is usually java.

[3] Flow means agency trading, most for institutional clients. Retail clients are very wealthy. Those ordinary retail investors won’t use an investment bank. Flow equity derivative can be listed or OTC.

Spring can add unwanted (unnecessary) complexity

[5] T org.springframework.jms.core.JmsTemplate.execute(SessionCallback action, boolean startConnection) throws JmsException
Execute the action specified by the given action object within a JMS Session. Generalized version of execute(SessionCallback), allowing the JMS Connection to be __started__ on the fly, magically.
Recently i had some difficulties understanding how jms works in my project. ActiveMQ hides some sophisticated stuff behind a simplified “facade”. Spring tries to simplify things further by providing a supposedly elegant and even simpler facade (JmsTemplate etc), so developers don’t need to deal with the JMS api[4]. As usual, spring hides some really sophisticated stuff behind that facade.

Now i have come to the view that such a setup adds to the learning curve rather than shortening it. Quickest learning curve is found in a JMS project using nothing but standard JMS api. This is seldom a good idea overall, but it surely reduces learning curve.

[4] I don’t really know how complicated or dirty it is to use standard JMS api directly!

In order to be proficient and become a problem solver, a new guy joining my team probably need to learn both the spring stuff and the JMS stuff [1]. When things don’t behave as expected[2], perhaps showing unexpected delays and slightly out-of-sync threads, you don’t know if it’s some logic in spring’s implementation, or our spring config, or incorrect usage of JMS or a poor understanding of ActiveMQ. As an analogy, when an alcoholic-myopic-diabetic-cancer patient complains of dizziness, you don’t know the cause.

If you are like me, you would investigate _both_ ActiveMQ and Spring. Then it becomes clear that Spring adds complexity, not reduces complexity. This is perhaps one reason some architects decide to create their own frameworks, so they have full control and don’t need to understand a complex framework created by others.

Here’s another analogy. If a grandpa (like my dad) wants to rely on email everyday, then he must be prepared to “own” a computer with all the complexities. I told my dad a computer is nothing comparable to a cell phone, television, or camera as a fool-proof machine.

[1] for example, how does the broker thread start, at what time, and triggered by what[5]? Which thread runs onMessage(), and at what point during the start-up? When and how are listeners registered? What objects are involved?

[2] even though basic functionality is there and system is usable

trade booking/capture in the big picture

For a novice who wonders just how important trade-capture is…

b/c (i.e. trade booking/capture) is the #1 essential component of trading systems, if you look across assets. B/c is often the _heart_ of an OTC trading desk or voice trading desk. But not true for trading desks against an exchange/interdealer, because pre-trade apps takes center stage, and post-trade
flow becomes middle-office.

b/c (along with position/pnl and trade blotter) is the first task to be computerized on wall street.

b/c is the basis of position master ie sub-ledger (often in mainframes), one of the most essential systems in any trading system. Sub-ledger is basis of pnl.

I feel b/c is relatively _low_tech_ compared to market data, low latency and some pre-trade systems. However, I feel in an exchange or a large sell-side firm, execution volume can be high.

I feel b/c demands more precision, more stability, better error rate, more robustness… than most pre-trade systems. This is because b/c is the point of no return — After an order is executed it can’t be canceled effortlessly.

In a voice trading desk, b/c is actually post-trade, because the trader executes the trade over phone and simply enters data into the books. Remember MTSTradeEngine? In contrast, the fully electronic b/c is not post-trade but sits at the choke point right between pre-trade and post-trade.

FX option trading – a typical arch

Just as in equity options, the core component is risk engine, because positions are large and long-term. See other reasons in my post on option trading systems.

— My hypotheses —
* I guess for both fx option and IRD, core engine is a realtime event-driven position updater (another side of the same coin as risk engine). Each position has a lot of contract attributes and risk attributes, all subject to frequent updates. A typical FX option desk probably has “too many” positions each reacting to a lot of events, but each update is complex and time-consuming.
* In contrast, cash desk has fewer positions and simpler positions.
* In bond trading, any non-flat position is also subject to updates in terms of marking and unrealized PnL, but calc is simpler.
FX option is an OTC market – “no electronic trading” (i guess no ECN either), but there are electronic trade messages in addition to manual trade booking. There’s also plan to access CME listed FX options. Note this plan is not about fx options on futures, and not about PHLX.

%Q: so is it voice based?
A: various means.

Clearing could be done at the London Clearing House. I guess London is a bigger center than NY.

A lot of “exotic” fx option products come online every year. There’s pressure to automate and speed up new product launch. I would guess 1) position management and 2) booking are among the most essential features needed by any new FX option instrument. System must be able to persist positions in these exotic options. If automatic STP booking is hard, then ops can manually enter them, assuming volume is low on new products.

Volume of Trades – FX options desk gets about 1500 trades/day. In contrast, FX cash desk (includes futures + forwards) gets about 100 times the volume, but profit is perhaps 2 to 3 times that of FX options desk, obviously different margins.

Volume of Positions – FX cash desk keep most positions flat so very few positions are non-flat. FX options desk has “too many” open positions, a big headache to risk engine.

Entire FX options desk needs about 20 desk-specific developers world-wide. Besides, I guess there are many supporting systems owned by other teams outside the desk. These teams include (not limited to) firmwide teams, probably further away from the profit centers.

FX option trading is more complex than FX cash trading.

Q: Are FX derivatives simpler than equity derivatives?
A: not necessarily. FX involves 2 interest rates. Eq involves dividends.

— system modules owned by dedicated desk developers–
FIX server (perhaps for market data, not e-trading?)
GUI is in Tcl, early versions of C# platform and WPF.
Market data is a major component in FX. Many modules react to market data —
– risk
– pricing
To traders, real time pricing is presumably more important  than risk is. I guess they need to send out updated bid/offer. RT pricing uses spot prices (market data) and volatility data for calculation. For any pair of currencies, (every?) market data could trigger Automatic price updates across all strikes and expiration.

Actual option valuation math is in c++/JNI.

Biggest headache in fx option risk engine is performance. FX Option Valuation is slow. FX option position Volume is too large for real time risk update. Instead, the risk “report” system is on-demand and covers a requested subset of the full portfolio, presumably those positions belonging to a trader. Such a report takes a few minutes. If market data has changed by then, report is obsolete.

Risk rollup from trader-level to entity-level to firm-level. There’s an external team responsible for analytics library and they call FX options system’s services to get positions. I guess that external system is a firm-wide analytics or risk engine.

#1 essential component (among the distributed components over 30 servers) in the trading desk is trade capture/booking, written in c++ primarily + some java. There’s some c++ valuation module for FX options. Plan is to slowly phase out c++. Other than that, desk is mostly java.

–core architecture–
Since an option (or any derivative) is not settled right away like cash trades, there’s a _lifecycle_ to each derivative trade. Each derivative trade takes on a life of its own and is subject to many “lifecycle events” like
– origination, cancels, amends/modifications
– knock-in, knock-out
– fixings
– market data effecting risk reassessment

Just like bond repricing engine, this is Service Oriented Architecture – MQ facilitates the event-driven architecture, but there are other ways to pass messages like SOAP over TCP (not http).
1) MQ for high volume messages
2) SOAP for slow, complex processing. Possibly a few trades a day! I guess these are exotic products.
A typical event-driven server here is a socket server, holding a thread pool, started with main(). No container or web server.

WallSt infrastructure — security trading systems

(Another blog. No need to reply.)

A large part of the wall street core infrastructure is built around the (regulated) exchanges and (unregulated) ECN’s, and includes the major trading houses’ trading systems — front and back ends, equities, fixed income, currency and commodities, including risk. Developers in these systems are the backbone of wall street.

I feel less than 50% of my company’s technology staff are application developers. Among them, less than 50% develop apps for real time trading. The rest of the developers support reporting, end-of-day risk, post-trading (like my team), GL, compliance, surveillance, price and other data feeds into trading systems and data feeds out of trading systems, maintain accounts and other reference data, ….

Trading system developers are employed by brokerage houses (aka securities firms), hedge funds, mutual funds, prop traders, exchanges, and many boutique firms(?) On the other hand, retail banking, consumer banking, corporate banking (they all taking deposits and giving loans) and the advisory business of investment banks don’t have infrastructure to trade securities. I don’t think they have access to the security exchanges.The IPO, M&A, privatization… investment bankers do need some access to trading systems, as they issue securities.

Overall, how many percent of the financial IT people are in the “backbone”? I guess not more than 10%.

essential trading server arch QnA #tibrv,gemfire

Every trading server invariably uses some non-http network daemon. There’s always more than 1 process (JVM, C# or c++) on the server side. In FI/commodities (not low-latency eq) There’s often some MOM daemon such as JMS, tibrv and gemfire notification daemon. Here are some Fundamental questions:

Q22: on top of tcp/udp, what specific network protocol between the server-side and GUI?
A: I have seen rmi and protobuf over tib ems.

Q22a: how about JMS between server and swing? Did we see 160 subscribers on a given topic, due to that many swing installations?

Q33: on top of tcp/udp, what specific network protocol among the server-side processes?
A: I have see tibrv, JMS, RMI, gemfire data distribution protocols …

Q44: since most trading servers must avoid DB latency, where does the trading data live? In memory?
A: i have seen gemfire, rttp,

Q45: in case of distributed cache (not replicated), how does one cache listener update another node?

Q55: how does the daemon stay alive after main() exits?
A: Look at ion, gemfire, activemq. There’s often at least 1 (1 enough) non-daemon thread that’s stuck in wait()

web services j4, features — briefly

web service is an old technology (RPC) given a new lease of life 10 years ago.

* [#1 selling point] cross platform for eg between dotnet frontend and java backend
* loosely coupled
* good for external partner integration. Must be up all the time.
* beats MOM when immediate response is required.
* web service (soap) over MOM? should be feasible. One listener thread for the entire client system — efficiency

design patterns often increase code size

Using nested if and instanceof.., you can implement complex logic in much fewer lines, and the logic is centralized and clearly visible.

If you refactor using OO design techniques, you often create many Types and scatter the logic.

Case in point — Error Memos workflow kernel. We could use factory, polymorphism, interfaces…

Case in point — chain of responsibility

make 2 custom exceptions only — one checked, one unchecked

Lots of interviewers asked me about my exception handling strategy. Here’s a tentative exception object design for a small project — Maintain just 2 custom exception classes.

1) — a big custom checked exception (by extending containing an enum[2] field. Each value represents a specific error condition[1].

2) MyUncheckedEx — a similar thing but unchecked (extending RuntimeException?).

Now the usage:

1) When you want a given error situation to always be handled in every *context* and never ignored, then register it in MyCheckedEx. Every place it can possibly occur, compiler will be your trusted friend to ensure it’s declared and handled explicitly.

2) things you put into MyUncheckedEx are nicely flexible, less strict. Callers (other developers) can choose to catch them or ignore them, silently escalating them to upstairs.

What about standard JVM exceptions? You can wrap them into your 2 exceptions. In particular, if you don’t like to handle many of the standard checked exceptions, you can wrap them in MyUncheckedEx. I think Spring wraps lots of Checked exceptions into unchecked. Conversely, you can wrap an unchecked exception into MyCheckedEx.

Another requirement — We often need a hierarchy of exceptions. Gmail uses labels instead of folders, so likewise, shall we declare some marker interfaces so an exception can be labeled as both validation and synchronization and billing exceptions? Will this beat a custom inheritance tree?

[1] a priority — i prefer fine-grained types of exception, for better control
[2] or int. You can use these in switch-case. I don’t like String.
non-priority — exception parent-child family. Unnecessary complexity.

realtime inter-VM communication in front desk trading sys

Inter-VM is our focus.

* [s] MOM — async
** FIX over RV in Lehman Eq
* [s] distributed cache — async?

Above mechanisms notify listeners. Note Listeners are usually async and multi-threaded.

* DB writes by one app, and periodic DB polling by receiving app
* [s] RMI
* [s] EJB? infrequent. I think this is less efficient than MOM
* [s] web service? not sure
* FTP? not real-time but at SOD (startOfDay) and EOD
* email? none

MOM is the clear favorite. Most efficient. Guaranteed

Within my front office app, RMI, MOM and cache are dominant. Within a related ticketing system (iticket), MOM and RMI are dominant.

DB is an extreme form of synchronous pub/sub.

[s=needs object serialization. cross-VM often requires serializable]

separation of concern: a G3 design priority

for a long time my #1 ideal is testability. Separation of concern is related. Many other related ideals —

* modularize
* layering — extremely popular modularization pattern
* stable interfaces between modules
* interchangeable parts
* encapsulation of implementation details

But “separation of concern” is the best phrase. Now specific examples in an environment of multiple systems and teams:
* a basic idea — a table which multiple systems can read/write
** use views rather than the base table
** system to call a proc rather than select/update on the underlying table
* MOM (RV or mq)
* SOA service bus (MOM)
* web service
* producer/consumer
* getter/setters rather than public fields
* declare return types and variables as interface types

swing/server using 2 http requests (trading

A third party trading platform (like FXAll, tradeweb, Bloomberg, creditex, espeed, brokerTec) could install a swing GUI on traders’ screens. Every time a trader requests info from the remote server, there are 2 swing threads involved —

– one responsible for synchronously uploading the request message, via an http tunnel through firewall
– another responsible for synchronously downloading the response message – http tunnel. I guess this requires a URL including a requestID.

If server is busy or slow, then the 2nd synchronous call should take place after some delay. Otherwise it would block for a long time.

Q: I wonder how server could push market data to swing? I feel the http tunnel means the http client can’t be a server.
%%A: i guess the client makes periodic requests for updates. This is not really real time.

Studying biz rule tables used on WallSt

Background: many rule based systems (on Wall St or elsewhere) have thousands of rules in database tables. The simplest and most used pattern is the lookup table. A lookup table has one (or more) input columns and one (or more) output columns.

Each combination of input fields maps to exactly one row. There should be a unique composite index on the input columns as a group.

The concept is simple, but sometimes needs to be hand-crafted and perfected, because
* these rules are often part of the most important 1% source code in a big code base
* users change rules frequently, at least in some Wall St firms
* we often need to query the tables with complex criteria. Such complex queries are usually implemented in source code, but for quick production support we often need to run them by hand, tweak and rerun. In some cases, we end up learning all the intricacies by heart. If your full time job is a spell-checker you won’t need the dictionary much.
* we often need to explain how rules are disqualified, filtered out, ranked, grouped, applied, overlayed, re-applied…. Many IT veterans would say blow-by-blow knowledge of a code module is needed in extremely rare situations.. Welcome to Wall St these business rules are examined and tweaked on a daily basis by multiple groups of users.

Compare with Informatica lookup transform.

1900 tiers of quotes, RFS over FIX, indicative/executable quotes #400w

One of the REAL bottlenecks in a large SELL-side FX dealer system is tiered pricer. Trigger event could be a market data change. Since such an event could trigger an avalanche of messages, the frequency of such events is not very high, probably below 10 events/second on 1 currency pair. If you have 10(or 50 or whatever) active currency pairs, then you could get 100 triggers/sec through your entire system.

Once a trigger is activated, pricer computers new bid/ask quotes for Tier 1 Gold clients. Pricer then adds a distinct spread for each tier. For an active pair like EUR/USD, there can be 1900 tiers. There can be up to 1900 (non-unique) pairs of bid/ask quotes. Typically, the “best” quotes would have a bid/ask spread of 2 to 3 pip, applicable to the best and largest clients. For a *retail* client, it could be 20 – 100 pips.

Core of the tiered pricer is a Drools rule engine.

Another module is the messaging engine using Nirvana by My-Channels. If a particular bid/ask quote applies for all (say, 20) tiers in Silver category, then pricer broadcast the quote to a topic like Quote.EURUSD.Silver. This is kind of alias for 20 different “tier” topics. For efficiency, this is probably multicast.

In the worst case, one event can trigger an avalanche of 1900 messages for one currency pair alone.

Last module is the FIX engine. Quotes often go out the door in FIX format, just as RFQ. Now there are 1900 tiers but the number of clients could be higher or lower than 1900. If there are more than 1900 clients and if all of them subscribe to our quote, then each must be sent the quote. I know a Chicago prop trading firm (Gelber?) subscribe to a lot of “bank feeds”.

The most demanding type of subscription is a RequestForStream (RFS). A typical RFS could ask for a stream of EURUSD quotes for 10 (up to 120 minutes), during which time all quotes must be delivered.

Unlike RFQ, RFS requires special approval. The bid/ask quotes in an RFS can be indicative or *executable* (similar to Firm, but see separate blot post). If a client hit an executable bid or lift an executable offer, then the trade is considered executed, though I believe cancellation is still a possibility, just like any Firm quote in bonds.

How does a dealer make sure he has enough position to honor the quote? Perhaps by setting aside reserve quantities, or by monitoring the open market.

Unlike bidwanted systems (non-negotiable quotes), it’s possible for a client to negotiate on our quote electronically, though I feel manual negotiation is more practical.

constructor to throw checked exception

Hi XR,
This is just a java learner’s personal opinion — i feel constructors should avoid throwing checked exceptions. Traditional if/while/switch tests are more natural , more convenient, and more maintainable than throwing exceptions. Only in a few special scenarios should a constructor throw a checked exception.
My reasons against checked exceptions thrown from a constructor? My answer is long. A constructor returns null after throwing exception. The caller method has to 1) test for null before using the new object constructed. The caller also need to 2) try-catch the exception. Of course the constructor must 3) detect the invalid input before throwing. At least 3 tests.
Whenever possible, i would move Test #3 out of constructor into the caller, so constructor won’t return null nor throw checked exceptions. How to achieve this? One standard solution is a factory. Factory does Test #3 and handles the invalid condition. Factory may return null, so the caller still need Test #1. So 2 tests to perform.
This leads to an even simpler solution. If possible, the caller method should detect invalid input (Test #3) before passing it to the constructor/factory. 1 test only.
I guess this is rudimentary. Somehow, after programming java for years, i don’t always see the simplest solution. I think there are a lot of java design patterns, java idoms, common java constructs, such as throwing checked exceptions from constructors. I learnt these and didn’t know when NOT to apply them.

database access control solutions for enterprise

solution: deny access to command line tools like sql+ or sqsh, deny access via standard windows clients like toad, aqua studio. Require every interactive user to access via a browser.

solution: custom api for java clients. block access via jdbc.

Solution: views to limit access to subset of rows, subset of columns or derived columns. deny access to underlying tables.

Solution: sproc and function to limit access. deny direct access to underlying tables. I think this is the most flexible.

from a database row to an object

“a datatype with methods”. In a longer sentence, “a datatype with specific operations defined for it”

For example, a “student” datatype has fields representing the object’s state and and operations appropriate for a student such as enroll(courseID), payFees()….

This is a good example of the simplest type of class — a data class.

Another short answer — “a C struct with methods”

what realworld entity a class represents

The Challenge: given a complex class, how do i quickly find out what kind of realworld thingy, if any, it models? I hope this post offers practical, quick solutions.

Conversely, a good OO-design can help show what kind of entity the class represents. Beside naming and comments, a designer can consider the list below.

Tip: Occassionally, you notice a field of a collection type. The entities inside the collection is the realworld thingy represented by this class. Common design pattern. Perhaps a simple pattern, but NOT obvious!

Tip: See if it has relatively few instance variables. In a good design, instance variables collectively represent object *state*.

Tip: See if the constructors are simple and meaningful. Some constructors receive key argument(s) that reveal the role of this class among the /domain-objects/.

Tip: See if the class advertises an obvious “service” method, perhaps supported by helper-methods

Tip: always see the base types.

Tip: getters and setters are an obvious clue.

rule compilation in java #basic learning notes

a rule-compiler compiles rule-source-code into executable-rules. You deploy excutable-rules just as any other java class, to JVM.

rule-compiler contains a rule-parser.

executable-rule files have an encoding format to hold the condition/action.

rule-compilation is an initialization overhead to minimize, perhaps by caching.

Q: How does this compilation affect the edit-cycle?

rule-engine vs rules

If you remember one knowledge pearl about business rule engines, i think you may want to remember the relationship between rule engine and rules. I think these are the 2 main entities to *deploy* to an ent app.

Verizon’s circuit fault-isolator is a typical enterprise application using JRules. Think of your ent app as a host-app (or user or caller) of the rule stuff. There are quite a lot of rule stuff to *deploy* and you will soon realize the 2 main thingies are the (A) generic rule-engine and (B) your rules.

– The rule-engine is written by ILOG (or JBoss or whoever) but the rules are written by you.
– Rule Engine is a standard, generic component but the rules are specific to your business.
– The rule-engine is first and foremost the interpreter of your rules

* a good analogy is found in XSL transformer vs xsl stylesheet. Your host application need to load both of them into memory
* A similar relationship exists between spring the framework and the spring-beans you create.

cache size control in trading engines#%%ideas

My answer in 2007 – separate thread to monitor the size and send out alerts. JMX-based clean-up operation.

Now I feel we can take the swing EDT approach. “EDT” here means some kind of separate thread/data-structure for cache size control — a watchdog.

Each insert/delete operation on any thread would send a msg to the watchdog queue. Watchdog can decide how many messages to drop before it decides to take a look. If size is close to the limit, then it could decide to be more vigilant. Once limit is hit, watchdog could turn on some flag in the cache system to remind the inserters.

But how does watchdog keep the size within limit? It has to remove the least accessed item. A priority queue with a last-accecced-time might be good. See post on LRU for a java builtin solution.

More importantly, the app has to be tolerant of involuntary loss of cache items. “If cache doesn’t have what I inserted, i will insert again.”

batch jobs in financial trading system

According to a friend in investment banking technology, ALL trading systems need batch jobs to complement online applications. I think MQ applications are a third type. Typical batch:

* save in DB historical volume/day-high/day-low/day-open/day-close … — “open” information open to the public

* save in DB all market players’ trades. That’s my own terminology referring to “our own” hedge funds, other firms’ hedge funds, other firms’ traders… Our own traders’ activities are probably captured during transaction — no batch required.

* We (the brokerage) may also have large institutional clients whose data need to be recorded in DB. Such data may need batch processing if it is not recorded automatically.

which methods to make static — FTTP perspective

Key differences between static ^ instance methods

* static methods can only access static fields and static methods of this class, but many important “members” are non-static
* “public static” methods serve as “services” to clients (and deserve public service awards). Example: Arrays.*, Collections.*, Math.*
* static methods can run before the constructor ==> Many common operations in a class must be static — main(), getInstance()
* abstract method can’t be static. See other posts for the complicated reasons.
* overriding^hiding
* “synchronized” has different meanings

CurrencyHoliday DB table

To my surprise, there’s actually a CurrencyHoliday table in the FX trading system. Traders/dealers are alerted to holidays that impact their trade’s settlement.

FX Forward traders get a calendar automatically adjusted based on this table, to show the value dates of T/N, S/N, 1m, 2m etc

The related TblCurrencyPair table holds standard lot size and settlement convension (like T+1 for CAD)

tiers ] ports^1-jvm

Refer to the overview post on ports^1-jvm.

Suffering from the same abuse-of-terminology as “server”, the word “tier” can now refer to not only separate-jvm but also to 1-jvm AR. Consider

“data tier” —
“dao tier” — basically the same as “dao modules”, as a layer of abstraction
“orm tier”
“object tier”
“presentation tier” — jsp, struts views, usually within 1-jvm

server/client mean — ports^1-jvm

a “server” is traditionally a separate jvm or unix process but nowadays occasionally can refer to a method (or set of methods) your “client objects” can call within a container.

In the same vein, a “client” used to mean a number — a process id (or thread), often on a different host, but now in Java literature it often means a “caller of a method”. A caller of a method may or may not be an object in memory, but always refers to some “calling context” to be altered by the method.

Every OO students would eventually come to realize when to say “client” and what that implies.

The j2ee community seem to pay little attention to the question “same or separate jvm?”

A “service” is often provided by the container.

The “server” usually provides some utility service like water and electricity.


Perhaps not a good example. jndi is obviously a container service. It can be filesystem-based.

ports^1-jvm to decouple web tier

2 common architectures to decouple any object-oriented web (or non-web) system. Master them. Don’t try to add a 3rd architecture to overload your memory.

— A) ports ie tcp ports. You Separate a chunk of java code into another unix-process with a port. You end up with a “tier” in a multi-tier AR. Beware “tier” can now refer to single-jvm modules too, in an abuse of terminology.

connection pools @@

Examples: EJB, ActiveMQ, CPF Single-sign-on-server, crystal-report-server, web services

— B) single-jvm solution. Instantiate intelligent components from an off-the-shelf or /3rd-party/ (3p) jar
thread issues
In a web tier, the 3p objects could be too big to re-create ==> put in session

Examples: struts, spring, hibernate, nanoXML, log4j, shopping cart

Example: the M in MVC could be a 3rd-party module (shopp`cart) or even a legacy ERP, but almost always there are some M-classes within the MVC jvm.

capacity management, a Unix /perspective/

aim for simultaneous saturation and eliminate bottleneck? More for perf tuning than cap management

“Capacity” is largely (don’t /sweat/ it) about “resources”.

— can’t add resource?
identify critical resources (bandwidth, simultaneous oracle conn, disk throuput..)
collect usage pattern esp. peak usage for each resource
increase effi 4 each resource ie reduce wastage
Identify — most of the time, perf is cpu-bound, mem-bound, disk io-bound, network io-bound … Same for a Weblogic server

— can add resources?
follow the same suggestions above
do cap plann`
do load forecast

perf techniques in T J W’s project–ws,mq,tx

Q: request wait-queuing (toilet queue)? I know weblogic can configure the toilet queue
A: keep the queue entries small. we only keep object id while the objects are serialized to disk (?!)

Q: is 1kB too large?
A: no

q: most common cause of perf issue?
A: mem leak. still present after regression test

q: jvm tuning?
A: yes important, esp mem related

q: regression test?
a: important

q: perf tools?
a: no tools. primarily based on logs. eg. track a long-running
transaction and compute the duration between soap transaction start
and end.

Q: web services?
A: Many of the transactions are based on soap, axis. TCP monitor
can help with your perf investigation.

Q: tx?
A: yes we use two phase commits. Too many transactions involved.
really complex biz logic. Solution is async.

Q: multi-threaded?
A: handled by weblogic.

Q: how is the async and queue implemented?
A: weblogic-mq with persistent store, crash-proof

create functionality +! jvm restart]strategy pattern

Strategy patterns allows you to define a family of interchangeable algorithms, to be selected at runtime. In extreme circumstances, a new algorithm is to be created and to be added immediately without jvm restart. This would be a higher level of flexibility.

Perhaps the highest level of flexibility is offered by a DB containing classnames in the “family”. After you create a new algorithm, you insert its classname into the DB.

Class c = Class.forName( “com.myPackage.Myclass” );
Thing t = (Thing)c.newInstance( );

Also see detailed sample code in

runtime change to object behaviour

[[ head first design patterns ]] repeatedly favors *runtime* change to program functionality, rather than compile-time ie source code change. I assume they have a *practical* reason instead of a doctrine.

Related concepts: Strategy pattern, Decorator pattern,

When we need to change from an old functionality to a new functionality, a good approach is
* we try to create a new functionality class, if at all possible,
* at runtime, use existing setters to assign the new functionality, replacing the old, when needed.
* minimize edits to existing, tested classes

See also post on [[ create functionality without jvm restart]strategy ]]

I think this probably incurs least-impact to existing, tested functionalities.
=> regression test@@ no need
=> Low stress for fellow developers, managers, clients, internal users and any non-technies.
=> no need to worry “Did we miss any other existing classes that need edit?”
( documentation on interdependencies is crucial but often neglected by developers. )

batch feature wishlist

[x = lesser-known but fairly regular requirement in my experience]
A “record” means one of a (potentially large) number of input data to be processed

* [x] step-by-step manual confirmation, each with a single keystroke. Just like rm -i
* skip certain steps
* reshuffle some steps — arguably tapping on one of the strengths of interpreted languages.
* [x] re-run a certain step only
* share codebase with other on-going projects, to avoid forking and ease maintenance
* persistent xml config + command-line config
* be nice (Unix terminology) to other processes. Batch jobs can quickly eat up shared resources.

— infrastructure support needed, because standard batch languages can’t
* self-profiling and benchmarking on the batch application, to record time/mem/DB/bandwidth… usage for performance analysis
* scheduled retry or manual retry
* “easy” multi-threading (with data sharing) to exploit multi-threaded processors like our T2000’s 32 kernel threads. Multi-threading is non-trivial, esp. with data sharing. Many batch developers won’t have the time/expertise to create it or test it. Infrastructure support could lower the barrier and bring multi-threading to the “masses”

Re: NextGen server mean time to failure@@

(A draft email) Hi,

Thanks for your quick reply. Sorry I’m unable to give any suggestion. Just some nagging worries. I’m trying to be critical yet objective.

My experience suggests that many java-based daemons are fairly susceptible to degradation with a concurrent load level high enough. Similar to denial-of-service attacks.

I’m not easily convinced that any piece of software (including my favorite — apache httpd) can keep up performance without restart for a few months under heavy load. For example, over 20 years solaris went through continuous improvements in terms of self-healing, daemon/service availability — a clear sign that the system can sustain “injuries” and lose performance. If it can happen to OS, what is immune?

I remember Siva told me the FTTP workload could be quite high and it’s not easy to handle that load. I think he said a few thousand cases a day. Will keep us busy:)

tan bin

app design in a fast-paced financial firm#few tips

#1 design goal? flexibility (for change). Decouple. Minimize colleagues’ source code change.

characteristic: small number of elite developers in-house (on wall street)
-> learn to defend your design
-> -> learn design patterns
-> automate, since there isn’t enough manpower

characteristic: too many projects to finish but too few developers and too little time
-> fast turnaround

characteristic: reputation is more important here than other firms
-> unit testing
-> automated testing

characteristic: perhaps quite a large data volume, quite data-intensive
-> perhaps “seed” your design around data and data models

characteristic: wide-spread use of stored proc, but Many java designs aren’t designed to work well with stored proc. Consider hibernate.
-> learn coping strategies

characteristic: “approved technologies”
characterstic: developers move around
-> maintenance left to other guys
-> documentation is ideally “less necessary” if your design is easy to understand
-> learn documentation tools like javadoc

transparency ] j2ee AR

warning: “transparent” has 2 unrelated meanings in java.

[[better, lighter faster java]]

Key concept: coupling. The tighter, the less transparent
Key concept: put “peripherals SERVICES” out of the DOMAIN MODEL
– persistence service
– messaging service
– tx service
– sec service,
– serialization service
– printing service
– email service

For example, Tight coupling between a serialization service and the domain model means “the service is CUSTOMIZED for this biz”. Changes to domain
model requires changes to the service.

For example, Look at persistence. a transparent persistent SERVICE
persists any, yes any, object.

For example, look at serialization SERVICE, which serializes any, yes
any, object

Most imp technique –> reflection <–

Q: what u already understand (LJ) transparency@@
A: see-through. readable logic. an extra layer or functionality should not
impede overall AR readability

declarative control — enterprise design pattern

justification: reduce source change, which can introduce bugs

justification: reduce test effort

justification: Maintain users’, bosses’ and colleagues’ confidence. Confidence that the source didn’t change, so existing functionalities aren’t affected.

justification: slightly Better man-day estimate compared to hacking source code

justification: Adaptble, Flexible

justification: more readable than source code

justification: something of a high-level documentation, well-structured

Examples: DD, spring config file, struts config file, hibernate config file.

I feel this is a habit (unit test is another habit). Initially it’s not easy to apply this idea. A lot of times you feel “not applicable here”, only to witness others applying it here. Easier to justify for component-based, inter-dependent modules. Other projects may find declarative control an overkill, and may opt for a properties file.

Q: Alternative to declarative?
A: The information must move into some place, usually source code. How about a properties file?

Q: is this a design pattern?
A: Purists to avoid the term. For OO and non-OO

biz rules ] DB

What are Business rules? They are set by the business. These guys have written rules. Don’t ask me exactly what qualify and what don’t qualify as business rules. Business rules can be implemented in java, javascript or batch.

Many business rules are best implemented inside the DB. Reason? The concept of biz rule is popularized and heavily influenced by DB industry, vendors and practitioners. Most things that /pass as/ biz rules are defined in terms of DB records (real world objects represented by records). As a result, these biz rules can be and often are best described, saved, encoded in a DB format.

Below are just a few buzzwords, not meant to be an orthogonal, mutually exclusive list of things.

– unique constraint — eg: member id must be unique
– not-null — eg: “We can’t leave this field blank”
– RI — May not be a rule set by business, but closely related to other business rules. eg: “When this salesperson resigns, all her customers must be assigned a replacement salesperson.”
– check constraint — Can be complex. I think (??? confirmed) they should be applied at modification time.
– triggers — can implement RI, check constraints,
– – > input-validation trigger is an important, well-defined type of
– derived data — insert or update “derived data” via triggers, to let java classes select them without “deriving”. The derivation formula contain business rules.
– authorization and access control via views and stored-procs. May not qualify as business rules.
– stored proc — most flexible. Can implement the most complex rules set by business, involving multiple objects.
– – > multi-table correlated modification via stored programs
– cascade delete
– views — can contain business rules in the view’s definition query. eg: “These class of users can only read/modify this subset of data — not those protected columns or irrelevant rows. They should always see the details of each purchase — by a table join.”

An architect should learn this list of techniques. Move business rules from java classes into DB whenever possible, to reduce the complexity of java classes. A large system usually has 60-90% of the business logic implemented in application source code (like java). That’s too much to manage. It’s good to move some to javascript or DB.

[[ pl/sql for dummies ]] advocates putting most “business logic” in DB rather than java. The most complex business logic would need big guns like
* procedures
* functions
* triggers
* complex views, perhaps containing functions in their definitions and have instead-of triggers defined on them.