Q1: pros and cons of vector vs linked list?
Q1b: Given a 100-element collection, compare performance of … (iteration? Lookup?)
Q: UDP vs TCP diff?
%%A: multicast needs UDP.
Q: How would you add reliability to multicast?
Q: How would you use tibco for trade messages vs pricing messages?
Q5: In your systems, how serious was data loss in non-CM multicast?
%%A: Usually not a big problem. During peak volatile periods, messaging rates could surge 500%. Data loss would deteriorate.
Q5b: how would you address the high data loss?
%%A: test with a target message rate. Beyond the target rate, we don’t feel confident.
Q7: how is order state managed in your OMS engine?
%%A: if an order is half-processed and pending the 3nd reply from ECN, the single thread would block.
Q7b: even if multiple orders (for the same security) are waiting in the queue?
%%A: yes. To allow multiple orders to enter the “stream” would be dangerous.
Now I think the single thread should pick up and process all new orders and keep all pending orders in cache. Any incoming exchange messages would join the same task queue (or a separate task queue) – the same single thread.
3 main infrastructure teams
* exchange connectivity – order submission
* exchange connectivity – pricing feed. I think this is incoming-only, probably higher volume. Probably similar to Zhen Hai’s role.
* risk infrastructure – no VaR mathematics.
http://solacesystems.com/news/fastest-jms-broker/ solace JMS broker (Solace Message Router) support 100,000 messages per second in persistent mode and 10 million messages non-persistent. In a more detailed article, http://solacesystems.com/solutions/messaging-middleware/jms/ shows 11 million 100-byte non-persistent messages.
A major sell-side’s messaging platform chief said his most important consideration was the deviation of peak-to-average latency and outliers. A small amount of deviation and (good) predictability were key. They chose Solace.
In all cases (Solace, Tibco, Tervela), hardware-based appliances *promise* at least 10 fold boost in performance compared to software solutions. Latency within the appliance is predictably low, but the end-to-end latency is not. Because of the separate /devices/ and the network hops between them, the best-case latency is in the tens of microseconds. The next logical step is to integrate the components into a single system to avoid all the network latency and intermediate memory copies (including serializations). Solace has demonstrated sub-microsecond latencies by adding support for inter-process communications (IPC) via shared memory. Developers will be able to fold the ticker feed function, the messaging platform, and the algorithmic engine into the same “application” , and use shared memory IPC as the data transport (though I feel single-application design need no IPC).
For best results you want to keep each “application”  on the same multi-core processor, and nail individual application components (like the feed handler and algo engine) to specific cores. That way, application data can be shared between the cores in the Level 2 cache.
 Each “application” is potentially a multi-process application with multiple address spaces, and may need IPC.
Benchmark — Solace ran tests with a million 100-byte messages per second, achieving an average latency of less than 700 nanoseconds using a single Intel processor. As of 2009, OPRA topped out at about a million messages per second. OPRA hit 869,109 mps (msg/sec) in Apr 2009.
Solace vs RV appliance — Although Solace already offers its own appliance, it runs other messaging software. The Tibco version runs Rendezvous (implemented in ASIC+FPGA), providing a clear differentiator between the Tibco and Solace appliances.
Solace 3260 Message Router is the product chosen by most Wall St. customers.
http://kirkwylie.blogspot.com/2008/11/meeting-with-solace-systems-hardware.html provides good tech insights.
Update – Si-Valley also need elites – small number of expert developers.
Financial (esp. trading) IT feels like an elite sector – small number of specialists
– with multi-skilled track record
– familiar with rare, specialized tools — JNI, KDB, FIX, tibrv, sockets, sybase
– Also, Many mainstream tools used in finance IT are used to an advanced level — threading, memory, SQL tuning, large complex SQL
If you compare the track record and skills of a finance IT guy with a “mainstream” tech MNC consultant, the finance guy probably appears too specialized.
That’s one psychological resistance facing a strong techie contemplating a move into finance. It appears risky to move from mainstream into a specialized field.
Culprit: all threads in the pool are blocked in wait(), lock() or …
Culprit: bounded queue is full. Sometimes the thread that adds task to the queue is blocked while doing that.
Culprit: in some systems, there’s a single task dispatcher thread like swing EDT. That thread can sometimes get stuck
Suggestion: dynamically turn on verbose logging in the messaging module within the engine, so it always logs something to indicate activity. It’s like the flashing LED in your router. You can turn on such logging by JMX.
Suggestion: for tibrv, you can easily start a windows tibrv listener on the same subject as the listener inside the trading engine. This can reveal activity on the subject
swing trader station + OMS on the server-side + smart order router over low-latency connectivity layer
* gemfire distributed cache. why not DB? latency too high.
* tibrv is the primary MOM
* between internal systems — FIX based protocol over tibrv, just like Lehman equities. Compare to protobuf object serialization
* there’s more math in risk system; but the highest latency requirements are on the eq front office systems.
jms message selector is executed on the broker.
rvd executes the same duty — “Filter subject-addressed messages.”
Among the defining features of Tibrv, [ A) decentralization and B) SBA ] are 2 sides of the same coin. For me, it's a struggle to find out the real difference between SBA vs jms topics. Here's my attempt.
#1) In SBA(subject-based-addressing), there's no central server holding a subject. In tibrv, any sender/receiver can specify any subject. Nice — No admin to create subjects. I tested it with tibrvlisten/tibrvsend. By contrast, In JMS, “The physical creation of topics is an administrative task” on the central broker , not on the producer or consumer.
) If a jms broker goes down, so do topics therein.
) a jms publisher (or subscriber) must physically connect to a physical broker process. The broker has a physical network address. Our topic is tied to that physical address. A tibrv subject has no physical address.
) tibrv SBA (subject-based-addressing) uses a subject tree. No such tree among JMS topics. The tree lets a subscriber receive q(prices.stocks.*) but also q(*.your.*). See rv_concepts.
 such as the weblogic server in autoreo.
1) offload to worker thread — When inbound messages require lengthy processing, we recommend shifting the processing load asynchronously. Quickly extract data from the message, and process it in another thread.
* compare EDT/swingWorker
In a ML muni trading engine, we designed several “grabber” listeners
– grab messages and route them to different processing queues, based on hashcode.
– grab messages, append a short code to the subject, then republish to the same queue. Not sure if this works.
2) reduce message size — Avoid XML. RV now supports integer field identifiers similar to FIX. 16-bits and much smaller than String field names.
3) reduce logging — I always felt logging hurts performance. Now confirmed in Tibrv manual! When logging is required for monitoring or auditing, shift the I/O burden to another computer to log messages without introducing a time penalty
Imagine a typical request/reply messaging system. I think in JMS it’s usually based on temp queues, reply-to and correlation-Id — See other blog post. In contrast, RV has no broker. It’s decentralized into multiple peer rv daemons. No difference in this case —
Suppose a message broker holds a lot of queues. One of the queues is for a request message, from requester system to a pricing system. Another queue is for pricing system to return the new price to the requester.
Now, pricing system is slow. Requester should wait for no more than 5 minutes. If the new price comes back through the reply-queue 301 sec later, requester will ignore this stale price since it’s too risky to place an order on a stale price in a fast market. How do you implement this?
My design — Requester main thread can wait(5*60*000). Another thread in requester JVM can block forever in onMsg(), and notify main thread when something received.
(I actually implemented this in a few trading engines.)