sybase 15 dedicated cpu

In one of my sybase servers with large database load, we managed to mark one of 16 cpu cores to a specific stored proc, so no other process could use that core.

We also managed to dedicate a processor to a specific JDBC connection.

This let us ensure a high priority task gets enough CPU resource allocation.

What’s the name of the feature?

vector insertion — implicit copy-ctor frequently

When you insert your object into a vector, the original object (B) is passed by value. Copy-ctor called exactly once for each insert.

    vector v(3); // default ctor — 3 calls.
// update — now I see one default ctor call, and 3 copy ctor calls.

    cout<<v.capacity()<<endl; // capacity = 3. Next insert would reallocate to 6
    v.push_back(B()); // temp B instance created, copied in, then destroyed. The reallocation calls copy ctor 3 times on the existing elements. 3 original objects destroyed
    cout<<v.capacity()<<endl; // Capacity is now 6
    v.push_back(B()); // temp B instance created, copied in, then destroyed. No reallocation
    cout<<v.capacity()<<endl; // Capacity remains 6

Here are some useful instrumentation code —

struct B {
    static int count;
    int id;
    B() {
        ++count;
        cout << count << " B()\n";
        id = count;
    }
    B(B const & rhs) :
        id(rhs.id + 10) {
        cout <<id << " copied\n";
    }
    ~B(){
        cout<<id<<" ~B\n";
    }
};
int B::count = 0;
ostream & operator<<(ostream & os, B const & b){
    os<<b.id<<endl;
}

3 types of domain knowledge in finance IT

Domain knowledge is specialized, hard-to-find, and an entry barrier. I see 3 broad categories of domain knowledge — Lingo (jargon), Math and Architecture

1) Lingo (Jargon) — Technical or Non-technical professionals in a given trading system must share a few hundred terms, phrases, verbs and adjectives. Each jargon term often has a superficial “face” meaning and connects with other jargon terms. The full meaning in context often requires a wikipedia page. Some Random examples that came to my mind — fixings; basis risk; option knock-in; tick DB; Basel; lockfree; move-semantic; rvalue ref. 

Roughly half the jargon terms involve math. An IT guy can either memorize the derived mathematical conclusions or examine the underlying math. Random examples of such jargon include — volatility surface; yield; price sensitivities; yield’s impact on FX options; when to use stress testing vs VaR. I feel many people in IT don’t really have a firm grip on these. We go into a restaurant of financial jargon and look for fast-food. I call it fast-food culture, or quick-answer culture. Anything we don’t easily understand, we like to say we don’t need to understand. If we don’t make a conscious effort, we will always stay a financial laymen. As a result, a “conscious” programmer completely new to finance for 6 months can understand concepts deeper than a 5-year veteran.

Rather than “Jargon” I prefer “Lingo” — more general and less technical.

2) the Math part is a body of knowledge created by quants and PhD’s, over the past 50 years or so, perhaps starting with bond math. Needs high school math and later some calculus. For many IT folks who left school 5+ years ago, even the high school math pieces are not a piece of cake. If I don’t invest hours of spare time, i won’t fully understand half the theories in bond math which is all high-school.

The math in finance looks simple but needs to be rigorous. I feel pricing calculations in fixed income and in options often use a precision of 5 to 10. A shallow understanding could overlook details.

Financial math is a sizable body of theory, but most trading IT developers need only the basics. If I could say that I needed 6 months to grow from zero-finance-knowledge to be competent enough to help build trading engines, then obviously my roles didn’t need a lot of financial math.

3) architectures of trading systems — components like pricing, risk analytics, high vol market data, pnl explains, stress test, visualization, exchange gateways, SOR, DMA, mark-to-market, trade capture …
It’s possible to talk about an architecture of a complete suite of components but I prefer to talk about architectures of individual components. I prefer this because architectures are vastly different in terms of volume and real-time nature.

Banks want to hire experienced developers to help them re-architect, so as to stay competitive in the “arms race”.

Architecture domain knowledge changes faster than math or jargon knowledge. Just like jargon and math, architectural knowledge is specialized and not a commodity skill, so relatively few people in finance IT has it, creating a has-vs-hasnot divide. It’s difficult to gain that insight and knowledge because developers don’t need to know all the design decisions once made to fix up the live system they are now supporting. Team is set up such that you are given a lot to do just to get-things-done so you have no spare bandwidth to learn other components, but non-trivial insight into neighboring components are necessary for an architectural understanding.

I have seen a few impeccable blue-prints that fail in practice and need major changes.  Anyone can come up with great-sounding architectures but most of them are /sub-optimal/. Reason can be “hard to debug”, “inflexible”, “learn curve”… Therefore real world, proven architectural domain knowledge is rare and valuable.

Most architectures rely on solid, time-honored infrastructure software like Message-oriented-middle-wares, RDBMS, distributed cache and grids, cross-platform RPC/ORB, thread libraries, xml. Vendors always say “we are perfect” but good knowledge about their weaknesses and limitations are hallmarks of real architects.

Some say fixed income has more domain knowledge but that’s true only in the math area. Equities (and FX?) probably have a larger body of domain knowledge in the architecture space.

object, referent, and reference — intro to weak reference

see also post on pointer.

— physical explanation of the 3 jargons
A regular object (say a String “Hi”) can be the target of many references
* a regular strong reference String s=”Hi”;
* another regular strong reference String ss1=s; // remote control duplicated
* a weak ref WeakReference wr1 = new WeakReference(ss1);
* another wak ref wr2 = new WeakReference(s);

1) First, distinguish a regular object “Hi” from a regular strong reference ss1.
* The object “Hi” is a “cookie” made from a “cookie cutter” (ie a class), with instance and static methods defined for it.
** like all objects, this one is created on the heap. Also, think of it as an onion — see other posts.
** “Hi” is nameless, but with a physical address. The address is crucial to the “pointer” below.

* strong reference ss1 is a “remote control” of type String. See blog post [[an obj ref = a remote control]]
** People don’t say it but ss1 is also a real thingy in memory. It can be duplicated like a remote control. It has a name. It’s smaller than an object. It can be nullified.
** It can’t have methods or fields. But it’s completely different from primitive variables.
** method calls duplicate the remote control as an argument. That’s why java is known for pass-by-value.

2) Next, distinguish a strong reference from a pointer.
* a regular strong reference ss1 wraps a pointer to the object “Hi”, but the pointer has no technical name and we never talk about the pointer inside a strong reference.
** the pointer uses the “address” above to locate the object
** when you duplicate a remote control, you duplicate the pointer. You can then point the old remote control at another object.
** when no pointers point to a given object, it can be garbage collected.

Summary — a pointer points to the Object “Hi”, and a Strong reference named ss1 wraps the pointer. Note among the 3, only the strong ref has a name ss1. The other strong ref s is another real thingy in memory.

3) Now, weak reference. A weakref wr1 wraps a pointer to an obj (ie an onion/cookie), just like a strong ref. In a weak reference (unlike strong reference) the pointer is known as a “referent in the reference”. Note wr1’s referent is not ss1, but the object “Hi” referenced by ss1. That’s why “Hi” is target of 4 pointers (4 reference). This object’s address is duplicated inside all 4 variables (4 references).

“Referent in a reference” is the pointer to object “Hi”. If you “set the referent to null” in wr1, then wr1 doesn’t point to any object in memory.

Unlike strong references, a weak reference is sometimes mentioned like an object, which is misleading.

Q: Is the weak ref wr1 also an object? We explained a strong ref ss1 differs completely from the object “Hi”.

A: Yes and no. you create it with new weakReference(..) but Please don’t treat it like a regular object like “Hi”. In our discussion, an object means a regular object, excluding reference objects. [[Hardcore Java]] page 267 talks about “reference object” and “referenced object” — confusing.
A: java creators like to dress up everything in OO, including basic programming constructs like a thread, a pointer, the VM itself, a function (Method object), or a class (Class object). Don’t overdo it. It’s good to keep some simple things non-OO.

A: functionally, wr1 is comparable to the remote control ss1. Technically, they are rather different

top 10 fields in NOS ^ exec report

—- top 5 fields common to NOS (NewOrderSingle) and exec report —
#1) 22 securityIDSource. Notable enum values
1 = CUSIP
5 = RIC code
*** 48 SecurityID, Requires tag 22 ==NOS/exec
*** 55 optional human-readable symbol ==NOS/exec

#2) 54 side. Notable enum values
1 = Buy
2 = Sell

#3) 40 ordType. Notable enum values
1 = Market order
2 = Limit order

#4) 59 TimeInForce. Notable enum values
0 = Day (or session)
1 = Good Till Cancel (GTC)
3 = Immediate Or Cancel (IOC)
4 = Fill Or Kill (FOK)

—- other fields common to NOS and exec type
*11 ClOrdID assigned by buy-side
*49 senderCompID
*56 receiving firm ID
*38 qty
*44 price

—- exec report top 5 fields
* 150 execType
* 39 OrdStatus. Notable enum values
0 = New
1 = Partially filled
2 = Filled

* OrderID assigned by sell-side. Not in NOS. Less useful than the ClOrdID

15,000 quotes repriced within a minute

One of my bond pricing engines could price about 15,000 offers/bids in about a minute. 4 slow lanes to avoid
 
1) database persistence is done asynchronously by gemfire write-behind.

2) offers/bids we produce must be verified by another system, which officially owns the OutgoingQuote table. The verification takes a long time. We avoid that overhead by pricing all the offers/bids in gemfire, then send them out by batch, then wait for the result. The 1 minute speed is without the verification.

3) all reference data is preloaded into gemfire, so no more disk I/O.

4) minimal serialization overhead, since most of the objects needed are in local JVM.

In contrast, a more complex engine, the mark-to-market engine needs a few minutes to price 15,000 positions. This engine doesn't need real time performance.