Affirm^Confirm in FX drv

In both FX derivativies and Equity derivatives worlds, there’s a Affirmation process before the Confirm process. We will focus on FX derivatives but most of it applies to eq derivatives.

Confirm is a legally binding document, accessible by clients, often on paper. Affirm is not legally binding. It’s often based on a phone call between the 2 counterparties. Think of Affirm as a preliminary Confirm.

Q: can we (operations) complete Affirm on a trade without checking with counterparty?
AA: Forbidden. Operations must verify with counterparty before completing Affirm.

Given its legal power, it’s good to know what a confirm document contains. A BofA veteran told me it might say “Blackrock shall pay BofA $1m on 1/1/2013”. There are 2 such confirms, on both sides, to be matched.

For a derivative deal executed on T+0 and settles T + 3mth, Affirmation could happen T + 0/1/2, but Confirmation usually happens right before T + 3mth — real world settlement. Settlement system actually receives the trade around T + 0/1/2, but keeps the trade as unsettled and often reports such unsettled trades on a daily basis.

##feeling like 滥竽充数 among trading developers #impostor syndrome

(another blog post. No need to reply.)

Working in trading system [1], every now and then i feel like 滥竽充数… impostor’s syndrome If i ask myself

Q: if i benchmark myself among java developers with 5+ year experience in trading, are the majority higher than me?

i can’t categorically say YES. A lot of them are not obviously better than me. For one thing, I feel most of them aren’t battle-tested in demanding places like Goldman Sachs;) However, some of the trading developers I see are more experienced than me on several technical fronts below, and are faster [2]. However, as stated in my post on “perl defensible turf”, 5 years in trading doesn’t mean you know threads or MOM (for eg) inside out.

* threading — idioms and implementation techniques, and to a lesser extent, design techniques. Devil in the details. Compared to other developers, i place more emphasis on low-level implementation skill. If you want a competent and productive[2] threading developer, test her implementation skill, not architecture.
* MOM ie messaging
* data grid
* trouble-shooting MOM, serialization, Spring
* trouble-shooting eclipse. I single out this one as the most frequent weakness and most neglected area.
*** (For a balanced perspective, I should point out I rate myself above average on Unix, SQL and scripting. In fact i had a recent “debate” with a younger java developer with narrower experience than me. He is less comfortable putting complex biz logic into queries and procs.)

Bottom line ==> If i were to lead a team in S’pore, I had better catch up.

[1] I feel lucky this is a mainstream wall street trading system rather than a no-name trading house with a cheap, home-made system.
[2] trading systems are more fast-paced than anywhere i know including telecom, media, manufacturing, dotcom, e-government, healthcare… Managers really do benchmark developer productivity.

Baye’s formula with simple quiz #my take

Tree diagram — useful in Baye’s.

Wikipedia has a very simple example — If someone told you they had a nice conversation in the train, the probability it was a woman they spoke with is 50%. If they told you the person they spoke to was going to visit a quilt exhibition, it is far more likely than 50% it is a woman. This is because women enjoy the comforting feel of a quilt. Call the event “they spoke to a woman” W, and the event “a visitor of the quilt exhibition” Q. Then pr(W) = 50%, but with the knowledge of Q the updated value is pr(W|Q) that may be calculated with Bayes’ formula.

Let’s be concrete. Let’s say out of 100 woman, 10 would mention their visit to quilt exhibition, and out of 100 men, 2 would. We do a large number (10,000) of experiments and record the occurrence of W and Q.

pr(W and not Q) = 50%(1-10%) = 0.45
pr(W and Q) = 50%*10% = 5%
pr(M and Q) = .5*.02 = 1%
pr(M and not Q) = .5(1-0.02) = 49%

These 4 scenarios are Mutex and Exhaustive. Among the Q scenarios (6%), how many percent are W? It’s 5/6 = 83.3% = pr(W|Q). This is the Baye’s formula in action. In general,

pr(W|Q) = pr(W and Q) / pr(Q) , where

pr(Q) == [ pr(Q|W)pr(W) + pr(Q| !W)pr(!W) ]

Another common (and symmetrical) form of Beye’s formula is the “frequenist” interpretation of Baye’s formula —

pr(W|Q)pr(Q) =pr(W and Q)= pr(Q|W)pr(W)

I feel in quiz problems, we often have some information about pr(Q| !W), or pr(Q|W) or pr(W|Q) or pr(Q) or pr(W), and need to solve for the other Probabilities. Common problem scenarios:
* We have pr(A|B) and we need pr(B|A)

I think you inevitably need to calculate pr(A and B) in this kind of problems. I think you usually need to calculate pr(A) in this case, since the unknow probability = …/pr(A)

custom allocator usage

Some container classes have custom allocators, because the default allocator (allocator) leads to memory fragmentation. — tip from Wang, c++ veteran in Nomura.

Custom allocators can also increase “locality of reference”, to allow hardware to exploit locality of reference.

Overall, custom allocators are an optimization technique.

[[STL tutorial]] has a small example

save a literal string in various simple data holders

— basic char pointer
char * ptr  = “and”;
char const * const_ptr = “and”;

— char array
char charArray[] = “and”;
char const  constCharArray[] = “and”;

— std::string
std::string std_string = “and”;

— std::string to vector of char.
std::vector<char> vc (std_string.begin(), std_string.end()); //

(2/3 of C/C++ coding tests involve strings…)

don’t repeat yourself ] sproc

Basic constructs/ideas/techniques to avoid repeating yourself ] sproc:

* dynamic sql
* nested sproc — sproc1 calling sproc2, which calls sproc3. i think the code in the nested sproc can be invoked repeatedly but coded once only
* app to construct the “sister statements”
* converge — short, non-repetitive queries insert into a shared table, then common logic applied to the shared table
* temp tables are essential to many of the constructs above
* case-expressions are powerful tools for cutting such repetition

t-sql: GO,

Think of it like this: Cut up your script into multiple files,separated by the “GO” statement. Run each of these files individually,but use the same connection. That’s all “GO” does.

Server never sees GO. GO is a keyword in client apps such as sqsh. I think java/perl apps don’t use GO.

Until you are clear on the fundamentals above, avoid the confusing questions over auto-commit, transaction and GO. I think auto-commit means every GO-batch is a self-contained transaction. If a GO-batch issues a begin tran without ending it, then server will keep it open.

Another confusing question is GO and stored proc.
* i believe a java/perl app calling a proc won’t use GO at all.
* when sqsh calls the proc, it needs GO to mark end-of-batch
* Most complicated scenario is when you create the proc in sqsh. I usually wrap the create-begin-end piece in one go-batch.

modify a check constraint ] sybase

According to my brief inet search, i think you must do a drop-add.

1) sp_helpconstraint table1 — to see the old constraint name
) alter table table1 drop constraint con1
) alter table table1 add constraint con2 check (….) — must specify a constraint name [1].

[1] when adding/creating columns, u don’t need to specify a constraint name

Constraints do not apply to the data that already exists in the table at the time the constraint is added.

how I estimate workload #XR

Hi XR,

There’s a lot of theory and many book chapters and courses on app development estimation. I don’t know how managers in my department does it. I’m sure these factors below affect their estimations.

* the right way of solving a problem can shorten a 2-day analysis/discussion to 2 hours, provided someone points out the correct way in the beginning. In our bigger department (100+ developers), many managers make estimates assuming each developer knows the standard (not always best) solutions from the onset. If I am new i could spend hours and hours trying something wrong.

* as in your struts form-validation example, a good framework/design can reduce a 1-month job to 1-week.

* tool — the right software tool can reduce a 3-hour job to 30-minutes. Example — I once mentioned to you testing tools. Also consider automated regression testing.

* A common complaint in my department (100+ developers) — a poorly documented system can turn a 2-day learning curve into a week.

* I suspect my colleagues do fewer test cases than i do, given the same system requirement. This is partly because they are confident and familiar with the system. This could mean 30% less effort over the entire project.

outer join generalized

A outer join B on
condition1_involving_both and
condition2_involving_both and

In the worktable, system generates at least one result row for each A row.

System takes the A row and evalutes the condition set. If 4 B rows match, then 4 result rows. If no B row matches, then still one result row with (A.*, null as B.col1 , null as B.col2..)

Note that conditions may be any predicate, not only

US work culture encourages out-spoken assertiveness

(to be published on my blog.)
There are limits but US work culture is more expressive, liberal and permissive in terms of employee communications. 

I don’t have a lot of personal experience — this is just a casual observer’s personal bias — in this country workers are expected to protest, to complain, to argue (sometimes), to protect his/her self-interest. If you really push the limits (but not exceed them), you can earn people’s respect.
“Squeaky wheel gets the oil”

Not sure about China, but Singapore workplaces are more strict, more disciplined, more “uniform”. Workers are reluctant to push the limits, perhaps because the limits are not so pushable — they are more rigid than in US.
Compare to SG, In US it’s your job to get your job done in time, your job to get the support you need, your job to get rid of the road-blockers. The system (in many companies) is not as perfect and functioning as in SG companies. SG workplaces often present a well-managed, well-controlled environement, partly because subordinates are more obedient.
If you are unhappy, i think you can raise your concerns to your onsite manager, or your offsite managers. You deserve their attention.

solace^tibcoApplicance #OPRA volume solace JMS broker (Solace Message Router) support 100,000 messages per second in persistent mode and 10 million messages non-persistent. In a more detailed article, shows 11 million 100-byte non-persistent messages.

A major sell-side’s messaging platform chief said his most important consideration was the deviation of peak-to-average latency and outliers. A small amount of deviation and (good) predictability were key. They chose Solace. has good details.

In all cases (Solace, Tibco, Tervela), hardware-based appliances *promise* at least 10 fold boost in performance compared to software solutions. Latency within the appliance is predictably low, but the end-to-end latency is not. Because of the separate /devices/ and the network hops between them, the best-case latency is in the tens of microseconds. The next logical step is to integrate the components into a single system to avoid all the network latency and intermediate memory copies (including serializations). Solace has demonstrated sub-microsecond latencies by adding support for inter-process communications (IPC) via shared memory. Developers will be able to fold the ticker feed function, the messaging platform, and the algorithmic engine into the same “application” [1], and use shared memory IPC as the data transport (though I feel single-application design need no IPC).

For best results you want to keep each “application” [1] on the same multi-core processor, and nail individual application components (like the feed handler and algo engine) to specific cores. That way, application data can be shared between the cores in the Level 2 cache.

[1] Each “application” is potentially a multi-process application with multiple address spaces, and may need IPC.

Benchmark — Solace ran tests with a million 100-byte messages per second, achieving an average latency of less than 700 nanoseconds using a single Intel processor. As of 2009, OPRA topped out at about a million messages per second. OPRA hit 869,109 mps (msg/sec) in Apr 2009.

Solace vs RV appliance — Although Solace already offers its own appliance, it runs other messaging software. The Tibco version runs Rendezvous (implemented in ASIC+FPGA), providing a clear differentiator between the Tibco and Solace appliances.

Solace 3260 Message Router is the product chosen by most Wall St. customers. provides good tech insights.

thread quizzes+paradoxes

Q: class SimMessageConsumer extends Thread and overrides run(). In my main() method, i instantiate myThr1 = new SimMessageConsumer() and call, on which thread does run() run?
A: on the main thread. The real world thread linked to myThr1 is not yet created by VM. That happens when you call myThr1.start().

As stated in another post, a Thread object is a poor handle on a real VM thread.

Q: if MyThread defines a (static or non-static) method m1(), and I call m1() from main(), on which thread does m1() run? Note this is not best practice. Avoid defining any other methods in a Thread derivative.
A: main thread.

include an irrelevant table in a n-way join with no effect

Say you use a 6-table join to return 1000 rows for a large set of conditions. Now for some reason[1] you need to include another table, but want exactly the same output. The additional table should have zero impact on query output and near-zero impact on performance.
[1] one reason: to combine this query with another query.
Solution: try to find a condition involving only columns in the irrelevant table, to return exactly one row. Usually the primary key can help. However, this condition can be very hard to understand.

java type bounds, c++ concepts, templates

If you have a c++ parametrized class C with whose T must have “operator>”, then java has a better solution —

class C

Now, if a parametrized function f(T input) with whose T must have “operator>”, i would guess java would ditch the generics and simply declare f(Comparable input).

In the general scenario, a template param T has constraints such as “having run() method”, “numeric”, “copyable”, “assignable”, “dereferenceable like a smart ptr”. In a /degenerate/ case, there’s no constraint — vector can take any type[1]. In another degenerate case, a template param can have so many constraints that the template class can take nothing but USPresident class — Better drop the template.

[1] actually vector requires T to be copyable.

Back to the constraints, c++ compilers actually check some of these constraints and won’t let you specialize a template with an incompatible type. Examples —

* if you specialize natural_log() with a non-numeric type, compiler breaks.
* if you specialize a sort() with a non-random-access iterator, …. compile time or runtime error? I think compiler is too dumb.
* ARM p343 — linker detects a non-comparable type specializing a sorting template.

After reading, I feel c++concept (cconcept) resembles java type bounds of the form

Java interface is better. The constraints above usually translate to methods. In that case they can be implemented using interfaces, without generics. Occasionally, you can use something like

raiserror, fatal^non-fatal errors ] sybase

– fatal errors (caused by one statement) causes the sproc to abort immediately at the offending statement, ignoring subsequent statements. You can’t “handle” or “react to” these errors. @@error won’t let you read it

– non-fatal errors let subsequent statements run. These are the only type of errors you can handle.

"star shines too bright" (but less than ++

The asterisk i.e. dereference operator binds Tighter than +/- operators, so

   *ptr+1 // bug

is evaluated as the nonsensical (*ptr)+1. I call this problem “star Binds too Tight” or “star Shines too Bright

Solution 1 — For pointer arithmetic, you must do *(ptr+1)

Solution 2 — for arrays, no parens needed here. Subscript binds tighter than star

   *orderLookup[i] // same as *(orderLookup[i])

Solution 3 — the increment “++” binds tighter than the star, so you can use

   *++ptr // same as *(++ptr)

Solution 3XXX (wrong!) — this means something else —

   *ptr++ // same as *(ptr++)

See also

filter^output column

2 types of columns

Most (if not all) columns in a table/view are either
– filter columns, used in where, having, on …
– output columns, returned in select-list
– or both

Examples of filter columns: most join columns, id ….

Given a complex query, if u need a quick summary of one of the tables it’s worthwhile to classify the columns this way. This is obvious stuff but it pays to develop this instinct. It should become a 2nd nature

de-couple – using abstract^concrete class

When authors say a client class A “depends on a concrete class” B they mean instanticiates new B(). This is less advisable than using a abstract/interface. Now I think there are many hidden costs and abstract arguments.

One of the most understandable argument is testability. But in this post, we talk about another argument — the “expected services” between client object A and “service” object B.

Scenario 0: client object A instantiates new B(). The maintainers of and have to assume every public method [1] is required. Suppose and are maintained in separate companies like spring^hibernate. The 2 maintainers dare not change many things in or But all software need changes!

Scenario 1: Interface offers more flexibility — decoupled. If only uses C, then both maintainers know the services that must support. A maintainer can swap in (runtime) another implementation of interface C. Likewise, B maintainer have more room for maneuver.

Analogy: I wrote HTTP clients in telnet and Perl. They follow the HTTP protocol so they can interact with any web server, thanks to a published standard on the expected behaviour of client and server. Both sides depend on the abstract HTTP protocol only. A java Interface strives for the same flexibility and decoupling.

During OO design, I think it’s fine to introduce as many interfaces as you like. I think interfaces don’t add baggage or tie us down as classes do. Remember your team lead makes one DAO interface for every dao class!

[1] Public fields, too.

import+package+namespace ]py/c++/java/perl

Namespace is a common challenge, and a common feature. Physically, module files invariably live in a hierarchical file system.

Perl’s solution revolves around the concept of package and symbol table….

–java’s solution is rooted in fully qualified type names. Every named type has a FQTN. They naturally form a hierarchical namespace tree.

Q: Can import a global object like System.out i.e. the equivalent of c++ cout handle object?
A: java global objects are always implemented as static fields (never namespce-level or package-level variables). Therefore we use static import. See

Python instantiates a namespace object “os” (or sys, re) when you say “import os, sys, re”, so you can use the dot notation like os.path. Thanks to this “object”, Python’s introspection, instrumentation and meta programming capabilities shine through.

Python’s “from os import environ” imports the environ variable into the current namespace, just like c++ “using std::out”

–C++ namespaces are well-covered in [[absolute c++]], and also concisely covered in effC++
using namespace std; // imports all the vars and functions into “here” so we don’t need to say std::cout
using std::cout; // imports just one var

However, in both cases your own “cout” variable will clash with the cout imported.

There’s a 3rd usage of “using myBaseClass::method2” on P413 of ARM

Note, “using” can be nested in  a class.