the base Delegate class isn’t a delegate type

Q: Why is the base Enum class not a enum type?
I feel an instance of this particular class has no integer representation, unlike an enum instance.

Q: Why is the ValueType class not a value type?
I feel this class exists to provide a suitable Equals() and GetHashCode() suitable for most structs. I feel this is like an empty abstract class. I feel it has no field, so copying a ValueType instance should not be a copy by value.

Q: why is the Delegate class not a delegate type?
I feel this type doesn’t specify a method signature and doesn’t have an inv list

what kind of entity is "str" in a python script

(Warning — for a novice introspection is overly powerful and complex in dynamic languages like python. We won’t get to the bottom any time soon.)

Everything (except literals and operators) you see in a python script is usually one of

– a variable name
– a reserved keyword like for/while, import, in,
– module name like re, string, os. I think these are actually dictionaries
– a type-friendly builtin function — you can put the identifier into type() and dir() — like round, type.
– a type-unfriendly builtin function — print, ..

* “str” is a very strange animal. It’s a function, but type(str) says it’s a typename
* “print” is a strange animal. You can’t examine it using dir(), type()

Python docs list all builtin functions, but a few won’t “type” as builtin function — str, print, type, tuple…

For this sort of investigation, you need a few instrumentation tools — str(), repr(), type(), dir(). I call these meta-functions. They operate on other functions, variables ….

print dir(type)
print dir(str)
print dir(dir)

factory-method vs in Spring

I feel there’s deeper and richer support for Notes from spring doc —

— “factory bean” vs FactoryBean —
When the Spring documentation makes mention of a ‘factory bean’, this will be a reference to a bean that is configured in the Spring container that will create objects via an instance or static factory method. When the documentation mentions a FactoryBean (notice the capitalization) this is a reference to a Spring-specific FactoryBean.

— getting the FactoryBean object itself —
Finally, there is sometimes a need to ask a container for an actual FactoryBean instance itself, not the bean it produces. This may be achieved by prepending the bean id with ‘&’ when calling the getBean method of the BeanFactory (such as the ApplicationContext bean factory).

So for a given FactoryBean with an id of myBean, invoking getBean(“myBean”) on the container will return the product of the FactoryBean, but invoking getBean(“&myBean”) will return the FactoryBean instance itself.

which city is at the spearhead of finance IT

Even if Asia IT jobs were to pay higher than US, US is still spearheading industry best practices. I’m in the financial IT space, but see similar patterns in other IT sectors.

It’s revenue share, stupid.

The more revenue a country office generates, the more budget it controls. The office usually prefers onsite development team, so that team tends to become the hotbed of innovation. For a foreseeable future, NY and London will be the hotbeds.

American vs European call option valuation

Hey Hai Tao,

My CFA textbook had a conclusion I don't believe.

Say there is a microsoft stock call option expiring 7/1/2011, X = $20, in-the-money, American style
Say there is an identical microsoft call but European style.

Assumption: the underlying stock makes no dividend or other cash payment before expiration.

Under this assumption, textbook says both call options are worth the same.

Earlier the same author said that an American option is worth at least the same as an European style option, but under this Assumption, he claims they have equal valuation.

As a Layman, my intuition tells me the American option is more valuable. Suppose my analysis tells me Microsoft might drop to $19, then the American option lets me pay $2000 today for 100 shares and sell today at $2467, earning a profit of $467. The European option may expire out of the money and worthless.

Do you agree?

necessity: some trading module imt others

“non-optional + non-trivial” is the key.

Context – trading systems.

I feel trade booking/capture is among the “least optional”. Similarly, settlement, cash mgmt, GL, position database, daily pnl (incl. unrealized). Even the smallest trading shops have these automated. Reasons – automation, reliability, volume. Relational Database is necessary and proven. These are Generally the very first wave of boring, low-pay IT systems. In contrast a lot of new, hot technologies look experimental, evolving and not undoubtedly necessary or proven —

* Sophisticated risk engine is less proven. I don’t know if traders really trust it.
* Pre-trade analysis is less proven.
* huge Market data often feed into research department, risk/analysis systems. I feel some small portion of market data is necessary.
* models
* Algo trading, often based on market data and models
* object DB, dist cache, cloud aren’t always needed
* MOM? i guess many trading systems don’t use MOM but very rare.

Quiz: who to "intercept" returns from a java method

Jolt: When you put in a “return”, you think “method would exit right here and bypass everything below”, but think again! If you use a return in a try{} or catch{}, it is at the /mercy/ of finally{}.

P477 [[ thinking in java ]] Only one guy can stop a return statement from exiting a method. This guy is finally, also mentioned in one of my posts on try-catch-finally execution order.

Same deal for break, continue.

equity-linked notes, briefly

From the dealer/issuer’s stand point, ELN is a debt issue. ELN is a note i.e. a short-term bond, therefore a debt to be repaid. The issuer Borrows money from the Investor and will pay it back upon maturity.

So far, this sounds like a regular bond. What’s the difference? Well, ELN is a debt issue + an equity option written by the issuer.

I believe an ELN contract typically stipulates a reference stock index. If the index gains 50% and participation rate is 80%, then investor receives $1.40 for each dollar invested — an example from This is a contractual agreement. Issuer doesn’t invest in the index.

However, if the index doesn’t gain, the investor is protected 100% but receives no interest — like a zero-coupon bond. Therefore the investor effectively buys a zero-coupon bond + an option from the issuer.

Q: How does the issuer profit?
– if index falls, issuer borrows money interest-free
– if index gains, issuer could invest the money in higher-return securities and only pass a small portion of the gain. Even if issuer invests in the reference index directly, only part of the gain need to be returned to investor.

Q: How can a retail investor replicate an ELN of $10,000 + 80% participation? I guess you can buy a discount note + an index option.
+ buy a note with face value $10,000 matching the target maturity. Current price is by definition a discount below $10,000
+ suppose index is $1000 now, just buy 8 ATM call options with the same maturity. Premium might be $400. If index falls they expire worthless but we still get the $10,000 from the note.

single-threaded UI update – wpf vs swing

In a WPF app every visual “element” object has a “owning” thread. It can be modified only by the owning thread. This sounds like “every child has a mother”. Indeed WPF lets you have 2 owning threads if you have 2 unrelated window instances in one Process[3]. In practice, most developers avoid that unnecessary flexibility, and use a single owning thread — known as a dispatcher thread.

[3] each having a bunch of visuals.

In WPF, Anytime you update a Visual “element”, the work is sent through the Dispatcher to the UI thread. The control itself can only be touched by it’s owning thread. If you try to do anything with a control from another thread, you’ll get a runtime exception.

new Action(
delegate(){myCheckBox.IsChecked = true;}

Very similar to invokeAndWait(new Runnable()…. There’s also a counterpart for invokeLater().

mkt-data is primarily collected for … quants in research dept

(background — there are many paper designs to process market data. The chosen designs on wall street reflect the intended use of this data.)

I won’t say “by traders” since it’s too much data for human consumption. It must be digested. Filtering is one of many (primitive) form of digestion.

I won’t say “by trading systems” as that’s vague. Which part of the trading system?

I won’t say “by algo trading engines”. What’s the algo? The abstract algo (before the implementation) is designed by quants based on models, not designed by IT. Traders may design too.

Q:Who has the knowledge to interpret/analyze such flood of market data?
A: not IT. We don’t have the product knowledge
A: not traders. In its raw form such market data is probably unusable.
A: quantitative researchers by definition are responsible for analyzing quantitative data.
A: data scientist also need to understand the domain and can use the data to extract insight.

make a series of random colors – no 2 similar colors in a row

brightness — Colors should not be too light to read on a white background.

saturation — push to the max

private Color makeColor() {
float piePieces = 50;//how many pizza slices in a pie
for (;;) {
int tmpHue = this.rand.nextInt((int) piePieces + 1);
if (Math.abs(tmpHue - this.intHue) / piePieces > 0.11) {
// 0.5 means exact opposite color
this.intHue = tmpHue;
float hue = tmpHue / piePieces; // 0 to 1
return Color.getHSBColor(hue, 0.99999f, 0.77f);

Oracle article on exclusive/shared locks, row/table level locks, non-repeatable-read

Here are a small sample of the knowledge pearls —

– If a transaction obtains a row lock for a row, the transaction also acquires a table lock for the corresponding table. The table lock prevents conflicting DDL operations (like alter table).

– A detailed comparison of read-committed vs serializable isolation levels in Oracle

To my surprise,

* “Readers of data do not wait for writers of the same data rows”. Maybe Reader thread simply goes ahead and read-committed. Reads either the BEFORE or the AFTER image of the row. Now, If isolation is at Serializable, then still no wait, but this thread will throw exception if it re-reads the row and detects update. This is a classic non-repeatable-read error.

another large Interest Rate trading desk

Gov desk + IRD desk

IRD trading desk is very close to Gov bond trading desk which includes agency bond.

IRD covers IRS, ED futures, T futures, vanilla derivatives and exotic derivatives .. some of these derivatives take a long time to price.

Core of the entire IRD system is the risk engine. It “Provides real time risk assessment to traders” — a one-sentence explanation by an IT manager in the team.

Gov desk has higher trade volume, mostly flow trades, with some prop trades. IRD vol is lower – about 1000 trades a day, with 8000 positions in one of the top International i-banks.

5 ways my GS managers make new guys so productive so fast

# 1) My GS managers have a feel about how much production support each developer is taking on. Therefore managers can monitor the learning pace of the new hire.

(Note Tier 1 or Tier 2 production support teams take on a lot, but they can’t read source code. Developer team always receive a lot of escalated prod support requests.)

Sooner or later, we realize there need to be a way to numerically measure the accumulation of _local_system_knowledge_ by the new hire. Many new hires intuitively feel “now i know close to half the systems”, or “now i have 75% of the system knowledge of my mentor”. A more objective way to measure is to simply count the number of system issues (big or small) resolved by each developer *independently*. I also count in all user questions as system issues. Together, they represent the “load” on the production support team. The more a new hire can help “offload”, the better. If a new hire is not taking on enough after 6 months, improve her training.

Crucially, only the knowledge relevant to “offloading” is true “system knowledge”. Manager needs to know the difference — a lot of know-how isn’t relevant and doesn’t constitute system knowledge.

#1b) Some of my GS managers realize that every proficient prod support person need to get their hands really dirty with logs, autosys JIL, DB investigation, tracing through java/SQL/scripts — an overwhelming body of “traces” or clues. A foundation of the “local system knowledge”.

In other banks, some new hires grow familiar with 1/3 of those traces in a year — too slow. It’s possible to achieve 70% familiarity in a year. 100% is defined as the minimum level among primary production support guys.

#1c) my GS managers asked me to spend a session each week with a system knowledge expert colleague, going through the weekly 300+ emails sent to the team. A lot of production support tasks originate in such emails. Personally, I feel any amount of such knowledge sharing can’t replace resolving an issue independently. A new hire needs to use her own head and interpret, guess, rule-out-possibilities, not just follow a cookbook. A lot of the “traces” are misleading or incorrect.

# 2 ) Perhaps the manager’s most important guide is the determination and conviction that “we can and will train this person up in 6 months to start delivering real business value”. A crude analogy — once you pay big money for a power drill before home renovation, then you want to get real value from the investment. Given this conviction, the manager makes everyone in the team aware new hire’s survival depends on their help, so “be prepared for the hundreds of questions new hire may ask”. In another bank, a senior mgr told me to reduce the volume of questions and learn to “slide into the new team”.

My GS manager spent lots of time trying to understand how I learn, how i spent my time, what questions I ask, where I often get stuck. This is what mentors do. Micro-managing? Perhaps.

3 ) My personal favorite (possibly biased opinion). Some of the new hires were put into “intensive production support” mode, kind of immersion training, spending 70% their time on prod support, far more than on green field or BAU. This lasts 3 months to almost a year.

) This is nothing special in GS — Let a new hire take ownership of a production support task and follow up till closure. I feel other banks too easily allow a new hire to pass on a “hard” problem to colleagues. In GS, New hire is encouraged to ask many “why” questions afterward, and mentor/colleagues should give in-depth answers. Both sides should be aware that some questions are less relevant and too early.

My GS managers, just as in other fast-paced banks, like to mention note taking. New hire’s note taking is frequently criticized in many banks. He can’t write down everything he hears since what he hears don’t make sense to him anyway. He can’t afford to spend too much time going into all the “why” either.

) This is nothing special in GS — Sometimes there’s a dedicated “mentor” to transfer basic system knowledge. A good mentor can make a big difference, such as pointing out the wrong things to delve into. Some new hires prefer asking many different colleagues.

I was a decent mentor. I share quick tips to set up tests, point out the tools worth learning … I spent a lot of time sitting with the new hire.

) In my GS department, people are generally more available to help or answer questions than in other companies. This extends across department boundaries. If I need help from another business division, I can send mail or call them and get immediate answers, or they will get back to me.

software engineering where@@ wallSt @@

When I look at a durable product to be used for a long time, I judge its quality by its details. Precision, finishing, raw material, build for wear and tear…. Such products include wood furniture, musical instruments, leather jackets, watches, … I often feel Japanese and German (among others) manufacturers create quality.

Quick and dirty wall street applications are low quality by this standard, esp. for code maintainers.

Now software maintainability requires a slightly different kind of quality. I judge that quality, first and foremost, by correctness/bugs, then coverage for edge/corner cases such as null/error handling, automated tests, and code smell. Architecture underpins but is not part of code quality. Neither is performance, assuming our performance is acceptable.

There's a single word that sums up what's common between manufacturing quality and software quality — engineering. Yes software *is* engineering but not on wall street. Whenever I see a piece of quality code, it's never in a fast-paced environment.

observer can be MT or SingleT

Observer pattern can either go MT or stay ST ie single-threaded.  Both are common.

The textbook showcase implementation is actually single threaded. The event happens on the same thread as the notifications, often looping through the list of observers. That’s actually a real world industrial-strength implementation.

(—> Notably, the single thread can span processes, where one method blocks while another method runs in an RMI server. This is the so-called synchronous, blocking mode.)

Other implementations are MT — event origination and and notifications happen on different threads. As explained in the post on async and buffer ( ), we need a buffer to hold the event objects, as the observer could suffer any amount of latency.

instrumentation overhead: latency/throughput]low latency app

(Note these overheads primarily hurt latency, but throughput-sensitive apps would also suffer, to a lesser extent I presume. Here’s my Reason — if it takes 20 minutes (or hours) to finish a big batch, 5% longer would not matter that much. However, in low-latency trading a consistent 5% additional latency can mean many lost opportunities. An inconsistent latency is even more prevalent.)

A top engineer in a ultra high speed option exchange told me truss is the #1 instrumentation tool (they run linux). I believe this is for occasional use. truss is intrusive and can slow down an app to HALF its speed, according to Solaris tuning experts.

Dtrace jvm provider has method-entry and method-exit hooks and can add overhead too. Overall, dtrace is less intrusive than truss.


— logging —
By default, logging is intrusive and adds latency to the low latency critical-path. We still need real time logging for production support. Solutions? Someone told me ramdisk is one option. Someone told me async logging is available in log4j. 

Given that disk is a block-oriented storage (see other posts on this blog), you can also increase disk buffer and write in bigger blocks. Non-ideal for real time monitoring. Minutes of delay is common.

When quizzed in interviews, I gave some impromptu suggestions —
* main thread puts instrumentation/logging data into a shared data structure in local memory. A background thread reads it asynchronously — non-intrusive.  Shared mutable => locking required.
* we can also send messages to a logging server. In this case, I feel few-big messages beat many-smaller messages.

block-oriented storage – j4 columnar

As suggested on, reading a megabyte of sequentially stored disk data takes no more time than a single random access on block-oriented storage. I call it wholesale vs retail. In such a context, storing related items physically close is essential optimization.

I believe this is a fundamental fact-of-life in tuning, algorithms and system design.

I believe this is the basis of the b+tree RDBMS index tree. In contrast, In-memory indexing uses no Block-oriented storage and probably doesn’t use b+tree.

I believe this is one of the top 3 fundamental premises of columnar DB. Imagine a typical user query selects columns of data, not rows of data…

q{FIND} | perl,grep,notepad++

q(grep -r –color) is a simple solution but I still worry about lack of control.

# use grep inside perl, without xargs
find . -type f|perl -nle 'print "$_ --\n$a" if /\.(C|h)/ and $a=qx(grep -i "btmodels" "$_") '
find . -type f|perl -nle 'print "$_ --\n$a" if !/\.git/ and $a=qx(grep -i "btmodels" "$_") '

MSWE search is unreliable for full-text search. Ditto for MSVS search. Don’t waste time on them!
Try notepad++. You can click the search result, and get keyword highlight, like in google!
Try findstr in

What’s the different between these 2 sql’s — a condition to put into WHERE or ON

Q: What’s the different between these 2 queries?

left outer join CodeLegend c on spv.StatusUid=c.CodeLegendUid
where c.FullName=’Active’
———————————————————————— ———-
left outer join CodeLegend c on spv.StatusUid=c.CodeLegendUid
and c.FullName=’Active’

%%A: all rows from outer table will show up in intermediate table. The “Active” condition becomes “and (c.FullName=’Active’ or c.FullName is null)” if put into ON clause

If in doubt, I usually put conditions in WHERE (not ON) because WHERE is less implicit and more explicit. Intermediate table tends to show more meaningful data this way. Not sure about this case though.

See explains right-table predicate in ON-clause

The more obscure issue is Outer table predicate in ON-clause, explained in other blog post…

where is the svn ignore-list physically saved

If you instruct[1] svn to ignore a dir, that instruction is physically saved in your local parent_directory/dir-props.

Now tortise will show parent_directory as “modified”. 

Tortise context menu -> svn property — will show it.
Tortise context menu -> check for modification -> look into “Status” column (one word) — will also show the exact change

[1] by IDE, Tortise or command line

secDB/Slang – resembles python inside OODB

A few hypothesis to be validated —

1) you know oracle lets you write stored proc using java (OO). Imagine writing stored proc using python, manipulating not rows/tables but instances/classes

2) you know gemfire lets you write cache listeners in java. Now Imagine writing full “services” in java, or in python.

I feel secDB is an in-memory OODB running in a dedicated server farm. Think of it as a big gemfire node. It runs 24/7.

Why python? Well, python’s object is dict-based. I believe so are secDB objects.

IPC sockets^memory mapped files

Unix domain socket is also known as IPC socket.

Requirement — Large volume data sharing across jvm and other unix processes.

  1. sockets are well supported in java, c, perl, python but still requires copying lots of data. I think only the unix domain socket is relevant here, not inet sockets.
  2. memory mapped files as a RandomAccessFile and MappedByteBuffer? Pure java solution — No JNI needed. I feel not so “popular”. c# and c++ also support it.

FactoryBean to avoid hardcoded classnames in spring xml

Without this, you need to hard code the class name in a windows-specific xml, and a different classname in the linux-specific xml

public class CalcLibFactoryBean implements FactoryBean {
public CalcLibWrapper getObject() throws Exception {
if (System.getProperty("").toLowerCase().startsWith("win")) {
return new PxYldCalcWrapperStub();
}else {
return new PxYldCalcWrapper();

no blocking feature in ThreadPoolExecutor

A known issue — The Sun bugs database describes the unexpected immediate-failure behavior of blocking-thread-pool, and mentions a Spring forum discussion —

“You have to modify the java.util.concurrent.ThreadPoolExecutor by overriding the execute method and place a task with a put instead of an offer with a zero timeout on the workqueue. Or you could use the BlockingExecutor (and the ThreadPoolBlockingExecutor) from Prometheus. The primary reason of existence of these classes was to create an Executor that is able to block.”

JDK ThreadPoolExector javadoc made it clear — “New tasks submitted in method ThreadPoolExecutor.execute will be rejected when the Executor … uses finite bounds for work queue capacity, and is saturated.” Root cause beneath the non-blocking behavior of blocking-queue is the use of queue.offer(). “Offer” sounds like “try inserting” and therefore non-blocking. It returns false if queue full. In contrast, the put() method is blocking.

   public void execute(Runnable command) {
       if (poolSize >= corePoolSize || !addIfUnderCorePoolSize(command)) {
           if (runState == RUNNING && workQueue.offer(command)) {

Personal experience #1 — we created a simple blocking thread pool. Avoid ThreadPoolExecutor.execute(). Manually create an array blocking queue of fixed capacity, and add tasks using put().

Personal experienc #2 — An even simpler solution is a “shared unbounded queue” (see javadoc) free of blocking altogether —

     virtThrPool = (ThreadPoolExecutor) Executors.newFixedThreadPool(howManyThreads);

(You don’t have to cast, but casting gives many convenient methods of

You can easily submit() tasks and never get blocking or rejection.

demo from XR, showing non-blocking behavior of blocking queue

You can get blocking behavior if you avoid ThreadPoolExecutor.execute(). You can create an Array blocking queue (myABQ) and submit tasks using myABQ.put().

I learnt this from my Taiwan colleague.

On Fri, Apr 8, 2011 at 11:56 PM, Bin TAN (Victor)  wrote:

ThreadPoolExector api made it clear — New tasks submitted in method ThreadPoolExecutor.execute will be rejected when the Executor has been shut down, and also when the Executor uses finite bounds for both maximum threads and work queue capacity, and is saturated. describes the non-blocking behavior of blocking thread pool, and mentions a Spring forum discussion —

You have to modify the java.util.concurrent.ThreadPoolExecutor by overriding the execute method and place a task with a put instead of an offer with a zero timeout on the workqueue.

Or you could use the BlockingExecutor (and the ThreadPoolBlockingExecutor as standard implementation) from Prometheus. The primary reason of existence of these classes was to create an Executor that is able to block.

create unbounded queue using STL deque – threading

There’s a STL container adapter — queue over deque. (Probably no queue over vector exists since one end of the vector is hard to operate.) Like all STL containers it’s thread unsafe. But in this context, let’s ignore this adapter and try to design our own adapter.

How many condition variables? I think 1 only — isEmpty.

Q1: How many mutexes? At most 1 at each end of the queue.
%%A: probably 1. See below.

Q1b: Is it possible to allow consumers and producers to operate simultaneously?
%%A1b: I don’t think so. See below.

Q2: What if there’re 0 elements now and both threads are operating on that 1 new element?

For a queue backed by a linked list, Answer 1b might be yes — I feel the producer would button up the entire node and put its address into the head pointer in one instruction. For a deque though, the copying is not an atomic one-stepper. It’s either operator=() or copy-ctor. Before the producer finishes copying, consumer reads the half-copied node! In conclusion, I feel we need a big lock over the entire data structure.

decreting vs accreting curves in a YC Group — basics

Suppose you hold an asset (say a stock) on borrowed money. Over 1 year,

A) You pay interest on the loan. Interest amount is determined by your credit rating and the prevailing risk-free rate (but more commonly Libor). If your credit is excellent, then this interest amount primarily reflects inflation and the gradual diminishing purchasing power of one unit of USD (or whatever currency).

B) The asset in your portfolio also accrues in market-value in USD, due to dividend or coupon income, or you can lend it out on the repo market to earn a fee.

Both of these sums grow with holding duration. A partially offsets B. Net appreciation of the asset is often positive. Note A depends on the borrower and B depends on the asset.

I believe most professional traders (buy-side or sell-side) do most trades on borrowed money — Leverage. They start with $1m, and through credit relationship, can use $10m — leverage ratio of 10.

Even if you trade using your own $500k, you are still “paying” or forgoing the interest (opportunity cost) you could earn on the $500k.

underlying price is equally likely to +25% or -20%

See also P402 [[CFA textbook on stats]] says Black-Scholes “model is based on a normal distribution of underlying asset returns which is the same thing as saying that the underlying asset prices themselves are log-normally distributed.”. Actually, many non-BS models also assume the same, but my focus today is the 2nd part of the sentence.

At expiration, the asset has exactly one price as reported on WSJ. However, if we simulate 1000 experiments, we get 1000 (non-unique) expiration prices. If we plot them in a __histogram__, we get a kind of bell curve. But in Black-Schole’s (and other people’s) simulations, the curve will resemble a log-normal bell. Reason? …..

Well, they tweak their simulator according to their model. They assume underlying price is a random walker taking many small steps, whose probability of reaching 125% equals probability of dropping to 80% at each step. (But remember the walks are tiny steps, so 80% is huge;) Now the reason behind the paradoxical numbers —

  log(new_px/old_px) is normally distributed, so log(1.25)=0.97 and log (0.8)= – 0.97 are equally likely.

Now if we do 1000 experiments and compute the log(price_relative), we get another histogram – a normal (NOT log-normal) curve. Note Price-relative is the ratio of new_Price / old_Price over a holding period.

Here’s Another experiment to illustrate log-normal. Imagine a volatile stock (say SUN) price is now $64. How about after a year ? Black-Scholes basically says it’s

   equally likely to double or half.
Double to $128 or half to $32. log2(new_Price / old_Price) would be 1 or -1 with equal likelihood. Intuitively,

   log (new_Price / old_Price) is normally distributed.

Now consider prices after Year1, Year2, Year3… log2(S2/currentPx) = log2(S2/S1  *  S1/currentPx) = log2(S2/S1) + log2(S1/currentPx). In English this says base-2 log of overall price-relative is sum of the log of annual price-relatives. Among the 3 possible outcomes below, the $256 likelihood equals the $16 likelihood, and is 50% the $64 likelihood.
double-double -> $256
double-half -> $64 unchanged
half-double -> $64 unchanged
half-half -> $16

This stock can also appreciate/drop to other values beside $256,$64,$16, but IF the $256 likelihood is 1.71%, then so is the $16 likelihood, and the $64 likelihood would be 3.42%. We assume no other price “path” will end up at $64 — an unsound assumption but ok for now.

Since log(S2/S1) is normally distributed, so is the sum-of-log. Therefore log(S2/currentPx) is normally distributed.

     log(price-relative) is normal.
     log(cumulative price-relative) is normal for any number of intervals. For example,

Price_After_2years/current_Price is equally likely to double or half.
Price_After_2years/current_Price is equally likely to grow to 125% or drop to 80%.

More realistic numbers — when we shrink the interval to 1 day, the expected price relative looks more like

      “equally likely to hit 101.0101% or drop to 99%”

Func/Action – 2 super delegates

Func is an extremely versatile super-delegate. So is Action. In my limited experience using c#, every delegate Type I used has an FA form i.e. a derivative delegate type from Func or Action. Examples –

ThreadStart has an FA form “Action”
ParameterizedThreadStart has an FA form “Action”
KeyEventHandler has an FA form “Action<object, KeyEventArgs>”
Predicate has FA form Func

However, Func/Action are not always popular in the BB case as defined in the post about 2 fundamental categories.

Q: so what’s the difference between the FA form and the original type?

Any method to be “chained” into a KeyEventHandler delegate Instance can also be chained into a dlg Instance of the FA form, but the 2 Types aren’t 2 aliases of the same Type. To understand this, let’s go back to basics. In java or c#, Golfer and Violinist interfaces can both comprise a single void play() method, so the 2 interfaces are equivalent but the VM regards them as 2 distinct types and won’t allow casting.

It’s impractical to work exclusively with the super delegates. These types have meaningless names.

Q: So where can the super delegate Types be useful?
A: when you need a throwaway delegate type, like the AA case as defined in the post about 2 fundamental categories.
A: LINQ has similar comments on the super delegates

GS xp: 1)decent design 2)faster n faster turnaround

In my first 12 months at GS, I would work hard on a feature, complete in
a day then proudly present to my team lead, only to hear from him “good
work, but why did it take so long? So and so could have done it in 2

This is just one of those awakening moments that slowly (yes slowly)
made it clear to me that my TL wanted 1) decent design 2) faster and
faster turnaround.

Black-Scholes: flawless for volatility inversion

Inverting a given option premium to an implied vol by BS is uncontroversial and unaffected by volatility skew/smile. If risk-free rate, spot price, time to expiry (aka TTL) are all /unanimously/ observed, then according to BS equation there’s one-to-one mapping between option premium and implied vol. It’s like converting kilograms to pounds. Therefore exchanges often quote premiums in vol, forcing everyone to use BS to back out the dollar values.

That doesn’t mean we agree to all the BS assumptions, including the constant-vol assumption.

BS is perfectly fine for inversion + … perhaps … greek calculations….

We can safely use the original BS equation for inversion, and then completely discard the BS model after that. We can use a different model (usually related to BS) to describe or model the dataset of implied vol numbers. Some common models include

local-vol model
SABR model
(I guess both are stochastic vol models??)

which (vendor-)implementations to invest in

I may have talent in comp science theories. But I place my highest emphasis on familiarity with implementations — tools, utilities, products, packages, jars, building blocks — download-able stuff.

With theories, a talented student can progressively go deeper, higher, sharper, and in less time as her foundation strengthens and broadens. I feel less so with tools. Familiarity with implementations takes a hell of time.

For example, when something doesn’t work with something else, it often took me a few minutes to a few days. Therefore I have a strong aversion for new technologies and a strong affinity to mature, time-honored tools.

For example, countless designs work great on paper or during interviews. The #1 thing about any tool is the limitations you are likely to hit. Even a minor drawback can derail your entire project.

Some tools I’m investing into (either my old turf or new targets, the +++) and some (–) I’m divesting
++ c/java debuggers
++ sybase, oracle, mssql
+ rv, MQ (JMS is a spec, not an implementation)
+ unix sockets
+ eclipse
+ pthreads implemented in linux and solaris
+ python + solaris/linux latency tuning
+ pl/sql?
– spring / hibernate
– bash customization
– http
– mysql, php
– vi — needs 2 years. See [[Productive Programmer ]]
– any local system knowledge

Milestone events in a growing thread pool

M22a) core threads all busy
M22) start queuing

M33a) queue full
M33) create first thread beyond core threads

M44) 1st rejected task

As shown above, task queue has a max size; thread pool has a core size and max size.
P194 [[Java threads]] has a good illustration of these milestones in a “bounded pool” i.e. a pool with bounded queue. Here’s my modified version based on P194
* Pool starts with no thread no task
* First task -> create first thread
* tasks come in a flurry -> M22
* Queue full -> M33
* Pool capacity reached -> M44

How about an unbounded pool? M33 never happens.

handle all tibrv advisory messages

For each tibrv transport object, you need to link it to a callback
object like this. If you leave one transport unattended, advisories will
show up on it. Below is a partial solution that is almost water tight.

new TibrvListener(new TibrvQueue(), mySysAdvisoryCallback,

static private TibrvMsgCallback mySysAdvisoryCallback = new
TibrvMsgCallback() {
public void onMsg(TibrvListener listner, TibrvMsg msg) {
(“_RV.WARN.SYSTEM.LICENSE.EXPIRE”.equals(msg.getSendSubject())) {
log.warn(” Received advisory message: ” + msg) ;

convenient spring jdbc methods

1a) Query for map – if your query/proc returns just a single row and you
don’t want to create a new bean class.
* Column names had better be distinct, which is usually easy.
* Similar to boost tuple.
2) Query for object (of your custom class) – if your proc/query returns
a single row, and you already have a bean class to hold all attributes.
3) queryForObject (String sql, String.class) – if you are sure to get
only one row and combine all columns into a single string
4a) Query for int – if your query/proc returns nothing but an integer
4b) Query for object – if your proc/query returns a single Date
5) Row mapper – if your query/proc returns a result set

##[11]upstreams(sg gov)for next 10Y #wallSt tech

(See also blog post on 2010 GS annual report.) Singapore government is smart to always look out for upstream domains. Similarly, in financial IT, there are some upstream *domains* too.

Upstream: real time risk — upstream in derivative risk … but poor Market Depth
Upstream: grid — Memory virtualization, grid computing
Upstream: Eq(and FX) HFT and algo trading is upstream to other asset classes
Upstream: Option-related (including mtg) analytics tend to be more advanced, and probably upstream. Any derivative with optinality tends to be dominated by vol in terms of pricing complexity.
Upstream: connectivity — collocation, streaming,
Upstream: latency — sub-m, multi-core, non-blocking…
Upstream: voluminous market data processing — storage, distribution…FX and eq option has the highest volume, and often dominates entire infrastructure. Many asset classes are far behind
Upstream: SecDB — firmwide singleton mkt risk engine. Upstream in m-risk technology. Most shops have data silos.

However, upstream technologies often become fads.
– C++? not really upstream. A strong c++ guy isn’t necessarily strong in java or C#.
– web service
– solace-like messaging hardware
– CEP? How much real value does it add?
– rule engines
– MortgageBackedSecurity pricing? Not really upstream, though most complex analytically.
– ETL?
– hibernate?
– python?

which "Technology" were relevant to 2010 GS annual report #upstream

(Another blog post. See also the post on Upstream)

Re the GS annual report page about Technology… When yet another market goes electronic, commissions drop, bid/ask spreads drop, profit margins drop, trade volumes increase, competitions intensify. So which IT systems will rise?

market data – tends to explode when a market goes electronic.
* tick data
trade execution including
* order state management
* order matching, order book
real time risk assessment
real time deal pricing
offer/bid price adjustment upon market events
database performance tuning
distributed cache
MOM ie Message-oriented-middleware
multi-processor machines
grid computing

How about back office systems? If volumes escalate, I feel back office systems will need higher capacity but no stringent real time requirements.

On the other hand, what IT systems will shrink, fade away, phase out? Not sure, but overall business user population may drop as system goes low-touch. If that happens, then IT budget for some departments will shrink, even though overall IT budget may rise.

In a nutshell, some systems will rise in prominence, while others fall.

Executor iwt ExecutorService javadoc ties together
– extended-interface
– parent-interface
– — all-static utility class, similar to Collections and Arrays.

This Javadoc hints that

JDK *implementations* i.e. concrete classes provided in this package implement ExecutorService, which is a more extensive interface than Executor. The Executors class provides convenient factory methods for these Executors.

Remember the Lab49 IV question? There are just 2 executor services (bundled in JDK), so better know them
* ThreadPoolExecutor
* ScheduledThreadPoolExecutor

finance presents distinct risks to different people

Investors talk about returns, growth, opportunities, aggressive/conservative, hedging, tail risk, risk profile..—-> market risk. Not credit risk, not liquidity risk, not counter-party risk.

Regulators talk about systemic risk, controls, reserves, transparency, protecting (those to be protected). They mean —-> liquidity risk.

Exchanges talk about integrity, stability … —> c-risk

Traders are accused of taking the profit but not the risk since it’s other people’s money. We are talking about —–> market risk, not liquidity risk. Credit analysis and approval is, i guess, not the trader’s job.

In an economy, investment banks are dwarfed by commercial banks. For their credit card, car loan, student loan, mortgage departments, risk means —–> credit risk, not liquidity risk, not market risk.

Hedge funds and mutual funds traders? market risk

Hedge funds owners during a crisis? liquidity risk

STL algo – iterators always; containers never

ObjectSpace STL manual said “stl algorithms only access data via
iterators”. I don't know if this is 100% true today, but STL creators
indeed designed their algorithms to be container-agnostic.

STL was designed as, first and foremost, building-block data structures
known as containers. Essential operations on containers are provided in
2 forms

– methods of the container classes (class templates actually). These are
“trusted” members and therefore can modify container content.
– container-agnostic free-functions known as STL algorithms

The link between the algorithms and containers is the iterator.

bond duration vs option sensitivity greeks

Just as an option’s market value has sensitivities to underlier price, vol, time to expiration… collectively known as the greeks, a fixed-income portfolio valuation has a much-watched sensitivity to yield, known as Modified-Duration.

Interestingly, The most complex part of bond duration is … option adjusted spread for bonds with embedded options!

A 2nd common type of bond-with-options are caps/floors. Is any Greek widely used for these? Yes since these instruments are mathematically equivalent to calls/puts on an underlying bond.

I feel caps/floors are less popular than swaption, whose market is quite large.

elegate recipe:utility static class2call instance method@any object

Say you have a (shared) _static_utility_ class freely usable from any client class, but this utility class should not “know” about (see these client classes.  Suppose the utility class’s main job is to asynchronously call some public WCF query service and provide a static callback method handles data received.

Now suppose some Student1 object (observer) is interested in the query result and would like to receive a callback. Here’s a naïve solution – Student class ignores the utility class and directly invokes the WCF query and privately handles data received. Code duplication in 1) the utility class and 2) Student class.

The better solution is for the static utility class’s callback method (observable) to invoke a (non-static) method on Student1. But then the utility class should NOT hold a reference to Student1, right? Utility class source code should be free of the Student class or any client class. If the util class is in namepace Util, then the util class should not import the Student class into its Util namespace. This way, the util class is reusable in another project without the Student class. If util class were to include “using The.Name.Space.Of.Student” then this class won’t compile without the Student class. Such a util class would be tightly coupled with the Student class.

Solution – utility static class to expose a static event whose type is a dummy void-void delegate. You can also “upgrade” to the standard System.Action delegate, but to illustrate the simplicity of the technique i will use the void-void delegate, probably the simplest delegate in the world.

        public delegate void Dlg1();
        public static event Dlg1 SnapshotsReceived;
        private static void TheCallback()
            SnapshotsReceived(); // fire event

Now our Student1 object can register interest in SnapshotReceived, by passing a Student1 reference (reference to itself) into the static util class ! Under the hood, static class has a hidden static invocation list, each holding a 2-pointer thingy. The Student1 reference is in the invocation list.

dependency graph weaved into a mkt-risk engine (secDB@@)

Imagine 2 objects in your risk engine’s memory space. If Object A2 depends on Object I3, then I3 modifications automatically triggers a change in A2 — The defining feature of dependency graph system. All Spreadsheet implementations meet this requirement.

By the way, Disk-based OODB are unsuitable. Instead, both objects must be “loaded” from OODB into memory for this to work. Similarly,
spreadsheet must be loaded from disk to memory.

For automatic chain reaction, every dependent part of A2 object state (say, the volatility attribute) should be implemented as something other than a
member variable.

Q3a: How is this attribute implemented physically? In fact, this attribute is serializable and can be “stored” to disk-based OODB. You can imagine a hidden object holding this volatility value.
A: There’s an alternative to a regular field — Whenever anyone requests the vol value, just recomputed from the upstream objects I3 etc. May hit many nodes/cells/objects every time. Performance-wise, A2 object gets a hidden field to hold the cached-function-result of the vol() method call.

Q3b: so in our example is the vol a method or a field?
A: Probably a method + hidden field to hold cached-result.  Q3c: In many dependency graph systems, the syntax looks like vol is a field of the A2 object. Why?
A: C++ allows you to overload “operator()”, so a call to “vol()” can get and set a field. Likewise in python, you can also have a vol() method to get and set a field. The field is the cached-function-return-value. In python, the vol attribute may look like both a field variable and a method but it’s a method.

Q: how does the downstream object A2 get triggered when upstream object I3 changes value? Async Cache trigger?
A: I feel spreadsheet drives the chain reaction in one thread.  chatter class —
I’ve looked at Slang a bit. It’s an interpreted _dataflow_ language running on an in-memory database called SecDB. Untyped, Pascal-ish, __single-threaded__. Like a spreadsheet, it only needs to recompute the subgraph that has changed.

“like a spreadsheet, it only needs to recompute the subgraph that has changed.” Yes that’s right – and that is a key feature for GS – they need to be able to revalue huge trading books ~ real time. (Most other banks estimate impact of intraday moves, and do a re-price overnight in batch)

In one implementation of dependency-graph, “Assigning to a cell method {like our volatility()} will execute a setValue. This will propagate the changes to any other methods/cells/nodes that are dependent upon it.”

To facilitate spreadsheet-like programming,

* dependency relationships between functions and intermediate results
** I guess dependency among objects, functions (are objects in python), methods
* Python language feature of decorators to define the nodes (aka: cells) in the graph
* Cells know how to serialize and deserialize themselves into and from the object DB
* When a given cell is tweaked, dependents change accordingly, including the UI
* framework is lazy and caches intermediate values to increase performance

secDB – designed for mkt-making and prop trading

GS 2010 annual report said “A particular technological competitive advantage for GS is that GS has only one central risk system.” Presumably, this is a market risk system. Not much credit risk, or  liquidity risk, or counter-party risk.

It’s instructive to compare several major business models
+ prop-trading and principal-investment — maintain positions ==> need secDB
+ dealing and market-making — need to maintain (and hedge) positions ==> probably needs SecDB esp. option/swap market making
+ security lending — holds client’s positions under bank’s name ==> Needs secDB
– brokerage and agency-trading — don’t need to maintain positions by strict definition. In practice I guess they often lend assets to clients and hold client’s positions as collateral. That would need secDB
– asset management — managing client’s money only, so maintain positions on behalf of clients. If you care about the fund manager’s positions, then you need secDB. In addition, I guess managers often co-invest — need secDB
– high-volume, low-margin electronic trading is usually(??) agency trading
+ if there’s a high volume system that needs secDB, I’d guess it’s prop trading similar to hedge funds

Looking into the design of SecDB (dependency graph + OODB), I feel it’s designed for prop desk.

In citigroup, the research department supports prop desk as its #1 user. I guess the market maker desk would be #2.

When GS say “aggregate positions across desks”, I feel it means prop desk + sell-side “house” desks. I think they mean house money i.e. positions under GS own name, not client names. Risk means risk to the bank’s positions, not client’s positions. (No bank will spend so much money to assess risk to clients’ positions.)

secDB: one-sentence description of its goodness

The system enables us to take virtually every position we have in the firm and revalue them thousands of times every night under all sorts of different extreme scenarios to work out what sorts of risk we have“, said Robert Berry, Head of Market Risk for Goldman Sachs. See

SecDB evolved to include diverse capabilities but at the core it’s a “dependency-graph-aware, firmwide, position valuation engine” A few key points —
* “every position” — firmwide. Few banks have a single, firmwide risk engine. They have data silos.

* all these positions must load into memory so their attributes can automatically update in response to their upstream objects. It helps a lot if there’s some “infrastructure software” that permits these positions to load into distributed nodes virtualized into one in-memory database.

* “thousands of times” — such performance requires in-memory operation. GS veterans told me secDB is an in-memory DB, but I guess it’s probably disk-based but can quickly load into memory, like KDB and other time-series databases.

It’s clear to me that among all the key data types (trades, instruments, accounts, rates, correlation coefficients, …) the first among equals is position. Like in any risk engine in any trading desk, I feel positions are the most important entity in SedDB risk engine. To an IT guy, each position object must have an instrument and account, but to the business, Positions are the basic element of analysis.

risk system can be front office #quartz

most market-risk (and MOST credit-risk) IT systems are considered middle office, but Kirat Singh’s presentations suggest a specialized developer role in risk/research space is more “front office” because these developers interact with quants and traders.

How do we identify such a developer position among millions of risk developer positions?
+ real time (at least intraday) risk numbers
+ job responsibility often mentions risk and pricing in one breath
+ pre-trade analysis
+ a key part of the trading strategy
+ a key part of models
+ business users are prop traders or fund managers, not agency traders
+ derivatives needs more risk management than cash products

In contrast, middle office risk systems
– are heavily batch-driven
– i feel some traders treat those risk numbers as unreliable, unlike the real time risk numbers they look at before every trade.

portions@secDB – created by quants, for quants@@

I feel Quants and others around them don’t always point out the real difference between Traders, Coders and Quants aka strategists. In this context we have to.

Once secDB core is built, SecDB is “programmed” in each desk by regular software developers, not  secDB core team. Business end-users are risk managers and traders, majority of whom are probably not trained for or fascinated by Slang or python.

In terms of formal training, regular software developers are trained in engineering; quants in math; traders in investment. (Think of prop trading, to keep things simple.)

However, some quants qualify as software developers. (Actually, most c++ analytic libraries are owned by these part-time developers:-). Some sharp minds in the camp eventually realize 1) the inherent dependency graph in this world, and then rediscover 2) OODB. Put the customized quantitative dependency-graph-aware objects into an OODB and SecDB is born. Grossly oversimplified, but at this juncture we have to swallow that in order to wrap our mind around a monstrous concept, step back and see the big picture. (signal/noise ratio and focus badly needed.)

Another article said secDB “allows Goldman’s sales force and traders to model, value and book a transaction”, confirming that model and valuation are among the first goals of secDB.

Quants create this platform for themselves (ie for quantitative research, back testing, model tuning…), and then persuade developers to adopt. In such a world, quants are the law-makers. Power shifts from traders and IT to quants. In the traditional world, IT resembles the powerful equipment manufacturers of the dotcom era.

Now I feel the creators of SecDB may not be the typical desk quant, risk quant or strategist. He or she might be a quant developer

I guess Quants’ job is to analyze 1) financial instruments and 2) markets, and try to uncover the quantitative “invisible hands”. (In, I question the value of this endeavor.) They _numerically_ analyze real time market data, positions, recent market data (all fast changing), historical data, product data…. Too much data to be digestible by business, so they condense them into mathematical “models” that attempt to explain how these numbers change in relation to each other. Classic example is the term structure and the BS diffusion model. A proven model acquires a life of its own —

– become part of a trading strategy
– sell-side often offer these models as value-added services to hedge funds. I guess they can send trading signals or digested, derived data to hedge funds.
– risk mgr are often the #1 user of these models. Risk mgr and quants validate models by back testing.

xaml code behind – phrasebook

Views — only views can have a code behind.
Behind — a view. The code exists Behind a view

*.cs sits behind *.xaml — code behind is a someName.xaml.cs file, behind a someName.xaml file

fields — defined in the xaml are directly accessible as Instance fields of the code behind class.

Part of the view — the code behind class is part of the view, not the VM or M.

event registration — often registered in xaml or (programmatically) in the code behind.

data binding?

basic eclipse c++ Q&&A

Q: project rebuilding — the exe is permission-denied
A: perhaps the debug shows the exe still in use.

Q: funny/weird characters in error messages and tooltips?
A: define environment variable LANG=en_US.ISO-8859-1 in Window->Preferences->C/C++->Environment
A: In toolchainEditor (TCE) if you chose cross GCC, you can get cleartext error messages. You also see Includes in Project view

Q: binary not found
Maybe there’s nothing for this Project under runConfig? Your *.exe (full path) should show up.

Q: basic includes for every *.c file?
using namespace std;

Q: “This application has requested the Runtime to terminate it in an unusual way. Please contact the application’s support team for more information.”
A: perhaps you access argv[1] without any command line arg?

Q: where’s the *.exe file generated for my project?
A: in the Debug folder

Q:’strlen’ was not declared in this scope
A: #include

Q: undefined reference to `WinMain@16′
A: no main()