#1 driver of long term FX #my take

Q: what is the #1 fundamental driver of long term FX rate between 2 currencies?

Is it PPP? I don’t think so.
Is it supply/demand? Short term yes; long term … this seems less relevant
%%A: “gold” backing.

In the beginning, every pound issued by the British monarch is as good as an ounce of gold (or whatever fixed amount of gold). Therefore anyone holding a pound note can exchange it for the equivalent amount of gold.  The monarch then decided to print more pound notes. Logically, all the existing and new pounds devalue. But the country also exports and, in a world of universal gold standard, earns gold. In a more realistic world, What is earned is foreign currency.

I believe Every Singapore dollar issued is backed by some amount of “gold” which in modern context means foreign reserve in a basket of hard currencies. By the way, there’s nothing harder than gold. As the national economy expands, more goods are produced domestically so the CB could issue more SGD, but I guess the CB waits until the goods are exported[2] and foreign currency earned. Such a prudent practice helps ensure every SGD is fully backed by sufficient “gold”.

[2] incoming tourists also spend their own currency like JPY, therefore contributing to the Singapore foreign reserve. If tourists sell their JPY for SGD outside Singapore, the JPY amounts tend to flow into the global banking systems and back to Singapore banks. I think non-Singapore banks don’t want to go long or short SGD for large amounts, so they would eventually exchange with Singapore banks including Citibank, HSBC, SCB, BOC.

In general, a hard currency is more deeply *backed* by gold than a weak currency. Alternatively, the issuing central bank has other capabilities (besides gold reserve) to maintain the *strength* of her currency. This strength depends on Current Account and economic growth — the 2 fundamentals. Another factor is global competitiveness, but this is often measured by the capacity to export and compete on global markets. After all, it still seems to boil down to earning enough foreign currency to back our own currency — “gold” backing in disguise.

Advertisements

insert() — more versatile than assign() and range-ctor

[[effSTL]] P31 points out that range-iterators are used consistently across containers —

– Every (yes both sequence/associative) container supports a ctor taking a couple of range iterators
– All sequence (not associative) containers support assign() method taking a couple
– Every (yes both sequence/associative) container supports erase() method taking a couple

However, I’d argue the most versatile is the insert() method in Every Sequence and Associative containers.
* insert() can emulate the range-ctor
* insert() can emulate assign()

This is also more versatile than operator=().
This member function is also simpler than the free function copy().

y create custom STL iterators, briefly

Q: when would you write your own STL iterator?

I think if you create your own customized container, you probably need custom iterator Objects. You probably need to return such an Object from rbegin() etc.

In python, c# and (less commonly) in java, many data structures can be made iterable. There’s an implicit iterator.

[[STL tutorial]] has a 3-page chapter showing a debuggable-iterator that reveals interesting details of the inner workings of STL containers and STL algorithms.

libor vs gov bond — 2 benchmarks

Most credit instruments, all (secured/unsecured) loans, all IR products, most derivatives based on FX, IR or credit (I'd say virtually all derivatives) need to apply a spread on a reference yield. If the deal has a “maturity date” or “delivery date”, “call date” [1], then we look for that date on the reference yield curve and read a yield number like (222 bps pa) off that curve. That number is the reference yield, a.k.a reference spot rate. You can convert that to a discount factor easily. There are also straightforward and well-defined conversions to/from fwd rates, driven by arbitrage principles.

[1] perhaps among a series such dates.

Question is which reference yield curve to use. Most companies use a single, consolidated curve for each currency. One of the biggest muni security trading desks in the world has just one yield curve for USD, which is typical. Another megabank has a single live Libor curve for the entire bank, updated by the minute.

If you use more than one yield curve built from different data sources, then for any maturity date, you would read 2 yield numbers off them. If sufficiently different, you create arbitrage opportunity and your valuations are inconsistent.

On the short end of the IR space the reference curves are 1) Libor 2) Fed Funding (USD only). Libor is more popular.

On the long end, T-bond dominates the USD market. Many governments issue similar bonds to create a reference riskless rate.

However, the most liquid IR instruments are probably more realistic and reliable as a reflection of market sentiment. ED futures, Bund futures, T-bond inter-dealer market rates are examples.

c++highly parallel numerical programm`IV(smart ptr, threading…

These are mostly QQ type. Jap firm https://www.numtech.com

Q: What are the differences between win32 and linux threading Implementation? I probably won’t spend time reading this.

Q: given a bunch of 10-year old linear algebra c++ functions (using many VD or MD templates as function inputs), how would you go about extracting and packaging them into a DLL? No experience. Not on my Tier 1/2.
%%A: stateless
%%A: pure functions.
%%A: Thread safe
%%A: remove code duplication
%%A: for each function, there should usually be default parameters, so we can call it with 2 args, 3 args, 4 args etc
A: function parameters should not be templates. Ints, double are probably fine. DLL interface is defined by the platform, not the language.

Q: what’s wrong with boost libraries?
AA: (STL is fine) many of them aren’t proven — just 10 years

Q: Since you said shared_ptr is the most popular boost module, how is reference count done in heavily parallel programming
%%A: for regular shared_ptr, thread safety is a big design goal, so probably fine[1]. For intrusive_ptr, the pointee class must expose certain mutator methods, which must be made thread-safe by the pointee class author, not boost authors

[1] isn’t correct. The testing on shared_ptr thread safety isn’t sufficient for large number (thousands) of threads. See https://bintanvictor.wordpress.com/2017/07/09/shared_ptr-thread-safety-take/

 

double-ptr usage #2b — swap 2 pointers

(Note this is a special case of “special reseating” — http://bigblog.tanbin.com/2011/04/double-pointer-usage-2-special.html)

Q1: write a utility function to swap 2 pointer’s content.
%%A: swap(int** a, int** b) {….}
Obviously, the 2 pointer Variables must be declared to be compatible — to be elaborated [1], such as
int* a1, b1; …. swap(&a1, &b1);

To really understand why we need double pointers, consider

Q2: function to swap 2 nonref variables. In other words, after i call swap(..), x should have the value of y. If y used to be -17.2, x now has that value. Note this number isn’t an address.

In this case, you need to pass the address of x like &x….

To understand this Q2, it’s actually important to be be thoroughly familiar with

Q3: what does swap(char i, char j) do?
%%A: something other than swapping. I doubt it does anything meaningful at all. It receives two 8-bit chars by value. It doesn’t know where the original variables are physically, so it can’t put new values into those original variables, so it can’t swap them.

Any swap must use some form of pass-by-reference.

Q1b: function to swap 2 arrays. After the swap, array1 is still a pointer but it points to the physical location of array2
%%A: same signature as Q1.

[1] void ** is probably a feasible idea but may require casting. Usually the 2 arguments a1 and b1 should have the same declaration. If a1 is Animal* and b1 is Dog* (Note the single asterisks), then such a polymorphic swap is no longer straight-forward and fool-proof.

Forex trading is Singapore’s strength

(A personal blog.)

I think for the past few months in the Singapore job market (java or c++), i didn't notice any fixed income, credit, equity domain roles. There are some cross-asset system positions, and there are commodities and FX positions. I feel FX is roughly half (up to 66%) of all the roles that pays a reasonable salary.

This is a hard lesson learned — I have to deepen my FX knowledge and track record otherwise the biggest chunk of jobs stay beyond my reach.

I feel in terms of domain knowledge, FX is more relevant (to high-end software jobs in Singapore) than volatility, bond math, exchange trading, structured products, etc

But why does FX pay above other fields. Here's what I came up with
– equity is small, less active in S'pore than HK. The high-paying eq-related jobs are usually in HK
– FX is perhaps more profitable at this moment
– FX is Singapore's traditional strength for decades. #4 behind Ldn, Nyk, Tky
– FX is high speed and high volume (in terms of market-data) so this places some stringent criteria on developer skill
– FX is more electronic, more standardized, more inter-connected than FI, commodities or derivative markets, on par with cash equities. More technical skills required.

It's interesting that FixedIncome has more complexity and has more profit potential but doesn't really pay comparable salary.

real/fake market-order + limit order in FX ECN

Some popular institutional ECN’s offer 3 main types of orders. Taking buy orders for example, you can place

– limit order – to buy at or below a specified price. It might match some offers immediately. All remaining amounts on the order remain in the market
– IOC – fake mkt order – buy at or below a specified price. All unfilled amount is cancelled. This is more popular than the real mkt order
– real mkt order – order without a price

secDB — helps drv more than cash traders?

(Personal speculations only)

Now I feel secDB is more useful to prop traders or market makers with persistent positions in derivatives. There are other target users but I feel they get less value from SecDB.

In an investment bank, equity cash and forex spot desks (i guess ED futures and Treasury too) have large volume but few open positions at end of day [1]. In one credit bond desk, average trade volume is 5000, and open positions number between 10,000 to 15,000. An ibank repo desk does 3000 – 20,000 trades/day

In terms of risk, credit bonds are more complex than eq/fx cash positions, but *simpler* than derivative positions. Most credit bonds have embedded options, but Treasury doesn't.

In 2 European investment banks, eq derivative risk (real time or EOD) need server farm with hundreds of nodes to recalculate market risk. That's where secDB adds more value.

[1] word of caution — Having many open positions intra-day is dangerous as market often jumps intra-day. However, in practice, most risk systems are EOD. I was told only GS and JPM have serious real time risk systems.

island rainfall problem – C array/pointer algo

#include <cstdlib>
#include <cstdio>
#include <iostream>
#include <iterator>
#include <algorithm>
#include <assert.h>
using namespace std;
int const island[] = { 54, 50, 54, 54, 52, 55, 51, 59, 50, 56, 52, 50 };
///////////////   Pos # 0   1   2   3   4   5   6   7   8   9  10  11
int const size = sizeof(island) / sizeof(int);
int accu = 0;
template<class ForwardIterator>
ForwardIterator max_element_last(ForwardIterator first, ForwardIterator last) {
    ForwardIterator ret = first;
    if (first == last)
        return last;//empty range
    while (++first != last)
        if (*ret <= *first)
            ret = first;
    return ret;
}
void print1(int const* const a, char const * const label) {
    printf("%s=%d/%d ", label, *a, a - island);
}
void printAll(int const* const L, int const* const l, int const* const h,
        int const* const H) {
    if (l < h) {
        print1(L, "wallL");
        print1(l, "ptr");
        printf("  ");
        print1(h, "ptr");
        print1(H, "wallH");
    } else {
        print1(H, "wallH");
        print1(h, "ptr");
        printf("  ");
        print1(l, "ptr");
        print1(L, "wallL");
    }
    printf("%d=accumulated\n", accu);
}
void onePassAlgo(){
    int*wallLo, *wallHi;
    int*l, *h; //moving pointers
    wallLo = l = const_cast<int*> (island);
    wallHi = h = const_cast<int*> (island) + size - 1;
    if (*l > *h) {
        std::swap(l, h);
        std::swap(wallLo, wallHi);
    }
    printAll(wallLo,l,h,wallHi);
    printf("All pointers initialized\n");
    while (l != h) {
        if (*l > *wallHi) {
            wallLo = wallHi;
            wallHi = l;
            std::swap(l, h);
            //printf("new wallHi:");
        } else if (*l >= *wallLo) {
            wallLo = l;
            //printf("new wallLo:");
        } else {
            accu += *wallLo - *l;
            printf("adding %d liter of water at Pos#%d (T=%d)\n", *wallLo - *l,
                    l - island, accu);
        }
        printAll(wallLo,l,h,wallHi);
        //now move the scanner
        if (l < h)
            ++l;
        else
            --l;
    }
}
void twoPassAlgo() {
    int const* const peak = max_element_last(island, island + size);
    printf("highest peak (last if multiple) is %d, at Pos %d\n", *peak, peak
            - island);
    //(island, island + size, ostream_iterator<int> (cout, " "));
 
    //forward scan towards peak
    int* pos = const_cast<int*> (island); //left edge of island
    int* wall = pos;
    for (++pos; pos < peak; ++pos) { if (*wall > *pos) {
            accu += *wall - *pos; // accumulate water
            printf("adding %d liter of water at Pos#%d (T=%d)\n", *wall - *pos,
                    pos - island, accu);
            continue;
        }
        //ALL new walls must match or exceed previous wall.
        printf("found new wall of %d^ at Pos#%d\n", *pos, pos - island);
        wall = pos;
    }
    cout << "^^^ end of fwd scan ; beginning backward scan vvv\n";
    //backward scan
    pos = const_cast<int*> (island) + size - 1;
    wall = pos;
    for (--pos; pos > peak; --pos) {
        if (*wall > *pos) {
            accu += *wall - *pos; // accumulate water
            printf("adding %d liter of water at Pos#%d (T=%d)\n", *wall - *pos,
                    pos - island, accu);
            continue;
        }
        //Note all new walls must match or exceed previous wall.
        printf("found new wall of %d^ at Pos#%d\n", *pos, pos - island);
        wall = pos;
    }
}
int main(int argc, char *argv[]) {
    onePassAlgo();
}
/*
 Requirement -- an island is completely covered with columns of bricks. If
 between Column
 A(height 9) and Column B(10) all columns are lower, then we get a basin to
 collect rainfall. Watermark level will be 9.  We can calculate the
 amount of water. If I give you all the columns, give me total rainfall collected.
 Code showcasing
 - stl algo over raw array
 - array/pointer manipulation
 - array initialization
 - array size detection
 - std::max_element modified
 - std::swap
 */

kurtosis — thick tail AND slender

All normal distributions have kurtosis == 3.000. Any positive “excess kurtosis” is known as leptokurtic and is a sign of thick tail.

The word leptokurtic initially means slender — in a histogram, the center bar is higher, i.e. higher concentration towards the mean. To my surprise, the extreme left/right bars are also __higher___, indicating thick tails. To compensate, the rest of the bars must be shorter, since all the bars in a histogram must add up to 100%.

In short, excess kurtosis means 1) slender and 2) thick tail.

Thick tail is more important to many users as thick tail means more unexpected extreme deviations (from mean) than in the Normal distribution. Thick tail is unexplainable by Normal distribution and indicates a different, unidentified distribution.

However, the “Slender” feature is more visible and is the meaning of “leptokurtic”. The thick tail is almost invisible unless plotted logarithmically — see http://en.wikipedia.org/wiki/Kurtosis

island rainfall problem

#include
#include
#include
#include
#include
#include
using namespace std;
int const island[] = { 54, 50, 54, 54, 52, 55, 51, 59, 50, 56, 52, 50 };
///////////////   Pos # 0   1   2   3   4   5   6   7   8   9  10  11
int const size = sizeof(island) / sizeof(int);
int accu = 0;
//adapted from STL
template
ForwardIterator max_element_last(ForwardIterator scanner, ForwardIterator const end) {
ForwardIterator ret = scanner;
if (scanner == end)
return ret;//empty range, with zero element!
while (++scanner != end)
if (*ret <= *scanner) //"=" means find LAST
ret = scanner;
return ret;
}
//print height and address of a column
void print1(int const* const pos, char const * const label) {
//int const height = *pos;
printf(“%s=%d/%d “, label, *pos, pos – island);
}
void printAll(int const* const L, int const* const l, int const* const h,
int const* const H) {
if (l < h) {
print1(L, “wallL”);
print1(l, “ptr”);
printf(”  “);
print1(h, “ptr”);
print1(H, “wallH”);
} else {
print1(H, “wallH”);
print1(h, “ptr”);
printf(”  “);
print1(l, “ptr”);
print1(L, “wallL”);
}
printf(“%d=Accu\n”, accu);
}
//Rule: move the lo-side pointer only
void onePassAlgo(){
int*loptr; //moving pointer, moving-inward.
int*wallLo, *wallHi; //latest walls
int*h;

//1st we ASSUME the first left side wall will be lower than the first right side wall
wallLo = loptr = const_cast (island);
wallHi = h = const_cast (island) + size – 1;
//2nd, we validate that assumption
if (*wallLo > *wallHi) {
std::swap(wallLo, wallHi);
std::swap(loptr, h);
}
// now lo is confirmed lower than the hi side
printAll(wallLo,loptr,h,wallHi);
printf(“All pointers initialized (incl. 2 walls\n”);
while (loptr != h) {
if (*loptr > *wallHi) {
wallLo = wallHi;
wallHi = loptr;
std::swap(loptr, h);
//printf(“new wallHi:”);
} else if (*loptr >= *wallLo) {//see the >=
wallLo = loptr;
//printf(“wallLo updated:”);
} else {
assert (*loptr < *wallLo);
accu += (*wallLo – *loptr);
printf(“adding %d liter of water at Pos_%d (%d=A\n”, *wallLo – *loptr,
loptr – island, accu);
}
printAll(wallLo,loptr,h,wallHi);
// only by moving the loptr (not h) can we confidently accumulate water
if (loptr < h)
++loptr; //lo side is on the left, move loptr right
else
–loptr; //lo side is on the right, move loptr left
}
}
void twoPassAlgo() {//less convoluted
int const* const peak = max_element_last(island, island + size);
printf(“highest peak (last if multiple) is %d, at Pos %d\n”, *peak, peak
– island);
//(island, island + size, ostream_iterator (cout, ” “));

//forward scan towards peak
int* pos = const_cast (island); //left edge of island
int* wall = pos;
for (++pos; pos < peak; ++pos) {
if (*wall > *pos) {
accu += *wall – *pos; // accumulate water
printf(“adding %d liter of water at Pos#%d (T=%d)\n”, *wall – *pos,
pos – island, accu);
continue;
}
//ALL new walls must match or exceed previous wall.
printf(“found new wall of %d^ at Pos#%d\n”, *pos, pos – island);
wall = pos;
}
cout << "^^^ end of fwd scan ; beginning backward scan vvv\n";
//backward scan
pos = const_cast (island) + size – 1;
wall = pos;
for (–pos; pos > peak; –pos) {
if (*wall > *pos) {
accu += *wall – *pos; // accumulate water
printf(“adding %d liter of water at Pos#%d (T=%d)\n”, *wall – *pos,
pos – island, accu);
continue;
}
//Note all new walls must match or exceed previous wall.
printf(“found new wall of %d^ at Pos#%d\n”, *pos, pos – island);
wall = pos;
}
}
int main(int argc, char *argv[]) {
twoPassAlgo();
accu = 0;
cout<<"—————————–\n";
onePassAlgo();
}
/*
 Requirement — a one-dimentional island is completely covered with columns of bricks.
 If  between Column 
 A(height 9) and Column B(10) all columns are lower, then we get a basin to
 collect rainfall. Watermark height (absolute) will be 9.  We can easily calculate the
 amount of water. If I give you all the column heights, give me total rainfall collected.
 Code showcasing
 – stl algo over raw array
 – array/pointer manipulation
 – array initialization
 – array size detection
 – std::max_element modified
 – std::swap
 */

object DB + built-in language = secDB

I have noticed at least 2 original creators (U/ML) describing secDB as a 2-piece suite [1] — an OODB (secDB) + a built-in language (Slang).

[1] that's my own language, not theirs.

(By the way, different secDB *users* refer to it using different terms. End-developers who build business apps atop secDB mostly talk about __Slang__, perhaps because that's what they express business logic, then debug/test every day. End-developers don't modify secDB core engine at all, so they see secDB as a data store. Business users talk about __secDB__, because they don't care about Slang programming. Generally, Business users are more interested in data, and application features, not implementations.)

But let's come back to the 2-piece suite.  Initial motivation is characterized by a small number of key-keywords — Positions, market-risk, what-if, chain-reaction, object-graph… — keep these key concepts in focus.

I think the OODB idea came first. Loading all Positions across all desks into one virtualized memory is valuable to risk scenario analysis. Obviously a position's risk profile depends on many variables (product/account, interest rates, FX rates, index vol, credit spread, product characteristics) so all of these must be represented as objects in the same virtualized memory. A big object graph.

In each SecDB-alike system, there's a dedicated team building an in-house customized OODB. In some cases, the OODB being built is quite similar to Gemfire or Tangosol. I was skeptical but one of the original SecDB creators confirmed “that particular OODB project” is indeed part of the SecDB challenger system.

The other part — the language — will be another blog post. For now, I'll just mention that in each secDB-derivative system, there's a dedicated team creating a customized language to manipulate the object graph. In some cases, python is chosen, with DAG features added. In other cases,  some of the secDB core team members were hired to create a new language.

FMD is somewhat similar in purpose to Slang.

%%top 3 tips on unix permissions

Scripts must be readable AND executable [1] but compiled programs need only be executable.

[1] exception — It is possible to run a script without execute permission by entering sh myscript

You don’t have to be the owner of a file or have write permission on it to rename or delete it!  You only need write permission on the directory that contains the file.

a directory isn’t really a program that you can run even if it has execute permission.  The execute bit is *reused* (like C++ union) rather than waste space with additional permission bits.

Besides controlling a user’s ability to cd into some directory, the execute permission is required on a directory to use the stat() system call on files within that directory. This stat() returns file inode details. Therefore, to use ls -l file (i.e., to use stat() system call), you must have execute on the directory, the directory’s parent, and all ancestor directories up to and including “/” (the root directory). If execute permission is required for a directory, it is usually required for each enclosing directory component on the full path to that directory.

———- The tips below are less understood —
The execute bit on a directory is sometimes called search permission.  For example, to read a file /foo/bar, before the file can be accessed you must first search the directory foo for the inode of file bar.  This requires search (“x”) permission on the directory /foo.  (Note you don’t[2] need read permission on the directory to search in this case!  You would need read permission on a directory if you were to list its contents.)

[2] With execute but not read permission on a directory, users cannot list the contents of the directory but can access files within it if they know about them.

binomial tree: y identical diamonds

The standard CRR btree is always drawn with all straight lines, equally spaced vertically and equally spaced horizontally. Therefore you always see nothing but a strict pattern of identical diamonds. Let’s zoom into this “geometry”.

First, let’s set the stage for the discussion. In this conceptual “world”, a price (say IBM) can only be observed/sampled at periodic discrete moments, either once a second, or once a day, though the interval should be small relative to time to maturity. Price may change mid-interval, but we can’t observe that. Further, during each interval, the price either moves up or down. It can remain unchanged only in a trinomial tree — not popular in industry.

Why diamonds? Because of interlocking/recombinant. See http://bigblog.tanbin.com/2011/06/option-pricing-recombinant-binomial.html

Why equally spaced horizontally? Because the intervals are fixed and constant — at each clock tick, the variable must either rise or fall, never stay flat like a trinomial.

Why equally spaced vertically? Because the y-axis is log(price). An interesting feature of the CRR btree. If after n-1 intervals you plot the n price values in a bar chart, they don’t fit a straight line — but try plotting log(price).

Why are all the diamonds identical? Because the nodes are equally spaced both vertically and horizontally.

I feel the regularity is a great simplification and helps us focus on the real issue — the probability of an upswing at each node — the transition probability function, which is individually determined at each node position.

dev-cpp and mingw integration

My dev-cpp used to work. I later installed minGW and dev-cpp stopped working even when I compile the simplest program

Solution — http://www.computing.net/answers/programming/dev-c-compier-just-not-working-/24305.html
1. Download new MINGW compiler at http://www.mingw.org/ and install it in C:MinGW
2. in Dev c++. right click on menu Tools>Compiler options.
3. in tab “Directory”, right click on “Libraries” and change “C:Program FilesDev-CppLib” to “C:MinGWlib”
4. Compile!

sequence-container =\= array-based

My blog http://bigblog.tanbin.com/2009/04/4-basic-foundations-of-all-collection.html claims all java/c#/STL structures are based on 4 basic data structures

– array
– linked graph
….

Official STL documentation uses the term “sequence containers”. Now, Sequence-container can be non-array-backed. Linked list is one example.

The deque is not completely array-based. See ObjectSpace manual. A more detailed description of the HP implementation of STL deque is on P139 [[stl tutorial]]. It consists of multiple mini-arrays (segments) linked to a lookup construct.

Therefore deque combines array and linked graph.

See also http://bigblog.tanbin.com/2011/09/deque-advantage-over-vector.html.

Deque is one of the few random-access containers.

STL algos and their _if() and _copy() derivatives

Note all the _if() accept only Unary predicates i.e. filters. Often, you start with a binary predicate like less
… then you use bind2nd() to convert binary to unary predicate

—-These algos have both _if and _copy derivatives–
remove
remove_if
remove_copy
remove_copy_if
replace
replace_if
replace_copy
replace_copy_if
—-These algos have an _if derivative–
count
count_if
find
find_if
—-These algos have an _copy derivative–
partial_sort
partial_sort_copy
rotate
rotate_copy
unique
unique_copy

smart_ptr < double* > i.e. smart ptr of ptr

Should or can we ever put ptr-to-ptr into a smart ptr like smart_ptr?

My problem is the delete in the smart_ptr dtor — “delete ptr2ptr2dbl” where ptr2ptr2dbl is pointing at a 32-bit object (let’s call it “Dog123”). Dog123 is a 32-bit pointer to a double object. Is Dog123 on heap or on stack or in global area?

If dog123 is not in heap, the delete will crash (UndefinedBehavior) — deleting a pointer to stack (or global)

if dog123 is in heap, then how is it allocated? (Now it’s best to draw memory layout.) I feel it is almost always a field in a class/struct. Supposed the 32-bit object dog123 is also referred to as student1.pointerToFee, like

class student{
   double * pointerToFee;
// in this case, our smart pointer is perhaps set up like smart_ptr sptr( &(student1.pointerToFee) )
}

Now, if our “delete ptr2ptr2dbl” works, then the memory location occupied by Dog123 is reclaimed.
– that means the 64-bit double object is never reclaimed — memory leak
– also, student1.pointerToFee is in trouble. It’s not a wild pointer (its 64-bit pointee is ok), but the 32-bit memory location holding pointerToFee is reclaimed. On the real estate of student1instance, there’s now a 32-bit hole — 4 bytes de-allocated.

Q: how can we safely use this kind of smart pointer?
%%A: I guess we must use a custom deleter

Q: When would you need this kind of smart poitner?
%%A: I don’t think we need it at all.

self-rating in java GTD+theory/zbs #halos,letter to YH

Hi YH,

Self-rating is subjective, against one’s personal yardsticks. My own yard stick doesn’t cover those numerous add-on packages (swing, spring, hibernate, JDBC, xml, web services, jms, gemfire, junit, jmock, cglib, design patterns …) but does include essential core jdk packages such as —

  • – anything to do with java concurrency,
    • reading thread dump
  • – anything to do with java data structures,
  • – [2017] Garbage collection
    • memory profiling (jvm tools)
    • GC log analysis
  • – [2017] java.lang.instrument, mostly for memory
  • – networking/socket, file/stream I/O
  • [!2017] generics, esp. type erasure
  • – RMI,
  • – reflection, dynamic proxy, AOP
  • – serialization,
  • – class loaders
  • – JNI, esp. related to memory
  • – JMX, jconsole,
  • – difference between different JDK vendors,
  • – real time java
  • – reading bytecode

(The highlighted areas are some of my obvious weaknesses.) Now I feel my self-rating should be 8/10, but i still feel no one in my circle knows really more about the areas outlined above. I guess that’s because we don’t need to. Many of these low-level implementation details are needed only for extreme latency, where c++ has a traditional advantage.

Another purpose of this list — answer to this

Q: what kind of java (theoretical) knowledge would give you the halos in a Wall St interview?

fail fast – fundamental and Practical principle

Fail-fast is one of those low-level coding habits that deserve its place among the valuable habits to be adopted by practicing app developers in industry. In contrast, library developers (including open-source authors) generally adopt fail-fast by default.

Principle — prefer crashing the entire program. Don’t keep going. Don’t hope the dubious condition will be tolerated. Don’t hope the corrupted data will be left alone and left untouched. Sooner or later what you fear will happen. In such a case it can be very hard to find the root cause. The crash site might be far away from the buggy code.

Example — when dealing with DAM issues, there are specific tools to make the program crash as soon as detected. I think they intercept, replace or integrate with malloc/free.

Example — if null pointer can cause problem then check as early as possible. In java and c#, a lot of error messages simply say null pointer (like “unknown exception”) . It can take hours to find out the real cause.

Example — fail-fast iterators.

beta, briefly

Beta is calculated using regression analysis, and you can think of beta as the tendency of a security’s percentage Returns (not the continuously compounded return) to respond to swings in the market (represented by a benchmark). A beta of 1 indicates that the security’s price will move at the same magnitude with the market. A beta of less than 1 means that the security will be less volatile than the market. A beta of greater than 1 indicates that the security’s price will be more volatile than the market. For example, if a stock’s beta is 1.2, it’s theoretically 20% more volatile than the market.

For example, many utilities stocks have a beta of less than 1. Conversely, most high-tech stocks have a beta of greater than 1, offering the possibility of a higher rate of return, but also posing more risk.

Zero beta means 0 correlation with the index (i.e. the market), i.e. independent, insulated.

Negative beta means anti-correlation, or bucking the market.

If the market is always up 10% and a stock is always up 20%, the correlation is one (correlation measures direction, not magnitude). However, beta takes into account both direction and magnitude, so in the same example the beta would be 2 (the stock is up twice as much as the market).

I feel Beta is more important to the buy-side than the sell-side. Note many sell-side megabanks have buy-side units too.

Beside the standard beta on Return, there’s also what I call “vol-space” beta — where a beta of 1 means IBM realized vol over the past 2 years has identical magnitude of ups and downs as s&p (the benchmark) realized vol. This vol-space beta is calculated using 2 years of historical volatility numbers.

fitting cost vs Local-Vol cost

(I think this applies to any vol surface fitting — Eq, FX, IR…)

As a concept, fitting-cost is part of fitting, not validation, not extrapolation. Extrapolation has no fitting-cost since there’s no fitting.

LV-cost is different from fitting-cost.  LV-cost measures smoothness in LV — there’s an LV value at each point on the vol surface. Extrapolation calibration tries to minimize LV-cost.

– High fitting-cost means fitted curve deviates too much from targets i.e. input data.
– High LV-cost means some LV values are too high (or too low!!) compared to other LV values.

Bump check is one of the many post-fitting checks on the new surface.

max-thruput quote distribution: 6designs#CAS,socket

Update — fastest would require single-threaded model with no shared mutable

Suppose a live feed of market quotes pumps in messages at the max speed of the network (up to 100gigabit/sec). We have (5) thousands of hedge fund clients, each with some number (not sure how large, perhaps hundreds) of subscriptions to these quotes. Each subscription sets up a filter that may look like some combination of “Symbol = IBM”, “bid/ask spread < 0.2…”, or “size at the best bid price….”. All the filters only reference fields of the quote object such as symbol, size and price. We need the fastest distribution system. Bottleneck should be network, not our application.

–memory allocation and copying–
If an IBM /quote/ matches 300 filters, then we need to send it to 300 destinations, therefore copying 300 times, but not 300 allocations within JVM. We want to minimize allocation within JVM. I believe the standard practice is to send just one copy as a message and let the receiver (different machine) forward it to those 300 hedge funds. Non-certified RV is probably efficient, but unicast JMS is fine too.

–socket reader thread latency–
Given the messaging rate, socket reader thread should be as lean as possible. I suggest it should blindly drop each msg into a buffer, without looking at it. Asynchronously consumer threads can apply the filters and distribute the quotes.

A fast wire format is fixed-width. Socket reader takes 500bytes and assume it’s one complete quote object, and blindly drops this 500-long byte array into the buffer.

–multicast rather than concurrent unicast–
See single/multi-thread TCP servers contrasted

–cpu dedication–
Each thread is busy and important enough to deserve a dedicated cpu. That CPU is never given to another thread.
————-
Now let me introduce my design. One thread per filter. Buffer is a circular array — bounded but efficient pre-allocation. Pre-allocation requires fixed-sized nodes, probably byte arrays of 500 each. I believe de-allocation is free — recycling. Another friend (csdoctor) suggested an unbounded linked list of arrays . Total buffer capacity should exceed the *temporary* queue build-up. Slowest consumer thread must be faster than producer, though momentarily the reverse could happen.

—-garbage collection—-
Note jvm gc can’t free the memory in our buffer.

–Design 3–
Allocate a counter in each quote object. Each filter applied will decrement the counter. The thread that hits zero will free it. But this incurs allocation cost for that counter.

–Design 6–
Each filter thread records in a global var its current position within the queue. Each filter thread advances through the queue and increments it’s global var. One design is based on the observation that given the dedicated CPU, the slowest thread is always the slowest in the wolfpack. This designated thread would free the memory after applying its filter.

However, it’s possible for 2 filters to be equally slow.

–design 8–We can introduce a sweeper thread that periodically wakes up to sequentially free all allocations that have been visited by all filters.

–Design 9– One thread to apply all filters for a given HF client. This works if filter logic is few and simple.

–Design A (CAS)– Create any # of “identical” consumer threads. Any time we can expand this thread pool.
while(true){
1)read BigArrayBuffer[++MyThreadPtr] into this thread’s register and examine the fields, without converting to a Quote instance.
2) examine the Taken boolean flag. If already set, then simply “continue” the loop. This step might be needed if CAS is costly.
3) CAS to set this flag
4a) if successful, apply ALL filters on the quote. Then somehow free up the memory (without the GC). Perhaps set another boolean flag to indicate this fixed-length block is now reusable storage.
4b) else just “continue” since another thread will process and free it.
}

Stoch volatility ^ local-volatility

SV doesn’t refer to the random walk of a stock price (or an forex rate). SV refers to the random walk of instant volatility value [2]. This instant volatility (IV) can take on a value of 10%pa now, and 11.5%pa an hour later [1]. If a stock price were to fluctuate constantly by the micro second, then we would be able to record these movements during each second and compute realized/historical IV values for each interval.

[1] Note all volatility values are annualized, just as we compare different rice brands by per-kg price.
[2] realized vol or implied vol? Irrelevant. In the BS theory, volatility is a concept related to Brownian motion. Both r-vol and i-vol are indications of that theoretical volatility. I feel in this /context/, there’s no differentiation of implied vs realized vol.

I feel many people agree that it’s a sound assumption to assume IV follows a random walk, but there are very different random walks. For example, the stock price itself also follows a random walk, but that random walk is carefully modeled by the drift + the Brownian motion. That’s one type of random walk. The IV random walk is different and I call it a special random walk (SRW), for want of a better word.

Basically, SV models assume
1) the stock price follows a random walk characterized by an IV variable, along with a drift
2) this variable doesn’t assume a constant value as BS suggested, but follows a SRW. This SRW is described by a state variable, which depends on current stock price and has a mean-reverting tendency.

I find the mean-reversion assumption quite convincing (yes I do). In reality, if we measure the realized IBM volatility over each trading day and write down those realized-vol values on a table top calendar, we will see it surges and drops but always stays within a range instead of growing steadily. The stock price may grow steadily (drift) but the realized vol doesn’t.

SABR and local vol were said to be 2 models describing stochastic volatility, but veterans told me LV isn’t stochastic at all. I believe LV doesn’t include a dB term in sigma_t i.e the Instantaneous volatility.

LV — when IV is described merely as a function of underlier price St and of time t, we have a local volatility model. The local volatility model is a useful and simple SV model, according to some.

Another veteran in Singapore told me that local vol (like SV) is designed to explain skew. During the diffusion, IV is assumed to be deterministic, and a function of 2 inputs only — spot price at that “instant” i.e. St and t. I guess what he means is, after 888888 discrete steps of diffusion, the underlier could be at any of 888888 levels (in a semi-continuous binomial tree). At each of those levels, the IV for the next step is a function of 2 inputs — that level of underlier price and the TTL.

inputIterator category — unneeded?

Q: why there's a category “Input Iterator” at all, where is it used? (Output iterator is a similar story, so this post will omit it.)

I think the only major use is input stream.

There are some algorithms like std::merge(…) that require a tiny *subset* of a C pointer's full capabilities.

To make merge() useful on an input stream, STL authors put the Dummy type name “InputIterator” into merge() template declaration as a *hint* — a hint that in the implementation, only that *subset* of pointer capabilities are used. This is a hint to containers

“Hey Containers, if you have an iterator capable of deference-then-read, then you can use me.”

It turned out all containers except output stream has that capability.

ActionListener^Action

javadoc on Action.java says

Note that Action implementations tend to be more expensive in terms of *storage* than a typical ActionListener implementation class, which does not offer the benefits of centralized control of functionality and *broadcast of property changes*. For this reason, you should take care to only use Actions where their benefits are desired, and use simple ActionListeners elsewhere.

I feel ActionListener objects are typically stateless. I feel they are pure functor objects. The word “Action” in “ActionListener” means very much like the generic “Event”. However, the word “Action” in Action.java seems to mean something else, such as the Control in Microsoft lingo.

Anyway, i feel the word “Action” is ambiguous and overloaded. Let’s avoid it but do understand its various meanings.

remain relevant in a given technology(fundamentals

Q: At age 55, will you still qualify as a swing developer, or SQL developer, or python developer, or Unix admin?

1) An important factor is the stability of the language and the changing demand on specific skillset in that Language (contrast java vs SQL), but let’s focus on another factor — fundamentals.

2) In every language, there’s a body of fundamental knowledge (“knowledge-pearls”) that are Empowering and Instrumental. They facilitate your learning of “superstructure”. Many hiring managers believe strong fundamentals are the most important skill with a given Language.

I feel low-level knowledge will help you remain /relevant/ over 10 years. Some of the most tricky and frequently quizzed sub-topics are low-level — threading, memory-mgmt, collections, subclass memory layout, casting, RTTI, vptr, c++ big 3, …

3) I feel an appreciation of relative strengths/weaknesses of alternative technologies will help you remain relevant. Fundamental knowledge would help you appreciate them.

4) essential libraries of threading, collections, I/O, networking, … are not strictly part of the fundamentals, but part of mandatory skills anyway.

#1 usage of volatility surface

I’d say end-of-day unrealized PnL is the most IMPORTANT usage. An integral part of it is mark-to-market. (However, For liquid option products with numerous “tight” market quotes, I don’t know if we really need the vol surface for PnL.)

A more “fundamental” need for vol surface is the valuation of non-liquid volatility contracts, including structured, exotic, tailor-made contracts with optionality features. I prefer the words “contract” or “deal” rather than “instrument”, “product”, “security” or “asset”. Contract means there are at 2 counter-parties. If they really do the deal, at contract termination each will end up with a realized PnL, potentially humongous. The estimate, risk-management and analysis of that realized PnL is often the biggest job in a trading desk.

In Equities and FX, vol surface is often the centerpiece (at least part thereof) of the valuation framework for such “contracts”. In a valuation framework, most other factors are simpler compared to the volatility factor.

In a real London structured eq vol desk, such a valuation requires a Monte Carlo simulation which queries one or more vol  surfaces repeatedly. However, i don’t think the valuation need to use the parameters (like skew, tail…). The surface is treated as a black box to query.

The vol surface must be constructed by taking into consideration a variety of observed market data. Therefore a good surface is consistent with a diverse variety of market data, including but not limited to
– dividend forecast,
– tax schedule on dividends,
– calendar convention,
– holiday schedules….

But the most important market data is the premium on the liquid instruments, which typically cover the first few years only. Long-dated instruments are much less liquid.

fopen in various languages (file input/output

–C++
ofstream outfile(“out.txt”);
ifstream infile (“in.txt”); // class template

–C
FILE * pFile = fopen (“myfile.txt”,”w”);

–php follows C
<?php
$handle = fopen(“a.txt”, “r”);
?>

— python:
outfile = open(“a.txt”, “w”) # semicolon is usually omitted

–perl
open (OUTFILE, “>>append.txt”) or die …  ### No dollar sign. parentheses are optional but help readability

–c# offers many convenient solutions —
TextReader rd = new StreamReader(“in.txt”);
TextWriter tw = new StreamWrioter(“out.txt”);

Alternatively, File class offers variations of
static string ReadAllText(string path)
static void WriteAllText(string path, string contents) //creates or overwrites file

–java
I have written so many of them but paradoxically can’t recall which class we need to instantiate

java regex — replace with captured substring but modified

Any time you have a string with lots of x.xx000001 or x.xx99999, it’s probably noise you want to get rid of. Here’s a java solution. Perl can do this in 1 line (at most 2).

    public static String cleanUp999or000(String orig) {
        final static Pattern PATTERN9999 = Pattern
                .compile(“(\\d\\.\\d*)([0-8]9999+)(\\d\\s)”);
        Matcher m = PATTERN9999.matcher(orig);
        StringBuffer sb = new StringBuffer();
        String without999 = orig;
        String without999_or_000 = orig;
        try {
            while (m.find()) {
                final long intEndingIn999 = Long.parseLong(m.group(2));
                final long intEndingIn000 = intEndingIn999 + 1;
                System.out.println(intEndingIn000);
                m.appendReplacement(sb, m.group(1) + intEndingIn000 + m.group(3));
            }
            m.appendTail(sb);
            without999 = sb.toString();
        } catch (NumberFormatException e) {
            e.printStackTrace();
            without999 = orig;
        } finally {
            without999_or_000 = without999.replaceAll(
                “(\\d\\.\\d+?)0000+\\d(\\s)”, “$1$2”);
        }
        return without999_or_000;
    }

3 types of PWM traders — typical of buy-sides

Clients can trade by themselves through brokerage accounts, but for discretionary accounts, the traders are the investment professionals (IP) and portfolio managers (PM).  PM are the bigger traders.

A PM are product specialist, and manages a portfolio of several SMA (separately managed accounts) invested in a the product she specializes in. In terms of trading, PM are at the top of the food chain in wealth management.See also FX MA in [[Forex revolution]]

PM actually uses specialized trading systems, custom made for them, with a dedicated IT team. They could, if they want to and equipped to, engage in high-frequency trading. As a buy-side trader in a sell-side bank, they enjoy preferential treatment in terms of transaction cost, provided they engage the parent firm’s sell-side trading desk. They can also by pass parent firm and execute on the street — i.e. through a competitor’s sell-side traders. This does happen and they get slapped on the wrist, hard.

They have no relationship with clients.  They don’t handle asset allocation — financial advisers do that.

swing OMS screen (PWM), briefly

PWM screen, used till today (2012). Handles Eq, FX, FI, derivatives (esp. options), futures….

Real time OMS — “order state management”, including but not limited to
* real time status updates
* order entry
* manual trade booking

* manual order cancel/mod before fully executed

Web GUI won’t provide the responsiveness and volume —

At least 30,000 orders/day in US alone. 50-200 executions/order, executed in-house or on external liquidity venues. Typically 10 MOM messages per execution.
– new order placed
– acknowledged
– partial fill
– cancel/mod

Each swing JVM has its own MOM subscriber(s).

This codebase isn’t part of the IC build and has a frequent release cycle.

#1 famous undefined behavior C++

(See below for smart ptr, template, non-RTTI)

Deleting [1] a derived[3] object via a base[4] pointer is undefined behavior if base[6] class has non-virtual dtor, with or without vtable.

This is well-known but it applies to a very specific situation. Many similar situations aren’t described by this rule —
[1a] This rule requires pointer delete. In contrast, automatic destruction of a non-ref “auto” variable (on stack) is unrelated.

[1b] This rule requires a heap object. Deleting a pointee on stack is a bug but it’s outside this rule.

[1c] This rule is about delete-expression, not delete[]

[3] if the object’s run-time type is base, then this rule is Inapplicable

[4] if the pointer is declared as pointer-to-derived, then Inapplicable, as there is no ambiguity which dtor to run

[3,4] if the object run time type is base, AND pointer is declared pointer-to-derived? Inapplicable — compiler or runtime would have failed much earlier before reaching this point.

[6] what if derived class has non-virtual dtor? Well, that implies base non-virtual too. So Yes applicable.

*) P62 [[effC++]] points out that even in the absence of virtual functions (i.e. in a world of non-RTTI objects), you can still hit this UB by deleting a subclass instance via a base pointer.

**) The same example also shows a derived class-template is considered just like a derived class. Let me spell out the entire rule — deleting an instance of a derived-class-template via a pointer to base-class-template is UB if the base class-template has a non-virtual dtor.

What if the pointee is deleted by a smart_ptr destructor? I think you can hit this UB.

xaml code behind

Each xaml file describes the components, layout, data binding, event-handler and command binding associated with the visual components. The *.xaml.cs file is the code behind file, implementing among others, the event-handlers mentioned in the xaml.

The listeners don’t have to be part of an interface. It is probably mandatory to take 2 inputs — source and event.

table cell render – per column, per row, per data type

Typically, you associate a table cell render (Instance, not class) on a column. (A column is a homogeneous data type.)

You can also associate a render object to a class (such as Account or Student).

It's less natural to associate a render to a row. You can, however, adapt the render behavior to a row, perhaps based on the row number, or any property of the actual “value” object

go short on tail-risk — my take

Many sell-side [1] traders are described as being short tail-risk. In other words, they go short on tail-risk.

[1] some hedge funds too

*** If you are long tail-risk (insurance buyers), you are LONGING for it to increase. You stand to profit if tail risk increases, such as underlier moving beyond 3sigma. Eg — buy deep OTM options, buy CDS insurance.

*** If you are short tail-risk (insurance sellers), you hope tail risk drops; you mentally downplay the extreme possibilities; you stand to Lose if tail risk actually escalates. Eg — sell OTM options, sell CDS insurance agressively (below the market).

As a result, you would earn premiums quarter after quarter, but when an extreme tail risk does materialize, your loss might not be fully compensated by the premiums, because the insurance was (statistically) underpriced, because you underestimated the probability and magnitude of tail risk.

Maybe you (the trader) is already paid the bonus, so the consequence is borne by the insurance seller firm. In this sense, the compensation system encourages traders to go short on tail risk.

convert a reference variable into a pointer variable

You can’t declare a variable as a pointer to a reference, but we often take the address of a reference variable. I think it’s same as address of the referent.

Q: If you need to pass a pointer to a 3rd party library, but you only received a reference variable — perhaps as an function input argument, how?
A: Well, you can treat the variable as a non-ref and simply pass its address to the library.

 

ATM ^ ATF European calls, briefly

Refer to my simplified BS formula in http://bigblog.tanbin.com/2011/06/my-simplified-form-of-bs.html.

Q: for a ATF European call, where K == S*exp(rt) i.e. struck slightly Above current spot, how would the BS formulas be simplified
d1 = -d2 = 0.5σt  =   \frac{\sigma\sqrt{t}}{2} …………………. (in more visual form)

C(S,t) = S * [ N(d1) – N(-d1) ] = S * [2N(d1)-1] and depends only on sigma scaled Up for 2.5 years (our t)

Q: how about an ATM European call, where S==K?
A: the ATM call (slightly Lower strike than ATF) has more moneyness than the ATF call , because stock will drift past K long before expiry. The diffusion of the stock prices is “centered” around the drift.

##common c++ run time errors

In java, the undisputed #1 common run time error is NPE, so much so that half of all error checks are null-pointer-checks. In c++, divide-by-zero is similarly a must. But there are more…

– divide by zero
– pointer-move beyond bounds — remember all array sizes are compile-time constants, even with realloc()
– read/write a heap object (heapy thingy) via a reference variable after de-allocation
– return by reference an object allocated on the stack — bulldozed
– dereference (unwrap) a dangling pointer
– c-style cast failure
– double-free
– dereference (unwrap) a null pointer — undefined behavior, unlike java NPE. We should always check null before dereferencing.

—-less common
– delete a pointer to non-heap
– free a pointer to non-heap
– hold pointer/reference to a field of an object, not knowing it’s soon to be reclaimed/bulldozed. Object could be on heap or stack. More likely in multi-threaded programs.
– misusing delete/delete[] on a pointer created with new/new[]. The variable always looks the same — plain pointers.

————-
For any of the above, if it happens in a dtor, you are in double trouble, because the dtor could be executing due to another run time error.

The authoritative [[essential c++]] says that for every exception, there must be a “throw” you can find. Divide-by-zero is something directly done on “hardware” so no chance to throw. I feel many error conditions in C are treated same as in C, without “throw”. In contrast,

  • I feel operator-new is a typical “managed” low level operation that can (therefore does) use exception.
  • dynamic_cast() is anther “managed” low level operation added over C, so it does throw in some cases

Now, JVM “wraps” up all these runtime error Conditions into exceptions. Java creators generally prefer to push these error conditions to the compilation phase, to reduce the variety of Runtime errors. What remain are wrapped up as various RuntimeExceptions. It’s rather remarkable that JVM let’s you catch  all(?) of these exceptions. No undefined behavior.

case/group-by/self-join — #2 or top 9

(Simplest solution is at the end, which also returns #2 alone ….)
I have seen many tricks in SELECT queries (Most using joins – Fine.), but now if I must name one keyword to be the most powerful yet unappreciated keyword, it has to be
1) CASE.

2) The combined power of case and group-by is even more impressive.
3) Yet more unthinkable is the combination of case/group/self-join.
4) correlated subquery (slow?) in SELECT, FROM, WHERE and HAVING.

http://www.informit.com/articles/article.aspx?p=26856 explains “In addition to subqueries in the WHERE and HAVING clauses, the ANSI standard allows a subquery in the FROM clause, and some relational database management systems (RDBMSs) permit a subquery in the SELECT clause.”

Key to understanding case_group_self-join is the intermediate work table. Mind you, intermediate table is the key to _every_ tricky self-join. P30 [[ transact-sql cookbook]] among many chapters, has a wonderful solution to find the top 5 values in a table without a 5-way self-join. Based on that, here’s a simplified but full example, showing several distinct solutions. These solutions can be adapted to find 5th largest value in a table, too.

— setting up the data —-
drop table public.students
create table public.students(
student varchar(44),
score decimal(4,1),
primary key (student)
)
insert into students values( ‘Andrew’, 15.6)
insert into students values( ‘Becky’, 13)
insert into students values( ‘Chuck’, 12.2)
insert into students values( ‘Dan’, 25.6)
insert into students values( ‘Eric’, 15.6)
insert into students values( ‘Fred’, 5.6)
insert into students values( ‘Greg’, 5.6)
select * from students


Solution 1
—– top 3 scores
— intermediate table
select * from students h right join students L
on h.score > L.score

select L.student, ‘is lower than’, count(h.student), ‘competitors’
from students h right join students L
on h.score > L.score
group by L.student
having count(L.student) < 3

—– lowest 4 scores
— intermediate table
select * from students h left join students L
on h.score > L.score

select h.student, ‘defeats’, count(L.student), ‘competitors’
from students h left join students L
on h.score > L.score
group by h.student
having count(L.student) < 4


—- Solution 2, using case
— intermediate table
select *, (case when h.score > L.score then 1 else 0 end)
from students h left join — inner join ok since cartesian
students L on 1=1

select h.student, ‘defeats’, sum(case when h.score > L.score then 1 else 0 end), ‘competitors’
from students h left join — inner join ok since cartesian
students L on 1=1
group by h.student
having sum(case when h.score > L.score then 1 else 0 end) < 4

–same solution tested on http://sqlzoo.net/howto/source/z.dir/tip915069/sqlserver
select h.name, h.area, ‘is smaller than’, sum(case when h.area < L.area then 1 else 0 end), ‘countries’
from cia h, cia L
group by h.name, h.area
having sum(case when h.area < L.area then 1 else 0 end) < 4


—– (concise) Solution 3, using correlated sub-select, P295 [[sql hacks]] — without intermediate table
select * from students o where  (select count(1) from students where score < o.score) < 4

–Same solution tested in http://sqlzoo.net/howto/source/z.dir/tip915069/sqlserver:
select * from cia o where  (select count(1) from cia where area > o.area) < 4 — shows top 4

…and (select count(1) from cia where area > o.area) > =3 — returns #4 alone
…and (select count(1) from cia where area > o.area) > =2 — returns #4 #3 exactly

–This technique shows its power when you want top 2 in each continent, without group-by
select * from cia o where  (select count(1) from cia where area > o.area and region=o.region) < 2
order by o.region, o.area

I feel for a student/practitioner, it pays to think in terms of CASE. This strikingly simple solution can be rewritten using (messy) CASE.

swing automated test, briefly

3) http://www.codework-solutions.com/testing-tools/qfs-test/ says …event is constructed and inserted artificially into the system’s EventQueue. To the SUT it is indistinguishable whether an event was triggered by an actual user or by QF-Test. These artificial events are more reliable than “hard” events that could be generated with the help of the AWT-Robot, for example, which could be used to actually move the mouse cursor across the screen. Such “hard” events can be intercepted by the operating system or other applications.

2) http://jemmy.java.net/Jemmy3UserTutorial.html and http://wiki.netbeans.org/Jemmy_Tutorial explain some fundamentals about component searching. Jemmy can simulate user input by mouse and keyboard operations.

1) java.awt.Robot is probably the most low-level — Using the class to generate input events differs from posting events to the AWT event queue —  the events are generated in the platform's native input queue. For example, Robot.mouseMove will actually move the mouse cursor instead of just generating mouse move events.

option valuations – a few more intuitions

It’s quite useful to develop a feel for how much option valuation moves when underlier spot doubles or halves. Also, what if implied vol doubles or halves? What if TTL (time to expiration) halves?

For OTM / ITM / any option, annualized i-vol multiplied by TTL is the real vol. For example, If you double vol and half TTL twice, valuation remains unchanged.

If you compare a call vs a put with identical strike/expiry (E or A style), the ITM instrument and the OTM instrument have identical time value. Their valuations differ by exactly the intrinsic value of the ITM instrument. (See http://www.cboe.com/LearnCenter/OptionCalculator.aspx.)  — Consistent with European option’s PCP, but to my surprise, American style also shows this exact relationship. I guess it’s because the put valuation is computed from a synthetic put (http://25yearsofprogramming.com/blog/20070412.htm).

For ATM options, theoretical option valuation is proportional to both vol and TTL, i.e. time-to-live. http://www.cboe.com/LearnCenter/OptionCalculator.aspx and other calculators show that
– when you change the vol number, valuation changes linearly
– when you double TTL while holding vol constant, valuation grows quadratically.

For OTM options? non-linear

For ITM options, it’s approximately the OTM valuation + intrinsic value.