Containers are what people want; algo/adapters/functors are "accessories"

Most developers come to the “STL supermarket” looking for containers to complement the basic array container. They like a container and use it, and soon realize they must return to the supermarket and pick the associated algo/iterator etc. Many developers find it hard to avoid the STL algorithms. Many feel in their project STL algorithms are avoidable if they write their own home-grown functions access the containers. Iterators, however, are more necessary.

Functors are somewhat arcane to those from C or non-STL environments, but functors were created out of practical necessity.
Adapters (container adapter, iterator adapters, functor adapters.) are also fairly necessary. See http://bigblog.tanbin.com/2009/11/adapters-for-stl-1-container-2-iterator.html
So you entered the supermarket looking for containers, but to work with containers you came back to supermarket for supplementary tools, which are not “free” — all require a bit of learning.
Advertisements

implied vol vs forecast-realized-vol

In option pricing, we encounter realized vs implied vol (not to be elaborated here). In market risk (VaR etc), we encounter

past-realized-vol vs forecast-realized-vol. Therefore, we have 3 flavors of vol

PP) past realized vol, for a historical period, such as Year 2012

FF) forecast realized vol, for a start/end date range that's after the reference date or valuation date. This valuation date is

typically today.

II) implied vol, for a start/end date range that's after the reference date or valuation date. This valuation date is typically

today.

PP has a straightforward definition, which is basis of FF/II.

Why FF? To assess VaR of a stock (I didn't say “stock option”) over the next 365 days, we need to estimate variation in the stock

price over that period.

FF calculation (whenever you see a FF number) is based on historical data (incidentally the same data underlying PP), whereas II

calculation (whenever you see a II number like 11%) is based on quotes on options whose remaining TTL is being estimated to show an

(annualized) vol of 11%.

See http://www.core.ucl.ac.be/econometrics/Giot/Papers/IMPLIED3_g.pdf compares FF and II.

derivative/ integral of exponential/log functions

I feel the #1 most useful integral (and derivative) is that of exponential function and the natural log function. Here is a

cheatsheet to be internalized.

for f (x) = e^x, f ' (x) = e^x

for f (x) = a^x, f ' (x) = ln(a) a^x

for f (x) = ln(x), f ' (x) = 1/x

for f (x) = log_a(x), then simpliy recognize f (x) = ln(x) / ln(a). The rest is really really simple.

Now the integrals

For f ' (x) = a^x, f (x) = a^x / ln(a)

For f ' (x) = ln(x), f (x) = x ln(x) – x See http://www.math.com/tables/integrals/more/ln.htm. Classic integration-by-parts

For f ' (x) = log_a(x), then simpliy recognize f ' (x) = ln(x) / ln(a). The rest is really really simple.

%%unconventional code readability tips – for prod support

[[The Art of Readable Code]] has tips on naming, early return (from functions), de-nesting,  variable reduction, and many other topics…. Here are my own thoughts.

I said in another blog that “in early phrases (perhaps including go-live) of a ent SDLC, a practical priority is instrumentation.” Now I feel a 2nd practical priority is readability and traceability esp. for the purpose of live support. Here are a few unconventional suggestions that some authorities will undoubtedly frown upon.

Avoid cliche method names as they carry less information in the log. If a util function isn’t virtual but widely used, try to name it uniquely.

Put numbers in names — the less public but important class/function names. You don’t want other people outside your  team to notice these unusual names, but names with embedded numbers are more unique and easier to spot.

Avoid function overloads (unavailable in C). They reduce traceability without adding value.

Log with individualized, colorful even outlandish words at strategic locations. More memorable and easier to spot in log.

Asserts – make them easy to use and encourage yourself to use them liberally. Convert comments to asserts.

If a variable (including a field) is edited from everywhere, then allow a single point of write access – choke point. This is more for live support than readability.

How do global variables fit in? If a Global is modified everywhere (no choke point), then it’s hard to understand its life cycle. I always try to go through a single point of write-access, but Globals are too accessible and too open, so the choke point is advisory and easily bypassed.

MxV is beneath every linear transformation

Q: can we say every linear transformation (linT) can be /characterized/expressed/represented/ as a multiplication by a specific (often square) matrix[1]? Yes See P168 [[the manga guide to LT]]

[1] BTW, The converse is easier to prove — every multiplication by a matrix is a linT, assuming input is a columnar vector.

Before we can learn the practical techniques applying MxV on LinT, we have to clear a lot of abstract and confusing points. LinT is one of the more abstract topics.

1) What kind of inputs go into a LinT? By LinT definition, real numbers can qualify as input to a LinT. With this kinda input, a LinT is nothing but a linear function of the input variable x. Both the Domain and the Range of the linT consist of real numbers.

2) This kinda linear transform is too simple, not too useful, kinda degenerate. The kinda input we are more interested in are vectors, expressed as columnar vectors. With this kinda inputs, each LinT is represented as a matrix. A simple example is a “scaling” where input is a 3D vector (x,y,z). You can also say every point in the 3D Space enters this LinT and “maps” to a point in another 3D space. This transform specifies how to map Every single point in the input space. “Any point in the 3D space I know exactly how to map!”. Actually this is a kind of math Function. Actually Function is a fundamental concept in Linear Transformation.

This particular transform doesn’t restrict what value of x,y or z can come in. However, the parameters of the function itself is locked down and very specific. This is a specific Function and a specific Mapping.

3) Now, since matrix multiplication can happen between 2 matrices, so what if input is a matrix? Will it be a LinT? I don’t know too much but I feel this is not practically useful. The most useful and important kind of Linear transformation is the MxV.

4) So what other inputs can a LinT have? I don’t know.

To recap, there are unlimited types of linear transformations, and each LinT has an unlimited, unconstrained Domain. This makes LinT a a rather abstract topic. We must divide and conquer.

First divide the “world of linear transforms” by the type of input. The really important type of input is columnar vector. Once we limit ourselves to columnars, we realize every LinT can be written as a LHS multiplying matrix.

To get a concrete idea of LinT, we can start with the 2D space — so all the input columnars come from this space. These can be represented as points in the 2D space.

MxV – matrix multiplying columnar vector

To my surprise, practically all of the important concepts in introductory linear algebra are related to one operation – a LHS “multiplier” matrix multiplying a RHS columnar (i.e. a columnar vector). I call it a MxV

I guess LA as a branch grew to /characterize/abstract/ and solve real problems in physics, computer graphics, statistic etc. I guess many of the math tools are about matrices, vectors and … hold your breath … MxV —

– Solving linear system of equations. The coefficients form a LHS square matrix and the list of unknowns form the columnar vector

– transforming 3D space to 2D space — when the columnar is 3D and the matrix is 2×3
– range, image … of a transform function — the function often represented as a MxV multiplication.
– inverse matrix
– linear transform
– eigenXXX

eigen vector, linear transform – learning notes

To keep things simple and concrete, let’s limit ourselves to square matrices up to 3D.

I’m no expert on linear transform (LinT). I feel LinT is about mapping a columnar vector (actually ANY columnar in a 2D space) to another vector in another 2D space. Now, there are many (UNLIMITED actually) such 2D mapping functions. Each _specific_ mapping function can be characterized by a _specific_ LHS multiplying matrix. MxV again!

Now eigenvector is about characterizing such a matrix. Suppose we are analysing such a matrix. The matrix accepts ANY columnar (from the 2D space) and transforms it. Again there are UNLIMITED number of input vectors, but someone noticed one (among a few) input vector is special to this particular matrix. It goes into the transform and comes out perfectly scaled. Suppose this special input vector is (2,1,3) [1], it comes out as (20,10,30). The scaling factor (10 in this case) is the eigenvalue corresponding to the eigenvector.

Let’s stop for a moment. This is rare. Most input vectors don’t come out perfectly scaled – they get linearly transformed but not perfectly scaled. This particular input vector (and any scaled version of it) is special to this matrix. It helps to characterise the matrix.

It turns out for a 3D square matrix, there are up to 3 such special vectors — the eigenvectors of the matrix.

The set of all eigenvectors of a matrix, each paired with its corresponding eigenvalue, is called the eigensystem of that linear transform.

Instead of “eigen”, the terms “characteristic vector and characteristic value” are also used for these concepts.

[1] should be written as a columnar actually.

Multicast delegate in c# (+AWT), briefly

cf swing AWTEventMulticaster …
———
All c# delegate types are multicast. The word “multicast” is /superfluous/redundant/, just like Reentrant in ReentrantLock. You can similarly say silly things like “Secure SSL”, or “Reliable TCP”.

According to http://msdn.microsoft.com/en-us/library/system.multicastdelegate.aspx, a MulticastDelegate has a linked list of delegates, known as an invocation list. When a multicast delegate is invoked, the delegates in the invocation list are called _synchronously_ in the order in which they appear.

http://blogs.msdn.com/b/brada/archive/2004/02/05/68415.aspx says ” We abandoned the distinction between Delegate and MulticastDelegate towards the end of V1.” Even though MulticastDelegate type extend Delegate type, system will not even allow you to derive from Delegate directly. Instead, all delegates extend MulticastDelegate.

auto vs star – in xaml, briefly

… in wpf (not Silverlight) means that the first column is 10x wider than the second. It's like saying “10 parts column 1, and 1 part column 2.” The cool thing about this is that your columns will resize proportionally.

////////////////

//eat up all available space that enclosing container has

////////////////

//Take up as much space as the contents of the column need

///////////////

//Fixed width: 100 pixels

string split c#

Requirement – Split by comma (or another delimiter). Handle consecutive delimiters. Trim white spaces around delimiters.

Solution –

string[] tmp = (commified ?? “”).Split(new string[] { “,” }, StringSplitOptions.RemoveEmptyEntries);

string[] tmp2 = Array.ConvertAll(tmp, p => p.Trim());

tabbing policy – swing

See http://docs.oracle.com/javase/tutorial/uiswing/misc/focus.html

The policy of a Swing application is determined by LayoutFocusTraversalPolicy. You can set a focus traversal policy on any Container by using the setFocusCycleRoot method.

Alternatively you can pass focus-traversal-policy-providers to the FocusTraversalPolicy() methods instead of focus cycle roots. 

Use the isFocusTraversalPolicyProvider() method to determine whether a Container is a focus-traversal-policy-provider. Use the setFocusTraversalPolicyProvider() method to set a container for providing focus traversal policy.

Timer to edit UI objects – WPF

http://www.silverlightshow.net/items/Tip-Asynchronous-Silverlight-Execute-on-the-UI-thread.aspx shows a common scenario. Say you want to periodically call WCF and in the callback method update UI directly — by editing UI objects. Not good practice, but sometimes this is quick and dirty.

1)
Solution: System.Threading.Timer, which uses background threads in the per-Process thread pool. Unfortunately these threads aren’t allowed to edit UI objects (only UI thread can). Result — blank UI.

Solution: this.Dispatcher.BeginInvoke(), but how do you get this.Dispatcher? Well, every UI object has a property this.Dispatcher, but is it easy to get a handle on the UI object?

In my case, WCF callback method is part of a static class. In this callback I raise an (static) event. One of the View Model objects subscribe to it. However, the View Model object should not have a handle on the UI objects, by OO principle.

Solution: the CodeBehind is the melting pot. It is part of the View and has access to every View Model object. So my VM exposes a public property this.Dispatcher which is set by the CB. Dependency injection

2) DispatcherTimer is the popular solution. However you can’t change the timer interval mid-flight.

premium adjusted delta – basic illustration

http://www.columbia.edu/~mh2078/FX_Quanto.pdf says “When computing your delta it is important to know what currency was used to pay the premium. Returning to the stock analogy, suppose you paid for an IBM call option in IBM stock that you borrowed in the stock-lending market. Then I would inherit a long delta position from the option and a short delta position position from the premium payment in stocks. My overall net delta position will still be long (why?), but less long than it would have been if I had paid for it in dollars.”

Suppose we bought an ATM call, so the option position itself gives us +50 delta and let us “control” 100 shares. Suppose premium costs 8 IBM shares (leverage of 12.5). Net delta would be 50-8=42. Our effective exposure is 42%

The long call gives us positive delta (or “positive exposure”) of 50 shares as underlier moves. However, the short stock position reduces that positive delta by 8 shares, so our portfolio is now slightly “less exposed” to IBM fluctuations.

2nd scenario. Say VOD ATM call costs 44 VOD shares. Net delta = 50 – 44 = 6. As underlier moves, we are pretty much insulated — only 6% exposure. Premium-adjusted delta is significantly reduced after the adjustment.

You may wonder why 2nd scenario’s ATM premium is so high. I guess
* either TTL(i.e. expiration) is too far,
* or implied vol is too high,
* or bid ask spread is too big, perhaps due to market domination/manipulation

quiet confidence on go-live day

I used to feel “Let’s pray no bug is found in my code on go-live day. I didn’t check all the null pointers…”

I feel it’s all about …blame, even if manager make it a point to to avoid blame.

Case: I once had a timebomb bug in my code. All tests passed but production system failed on the “scheduled” date. UAT guys are not to blame.

Case: Suppose you used hardcoding to pass UAT. If things break on go-live, you bear the brunt of the blame.

Case: if a legitimate use case is mishandled on go-live day, then
* UAT guys are at fault, including the business users who signed off. Often the business come up with the test cases. The blame question is “why this use case isn’t specified”?
* Perhaps a more robust exception framework would have caught such a failure gracefully, but often the developer doesn’t bear the brunt of the blame.
**** I now feel business reality discounts code quality in terms of airtight error-proof
**** I now feel business reality discounts automated testing for Implementation Imperfections (II). See http://bigblog.tanbin.com/2011/02/financial-app-testing-biz.html

Now I feel if you did a thorough and realistic UAT, then you should have quiet confidence on go-live day. Live data should be no “surprise” to you.

double-ptr usage #4 – array of pointer@@

Whenever you have an array of pointer, you probably have a double-ptr. See http://stackoverflow.com/questions/5558382/double-pointers-and-arrays

In some contexts, you actually new up an array of (uninitialized) pointers, then initialize each ptr. This example is from [[Programming with Visual C++: Concepts and Projects]]

int* arr;
int** ptr;
arr = new int[8];
ptr = new int*[8];
for (i=0; i<8; ++i) ptr[i] = &arr[i];

y differentiate Read vs Write to encapsulated/internal fields

Say my class C has an array field, and provides operator[]. MoreEffC++ shows some techniques to handle read access vs write access differently. As a result, when you call cout << myObj[2] you invoke the readonly accessor,  but you hit another accessor when you assign myObj[2] += 9999

Q1: Now, why do we ever Need to differentiate read vs write?
Q2: Why can’t I return a const reference under read, and a non-const reference under write?
A2: That’s a simple and useful design. To implement it you need to differentiate read vs write, and it’s not easy.

A1: More importantly, in some cases such as copy-on-write, we need to do something special when writing.
A1: suppose I want to provide read/write lock
A1: suppose i want to count how many times I’m accessed in RO vs RW mode

The standard mechanism to differentiate — overload operator= (and operator+= etc).

The simpler mechanism to differentiate — java style getter/setter, instead of op overloading. I feel the entire challenge in the Scott Meyers chapter is c++ specific and motivated by read/write Operator. Since java doesn’t support op overloading, the techniques don’t apply. Uncommon in C# even though c# supports op overloading.

c-str cheatsheet – remarks

cString manipulation involves too many more details than I anticipated. This is a first cheatsheet. I will put new tricks into separate posts for easy management. This cheatsheet is Sufficient for Most coding tests. Remember testers are more interested in your algo and data structure, not string operations.

Only a minority of questions require editing an existing string.

– Manufacture a std::string from a given c-string?http://stackoverflow.com/questions/5328544/returning-char-array-as-stdstring

?populate an array of strings, and save them to a vector
? convert int to a c-string? http://faq.cprogramming.com/cgi-bin/smartfaq.cgi?id=1043284385&answer=1043808026
– duplicate a string on stack
– char* strstr()
– int indexOf()
– compare
– clearContent
– concat/append
– erase_by2pos
– substr_by2pos
– replace_by2pos
– replace_a_substr
– insert_by_pos
? remove white spaces
?rtrim trailing comments

c-str cheatsheet – code

printf("%.*s", len, markets) // print c-str without \0
#include <iostream>
#include <string.h>
#include <algorithm>
#include <iterator>
#include <vector>
using namespace std;
void clone(char* orig) {
    char dup[strlen(orig) + 1];//VLA
    strcpy(dup, orig);
    cout << "-- duplicate a string on stack\n" << dup << endl;
    dup[0] = 0;
    cout << "-- now clear its content. New strlen() == \n" << strlen(dup)
            << endl;
}

int main() {
    char a[] = "Four score and seven years ago our fathers brought forth //comment";
    char * const pos5 = strstr(a, "//");
    cout << "-- position of sub-string == " << pos5 << endl;
    *pos5 = '\0';
    cout << "-- rtrim trailing comments -->" << a << "<---\n";
    ////////
    clone(a);
    //////////
    char const b[] = "(extra)";
    char * news = new char[strlen(a) + strlen(b) + 1];
    strcpy(news, a);
    strcat(news, b);
    cout << "-- append and return a heapy\n" << news << endl;
    delete[] news;
    cout << "-- after clearing, strlen ==" << strlen(news) << endl;
    ////////////
    char * pattern = "and";
    int const indexOf = strstr(a, pattern) - a;
    cout << "-- indexOf == " << indexOf << endl;
    //////////
    int const pos = 4;
    news = new char[strlen(a) + strlen(b) + 1];
    strncpy(news, a, pos); //the left half
    strcpy(news + pos, b); //insert
    strcpy(news + strlen(news), a + pos); //the left half
    cout << "-- insert_by_index at position " << pos << endl << news << endl;
    delete[] news;
    //////
    news = new char[strlen(a) + strlen(b) + 1];
    int pos1 = indexOf;
    int pos2 = indexOf + strlen(pattern); //1 past the region
    strncpy(news, a, pos1); //the left half
    strcpy(news + pos1, b);
    strcpy(news + pos1 + strlen(b), a + pos2);//the right half
    cout << "-- replace_by2pos or by substring\n" << news << endl;
    delete[] news;
    //////////
    news = new char[strlen(a) + 1];
    strcpy(news, a);
    pos1 = 1;
    pos2 = 6; //1 past the region
    *(news + pos2) = 0;
    cout << "-- substr_by_pos\n" << news + pos1 << endl;
    /////////
    strcpy(news, a);
    pattern = "ou";
    cout << "-- progressive matching (counting, lastIndexOf..)\n" << endl;
    while (1) {
        news = strstr(news, pattern);
        if (!news)
            break;
        cout << news << endl;
        news += strlen(pattern);
    }
    delete[] news;
    /////////////
    const char* strArray[] = {"aa", "bb"};
}

VBA Application.VLookup vs Application.WorksheetFunction.VLookup

WorksheetFunction.VLookup throws no-match exception that can’t be caught!

Dim pair As Variant
pair = Application.WorksheetFunction.VLookup(Key, haystack, 2, False)
If IsError(pair) Then pair = “”

If you remove “WorksheetFunction”, Application.VLookup failure is catchable.

vlookup last arg (“sorted closest match”) defaults to TRUE!
The table array can include headers — does’t matter

most popular/important instruments by Singapore banks

I spoke to a derivative market data vendor’s presales. Let’s just say it’s a lady named AA.

Without referring specifically to Singapore market, she said in all banks (i guess she means trading departments) FX is the bread and butter. She said FX desk is the heaviest desk. She said interest rate might be the 2nd most important instrument. Equities and commodities are not …(heavy/active?) among banks.

I feel commercial banks generally like currencies and high quality bonds in favor of equities, unrated bonds and commodities. Worldwide, Commercial banks’ lending business model is most dependent on interest rates. Singapore being an import/export trading hub, its banks have more forex exposure than US or Japanese banks. Their use of credit products is interesting.

AA later cited credit derivative as potentially the 2nd most useful Derivative market data for a typical Singapore bank. (FXVol being the #1). Actually, Most banks don’t trade a lot of credit derivatives, but they need the market data for analysis (like CVA) and risk management. She gave an example — say your bank enters a long-term OTC contract with BNP. You need to assess BNP’s default probability as part of counterparty risk. The credit derivative market data would be relevant. I think the most common is CDS
spread.

(Remember this vendor is a specialist in derivative market data.)

The FX desk of most banks make bulk of the money from FXO, not FX spot. She felt spot volume is higher but margin is as low as 0.1 pip, with competition from EBS and other electronic liquidity venues. What she didn’t say is that FXO market is less crowded.

She agreed that many products are moving to the exchanges, but OTC model is more flexible.

%errorlevel% in DOS, briefly

Errorlevel is like a “dynamic readonly” whiteboard (NOT an environment variable — see below). The whiteboard is wiped clean and overwritten by DOS after each command executed. This happens whether you run the commands interactively or in batch using *.bat files.

You can say this whiteboard shows _transient_ values. For a crude analogy, consider the variable behind LED temperature display. It keeps changing.

If you want to capture a particular transient value of this object, save it *immediately* in your own variable (before echoing %errorlevel%). See http://www.coretekservices.com/2012/06/28/dos-batch-file-error-level-checking-tricks/

Q: What’s %ERRORLEVEL% vs bare ERRORLEVEL ?
A: %ERRORLEVEL% will expand into a String representation of the current value of ERRORLEVEL, provided that there is not already an environment variable with the name ERRORLEVEL, in which case you will get its value instead. 
A: To see the value, you must use %xxx%. Consistent with other DOS variables such as %OS%

every async operation involves a sync call

I now feel just about every asynchronous interaction involves a pair of (often remote) threads. (Let’s give them simple names — The requester RR vs the provider PP). An async interaction goes through 2 phases —

Phase 1 — registration — RR registers “interest” with PP. When RR reaches out to PP, the call must be synchronous, i.e. Blocking. In other words, during registration RR thread blocks until registration completes. RR thread won’t return immediately if the registration takes a while.

If PP is remote, then I was told there’s usually a local proxy object living inside the RR Process. Registration against proxy is faster, implying the proxy schedules the actual, remote registration. Without the scheduling capability, proxy must complete the (potentially slow) remote registration on the RR thread, before the local registration call returns. How slow? If remote registration goes over a network or involves a busy database, it would take many milliseconds. Even though the details are my speculation, the conclusion is fairly clear — registration call must be synchronous, at least partially.

Even in Fire-and-forget mode, the registration can’t completely “forget”. What if the fire throws an exception at the last phase after the “forget” i.e. after the local call has returned?

Phase 2 — data delivery — PP delivers the data to an RR2 thread. RR2 thread must be at an “interruption point” — Boost::thread terminology. I was told RR2 could be the same RR thread in WCF.

greeks on the move – intuitively

When learning option valuations and greeks, people often develop quick reflexes about what-if’s. Even a non-technical person can develop some of these intuitions. Because these are quick and often intuitive, this knowledge is often more practical and useful than the math details.

Some of these observations are practically important while others are obscure.

Q3: How would all indicators of an ATM instrument move when underlier rises/falls?
QQ: What if the instrument has very low/high volatility?
QQ: What if the instrument is far/close to expiry?

Q5: How would all indicators of a deep OTM (deep ITM is rare) instrument move when underlier moves towards/from strike?
QQ: What if the instrument has very low/high volatility?
QQ: What if the instrument is far/close to expiry?

Q7: How would all indicators of a deep-OTM/ATM instrument move when sigma_imp rises/falls?
QQ: What if the instrument has very low/high volatility?
QQ: What if the instrument is far/close to expiry?

Q9: How would all indicators of a deep-OTM/ATM instrument move when approaching maturity?
QQ: What if the instrument has very low/high volatility?

“Indicators” include all greeks and option valuation. The “instrument” can be a European/American call/put/straddle.