sys instrumentation: win32 murkier than linux #c++@@

See also

I recently browsed [[inside windows debugging]]. In the past I also browsed a few *nix system programming books. I’m more afraid of windows than *nix. Windows is more dark and murky; linux is more open and consistent. As a beginner, I feel in linux the challenge is interpreting tool output but on windows it’s getting past the installation and security restrictions.

  • complication: GUI — As explained in this blog (, GUI tends to hide a lot of features in layers and compartments. Console tools provide the same amount of power features …
    • if there’s a security or registry or other issues that reduce the provided feature set to below 100%, a console tool is more likely to report it to the user but GUI tools often “intelligently” handle it as if it’s normal.
    • MSVS is the primary windows instrumentation tool – very complicated.
  • complication: tool installation – Linux tools are not easy to install, but Windows tools tend to be even harder to install.
    • ** registry.
    • ** On any platform GUI tools are, on average, harder to install. Linux tools are less likely GUI.
  • complication: security restrictions – for a novice, 1/3 of the problems with windows tools are related to security/permission. Even under an administrator account, there are many security restrictions we may need to painstakingly disable! I don’t have any example now. I feel windows was criticized as vulnerable so Microsoft added a lot of security features.
    • ** some security features seem to be controlled by the domain administrators.
  • complication: environment factors – If you run the same tool in 2 machines with different results it’s often due to environment factors. In Linux, environment variables and some utilities are the only environment factors I know. In windows there seem to be many more places to check.
    • ** registry
    • ** certificates
    • ** control panel items
    • ** secpol.msc
  • complication: tool availability – many windows tools are commercial. On windows, I generally find the open-source tools better-quality, but for some tasks there’s only one tool – shipped by Microsoft. If I feel it is often crippled. On linux the best tools are invariably open-source.
  • complication: undocumented features — Even if you follow the all documented procedures, still you may hit something unexpected, due to undocumented features. This happens more often on windows then linux.

Note undocumented features affect not only instrumentation tools, but other components of the platform, which affect our debugging.

Fwd: pure language complexity ^ tool complexity

(“Tool” includes the entire system used in the dev process.)


You asked me why my OC manager likes to beat me up. I thought about it. Here’s one reason.

My manager complains I’m slow with my development projects. In this job (as in many other jobs), people don’t say my code quality or design is bad. I think most of the code they write isn’t great either. We are just hoping to churn out working code on time. I do hope to but won’t try too hard to make my code beautiful, low-maintenance, testable, easy to change, easy to understand. Implicitly, that’s the unspoken agreement with the boss. One quality where I exceed minimum requirement is error-condition handling — i want my code to behave reasonably under any conceivable condition. I don’t want “undefined behavior”. I don’t want surprises. Anyway, my boss only complains about how slow I complete projects, not about quality. Invariably it boils down to troubleshooting skill like googling, asking around, understanding logs, and diagnostic tools. Sometimes not easy for anyone. Most of these technical problems involve more than code complexity in my c# codebase, and now I realize that’s the key.

Now I believe my c# language skill is perhaps better than the average c# developer. For eg, I’m confident taking on technical challenges involving threading, linq, complex data structures, complex delegates+events. (The toughest coding interviews on the west coast sometimes cover nothing but a little tight/dense algorithm. These can be extremely IQ-intensive, more than I can handle but I feel comfortable that I can beat most c# developers at this game.) In a nutshell, if the problem is purely about c# language, I am confident. Most real world technical problem-solving goes beyond that. There are too many of these problems to enumerate. Below are just a few examples.

eg: When the same code works in one environment not in another environment, I feel 30% of the time it’s due to some permission or security issue. Error message is usually misleading. Even worse when nothing happens, without any error msg. Can take ages to uncover the root cause.

eg: GUI troubleshooting always involves some tool and some infrastructure that’s not written by me…

eg: GUI system is invariably a client to some server-side. There’s always some networking involved.

eg: Async communication might be considered a pure language complexity but in reality troubleshooting often involves some vendor-supplied infrastructure code which sets up the communication and threading.

eg: WCF and remoting always involves tools to generate proxy, network layer, serialization…

eg: My biggest source of troubleshooting headache is VisualStudio… I had many problems with debugger, with build, with project references…

In order to reproduce a problem, I often need to check out and build many projects, configure many modules, hunt down all the required libraries, set up all the dependencies, install a bunch of infrastructure software… The set-up can take days for the uninitialized, and is error-prone and not fully documented. This happens everywhere so I’m not complaining, but this often leads to a lot of delay and complaint from boss.

[[inside windows debugging]]

P38 has a half-pager comparison of the 2 main debuggers

^ MSVS – dev environment, with source code available

^ windbg – production environment. For post-coding, without source code.

** script/SQL debugging – only MSVS

P38 points out the free MSVS-express lacks certain debugging features. WinDBG is Completely free.

Q: does “windows debugger” mean windbg + minor tools?

–symbol files, symbol servers, *.pdb files


P54 – “public symbols” and the microsoft online public symbol server.

replicate eq-fwd contract, assuming a single dividend

See also

Note replication portfolio is always purchased as a bundle, sometime (time t) before expiry (denoted time T).

First, let’s review how to replicate a forward contract in the absence of dividends. The replication portfolio is {long 1 share, short K discount bonds}. To verify, at T the portfolio payout is exactly like long forward. By arbitrage argument, any time before expiry the portfolio value must at all times equal the fwd contract’s price. I will spare you the math formula, since the real key behind the math is the replication and arbitrage.

Now, suppose there’s a percentage dividend D paid out at time Td before T. In this case, let’s assume the dividend rate D is announced in advance. To reduce the abstractness, let’s assume D=2%, K=$100, the stock is IBM. We are going to reinvest the dividend, not use it to offset the purchase price $100. (This strategy helps us price options on IBM.)

The initial replication portfolio now adjusts to –{ long 0.98 IBM, short 100 discount bonds}. At T, the portfolio is exactly like long 1 forward contract. Please verify!

(In practice, dividends are declared as fixed amount like $0.033 per share whatever the stock price, but presumably an analyst could forecast 2%.)

In simple quant models, there’s a further simplification i.e. continuous dividend yield q (like 2% annually). Therefore reinvesting over a period A (like 1Y), 1 share becomes exp(qA) shares, like exp(0.02*1) = 1.0202 shares.

Q: delta of such a fwd contract’s pre-maturity value? Math is simple given a good grip on fwd contract replication.
A: rep portfolio is { +1 S*exp(-qT),     -K bonds }.
A: key concept — the number of shares (not share price) in the portfolio “multiplies” (like rabbits)  at a continuous compound rate of q. Think of q = 0.02.
A: In other words

   F0 = S0*exp(-qT) – K*Z0

Differentiating wrt S0, delta = exp(-qT), which degenerates to 1 when q=0.

barebones RAII smart ptr classes

If RAII and memory management is the main purpose, I feel a better word for the class template is a “capsule class”. I feel the boost implementations are too complicated and hard to modify. Here are some small but realistic implementations

[[c++FAQ]] has a one-pager RAII capsule class.

[[safe c++]] has a 2-page-short source code for a real-world capsule. Includes both ref count and scoped versions. introduces a very simple RAII capsule. One of the simplest yet useful smart pointers. I think the entire class needs nothing but … big3 + ctor.

A few observations —

* It’s not reliable enough if the capsule ctor can throw. For RAII to provide the ironclad guarantee, the ctor of the “capsule” class (smart ptr or not) should be simple and robust, and no-throw. If the capsule is not even locked during construction, then it offers no protection no guarantee.

* (Smart pointer always needs a copy policy) It’s meaningless unless every client “goes through” the wrapper to get to the raw pointer. Raw pointer direct access is dangerous and should be completely avoided.

* It’s meaningless without the exception risk. RAII is mostly for exceptions and programmer forgetfulness.
* It’s meaningless if there’s no need for an ironclad “guarantee“.
* It’s meaningless if there’s no need for “automatic” cleanup.
* It relies on the dtor feature. RAII relies on dtor. Smart pointer relies on dtor.
* It relies on local stack variables,but …
* It’s unusable unless the original pointer is for a heap object.
… These keywords are the heart of RAII idiom.

Given the heavy reliance on dtor, the dtor should be fairly simple. I don’t think it should be virtual.

##windows GTD skillset, my take

(master copy? Recrec)

Dotnet is like java, right? Dotnet developers operate in a sandbox similar to the java developer?

Wrong! I feel a windows developer team operates in an ecosystem that’s fundamentally different from a java development ecosystem. Differences lie beneath the surface.

Since all of my java servers run in linux (or unix), I also need some linux platform knowledge, but nothing in-depth.

A windows developer needs a lot of specific (platform) skills. The list below is what I feel I need to pick up. [[Inside windows debugging]] by a MS veteran points out loads of essential know-how required for dotnet (or pre-dotnet) hacking beyond the MSVS debugger. From this book i get the sense that every serious windows developer need (non-trivial) windows know-how.

As an analogy, serious sql coders need (non-trivial) knowledge of optimizer and query plan.

I feel the average windows developer don’t have the capacity to know all of these. Perhaps some guys specialize in winforms some in wpf, some specialize in web development, some specialize in VB, some specialize in c# server side… Yet it’s not inconceivable for some lead developer/architect to have some experience with all of these, and some deep experience in a few topics.

* windows TCP/IP – Microsoft has its own implementation
* windows threading – many small differences from java threading, even though the concepts match.
* windows web service (SOAP etc) – Microsoft has its own implementation
* debugger – remote debugger
* xml parsing – Microsoft has its own implementation
* office integration (COM), excel addin
* IDE – many many must-know 
* project configuration, 
* application configuration
* GUI dev – silverlight
* GUI dev – wpf
* set up WCF service and host
* c# loggers
* powershell, DOS scripting

* iis
* messaging server
* sharepoint server 
* exchange server 
* GUI dev – winforms
* GUI dev – MFC
* web development – silverlight
* web development – ASP,
* web development – vbscript, jscript etc

The blog title says “windows” because I believe a serious dotnet developer always knows windows well beyond dotnet. Many __dotnet__ dev skills require windows platform knowledge. (In java projects platform knowledge is useful but not as essential as in dotnet.) I believe a dotnet developer has to become a windows developer to be truly competent.

– COM integration
– MS-Office integration
– interop with VB, VC, both rich-history technologies retrofit into dotnet
– windows debuggers
– sysinternals tools
– registry
– application installer/uninstaller
– event viewer
– performance counters
– dotnet threading is based on win32
– WCF is based on a earlier Microsoft remoting technologies

java spring IV questions from friends #YH++

Q: How many ways of instantiating a Spring container?

Q: How many ways of injecting a bean in Spring XML config

Q: What's drawbacks of using constructor as injection mean? circular reference, what's exception will be thrown?

Q: Spring annotation. If a bean must be provided with a constructor with another bean injected, what's the attribute of the annotation should be used to enforce it.

Q: What're the scopes in a bean tag in Spring XML?

Q: If a scope being prototype, what will return from using “getBean()”

c++ stream — IKM findings

——tellg and tellp? is a short explanation.

tellg(void) and tellp(void) both return their pointer’s position
ostream& endl (ostream& os);
stream::rdbuf() changes the filebuf…
endl – flush() implicitly
If your variable to “populate” is an int, then extraction operator stops when hitting ….”any character that couldn’t be part of an int” which is not limited to white space.
—-which stream classes can be used for writing to a file
——which ios modes are used for reading a file

math tools used in option pricing vs risk mgmt – my take

In general, I feel statistics, as applied math, is a more widely used branch of math than probability. Both are used in finance. I feel their usage is different in the field of option pricing vs risk mgmt. Both efforts attempt to estimate the future movements of underlier prices. Both rely on complicated probability and statistics theories. Both try to estimate the “histogram” of a portfolio’s market value on a future date.

In option pricing, the future movement of the Underlyer is precisely modeled as a GBM (geometric Brownian motion). IMHO Stochastic is probability, not stats, and is used in option math. When I google “stochastic”, “volatility” always shows up. “Rocket science” in finance is usually about implied volatility — more probability less statistics.

In VaR, future is extrapolation from history. Risk manager doesn’t trust theoretical calculations but relies more [1] on historical data. “Statistical risk management” clearly shows the use of statistics in risk management.

In contrast, historical data is used much less in option pricing. Calibration uses current day’s market data.

[1] the “distribution” of past daily returns is used as a distribution of plant growth rate. There’s no reason to believe plant will grow any faster/slower in the future.

See other posts on probability vs stats. Risk management uses more stats than Option pricing.

Incidentally, If a portfolio include options,  then VaR would need both theoretical probability and statistics.

stoch Process^random Variable: !! same thing

I feel a “random walk” and “random variable” are sometimes treated as interchangeable concepts. Watch out. Fundamentally different!

If a variable follows a stoch process (i.e. a type of random walk) then its Future [2] value at any Future time has a Probability  distribution. If this PD is normal, then mean and stdev will depend on (characteristics of) that process, but also depend on the  distance in time from the last Observation/revelation.

Let’s look at those characteristics — In many simple models, the drift/volatility of the Process are assumed unvarying[3]. I’m not familiar with the more complicated, real-world models, but suffice to say volatility of the Process is actually time-varying. It can even follow a stoch Process of its own.

Let’s look at the last Observation — an important point in the Process. Any uncertainty or randomness before that moment is  irrelevant. The last Observation (with a value and its timestamp) is basically the diffusion-start or the random-walk-start. Recall Polya’s urn.

[2] Future is uncertain – probability. Statistics on the other hand is about past.
[3] and can be estimated using historical observations

Random walk isn’t always symmetrical — Suppose the random walk has an upward trend, then PD at a given future time won’t be a nice  bell centered around the last observation. Now let’s compare 2 important random walks — Brownian Motion (BM) vs GBM.
F) BM – If the process is BM i.e. Wiener Process,
** then the variable at a future time has a Normal distribution, whose stdev is proportional to sqrt(t)
** Important scenario for theoretical study, but how useful is this model in practice? Not sure.
G) GBM – If the process is GBM,
** then the variable at a future time has a Lognormal distribution
** this model is extremely important in practice.

c++ template technicality – IKM findings

Never quizzed when I declare I’m no template wizard…

to specialize a func template,

template <typename T> void fn(T a){}
template<nothing here> void fn(char c){…} //empty param to specialize an existing func template

template<class T> void f(T arg){ cout<<arg<<endl;}
template<> void f<char>(char arg){ cout<<"char: "<<arg<<endl;}
// the <char> after f is optional !
int main() {
///////////////// Some<int> is distinct from Some<char> -- which is the basis of my friend caching factory in 
template <typename T> class Some{
  public: static int stat;

template<typename T> int Some<T>::stat = 10;
int main(){
  Some<int>::stat = 5;

16 IKM c++ Q (no Answer)

Beware — efficiency is lower because there’s no correct answer given!
———–Question 1 (hit me) on virtual, dynamic_cast…:

//Someone else’s code, e.g. library
class IGlyph {
virtual ~IGlyph(){}
virtual std::string Text()=0;
virtual IIcon*      Icon()=0;
class IWidgetSelector {
virtual ~IWidgetSelector(){}

virtual void    AddItem(IGlyph*)=0;
virtual IGlyph* Selection()=0;
//Your code
class MyItem : public IGlyph {
virtual std::string Text(){
return this->text;
virtual IIcon* Icon() {
return this->icon.get();
void Activate() {
std::cout < < “My Item Activated” < < std::endl;
std::string          text;
std::auto_ptr icon;
void SpiffyForm::OnDoubleClick(IWidgetSelector* ws) {
IGlyph* gylph = ws->Selection();
MyItem* item  = dynamic_cast (gylph);
if(item) item->Activate();
A. The dynamic_cast is necessary since we cannot know for certain what concrete type is returned by IWidgetSelector::Selection().
B. The dynamic_cast is unnecessary since we know that the concrete type returned by IWidgetSelector::Selection() must be a MyItem object.
C. The dynamic_cast ought to be a reinterpret_cast since the concrete type is unknown.
D. The dynamic_cast is redundant, the programmer can invoke Activate directly, e.g. ws->Selection()->Activate();
E. A polymorphic_cast should be used in place of the dynamic_cast.
———–2. (hit me)
Which of the following declarations of function main are standard or standard conforming extensions?
(Please note that some compilers accept ill-formed main declarations, these should be considered incorrect).
A. void main(char* argv[], int argc)
B. int main()
C. void main()
D. int main(int argc, char* argv[])
E. int main(int argc, char* argv[], char* arge[])
———–3. Which of the following statements accurately describe the condition that can be used for conditional compilation in C++?
A. The condition can depend on the value of program variables.
B. The condition can depend on the values of any const variables.
C. The condition can use the sizeof operator to make decisions about compiler-dependent operations, based on the size of standard data types.”
D. The condition can depend on the value of environmental variables.
E. The condition must evaluate to either a “0” or a “1” during pre-processing.
———–4. In C++, which of the following are valid uses of the std::auto_ptr template considering the class definition below?
class Object {
public: virtual ~Object() {}
A. std::auto_ptr pObj(new Object);
B. std::vector <std::auto_ptr > object_vector;
C. std::auto_ptr pObj(new Object);
D. std::vector <std::auto_ptr > object_vector;
E. std::auto_ptr source() {
return new Object;
———–5. Which of the following statements correctly describe C preprocessor directives in C++?
A. The #pragma directive is machine-independent.
B. Preprocessor directives are processed before macros are expanded.
C. The #import directive is used to copy code from a library into the program’s source code.
D. Any number of #else directives can be used between an #if and an #endif.
E. The #include directive is used to identify binary files that will be linked to the program.
———–6. Which of the following statements describe the results of executing the code snippet below in C++?
int var = 1;
void main() {
int i = i;
A. The i within main will have an undefined value.
B. The i within main will have a value of 1.
C. The compiler will not allow this statement.
D. The compiler will allow this statement, but the linker will not be able to resolve the declaration of i.
E. The result is compiler-dependent.
———–7. (hit me) Which of the following statements regarding the benefits of using template functions over preprocessor #define macros are correct?
A. A preprocessor macro expansion cannot work when user-defined types are passed to it as arguments.
B. While expanding #define macros, the preprocessor does no type checking on the arguments to the macro.
C. Since the preprocessor does the macro expansion and not the compiler, the build process takes a longer period of time.
D. A preprocessor macro expansion incurs a performance overhead at runtime.
E. It is simple to step into a template function code during the debugging process.
———–8. In a hierarchy of exception classes in C++, which of the following represent possible orders of catch blocks when a C++ developer wishes to catch exceptions of more than one class from a hierarchy of exception classes?
A. Classes belonging to the same hierarchy cannot be part of a common set of catch blocks.
B. The most derived classes must appear first in the catch order, and the parent classes must follow the child classes.
C. The most derived classes must appear last in the catch order, and the parent classes must precede the child classes.
D. The order is unimportant as exception handling is built into the language.
E. Multiple classes can be caught in a single catch clause as multiple arguments.
———–9. (hit me) Which of the following statements provide a valid reason NOT to use RTTI for distributed (i.e. networked between different platforms) applications in C++?
A. RTTI is too slow.
B. RTTI does not have standardized run-time behavior.
C. RTTI uses too much memory.
D. RTTI’s performance is unpredictable/non-deterministic.
E. RTTI frequently fails to function correctly at run-time.
———–10. (hit me) Which of the following options describe the expected overhead for a class that has 5 virtual functions?
A. Every object of the class holds the address of the first virtual function, and each function in turn holds the address of the next virtual function.
B. Every object of the class holds the address of a link list object that holds the addresses of the virtual functions.
C. Every object of the class holds the addresses of the 5 virtual functions.
D. Every object of the class holds the address of a structure holding the addresses of the 5 virtual functions.
E. Every object of the class holds the address of the class declaration in memory, through which the virtual function calls are resolved.
———–11. (hit me) A C++ developer wants to handle a static_cast () operation for the class String shown below. Which of the following options are valid declarations that will accomplish this task?
class String {
//declaration goes here
A. char* operator char*();
B. operator char*();
C. char* operator();
D. String operator char*();
E. char* operator String();
———–12. Which of the following options describe the functions of an overridden terminate() function?
A. It performs the desired cleanup and shutdown processing, and then throws a termination_exception.
B. It performs the desired cleanup and shutdown processing, and then returns an error status value to the calling function.
C. It performs the desired cleanup and shutdown processing, and then calls abort() or exit().
D. It performs the desired cleanup and shutdown processing, and if it has restored the system to a stable state, it returns a value of “-1” to indicate successful recovery.
E. It performs the desired cleanup and shutdown processing, and then calls the unexpected() handler.
———–13. Which of the following options are returned by the typeid operator in C++?
A. A reference to a std::type_info object
B. A const reference to a const std::type_info object
C. A const std::type_info object
D. A reference to a const std::type_info object
E. A const reference to a std::type_info object
———–14. Which of the following statements accurately describe unary operator overloading in C++?
A. A unary operator can be overloaded with no parameters when the operator function is a class member.
B. A unary operator can be overloaded with one parameter when the operator function is a class member.
C. A unary operator can be overloaded with one parameter when the operator function is free standing function (not a class member).
D. A unary operator can only be overloaded if the operator function is a class member.
E. A unary operator can be overloaded with no parameters when the operator function is a free standing function (not a class member).
———–15. (hit me) What is the correct syntax for portable fstream file paths?
(e.g. the string you would pass to std::fstreamen() to open a file.)
A. “::directory:file.bin”
B. “C:/Directory/File.bin”
C. “/directory/file.bin”
D. “C:\Directory\File.bin”
E. std:fstream file paths are not portable.
———–16. (hit me) When a Copy Constructor is not written for a class, the C++ compiler generates one. Which of the following statements correctly describe the actions of this compiler-generated Copy Constructor when invoked?
A. The compiler-generated Copy Constructor makes the object being constructed, a reference to the object passed to it as an argument.
B. The compiler-generated Copy Constructor does not do anything by default.
C. The compiler-generated Copy Constructor performs a member-wise copy of the object passed to it as an argument, into the object being constructed.
D. The compiler-generated Copy Constructor tags the object as having been Copy-Constructed by the compiler.
E. The compiler-generated Copy Constructor invokes the assignment operator of the class.

[[safeC++]] – concise, pragmatic, unconventional wisdom

First off, this is a 120-page thin book, including about 30 pages of source code in the appendices. Light-weight, concise. Very rare.

I feel the author is bold to advocate avoidance of popular c++ features such as
– “Avoid using pointer arithmetic at all.”
– For class fields, avoid built-in types like int. Use Int type — no need to initialize.
– “Use the new operator only without bracket”. Prefer Vector to new[]
– “Whenever possible, avoid writing copy ctor and assignment operators for your class”

I feel these suggestions are similar to my NPE tactics in java. Unconventional wisdom, steeped in a realistic/pessimistic view of human fallibility, rather tedious, all about ….low-level details.

Amidst so many books on architecture and software design, I find this book so distinctive and it speaks directly to me — a low-level detailed programmer.

I feel this programmer has figured out the true cost/benefit of many c++ features, through real experience. Other veterans may object to his unconventional wisdom, but I feel there’s no point proving or convincing. A lot of best practices [1] are carefully put aside by veterans, often because they know the risks and justifications. These veterans would present robust justifications for their deviation — debatable but not groundless.

[1] like “avoid global variables and gotos”

Given the author’s role as a quant developer I believe all of the specific issues raised are relevant to financial applications. When you read about some uncommon issue (examples in [1]), you are right to question if it’s really important to embedded, or telecom, or mainframe domains, but it is certainly relevant to finance.

Incidentally, most of the observations, suggestions are tested on MSVS.

–assert, smartPtr…
I like the sample code. Boost smart ptr is too big to hack. The code here is pocket-sized, even bite-sized, and digestible and customizable. I have not seen any industrial strength smart pointer so simple.

The sample code provided qualify as library code, and therefore uses some simple template techniques. Good illustration of template techniques used in finance.

[1] for eg the runtime cost of allocating the integer ref count on P49; or the date class.

Any negative?

prevent heap instantiation of my class#Part 2#nQuant

I might have blogged about this…. [[more effC++]] by Scott Meyers has a chapter on this. The basic ideas, expressed in my own language for maximum absorption —

case 1) it’s relatively easy to prevent qq( new MyClass ) P157 basically says customize a new-operator as a private static class member and leave it undefined.

— below cases 2 and 3 are non-issues in “our” projects, where team is small and guys are expected to read the source code comments —

case 2) MyClass could be subclassed to Der. If Der is beyond my control, then MyClass could (against my will) be instantiated on heap as a subobject

case 3) MyClass could be a field in class Umbrella. If Umbrella is beyond my control, then MyClass could (against my will) be instantiated on heap as a subobject

I feel in many enterprise app teams, Cases 2/3 are rare and impractical, because Der and Umbrella would be owned by myself or my own team.

Therefore, in many real world contexts it is feasible to prevent heap instantiation of my class, and reduces memory leak.

c++syntax – class template – brief notes

[[absolutionC++]] covers ordinary class template and subclassing an /unconcretized/ class template. The syntax can be quite “flexible”. I guess some compilers might allow you to omit a symbol here or there. I prefer to follow the full syntax, same habit as with c# lambda expressions. Less likely to confuse myself when reading the code 6 months later.

In practice, we just need to follow a consistent syntax that works. No need to know if some unusual syntax is standard-conforming or not. But on some occasions like online quizzes, we may need to play the language lawyer. Let’s look at one simple aspect — It’s good to know where the dummy type name “T” or “S” should (not) be repeated.

After template , and when we first name our new class, we don’t put on its head —

template class BB   {};

However, subclassing this base class uses slightly different syntax —

template class DD: public BB{};

For an IKM c++ quiz, I used windows gcc to FAIL all of the below —

template class DD: public BB{}; //WRONG – new template must not be “decorated”
template class DD: public BB{}; // WRONG – base template must be “decorated”
template class DD: public BB{}; //WRONG – breaking both rules

Q: Do we ever use the class template BB without a concrete type and without a dummy type?
%%A: I haven’t seen any example. Looks like non-standard or obscure syntax

main() method requirements – briefly

Q: in c++ what signatures can main() have?
AA: return type must be “int”
int main() //tested
int main(int argc, char *argv[]) //tested
void main() //won’t compile.

Q: in c++ must main() actually return an int?
AA: No need. I never have a “return” statement in my main().

——– c# ———-
Main() can only declare to return int or void.
Main() can be private

preprocessor – random notes

I now feel many C/C++ old hands would exploit these preprocessor tricks to the fullest. These tricks are often impossible without using preprocessor.

Some useful tricks —

1) conditional Compilation is nice, but condition Preprocessing offers even more tricks.
1b) you can undefine or redefine a macro including a “macro-function”; you can test if a macro is defined.
These are powerful features only available in the preprocessor. Arguably the #1 popular preprocessor feature.

2) #     error “#e r r o r inside ifdef”  // I used it to debug the preprocessor
    #     error “compiler should never reach here”   // like JUnit fail().

3) __LINE__, __DATE__
4) indent your preprocessor source code with a tab before/after the “#”

——- from [[c++without fear]] ——-
A) #if can comment out code chunks like /* multi-line comment */ but additionally supports nesting!!
A2) #if 1 // to un-comment

B) qq(   defined) comes WITHOUT the pound, is a Boolean function almost exclusively used with #elif (and #if)
C) qq(#define) WITH the pound, is frequently used with just a name, without “content”. Example –

#define MY_CLASS1_INCLUDED // and later
#if defined(MY_CLASS1_INCLUDED)
Preprocessor directives can be either 1) macros or 2) conditional compile or 3) includes. Not every directive is a macro.

For debugging, you could even printf from a conditional compile, outside a macro. Compare with #error —
a) no leading “#”. Remember this is conditional compile —

#  ifdef language
//#   error “#e r r o r inside ifdef\n”
      printf(“%d”, language);
#  endif

b) printf can only appear inside a C++ function — basically conditional compilation of the printf statement. In contrast,  #error can appear anywhere in a source file, even before any c++ code.

preprocessor syntax – a few tips #IKM

Never quizzed except in online QQ tests.

Based on an IKM question and my research on P372 [[ARM]]

1) redefine and “undef” a name is fine. I feel the “undef” feature is rather useful but unavailable in the compiler.

2) can check if a macro is defined or not, using
#if defined AA
#ifdef AA
#if !defined AA
#ifndef AA

3) a macro can appear in another preprocessor code, but this kinda code is frowned upon
#define AA2(jkl, xyz) {#xyz ## #jkl}
char result[] = AA2(def, abc); // will produce “abcdef” without space

ITM binary call as TTL -> 0 or infinity

I was asked these questions in an exam (Assuming r = 0, and S_0 > K).

Given standard GBM dynamics of the stock, binary call price today is N(d2) i.e. risk-neutral probability of ITM.

As ttl -> 0, i.e. approaching expiry, the stock has little chance of falling below K. The binary call is virtually guaranteed ITM. So binary call price -> $1.

As ttl -> inf, my calc shows d2 -> -inf, so N(d2) -> 0. For an intuitive explanation, see

Pr(S_T > S_0 | r==0) == 50/50@@

This is an ATM binary call with K = S_0 .
Wrong thought – given zero drift, the underlier price process is stochastic but “unbiased” or “balanced”, so underlier is equally likely to finish above K or below K. This wrong thought assumes a BM not GBM.
Actually, the dynamics is given by
    dS = σ S dW
This is a classic GBM without drift, so S_T has a lognormal distribution – like a distorted bell shape. So we can’t conclude that probability[1] is 50%. Instead,
Pr (S_T > S_0 | r=0) = N(d2) = N (- σ√T/2 )    … which is =< 50%
This probability becomes 50% only if sigma is 0 meaning the price doesn’t move at all, like a zero-interest bank account. I guess the GBM degenerates to a BM (or a frozen, timeless constant?).
[1] under risk-neutral measure

today’s price == today’s expectation of tomorrow’s price

“today’s [3] price[1] equals today’s[3] expectation[2] of tomorrow’s price” — is a well-known catch phrase. Here are some learning notes I jotted down.

[1] we are talking about tradeable assets only. Counter examples – Interest rate and Dividend-paying stock are not tradeable by definition, and won’t follow this rule.

[2] expectation is always under some probability distribution (or probability “measure”). Here the probability distro is inferred from all market prices observable Today. The prices on various derivatives across different maturities enable us to infer such a probability distribution. Incidentally, the prices have to be real, not some poor bid/ask spread that no one would accept.

[3] we use Today’s prices of other securities to back out an estimated fair price (of the target security) that’s fair as of Today. Fair meaning consistent with other prices Today. This estimate is valid to the extent those “reference prices” are valid. As soon as reference prices change, our estimate must re-adjust.

[[debug it]] c++, java.. — tips

I find this book fairly small and practical. No abstract theories. Uses c++  java etc for illustrations.

Covers unix, windows, web app.

=== debugging memory allocators
memory leaks
uninitialized variable access
varialbe access after deallocation
Microsoft VC++ has a debuging mem allocator built in. Try

Electric Fence

–P201 DTrace – included in Mac OS X
–P202 WireShark, similar to tcpdump
–P203 firebug – client-side debugging
edit DOM
full javascript debugging

–P188 rewrites – pitfalls

–A chapter on reproducing bugs — quite practical

GBM formulas – when to subtract 0.5σ^2 from u

Background – I often get confused when (not) to subtract. Here’s a brief summary.
The standard GBM dynamic is
                dS = mu S dt + σ S dW …. where mu and σ are time-invariant.
The standard solution is to find the dynamics of logS, denoted L,
                dL = (mu – 0.5σ2 ) dt + σ dW …  BM, not GBM. No L on the RHS.
                L (time=T)     ~  N (mean = (mu – 0.5σ2 )T, std = …. )
So it seems our mu can’t get rid of –0.5σ2 thingy … Until we take expectation of S(time=T)
                E S(time=T) = S(time=0) exp(mu*T)     … no σ2 term
When we write down the Black Scholes PDE we use mu without the –0.5σ2 thingy.
BS formula uses mu without the –0.5σ2 thingy.

##[13]strategic(5Y+) investment in tech xx

custom wait/notify framework?

Q3: Out of so many tech topics, what can I invest now? In 5 years when I look back, what kind of learning today would prove strategic? What would give me the long-term confidence, the non-trivial long-term competitive edge [1], the sense of long-term employ-ability and long term market value, the long-term buoyant demand. What tech skills are churn-resistant?

[1] “competence” or zbs is less relevant — when competing for a role our potential on-the-job competence is evaluated only through tech quizzes. I have witnessed countless competent developers showing lack of theoretical knowledge in tech quizzes. You can complete 3 WCF projects with barely 1% of WCF knowledge — #1 reality in developer knowledge. Wizards like Venkat and the Swedish hacker in can get many tricky things done in no time, but their theoretical knowledge (not wizardry) is what matters on job market.

Q5: Looking back at my last 7 years (after those 3 years of self-employment), which specific acquired skills have proven strategic?

A key factor is getting into Wall St. Without entering Wall St, my “world view” would be completely different.

Fundamentally, the boat we are in is unstable. Technology Churn. Competition from young Indian guys. Unlike medical doctors, the older guys lose market value. Any in-demand skill may fall out of favor in 10 years. I doubt I can pick some skill to invest, which will provide an iron rice bowl till age 55. Most of my prognosis would be uncertain, but some are less uncertain than others.

Less uncertain — As I age, salary and job security are becoming more important than glamour, buzzwords… but I want to remain relevant.
Less uncertain — better to keep the focus on interview topics not on-the-job GTD skills including localSys. Real world problem solving skills (untested in IV) are less market-strategic
Less uncertain — shelf live is longer in the US, so pay attention to US style IV
Less uncertain — diminishing return on additional investment in java or SQL
Less uncertain — c++ xx has paid off in numerous job interviews.
Less uncertain — knowledge of “major” languages offers better shelf-life and value than scripting but python might buck the trend.
Less uncertain — what gives me self-confidence in any tech is not GTD experience, but successful IV experience. Many know SQL for a long time but fail non-trivial IV questions.
Uncertain — better stand ready to accept and embrace relocation
Uncertain — get closer to trading and decision making
Uncertain — c++ will hold its ground for 10Y
uncertain — continue to identify and invest in the non-trivial, somewhat niche technical skills with high market value. Avoid mainstream stuff like ASP or web technology.
Uncertain — java may not maintain its dominance, though it has so far
Uncertain — financial centers may or may not continue outsourcing

A5: (loosely ranked) threading, collections, STL, smart pointers, memory management, socket, linux, SQL, sort/search algo
A5: the non-trivial experience applying comp sci constructs to real problems such as my custom wait/notify framework in BAML. This is one of the few successful and innovative designs
A5: big outer-joins experience in GS
A5: dnlg – 3 types
A5: not sure at this moment —  other boost libraries, python, ajax, FIX

A3: learning more java ecosystem (like frameworks, messaging…) has diminishing returns
A3:  More c#  zbs may not help, but more interviews will.
A3: More c++ zbs may not help, but more interviews will.

a contract paying log S_T or S^2

Background – It’s easy to learn BS without knowing how to price simpler contracts. As show in the 5 (or more) examples below, there are only a few simple techniques. We really need to step back and see the big picture.
Here’s a very common pricing problem. Suppose IBM stock price follows GBM with u and σ. Under RN, the drift becomes r, the bank account’s constant IR (or probably the riskfree rate), therefore, 

Given a (not an option) contract that on termination pays log ST, how much is the contract worth today? Note the payoff can be negative.
Here’s the standard solution —
1) change to RN measure, but avoid working with the discounted price process (too confusing). 
2) write the RN dynamics as a BM or GBM. Other dynamics I don’t know how to handle.
Denote L:= log St and apply Ito’s
dL = A dt + σ dW … where A is a time-invariant constant. So log of any GBM is a BM.
I think A = r – 0.5σ2 but the exact formula is irrelevant here.
3) so at time T, L ~ N(mean = L0 + A*T, std = …)
4) so RN-expectation of time-T value of L is L0 + A*T
5) discount the expectation to PV
Note L isn’t the price process of a tradable, so below is wrong.
E (LT/ BT) = L0/ B0   … CANNOT apply martingale formula
— What if payout = ST2 ? By HW4 Q3a the variable Jt:=St2 is a GBM with some drift rate B and some volatility.  Note this random process Jt is simply derived from the random process St. As such, Jt is NOT a price of any tradable asset [1].
Expectation of J’s terminal value = J0 exp(B*T)
I guess B = 2r + σ2 but irrelevant here.

[1] if Jt were a price process, then the discounted value of it would be martingale i.e 0 drift rate. Our Jt isn’t martingale. It has a drift rate, but this drift rate isn’t equal to the risfree rate. Only a tradable price process has such a drift rate. To clear the confusion, there are common 3 cases
1) if Jt is a price process (GBM or otherwise), then under RN measure, drift rate in it must be r Jt. See P5.16 by Roger Lee
2) if Jt is a discounted price process, then under RN measure, drift rate is 0 — martingale.
3) if not a price process, then under RN measure, drift rate can be anything.

— What if payout = max[0, (ST2) – K]?  This requires the CBS formula.
— What if payout = max[0, (logST) – K]? Once you know the RN distribution of logST is normal, this is tractable.
— what if payout = max[0, ST – K] but the stock pays continuous dividend rate q? Now the stock price process is not a tradeable.
No We don’t change the underlier to the tradeable bundle. We derive the RN dynamics of the non-tradeable price S as
dS = (r-q) S dt + σ S dW … then apply CBS formula.
So far all the “variables” are non-tradeable, so we can’t apply the MG formula
— What if payout = STXT where both are no-dividend stock prices. Now this contract can be statically replicated. Therefore we take an even simpler approach. Price today is exactly S0X0

##basic steps in vanilla IRS valuation, again

* First build a yield curve using all available live rates. This “family photo” alone should be enough to evaluate any IRS
* Then write down all (eg 20) reset dates aka fixing date.
* Take first reset date and use the yield curve to infer the forward 3M Libor rate for that date.
* Find the difference between that fwd Libor rate and the contractual fixed rate (negotiated on this IRS contract). Could be +/-
* Compute the net cashflow to occur on that fixing/reset date.
* Discount that cashflow to PV. The discounting curve could be OIS or Libor based.
* Write down that amount.

Repeat for the next reset date, until we have an amount for each reset date. Notice all 20 numbers are inferred from the same “family photo”. Tomorrow under a new family photo, we will recalc/reval all 20 numbers.

Add up the 20 amounts to get the net PV in that position. Since the initial value of the position is $0, this net value is also the PnL.

chat with Yi Hai on quant finance education

i see more and more young people with this kinda degree getting good salaries (50% to 66% of our salaries within 3 years of graduation) as researchers or analysts. This skill has higher value than programming skill. More importantly, this skill suffers no “technology churn”.

My training so far is on derivative pricing (+ modeling) theories and portfolio theory. These are all classic topics with well-established theories. very academic, idealized theories, but surprisingly generations of students get trained on these and go into the field and make good money.

i have working knowledge in option pricing, bond math and a bit of personal experience as an equity investor. I find the theories too academic, too idealist, far removed from real trading. I feel like learning traditional Chinese landscape painting — seriously distorted and out of perspective.

These theories are actually the content of all the major quant finance programs. Therefore, many young quants every year are trained on these theories.

In fact, the professors can't teach anything but academic theories. These are what they are best at. Professors aren't traders or fund managers, so their strength is not in application of theories to practical challenges in order to achieve investment success.

The theories are impressive efforts to simplify the complicated realities in finance. (Best examples would be tail risk, volatility skew) But they are still rather imperfect. However, the trainees often went on to get good jobs and make good money.

(instantenous) fwd rate

I believe fwd rate refers to an interest rate from a future start date (like next Aug) to a future maturity date (next Nov). We are talking about the market rate to transpire on that start date. That yet-unknown rate could be inferred (in a risk-neutral sense) today, using the live market rates.

The basic calc is documented in my blog …

When the loan tenor becomes overnight (or, theoretically, shorter than a nanosec), we call it the instantaneous fwd rate. This rate, again, can be estimated. Given observation time is today, we can estimate the fwd rate for different “fwd start dates”, denoted tau. We can plot this fwd rate as a function of tau.

BS-F -> Black model for IR-sensitive underliers

Generalized BS-Formula assumes a constant interest rate Rgrow :
Call (at valuation time 0) := C0 = Z0 * ( F0 N(d1) – K N(d2) ), where
Z0 := P(0,T) := observed time-0 price of a T-maturity zero coupon bond. There’s no uncertainty in this price as it’s already revealed and observed. We don’t need to assume constant interest rate.
F0 := S0 exp(Rgrow * T), which also appears inside d1 and d2
Q: How does this model tally with the Black model?
A: Simply redefine
F0 := S0 / Z0 , which is the time-0 “fwd price” of the asset S
Now the pricing formula becomes the Black formula for interest-rate-sensitive options. Luckily this applies even if S is the price of 5Y junk bond (or muni bond), and we know 5Y interest rate is stochastic and changes every day.

FRA/ED-Fut: discount to fwd settlement date

–Example (from Jeff’s lecture notes)–
Assume on 12 Nov you buy (borrow) a 3×9 FRA struck at 5.5% (paying 5.5%) on 1M notional. On 12 Feb, 6M Libor turns out to be 5.74% , compensation due to you =

$1M x (0.0574-0.055) * 180/360 / (1 + 0.0574*180/360) = $1166.52
——–Notation ——-
Libor fixing date = 12 Feb

“accrual end date” (my terminology) = 12 Aug.

settlement could be either before or (occasionally) after the 6M loan tenor. This example uses (more common) fwd settlement.
disc factor from 12 Aug to 12 Feb = 1/ (1 + 0.0574 * 180/360)
Note the “interest due date” is always end of the 6M accrual period. Since we choose fwd settlement, we discount that cashflow to the fixing date.

annualized interest Rate difference = 5.74 %- 5.5%
pro-rated  interest Rate difference = (0.0574-0.055) * 180/360
difference in interest amount (before discounting) = $1M x (0.0574-0.055) * 180/360. This would be the actual settlement amount if we were to settle after the 6M loan period. Since we choose fwd settlement …

discounting it from 12 Aug to 12 Feb = $1166.52
Now we come to the differences between FRA and ED Futures.
1) a simple difference is the accrual basis. ED futures always assumes 90/360 exactly. FRA is act/360.
2) Another simple difference is, ED Futures always uses 3M libor, so our example must be set on Mars where ED futures are 6M-Libor-based.

3) The bigger difference is the discounting to fwd settlement date or fixing date.
– EDF gets away without the PV discounting. It takes Libor rate as upfront interest rate like in Islamic banking. Since Libor turns out to be 5.74% but you “bought” at 5.5%, the difference in interest amount is, under EDF, due immediately, without discounting to present value.
– the payout, or price, is linear with the Libor rate L.
– this is essentially due to daily mark-to-market margin calculation
* FRA takes Libor rate as a traditional loan rate, where interest is due at end of loan period.
** under late settlement, the amount is settled AFTER the 6M, on the proper “interest due date”. (Linear with L)
** under fwd settlement, the amount is settled BEFORE the 6M, but PV-discounted. This leads to a non-linear relationship with libor rate and convexity adjustment.

fwd disc factor, fwd rate … again

(See other posts in this blog. I think they offer simpler explanations.)

(Once we are clear on fwd disc factor, it’s easy to convert it to fwd rate.)

basic idea — discount an distant future income to tomorrow, rather than to today.

First we need to understand all the jargon around PV discounting which discounts to today…

Fwd discount factor is Discounting an income (or outflow) from a distant future date M (eg Nov) to a “nearer day” T [1] (eg Aug) is based on information available as of today “t” — a snapshot “family photo”. That discount factor could be .98. We write it as P(today, Aug, Nov) = 0.98. The fwd discount function P(t, T, M) can be interpreted as discounting $1 income from Nov (M) to Aug (T), given information available as of today (t). Something like P( Nov -} Aug | today), reversing the order of the 3 dates. As t moves forward, more info becomes available, so we adjust our expectation and estimate to a more realistic value of .80

The core math concept is very simple once you get used to it. $0.7 today grows to $1 in Aug, and $1.25 in Nov. These 2 numbers are implied/derived from today’s prices. These are the risk-neutral expectations of the “growth”. So $1.25 in Nov is worth $0.7 today, i.e.

  P(Nov -} today) = 0.7/1.25. Similarly
  P(Aug -} today) = 0.7/1

These are simple discount factors, Now fwd discounting is

  P( Nov -} Aug | today) = 1/1.25 = 0.8

The original notation is P(today, Aug, Nov) = 0.8.

Note the 0.80 value is not discounted to today, but discounted to next month i.e. Aug only. For PV calculation, we often need to apply discounting on top of the fwd discount factor.

fwd rate is like an interest rate. 0.8 would mean 25% fwd rate.

family photo ^ family video – yield curve

snapshot – The yield curve (yc) is a snapshot.
snapshot – term structure of IR is another name of the yc.
snapshot – discount curve is the same thing

On a given snapshot, we see today’s market prices, yields and rates of various tenors[1]. From this snapshot, we can derive[2] a forward discount factor between any 2 dates. Likewise, we can derive the forward 3M-Libor rate for any target date.

Looking at the formula connecting the various rates, it’s easy to mix the family photo vs the family video.
– family photo is the snapshot
– family video shows the evolution of all major rates (about 10-20) on the family photo.
** an individual video shows the evolution of a particular rate, say the 3M rate. Not a particular bond, since a given bond’s maturity will shrink from 3M to 2M29D in the video.
All the rate relationships are defined on a snapshot, not on a video.

I guess we should never differentiate wrt to “t”, though we do, in a very different context (Black), integrate wrt “t”, the moving variable in the video.

An example of a confusing formula is the forward rate formula. It has “t” all over the place but “t” is really held as a constant. The t in the formula means “on a given family photo dated t”. When studying fixed income (and derivatives) we will encounter many such formula. The photo/video is part of the lingo, so learn it well.

Also, Jeff’s HJM slide P12 shows how the discount bond’s price observed at time t is derived by integrating the inst fwd rates over each day (or each second) on a family photo.

[1] in an idealized, fitted yc, we get a yield for every real-valued tenor between 0 and 30, but in reality, we mostly watch 10 to 20 major tenors.

[2] The derivation is arbitrage free and consistent in a risk-neutral sense.

barebones web server in WCF: no app.config needed

using System;
using System.IO;
using System.ServiceModel;
using System.ServiceModel.Web;
namespace StaticWebServer
    public class WebServerHost
        public static void Main()
            var host = new WebServiceHost(typeof(StaticContentService),new Uri("http://localhost:8000/"));
            host.AddServiceEndpoint(typeof(IStaticContentService), new WebHttpBinding(), "");
                    new WebHttpBinding(){TransferMode = TransferMode.Streamed}, //to support large files like 70MB
            while (true) Console.ReadKey(true);
    public interface IStaticContentService
        Stream GetStaticContent(string content);
    /// <summary>
    /// In total, exactly *2* files deployed to C:/temp --
    /// Since I copied my compiled EXE (#1) to C:/temp, I needed a dummy C:/temp/www/a.html (#2).
    /// My browser could then show http://localhost:8000/www/a.html 
    /// - no config file -- You can remove app.config completely.
    /// - no dll, 
    /// - no IIS, 
    /// - no windows service.
    /// This WCF service is hosted in a console host.
    /// </summary>
    class StaticContentService : IStaticContentService
        [WebInvoke(Method = "GET", BodyStyle = WebMessageBodyStyle.Bare, UriTemplate = "/www/{*content}")]
        public Stream GetStaticContent(string content)
            Console.WriteLine("GetStaticContent() " + content);
            var response = WebOperationContext.Current.OutgoingResponse;
            string path = "www/" + (string.IsNullOrEmpty(content) ? "a.html" : content);
            if (File.Exists(path))
                response.ContentType = "text/html";
                response.StatusCode = System.Net.HttpStatusCode.OK;
                return File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
            response.StatusCode = System.Net.HttpStatusCode.NotFound;
            return null;

drv contract to swap 2 assets

nice brain teaser. No math required. Just common sense.

Suppose asset S and X have GBM dynamics (with mu_s, mu_x, sigma_s, sigma_x etc )
Suppose asset S and X have GBM dynamic too.

There’s a contract C paying (ST – XT) on termination. Could be a call option or any other contract.
There’s a contract C paying (STXT) on termination.

Q: Given X vs X have the same price today (and ditto S vs S), what can we say about C vs C price today?

To be concrete, say X0 = X0 = $0.5 and S0 = S0 = $3.

A replication portfolio for contract C — long 1 unit of S and short 1 unit of X. This portfolio has current price S0 – X0 = $2.5. Similar replication portfolio for C has current price $2.5. Tomorrow, the replication portfolios may have different prices. On expiration, they may be different in value, though each portfolio is a perfect replication.

So the GBM dynamic is irrelevant !

N(d1) N(d2) – my summary

n1 := N(d1)
n2 := N(d2)
Note n1 and n2 are both about probability distributions, so they always assume some probability measure. By default, we operate under (not the physical measure but) the risk-neutral measure with the money-market account as the “if-needed, standby” numeraire. 
– n2 is the implied probability of stock finishing above strike, implied from various live prices. RN measure.
– n1 is the same probability but implied under the share-measure. Therefore,
ST *N(d1) would be the weighted average payoff (i.e. expected payoff) of the asset-or-nothing call, under share-measure.
St * N(d1) would be the PV of the payoff, i.e. current price of asset-or-nothing call. Note as soon as we talk about price, it is automatically measure-independent.
Remember n2 is between $0 and $1 so it reminds us of … the binary call. I think this is the weighted average payoff of the binary call. RN measure. Therefore,
N(d2) exp(-Rdisc T) — If we discount that weighted average payoff to Present Value, we get the current price of the binary call. Note all prices are measure-independent.
N(d1) is also delta of the Vanilla call, measure-independent. Given call’s delta, using PCP, we work out put’s delta (always negative) = 1-N(d1) = -N(-d1)

The pdf N’(d1) appears in the gamma and vega formulas, measure-independent, i.e.
gamma = N’  (d1) * some function of (S, t)

vega = N’ (d1) * some other function of (S, t)

Notice we only put K (never S) in front of N(d2)

GBM + zero drift

I see zero-drift GBM in multiple problems
– margrabe option
– stock price under zero interest rate
For simplicity, let’s assume X_0 = $1. Given

        dX =σX dW     …GBM with zero drift-rate

Now denoting L:= log X, we get

                dL = – ½ σ2 dt + σ dW    … BM not GBM. No L on the RHS.
Now L as a process is a BM with a linear growth (rather than exponential growth).
LogX_t ~ N ( logX_0  – ½ σ2t  ,   σ2t )
E LogX_t = logX_0  – ½ σ2t  ….. [1]
=> E Log( X_t / X_0)  = – ½ σ2t  …. so expected log return is negative?
E X_t = X_0 …. X_t is a log-normal squashed bell where x-axis extends from (0 to +inf) [3].

Look at the lower curve below.
Mean = 1.65 … a pivot here shall balance the “distributed weights”
Median = 1.0 …half the area-under-curve is on either side of Median i.e. Pr(X_t < median) = 50%

Therefore, even though E X_t = X_0 [2], as t goes to infinity, paradoxically Pr(X_t<X_0) goes to 100% and most of the area-under-curve would be squashed towards 0, i.e. X_t likely to undershoot X_0.

The diffusion view — as t increases, more and more of the particles move towards 0, although their average distance from 0 (i.e. E X_t) is always X_0. Note 2 curves below are NOT progressive.

The random walker view — as t increases, the walker is increasingly drawn towards 0, though the average distance from 0 is always X_0. In fact, we can think of all the particles as concentrated at the X_0 level at the “big bang” of diffusion start.

Even if t is not large, Pr(X_t 50%, as shown in the taller curve below.

[1] horizontal center of of the bell shape become more and more negative as t increases.
[2] this holds for any future time t. Eg: 1D from now, the GBM diffusion would have a distribution, which is depicted in the PDF graphs.
[3] note like all lognormals, X_t can never go negative 

File:Comparison mean median mode.svg

bond ^ deposit , briefly

Bond and deposit are the 2 basic, basic FI instruments, underlying most interest rate derivatives.

Both pay interest, therefore have accural basis, like act/360 or 30/360

Both have settlement conventions, such as T+2. Note Fed Fund deposit is T+0.

# 1 difference in pricing theories — Maturity value is know for a bond, but in contrast, for some important deposits (money-market deposits) we only know the total market value tomorrow, not beyond. Though many real life fixed-deposits have a long tenor comparable to bonds, the deposits used in pricing theories are “floating” overnight deposits.

# 2 difference — Bond has maturity value exactly $1 and is traded at a discount before maturity, making it an ideal enbodiment of discount factor. A Deposit starts at $1 and grows in value due to interest.

–1) Bonds
eg of bonds — all treasury debts, corp debts, muni debts.

Has secondary market

bonds are the most popular asset for repo.

–2) Deposits is fairly similar to zero bonds.
eg of deposit — Fed Fund deposit, or deposits under other central banks. Unsecured
eg of deposit — Eurodollar deposit, in about 20 major currencies. Unsecured

OIS is based on deposits (Fed Fund deposit)

Libor is based on eurodollar deposits, for a subset (5) of the currencies.

Libor IRS and OIS IRS – all based on deposits.

No secondary market.

I feel deposits tend to be short term (1Y or less)

what c++ topics are valuable in the long run on WallSt

When you first pick up a language (like c++) from 11 textbooks, you form a view of what topics (know-how/insights) are important, and worth spending time on.

However, once in the field you will get a different view. Over the years, I realized
++ remote debugger is critical

++ lots of C techniques are widely used on Wall St.

++ MSVS is a dominant tool, so is GNU toolkit.

++ multiple inheritance is still relevant

— most whiteboard coding will use C constructs + vector/string, not fancy c++ features

— STL containers are widely adopted, but not everything in STL is equally important.
— portability isn’t a issue.

— template techniques are powerful but seldom used on Wall St
— boost shared_ptr is widely used, but not other boost libraries

— strings — many teams have their in-house development, so the std string is less imp

— I feel OO techniques are not as popular/easy as in java/c#

So over 50 years (if I were to work that long), i’m going to discover all the important topics. Now question is how to quickly discover them, so as to focus on the most valuable.

Answer — I feel a beginner is better off changing project frequently. The more diverse projects you take on, the faster you discover those important “topics”.

The rolling stone gathers no moss? Well, i feel this is not the case for the beginner dynamic traveling consultant:)

test if a key exists in multimap: count() perf tolerable

map::count() is simpler and slightly slower 🙂 than map::find()

Even for a multimap, count() is slower but good enough in a quick coding test. Just add a comment to say “will upgrade to find()”

In terms of cost, count() is only slightly slower than find(). Note multimap::count() complexity is Logarithmic in map size, plus linear in the number of matches… Probably because the matching entries live together in the RB tree.

std::map key is always field@payload@@

I encounter many situations like map<string, Trade>, where the string tradeRef is also a field of the Trade instance.

Q: is it inefficient to have each string value saved twice in the data structure?

  • A1: for the foreseeable future, std::string is reference counted, so not a big inefficiency
  • A2: map keys should be small. Big key values slow down everything
  • A3: my colleague Paul pointed out a limited technique using a set<Trade> that sorts itself by tradeRef.
    • limitation — no lookup by key. Sometimes, lookup is the main usage!
  • A4: See my solution in bbg coding IV: N most active stocks

yield curve -> fwd rate, spot rate …

This is yet another blog post about yield curve, fwd rate, spot rate etc

Let’s say we have a bunch of similar derivative instruments [1] on IBM. Each has an expiry date at each month end. For the Feb instrument, on the expiry date (end of Feb) all uncertainties would vanish and the value of the instrument would be determined/fixed. Therefore it’s practically possible to cash settle on that day. Alternatively the contract may specify a later maturity date (eg 3M from expiry/fixing) for the actual cashflow to occur.

Today, I can record all the current prices of this family of (eg 9) instruments. A minute later I can record their new prices… I keep doing it and get 9 (time-series) streams of live prices.

The “live yield curve” is something similar. The 9 instruments are the 9 deposit maturities we monitor, perhaps {1M, 3M, 6M, 1Y, 2Y, 3Y, 5Y, 10Y, 30Y …} These prices, after converting to yield numbers, actually comprise a 9-point yield curve. From this snapshot yc, we can derive many useful rates, such as (instantaneous) forward rates, spot rates, short rates… all valid at this moment only.

An additional complexity is discounting the cash flow. Whether the cash flow occurs on fixing date or on maturity date, we need to discount to valuation time (moment of observation), using a discounting curve such as the OIS curve.

Every minute, we re-sample live prices, so this 9-point yield curve (and the discount curve) shifts and wiggles by the minute.

[1] Could be bunch of forward contracts, or bunch of binary put options etc