dynamic dice game (Zhou Xinfeng book

P126 [[Zhou Xinfeng]] presents — 
Game rule: you toss a fair dice repeatedly until you choose to stop or you lose everything due to a 6. If you get 1/2/3/4/5, then you earn an incremental $1/$2/$3/$4/$5. This game has an admission price. How much is a fair price? In other words, how many dollars is the expected take-home earning by end of the game?

Let’s denote the amount of money you take home as H. Your net profit/loss would be H minus admission price. If 555 reasonable/intelligent people play this game, then there would be 555 H values. What’s the average? That would be the answer.

It’s easy to see that if your cumulative earning (denoted h) is $14 or less, then you should keep tossing.

Exp(H|h=14) is based on 6 equiprobable outcomes. Let’s denote Exp(H|h=14) as E14
E14=1/6 $0 + 1/6(h+1) + 1/6(h+2) + 1/6(h+3) + 1/6(h+4) + 1/6(h+5)=$85/6= $14.166

E15=1/6 $0 + 1/6(h+1) + 1/6(h+2) + 1/6(h+3) + 1/6(h+4) + 1/6(h+5) where h=15, so E15=$15 so when we have accumulated $15, we can either stop or roll again.

It’s trivial to prove that E16=$16, E17=$17 etc because we should definitely leave the game — we have too much at stake.

How about E13? It’s based on 6 equiprobable outcomes.
E13 = 1/6 $0 +1/6(E14) + 1/6(E15) + 1/6(E16) + 1/6(E17) + 1/6(E18) = $13.36111
E12 = 1/6 $0 + 1/6(E13) +1/6(E14) + 1/6(E15) + 1/6(E16) + 1/6(E17) = $12.58796296

E1 =  1/6 $0 + 1/6(E2) +1/6(E3) + 1/6(E4) + 1/6(E5) + 1/6(E6)

Finally, at start of game, expected end-of-game earning is based on 6 equiprobable outcomes —
E0 =  1/6 $0 + 1/6(E1) + 1/6(E2) +1/6(E3) + 1/6(E4) + 1/6(E5) = $6.153737928

Essential BS-M, my 2nd take

People ask me to give a short explanation of Black-Scholes Model (not BS-equ or BS-formula)…

I feel random variable problems always boil down to the (inherent) distribution, ideally in the form of a probability density function.

Back to basics. Look at the height of all the kids in a pre-school — There’s a distribution. Simplest way to describe this kind of distribution is a histogram [.8 -1m], [1-1.2m], [1.2-1.4m] … A probability distribution is a precise description of how the individual heights are “distributed” in a population.

Now consider another distribution — Toss 10 fair dice at once and add up the points to a “score”. Keep tossing to get a bunch of scores and examine the distribution of scores. If we know the inherent, natural distribution of the scores, we have the best possible predictor of all future outcomes. If we get one score per day, We can then estimate how soon we are likely to hit a score above 25. We can also estimate by the 30th toss, how “surely” cumulative-score would have exceeded 44.

For most random variables in real life, the inherent distribution is not a simple math function like our little examples. Instead, practioners work out a way to *characterize* the distribution. This is the standard route to solve random variable problems because characterizing the underlying distribution (of the random variable) unlocks a whole lot of insights.

Above are random variables in a static context. Stock price is an evolving variable. There’s a Process. In the following paragraphs, I have mixed the random process and the random variable at the end of the process. The process has a σ and the variable (actually its value at a future time) also has a σ.
In option pricing, the original underlying Random Process Variable (RPV) is the stock price. Not easy to characterize. Instead, the pioneers picked an alternative RPV i.e. R defined as ln(Sn+1 / Sn) and managed to characterize R’s behavior. Specifically, they characterized R’s random walk using a differential equation parametrized by a σinst i.e. the instantaneous volatility [1]. This is the key parameter of the random walk or the Geometric Brownian motion.

Binomial-tree is a popular implementation of BS. B-tree models a stock price [2] as a random walker taking up/down steps every interval (say every second). To characterize the step size Sn+1 – Sn, we wanted to get the distribution of step sizes but too hard. As an alternative, we assume R follows a standard Wiener process so the value of R at any future time is normally distributed. But what is this distribution about?

Remember R is an observable random variable recorded at a fixed sampling frequency. Let’s denote R values at each sampling point (i.e. each step of the random walk) as  R1, R2 ,R3, R4 …. We treat each of them as independent random variables. If we record a large series of R values, we see a distribution, but this is the wrong route. We don’t want to treat time series values R1, R2 … as observations of the same random variable. Instead, imagine a computer picking an R value at each step of the random walk (like once a second). The distribution of each random pick is programmed into computer. Each pick has a distinct Normal distribution with a distinct σinst_1, σinst_2, σinst_3 …. [4]

In summary, we must analyze the underlying distribution (of S or R) to predict where S might be in the future[3].
[4] A major simplifying assumption of BS is a time-invariant  σinst which characterizes the distributions of  R at each step of the random walk. Evidence suggests the diffusion parameter σinst does vary and primarily depends on time and current stock price. The characterization of σinst as a function of time and S is a cottage industry in its own right and is the subject of skew modelling, vol surface, term structure of vol etc.

[1] All other parameters of the equation pale in significance — risk-free interest i.e. the drift etc.
[2] While S is a random walker, R is not really a random walker. See other posts.
[3] Like in the dice case, we can’t predict the value of S but we can predict the “distribution” of S after N sampling periods.

speak freely, westerner, humor…

Sometimes you feel tired of paralanguage-monitoring and self-shrinking – it can feel tiring[4] (for the untrained) to be always on our toes and to avoid sticking-out. Quiet people are presumably more comfortable with that but I'm not a quiet person (though I'm rather introspective or “looki”)

Sometimes you just want to, for a moment, be yourself, express (not in a loud, in-your-face way, but in a Passive way) your individuality and leave the judgement to “them”. I seem to have many family members + colleagues/bosses having that tendency, though each of them decide when to show it and when to Control it.

Humour is a decisive part of it. Some would say “No humour, don't try it.” I'm not humorous even though I find many words I speak somewhat amusing.

99% of the time I decide to “let my hair down” and speak “freely”, it has been a (conscious or semi-conscious) gamble since I have no control over the situation [1]. I suspect a lot of times the negative reaction in the audience couldn't be completely offset by the positive. Let's face it, if there's any trace of negative reaction after you say something, it tends to last a long time, even if the positive is much more. Especially true if the negative is taken personally like a joke about age or weight. Showbiz people like to take on those sensitive topics… because they can, but it's foolhardy to “try it at home”. You also see people communicating rather directly in movies and in publications, but it's an exaggerated/distorted version of reality. In reality that kind of speak is rare, shocking. It's like playing with fire.

In my (somewhat biased) perception, I tend to see westerners as less restrained, more individualistic, speaking-for-own-self, carefree, less careful, less rule-bound… This long list of perceptions would eventually lead to the American ideal of individual “freedom”, but that word alone would be oversimplification.

Experience — In my first 5 years of working, I was often the youngest team member. I didn't have to “image-manage” myself as a future leader. I rarely had junior staff looking up to me.

[1] major exception is the last few days on any job.

[4] I suspect most of us can get used to it, just like children getting used to self-restraint once in school.

PCP – 3 basic perspectives

#) 2 portfolio asset values – at expiration  or pre-expiration. This simple view IGNORES premium paid
See http://bigblog.tanbin.com/2011/06/pcp-synthetic-positions-before.html

#) 2 traders' accounts each starting with $100k cash. This angle takes premium into account.

#) current mid-quote prices of the put/call/underlier/futures should reflect PCP, assuming good liquidity. In reality?

skew bump on a smile curve

Background — in volatility smile (rather than term structure) analysis, we often use a number (say, -2.1212) to measure the skewness of a given smile curve. Skew is one of several parameters in a calibrated formula that determines the exact shape of a given smile curve. Along with anchor volatility, skew is among the most important parameters in a parameterization scheme.

We often want to bump the skew value (say by 0.0001) and see how the smile changes.

A bump in skew would make the smile curve Steeper at the ATM point on the curve. You need to look at both the put (low strike) and call (high strike) sides of the smile curve. If the bump causes put side to move even higher and call side to move even lower, then skewness is further increased. In many cases, skew value of the entire curve is (approximately?) equal to the slope (first derivative) at the ATM point.

If you mistakenly look at only one side of the smile curve, say the put side, you might notice the curve flattening out when skew is bumped. Entire left half may become less steep, while the right half was rather flat to start with. So you may feel both halves become less steep when skew is bumped. That’s misleading!

Note skew is usually negative for equities. A bump is a bump in the magnitude.

theoretical numbers ^ vol surface

After you work in volatility field for a while, you may figure out when (and when not) to use the word “theoretical”. There’s probably no standard definition of it. I guess it basically means “according-to-BS”. It can also mean risk-neutral. All the greeks and many of the pricing formulas are theoretical.

The opposite of theoretical is typically “observed on the market”, or adjusted for skew or tail.

Now, the volatility smile, the volatility term structure and the vol surface are a departure from BS. These are empirical models, fitted against observed market quotes. Ignoring outliers among raw data, the fitted vol surface must agree with observed market prices — empirical.