Monte Carlo is the only way to estimate it…
I used to understand these things very well.
I feel Current Yield is a simplistic yardstick, not popular with quants or “serious” investors. It’s simply the coupon rate divided by current valuation.
Rule 1: For a given no-dividend stock, early exercise of American call is never optimal.
Rule 1b: therefore, the price is similar to a European call. In other words, the early exercise feature is worthless.
To simplify (not over-simplify) the explanation, it’s useful to assume zero interest rate.
The key insight is that short-selling stock is always better than exercise. Given strike is $100 but the current price is super high at $150.
* Exercise means “sell at $150 immediately after buying underlier at $100”.
* Short means “sell at $150 but delay the buying till expiry”
Why *delay* the buy? Because we hold a right not an obligation to buy.
– If terminal price is $201 or anything above strike, then the final buy is at $100, same as the Exercise route.
– If terminal price is $89 or anything below strike, then the final buy is BETTER than the Exercise route.
You can also think in terms of a super-replicating portfolio, but I find it less intuitive.
So in real markets when stock is very high and you are tempted to exercise, don’t sit there and risk losing the opportunity. 1) Short sell if you are allowed
2) Exercise if you can’t short sell
When interest rate is present, the argument is only slightly different. Invest the short sell proceeds in a bond.
(Equations were created in Outlook then sent to WordPress by HTML email. )
My starting point is https://bintanvictor.wordpress.com/2016/06/29/probability-density-clarified-intuitively/. Look at the cross section at X=7.02. This is a 2D area, so volume (i.e. probability mass) is zero, not close to zero. Hard to work with. In order to work with a proper probability mass, I prefer a very thin but 3D “sheet” , by cutting again at X=7.02001 i.e 7.02 + deltaX. The prob mass in this sheet divided by deltaX is a number. I think it’s the marginal density value at X=7.02.
The standard formula for marginal density function is on https://www.statlect.com/glossary/marginal-probability-density-function:
How is this formula reconciled with our “sheet”? I prefer to start from our sheet, since I don’t like to deal with zero probability mass. Sheet mass divided by the thickness i.e. deltaX:
Since f(x,y) is assumed not to change with x, this expression simplifies to
Now it is same as formula . The advantage of my “sheet” way is the numerator always being a sensible probability mass. The integral in the standard formula  doesn’t look like a probably mass to me, since the sheet has zero width.
The simplest and most visual bivariate illustration of marginal density — throwing a dart on a map of Singapore drawn on a x:y grid. Joint density is a constant (you can easily work out its value). You could immediate tell that marginal density at X=7.02 is proportional to the island’s width at X=7.02. Formula  would tell us that marginal density is
See also my book on numerical methods
See also [[Crack]]
See also D Duffy’s c++ book
I think the PDE (in physical science or finance) doesn’t have dW terms. The hedging derivation on Crack P75 seems to cancel out the dW term – a crucial step in the derivation.
I believe BS-equation ( a famous PDE) is not a Stoch differential equation, simply because there’s no dW term in it.
A SDE is really about two integrals on the left and right. At least one integral must be a stochastic integral.
Some (not all) of the derivations of BS-E uses stochastic integrals.
In a monte carlo simulation, I feel we should never remove any outlier.
The special event we are trying to capture could be an extreme event, such as a deep OTM option getting exercised. Perhaps one in 9 billion realizations is an interesting data point.
Removing any outlier would alter the probability distribution. So our Monte Carlo estimate is no longer unbiased estimate.
However, if there’s a data point with operational or technical error it needs to be removed. Not removing it would also mess up the probability distribution.
grid means the PDE pricer; tree means the binomial recombinant tree pricer.
In the grid, the price levels are equidistant. In the tree, the log price levels are presumably equidistant, since the up nodes are log(X0)+log(u) -> log(X0)+2 log(u) -> log(X0)+3 log(u)
I believe the price levels in the tree are related to GBM.. but that’s another blog…
There are many ways to derive the BS-E(quation). See [[Crack]]. Roger Lee covered at least two routes.
There are many ways to derive the BS-F(ormula). See P116 [[Crack]]
There are many ways to interpret the BS-F. Roger Lee and [[Crack]] covered them extensively.
Q: BS-F is a solution to the BS-E, but is BS-F based on BS-E?
A: I would say yes, though some BS-F derivations don’t use any PDE (BS-E is PDE) at all.
BS-E is simpler than BS-F IMO. The math operations in the BS-F are non-trivial and not so intuitive.
BS-F only covers European calls and puts.
BS-E covers American and more complex options. See P74 [[Crack]]
BS-E has slightly fewer assumptions:
– Stock is assumed GBM
– no assumption about boundary condition. Can be American or exotic options.
– constant vol?
I asked a relatively young quant I respect.
She said most sell side models do not have jump feature. The most advanced models tend to be stochastic vol. A simpler model is the local vol model.
I said the Poisson jump model is well-regarded – but she said it’s not that mature.
I said the Poisson jump model is needed since a stock price often exhibits jumps – but her answer gave me the impression that a model without this “indispensable” feature can be good enough in practice.
When you actually put the jump model into practice, it may not work better than a no-jump model. This is reality vs theory.