generate Fib(N) in O(1)time and space: 2 techniques

Like all Constant-recursive sequences, the Fib sequence has a closed-form formula Fib(N) that returns an integer but in floating point form. By rounding, we can generate Fib(N) in constant time. has the formula we can easily program.

[ Phin – ( -Phi)-n ] /√5  # Notice the three minus signs

where the constant Phi := (1+√5) / 2.

Similarly, if we are given 89 as a Fib number, we can use logarithm to compute the rank of this Fib number. shows two O(1) solutions

1)num-theory 2)combinatorics: most-quizzed math

  • For regular programmers, most quizzed non-trivial subdomains of math are 1) number theory and 2) combinatorics.
  • For Quant developer interview is a different game, but these subdomains are still highly relevant.

These math sub-domains would have remained obscure if programming as (a profession) had not become this wide spread, mainstream, lucrative (油水) and economically important. Since the 2000’s, more and more IV focus is shifted to math, largely in these two sub-domains.

This trend is bad news for the GTD types like Ashish/CSY, the domain experts and the managers like Shuo etc

I think some strong STEM students don’t bother with coding skill since it is less emphasized, less quizzed in school.

I suspect that in some colleges, math is regarded as more refined, more classic than coding skill in any language such as python or javascript. Coding is considered blue-collar.

find median@2sorted arrays #Trex untested is similar except X and Y can be unequal length. My solution solves the harder, generalized problem.

This “coding” question is really math problem. Once you work out the math techniques, the coding is simple.

Designate arr1 as the shorter array. compare med(arr1) vs med(arr2)

Suppose former is lower, i can discard lower half of arr1 (s items). Can i discard highest s items in arr2? I think so because upper half of arr2 cannot have that median element, so any subset of it can be discarded

repeat until arr1 is completely discarded or left to a single element .. might be the final median. Now answer is close to the med of the remaining arr2.

–For the equal-length problem, My own idea on the spot — find the median of X and median of Y. If med(X) < med(Y) then discard the lower portion of X i.e. the “XB group”, and higher portion of Y (“YA group”). Then repeat.

  • Note len(XB) == len(YA) == min(len(X), len(Y))/2 := K. So every iteration would shrink the shorter array by half (i.e. K), and shrink the longer array by K. K would drop in value in next iteration.
  • loop exit — When the shorter of the two (say it’s X) shrinks to length 1, we are lucky — find the numbers around median(Y) and adjust the answer based on X[0].

Insight — Why can’t the final “winner”be somewhere in XB group? Because XA + YA already constitute half the population, and all of them are higher.

I always like concrete examples. So Suppose there are 512 items in the lower portion “XB group”, and the higher portion “XA” has 512 items. Suppose there are 128 items each in YB and YA groups. So in this iteration, we discard YA and the lowest 128 items in XB.

Definition of lower portion —
* all lower items up to but not including med(X) If len(X) is odd
* exactly the lower half of X if len(X) is even

nearest-rank percentile definition has a few concise pointers

  • a percentile value is always an original object, never interpolated
  • we can think of it as a mapping function percentile(int100) -> an original object, where int100 data type can be a value from 1 to 100 (zero disallowed). Two distinct inputs can map to the same output object.
  • 100th percentile is always the largest object in the list. No such thing as 0th. 1st percentile is not always the smallest object.

find 4-digit natural number X, where (X^2)%10000==X #csdoctor

Lets denote the number X^2 as Y;
Let’s denote the lowest digit of Y as Y_1 and 2nd lowest digit as Y_2 ..
Let’s denote the four digits of X as a b c d (where each a single-digit integers), so X=1000a + 100b + 10c + d. Multiplying out this expression, we get

  • the ten’s digit of Y, denoted Y_2 = [2cd + carry from dd]%10 = c
  • the hundred’s digit of Y, denoted Y_3 = [(200 db + 100 cc)%100 + carry]%10 = (2bd+cc+carry)%10 = b

What can d be?

dd  %10 = d so d can only be 0 || 1 || 5 || 6. I will now rule out 0, 1 and 5

  • if d = 0, then the Y_2 expression evaluates to 0 i.e c = 0 i.e. X is a multiple of 100, so Y is a multiple of 10000, so X = 0000, an invalid integer.
  • if d = 1, Y_2 evaluates to 2c%10 = c, i.e. c = 0. Then Y_3 evaluates to 2b%10 = b i.e. b = 0, so X is “?001” … easy to prove this is impossible
  • if d = 5, then Y_2 evaluates to [10c + 2]%10 = 2 = c. Then Y_3 evaluates to (cc+carry from 20cd)%10= 6 = b i.e. X looks like “?625” .. easy to prove this is impossible
  • So d can only be 6

Now let’s work out c:

Y_2 evaluates to (12c+3)%10 = c, so c can only be 7, i.e. X looks like “? ? 7 6”

Now let’s work out b:

Y_3 evaluates to (2 b + 7) %10 = b so b can only be 3, i.e. X looks like “?376”

I tried all 10 possible values for a, to find a must be 9.

DeMorgan’s law in boolean algebra

{\displaystyle {\begin{aligned}{\overline {A\cup B}}&={\overline {A}}\cap {\overline {B}},\\{\overline {A\cap B}}&={\overline {A}}\cup {\overline {B}},\end{aligned}}}

Note the perfect symmetry 🙂

The symmetry makes it easy to remember the pair, but the two laws are actually related. You can easily derive one from the other if you denote A’ as C … and B’ as D.

Starting from first rule

(A or B)’ = A’ and B’ # now negate both sides
A or B = (A or B)” = (A’ and B’)’ = (C and D)’ # now rewrite everything in C/D
C’ or D’ = (C and D)’ # which is the same as the 2nd rule 🙂

e^pi vs pi^e

Hi Prof Lee,

Thanks for the lunch (including the advice part). I came up with some ideas about this brain teaser —

Q: which is bigger e^pi vs pi^e

One solution I can think of is, suppose e has a value close to 2 and pi is much larger.

Suppose e = 2 and pi = 10. Clearly e^pi wins.

Another way is, define 2 functions

f1(x) = 2.718281828^x and find the growth rate when x is slightly above e. This growth rate is e^x,

f2(x) = x^2.718281828 and find the growth rate when x is slightly above e. This grow rate is e/x * x^e, which is smaller, since x is slightly bigger than e.

Therefore, f1 grows faster than f2, over the range of (e , 3.15). Therefore e^pi wins.

find the line passing the most points #solved might be similar

Q: Given N points with positive integer coordinates, find the straight line passing through the most points
A: For each of (N*N-N)/2 pairs of points, compute a Line object identified by 2 numbers:
* a slope S = (y2 – y1)/(x2 – x1)
* intercept on y-axis.

So these 2 numbers can be computed easily from any pair.

Save each Line object as key in a hashmap. When a pair gives a Line that’s already seen, increment its count.

Intercept formula y_inter(int, int, int, int) can be assumed to exist. Writing this function isn’t relevant to a coding interview:

Suppose this value is y3, so the incept point is (0,y3), so

(y3-y1)/(0-x1) = S, so y3 = y1 – S x1

2 overlapping rectangles


Q: You have a rectangle a and b. Determine if the two rectangles overlap. That is at least some part of either rectangle should be within the other.

I will assume the coordinates are given.

%%A: find both centers.
– case 1: identify box A’s corner that’s closes to B’s center. Check if this corner is inside B.

– case 2: if they lie on one vertical line, then only the bottom border (of upper box) vs top border (of lower box) matters.


I have seen a few PDE/SDE combos. There’s a pattern among them.

The SDE tends to describe a process as a signal-noise description. It is actually rather precise, as precise as it gets — you can compute the exact probability of the “particle” falling into a range at a given time t.

However, the SDE won’t enable us to compute today’s price as an equation can, such as “sin(x) + log(x) – sqrt(x) = pi”. Reason? The dW term is an obstacle. Therefore, we need to somehow get rid of the dW term. We end up with a differential equation, often a PDE. If there are sufficient boundary conditions, we could solve the equation to get a precise time-0 price

PDE solver – phrasebook

[[Numerical methods and optimization in finance]] has simple examples

of a PDE numerical solver. This book is more in-depth than [[basic


I think the Daniel Duffy book also covers Finite Difference method

Brute-force — often there's no other solution (like mathematician

tackling a proof) to a PDE. Numerical solution is kinda versatile.

Fwd: comparing exp( (a+b)/2 ) vs 0.5 exp(a) + 0.5 exp(b)

How about Jensen’s inequality?

Hi Richard (Qu Miao),

brain teaser — compare
A = exp( (a+b)/2 )    vs
B = 0.5 exp(a) + 0.5 exp(b)
Here’s my solution. See if it is correct.
Denote f = 2B/A. So f = exp(.5a  – .5b)  +  exp(.5b  – .5a) ….. symmetry
Denote x = .5a – .5b , so f(x) = exp(x) + exp(-x).
Now f(x) curve goes to infinity on both sides. So f(x) has minimum value of 2 occurring at x=0.
That means 2B/A has a minimum value of 2. In other words, B >= A

100-lightbulb problem, with my solution

Q: There are 100 switched-off light bulbs along a staircase. Visitor #1 walks upstairs, pushing the on/off switch at each bulb. Visitor #2 walks upstairs hitting bulb #2,4,6…100. Visitor #3 to #100 each walks upstairs. In the end how many light bulbs are on? Can you tell me which bulbs are on? Can you tell me how many times each bulb has turned on?

For each integer between 1 and 100, factor it into “prime” factors. For eg, 72 = 2x2x2x3x3. Each integer is then expressed in the “Beautiful” form like A^n x B^m x C^p… where ABC are increasing prime numbers and nmp are exponents. For eg, 72 = 2^3 x 3^2.

Now (n+1)(m+1)(p+1)… is the number of “toggles” received by the light bulb. For eg, Bulb#72 gets 12 toggles. Bulb 10 gets 4 toggles (due to 2^1 x 5^1). Bulb 13 gets 2 toggles (due to 13^1).

Now let me defined Group-E and Group-O —
If (n+1)(m+1)(p+1)… is even, then the light bulb is in Group-E.
If (n+1)(m+1)(p+1)… is odd, then the light bulb is in Group-O.

Now, (n+1)(m+1)(p+1)… is usually an even integer. It’s odd only if n,m,p… are all even — for eg Bulb#4, #64, #36, …– Group-O. It’s easy to list all of them.

2^2,  2^4,  2^6,  3^2 , 3^4,  5^2,  7^2
2^2×3^2,  2^2×5^2

For all the “prime” bulbs like #2, #3 … #37…, the beautiful form looks like A^1, so these prime bulbs each gets 2 toggles exactly. Therefore Group-E.

Special cases — Bulb #1 gets togged only once — Group-O.

It turned out Group-O is the group of square numbers.

derivative/ integral of exponential/log functions

I feel the #1 most useful integral (and derivative) is that of exponential function and the natural log function. Here is a

cheatsheet to be internalized.

for f (x) = e^x, f ' (x) = e^x

for f (x) = a^x, f ' (x) = ln(a) a^x

for f (x) = ln(x), f ' (x) = 1/x

for f (x) = log_a(x), then simpliy recognize f (x) = ln(x) / ln(a). The rest is really really simple.

Now the integrals

For f ' (x) = a^x, f (x) = a^x / ln(a)

For f ' (x) = ln(x), f (x) = x ln(x) – x See Classic integration-by-parts

For f ' (x) = log_a(x), then simpliy recognize f ' (x) = ln(x) / ln(a). The rest is really really simple.

y is sphere surface 4pi R^2

Imagine a giant hoop passing through south and north poles. If I build a railroad along the equator and pull the hoop one complete

round, the hoop would have swept the entire surface exactly twice. The hoop perimeter is 2?R or 2? assuming R=1.

Imagine yourself holding one point of the hoop during the sweep. You can further imagine all the people (about 7,000,000,000)

holding on the hoop shoulder to shoulder from North pole to the equator. We would cover a quarter hoop. In one round we would sweep

half the sphere area. Each person's travel path distance will be different. The longest path is the equator — 2?R. At the pole the

sweep path is 0.

Q: what's the average travel among all those people? I think answer turns out to be 4R, which is slightly short of 66.666% of the

equator length.

Q: (To keep this question simpler, we can keep the hoop still.) What's the average distance from the axis? I think it's 2R/?

domain ^ range of a given function

Option pricing engine is always implemented in matrices. In the math theory beneath, we encounter the concepts of domain/range of a given function.
If I have a function that accepts prime numbers and returns sqroot, then I guess the domain would be a bunch of positive integers and the range would be a bunch of irrationals.

Strictly, a function’s domain is a set; a function’s range is another set.

Re: essential tools of finding antiderivative

substitution of variables during integration is related to the differentiation chain rule.

In general, Integrating an unfamiliar Expression of x requires that we tinker with it until it looks like one we recognize as an antiderivative. This skill comes with experience. Therefore we need to recognize the “output” function form of chain-rule, and also product-rule.

Undoing differentiation by “guess and check” methods

2 notations of differentiation

Given a function f(x), the derivative function can be written as either f'(x) or df/dx.

The prime notation (Newton style) signifies that

1) f'(x) is a function of x
2) f' is derived from f

The quotient notation (Leibniz style) highlights

a) its origin as a quotient
b) the variable x with respect to which the derivative is taken.

I like to treat the derivative as another variable, written as f' or as j or any letter you pick. AT each point on the x-axis, there's a unique value of f (being a function of x), there's also a value of j. Therefore j depends on x and is by definition a function x.

This value of j isn't really a quotient, but rather the quotient of delta_f/delta_x pushed to the limit. (a) is therefore misleading.

(b) is extremely useful when dealing with multiple variables, the chain rule, the inverse function or substitution of variables

segments to form a polygon

What is the probability that a stick of length 1 divided randomly into 3 parts can form a triangle with the three parts?

& then, General Bonus asks u…Soldier, what’s the probability that a polygon with (N+1) sides can be made from (N+1) segments obtained by randomly dividing a stick of length l into (N+1) parts?
You gave me the probability formula . Here’s my explanation —
Suppose we mark a circle’s 12 o’clock position with “0”, 3 o’clock with 90, 6 o’clock with 180 etc. So each position on the circle has a unique number from 0 to 359.999999999.
We put N+1 random dots (all at once) on the circle’s circumference, creating N+1 segments. (If any segment exceeds half the circle’s circumference, then we would fail to form a polygon.)
Rotate the clock to align 12 o’clock position to any random dot. Break the circle at that dot (and losing that dot). Now we have N dots and N+1 segments.
Now we color the segments. Say black segment is first from the 12 o’clock position, 2nd is red segment.
P (black segment exceeding half the circle) = P (all N dots landing beyond 6 o’clock position) = 2N
P (red segment exceeding half the circle) ? Now keep all the dots and the colors. Rejoin the circle. Rotate it by the length of black segment. Break circle there. Now red segment is the new 1st segment. P (red segment exceeding half) = 2N
There are N+1 segments, so P (any segment exceeding half the circle) = (N+1)*2N

Leibniz notation of integration

Look at the standard integral notation
   ab(some expression f of running variable x)dx ……….(1)
This usually but NOT always means (f of x) multiplying dx, then integrating from a to b, even though in some contexts it looks exactly like that and you feel “no doubt it means exactly that”.
In better computer graphic, it’s written as \int_a^b \! f(x)\,dx \,where f(x) is our complicated expression of x .

Instead, this Leibniz notation is based on the algebraic summation of n items, each a product of (f * Δ),

\sum_{i=1}^{n} f(t_i) \Delta_i ; …………..(2)

What is expressed in the integral in (1) is that we integrate over infinitesimal divisions. These divisions divide up and cover the entire continuous range [a,b]. Therefore (1) denotes the sum in (2) as Δ goes infinitesimal. Indeed, most of the time we can treat the (…)dx as a product as in (2), but i just feel there are some special contexts /with invisible broken glass on the floor/. I can’t identify exactly those contexts, but here are some hints.

People often put funny things after the /innocent-looking/ “d”. like
(…expression of x) d(-x/2)
(…expression of x) dw(x)
(…expression of x)dy dx/dy

dw(x) means dw, where w is treated as a variable, even though we know that w is a function of x.

I don’t know if teachers ever do these things, but students do. They muddy the water.

How about double integral
   cd(∫ab(f of x and y)dx)dy

Whenever it’s confusing I feel we had better refer to the the original definition of Leibniz’s notation, based on (2)

exponential functions in calculus

Exponential function (and its twin sister the logarithmic) is one of the top 3 most important functions in finance. (Can’t name the others though.)

Derivative of the standard exponential function exp(x) is exp(x) itself. Consequently, 2nd derivative (and nth derivative) is again itself. Furthermore, when you know that a mysterious function’s derivative function is itself, you may wonder whether that function is related to exp(). Answer is that function must be exp(). No one else have this unique property.

All exponential functions (i.e. functions whose x is the exponent, i.e. up there) are related to the standard exponential function exp(x)=ex. This makes the standard exponential function very useful esp. in calculus.

5x is also commonly written as 5^x or 5**x

85*85 45*45 115*115

For all of them, the last 2 digits are “25”. To work out the Other digits,

1) take the input’s leading digits before “5” — 8 in this case
2) multiple it by it’s bigger brother — 8 * 9

For 115, “the other digits” are 11*12 = 132, so 115*115=13225
For 175, “the other digits” are 17*18 = 289+17 = 306