All Heads when tossing a coin – prob^stats

A well-programmed computer simulates a biased coin. For this illustration, we can also use a physical coin. First toss is a head. You update your estimate of the true parameter (m) of the coin. Second toss is a head. You update it again. Third toss is a coin… So how exactly do you update it?

Is this statistics or probability problem? More like statistics IMO. Real data. There’s a lot of probability math in this statistics problem.

I guess you start with an estimate of m before the first toss. Safe choice is  50%. As you see more heads, you probably increase  your estimate but exactly how? Perhaps some MLE?

statistical independent ≠> no causal influence

When I was young, I ate twice as much rice as noodles; Now I still do. So the ratio of rice and noodle intake is independent of my age. This independence doesn’t imply that my age has no influence on the ratio. It only appears to have no influence. shows that 2 RV are each controlled by a family of “driver random variables”, but the 2 RV can be independent!

Note the mathematical definition of independence is based on covariance. There must be a stream of paired data points. I would say the mathematical meaning of independence is fundamentally different from everyday English, so intuition often gets in the way.

Special context — time series. We record 2 processes X(t) and Y(t). Both could be influenced by several (perhaps shared) factors. In this context, the layman’s independence is somewhat close to the mathematical definition.
* historical data – we could analyze the paired data points and compute a covariance. We could conclude they are independent, based on the data. We aren’t sure if there’s some common factor that could later give rise to a strong covariance
* future – we are actually more interested in the future. We often rely on historical data


abc pairwise independence ≠> Pa*Pb*Pc

Q: If we know events A, B, C are pairwise independent, does Pa*Pb*Pc mean anything?

Q: does Pa*Pb*Pc equal P(A & B & C)?
A: no. The multivariate normal model would imply exactly that, but this is just one possibility, not guaranteed.

Jon Frye gave an excellent example to show that P(A & B & C) can have any value ranging from 0% to minimum(Pa, Pb, Pc, Pab, Pac, Pbc). Suppose Pa = Pb = Pc = 10%. Pairwise independence means Pab = 1%. However, P(abc) can be 0% or 1% (i.e. whenever A and B happen, then C also happens)

School Prom illustration – each student decides whether to go, regardless of any other individual student, but if everyone else goes, then your decision is swayed.

stable job4H1 guys#le2HenryWu

Hi Henry,

See if you can connect me to your H1 sponsor at your earliest convenience.

For most H1 immigrants, having a stable job is a top priority. We all worry about losing our job, losing the H1 status and Green card petition.

Therefore, many prefer a big, reputable employer. Some prefer a consulting firm that can help maintain our H1 status even when we change project from time to time. There are definitely risks of “gaps” between 2 jobs. In my experience, 1 to 3 months are tolerable. Beyond that, there are probably other solutions. It all depends on the last employer and the lawyer. Remember I’m not an immigration attorney.

In the Worst scenario the employer cancels the H1 right away. The USCIS regulation probably allows us (“the aliens”) to stay in the US for a few weeks looking for the next job. If we can’t find any, we should ask our lawyer when we have to leave the country. We would re-enter once we find a new employer.

The exit/reentry can (in my imagination) be a real hassle for someone with a big family, esp. if kids are in school. It might be best to avoid the exit/reentry. I guess this is one reason many H1 families are fearful of layoff and prefer a stable job even at a lower salary. (Overall, Singapore companies are less likely to layoff large number of staff.)

Therefore, if I were you I would prefer a stable job. As a risk taker, I will take a gamble that I could reduce the “gap” between jobs to 2 months, by being flexible on the salary.

real J4 embracing U.S.: ez2get jobs till60

ez2get jobs; abundance of jobs — At the risk of oversimplifying things, I would single out this one as the #1 fundamental
justification of my bold and controversial move to US.

There are many ways to /dice-n-slice/ this justification:
* It gives me confidence that I can support my kids for many years. * I don’t worry so much about aging as a techie
* I don’t worry so much about outsourcing and a shrinking job pool * when I don’t feel extremely happy on a job, I don’t need to feel trapped like I do in a Singapore job.
* I feel like an “attractive girl” not someone desperately seeking on “a men’s market”.

Look at Genn. What if I really plan to stay in one company for 10 years? I guess after 10 years I may still face problem changing job in a market like Singapore.

fwd contract often has negative value, briefly

An option “paper” is a right but not an obligation, so its holder has no obligation, so this paper is always worth a non-negative value.

if the option holder forgets it, she could get automatically exercised or receive the cash-settlement income. No one would go after her.

In contrast, an obligation requires you to fulfill your duty.

A fwd contract to buy some asset (say oil) is an obligation, so the pre-maturity value can be negative or positive. Example – a contract to “buy oil at $3333” but now the price is below $50. Who wants this obligation? This paper is a liability not an asset, so its value is negative.

Probability of default ^ bond rating

* 1-Y Probability of default (denoted PD) is defined for a single issuer. * Rating (like AA) is defined for one bond among many by the same issuer.

Jon Frye confirmed that an AA rating is not an expression of the firm’s status/viability/strength/health.

I guess the rating does convey something about the LossGivenDefault, another attributes of the bond not the issuer.

network engineer^accountant@60: Raymond paradox

See my other posts on civil engineers.

I told Raymond about Junli’s lament over constant learning expected of an IT guy. It’s an imposed extra workload that eats into our spare time.
Raymond’s sister is an accountant, and Raymond is now in the infrastructure team. Raymond pointed out that network engineers don’t need to learn anything new. So the constant learning is not the key difference here.

We do see accountants in their 60’s but why we don’t see the same among network engineers? Perhaps it’s supply/demand.

Explanation – demand. Accountants enjoy much bigger demand. Every big or small company need accountants. Raymond felt only big companies need network engineers. If indeed one out of 50 professionals in each domain is 60+, then old accountants are easier to find because the absolute number is much higher than network engineers.

I feel network engineering is a more specialized domain. How about brain surgeons? Specialist too. The bigger demand doesn’t translate to higher salary — look at taxi drivers. So specialization is not the explanation.

Explanation – globalization. Supply is in Accountant’s favor. It’s less common to hire accountants from developing countries. (Those who have worked locally for years would be treated like locals.) IT skillset is more globallized and standardized, as standardized as it gets. Employers easily tap into overseas candidate pool. That’s Raymond’s observations. However, in the US there is also the presence of overseas candidate pool, but there are old techies. So overseas talent pool is not the complete explanation.

The value of experience is higher in medical, accounting … and less in c++. I know c++ didn’t change a lot over 20 years, but ….? But the young programmers can accumulate the experience very quickly, often in a few years. The entry barrier is too low. Many aggressive, ambitious and determined young programmers can pick it up at home, just like I did. How about network engineers? Higher entry barrier. So entry barrier is still not the explanation.

Hongzhi pointed out Singapore salary is lower for accountants (not auditors) than network engineers. If indeed a typical old accountant earns $4k but a network engineer typically earns $8k, then this would be one valid contributing factor. The less lucrative/competitive jobs would be easier to get for an old job seeker.

Hongzhi also pointed out the Singaporean perception that every IT job skill is churn and therefore favors the young. The lay public doesn’t realize network engineering vs online app — have vastly different churn rates.

Q: why I don’t see a network engineer in their 60’s on the Singapore job market? Maybe there are but they are not job hunting! I think in Singapore there are electronic equipment engineers at that age.

find the line passing the most points #solved might be similar

Q: Given N points with positive integer coordinates, find the straight line passing through the most points
A: For each of (N*N-N)/2 pairs of points, compute a Line object identified by 2 numbers:
* a slope S = (y2 – y1)/(x2 – x1)
* intercept on y-axis.

So these 2 numbers can be computed easily from any pair.

Save each Line object as key in a hashmap. When a pair gives a Line that’s already seen, increment its count.

Intercept formula y_inter(int, int, int, int) can be assumed to exist. Writing this function isn’t relevant to a coding interview:

Suppose this value is y3, so the incept point is (0,y3), so

(y3-y1)/(0-x1) = S, so y3 = y1 – S x1

rotate array by K slots in O(1)space O(N) time

Can write a c++ solution as an ECT practice. Easy to verify. Added to enroute.txt.

[[EPI300]] 6.13 is probably a much shorter and more readable solution!

A: find the largest common denominator between N and K. If it’s 3, then the array consists of 3 interleaved subsequences (abbreviations). We will “process” each subsequences simultaneously since they are completely independent. First subsequences consists of Slots 0,3,6,9… Within each subsequences , K’ and N’ (scaled down by the denominator) have no common denominator, therefore an iteration will visit every element exactly once until we revisit the first element.

Q: rotate a large array (N elements) by K positions. Both N and K are potentially large, so we need 1) time efficiency 2) memory efficiency. Ideally, O(1) space and O(N) time.

If K = 1, then just shift every element right, rotating the last element to the head. If K = 2, just repeat once. But K is large.

You can assume it’s an array of pointers, or you can assume it’s an array of integers — Same challenge.

The brute-force solution requires O(K*N), shifting all N elements K times.

Without loss of generality, let’s rotate by 5 slots. If N is multiple of 5, like 30, then there are 5 individual, independent, interleaved arrays of 10. Just rotate each by 1 slot.

If N is not a multiple of 5, like 31, then view the array as a circular array and we hop by 5 slots each time. So our journey will start from position 0 (or any position) and cover every position and come back to the starting position. This can be proven by contradiction —

Sooner or later we must be re-visiting some position. We know that position is reachable from position 0, so the distance between them must be multiple of 5. So position 0 must be revisited.


2 overlapping rectangles


Q: You have a rectangle a and b. Determine if the two rectangles overlap. That is at least some part of either rectangle should be within the other.

I will assume the coordinates are given.

%%A: find both centers.
– case 1: identify box A’s corner that’s closes to B’s center. Check if this corner is inside B.

– case 2: if they lie on one vertical line, then only the bottom border (of upper box) vs top border (of lower box) matters.

##thread cancellation techniques: java #pthread,c#

Cancellation is required when you decide a target thread should be told to give up halfway. Cancellation is a practical technique, too advanced for most IV.

Note in both java and c#, cancellation is cooperative. The requester (on it’s own thread) can’t force the target thread to stop.

C# has comprehensive support for thread cancellation (CancellationToken etc). Pthreads also offer cancellation feature. Java uses a numbers of simpler constructs, described concisely in [[thinking in java]]. Doug Lea discussed cancellation in his book.

Here are the java techniques

  • interrupt
  • loop polling – the preferred method if your design permits.
  • thread pool shutdown, which calls thread1.interrupt(), thread2.interrupt() …
  • Future — myFuture.cancel(true) can call underlyingThread.interrupt()

Some blocking conditions are clearly interruptible — indicated by the compulsory try block surrounding the wait() and sleep(). Other blocking conditions are immune to interrupt.

NIO is interruptible but the traditional I/O isn’t.

The new Lock objects supports lockInterruptibly(), but the traditional synchronized() lock grab is immune to interrupt.

bookmarking in vi

— based on

Any line can be “Book Marked” for a quick cursor return. Type the letter “m” and any other letter to identify the line. This “marked” line can be referenced by the keystroke sequence “‘” and the identifying letter.

Example: “mt” will mark a line by the identifier “t”. “‘t” will return the cursor to this line at any time. I prefer mm and ‘m

A block of text may be referred to by its marked lines. i.e.’t,’b


I have seen a few PDE/SDE combos. There’s a pattern among them.

The SDE tends to describe a process as a signal-noise description. It is actually rather precise, as precise as it gets — you can compute the exact probability of the “particle” falling into a range at a given time t.

However, the SDE won’t enable us to compute today’s price as an equation can, such as “sin(x) + log(x) – sqrt(x) = pi”. Reason? The dW term is an obstacle. Therefore, we need to somehow get rid of the dW term. We end up with a differential equation, often a PDE. If there are sufficient boundary conditions, we could solve the equation to get a precise time-0 price

openssh^putty key files

openssh key format is the one used in id_rsa /

  • Proven tip: I actually took the pair of id_rsa files and used them in any windows or linux account. It’s like a single fingerprint used everywhere. The key is not tied to any machine.
    • The authorized_keys file in ALL destination machines have the SAME line containing that public key.
  • Proven tip: To support ssh support@rtppeslo2 with ssh key, I copy a standard authorized_keys to rtppeslo2:~support/.ssh. See

transformation — Puttygen can read the same “fingerprint” to generate ppk files, in putty’s format. Then a putty session can use the ppk file to auto-login, in place of a password

Note when my remote_host home dir permission was too open, then the ssh key was ignored by sshd. I had to enter password.