creative writing on CV #CSY

Hi Shanyou,

Sharing my observations…

Creative resume writing is an “art”. Over the years I have worked out some rules of thumb.

  • Be careful with the dates in the CV, as they can be used as evidence of cheating.
    • o I sometimes specify only the year without month. If recruiter asks for the month, I would say, it means entire year is on that project
    • o I don’t massage the dates in the last 7 years, but earlier than that, I’m more creative
    • o I’m more careful with *perm employee* project dates as the employer often has a compliance requirement to release the dates when requested
    • o Contract agencies may close down or change name. The account managers in charge of my assignment often change job. The dates they have in their system is less reliable.
      • Also, Under one agency, I could have 2 assignments at two sites, so the dates are fuzzy.
    • o Since I changed jobs too many times, I sometimes combine the earliest 3 jobs into one, when I know the employer is already gone, and it’s 12 years ago.
  • Job duty is really up to me to write, esp. with my contract jobs. Also jobs done 7 years ago are not so relevant, so the background checkers are less concerned. I often shift or copy my “job duties” section from one job to another job.
  • The technical experience or domain experience are up to me to write.
    • o I used to mention java swing in 5 out of 7 past jobs. This way, my resume looked like a java swing veteran.
    • o I used to mention connectivity in 5 out of 7 past jobs.
    • o I used to mention c# in all of my past jobs.
    • o I used to mention Forex in 4 out of 6 past jobs (To create an impression of “Forex focus” I delete all jobs that are unrelated to forex. If recruiter ask about the gap, I say it’s irrelevant or I say I was jobless). Actually, only 2 jobs had some forex element.
  • I keep 3 versions of resume. I create a temporary version when a job application requires it. I don’t spend more than 20 minutes creating each version, as the effort is unlikely to pay dividends.

This is a trial-and-error process. I sometimes become over-creative and test the market. If no one notices or questions me over a few (10?) job interviews, then it’s considered very safe creativity. If they do spot any inconsistency, then I back off and admit a typo mistake.

I now think some hiring managers are suspicious or very perceptive so they could see through my creativity but won’t say anything, so I am completely unaware.

I see the resume as advertisement. The goal is an initial interview. If I ace the interview, they basically accept the resume as is.


SiliconValley grad salary: statistical sampling case study #XR

As a statistics student, I see problems in your sampling approach.

Suppose we start with a Random sample of 2017 fresh graduates in U.S. across all universities. Then filter out those who didn’t apply to software jobs in Silicon Valley (SV). So we have a random, unbiased sample of applicants.

Q: how many percent of the sample don’t get any offer from these companies?

The more selective employers probably make an offer to 1 in 10 candidates. Bloomberg has selectivity = 1/50. Facebook is probably worse…. I will not pursue this question further.

For each graduate with multiple job offers, let’s pick only the highest base salary. Now we have a sample of “personal best”. This is a random sample from a “population”. We can now look at some statistics on this sample.

Q: what’s the percentile of a 250k base?
A: I would think it’s above the 98th percentile, i.e. one in 50 graduates gets such an offer. This data point is possibly an outlier.

The fact that this graduate gets multiple offers above 250k doesn’t mean anything. That’s why it counts as a single data point in my sampling methodology. Every outlier candidate can easily get multiple offers from, say, Oracle JVM dev team, Microsoft Windows kernel team, Google AdSense core team … Each of these teams do hire fresh graduates but are very selective and can pay million-dollar salaries.

It’s dangerous to treat an outlier data point as a “typical” data point.

I know people who either worked in SV, applied to SV companies or have friends in SV.

  • In 2016 or 2017, an experienced engineer (friend of my colleague Deepak) was hired by Apple as a software engineer — 150k base.
  • Facebook recruiter (Chandni) told me 200k base is uncommon even for experienced candidates.
  • in 2017 I applied to a 2nd-tier Internet company in SV. The CTO told me $120k base was reasonable. We spoke for half an hour. He was sincere. I have no reason to doubt him.
  • Yi Ge was working in SV. He knows this CTO. He said some candidate asked for 200k base and was considered too expensive.
  • An ex-colleague c++ guy (Henry Wu) spent a year in SV then joined Bloomberg in NY. Clearly he didn’t get 250k base in SV.
  • a Computer Science PhD friend (Junli) applied to LinkedIn and another SV firm a few years ago. He said base was definitely below 200k, more like 150k.
  • A MS colleague with 1Y experience had a friend (junior dev) receiving an Amazon offer of $100k. He said average is 120-130k. GOOG/FB can pay 135k max including bonus + stocks. He said Bloomberg is a top payer with base 120k for his level.
  • In 2007 I applied to some php lead dev job in SV. Base salary $110. A fresh grad at that time could probably get up to 100k.
  • in 2007 Yahoo offered 90-95k base to fresh grades.
  • in 2011 some Columbia graduate said west coast (not necessarily SV) offers were higher than Wall St, at about 120k. Not sure if it’s base or base+ guaranteed first-year bonus + signon bonus

None of my examples is a fresh graduate, but …

Q: if we compare two samples(of base salaries) — fresh grad vs 5Y experienced hires, we have two bell-curves. Which bell is on the higher side?

Q: is your sample unbiased?
A: you don’t hear my “low” data points because they are not worth talking about. The data points we hear tend to be on the higher side … sampling bias. That’s why I said “start with a random sample“, not “start with voluntary self-declared data points”. Even if some young, bright graduate says “me and my fellow gradates all got 250k offers”, a statistician would probably discard such a data point as it is not part of a random sample.

Q: what’s your sample size?

My sample size is 5 to 10. To get a reasonable estimate of the true “population mean”, we typically need sample size 30, under many assumptions. Otherwise, our estimate has unacceptable margin of error.

Imagine if we research on professional footballer’s income. We hear a salary from some lesser-known player — $500k. We assume it’s a typical figure. We may assume he could be in the 66th percentile, slightly above average. But this sample size is so small that any estimate of population-mean is statistically meaningless. The true population-mean could be 30k or 70k or 800k.

## turn on asserts: 5 lang

===python: enabled by default

-O and -OO python parameters will strip away your assertions. Tested.

===C: enabled by default, but you need to #include <assert.h>

To disable, #define NDEBUG. Or you can pass -DNDEBUG to the GCC compiler

How to disable assert in GCC

===java: disabled by default

-ea enables assertions.

===Perl: disabled (unavailable) by default

use Carp::Assert; # like C include

===c#: disabled only in Release mode by default


trade bust by exchange^swap-dealer

Trade bust is rare on real exchanges, usually for some extreme scenarios.

It’s more common in a equity swap dealer system than an agency broker system. Assuming a buy, there are two transactions:

  1. client leg: contract between dealer and client, client buying IBM from dealer
  2. exchange leg: regular buy on nyse.

After a swap trade is executed i.e. after the hedge order has been executed on nyse, the dealer can bust the client leg. So for the time being there’s only the hedge position on the dealer’s book — risky. Now dealer will execute another client leg transaction at a new price.

linux command elapse time

–q(time) command

output can be a bit confusing.

Q: does it work with redirection?

–$SECONDS variable

sticky~most widely useful content on my blog

The 2 major + some minor categories

#1 tech tips, observations, experiences —
#2 finance/trading/quant knowledge
# wall st tech workplace realities

I feel #1 is more widely useful. My #2 content is often amateurish.

Introductory topics are more widely useful, but there are often good explanations online and in print, so you may need to specialize on a niche topic. Most importantly, you need to develop a communication style to reach out to beginners, using clear English.

label: original_content ^ _orig

original_content — should be original AND worthwhile

good enough to show on don’t be too strict. Most readers only look those few posts on first page.

I don’t want to “contrive” for the short term. My labels, once applied, should slowly build up like a snowball.
Over the long run, when we have 600 original_content posts, the defining feature would be “original”

— _orig label
If too many items pending “promotion/approval”, then perhaps move to “orig2”??

sticky~resist publishing ..

— before publishing
* check labels
* check dates
* check title

draft status means … need refresh, or not finalized…
drafts are easier to edit
drafts are easier backdate
drafts serve as refreshers
Resist temptation to publish drafts. You can mass publish drafts but not reverse.

too many posts in draft –> may lose data? nah

sticky~chipaway prioritize – drafts, oq, unlabelled, _orig

chipaway on _orig label ie original content candidates
chipaway on drafts
chipaway to quality-control original_content
chipaway on unlabelled
chipaway on vague titles — hard to be specific And brief … so be long winded!
chipaway to delete worthless posts
chipaway append “educated-guesses” to post titles
chipaway on vague label

Bear in mind … addiction…, time well spent?

DOS replacement, again


not yet tested

–%% wishlist
edit command history
shell integration

🙂 conEmu can search throughout console output
🙂 conEmu here
🙂 conEmu double-click to select text
🙂 console2 requires research (no time!) on tweaking to fix copy/paste.
🙂 console2 here
😦 PowerCmd requires admin right every time to use

master a skill^depend@colleagues – you decide

Nikhil surprised me when he told me he frequently asks for help… Until then, I always thought I need to go through the problem solving by myself, unaided, or I will never learn.

Context — trading engine team workload is often 3 times the normal, comfortable level. Headcount is often below 33% normal headcount. (This is a result of time-to-market delivery experience — The fewer developers, the lower communication overhead, the more nimble.)

Challenge – Suppose you are suddenly asked to finish something quick and dirty in half the time previously allocated.

Higher pressure usually means you need to depend more heavily on other developers in your own team. Many things you should not *want* to understand. You need to decide what analysis/research/investigation to “outsource”, but still retain control.

A component could be your colleague’ responsibility, but familiarity improves your *velocity* and reduces your reliance on the author. With mastery, you often discover tweaks and shortcuts, saving time when there’s no time at all. You can become more knowledgeable than the author about its usage!

Look at GS colleagues Nikhil, Anil, Nicholas… They ask each other dozens of questions for everyday BAU, but each mastered how to read and experiment with JIL, sybase stored proc, CVS, unix commands, …They don’t master every subject, and they don’t lean on others in every domain.

Eg: how to programatically insert reference data into an empty data store, rather than through a complicated (buggy) GUI. The GUI route is un-automated if you must repeatedly rebuild from scratch.
Eg: key tables. Out of 20 to 50 frequently used tables, about 2 to 5 tables should be familiar to everyone, like Comm, trade, offer. Before you go through these steep learning curves, you aren’t initiated into the club.
Eg: how to locate and parse the logs
Eg: how to test-send and test-receive messages
Eg: how to restart some of the important servers
Eg: how to identify common issues in each essential server
Eg: parse a few cornerstone config files
Eg: parse autosys JIL to see dependencies and flows

But as a Greenfield dev consultant, you aren’t expected to get too deep into prod support or BAU.

A short-term greenfield developer consultant is not a “Subject Matter Expert”.

architect’s helicopter view of a system

This (pearl) post is not only describing some role models. This is a career tip, possibly survival tip.

Bottom line — my capabilities (code reading…) is not very high, so as architect/owner of any large codebase I would tend to struggle. There are some ways to survive, but not easy. I would think a high volume, high latency batch+web/quant system in java/SQL/python/javascript would be easier for me and harder for my peers.

Prime eg: Yang. Nikhil calls it “his system knowledge” – Without knowing the source code, Yang would ask sharp questions that baffles the actual coders. If the answer smells bad, then either answer is not exactly truthful, or existing system design is imperfect. Yang is quick to notice “inconsistency with reality”, a very sharp observation. This pattern is critical in troubleshooting and production support as well. I told XR that production support can be challenging and require logical, critical thinking about the systems as a whole.

An __effective__ geek needs more than high-level system knowledge. Somehow, every one of them soon gets down to the low level, unless her role is a purely high-level role … like presales?

Not everyone has the same breadth and width of helicopter view. Often I don’t need it in the first few years. You would think after many years, each would acquire enough low-level and high-level knowledge, but I imagine some individuals would refuse to put in the extra effort, and keep her job by taking care of her own “territory” and never step out.

— my own experience —

At Citi muni, I largely avoided the complexity. I did try and trace through the autoreo codebase but stopped when I was given other projects. In hind sight, I feel it’s too slow and probably not going to work. Mark deMunk pointed out the same.

At 95 Green and Barcap, I was a contractor and didn’t need to step out of my territory. This might be one reason I had an easier job and exceed expectations. See post on “slow decline” and, both in the pripri blog.

At OC, the Guardian codebase I did take up fully. Quest was given to me after I /fell out/ with boss, so I did enough research to implement any required feature. No extra effort to get a helicopter view.

Stirt was tough. I spent a few months trying to learn some but not all of the nuts and bolts in Quartz. This knowledge is fundamental and crucial to my job and I didn’t have too much time learning the Stirt system, which I did learn but didn’t master. In fact, I asked many good system-level questions about Sprite and Stirt-Risk, but failed to gain Yang’s insight.


python RW global var hosted in a module

Context: a module defines a top-level global var VAR1, to be modified by my script. Reading it is relatively easy:

from mod3 import *
print VAR1

Writing is a bit tricky. I’m still looking for best practices.

Solution 1: mod3 to expose a setter setVAR1(value)

Solution 2:
import mod3
mod3.VAR1 = ‘new_value’

Note “from mod3 import * ” doesn’t propagate the new value back to the module. See example below.

#!/usr/bin/python -u
from mod3 import *

def main():
  ''' Line below is required to propagate new value back to mod3
      Also note the semicolon -- to put two statements on one line '''
  import mod3; mod3.VAR1 = 'new value'
VAR1='initial value'
def mod3func():
  print 'VAR1 =', VAR1

[11]python mutable^immutable type variables

Page numbers refer to [[automate the boring stuff with python]]
Note this kind of knowledge is interview relevant only in java/c++. low-level knowledge is not appreciated in python interviews… as stated in other blog posts.
List, Dict and user defined classes are mutable. They provide “mutation” methods (and operators) to edit the content in-place.
As explained in, numbers, strings, and tuples are immutable, in the bitwise sense not the java sense — If a tuple’s 2nd element is a dict, then yes the dict is still mutable. Why? the reference thing..
– A python variable of immutable type contains the object, not a reference (P99). However, I think the immutability removes the difference — the variable might contain a reference and it makes no difference.
– A python variable of   mutable type contains a reference to the object.
Two common use cases:
  1. When we reassign to an existing/loaded variable, we never overwrite the object in-place. Instead, the reference gets reseated, like a java reference variable reassignment. This applies to strings, integers, tuples and all mutable types. In c++ lingo, this is pointer reseat rather than operator=
  2. When we pass a variable into a function, we copy the reference. I think this applies to all types, including integers.
Our intuition is usually good enough, so most working developers don’t bother to understand these /nuances/.

mapReduce – rent cloud resources#IaaS

MR is popular on clouds — offered by AWS/Azure/Google.

I used to question the validity of on-demand/rent proposition. I felt during a peak period everyone is busy and in an off-peak period everyone has low volume. Now I think Hadoop is a nice example. 
It often runs for a short time but involves many compute nodes. This could be in a off-peak period. Without cloud, the Hadoop will have to make do with much fewer compute nodes.
Further, any batch job in general could be scheduled off-peak. 

cloud DB sharding – phrasebook#eg PaaS

I read quite a few books touching on sharding, but one book puts it succinctly into the big picture of noSQL, big data, web2.0 and cloud — [[cloud architecture patterns]]

PWM partition — The position database in PWM was partitioned into regions like Asia, Europe, Latam, …  This is an early example of sharding.
custom built — early sharding tends to be custom-built, like PWM. Nowadays standard sharding support is available in many databases, so no more custom-built.
noSQL and SQL databases — both support sharding
Autonomous — Shards do not depend on (or reference) each other.
Static data tables — are not sharded but replicated. 

frameworks, facades …"simplifiers"@@

I have seen a lot of new api (or library, or framework) created to “simplify” something, to shield developers from the “tedious, error-prone details” i.e. complexity. These façades or wrappers often make the common tasks easy but could make the uncommon tasks harder. 
I feel the technical complexities often won't disappear out of thin air. If you have 
– demanding performance requirements (often requires some engineering and tuning)
– complex, unusual business requirements
– fast-changing requirements (which often requires higher flexibility than supported by the simplifiers)
– large distributed team 
… you have to, one way or another, deal with the details beneath the carpet or behind the facade. The raw material is more flexible.
I have heard many veterans comment that swing is very flexible or C is very flexible.
Here are some endorsed/sanctioned “simplifiers”: 
eg: garbage collector
eg: drag-n-drop programming
eg: high-level concurrency constructs
eg: dotnet wrappers over win32 constructs
eg: any tool to make a remote component feel like a local one

MSVS build (!!clean-build) unable to detect upstream changes

Let's avoid the confusing word “rebuild”. In MSVS it is name of a command to clean-build. In everyday English it means

1) An action to build again, or

2) The observation that a project built again, whatever the trigger

Therefore the word “rebuild” is vague and misleading.

Many of my 200 projects use boost 1.47. I kept the boost 1.47 files unchanged, and added 1.57 files in a brand new folder. I first changed the include path and linker path inside the shared property sheet, to use boost 1.57. Unknown to me, a few projects still reference boost 1.47 files outside the property sheets. Everything built fine.

Then I physically removed the 1.47 files. I thought all the 1.47-dependent /renegade/ projects would detect the upstream change and build themselves. We wish to avoid a lengthy clean-build at solution level, so we choose the incremental build.

Indeed as expected some projects “turned themselves in” as secret dependents of boost 1.47. But to my surprise, some projects didn't. So after build-all, these projects still use boost 1.47.

Lesson learnt – after removing the boost 1.47 files, we have to clean-build all. Cutting corners here could leave some 1.47 residuals in the artefacts.

delegate: Yamaha makes motorbikes + pianos

label: lambda/delegate
when we say “delegate” we should but don't make an explicit distinction between a delegate typename vs a delegate instance.
Also, we seldom make an explicit distinction between the multicast delegate and unicast delegate usages, but these 2 have too little overlap to be discussed as the same concept. 
Yamaha makes motorbikes + pianos

git | branch sync – learning notes

Branching is different in git vs svn/cvs. By itself git branching isn’t too complex.

The git local/remote set-up is new to me. By itself, not too complex.

However, in reality we have to manage the combination of branching and sync – a higher level of complexity addressed in many online discussions. I would prefer to rely on the git commands (rather than GUI tools or Stash) to understand and manage it.

I like the philosophy in

–comparing the local branch vs remote branch

git diff feature/my_br6 origin/feature/my_br6 — path/to/your/file

–to see the mapping between local ^ remote branches

git branch -vv


–check out a remote branch? local branch to track remote branch?

–delete a branch locally, then remotely:

conditional probability given y==77 : always magnified

Look at the definition of cond probability. We are mostly interested in the continuous case, though the discrete case is *really* clearer than the continuous.

It’s a ratio of one integral over another. Example: Pr(poker card is below 3, given it’s not JQK) is defined as ratio of the 2 probabilities.

I feel often if not always, the numerator integral is being magnified, or scaled up, due to the denominator being smaller than 1.

In the important bivariate case, there’s a 3D pdf surface. Volume under entire surface = 1.0. If we cut vertically at y=3.3, on the cross-section view we get a curve of z vs x, where z is the vertical axis. This curve looks like a density function. We hope total area under this curve = 1.0 but highly unlikely.

To get 1.0, we need to scale the curve by something like 1/Pr(Y=3.3). This is correct in the discrete case, but in continuous case, Pr(Y=3.3) is always 0. What we use is f_Y(y=3.3) i.e. the marginal density function, evaluated at y=3.3.

pointer equality – counter-intuitive in c++

label – ptr/ref, book

Background: Pointer (identity) comparison is widely used in java, c# and python. 
[[c++common knowledge]] points out that if class DD subclasses CC and BB (classic MI), then pointer d, c, b follows:
assert(d == c); // since the the pointer to DD object can be cast to pointer to CC
assert(d == b); // since the the pointer to DD object can be cast to pointer to BB
However, c and b are two different address, because the 2 sub-objects can't possibly have the same starting point in the address space.

[15]1st deep dive@linker + comparison with compiler

mtv: I feel linker errors are common. Linker is less understood than pre-processor or compiler. This know-how is more practical than a lot of c++ topics like MI, templates, op-new … Most real veterans (not just bookworm generals) would deal with some linker errors and develop some insight. These errors can take a toll when your project is running late. My textbook knowledge isn’t enough to give me the insight needed.

I believe compiler produces object files; whereas linkers take in object or library files and produce library or executable files.

Q: can linker take in another linker’s output? seems to be more detailed, but I have yet to read it through.

This object file contains the compiled code (in binary form) of the symbols defined in the input. Symbols in object files are referred to by name.

Object files can refer to symbols that are not defined. This is the case when you use a declaration, and don’t provide a definition for it. The compiler doesn’t mind this, and will happily produce the object file as long as the source code is well-formed.

(I guess the essence of linking is symbol resolution i.e. translating symbols to addresses) It links all the object files by replacing the references to undefined symbols with the correct addresses. Each of these symbols can be defined in other object files or in libraries.

During compilation, if the compiler could not find the definition for a particular function, it would just assume that the function was defined in another file. If this isn’t the case, there’s no way the compiler would know — it doesn’t look at the contents of more than one file at a time.

So what the compiler outputs is rough machine code that is not yet fully built, but is laid out so we know the size of everything, in other words so we can start to calculate where all of the absolute addresses will be located. The compiler also outputs a symbol table of name/address pairs. The symbols relate a memory offset in the machine code in the module with a name. The offset being the absolute distance to the memory location of the symbol in the module. That’s where we get to the linker. The linker first slaps all of these blocks of machine code together end to end and notes down where each one starts. Then it calculates the addresses to be fixed by adding together the relative offset within a module and the absolute position of the module in the bigger layout.

inline getter methods

  • In C/C++, if you have a simple getter on a private field, you would typically request compiler to inline it, thus eliminating the function call run-time overhead.
  • In c#, the “property” getter is often inlined, but there’s no such requirement on the compiler.
  • Java getters can get inlined too, esp. if the field is final.

lambda – final local var — rationales, loopholes

Possibly a popular interview question…..

For Both lambda and inner classes to use a given variable:

A) local vars in enclosing method must be (as of Java 8, "effectively") final
** subversion / loophole — a final local variable in the form of "array of 1" would let the lambda to modify the content
**** similarly, you can provide a wrapper object, who is a final local variable but not immutable.
B) enclosing Class instance fields don’t need finality to be usable in lambda.

The most important motivation for the (effective) finality restriction is thread safety. For (B), this restriction is impractical.

Incidentally (perhaps another post), in java 8, you lambda is compiled not as an inner class, but as a static method with a special helper class.

## innovative features of python

Here’s my answer to a friend’s question “what innovative features do you see in python”

  • * decorators. Very powerful. Perhaps somewhat similar to AOP. Python probably borrowed it from Haskell?
  • * dynamic method/attribute lookup. Somewhat similar to C# “dynamic” keyword. Dangerous technique similar to java reflection.
  • * richer introspection than c# (which is richer than java)
  • * richer metaprogramming support (including decorator and introspection) … Vague answer!
  • * enhanced for-loop for a file, a string,
  • * listcomp and genexpr
  • * Mixin?
  • I wrote a code gen to enrich existing modules before importing them. I relied on hooks in the importation machinery.

prefer for(;;)+break: cod`IV

For me at least, the sequencing of the 3-piece for-loop is sometimes trickier than I thought. It’s supposedly simple rule(s), but I don’t get it exactly right sometimes. Can you always intuitively answer these simple questions? (Answers scattered.)

A87: ALWAYS absolutely nothing
A29: many statements. They are separated by many statements.

Q1: how many times (minimum, maximum) does the #1 piece execute?
Q2: how many times (minimum, maximum) does the #2 piece execute?
Q3: how many times (minimum, maximum) does the #3 piece execute?
Q: Does the number in A2 always exceeds A3 or the reverse, or no always-rule?
Q29: what might happen between #2 and #3 statements?
Q30: what might happen between #3 and #2? I feel nothing could happen.
Q87: what might happen between #1 and #2 statements?
Q: what’s the very last statement (one of 3 pieces or a something in loop body) executed before loop exit? Is it an “always” rule?

If there’s a q(continue), then things get less intuitive. explains the subtle difference between while-loop vs for-loop when you use “continue”.

In contrast, while-loop is explicit. So is do-while. In projects, for-loop is concise and often more expressive. In coding interviews, conditions are seldom perfect, simple and straightforward, so for-loop is error prone. White-board coding IV (perhaps bbg too) is all about nitty-gritty details. The condition hidden in the for-loop is not explicit enough! I would rather use for(;;) and check the condition inside and break.

The least error-prone is for(;;) with breaks. I guess some coding interviewers may not like it, but the more tricky the algorithm is, the more we appreciate the simplicity of this coding style.

Always safe to start your coding interview with an a for(;;) loop and carefully add to the header. You can still have increments and /break/continue inside.