accumulation: contractor vs FTE


You said that we contractors don’t accumulate (积累) as FTE do.

I do agree that after initial 2Y of tough learning, some FTE could reap the (monetary) rewards whereas consultants are often obliged (due to contract) to leave the team. Although there are long-term contracts, they don’t always work out as promised.

My experience — I stayed in GS for 2.5Y. My later months had much lower "bandwidth" tension i.e. the later months required less learning and figure-things-out. Less stress, fewer negative feedbacks, less worry about my own competence, more confidence , more in-control (because more familiar with the local system). If my compensation becomes 150k I would say that money amounts to "reaping the reward".

As a developer stays longer, the accumulation in terms of his value-add to the team is natural and very likekly [1]. Managers like to point out that after a FTE stays in the team for 2Y her competence, her design, her solutions, her suggestions, her value-add per year grows higher every year. If her initial value-add to the company can be quantified as $100k, every year it grows by 30%. Alas, that doesn’t always translate to compensation.

That’s accumulation in personal income. How about accumulation in tech skill? Staying in one system usually means less exposure to other (newer) technologies. Some developers prefer to be shielded from newer technologies. I embrace them. I feel my technical accumulation is higher when I keep moving from company to company.

[1] There are exceptions. About 5% of the old timers are, in my view, organization dead weights. Their value-add doesn’t grow and is routinely surpassed by bright new joiner within a year. Company can’t let them go due to political, legal or ethical reasons.

You said IV questions change over time so much (ignoring the superficial changes) that the IV skills we acquire today is uesless in 5Y and we have to again learn new IV skills. Please give one typical example if you can without a lot of explaining (I understand your time constraints). I guess you mean technology churn? If I prepare for a noSQL or big data interview, then I will probably face technology churn.

On the other hand, in my experience, many interview topics remain ever-green including some hard topics — algorithms (classic algos and creative algos), classic data structures, concurrency, java OO, pass-by reference/value, SQL, unix commands, TCP/UDP sockets, garbage collection, asynchronous/synchronous concepts, pub/sub, producer/consumer, thread pool concepts… In the same vein, most coding tests are similar to 10 yeas ago when I first received them. So the study of these topics do accumulate to some extent.

prod-release a single file in a complex python^c++system

Q1: suppose you work in a big, complex system with 1000 source files, all in python, and you know a small but critical change to a single file will only affect one module, not a core module. You have tested it + ran a 90-minute automated unit test suit, but without a prolonged integration test that’s part of the department-level full release. Would you and approving managers have the confidence to release this single python file?
A: yes

Q2: change “python” to c++ (or java or c#). You already followed the routine to build your change into a dynamic library, tested it thoroughly and ran unit test suite but not full integration test. Do you feel safe to release this library?
A: no.

Assumption: the automated tests were reasonably well written. I never worked in a team with a measured test coverage. I would guess 50% is hard and often impractical. Even with high measured test coverage, the risk of bug is roughly the same. I never believe unit test coverage is a vaccination. Diminishing return. Low marginal benefit.

Why the difference between Q1 and Q2?

One reason — the source file is compiled into a library (or a jar), along with many other source files. This library is now a big component of the system, rather than one of 1000 python files. The managers will see a library change in c++ (or java) vs a single-file change in python.

Q3: what if the change is to a stored proc? You have tested it and run full unit test suit but not a full integration test. Will you release this single stored proc?
A: yes. One reason is transparency of the change. Managers can understand this is an isolated change, rather than a library change as in the c++ case.

How do managers (and anyone except yourself) actually visualize the amount of code change?

  • With python, it’s a single file so they can use “diff”.
  • With stored proc, it’s a single proc. In the source control, they can diff this single proc
  • with c++ or java, the unit of release is a library. What if in this new build, beside your change there’s some other change , included by accident? You can’t diff a binary 😦

So I feel transparency is the first reason. Transparency of the change gives everyone (not just yourself) confidence about the size/scope of this change.

Second reason is isolation. I feel a compiled language (esp. c++) is more “fragile” and the binary modules more “coupled” and inter-dependent. When you change one source file and release it in a new library build, it could lead to subtle, intermittent concurrency issues or memory leaks in another module, outside your library. Even if you as the author sees evidence that this won’t happen, other people have seen innocent one-line changes giving rise to bugs, so they have reason to worry.

  • All 1000 files (in compiled form) runs in one process for a c++ or java system.
  • A stored proc change could affect DB performance, but it’s easy to verify. A stored proc won’t introduce subtle problems in an unrelated module.
  • A top-level python script runs in its own process. A python module runs in the host process of the top-level script, but a typical top-level script will include just a few custom modules, not 1000 modules. Much better isolation at run time.

There might be python systems where the main script actually runs in a process with hundreds of custom modules (not counting the standard library modules). I have not seen it.

##minimum py know-how4(compiler)cod`test#xr

Hi XR,

My friend Ashish Singh (in cc) said “For any coding tests with a free choice of language, I would always choose python”. I agree that perl and python are equally convenient, but python is the easiest languages to learn, perhaps even easier than javascript and php in my opinion. If you don’t already have a general-purpose scripting language as a handy tool, then consider python as a duct tape or Swiss army knife

(Actually linux automation still requires shell scripting. Perl and python are both useful additions.)

You can easily install py on windows. Linux has it pre-installed. You can then write a script in any text editor and test-run, without compilation. On windows the bundled IDLE tool is optional but even easier – see

(Ashish please feel free to add to this list) For coding tests, a beginner would need to learn

  • · String common operations. Regex not needed
  • · list and dict data structures and common operations. A “Set” may be useful occasionally. Tuple not needed.
  • · “in” operator on string, list, dict
  • · if/elif/else; while loop with beak and next
  • · for-each loop is more useful in coding test, esp. iterating list, dict, string, range()
  • · range() function – frequently needed in coding test
  • · Define simple functions. Recursion is frequently coding-tested.
  • · No need to handle exceptions
  • · No need to create classes
    • I think “struct-type” classes with nothing but data fields are useful in coding tests, but not yet needed in my experience.
  • · No need to learn OO features
  • · No need to use list comprehension and generator expressions, though very useful features of python
  • · No need to use lambda, map()/reduce()/filter()/zip(), though essential for functional programming
  • · No need to use import os and sys modules or open files, which are essential for practical automation scripts

Locate msg]binary feed #multiple issues solved

Hi guys, thanks to all your help, I managed to locate the very first trading session message in the raw data file.

We hit and overcame multiple obstacles in this long “needle search in a haystack”.

  • · Big Obstacle 1: endian-ness. It turned out the raw data is little-endian. For my “needle”, the symbol integer id 15852(in decimal) or 3dec(in hex) is printed swapped as “ec3d” when I finally found it.

Solution: read the exchange spec. It should be mentioned.

  • · Big Obstacle 2: my hex viewers (like “xxd”) adds line breaks to the output, so my needle can be missed during my search. (Thanks to Vishal for pointing this out.)

Solution 1: xxd -c 999999 raw/feed/file > tmp.txt; grep $needle tmp.txt

The default xxd column size is 16 so every 16 bytes output will get a line break — unwanted! So I set a very large column size of 999999.

Solution 2: in vi editor after “%!xxd -p” if you see line breaks, then you can still search for “ec\_s*3d”. Basically you need to insert “\_s*” between adjacent bytes.

Here’s a 4-byte string I was able to find. It span across lines: 15\_s*00\_s*21\_s*00

  • · Obstacle 3: identify the data file among 20 files. Thanks to this one obstacle, I spent most of my time searching in the wrong files 😉

Solution: remove each file successively, starting from the later hours, and retest, until the needle stops showing. The last removed file must contain our needle. That file is a much smaller haystack.

o one misleading info is the “9.30 am” mentioned in the spec. Actually the message came much earlier.

o Another misleading info is the timestamp passed to my parser function. Not sure where it comes from, but it says 08:00:00.1 am, so I thought the needle must be in the 8am file, but actually, it is in the 4am file. In this feed, the only reliable timestamp I have found is the one in packet header, one level above the messages.

  • · Obstacle 4: my “needle” was too short so there are too many useless matches.

Solution: find a longer and more unique needle, such as the SourceTime field, which is a 32-bit integer. When I convert it to hex digits I get 8 hex digits. Then I flip it due to endian-ness. Then I get a more unique needle “008e0959”. I was then able to search across all 14 data files:

for f in arca*0; do

xxd -c999999 -p $f > $f.hex

grep -ioH 008e0959 $f.hex && echo found in $f


  • · Obstacle 5: I have to find and print the needle using my c++ parser. It’s easy to print out wrong hex representation using C/C++, so for most of this exercise I wasn’t sure if I was looking at correct hex dump in my c++ log.

o If you convert a long byte array to hex and print without whitespace, you could see 15002100ffffe87600,but when I added a space after each byte, it looks like 15 00 21 00 ffffe876 00, so the 3rd byte was overflowing without warning!

o If you forget padding, then you can see a lot of single “0” when you should get “00”. Again, if you don’t include white space you won’t notice.

Solution: I have worked out some simplified code that works. I have a c++ solution and c solution. You can ask me if you need it.

  • · Obstacle 6: In some cases, sequence number is not in the raw feed. In this case the sequence number is in the feed, so Nick’s suggestion was valid, but I was blocked by other obstacles.

Tip: If sequence number is in the feed, you would probably spot a pattern of incrementing hex numbers periodically in the hex viewer.

focus+engagement2dive into a tech topic#Ashish

(Blogging. No need to reply)

Learning any of the non-trivial parts of c++ (or python) requires focus and engagement. I call it the “laser”. For example, I was trying to understand all the rules about placing definitions vs declarations in header files. There are not just “3 simple rules”. There are perhaps 20 rules and they have exceptions. To please the compiler and linker you have various strategies.

OK this is not the most typical example of what I want to illustrate. Suffice to say that, faced with this complexity (or ambiguity, and “chaos”) many developers at my age simply throw up their hands. People at my age are bombarded with kids’ schooling, kids’ enrichment, baby-sitting, home repair [1], personal investment [2], home improvement… It’s hard to find a block of 3 hours to dive in/zoom in on some c++ topic.

As a result, we stop digging after learning the basics. We learn only what’s needed for the project.

Sometimes, without the “laser”, you can’t break through the stone wall. You can’t really feel you have gained some insight on that topic. You can’t connect the dots. You can’t “read a book from thin to thick, then thick to thin again”. You can’t gain traction even though you are making a real effort. Based on my experience, on most of the those tough topics the focus and engagement is a must.

I’m at my best with my “laser”. Gaining that insight is what I’m good at. I relied on my “laser” to gain insights and compete on the job market for years.

Now I have the time and bandwidth, I need to capitalize on it.

[1] old wood houses give more problems than, say, condos with a management fee

[2] some spend hours every day

2H life-changing xp# income,home location,industry…

Here’s a real story in 2010 — I was completely hopeless and stuck in despair after my Goldman Sachs internal transfer was blocked in the last stage. I considered moving my whole family back to Singapore with an offer, and start my job search there. I was seriously considering a S$100k job in a back office batch programming job. Absolutely the lowest point in my entire career. After licking the would for 2 months, I started looking for jobs outside Goldman and slowly found my foot hold. Then in early 2010, I passed a phone screening and attended a Citigroup “superday”. I spent half an hour each with 3 interviewers. By end of the day, recruiter said I was the #1 pick. I took the offer, at a 80% increment. In the next 12 months, I built up my track record + knowledge in

  1. real time trading engine components, esp. real time pricing engine
  2. fixed income math,
  3. c++ (knowledge rebuild)

I have never looked back since. Fair to say that my family won’t be where we are today, without this Citigroup experience. With this track record I was able to take on relatively high-end programming jobs in U.S. and Singapore. I was able to live in a convenient location, and buy properties and send my kids to mid-range preschools (too pricey in hind sight). Obviously I wanted this kind of job even in 2009. That dream became reality when I passed the superday interview. That interview was one of the turning points in my career.

Fast track to 2017 — I had a 20-minute phone interview with the world’s biggest asset management firm (Let’s call it PP), then I had a 2-hour skype interview. They made an offer. I discussed with the recruiter the proposal —

  • I would relocate to California
  • I would get paid around 200k pretax and possibly with an increment in 6 months. PP usually increase billing rate after 12 months if contractor does well.
  • Unlike investment banks, PP has long-term contracts not subject to trading desk profits, so this contract is (supposedly) more stable. In a few years, I would consider buying a home in California
  • recruitment agency CEO said he would transfer my visa and sponsor green card.

If I were to take this offer, my life would be transformed. (I would also have a better chance to break into the  high tech industry in nearby silicon valley, because I would have local friends in that domain.) Such a big change in my life is now possible because … I did well [1] in the interview.

Stripped to the core, that’s the reality in our world of contract programmers.  Project delivery, debugging, and relationship with boss can get you promoted, but those on-the-job efforts have much lower impact than your performance during an interview. Like an NBA playoff match. A short few hour under the spot light can change your life forever.

This is not a rare experience. There are other high-paying contract job offers that could “change your life”, and you only need to do well in the interviews to make it happen.

I feel this is typical of U.S. market and perhaps London. In Singapore. contract roles can’t pay this much. A permanent role has a few subtle implications so I feel it’s a different game.

[1] The 7 interviewers felt I was strong in c++ (not really), java and sql, and competent in fixed income math (I only worked with it for a year). Unlike other high-end interviews, there are not many tough tech questions like threading, algorithms, or coding tests. I feel they liked my interview mostly because of the combination of c++/java/fixed income math — not a common combination.

tech advantages of c# over java#le2XR

Hi XR,

Based on whatever little I know, here are some technical advantages of c# over java.

(Master these c# feature and mention them in your next java interview 🙂

  • C# has many more advantages on desktop GUI, but today let’s focus on server side.
  • [L] generics —- c# generics were designed with full knowledge of java/c++ shortcomings. Simpler than c++ (but less powerful), but more complete than java (no type erasure). For example see type constraints.
  • [L] delegates —- Rather useful. Some (but not all) of its functionalities can be emulated in java8.
  • [L] c# can access low-level windows concurrency constructs such as event wait handles. Windows JVM offers a standardized, “reduced-fat” facade. If you want optimal concurrency on windows, use VC++, or c#.
  • [L] reflection —- is more complete than java. Over the years java reflection proved to be extremely powerful. Not sure if c# has the same power, but c# surely added a few features such as Reflection.Emit.
  • concurrency —- dotnet offers many innovative concurrency features. All high level features, so probably achievable in java too.
  • tight integration with COM and MS Office. In fact, there are multiple official and unofficial frameworks to write Excel add-ins in c#
  • tight integration with high-level commercial products from Microsoft like MSSQL, sharepoint
  • tight integration with windows infrastructure like Windows Services (like network daemons), WCF, Windows networking, Windows web server, windows remoting, windows registry, PowerShell, windows software installation etc
  • c# gives programmers more access to low-level windows system API, via unmanaged code (I don’t have examples). In contrast, Java programmers typically use JNI, but I guess the java security policy restricts this access.
  • probably higher performance than JVM on windows
  • CLR offers scripting languages, F#, IronPython etc, whereas JVM supports scripting languages javascript, scala, groovy, jython etc.

[L = low-level feature]

If you want highest performance on Windows, low-level access to windows OS, but without the complexity of VC++ and MFC, then c# is the language of choice. It is high-level, convenient like java but flexible enough to let you go one level lower when you need to.

Another way to address your question — listen to the the complaints against java. (Put aside the complaints of GUI programmers.)

Even if a (rational, objective) architect doesn’t recognize any of these as important advantages, she may still favor c# over java because she is familiar and competent ONLY in the Microsoft ecosystem. She could point out countless features in Visual Studio and numerous windows development tools that are rather different from the java tool set, so different that it would take months and years to learn.

Also, there are many design trade-off and implementation techniques built on and for Dotnet. If she is reliant on and comfortable in this ecosystem, she would see the java ecosystem as alien, incomplete, inconvenient and unproductive. Remember when we first moved to U.S. — everything inconvenient.

On a more serious note, her design ideas may not be achievable using java. So java would appear to be missing important features and tools. In a nutshell, for her java is a capable and complete ecosystem theoretically, but in practice an incomplete solution.