Y alpha-geeks keep working hard #speculation

Based on my speculation, hypothesis, imagination and a tiny bit of observation.

Majority of the effective, efficient, productive tech professionals don’t work long hours because they already earn enough. Some of them can retire if they want to.

Some percentage of them quit a big company or a high position, sometimes to join a startup. One of the reasons — already earned enough. See my notes on envy^ffree

Most of them value work-life balance. Half of them put this value not on lips but on legs.

Many of them still choose to work hard because they love what they do, or want to achieve more, not because no-choice. See my notes on envy^ffree

expertise: Y I trust lecture notes more than forums

Q: A lot of times we get technical information in forums, online lecture notes, research papers. why do I trust some more than others?

  1. printed books and magazines — receive the most scrutiny because once printed, mistakes are harder to correct. Due to this fundamental reason, there is usually more scrutiny.
  2. research papers — receive a lot of expert peer review and most stringent editorial scrutiny, esp. in a top journal and top conference. There’s also internal review within the authors’ organization, who typically cares about its reputation.
  3. college lecture notes — are more “serious” than forums,
    • mostly due to the consequence of misinforming students.
    • When deciding what to include in lecture notes, many professors are conservative and prudent when adding less established, less proven research findings. The professor may mention those findings verbally but more prudent about her posted lecture notes.
    • research students ask lots of curious questions and serve as a semi-professional scrutiny of the lecture notes
    • lecture notes are usually written by PhD holders
  4. Stackoverlow and Quora — have a quality control via voting by accredited members.
  5.  the average forum — Least reliable.

QQ=fake; zbs_learning=treacherous

I used to feel my QQ knowledge was not zbs but now I think many interviewers (like myself) ask zbs questions. zbs is a vague concept. QQ has a well-established range of topics. Experts sizing up each other … is mostly QQ

Biggest danger with zbs learning — when I spend my precious spare time on zbs not aimed at QQ, I inevitably regret. Without QQ, the learning lacks reinforcement, positive feedback loop… and fades from memory.
I think in SG context the dearth of interviews make zbs learning treacherous

Therefore, zbs learning must aim at xref and thick->thin.

One guy with theoretical zbs (either strong or poor QQ) may be very mediocre joining a new team. It depends on the key skill needed in the team. Is it localSys, design …? I tend to feel what’s really needed is localSys.

long-term value: QQ imt ECT speed

The alpha geeks — authors, experts, open source contributors … are they fast enough to win those coding contests?

The speed-coding contest winners … are they powerful, influential, innovative, creative, insightful? Their IQ? Not necessarily high, but they are nobody if not superfast.

The QQ knowledge is, by definition, not needed on projects, usually obscure, deep, theoretical or advanced technical knowledge. As such, QQ knowledge has some value, arguably more than the ECT speed.

Some say a self-respecting programmer need some of this QQ expertise.

##tech skill superficial exposure: 5 eg

see also exposure: semi-automatic(shallow)Accu #$valuable contexx and low-complexity topics #JGC..

  • –(defining?) features (actually limitations) of superficial exposure
  • accu — (except bash, sql) I would say very few of the items below offer any accu comparable to my core-c++ and core-java accu
  • entry barrier — (except SQL) is not created since you didn’t gain some real edge
  • commodity skill —
  • traction — spinning the wheel
  1. JGC — i think most guys have only textbook knowledge, no GTD knowledge.
  2. lockfree — most guys have only textbook knowledge
  3. [e] spring, tibrv — xp: I used it many times and read many books but no breakthrough. In-depth knowledge is never quizzed
  4. bash scripting — xp: i read books for years but only gained some traction by writing many non-trivial bash scripts
  5. SQL — xp: 5 years typical experience is less than 1Y@PWM
  6. –other examples (beware oth)
  7. [e] EJB, JPA (hibernate), JMS, Hadoop(?)
  8. memory profilers; leak detectors
  9. design patterns — really just for show.
  10. Ajax
  11. [e] Gemfire
  12. java latency tuning — is an art not a science.
    • Poor portability. JVM tuning is mostly about .. GC but at application level, real problem is usually I/O like data store + serialization. Sometimes data structure + threading also matter receives disproportionate limelight.
    • conclusion — superficial, textbook knowledge in this skillset probably won’t help with latency tuning.
  13. [e=ecosystem like add-on packages outside the core language.] Ecosystem skills are hard to enable deep learning and traction as employers don’t need deep insight. GTD is all they need.

zbs=vague,cf to QQ+GTD

Many zbs learning experiences (me or DQH) were arguably irrelevant if never quizzed.

–iv questions on malloc, tmalloc and the facebook malloc
In latency benchmarks, c++ should simply 1) avoid DMA or 2) pre-allocate. Therefore, if the iv question touches on benchmark, then the malloc question become moot.

–iv questions on lockfree in c++
messier and more low-level than java lockfree. I would say lockfree is seldom written by a c++ app developer. Therefore not a GTD topic at all. Purely QQ halo topic.

–iv question on volatile
Again, many interviewers use volatile as concurrency feature in c++ but it’s non-standard ! Even if it has a side effect under gcc, the practice is “not something I would recommend”, though I won’t criticize hiring team’s current practice.

Not a GTD skill, not zbs either.

–iv question on throwing dtor
c++ standard says UDB but still some interviewers ask about it. So this topic is not zbs not GTD but really QQ

–iv question on debugger breakpoint
Not GTD, not even zbs

–iv question (GS-HK) on CAS no slower than uncontended lock acquisition
I feel I don’t need to know how much faster. My knowledge is good enough for GTD

–iv question on how many clock cycles per instruction
no GTD but zbs for a latency enthusiast


##3 java topics I can build zbs #concurrency

C++ has many zbs topics with accu + job market relevance, but I won’t list them here. In contrast, java has very few qualifying topics —

  • [b] concurrency engineering — low-level techniques + high-level designs.
  • latency improvement to c++level — Accumulated experience can become outdated but relatively long shelf-life
  • [b] GC and JIT — Accumulated experience soon becomes outdated
  • –non-ideal choices
  • large java app tuning? can’t read. High churn. Accumulated experience can become outdated
  • [b] reflection? Powerful but seldom quizzed. Low mkt value
  • [b] java5 generics topics as in my book? low industry adoption. Seldom quizzed.
  • jvm source code? never required in IV
  • [b=luckily there are books. I’m good at reading those books.]

##any eg@portableGTD skills: valuable but never quizzed

Background — survival skills not tested in IV are usually [1] localSys and [2] portable GTD

  • — dev-tool knowledge
  • my posts on c++ build errors
  • Many posts on gdb
  • most posts on eclipse and MSVS
  • posts on git/cvs
  • posts on bash
  • posts on vi, less, find
  • — beyond tool knowledge
  • correlate logs, config, source code, data store…
  • script automation and shell customization skills
  • techniques and familiarity with jira

##what UPSTREAM tech domains: defined by eg

It pays to specialize in a domain that helps you break into similar domains. (Unsuccessful with my quant adventure so far.)

Note “wrappers” API are never upstream. Very common in big banks including Qz.

  • #1 eg: [L] socket —- C++ is the alpha wolf, leader of the wolf pack
  • [D] market data —- Nyse/Nasdaq (binary) feed parer seems to be the leader, over the slower FIX feeds
  • [D] execution algo —- sell side equities desk seems to be the leader in terms of volume and variety of algos
  • data analytics language —- python looks like the leader
  • eg: [D] alpha —- equities trading is the leader
  • eg: threading —- java is the leader of the pack. In c++ threading is too hard, so once I master some important details in java threading, it helps me build zbs in c/c#, but only to some extent
  • eg: regex/string —- Perl is the leader
  • eg: [D] consolidated firm-wide risk —- SecDB is the leader. Applicable to buy-side? I think so.
  • lambda —- C# is the leader. Scripting languages follow a different pattern!
  • list/stream —- Linq is the leader
  • drv pricing theory —- Black-Scholes —- feels like a leader or upstream, but not /convincing/
  • [L] heap/stack —- C++ is the leader. Java and c# are cleaner, simpler and hides the complexities.
  • defensive coding —- (and crash management) C++ is the leader, because c++ hits more errors, like Japan on earthquake management.
  • eg: collections —- C++ is the leader of the pack. There’s a lot of low level details you gain in STL that help you understand the java/c#/python collections
  • eg: [L] latency —- C++ is the leader of the pack in low-level optimizations .. In-line, cache-efficiency, footprint..
  • eg: [L] pbref^val —- C++ is the leader. C# has limited complexity and java has no complxity
  • [L = lowLevel]
  • [D= well-defined application domain]
  • ? automated pricing —- no upstream
  • ? Fixed income trading systems —- no upstream. IRS^Treasury^credit bonds (trading platforms) have limited features in common.
  • [D] VaR/risk analytics —- FI derivatives in big banks seem to be the leader, but I feel one trading desk in one firm may not respect another system

2 pitfalls on my accu path #portable,MktDepth..

Say you stay in one team for 5 years, hoping to accumulate expertise and insight

  • Trap 1: local system knowledge, 100% nonportable
    • eg: Qz
    • Tibrv wrappers in 95G is not so bad
  • Trap 1b: limited standardization across companies.
    • eg: cryptocurrency
    • eg: OMS framework? M b2bTradigEngine ^ SIG ^ ETSFlow
    • eg: software mkt data dissemination cf raw mkt data parsing
  • Trap 2 : poor market depth
    • eg: vol fitter

## 2 portable skills gained{each job

These are portable, longevity skills either for IV or GTD.

+1 +2 +3 .. = subjective valuation as of 2018
[grey color = portable financial domain skill, not tech skill ]

  • NIE +1 — 1) hacking open source php
  • GS +3 higher value than all other jobs —
    • 1) java GTD 2) SQL 3) database tuning 4) Database batch processing architecture
  • Citi +1~2 — 1) bond math 2) market-maker quote pricing system “architecture”
  • 95G +2 high value over mere 5 months —
    • 1) java wait/notify 2) store-proc for trade booking 3) FIX connectivity 4) Tibrv (basics only) and MOM system architecture
  • Barc +1 — 1) option math 2) analytics library integration
  • OC +1 — 1) c# GTD 2) basic MSVS 3) quote distribution architecture
  • Stirt +0 — 1) curve building basics. This job paid reasonably well, so I won’t prefer it for a low-paying top “contexx”
  • Mac +1 — 1) serious use of standard (portable) python 2) devops including git
  • RTS +2 — 1) c++ instrumentation 2) raw mkt data feed 3) order book replication 4) bash/perl for personal automation 5) socket (basics only)
  • mvea +1 — 1) lowLatency, high volume eq trading architecture 2) c++ (not C) GTD in one of the biggest c++ trading platforms including multi-file build, gdb, crash investigation 3) advanced language features including TMP and shared_ptr

##[18]orgro lens:which past accu proved long-term # !!quant

(There’s a recoll on this accumulation lens concept…. )

This post is Not focused on IV or GTD. More like zbs.

Holy grail is orgro, thin->thick->thin…, but most of my endeavors fell short. I have no choice but keep shifting focus. A focus on apache+mysql+php+javascript would have left me with rather few options.

  • —-hall of famers
  • 1) [T] data structure theory + implementation in java, STL, c# for IV — unneeded in projects
  • 2) [CRT] core java knowledge including java OO has seen rather low churn,
    • comparable to c++
    • much better than j2EE and c#
  • 3) [T] threading? Yes insight and essential techniques. Only for interviews. C# is adding to the churn.
  • 4) [j] java/c++/c# instrumentation using various tools. Essential for real projects and indirectly helps interviews
  • [C] core C++ knowledge
  • [C] GTD knowledge in perl/python/sh scripting
  • [j] google-style algo quiz — Only for high-end tech interviews. Unneeded in any project
  • [R] SQL? yes but not a tier one skill like c++ or c#
  • coding IV — improved a lot at RTS
  • ————————also-ran :
  • devops
  • [C] personal productivity scripts
  • [T] probability IV
  • [C] regex – needed for many coding interviews and real projects
  • [C] low level C skills@RTS {static; array; cStr; reinterpret_cast;  enum; typedef; namespace; memcpy}
  • [!T] bond math? Not really my chosen direction, so no serious investment
  • [!T] option math?
  • SQL tuning? not much demand in the trading interviews, but better in other interviews
  • [R] Unix — power-user GTD skills.. instrumentation, automation? frequently used but only occasionally quizzed
  • [R] Excel + VBA? Not my chosen direction
  • [jR !D] JGC +jvm tuning

C= churn rate is comfortable
D = has depth, can accumulate
R= robust demand
T= thin->thick->thin achieved
j|J = relevant|important to job hunting

%%c++keep crash` I keep grow`as hacker #zbs#AshS

Note these are fairly portable zbs, more than local GTD know-how !

My current c++ project has high data volume, some business logic, some socket programming challenges, … and frequent crashes.

The truly enriching part are the crashes. Three months ago I was afraid of c++, largely because I was afraid of any crash.

Going back to 2015, I was also afraid of c++ build errors in VisualStudio and Makefiles, esp. those related to linkers and One-Definition-Rule, but I overcame most of that fear in 2015-2016. In contrast, crashes are harder to fix because 70% of the crashes come with no usable clue. If there’s a core file I may not be able to locate it. If I locate it, it may not have symbols. If it has symbols the crash site is usually in some classes unrelated to any classes that I wrote. I have since learned many lessons how to handle these crashes:

  • I have a mental list like “10 common crash patterns” in my log
  • I have learned to focus on the 20% of my codebase that are most convoluted, most important, most tricky and contribute most to debugging difficulties. I then invest my time strategically to rewrite (parts of) that 20% and dramatically simplify them. I managed to get familiar and confident with that 20%.
    • If the code belongs to someone else including 3rd party, I try to rewrite it locally for my dev
  • I have learned to pick the most useful things to log, so they show a *pattern*. The crashes usually deviate from the patterns and are now easier to spot.
  • I have developed my binary data dumper to show me the raw market data received, which often “cause” crashes.
  • I have learned to use more assertions and a hell lot of other validations to confirm my program is not in some unexpected *state*. I might even overdo this and /leave no stoned unturned/.
  • I figured out memset(), memcpy(), raw arrays are the most crash-prone constructs so I try to avoid them or at least build assertions around them.
  • I also figured signed integers can become negative and don’t make sense in my case so I now use unsigned int exclusively. In hind sight not sure if this is best practice, but it removed some surprises and confusions.
  • I also gained quite a bit of debugger (gdb) hands-on experience

Most of these lessons I picked up in debugging program crashes, so these crashes are the most enriching experience. I believe other c++ programs (including my previous jobs) don’t crash so often. I used to (and still do) curse the fragile framework I’m using, but now I also recognize these crashes are accelerating my growth as a c++ developer.

focus+engagement2dive into a tech topic#AshS

(Blogging. No need to reply)

Learning any of the non-trivial parts of c++ (or python) requires focus and engagement. I call it the “laser”. For example, I was trying to understand all the rules about placing definitions vs declarations in header files. There are not just “3 simple rules”. There are perhaps 20 rules, with exceptions. To please the compiler and linker you have various strategies.

OK this is not the most typical example of what I want to illustrate. Suffice to say that, faced with this complexity (or ambiguity, or “chaos”) many developers at my age simply throw up their hands. People at my age are bombarded with kids’ schooling, kids’ enrichment, baby-sitting, home repair [1], personal investment [2], home improvement… It’s hard to find a block of 3 hours to dive in/zoom in on some c++ topic.

As a result, we stop digging after learning the basics. We learn only what’s needed for the project.

Sometimes, without the “laser”, you can’t break through the stone wall. You can’t really feel you have gained any insight on that topic. You can’t connect the dots. You can’t “read a book from thin to thick, then thick to thin again”. You can’t gain traction even though you are making a real effort. Based on my experience, on most of the those tough topics the focus and engagement is a must.

I’m at my best when I have my “laser” on. Gaining that insight is what I’m good at. I relied on my “laser” to gain insights and compete on the job market for years.

Now I have the time and bandwidth, I need to capitalize on it.

[1] old wood houses give more problems than, say, condos with a management fee

[2] some spend hours every day

## meaningful endeavor(Now)for20Y career#dnlg;zbs;IV

Q: what efforts go towards 20Y-career building?
A: instrumentation in c++(java is a bit more churn); socket programming; STL internals; JVM internals; shell scripting; pthreads; FIX; linux internal?

This question is slightly vague, but it’s subtly different from

Q2: what efforts create social value over 20Y?
Q3: what efforts are cumulative over 20Y?

I feel the key question in this write-up is about building a deep and broad “foundation”, upon which we can start to plan (possible) social value, retirement planning, parenting/family building etc.

  • 😦 teaching and research —- Somehow I tend to feel research creates long-term value but I suspect most research efforts are not valuable!
  • 😦 open-source software —- I tend to feel OSS projects create social value, but most of them are not successful and have no lasting influence
  • 😦 Deepening your domain knowledge such as (German’s) security lending —- It does increase your value-add to some employers and help you move up to management, but I don’t call it “foundation”.
    • MSFM and other formal training —- I feel this is less effective, more ivory-tower, but it does build a firm theoretical foundation.
  • fitness and health .. more important than I feel (as shown by my action)
  • zbs such as instrumentation skills? Yes zbs helps your KPI, and also indirectly helps you with interviews. It’s portable skill, unlike local system knowledge. Can the value last 20Y? Depends on the technology churn. c++ and java are safer.
  • IV muscle building? My favorite career building effort. 20Y? 10Y yes.
  • English/Chinese writing and vocab

skill: deepen^diversify^stack up

Since 2010, I have carefully evaluated and executed 3 broad strategies:

  1. deepen – for zbs + IV
  2. diversify or branch-out. Breaking into new markets
  3. stack-up – cautiously
  • eg: Deepened my java/SQL/c++/py knowledge for IV and GTD. See post on QQ vs ZZ.
  • eg: diversified to c++, c#, quant, swing…
  • eg: diversify? west coast.
  • eg: diversify? data science
  • eg: diversify? research + teach
  • eg: stack-up to learn spring, hibernate, noSQL, GWT.

Stack-up — These skills are unlikely to unlock new markets. Lower leverage.

GTD stress/survival on the job? None of these help directly, but based on my observation GTD skill seldom advance my career as a contractor. It could create a bit of spare time, but it’s a challenge to make use of the spare time.

I feel German’s strategy goes beyond tech skills. He treats tech skills as a tool, used to implement his business ideas to create business value. I feel his target role is a mix of architect/BA but more like “product visionary”. It’s somewhat like stack-up.

in-depth^cursory study@advanced program`topics

I have seen both, and tried both.

AA) A minority of developers spend the time studying and understanding the important details of threading, generics, templates … before writing code.

BB) Majority of developers don’t have the time, vision or patience, and only learn the minimum to get the (sloppy) job done. They copy-paste sample code and rely on testing, but the test cases are limited to realistic UAT test cases and will not cover edge cases that have yet to occur. For concurrency designs, such tests are not good enough.

AA is often seen as unnecessary except:
* Interviewers often go in-depth
* In a product vendor rather than an in-house application team
* If requirement is unusual (eg: B2BTradingEngine), then BB won’t even pass basic developer self-tests. Sample code may not be available — see my post http://wp.me/p74oew-1oI. So developers must bite the bullet and take AA approach. This is my sweet spot. See
https://bintanvictor.wordpress.com/2016/10/31/strength-gtd-with-non-trivial-researchinnovation/ https://bintanvictor.wordpress.com/2016/03/30/theoretical-complexity-strength/

On Wall St projects, I learned to abandon AA and adopt BB.

On Wall St interviews, I still adopt AA.

On West Coast, I think AA is more needed.

each year grow towards a principal developer

A lead developer of any system (not just a component) needs a few years experience [1] building a similar system. Each year, I hope to accumulate relevant know-how [2]. I’m not a people manager or project manager but I can be the principal developer.

Pause ….

However, I’m not accumulating “everything relevant” because … A lot of the know-how is never quizzed in tech interviews. Here are a few examples
* build/release/test/cont-int
* version control
* peer review/QA/release process
* wiki, jira
* back-up
* system admin scripting
* job schedulers

As a wall st contractor, I used to work in kinda greenhouse environment with all of the infrastructure taken care of, so I just focused on “get-code-done” and get productive with it. It’s an easier life.

Non-lead interviews often focus on implementation details. Benny Zhao once /identified/ just 4 or 5 clusters like 1) algo 2) language details 3) OO design. This skill set is insufficient for a principal developer.

[1] but look at Saurabh, Sundip, Kirupa, Yang … they came in with only something slightly similar
[2] not useful to focus on coding year in year out.

##coding guru tricks (tools) learnt across WallSt teams

(Blogging. No need to reply.)

Each time I join a dev team, I tend to meet some “gurus” who show me a trick. If I am in a team for 6 months without learning something cool, that would be a low-calibre team. After Goldman Sachs, i don’t remember a sybase developer who showed me a cool sybase SQL trick (or any generic SQL trick). That’s because my GS colleagues were too strong in SQL.

After I learn something important about an IDE, in the next team again I become a newbie to the IDE since this team uses other (supposedly “common”) features.

eg: remote debugging
eg: hot swap
eg: generate proxy from a web service
eg: attach debugger to running process
eg: visual studio property sheets
eg: MSBuild

I feel this happens to a lesser extent with a programming language. My last team uses some c++ features and next c++ team uses a new set of features? Yes but not so bad.

Confucius said “Among any 3 people walking by, one of them could be teacher for me“. That’s what I mean by guru.

Eg: a Barcap colleague showed me how to make a simple fixed-size cache with FIFO eviction-policy, based on a java LinkedHashMap.
Eg: a guy showed me a basic C# closure in action. Very cool.
Eg: a Taiwanese colleague showed me how to make a simple home-grown thread pool.
Eg: in Citi, i was lucky enough to have a lot of spring veterans in my project. They seem to know 5 times more spring than I do.
Eg: a sister team in GS had a big, powerful and feature-rich OO design. I didn’t know the details but one thing I learnt was — the entire OO thing has a single base class
Eg: GS guys taught me remote debugging and hot replacement of a single class
Eg: a guy showed me how to configure windows debugger to kick-in whenever any OS process dies an abnormal death.
Eg: GS/Citi guys showed me how to use spring to connect jconsole to the JVM management interface and change object state through this backdoor.
Eg: a lot of tricks to investigate something that’s supposed to work
Eg: a c# guy showed me how to consolidate a service host project and a console host project into a single project.
Eg: a c# guy showed me new() in generic type parameter constraints

These tricks can sometimes fundamentally change a design (of a class, a module or sub-module)

Length of experience doesn’t always bring a bag of tricks. It’s ironic that some team could be using, say, java for 10 years without knowing hot code replacement, so these guys had to restart a java daemon after a tiny code change.

Q: do you know anyone who knows how to safely use Thread.java stop(), resume(), suspend()?
Q: do you know anyone who knows how to read query plan and predict physical/logical io statistic spit out by a database optimizer?

So how do people discover these tricks? Either learn from another guru or by reading. Then try it out, iron out everything and make the damn code work.

2kinds@essential developer tools #WallSt+elsewhere

Note: There are also “common investigative” tools, but for now i will ignore them. Reasons is that most developers have only superficial knowledge of them, so the myriad of advanced features actually fall into #2).

Note: in the get-things-done stage, performance tools are much less “useful” than logic-revealing tools. In my experience, these tools seldom shed light on perf problems in DB, java, or MOM. Performance symptoms are often resolved by understanding logical flow. Simplest and best tools simply reveal intermediate data.
I think most interview questions on dev tools largely fall into these two categories. Their value is more “real” than other tools.

1) common tools, either indispensable or for productivity
* turn on/off exception breakpoint in VS
* add references in VS
* resolve missing assembly/references in VS
* app.config in VS projects
* setting classpath in IDE, makefile, autosys…
* eclipse plugin to locate classes in jars anywhere in windows
* literally dozens of IDE pains
* cvs,
* maven, ant,
* MS excel
* junction in windows
* vi
* bash
* grep, find, tail
* browser view-source

2) Specialized investigative/analytical/instrumentation tools — offers RARE insights. RARELY needed but RARE value. Most (if not all) of the stronger developers I met have these secret weapons. However, most of us don’t have bandwidth or motivation to learn all of these (obscure) features, because we don’t know which ones are worth learning.
* JMS — browser, weblogic JMS console…
* compiler flags
* tcpdump, snoop, lsof
* truss
* core dump analysis
* sar, vmstat, perfmeter,
* query plan
* sys tables, sp_lock, sp_depend
* set statistics io
* debuggers (IDE)
* IDE call hierarchy
* IDE search
* thread dump
* jvmstart tools — visualgc, jps,
* bytecode inspector, decompiler
* profilers
* leak detector
* jconsole
* jvmstat tools — visualgc, jps,..
* http://wiki.caucho.com/Category:Troubleshooting
* any tools for code tracing Other productivity skills that no software tools can help:
* log analysis
* when I first came to autoreo, i quickly characterized a few key tables.

##instrumentation[def]skill sets u apart

Have you seen an old doctor in her 60’s, or 70’s? What (practical) knowledge make her ?

[T = Some essential disagnostic Tools are still useful after 100 years, even though new tools get invented. These tools must provide reliable observable data. I call it instrumentation knowledge]
[D = There’s constant high Demand for the same expertise for the past 300 years]
[E = There’s a rich body of literature in this long-Established field, not an Emerging field]

As a software engineer, I feel there are a few fields comparable to the practice of medicine —
* [TDE] DB tuning
* [DE] network tuning
* [T] Unix tuning
* [TDE] latency tuning
* [] GC tuning
* [DE] web tuning

Now the specific instrumentation skills

  • dealing with intermittent issues — reproduce
  • snooping like truss
  • binary data inspection
  • modify/create special input data to create a specific scenario. Sometimes you need a lot of preliminary data
  • prints and interactive debugger
  • source code reading, including table constraints
  • log reading and parsing — correlated with source code
  • data dumper in java, perl, py. Cpp is harder
  • pretty-print a big data structure
  • — above are the generic, widely useful items —
  • In the face of complexity, boil down the data and code components to a bare minimum to help you focus. Sometimes this is not optional.
  • black box testing — Instrumentation usually require source code, but some hackers can find out a lot about a system just by black-box probing.
  • instrumentation often requires manual steps — too many and too slow. In some cases you must automate them to make progress.
  • memory profiler + leak detector

self-rating in java GTD+theory/zbs #halos#YH

Hi YH,

Self-rating is subjective, against one’s personal yardsticks. My own yard stick doesn’t cover those numerous add-on packages (swing, spring, hibernate, JDBC, xml, web services, jms, gemfire, junit, jmock, cglib, design patterns …) but does include essential core jdk packages such as —

  • – anything to do with java concurrency,
    • reading thread dump
  • – anything to do with java data structures,
  • – [2017] Garbage collection
    • memory profiling (jvm tools)
    • GC log analysis
  • – [2017] java.lang.instrument, mostly for memory
  • – networking/socket, file/stream I/O
  • [!2017] generics, esp. type erasure
  • – RMI,
  • – reflection, dynamic proxy, AOP
  • – serialization,
  • – class loaders
  • – JNI, esp. related to memory
  • – JMX, jconsole,
  • – difference between different JDK vendors,
  • – real time java
  • – reading bytecode

(The highlighted areas are some of my obvious weaknesses.) Now I feel my self-rating should be 8/10, but i still feel no one in my circle knows really more about the areas outlined above. I guess that’s because we don’t need to. Many of these low-level implementation details are needed only for extreme latency, where c++ has a traditional advantage.

Another purpose of this list — answer to this

Q: what kind of java (theoretical) knowledge would give you the halos in a Wall St interview?

Yang KPI:learn’huge codebase-goto person, connect user-expectations+codebase

This is an annotated conversation with a team mate.

> “Y (our team lead) has good system knowledge” like how things *should* work.

If Y asks a developer a specific question about how something actually works in the current implementation, he often points out “discrepancies” thanks to his system knowledge. In that case, either the developer is wrong or there’s a bug — Y’s system knowledge is quite sound. I (Bin) feel Y is sharp, quick and can filter and absorb the gist into his big picture.

I feel current implementation is probably fairly close to the “expecation” because it has been used (therefore verified) for years.

I feel Y approaches a complex system from top down, focusing on the how-things-should-work, or so-called “system knowledge” which is non-trivial. That’s one reason Y is more interested in the SQL queries. SQL is useful because DB serves as “checkpoint”.

Part of his learning is weekly user meeting on (high-level or detailed) requirements, bugs … — boils down to user EXPECTATIONS. Team members don’t get that insight. I feel that is more than biz knowledge — such insight connects user expectation with implementation. Much of the insight is summarized in jira and you can tell how much insight there is.

Another part of his learning is knowledge transfer session from his predecessor.

I feel Y asks better questions than I did. “system knowledge” questions. Roland told me he has difficulty getting answers but Y is *persistent* and asked lots of sharp questions until he got all his high/low level doubts cleared. My focus tended to be at a lower level.

Y became a go-to person quickly.

< By system knowledge i think you also mean the PATTERNS and SUMMARIES of logic in codebase. After you study codebase bottom up you inevitably see such patterns.
> “yes”

< how did you pick up in the beginning? There's so much code to get through.
> “only 2 ways to learn — support or development. I worked on development of pgrid and GCA. i followed emails between users and support guys and went into code to verify.”
I (Bin) think there’s also a lot of help from the Infosystem colleagues.

I didn’t follow all the email chains and take those as learning opportunities. Email overload.

Y gave me a plausible learning opportunity — document the haircut rules. But there’s too much logic. I studied it a few times but didn’t connect to the log messages or user issues. I didn’t learn enough to make me a go-to person. Now i know support is the way to go — i get to know what’s important among the hundreds of complex rules and also how-things-should-work.

## know what “nice” solutions work only on paper

We are talking about real technical value-adding, not interviews. Context — non-trivial trading system design. Team of 5 – 20 developers.

If you can get at least one credible [1] dose of non-trivial first-hand experience in pricing / booking / real time mark2market -> real time pnl -> real time VaR / market data feed / execution … then you can give real inputs to design discussions. Real inputs, not the typical speculative inputs from an “armchair quarterback”. In other words, if you have done it, then you can speak from experience. Speculators can’t be sure if their solutions will work. If it’s a top brain, she can be fairly confident, but actual development invariably requires many implementation decision. Even for a genius, It’s impossible to make an educated guess at every juncture and be right every time.

Most of the tools are made by “strangers”. No one can be sure about the hidden limitations esp. at integration time. Some experience is better than no experience.

Here are a small sample of those tools that came to mind. These aren’t trivial tools, so real experience will teach you what to *avoid*. In short, every solutions works on paper, but Speculators don’t know what don’t work.

  • temp queue
  • lock free
  • rv
  • cache triggers and gemfire updaters
  • gemfire
  • camel
  • large scale rmi
  • large scale web service
  • DB-polling as a notification framework

(I have a fundamental bias for time-honored tools like MOM, SQL, RMI..)

[1] need to be a *mainstream* system, not a personal trading system. Each trading system could have quite different requirements, so ideally you want to sample a good variety of, say, pricing engines.

##what zbs skills differentiate java trading developers

I said before that core java trading system developers need a few special skills

1) multi-threading
2) messaging
3) data structures and algorithms
?) distributed caching — trading system often avoid hitting database, so they keep large amounts of data in memory
?) java tuning
?) distributed computing using RMI, web services, ejb etc
?) Spring DI beyond MVC
?) serialization

#1 and 2 are the most broadly recognized requirements.

Now I feel most of these skills are built on top of fundamental java skills. Messaging system uses threading, serialization, data structures. Caching uses threading, data structures, serialization …

I don’t know a well-defined list of “fundamental java skills” but interviewers often zoom into threading, collections and OO. Among these, threading presents the highest complexity, but most applications stick to well-tested fool proof patterns.

I feel java generic collections (in a multi-threaded environment) is perhaps the 2nd in complexity. (I realized this after reading c++. C++ generic collections are quite a bit messier than java.) Most java developers I know never need to go beyond scratching the surface of generics.

In conclusion, to pass interviews, over-prepare on threading and collections, and prepare for high-level questions on messaging. To be a good trading programmer, become really good at threading and serialization… among other things

stronger]Perl than java though I spent more years]java (theory^GTD)

Q: what elements make me Feel strong in c# or c++ or perl…?
A21: passing a large number of non-trivial interviews
A36: discussion with veterans. Can follow. Can contribute.
A78: help veterans solve problems
A96: independently solve non-trivial problems (colleagues couldn’t help you solve)
A01: hands-on skill with the tough and core parts of the language, like threading, generics, Reflection …
A10: tools, tools and tools

Q: what elements make me Look strong in c# or c++ or perl…?
A63: track record – a wide range of non-trivial mainstream projects, often with substantial (or impressive) volume and performance
A81: certification or the type of knowledge therein
A56: insight into strange errors that other veterans don’t dig into

For c#, I have A36, A01, A96
For java, I can improve A78, A10
In any programming language there’s a body of abstract theory (including design principles, language features…) on one hand, and practical how-to on the other hand. I’m a theory guy meaning I understand theory faster. Practical on-the-ground problem solving skill is ….mediocre.

Now, Why am I stronger in Perl than java even though I spent more years writing java? Here’s my new answer — I understand java fundamentals better than Perl fundamentals; but I know more Perl how-to than java how-to. Most practical Perl problems I can resolve faster than others. (Well, not many people spent that many years using Perl.) I have worked with many Perl developers and many of them tell me I know a lot of little tricks.

Here’s one example of a killer java skill — Multithreading is the most complex part of java (followed by generics and reflection, IMO). Complex in the JVM ie “theory”. In practice, app developers stick to code idioms and proven constructs. These “best practices” shield programmers from the underlying complexities. If you know some of these “best practices”, then you don’t need a deep understanding of the JVM internals. What I just said is all within my comfort zone. However, when threads meet swing, messaging, distributed cache, app servers…, we are challenged. Not much theory. Just a lot of code on the ground. You need resourceful, quick problem solvers. Good understanding of java fundamentals can help to some extent.

Another example – eclipse IDE. Never tested in interviews. No theory. But can really make a programmer so much faster or slower.

Perl: defensible turf but y@@

Reason: in every team including GS, no one has a deeper or more complete experience than mine.

reason: i explored way beyond everyday language features. if i walk into a perl shop, most of the features used i already know.

Lesson: “3 years on a … job” seems to prove qualification for similar jobs, but nonono, because those x years could be very limited.

Look at C. i used it in Chartered Semi but almost no pointers, no memory management, no threading, no system programming… For java i actually used a lot of things, but there are many more experienced guys.

A lot of 5-year java developers are completely unprepared against threading challenges.

Which area can I target as my next?

* threading?
* MOM?
* SQL + relational design?