intellij=cleaner than eclipse !

intellij (the community version) is much cleaner than eclipse, and no less rich in features.

On a new job, My choice of java ide is based on
1) other developers in the team, as I need their support

2) online community support — as most questions are usually answered there
I think eclipse beats intellij

3) longevity — I hate to learn a java ide and lose the investment when it loses relevance.
I think eclipse beats intellij, due to open-source

)other factors include “clean”

The most popular tools are often vastly inferior for me. Other examples:
* my g++ install in strawberryPerl is better than all the windows g++ installs esp. msvs
* my git-bash + strawberryPerl is a better IDE than all the fancy GUI tools
* wordpress beats blogger.com hands down
* wordpad is a far simpler rich text editor than msword or browsers or mark-down editors

“Strategic” needs a re-definition #fitness

“Strategic” i.e. long-term planning/t-budgeting needs a re-definition. quant and c# were two wake-up calls that I tragically missed.

For a long time, the No.1 strategic t-expense was quant, then c#/c++QQ, then codingDrill (the current yellowJersey).

Throughout 2019, I considered workout time as inferior to coding drill .. Over weekends or evenings I often feel nothing-done even though I push myself to do “a bit of” yoga, workout, or math-with-boy, exp tracking,

Now I feel yoga and other fitness t-spend is arguably more strategic than tech muscle building. I say this even though fitness improvement may not last.

Fitness has arguably the biggest impact on brain health and career longevity

c++low-^high-end job market prospect

As of 2019, c++ low-end jobs are becoming scarce but high-end jobs continue to show robust demand. I think you can see those jobs across many web2.0 companies.

Therefore, it appears that only high-end developers are needed. The way they select a candidates is … QQ. I have just accumulated the minimum critical mass for self-sustained renewal.

In contrast, I continue to hold my position in high-end coreJava QQ interviews.

##command line c++dev tools: Never phased out

Consider C++ build chain + dev tools  on the command line. They never get phased out, never lost relevance, never became useless, at least till my 70s. New tools always, always keep the old features. In contrast, java and newer languages don’t need so many dev tools. Their tools are more likely to use GUI.
  • — Top 5 examples similar things (I don’t have good adjectives)
  • unix command line power tools
  • unix shell scripting for automation
  • C API: socket API, not the concepts
  • — secondary examples
  • C API: pthreads
  • C API: shared memory
  • concepts: TCP+UDP, http+cookies
Insight — unix/linux tradition is more stable and consistent. Windows tradition is more disruptive.

Note this post is more about churn (phased-out) and less about accumulation (growing depth)

@outage: localSys know-how beats generic expertise

When a production system fails to work, do you contact

  • XX) the original developer or the current maintainer over last 3Y (or longer) with a no-name college degree + working knowledge of the programming language, or
  • YY) the recently hired (1Y history) expert in the programming language with PhD and published papers

Clearly we trust XX more. She knows the localSys and likely has seen something similar.

Exception — what if YY has a power tool like a remote debugger? I think YY may gain fresh insight that XX is lacking.

XX may be poor at explaining the system design. YY may be a great presenter without low-level and hands-on know-how.

If you discuss the language and underlying technologies with XX he may show very limited knowledge… Remember Andrew Yap and Viswa of RTS?

Q: Who would earn the respect of teammates, mgr and external teams?

XX may have a hard time getting a job elsewhere .. I have met many people like XX.

WallSt=kind to older guys like me@@

I would say Wall St is Open to older techies.

Q: I sometimes feel WallSt hiring managers are kind to older techies like me, but really?
A: I feel WallSt hiring managers are generally a greedy species but there are some undercurrents :

  • 🙂 traditionally, they have been open to older techies who are somewhat less ambitious, less driven to move up, less energetic, less willing to sacrifice personally
  • 🙂 U.S.hir`mgr may avoid bright young candd #Alan
  • 🙂 some older guys do perform well above expectation
  • 😦 if an older guy needs to be cut, many hiring managers won’t hesitate.

Overall, I think Wall St hiring managers are open to older guys but not sympathetic or merciful. They are profit-driven, not compassionate. The fact that I am so welcome on Wall St is mostly due to my java/c++ QQ, not anyone’s kindness.

I thank God. I don’t need to thank Wall St.

Sunday night: if in the mood4localSys

Realistic scenario — I find myself in the mood for localSys learning on a Sunday night 11pm.

I think it’s better to sleep in office than to go home, but in Singapore, I had better go home and sleep, by taking taxi.

I think it’s better to work on the localSys till 2am (or later). Those 3 hours are precious engagement … Burning pleasure. I don’t get such 3-hour engagements in a week.

I used to feel “why not work on my QQ or coding practice now, and focus on work Monday morning?” It turns out that I’m usually less motivated on Monday morning, for whatever reasons.

Friday night is similar. I often find my highest appetite and absorbency on Friday nights (or eve of public holidays). So better go home late, sleep, then come back to office Saturday early morning to capture the mood.

U.S.startups: often less selective,lower caliber

U.S. startups may represent a sweet spot among my job choices. I think they are less selective than HFT or top tech shops. They attract fewer top geeks.

  • Workload – presumably comparable to ibanks like MS/Baml/Barc. I guess some startups are higher, but I guess fewer than 50%. Mithun described 3 startups but he has no firsthand experience.
  • Salary — Many startups pay on-par with ibanks, according to CYW
  • Leaves – possibly fewer than ibanks
  • Cowoker benchmark – possibly lower caliber
    • Respect – could be slightly better than in an ibank if you earn the respect

Some of the big tech shops are actually less selective than HFT – Amazon, Apple

##longevity rating: java is outlier ] language_war

Among languages, java is way ahead of the pack. We need to treat java as exceptional, outlier. With that, c++ looks rather solid and decent. Shall we merge this table into my pastTechBet.xlsx sheet? Partially merged, but no need to update the sheet.

longevity rating #bias tech skill mkt share prominence domains
80 % java robust 2000’s
40 % py rising 2000’s
50 % c/c++ fell 1980’s HFT, gaming,
telco,embedded, AI/ML
20 % c#/dotnet fell 2010’s
30% php ? 2000’s
10% perl FELL 2000’s automation
40% javascript rising 2000’s
30 % RDBMS/sql fell 1990’s
70 % socket robust 1990’s probably
90% TCP/IP dominant 1970’s
20 % MOM robust
90 % http stack dominant 2000’s
90 % unix “tradition” dominant beyond memory

localSys,big codebase..taxing@the aging memory

A.Brooks talked about innovative brain power. I’m talking about memory capacity.

Now I see that localSys on a big system is taxing on the aging memory. I guess GregM (RTS) might be a cautionary tale. GregM relied on his theoretical knowledge to pass interviews, but not fast enough with local codebase

Is green field better than brown-field codebase? I guess so, based on personal experience. Green-field projects are rarely given to a new joiner but contractors are often hired on green field budget — like Citi, 95G, volFitter and RTS 🙂

Now consider this separate question:

Q: Are the c++11 QQ topics a form of churn on the interview arena?
A: Yes but c++11 QQ is lighter (on the aging memory) than localSys

Q: how about the Coding IV?
A: still lighter stress than localSys.

c++TMP^other QQ topics #java

Alexandrescu’s TMP techniques (not “designs”) are very tricky (not “complex”). They require absorbency, but do they enhance latency? Do they get you higher jobs with lower stress?

I need to make time-allocation decisions among QQ topics, including TMP

In terms of latency, Well, java can now rival c++ in latency. The technical reasons are not obvious nor intuitive, but not my focus today. Just an observed fact which discredits conventional wisdom and our assumptions.

— zbs, based on continued relevance :

TMP is needed when reaching next level in c++ zbs.

TMP is more time-honored than many c++0x features.

Many new c++0x features were added for TMP. I feel TMP is the main innovation front across c++ language n standard development. C++ lost many battles in the language war but no other languages offer anything close to TMP features.

— As QQ

Will C++TMP (and rvr) QQ turn out similar to java bytecode engineering, reflection, generics? (Even in such a scenario, TMP still offers better roti than Qz.) Actually TMP is quizzed more than those. The c++ guru interviewers often adore TMP.. cult following.

EJB is an add-on package .. different category, not an advanced core language feature.

When TMP is not quizzed you may still get opportunities to showcase your halo. Many interviewers ask open-ended questions.

TMP techniques would remain a halo for years to come. Classic QQ topic.

— GTD: templates are never needed in greenfield projects. Occasionally relevant in understanding existing code base such as etsflow, STL, boost..

Q: are there rare projects using TMP and offer me an opportunity to outshine others, gain GTD advantage ..?
A: I guess it’s one in 10 or 20. Here are some factors:

Within a given project codebase, TMP is a powerful tool for DRY improvement and re-usability , but such reusability  is over-rated in most projects.

DRY (don’t repeat yourself) is practiced more widely, but I feel TMP techniques replace 100 lines of code duplication with 20 lines of unreadable code.

 

QQ=fake; zbs_learning=treacherous

I used to feel my QQ knowledge was not zbs but now I think many interviewers (like myself) ask zbs questions. zbs is a vague concept. QQ has a well-established range of topics. Experts sizing up each other … is mostly QQ

Biggest danger with zbs learning — when I spend my precious spare time on zbs not aimed at QQ, I inevitably regret. Without QQ, the learning lacks reinforcement, positive feedback loop… and fades from memory.
I think in SG context the dearth of interviews make zbs learning treacherous

Therefore, zbs learning must aim at xref and thick->thin.

One guy with theoretical zbs (either strong or poor QQ) may be very mediocre joining a new team. It depends on the key skill needed in the team. Is it localSys, design …? I tend to feel what’s really needed is localSys.

jav isn’t challenged by c++/c# but by high-level languages

  • I noticed at the beginning of this decade (2010-2012) c# mounted a credible challenge to dethrone java but has since lost momentum.
  • c++ had a renaissance with c++1x, but after 8 years I would say it failed to become a threat to java.

The other challengers to java seems to be domain-specific high level languages like javascript (browser and beyond), ruby-on-rails, php (web server side), perl (text processing), q (kdb). Not sure about Golang.

Python is the strongest challenger among them. Here I consider python a domain-specific language given its visible strengths in data science, machine-learning, black-box testing, and traditional process management. I sometimes consider python a general purpose language like java and c++, but i feel it’s not widely perceived as such, perhaps due to dynamic typing and efficiency.

C/c++ remains the default system programming language — that’s clear. Less clearly, Java remains the “default” application-development language on server side — but in this sentence the key words are vague.

Luckily for java developers, nowadays most of the work is on server side, including many mobile and cloud applications. In addition, java is viable for offline/batch job, data science, transaction-processing systems … Actually, I listed javascript as a java challenger because I think a lot of application functionalities are moving to client-side.

Luckily for java developers, a few internet powerhouses led by Google (and Alibaba) have chosen Java as the default language on server side. So have most Wall St firms.

deadly delays@ project^feature levels

insightful article: managing tight deadlines is the background

— deadly delay at feature level, without user impact

This is real stressor primarily because there exist team colleagues who can GTD faster on this same task.

For the serious delay, user impact is … exaggerated! This is true across all my projects — user impact doesn’t really matter. In fact, the feature may not be used much whether you deliver it in high or low quality, within or exceeding budget.

— blatant delay at project level when you are architect / team lead

In theory if you promise to delivery some new feature i.e. green field project, then it can be tough to deliver on time. In reality, project time/budget overrun is very common. You only need good explanation.

Users never care that much about any new system cf current production systems. New systems often hit various teething problems and functionally unreliable for months.

OK , Users don’t really care that much, but there’s visible technology budget impact which make the technology MD look bad. MD must look for an “explanation” and may cut headcount as in StirtRisk and RTS-Trep

Whose fault? Many fingers point at you the architect, but often it’s partial fault of the user-facing manager due to immaturity and overconfidence with timeline estimate.

Delay is mostly architect’s fault if another architect can deliver faster but I can’t imagine two architects working on two comparable projects at the same time. Therefore it’s never “mostly architect’s fault”. In reality, management do apply pressure and you suffer, but it’s really for political reason (like sacrificial lamb or scapegoat)

eg: RTS Trep
eg: PWM Special-Investment-management
eg: Post-Saurabh GMDS — multiple delays and scope reductions

eg@traction[def2] in GTD^IV #Sunil

Sunil is not the only one who tried but failed to break into java. Sunil was motivated by the huge java job market. I believe he had opportunities to work on (probably small) java projects and gained confidence, but that’s really the easy part. That GTD experience was completely insufficient to crack java interviews. He needs IV traction.

  • Zhurong also tried java
  • [g] Venkat actually got a java job in SCB but didn’t like it. I feel he was lacking GTD traction
  • [g] XR and the Iranian java guy had some c# projects at work but didn’t gain traction.
  • CSY had a lot to tell me about breaking into java.
  • [gi] CSY and Deepak CM both had java projects at work but no traction
  • [i=IV traction]
  • [g=GTD traction]

IV/QQ traction — I experienced better IV traction in c# than c++. I think it’s because half my c++ interviews were HFT shops.

GTD traction — I had good GTD traction with javascript, php …, much better than c++. My C# GTD traction was also better than c++. C++ is probably the hardest language in terms of GTD. As explained to Kyle, my friend Ashish experienced tremendous GTD traction but can he crack the interviews? Hiring teams can’t access your GTD but they can ask tough QQ questions.

##[19] am competent at..as a professional thanks2tsn

Background — When I listen to a professional musician or comedian from some unfamiliar country, I wonder if they are actually good. Similarly, when I consult a doctor or dentist, I wonder if they are qualified.

“Self-respecting programmer” — Yi Hai’s motivation.

I have been tested and proven on the big stage i.e. U.S. tech interviews + GTD

  • [p t w] java/c++/c#
  • [w] algo
  • [w] coding test
  • [t] SQL
  • [t] socket
  • [t] unix power user
  • swing
  • [p] web app developer across php, javascript, java
  • py, perl (and shell/javascript?) — are these professional games?
  • [t = long tradition over 20Y]
  • [w = worldwide contest]
  • [p = a well-known standard profession]

became expert via external QQ benchmark` !!localSys xx

Suppose I invest heavily and become very productive on a local c++system. Similarly, a java, or SQL, or socket, pyhton .. system, but the point is — Any local system uses only a small portion of the language features.. at most 5% of the typical high-end interview topics on that language.

Therefore the project experience won’t give me confidence to be an expert on the language, but acing multiple QQ interviews does, as these interviews compare dozens of competitive, motivated and qualified candidates across a wide field.

I grew very confident of java through this type of external benchmarking, in contrast to internal benchmarking within the local team.

required experience: QQ imt CIV

A few years into my technology career, I observed that Solaris/Oracle was harder to self-teach at home than linux/mysql/php/python/perl/javascript because even a high-school student can install and hack with the latter. Entry-barrier was non-existent.

Similarly, I now observe that Wall St QQ-IV demands longer experience than coding IV of west-coast style. Coding drill can use Leetcode over months. QQ requires not so much project experience but a lot of interview experience. It takes more than months of exploration and self-learning.

  • example — tcp/ip. My friend Shanyou preferred to read books systematically, but a book touches on hundreds of topic. He wouldn’t know what topics interviews like to dig into.
  • example — c++ TMP. A fresh grad can read about it but won’t know the favorite topics for interviewers
  • Compelling Example — java concurrency. A fresh grad can build up theoretical knowledge but won’t have my level of insight

Many inexperienced candidates were highly appreciated in west coast interviews. No such appreciation on Wall St because Wall St can VP or contract roles require work experience .. tried-n-tested.

  • Wall St interviews are selective in terms of experience
  • West Coast coding interviews are selective in terms of … speed and optimality

cod`drill^quant self-study

quant knowledge used to be a halo, now replaced by coding IV skills like

  • classic generators
  • DP, recursion
  • clever data structures
  • clever graph algos
  • speed coding

In 2012 When I was studying quant in my spare time, I was on a powerful ascent. In the subsequent years, I gradually lost steam. The quant sector was shrinking and market depth was disappointing.

Since 2018, my coding drill feels very relevant, even though the gap behind the strong players continues to be a pain and a de-motivation. When I stop comparing myself with the stronger players, I would feel my improvement in a highly valuable skill

west-coast IV need no java/c++insight..our t-investment lost

I attended a few technical interviews at “west coast” type of companies over the years — google, amazon, VMWare … and recently Indeed, Facebook and some well-known Chinese tech shops.

These Interviewers never asked about java/c++ language details or data structures (as implemented in those standard libraries), or Linux+compiler system knowledge. ( I know many of these shops do use java or c++ as firm-wide primary language.) They do require data structure knowledge in any language you choose.

My conclusion from these experiences — if we compete for these jobs, we can’t rely on the prior interview experiences gained from all the financial-domain tech interviews. Wall St vs West Coast are too different, so much so that Wall St essential tech knowledge is not useful for west coast interviews.. We have to leave that wealth of knowledge behind when we start on a new journey (to the West) of learning, trying (trying our luck at various interviews), failing and retrying.

Michael Jordan once quit NBA and tried his hand at professional baseball. I see ourselves as experienced basketball players trying baseball. Our interview skills, interview practice, in-depth knowledge of crucial interview topics have no value when we compete in west-coast interviews.

West cost shops mostly rely on algo interviews. You can use any (real) language. The most common are java/c++/python. You just need a tiny subset of the language knowledge to compete effectively in these coding tests. In contrast, financial firms quiz us on much wider and deeper knowledge of java/c++/c#/Linux etc.

Q: What if a west-coast candidate were to try the financial tech jobs like ibanks or hedge funds etc? I think they are likely to fail on those knowledge tests. I think it would take more than a year for them to acquire the obscure knowledge required at high-end financial tech jobs. In contrast, it takes months to practice a few hundreds leetcode problems. You can decide for yourself which side is more /impenetrable/ — Wall St or West Coast.

Ironically, neither the west-coast algo skill nor the financial tech obscure knowledge is needed in any real project. All of these high-end employers on both coasts invent artificial screening criteria to identify “cream of the crop”. What’s the correlation of on-the-job performance to a candidate’s algo skill and obscure knowledge? I would say zero correlation once we remove the intelligence/diligence factors. In other words, algo skill or obscure knowledge are poor predictors of job performance, but intelligence/diligence are good predictors.

In the bigger picture, these tech job markets are as f**ked up as decades ago, not improving, not worsening. As long as there are big salaries in these jobs, deep-pocketed employers will continue to use their artificial screening criteria. We have to play by their rules, or get out of the way.

competitive strengthS offer different$values #speedCod`,math

  • competitive strength in speed coding contest — such contest are now more prevalent and the skills are more valued
  • competitive strength in dStruct/algo beyond the classics
  • competitive strength in core cpp QQ
  • competitive strength in core java QQ — bigger job market than cpp
  • competitive strength in socket QQ
  • competitive strength in SQL QQ (and perl GTD) — better than swing
  • competitive strength in math before college level — huge and long-term impact
  • competence in localSys — no long-term impacts, so Ashish’s t-investment is unwise
  • improvement in yoga and fitness

In each “competitive” case, you build up competitive strength over H years but may lose it in K (could be long) years. Each strength has very different long-term impacts and long-term value, not “zero” (transient) as we sometimes perceived them to be.

Any Valuable resources (including a lucrative job) are scarce and invites competition. A competitive strength (in any one of these domains) has long term impact on my health, mental aging, stress level, my job choices, my commute, amount of annual leave.

For a concrete comparison, let’s compare speed coding vs math. In school, math is far more valuable. It enhances your physics, chemistry, economics… There are many math competitions at various levels.  After we turn 30, math, in the form of logical and number-theory quizzes, still plays a small part in some job interviews. However, speed coding strength (am building) now has such an appreciating value on the high-end competitive interview scene.  Over the next 10Y, speed coding will have far more impact on those aspects listed earlier.

However, if you want to invest in building such a strength, beware of huge disappointments. You can say every woman’s natural beauty has imperfections when you see that woman everyday. This is because our idea of perfect beauty is based on paintings and photos, not live humans. Similarly, every endeavor’s ROTI has imperfections compared to our naive, idealized concepts.

If you look for imperfections you will always find some, but such fixation on imperfections is cynical, demoralizing and unproductive research.

We need to zoom into our own strategic strengths + long term interests such as low-level, theoretical stuff, pure algo, dstruct, and avoid our weaknesses.

  • low level or theoretical QQ — my strength
  • low level investigation using tools + codebase — my weakness
  • picking up new GTD challenges — my relative weakness but I did well before joining Wall St.
  • picking up new IV topic — my relative strength

unnoticed gain{SG3jobs: 看破quantDev

All three jobs were java-lite , with some quantDev exposure. Through these jobs, I gained the crucial clarity about the bleak reality of the quantDev career direction. The clarity enabled me to take the bold decision to stop the brave but costly TSN attempts to secure a foothold. Foothold is simply too tough and futile.

Traditional QuantDev in derivative pricing is a shrinking job pool. Poor portability of skills without any standard set of interview topics.

at same pay, now I would prefer eq than drv pricing domain, due to mkt depth and job pool.

QuantDev offers no contract roles !

Instead, I successfully established some c#/py/c++ trec. The c++ accu, though incomplete, was especially difficult and precious.

Without these progresses, I would be lacking the confidence in py/c#/c++ professional dev that enabled me to work towards and achieve multiple job offers. I would still be stuck in the quantDev direction.

market-depth^elite domains #jxee

I used to dismiss “commodity” skills like market data, risk system, JXEE… I used to prefer high-end specializations like algo-trading, quant-dev, derivative pricers. In reality, average salary is only slightly different and a commodity job can often outpay a specialist job.

As I get older, it makes sense to prefer market depth rather than “elite”(high-end niche) domains. A job market with depth (eg jxee, market-data, risk systems) offers a large number of positions. The typical salary of top 10% vs the median are not very different — small gaps. In contrast, the elite domains feature bigger gaps. As I grow older, I may need to reconsider the specialist vs generalist-manager choices.

Reminders about this preference (See also the spreadsheet):

  1. stagnation in my orgradient
  2. may or may not use my specialist skills in math, concurrency, algorithms, or SQL …
  3. robust demand
  4. low churn — a critical criteria whenever I mention “market depth”. I don’t like the market depth of javascript and web java.
  5. salary probabilities(distro): mgr^NBA#marketDepth etc

–case study: Algo trading domain

The skillset overlap between HFT vs other algo systems (sell-side, OTC, RFQ, automated pricing/execution..) is questionable. So is “accumulation” across the boundary.  There seems to be a formidable “dragon gate” — 鲤鱼跳龙门.

Within c++ based HFT, accumulation is conceivable. Job pool is so small that I worry about market depth. My friend Shanyou agreed that most of the technical requirement is latency. C/C++ latency techniques are different from java.

However, HFT developers seldom need to optimize latency

Outside HFT, the level of sophistication and latency-sensitivity varies. Given the vague definition, there are many (mostly java) jobs related to algo trading i.e. better market depth. Demand is more robust. Less elitist.

rvr usually shows up as function param ONLY

r-value reference is a type, and therefore a compile-time thing, not a runtime thing as far as I know. At runtime, there’s no r-value reference variable,

only addresses and 32-bit pointer objects.

(I believe at runtime there is probably no lvr reference variable either.)

Compiler recognizes the RHS’s type and decides how to bind the RHS object to a variable, be it an rvr-variable, lvr-variable, nonref-variable, or const-lvr-variable.

About the only place I would use a "&&" variable is a function parameter. I don’t think I would ever need to declare a local variable or a field with "&&".

Do I ever assign an rvr variable as a RHS to something else? Only in one case, as described in [[effModernC++]] P162 and possibly Item 25. This kind of usage is really needed for QQ interviews and never in any job…. never. It’s too tricky and doesn’t buy us anything significant.

hands-on dev beats mgr @same pay

BA, project mgr, even mid-level managers in some companies can earn the same 160k salary of a typical “developer” role. For a manager in a finance IT, salary is often higher, but for a statistically meaningful comparison I will use a 160k benchmark. Note in finance IT or tech firms, 160k is not high but on main street many developer positions pay below 160k.

As stated in other blogposts, at the same salary, developers enjoy higher mobility, more choices, higher career security…

Leetcode speed-coding contest #Rahul

  • don’t look at ranking
  • yoga — I CAN keep up this practice. This practice is good for my mental health and family well-being
  • yoga — I feel even if i don’t improve visbly, the fact that my participation count is increasing means I’m improving
  • if I don’t do the contest then I may not do any coding drill at all
  • What if I give up after one or a few episodes?
  • impact on family well-being?
  • leverage over long term?

In [[who moved my cheese]], we discover the world has changed. The harsh reality is, in this profession, your experience (like doctors) is heavily discounted. Your current runtime performance is easily benchmarked, just like a painter, or pianist, or chef.

wasm: a distant threat to javascript

https://medium.com/javascript-scene/what-is-webassembly-the-dawn-of-a-new-era-61256ec5a8f6 is a good WebAssembly intro, by an author.

  • wsam is an in-browser language, like javascript
  • wsam offers low-level (web-ASSEMBLY) programming constructs to complement javascript
  • I feel wsam will only be considered by extreme-performance browser apps. It’s too low-level, too inconvenient to be popular.
  • How many percent market share will javascript lose to wsam? 0.1% to 0.5%
  • The fact that wsam is cited as the main challenger to javascript means javascript is unchallenged as the dominant choice on the browser

real-world treeNode has uplink

I think only in contrived interview questions do we find tree nodes without uplink.

Uplink basically, means every edge is bidirectional. With uplinks, from any node we can easily trace to the ancestor nodes.

In real world trees, uplink is cheap-yet-valuable most of the time. AVL tree node has uplink, but let’s look at RBTree:

  • RBTree needs uplink for tree rotation.
  • Also, I believe when you give an insertion hint, the “engine” needs to validate the hinted node, by checking parent + another ancestor.

success@concurrency features] java^c++^c#

I’m biased towards java.

I feel c# concurrency is less impactful because most of the important concurrent systems use *nix servers not windows, and most concurrent programming jobs do not go to windows developers.

Outside windows, c++ concurrency is mostly based on the C library pthreads, non-OO and inconvenient compared to java/c#

The c++11 thread classes are the next generation following pthreads, but not widely used.

Java’s concurrency support is the most successful among languages, well-designed from the beginning and rather stable. It’s much simpler than c++11 thread classes, having only the Thread.java and Runnable.java data types. More than half the java interviews would ask threading, because java threading is understandable and usable by the average programmer, who can’t really understand c++ concurrency features.

c++complexity≅30% above java #c#=in_between

Numbers are just gut feelings, not based on any measurement. I often feel “300% more complexity” but it’s nicer to say 30% 🙂

  • in terms of interview questions, I have already addressed in numerous blog posts.
  • see also mkt value@deep-insight: java imt c++
  • — tool chain complexity in compiler+optimizer+linker… The c++ compiler is 200% to 400% (not merely 30%) more complex than java… see my blogpost on buildQiurks. Here are some examples:
  • undefined behaviors … see my blogposts on iterator invalidation
  • RVO — top example of optimizer frustrating anyone hoping to verify basic move-semantics.
  • See my blogpost on gdb stepping through optimized code
  • See my blogpost on on implicit
  • — syntax — c++ >> c# > java
  • java is very very clean yet powerful 😦
  • C++ has too many variations, about 100% more than c# and 300% more than java
  • — core language details required for GTD:
  • my personal experience shows me c++ errors are more low-level.
  • Java runtime problems tend to be related to the (complex) packages you adopt from the ecosystem. They often use reflection.
  • JVM offers many runtime instrumentation tools, because JVM is an abstract, simplified machine.
  • — opacity — c++ > c# > java
  • dotnet IL bytecode is very readable. Many authors reference it.
  • java is even cleaner than c#. Very few surprises.
  • — more low-level — c++ > c# > java.
  • JVM is an excellent abstraction, probably the best in the world. C# CLR is not as good as JVM. A thin layer above the windows OS.

short-term ROTI ] java IV: tuning imt threading

Analogy — physics vs chemistry in high-school — physics problems involve more logic, more abstract, more ; Chemistry is more factual. Similarly, Java threading challenges are more complex. Java tuning knowledge-pearls are relatively easy to memorize and show-off.

“Short-term” here means 3-5Y horizon.

I’m many times more successful with java threading– accu, deepening knowledge; lead over the pack.

I hope to build some minor zbs in JVM tuning, mostly GC tuning. However, many important parts of JVM tuning skills suffer from churn and poor shelf life.

consolidate into single-process: low-latency OMS

Example #2– In a traditional sell-side OMS, an client FIX order propagates through at least 3 machines in a chain —

  1. client order gateway
  2. main OMS engine
  3. exchange gateway such as smart order router or benchmark execution engine supporting VWAP etc

The faster version consolidates all-of-the-above into a single Process, cutting latency from 2ms to 150 micros .. latency in eq OMS

Example #1– In 2009 I read about or heard from interviewers about single-JVM designs to replace multi-stage architecture.

Q: why is this technique not used on west coast or main street ?
%%A: I feel on west coast throughput outweighs latency. So scale-out is the hot favorite. Single-JVM is scale-up.

decline@quant-dev domain: low value-add

Q: why drv pricing quant jobs used to be more abundant and pay a premium even though the economic value-add of the quant skill has always been questionable?
%%A: because the traders made money for a few years and subsequently received a big budget. Now budget is lower.

Same can happen to quant funds in NY, Chicago, HK…

Some UChicago lecturer once commented that everyone should get some quant training .. (almost as every high school student does math). But I think it is not necessary….

I am more theoretical and was naturally attracted to this domain but the value-add reality soon prevailed in my case.

Tech shops love algo challenges and speed-coding which have dubious value-add similar to the quant skills. However, the hot money in tech sector is bigger and last longer.

short coding IV: ibank^HFT^inetShop

See also prevalence@speed coding among coding-IV

  • “InetShop” includes all west-coast type of hiring teams including China shops
  • Ibanks are mostly interested in language knowledge, so they use coding test (including multithreaded) for that purpose.  Ibanks also use simpler coding questions for basic screening.
  • HFT shops (and bbg) have an unpredictable mix of focuses

I predict that Wall St will Not adopt the west coast practice due to .. poor test-coverage — Not testing language knowledge, latency engineering, threading etc.

The table below excludes take-home tests and includes hackerrank, white-board, webex, and by-phone algo questions.

ibanks web2.0 Shops HFT
language knowledge; latency NOT bigO #1 focus never key focus, not #1
simple-medium speed coding rare #1 focus. High bar common
medium-simple algo without implementation,
logistically easiest => popular
common #SCB
minimum bar is low
common key focus, not #1
tough algo never sometimes rare
big-O of your solution NOT latency LGlp. low bar #2 focus. high bar LGlp
concurrency coding test G5 focus never #python/ruby/js sometimes

 

## marketable syntax nlg: c++ > j/c#

Every language has poorly understood syntax rules, but only in c++ these became fashionable, and halos in job interviews !

  • ADL
  • CRTP
  • SFINAE
  • double pointers
  • hacks involving void pointers
  • operator overloading to make smart ptr look like original pointers
  • TMP hacks using typedef
  • TMP hacks using non-type template param
  • universal reference vs rvr
  • rval: naturally occurring vs moved
    • const ref to extend lifetime of a naturally occurring rval object

value@xRef imt tricks@frontier_Qn: retention

My absorbency is limited. My absorbency is more limited than my time (that’s why I camp out in office when in the mood.) When I’m in the “absorbency” mood,….

… I need to apply my laser energy on the most valuable learning. Many coding drill guys would focus on unfamiliar problems, however, learning a few tricks on a few unfamiliar categories of problems  is unlikely an organic growth.. limited xRef.. not connecting the dots .. no thick->thin. Nowadays I think the more valuable learning is building xRef including

Even if my xRef leads to non-optimal solution, it is still extremely valuable. The non-optimal solutions is sometimes more versatile and can become more effective in a modified problem [3].

If you focus on optimal solution, you may experience wipe-out. Same wipe-out effect due to fixation on passing leetcode tests and fixation on “#new problems solved”

A shift from “frontier” exploration to xRef means slowing down and revisiting old problems.

In some coding problems, you really need some new technique [2], but across 90% of the coding problems, the number of constructs are below 10.

[2] any example?

[3] example:

  • find-all-paths
  • in-order-walk on a non-sorted tree to assign auto-increment nodeIDs, then use pre-order or post-order
  • reusable technique — RBTree as alternative to heap or sorted vector
  • reusable technique — yield-based cookbook
  • reusable technique — radix sort on ints, floats and fixed-length strings as a O(N) wall-blaster. Variable-length string can also achieve O(N) as described in my blogpost on “FastestSortFor”

c++ecosystem[def]questions are tough #DeepakCM

C++ interviewers may demand <del>c++ecosystem knowledge</del> but java also has its own ecosystem like add-on packages.

As I told my friend and fellow c++ developer Deepak CM,

  1. c++ecosystem QQ questions can be more obscure and tougher than core c++ questions
    • tool chain — compiler, linker, debugger, preprocessor
    • IPC, socket, pthreads and other C-level system libraries
    • kernel interface — signals, interrupts, timers, device drivers, virtual memory+ system programming in general # see the blog catetory
    • processor cache tuning
    • (at a higher level) boost, design patterns, CORBA, xml
    • cross-language integration with python, R, pyp, Fortran + other languages
  2. java ecosystem QQ questions are easier than core java questions. In other words, toughest java QQ questions are core java.
    • java ecosystem questions are often come-n-go, high-churn

Low level topics are tough

  1. c++ ecosystem questions are mostly in C and very low-level
  2. java ecosystem questions are usually high-level
    • JVM internals, GC … are low-level and core java

 

competitiveness: Pre-teen sprinter + HFT selectivity

I was a decent sprinter at age 6 through 13, then I realized I was not good enough to compete at Beijing municipal level.

There are quite a number of HFT shops in Singapore. I think they have a higher bar than ibanks. At ibank interviews I felt again like that pre-teen sprinter, but HFT interview is generally too hard for me

op=(): java cleaner than c++ #TowerResearch

A Tower Research interviewer asked me to elaborate why I claimed java is a very clean language compared to c++ (and c#). I said “clean” means consistency and fewer special rules, such that regular programmers can reason about the language standard.

I also said python is another clean language, but it’s not a compiled language so I won’t compare it to java.

See c++complexity≅30% mt java

— I gave interviewer the example of q[=]. In java, this is either content update at a given address (for primitive data types) or pointer reseat (for reference types). No ifs or buts.

In c++ q[=] can invoke the copy ctor, move ctor, copy assignment, move assignment, conversion ctor, conversion operator. In addition to the standard meanings,

  • Its meaning is somewhat special for a reference variable at site of initialization vs update.
  • For (dereferenced) pointers variables, there are additional subtleties.
  • You can even put a function call on the LHS

c++changed more than coreJava: QQ perspective

Recap — A QQ topic is defined as a “hard interview topic that’s never needed in projects”.

Background — I used to feel as new versions of an old language get adopted, the QQ interview topics don’t change much. I can see that in java7, c#, perl6, python3.

To my surprise, compared to java7/8, c++0x has more disruptive impact on QQ questions. Why? Here are my guesses:

  • Reason: low-level —- c++ is more low-level than java at least in terms of interview topics. Both java8 and c++0x introduced many low-level changes, but the java interviewers don’t care that much.
  • Reason: performance —- c++0x changes have performance impact esp. latency impact, which is the hot focus of my target c++ employers. In contrast, java8 doesn’t have much performance impact, and java employers are less latency-sensitive.
  • Reason: template  —- c++0x feature set has a disproportionate amount of TMP features which are very hard. No such “big rock” in java.
    • move/forward, enable_if, type traits

Q: if that’s the case, for my career longevity, is c++ a better domain than java?
A: I’m still biased in favor or low-level languages

Q: is that a form of technology churn?
A: yes since the c++11 QQ topics are likely to show up less over the years, replaced by newer features.

%%geek profile cf 200x era, thanks2tsn

Until my early 30’s I was determined to stick to perl, php, javascript, mysql, http [2] … the lighter, more modern technologies and avoided [1] the traditional enterprise technologies like java/c++/c#/SQL/MOM/Corba . As a result, my rating in the “body-building contest” was rather low.

Like assembly programming, I thought the “hard” (hardware-friendly) languages were giving way to easier, “productivity” languages in the Internet era. Who would care about a few microsec? Wrong…. The harder languages still dominate high-end jobs.

Analogy?

* An electronics engineering graduate stuck in a small, unsuccessful wafer fab
* An uneducated pretty girl unable to speak well, dress well.

Today (2017) my resume features java/c++/py + algo trading, quant, latency … and I have some accumulated insight on core c++/c#, SQL, sockets, connectivity, ..

[1] See also fear@large codebase
[2] To my surprise, some of these lighter technologies became enterprise —

  1. linux
  2. python
  3. javascript GUI
  4. http intranet apps

strong coreJava candd are most mobile

Jxee guys face real barriers when breaking into core java. Same /scene/ when a c++ interviewer /grills/ a java candidate. These interviewers view their technical context as a level deeper and (for the skill level) a level higher. I have seen and noticed this perception many times.

Things are different in reverse — core java guys can move into jxee with relatively small effort. When jxee positions grow fast, the hiring teams often lower their requirements and take in core java guys. Fundamentally (as some would agree with me) the jxee add-on packages are not hard otherwise they won’t be popular.

Luckily for java guys, Javaland now has the most jobs and often well-paying jobs but ..

  1. c# guys can’t move into the huge java market. Ellen have seen many
  2. c++ is a harder language than java, but a disproportionate percentage (80%?) of c++ guys face real entry barrier when breaking into javaland.
  3. I wonder why..
  • I have heard of many c++ and c# veterans complain about the huge ecosystem in javaland.
  • I have spoken to several c++/c# friends. I think they don’t sink their teeth in java. Also a typical guy doesn’t want to abandon his traditional stronghold, even though in reality the stronghold is shrinking.
  • Age is a key factor. After you have gone though not only many years but also a tough accumulation phase on a steep learning curve, you are possibly afraid of going through the same.

Most c++guys have no insight in STL

When I recall my STL interviews on Wall St, it’s clear that majority of c++ app developers use STL almost like a black box, every year for 5-10 years. Only 1% has the level of insight as Henry Wu.

Reason? such insight is never needed on the job. This factor also explains my advantage in QQ interviews.

  • Analogy — Most of us live in a house without civil engineering insight.

STL uses many TMP and memory techniques. Some may say STL is simple but I doubt they have even scratched surface on any of the advanced topics.

  • Analogy — some may say VWAP execution algo is simple.

Many interview questions drill in on one or two essential STL functionality (which everyone would have used), just below the surface. These questions filter out majority[1] of candidates. Therein I see an opportunity — You can pick one or two topics you like, and grow an edge and a halo, just as Henry did. Intellectual curiosity vs intellectual laziness. 不求甚解,不加咀嚼,囫囵吞枣 — I see it in many self-taught developers. That’s exactly what these Wall St interview questions are designed to identify and screen out.

How about west coast and other high-end tech interviews? I’m not sure.

[1] sometimes 90%, sometimes 60%.

 

##[18] 4 qualities I admire ] peers #!!status

I ought to admire my peers’ [1] efforts and knowledge (not their STATUS) on :

  1. personal wellness
  2. parenting
  3. personal finance, not only investment and burn rate
  4. mellowness to cope with the multitude of demands, setbacks, disappointments, difficulties, realities about the self and the competition
  5. … to be Compared to
    • zbs, portable GTD, not localSys
    • how to navigate and cope with office politics and big-company idiosyncrasies.

Even though some of my peers are not the most /accomplished/ , they make a commendable effort. That attitude is admirable.

[1] Many people crossing my path are … not really my peers, esp. those managers in China. Critical thinking required.

I don’t have a more descriptive title..

77 c++IV paperTigers

Avoid spending too much time on this list…. These c++ topics appeared non-trivial (perhaps daunting, intimidating) for years, until I started cracking the QnA interviews. Then I realized in-depth expertise isn’t required.

 

  1. — real tigers i.e. non-trivial nlg is quizzed
  2. [A] CRPP, SFINAE — real tigers. I got asked about these around 5 times, sometimes in-depth
  3. socket: non-blocking
  4. — Now the paper tigers
  5. open source or commercial instrumentation for memory, dependency instrumentation. See blogpost to Ashish
  6. [s] what debuggers and memory leak detectors are you familiar with?
  7. [a] singleton, factory, pimpl and other design patterns
  8. [s] c++ regex
  9. [s] stream I/O and manipulators
  10. —sockets # many more paper tigers to be listed
  11. udp, multicast, select()
  12. [s] socket buffers
  13. [a] byte alignment
  14. endianness
  15. TCP flow control
  16. TCP handshake and disconnect
  17. ack numbers
  18. partial send
  19. close() vs shutdown()
  20. — STL # many more paper tigers to be listed
  21. [s] STL binders
  22. [s] STL allocators
  23. adapters for containers, iterators and functors
  24. [a] iterator invalidation rules
  25. [s] how is deque different from a vector
  26. RBtree
  27. —concurrency # many more paper tigers to be listed
  28. [A] atomic types
  29. pthread functions
  30. [A] IPC mutex
  31. mutex in shared memory
  32. recursive lock
  33. read-write lock
  34. what if a thread dies while holding a lock?
  35. [s] RAII scoped lock
  36. — multi-file build
  37. forward class declarations (as required in pimpl) and their limitations
  38. [s] C/C++ integration, extern-C — heavily quizzed, but no in-depth
  39. [s] what C features are not supported by c++ compiler
  40. circular dependency between libraries — confusing. real tiger but seldom quizzed
  41. [As] shared lib vs static lib
  42. —integration and data exchange
  43. [A] shared memory
  44. [A] CORBA, RPC
  45. [a] serialization in compact wire format — only primitive data types!
  46. [s] OS system calls vs std library (glibc) — sounds daunting to most developers
  47. —exception
  48. catch by ref or by value or by pointer?
  49. [s] exception guarantees
  50. [s] stack unwinding due to exception
  51. throwing destructors — various questions
  52. —memory
  53. which part of memory do static data members go? How about file scope static variables? How about global variables
  54. [s] preventing heap/stack allocation of my class
  55. [s] custom new/delete,  set_new_handler()
  56. [s] intrusive smart ptr, weak_ptr
  57. [s] ref counting
  58. custom deleter in shared_ptr
  59. [s] reinterpret_cast  # always on pointers
  60. [A] custom allocators
  61. [A] free list in the free store
  62. what if you call delete on a pointer that’s created by array-new?
  63. placement new
  64. out-of-memory in operator-new
  65. —inheritance
  66. dynamic_cast
  67. [A] multiple inheritance
  68. [s] virtual inheritance… which base class ctor gets called first? See https://isocpp.org/wiki/faq/multiple-inheritance#mi-vi-ctor-order
  69. [a] slicing problem
  70. [a] private inheritance
  71. [s] pure virtual
  72. —other low level topics
  73. [s] setjmp, jmp_buf… See the dedicated blog post jmp_buf/setjmp() basics for IV #ANSI-C
  74. [s] cpu cache levels
  75. [s] translation lookaside buffer
  76. [s] what data structures are cache-friendly?
  77. [a] memcpy, memset
  78. [s] ++myItr vs myItr++ how are they implemented differently?
  79. —other language features
  80. [s] RAII
  81. [s] operator-overload
  82. [A] template specialization — part of the STL fabric but largely transparent
  83. [s] ptr to member (function) — seldom used outside library code. I tried the syntax in my binary tree serializer
  84. [A] std::forward() std::move(), rvalue-ref
  85. const and constexp
  86. [a] lambda with capture
  87. [a] double-pointer .. why do you need it?
  88. —-
  89. [s == shallow book knowledge is enough]
  90. [a == actually not that deep, IMHO]
  91. [A == actually deep topic]

7 clusters@HFT-c++IV questions

Every single HFT interview question is about low-latency. Furthermore, the ibank algo-trading interviews revolve around the same clusters.

Even though I’m not aiming for HFT jobs, these topics are still very relevant to ibank and the “3rd type” of c++ shops.

  1. socket — lots of details as Shanyou and I agreed
  2. template meta-programming — deep topic but never quizzed in-depth beyond “tricks”
  3. move-semantics

— Themes are less visible in these clusters:

  1. pthreads and c++ threading but I seldom get c++11 question here
  2. STL container internals, mostly shared_ptr, raw array, vector, RBTree, and hashmap
  3. (back of tricks) memory optimization techniques using allocators, cache-optimization, malloc(), placement-new, object-pool, memset,
  4. miscellaneous core OO features like big4, virtual, MI, pbref/pbval

— other HFT topics are dispersed/scattered, not showing any strong central theme

  1. shared memory
  2. linux system calls
  3. compiler details
  4. selected boost libraries

reasons to limit tcost@SG job hunt #XR

XR said a few times that it is too time consuming each time to prepare for job interviews. The 3 or 4 months he spent has no long-term value. I immediately voiced my disagreement because I took IV fitness training as a lifelong mission, just like jogging or yoga or chin-up.

This view remains as my fundamental perspective, but my disposable time is limited. If I can save the time and spend in on some meaningful endeavors  [1] then it’s better to have a shorter job hunt.

[1] Q: what endeavors?
A: yoga
A: diet
A: stocks? takes very little effort
A: ?

throughput^latency #wiki

High bandwidth often means high-latency:( .. see also linux tcp buffer^AWS tuning params

  • RTS is throughput driven, not latency-driven.
  • Twitter/FB fanout is probably throughput-driven, not latency-driven
  • I feel MOM is often throughput-driven and introduces latency.
  • I feel HFT OMS like in Mvea is latency-driven. There are probably millions of small orders, many of them cancelled.

https://en.wikipedia.org/wiki/Network_performance#Examples_of_latency_or_throughput_dominated_systems shows

  • satellite is high-latency, regardless of throughput
  • offline data transfer by trucks) is poor latency, excellent throughput

highest leverage: localSys^4beatFronts #short-term

Q: For the 2018 landscape, what t-investments promise the highest leverage and impact?

  1. delivery on projects
  2. local sys know-how
  3. portable GTD+zbs irrelevant for IV
  4. pure algo (no real coding) — probably has the highest leverage over the mid-term (like 1-5Y)
  5. QQ
  6. –LG2
  7. obscure QQ topics
  8. ECT+syntax — big room for improvement for timed IDE tests only
  9. Best practices — room for improvement for weekend IDE tests only

average O(): hashtable more IMperfect than qsort

In these celebrated algorithms, we basically accept the average complexity as if they were very likely in practice. Naive…

In comp science problems, hash table’s usage and importance is about 10 times higher than qsort

  • I would say qsort is faster than many O(N logN) sorts. Qsort can use random pivot. It degrades only if extremely “lucky;)” like getting a “6” on all ten dice.
  • In contrast, hash table performance depends mostly on programmer skill in designing the hash function, less on luck.

Performance compared to the alternatives — qsort competitive performance is pretty good in practice, but hash table relative performance is often underwhelming compared to red-black trees or AVL trees in practice. Recall RTS.

##Y c++IV improved much faster]U.S.than SG #insight{SCB breakthru

Hi XR,

I received 9 c++ offers since Mar 2017, mostly from U.S. In contrast, over the 4.5 years I spent in Singapore, I received only 3 c++ offers including a 90% offer from HFT firm WorldQuant (c++ job but not hardcore).

  1. Reason: buy-side employers — too picky. Most of the Singapore c++ jobs I tried are buy-side jobs. Many of the teams are not seriously hiring and only wanted rock stars.
    • In contrast, Since 2010 I tried about 6 Singapore ibank c++ jobs (Citi, Barclays, Macquarie, Standard Chartered Bank) and had much better technical wins than at buy-side interviews.
  2. Reason: Much fewer c++ jobs than in U.S.
  3. Reason: employee — I was always an employee while in Singapore and dare not attend frequent interviews.
  4. Reason: my c++ job in the U.S. are more mainstream so I had more opportunities to experiment on mainstream c++ interview topics. Experiments built up my confidence and depth.
  5. Reason: I had much more personal time to study and practice coding. This factor alone is not decisive. Without the real interviews, I would mostly waste my personal time.

Conclusion — availability of reasonable interview opportunities is a surprisingly oversize factor for my visible progress, 

By the way, Henry Wu (whom I told you about) had more successful c++ interviews. He joined WorldQuant and Bloomberg, two companies who didn’t take me up even after my technical wins.

Q: passive income ⇒ reduce GTD pressure#positive stress

My (growing) Passive income does reduce cash flow pressure… but it has no effect so far on my work GTD pressure.

Q: Anything more effective more practical?

  1. take more frequent unpaid leaves, to exercise, blog or visit family
  2. expensive gym membership

How about a lower salary job (key: low caliber team)? No I still want some challenge some engagement, some uphill, some positive stress.

low-latency: avoid concurrency #ST-mode

Backgrounder — CPU speed is increasing more gradually than before. The technology industry as a whole is advancing more horizontally — increasing parallelism. Yet the best designs don’t use lock-free or concurrency at all.

I asked Martin Thompson — To really push the limit of latency, should we avoid concurrency as much as possible, completely eliminating it if possible? Answer is yes. Martin pointed out the difference between

  • parallel design —— use multitasking, in ST-mode, or “do multiple things at the same time
  • concurrent design — deal with multitasking, or “deal with multiple things at the same time“. The expression “deal with” implies complexities, hazards, risks, control, management.

One of the hidden hazards Martin pointed out is heap memory de-allocation, but that’s for another blogpost.

Proven GTD: worthless in candidate ranking #JackZ

I feel that Jack Zhang is competent with localSys GTD but weak on c++ and comp science.

Does he have working knowledge of c++? I assume so. Working knowledge is attainable in a couple of months for a clean language, and up to a year for c++

The required level of working knowledge and basic skill is very low for localSys GTD.

His c++ knowledge is probably barely enough to do the job. Remember I didn’t know what things live on java heap vs stack.

Based on my guesstimate, he would fail any c++ interview and any algo interview. He can write simple SQL in an interview, but I am not sure if he can write complex joins.

The fact that Jacn and DeepakM are are proven on GTD is useless and lost in the conversation.

How about CSY? He can solve many algo problems without practice, but he is reluctant to practice.

I think the self-sense of on-the-job competency is misleading. Many in their positions might feel GTD competency is more important than IV skills. They are so afraid of the benchmark that they don’t want to study for it.

When the topic of tech interview comes up, I think they wish to escape or cover their ears.

tried 3″hard”leetcode Q’s #tests !! 100%

I tried Q4, Q10, Q23.

Observation — they are not really harder in terms of pure algo. I found some “medium” questions actually harder than Q4/Q23 in terms of pure algo.

Beside the algorithm, there are other factor to make a problem hard. For me and my peers, coding speed and syntax are a real problem. So the longer my program, the harder it becomes. Some of the “medium” questions require longer solutions than these “hard” problems.

Logistics of instrumentation is another factor. Some problems are easy to set up and easy to debug, whereas 3D, graph or recursive problems are tedious to set up and often confusing when you try to debug with print’s.

There’s another factor that can make any “medium” problem really hard

pick java if you aspire 2be arch #py,c#

If you want to be architect, you need to pick some domains.

Compared to python.. c#.. cpp, Java appears to be the #1 best language overall for most enterprise applications.

  • Python performance limitations seem to require proprietary extensions. I rarely see pure python server that’s heavy-duty.
  • c#is less proven less mature. More importantly it doesn’t work well with the #1 platform — linux.
  • cpp is my 2nd pick. Some concerns:
    • much harder to find talents
    • Fewer open-source packages
    • java is one of the cleanest languages. cpp is a blue-collar language, rough around the edges and far more complex.

spend more$ to prolong engagement+absorbency

  • increase spend on good mobile data and mobile devices to capture the creativity, absorbency …
  • increase spend on printer
  • increase spend on hotel stay near office (or taxi home) to capture the engagement on localSys
  • spend on flights to gain family time, engaged
  • spend unpaid leave to attend interviews .. to gain joy, motivation, engagement, precious insight into their selection priority

Not so sure about …

  • Regrettable — spent unpaid leave before starting Macq job .. to gain peaceful family time? low ROI
  • questionable spend (gave up higher pay rate) to gain … c++ skills like QQ, GTD, zbs

price sensitivities = #1 valuable output of risk-run

[[complete guide]] P433, P437 …

After reading these pages, I can see that per-deal PnL and markt-to-market numbers are essential, but to the risk manager, the most valuable output of the deal-by-deal “risk run” is the family of sensitivities such as delta, gamma, vega, dv01, duration, convexity, correlation to a stock index (which is different from beta) , ..

Factor-shocks (stress test?) would probably use the sensitivity numbers too.

In Baml, the sensitivity numbers are known as “risk numbers”. A position has high risk if it has high sensitivity to its main factor (whatever that is.)

“didn’t like my face”: we aren’t his top-favorite #bbg

Hi Deepak,

I now think there’s another reason that SIG, Bloomberg, LiquidNet, CapitalDynamics and other employers didn’t make me an offer even though I probably passed technical screening with a technical win.

In our chats, I used the generic term “didn’t like my face” as an umbrella term for several different factors. Today I want to mention a new factor – “what if this candidate takes my offer and continues to shop around?

I believe some companies shun that risk. When in doubt, they reject. When they make an offer they want to ensure the candidate will accept. They want to see “Hey we are clearly the favorite in his mind and he is in a hurry. If we make him an offer he will likely accept right away.”

Clearly, I’m not that type of candidate. I often come across as a “job shopper”, through my non-verbal language, or even through my explicit verbal answers. For example, when asked “Why are you looking to change job” I often answer “I’m actually doing fine on my current job but there are better opportunities like the role in your company.”

[19] new problems!=the best drill

I used to focus on how many new problems solved each week, but that’s very challenging and not necessarily most effective. In contrast, Reviewing existing code is easier i.e. requires lower absorbency, but can still take hours.  Worth the time!

There’s a real risk to overspend time on new problems.

  • We don’t always fully digest the solved problems. No one can remember so well without refresh. Therefore, am growing stronger than before and stronger than my peers who don’t invest this time to refresh.
  • a bigger risk is burnout, stress, rat race against friends who “solved X new problems in Y weeks”. Reality is, they don’t necessary perform better or grow stronger.

 

SQL expertise(+tuning)as competitive advantage: losing mkt val

Opening example — I have received very few selective/tough SQL interview questions. The last interview with non-trivial SQL is Lazada.

If you want to rely on SQL/RDBMS skill as a competitive advantage, you will be disappointed.

I think many teams still use SQL, but use it lightly. Simple queries, simple stored proc, small tables, no tuning required. Therefore, interview questions are dumbing down…

I believe I over-invested in SQL. The last job that required any non-trivial SQL was 95G…

strategic value of MOM]tech evolution  is about MOM, but similar things can be said about SQL and RDBMS.

This is a cautionary tail in favor of TrySomethingNew. If my friend Victoria stays within the familiar domain of SQL/Perl/PHP/Apache, then her skillset would slowly lose market value.

Don’t forget — TSN can have low ROTI. We have to accept that possibility

[18]fastest threadsafe queue,minimal synchronization #CSY

I got this question in a 2017 Wells white-board coding interview, and discussed with my friend Shanyou. We hoped to avoid locks and also avoid other synchronization devices such as atomic variables..

Q1: only a single producer thread and a single consumer thread and no other threads.

I put together a java implementation that can enqueue without synchronization, most of the time … See https://wp.me/p74oew-7mE

Q1b: Is it possible to avoid synchronization completely, i.e. single-threaded mode?
A: No. Consumer thread would have absolutely NO idea whatsoever how close it is to the producer end. No. We asneed a memory barrier at the very least.

Q2: what if there are multiple producer/consumer threads?

I believe we can use 2 separate locks for the two ends, rather than a global lock. This is more efficient but invites the tricky question “how to detect when the two ends meet“. I am not sure. I just hope the locks enforce a memory barrier.

Alternatively, we could use CAS on both ends, but see lockfree queue #popular IV

 

c++^java..how relevant ] 20Y@@

See [17] j^c++^c# churn/stability…

C++ has survived more than one wave of technology churn. It has lost market share time and time again, but hasn’t /bowed out/. I feel SQL, Unix and shell-scripting are similar survivors.

C++ is by far the most difficult languages to use and learn. (You can learn it in 6 months but likely very superficial.) Yet many companies still pick it instead of java, python, ruby — sign of strength.

C is low-level. C++ usage can be equally low-level, but c++ is more complicated than C.

compare%%GTD to single-employer java veterans

Me vs a java guy having only a single long-term java project, who has more zbs (mostly nlg) and GTD power, including

  • performance tuning in thread pool, DB interface, serialization
  • hot-swap and remote debugging
  • JMX
  • tuning java+DB integration

When it comes to QQ and coding test scores, the difference is more visible than it is with GTD/zbs.

Conclusion — over 10 years, your portable GTD power grows too slow if you stick with one (or very few) system.

Am I advocating job hopping? Yes if you want to remain an individual contributor not aiming to move up.

 

Data Specialist #typical job spec

Hi friends,

I am curious about data scientist jobs, given my formal training in financial math and my (limited) work experience in data analysis.

I feel this role is a typical type — a generic “analyst” position in a finance-related firm, with some job functions related to … data (!):

  • some elementary statistics
  • some machine-learning
  • cloud infrastructure
  • some hadoop cluster
  • noSQL data store
  • some data lake
  • relational database query (or design)
  • some data aggregation
  • map-reduce with Hadoop or Spark or Storm
  • some data mining
  • some slice-n-dice
  • data cleansing on a relatively high amount of raw data
  • high-level python and R programming
  • reporting tools ranging from enterprise reporting to smaller desktop reporting software
  • spreadsheet data analysis — most end users still favor consider spreadsheet the primary user interface

I feel these are indeed elements of data science, but even if we identify a job with 90% of these elements, it may not be a true blue data scientist job. Embarrassingly, I don’t have clear criteria for a real data scientist role (there are precise definitions out there) but I feel “big-data”, “data-analytics” are so vague and so much hot air that many employers would jump on th bandwagon and portray themselves as data science shops.

I worry that after I work on such a job for 2 years, I may not gain a lot of insight or add a lot of value.

———- Forwarded message ———-
Date: 22 May 2017 at 20:40
Subject: Data Specialist – Full Time Position in NYC

Data Specialist– Financial Services – NYC – Full Time

My client is an established financial services consulting company in NYC looking for a Data Specialist. You will be hands on in analyzing and drawing insight from close to 500,000 data points, as well as instrumental in developing best practices to improve the functionality of the data platform and overall capabilities. If you are interested please send an updated copy of your resume and let me know the best time and day to reach you.

Position Overview

As the Data Specialist, you will be tasked with delivering benchmarking and analytic products and services, improving our data and analytical capabilities, analyzing data to identify value-add trends and increasing the efficiency of our platform, a custom-built, SQL-based platform used to store, analyze, and deliver benchmarking data to internal and external constituents.

  • 3-5 years’ experience, financial services and/or payments knowledge is a plus
  • High proficiency in SQL programming
  • High proficiency in Python programming
  • High proficiency in Excel and other Microsoft Office suite products
  • Proficiency with report writing tools – Report Builder experience is a plus

 

churn !! bad ] mktData #socket,FIX,.. unexpected!

I feel the technology churn is remarkably low.

New low-level latency techniques are coming up frequently, but these topics are actually “shallow” and low complexity to the app developer.

  • epoll replacing select()? yes churn, but much less tragic than the stories with swing, perl, structs
  • most of the interview topics are unchanging
  • concurrency? not always needed. If needed, then often fairly simple.

rvalue Object holding a resource : rather rare

I think naturally-occurring rvalue objects  rarely hold a resource.

  • literals — but these objects don’t hold any resources via a heap pointer
  • string1 + “.victor”
  • myInventoryLevel – 5000
  • myVector.push_back(Trade(12345)) — there is actually a temp Trade object. Compiler will call the rvr overload of push_back(). https://github.com/tiger40490/repo1/blob/cpp1/cpp/rvr/rvrDemo_NoCtor.cpp is my investigation. My temp object actually hold a resource via a heap pointerBut this usage scenario is rare in my opinion

However, if you have a regular nonref variable Connection myConn (“hello”), you can generate a rvr variable:

Connection && rvr2 = std::move(myConn);

By using std::move(), you promise to the compiler not to use myConn object afterwards.

 

 

#1 server-side language@WallSt: evolution

Don’t spend too much time.. Based on my limited observations,

  • As of 2007, the top dog was java.
  • The dominance is even stronger in 2018.
  • Q: how about in 10 years?
  • A: I feel java will remain #1

Look at the innovation leaders — West coast. For their (web) server side, they seem to have shifted slightly towards python, javascript, RoR

Q: Why do I consider buy-side, sell-side and other financial tech shops as a whole and why don’t I include google finance?
A: … because there’s mobility between sub-domains within, and entry barrier from outside.

Buy-side tend to use more c++; Banks usually favor java; Exchanges tend to use … both c++ and java. The latency advantage of c++ isn’t that significant to a major exchange like Nsdq.

 

CV-competition: Sg 10x tougher than U.S.

Sg is much harder, so … I better focus my CV effort on the Sg/HK/China market.

OK U.S. job market is not easy, but statistically, my CV had a reasonable hit rate (like 20% at least) because

  • contract employers don’t worry about my job hopper image
  • contract employers have quick decision making
  • some full time hiring managers are rather quick
  • age…
  • Finally, the number of jobs is so much more than Sg

 

socket^swing: separate(specialized skill)from core lang

  • I always believe swing is a distinct skill from core java. A regular core Java or jxee guy needs a few years experience to become swing veteran.
  • Now I feel socket programming is similarly a distinct skill from core C/c++

In both cases, since the core language knowledge won’t extend to this specialized domain, you need to invest personal time outside work hours .. look at CSY. That’s why we need to be selective which domain.

Socket domain has much better longevity (shelf-life)  than swing!

learn new tech for IV(!!GTD): learn-on-the-job is far from enough

Example — you programmed java for 6+ months, but you scored below 50% on those (basic) java knowledge question I asked you in skype chat. You only know what to study when you attend interviews. Without interviews, you won’t encounter those topics in your projects.

Example — I used SQL for at least 3 years before I joined Goldman Sachs. Until then I used no outer join no self-join no HAVING clause, no CASE, no correlated sub-query, no index tweaking. These topics were lightly used in Goldman but needed in interviews. So without interviews, I wouldn’t not know to pay attention to these topics.

Example — I programming tcp sockets many times. The socket interview questions I got from 2010 to 2016 were fairly basic. When I came to ICE I looked a bit deeper into our socket codebase but didn’t learn anything in particular. Then my interviews started showing me the direction. Among other things, interviewers look for in-depth understanding of

· Blocking/non-blocking

· Fast/slow receivers

· Buffer overflow

· Reliability

· Ack

How the hell can we figure out these are the high-value topics in TCP without interviews? I would say No Way even if I spend 2 years on this job.

coreJava^big-data java job #XR

In the late 2010’s, Wall street java jobs were informally categorized into core-java vs J2EE. Nowadays “J2EE” is replaced by “full-stack” and “big-data”.

The typical core java interview requirements have remained unchanged — collections, threading, JVM tuning, compiler details (including keywords, generics, overriding, reflection, serialization ), …, but relatively few add-on packages.

(With the notable exception of java collections) Those add-on packages are, by definition, not part of the “core” java language. The full-stack and big-data java jobs use plenty of add-on packages. It’s no surprise that these jobs pay on par with core-java jobs. More than 5 years ago J2EE jobs, too, used to pay on par with core-java jobs, and sometimes higher.

My long-standing preference for core-java rests on one observation — churn. The add-on packages tend to have a relatively short shelf-life. They become outdated and lose relevance. I remember some of the add-on

  • Hadoop, Spark
  • functional java
  • SOAP, REST
  • GWT
  • NIO
  • Protobuf, json
  • Gemfire, Coherence, …
  • ajax integration
  • JDBC
  • Spring
  • Hibernate, iBatis
  • EJB
  • JMS, Tibco EMS, Solace …
  • XML-related packages (more than 10)
  • Servlet, JSP
  • JVM scripting including scala, groovy, jython, javascript@JVM… (I think none of them ever caught on outside one or two companies.)

None of them is absolutely necessary. I have seen many enterprise java systems using only one or two of these add-on packages.

Q: Just when do App (!! lib) devs write std::move()

I feel move ctor (and move-assignment) is extremely implicit and “in-the-fabric”. I don’t know of any common function with a rvr parameter. Such a function is usually in some library, but I don’t know any std lib function like that. Consequently, in my projects I have not seen any user-level code that shows “std::move(…)”

Let’s look at move ctor. “In the fabric” means it’s mostly rather implicit i.e. invisible. Most of the time move ctor is picked by compiler based on some rules, and I have basically no influence over it.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp shows when I need to call move() but it’s a contrived example — I have some object (holding a resource via heap pointer), I use it once then I don’t need it any more, so I “move” its resource into a container and abandon the crippled object.

Conclusion — as app developers I seldom write code using std::move.

  • P20 [[c++ std lib] shows myCollection.insert(std::move(x)); // where x is a local nonref variable, not a heap pointer!
    • in this case, we should provide a wrapper function over std::move() named getRobberAliasOf()
    • I think you do this only if x has part of its internal storage allocated on heap, and only if the type X has a move ctor.

I bet that most of the time when an app developer writes “move(…)”, she doesn’t know if the move ctor will actually get picked by compiler. Verification needed.

–Here’s one contrived example of app developer writing std::move:

string myStr=input;
vectorOfString.push_back(std::move(myStr)); //we promise to compiler we won’t use myStr any more.

Without std::move, a copy of myStr is constructed in the vector. I call this a contrived example because

  • if input is a char-array, then emplace_back() is more efficient
  • if input is another temp string, then we can simply use push_back(input)

c++QQ/zbs Expertise: I got some

As stated repeatedly, c++ is the most complicated and biggest language used in industry, at least in terms of syntax (tooManyVariations) and QQ topics. Well, I have impressed many expert interviewers on my core-c++ language insight.

That means I must have some expertise in c++ QQ topics. For my c++ zbs growth, see separate blog posts.

Note socket, shared mem … are c++ ecosystem, like OS libraries.

Deepak, Shanyou, Dilip .. are not necessarily stronger. They know some c++ sub-domains better, and I know some c++ sub-domains better, in both QQ and zbs.

–Now some of the topics to motivate myself to study

  • malloc and relatives … internals
  • enable_if
  • email discussion with CSY on temp obj
  • UDP functions

how I achieved%%ComfortableEconomicProfile by44

(I want to keep this blog in recrec, not tanbinvest. I want to be brief yet incisive.)

See 3 ffree scenarios: cashflow figures. What capabilities enabled me to achieved my current Comfortable Economic profile?

  • — top 3
  • by earning SGP citizenships
  • by developing my own investment strategies, via trial-n-error
  • by staying healthy
  • — the obvious
  • by high saving rate thanks to salary + low burn rate — efficiency inspired by SG gov
  • by consistent body-building with in-demand skills -> job security. I think rather few of my peers have this level of job security. Most of them work in one company for years. They may be lucky when they need a new job, but they don’t have my inner confidence and level of control on that “luck”. Look at Y.W.Chen. He only developed that confidence/control after he changed job a few times.

When I say “Comfortable” I don’t mean “above-peers”, and not complete financial freedom, but rather … easily affordable lifestyle without the modern-day pressure to work hard and make a living. In my life there are still too many pressures to cope with, but I don’t need to work so damn hard trying to earn enough to make ends meet.

A higher salary or promotion is “extremely desirable” but not needed. I’m satisfied with what I have now.

I can basically retire comfortably.

python: value@ECT+syntax > deep insight

c++ interviews value deep insight more than any language. Java and c# interviews also value them highly, but not python interviews.

Reminder — zoom in and dig deep in c++, java and c# only. Don’t do that in python too much.

Instead of deep insight, accumulate ECT syntax … highly valued in TIMED coding tests.

Use brief blog posts with catchy titles

p2p messaging beats MOM ] low-latency trading

example — RTS exchange feed dissemination infrastructure uses raw TCP and UDP sockets and no MOM

example — the biggest sell-side equity OMS network uses MOM only for minor things (eg?). No MOM for market data. No MOM carrying FIX order messages. Between OMS nodes on the network, FIX over TCP is used

I read and recorded the same technique in 2009… in this blog

Q: why is this technique not used on west coast or main street ?
%%A: I feel on west coast throughput outweighs latency. MOM enhances throughput.

latency QQ ]WallSt IV #java,c++..

Latency knowledge

  • is never needed on the job but … high Market Value
  • is not GTD at all but … is part of zbs
  • is not needed in py jobs
  • is needed in many c++ interview topics but .. in java is concentrated in JIT and GC
  • is an elite skill but … many candidates try
  • some depth is needed for IV and other discussions but … relatively low-complexity .. low-complexity topics #eg:GC/socket

mgr role obligation: relationships

A technical or contract role is less complicated, though relationships are also important and can make your life very stressful or relatively easy.

In ## 2 heaviest work stressors, I listed “figure-things-out” as a key stressor — if I’m reasonably fast on this front, then the relationships have limited impact on my stress level at work. Not true for a manager role — even if you get things done fast enough, relationships can still mess up your life.

  • the relationship with the immediate boss is most critical. I had many problems in the past.
  • relationship with other teams. Dependency means … stressful relationship
  • relationship with big bosses
  • relationship with subordinates can also become difficult. Shuo told me it was not for him, but I feel some managers depend on some key subordinates. Dependency means stress.
    • managing a non-performing subordinates … is not easy at all. I could see Kevin had headaches with me.
  • relationship with key business users. I feel Venkat (ICE) is under that pressure.

volume alone doesn’t make something big-data

The Oracle nosql book has these four “V”s to qualify any system as big data system. I added my annotations:

  1. Volume
  2. Velocity
  3. Variety of data format — If any two data formats account for more than 99% of your data in your system, then it doesn’t meet this definition. For example, FIX is one format.
  4. Variability in value — Does the system treat each datum equally?

Most of the so-called big-data systems I have seen don’t have these four V’s. All of them have some volume but none has the Variety or the Variability.

I would venture to say that

  • 1% of the big-data systems today have all four V’s
  • 50%+ of the big-data systems have no Variety no Variability
    • 90% of financial big-data systems are probably in this category
  • 10% of the big-data systems have 3 of the 4 V’s

The reason that these systems are considered “big data” is the big-data technologies applied. You may call it “big data technologies applied on traditional data”

See #top 5 big-data technologies

Does my exchange data qualify? Definitely high volume and velocity, but no Variety or Variability.

%%c++keep crash` I keep grow`as hacker #zbs#Ashish

Note these are fairly portable zbs, more than local GTD know-how !

My current c++ project has high data volume, some business logic, some socket programming challenges, … and frequent crashes.

The truly enriching part are the crashes. Three months ago I was afraid of c++, largely because I was afraid of any crash.

Going back to 2015, I was also afraid of c++ build errors in VisualStudio and Makefiles, esp. those related to linkers and One-Definition-Rule, but I overcame most of that fear in 2015-2016. In contrast, crashes are harder to fix because 70% of the crashes come with no usable clue. If there’s a core file I may not be able to locate it. If I locate it, it may not have symbols. If it has symbols the crash site is usually in some classes unrelated to any classes that I wrote. I have since learned many lessons how to handle these crashes:

  • I have a mental list like “10 common crash patterns” in my log
  • I have learned to focus on the 20% of my codebase that are most convoluted, most important, most tricky and contribute most to debugging difficulties. I then invest my time strategically to rewrite (parts of) that 20% and dramatically simplify them. I managed to get familiar and confident with that 20%.
    • If the code belongs to someone else including 3rd party, I try to rewrite it locally for my dev
  • I have learned to pick the most useful things to log, so they show a *pattern*. The crashes usually deviate from the patterns and are now easier to spot.
  • I have developed my binary data dumper to show me the raw market data received, which often “cause” crashes.
  • I have learned to use more assertions and a hell lot of other validations to confirm my program is not in some unexpected *state*. I might even overdo this and /leave no stoned unturned/.
  • I figured out memset(), memcpy(), raw arrays are the most crash-prone constructs so I try to avoid them or at least build assertions around them.
  • I also figured signed integers can become negative and don’t make sense in my case so I now use unsigned int exclusively. In hind sight not sure if this is best practice, but it removed some surprises and confusions.
  • I also gained quite a bit of debugger (gdb) hands-on experience

Most of these lessons I picked up in debugging program crashes, so these crashes are the most enriching experience. I believe other c++ programs (including my previous jobs) don’t crash so often. I used to (and still do) curse the fragile framework I’m using, but now I also recognize these crashes are accelerating my growth as a c++ developer.

##fastest container choices: array of POD #or pre-sized vector

relevant to low-latency market data.

  • raw array is “lean and mean” — the most memory efficient; vector is very close, but we need to avoid reallocation
  • std::array is less popular but should offer similar performance to vector
  • all other containers are slower, with bigger footprint
  • For high-performance, avoid container of node/pointer — Cache affinity loves contiguous memory. After accessing 1st element, then accessing 2nd element is likely a cache-hit
    • set/map, linked list suffer the same

c++11,sockets,IPC in QnA IV #Ashish

[18] Update — This has become one of the more  valuable posts in my c++ blog. It cuts down the amount of info overload and learning curve steepness by half then again by half.

Hi Ashish,

I now have a little theory on the relative importance of several c++ tech skills in a job candidate. I feel all of the skills below are considered “secondary importance” to most of the (15 – 30) interviewers I have met. These skill are widely used in projects, but if we say “no experience” in them, BUT demonstrate strength in core c++ then we win respect.

[e=c++ecosystem topics]
[c=core c++topics]

—#2 [e] c++threading —— many job descriptions say they use threading but it’s probably very simple threading. Threading is such complex topic that only the well-reviewed proven designs are safe to use. If a project team decides to invent their concurrency design ( I invented once ) , and have any non-trivial deviation from the proven designs, they may unknowingly introduce bugs that may not surface for years. So the actual threading usage in any project is always minimal, simple and isolated to a very small number of files.

The fastest systems tend to have nothing shared-mutable, so multi-threading presents zero risk and requires no design. No locking no CAS. Essentially single-threaded.

However, we don’t have the luxury to say “limited experience” in threading. I have a fair amount of concurrency design experience across languages and thread libraries (pthreads, ObjectSpace, c#), using locks+conditional variables+CAS as building blocks, but the c++ thread library used in another team is probably different. I used to say I used boost::thread a lot but it back-fired.

—#99 [c] OO design patterns ——- never required in coding tests. In fact, basically no OO features show up in short coding tests. 90% of short coding tests are about GettingThingsDone, using algorithms and data structures.

Design patterns do show up in QnA interviews but usually not difficult. I usually stick to a few familiar ones — Singleton, Factory, TemplateMethod, ProducerConsumer. I’m no expert with any of these but I feel very few candidates are. I feel most interviewers have a reasonable expectation. I stay away from complex patterns like Visitor and Bridge.

—#1 [c] c++11 —— is not yet widely used. Many financial jobs I applied have old codebases they don’t want to upgrade. Most of the c++11 features we use as developers are optional convenience features, Some features are fundamental (decltype, constexpr …) yet treated as simple convenience features. I feel move semantics and r-value references are fairly deep but these are really advanced features for library writers, not application developers. Beside shared_ptr, C++11 features seldom affect system design. I have “limited project experience using c++11“.

Interviewers often drill into move semantics. If I admit ignorance I could lose. Therefore, I’m actively pursuing the thin->thick->thin learning path.

—#5 [e] Boost —— is not really widely used. 70% of the financial companies I tried don’t use anything beyond shared_ptr. Most of the boost features are considered optional and high-level, rather than fundamental. If I tell them I only used shared_ptr and no other boost library, they will usually judge me on other fronts, without deducting points. I used many boost libraries but only understand shared_ptr better.

Q: Are there job candidates who are strong with some boost feature (beside shared_ptr) but weak on core c++?
A: I have not seen any.

Q: Are there programmers strong on core c++ but unfamiliar with boost?
A: I have seen many

—#4 [c] templates —— another advanced feature primarily for library developers. Really deep and complex. App developers don’t really need this level of wizardry. A few experienced c++ guys told me their teams each has a team member using fancy template meta-programing techniques that no one could understand or maintain. None of my interviewers went very deep on this topic. I have only limited meta-programming experience, but I will focus on 2 common template techniques — CRTP and SFINAE and try to build a good understanding (thin->thick->thin)

—#6 [c] IKM —— is far less important than we thought. I know many people who score very high but fail the tech interview badly. On the other hand, I believe some candidates score mediocre but impress the interviewers when they come on-site.

—#3 [e] linux system programming like sockets —— is relevant to low-latency firms. I think half the c++ finance firms are. Call them infrastructure team, or engineering team, or back-end team. I just happened to apply for too many of this kind. To them, socket knowledge is essential, but to the “mainstream” c++ teams, socket is non-essential. I have “modest socket experience” but could impress some interviewers.

(Actually socket api is not a c/c++ language feature, but a system library. Compared to STL, socket library has narrower usage.)

Messaging is a similar skill like sockets. There are many specific products so we don’t need to be familiar with each one.

InterProcessCommunication  (Shared memory, pipes… ) is a similar “C” skill to sockets but less widely required. I usually say my system uses sockets, database or flat files for IPC, though shared memory is probably faster. I hope interviewers don’t mind that. If there’s some IPC code in their system, it’s likely isolated and encapsulated (even more encapsulated than threading code), so hopefully most developers don’t need to touch it

Memory Manager is another specialized skill just like IPC. Always isolated in some low-level module in a library, and seldom touched. A job spec may require that but actually few people have experience and never a real show stopper.

If a role requires heavy network programming (heavy IPC or messaging development is less common) but we have limited experience, then it can be a show-stopper.

— #99[c] A) instrumentation and B) advanced build tools for memory, dependency etc … are big paper tigers. Can be open-source or commercial. Often useful to GTD but some interviewers ask about these tools to check your real-world experience. 99% of c++programmers have superficial knowledge only because we don’t need to understand the internals of GPS to use it in a car.

Note basic build-chain knowledge is necessary for GTD but relevant “basic” and seldom quizzed.

—————–
These topics are never tested in short coding questions, and seldom in longer coding questions.

What c++ knowledge is valued more highly? See ##20 C++territories for QQ IV

As CTO, I’d favor transparent langs, wary of outside libraries

If I were to propose a system rewrite, or start a new system from scratch without constraints like legacy code, then Transparency (+instrumentation) is my #1 priority.

  • c++ is the most opaque. Just look at complex declarations, or linker rules, the ODR…
  • I feel more confident debugging java. The JVM is remarkably well-behaving (better than CLR), consistent, well discussed on-line
  • key example — the SOAP stub/skeleton hurts transparency, so does the AOP proxies. These semi-transparent proxies are not like regular code you can edit and play with in a sandbox
  • windows is more murky than linux
  • There are many open-source libraries for java, c++, py etc but many of them affects transparency. I think someone like Piroz may say Spring is a transparent library
  • SQL is transparent except performance tuning
  • Based on my 1990’s experience, I feel javascript is transparent but I could be wrong.
  • I feel py, perl are still more transparent than most compiled languages. They too can become less transparent, when the codebase grows. (In contrast, even small c++ systems can be opaque.)

This letter is about which language, but allow me to digress briefly. For data store and messaging format (both require serialization), I prefer the slightly verbose but hugely transparent solutions, like FIX, CTF, json (xml is too verbose) rather than protobuf. Remember 99% of the sites use only strings, numbers, datetimes, and very rarely involve audio/visual data.

contractor^mgr^low-VP 3way-compare #XR

See also hands-on dev beats mgr @same pay and my discussion with Youwei in pureDev beats techLead: paycut=OK #CYW

In the U.S. context, I feel the FTE developer position is, on average, least appealing, though some do earn a lot, such as some quant developers. My very rough ranking of total income is

  1. senior mgr
  2. contractor
  3. FTE-dev including entry-level lead roles

Without bonus, the FTE-dev is often lowest. However, bonus is not guaranteed.

I exclude the pure quants (or medical doctors) as a different profession from IT.

 

edit 1 file in big python^c++ production system #XR

Q1: suppose you work in a big, complex system with 1000 source files, all in python, and you know a change to a single file will only affect one module, not a core module. You have tested it + ran a 60-minute automated unit test suit. You didn’t run a prolonged integration test that’s part of the department-level full release. Would you and approving managers have the confidence to release this single python file?
A: yes

Q2: change “python” to c++ (or java or c#). You already followed the routine to build your change into a dynamic library, tested it thoroughly and ran unit test suite but not full integration test. Do you feel safe to release this library?
A: no.

Assumption: the automated tests were reasonably well written. I never worked in a team with a measured test coverage. I would guess 50% is too high and often impractical. Even with high measured test coverage, the risk of bug is roughly the same. I never believe higher unit test coverage is a vaccination. Diminishing return. Low marginal benefit.

Why the difference between Q1 and Q2?

One reason — the source file is compiled into a library (or a jar), along with many other source files. This library is now a big component of the system, rather than one of 1000 python files. The managers will see a library change in c++ (or java) vs a single-file change in python.

Q3: what if the change is to a single shell script, used for start/stop the system?
A: yes. Manager can see the impact is small and isolated. The unit of release is clearly a single file, not a library.

Q4: what if the change is to a stored proc? You have tested it and run full unit test suit but not a full integration test. Will you release this single stored proc?
A: yes. One reason is transparency of the change. Managers can understand this is an isolated change, rather than a library change as in the c++ case.

How do managers (and anyone except yourself) actually visualize the amount of code change?

  • With python, it’s a single file so they can use “diff”.
  • With stored proc, it’s a single proc. In the source control, they can diff this single proc. Unit of release is traditionally a single proc.
  • with c++ or java, the unit of release is a library. What if in this new build, beside your change there’s some other change , included by accident? You can’t diff a binary 😦

So I feel transparency is the first reason. Transparency of the change gives everyone (not just yourself) confidence about the size/scope of this change.

Second reason is isolation. I feel a compiled language (esp. c++) is more “fragile” and the binary modules more “coupled” and inter-dependent. When you change one source file and release it in a new library build, it could lead to subtle, intermittent concurrency issues or memory leaks in another module, outside your library. Even if you as the author sees evidence that this won’t happen, other people have seen innocent one-line changes giving rise to bugs, so they have reason to worry.

  • All 1000 files (in compiled form) runs in one process for a c++ or java system.
  • A stored proc change could affect DB performance, but it’s easy to verify. A stored proc won’t introduce subtle problems in an unrelated module.
  • A top-level python script runs in its own process. A python module runs in the host process of the top-level script, but a typical top-level script will include just a few custom modules, not 1000 modules. Much better isolation at run time.

There might be python systems where the main script actually runs in a process with hundreds of custom modules (not counting the standard library modules). I have not seen it.

effi^instrumentation ] new project

I always prioritize instrumentation over effi/productivity/GTD.

A peer could be faster than me in the beginning but if she lacks instrumentation skill with the local code base there will be more and more tasks that she can’t solve without luck.

In reality, many tasks can be done with superficial “insight”, without instrumentation, with old-timer’s help, or with lucky search in the log.

What if developer had not added that logging? You are dependent on that developer.

I could be slow in the beginning, but once I build up (over x months) a real instrumentation insight I will be more powerful than my peers including some older timers. I think the Stirt-tech London team guru (John) was such a guy.

In reality, even though I prioritize instrumentation it’s rare to make visible progress building instrumentation insight.

C for latency^^TPS can use java

I’m 98% confident — low latency favors C/C++ over java [1]. FPGA is _possibly_ even faster.

I’m 80% confident — throughput (in real time data processing) is achievable in C, java, optimized python (Facebook?), optimized php (Yahoo?) or even a batch program. When you need to scale out, Java seems the #1 popular choice as of 2017. Most of the big data solutions seem to put java as the first among equals.

In the “max throughput” context, I believe the critical java code path is optimized to the same efficiency as C. JIT can achieve that. A python and php module can achieve that, perhaps using native extensions.

[1] Actually, java bytecode can run faster than compiled C code (See my other posts such as https://bintanvictor.wordpress.com/2017/03/20/how-might-jvm-beat-cperformance/)

[17] 5 unusual tips@initial GTD

See also https://bintanvictor.wordpress.com/wp-admin/edit.php?s&post_status=all&post_type=post&action=-1&m=0&cat=560907660&filter_action=Filter&paged=1&action2=-1

* build up instrumentation toolset
* Burn weekends, but first … build momentum and foundation including the “instrumentation” detailed earlier
* control distractions — parenting, housing, personal investment, … I didn’t have these in my younger years. I feel they take up O2 and also sap the momentum.
* Focus on output that’s visible to boss, that your colleagues could also finish so you have nowhere to hide. Clone if you need to. CSDoctor told me to buy time so later you can rework “under the hood” like quality or design

–secondary suggestions:
* Limit the amount of “irrelevant” questions/research, when you notice they are taking up your O2 or dispersing the laser. Perhaps delay them.

Inevitably, this analysis relies on the past work experiences. Productivity(aka GTD) is a subjective, elastic yardstick. #1 Most important is GTD rating by boss. It sinks deep… #2 is self-rating https://bintanvictor.wordpress.com/2016/08/09/productivity-track-record/

## low-complexity QQ topics #JGC/parser..

java GC is an example of “low-complexity domain”. Isolated knowledge pearls. (Complexity would be high if you delve into the implementation.)

Other examples

  • FIX? slightly more complex when you need to debug source code. java GC has no “source code” for us.
  • socket programming? conceptually, relatively small number of variations and combinations. But when I get into a big project I am likely to see the true color.
  • stateless feed parser coded against an exchange spec

2H life-changing xp#Pimco#income,home location,industry…

Here’s a real story in 2010 — I was completely hopeless and stuck in despair after my Goldman Sachs internal transfer was blocked in the last stage. I considered moving my whole family back to Singapore without any offer, and start my job search there. I was seriously considering a S$100k job in a back office batch programming job. Absolutely the lowest point in my entire career. After licking the would for 2 months, tentatively I started looking for jobs outside Goldman and slowly found my foothold. Then in early 2010, I passed a phone screening and attended a Citigroup “superday”. I spent half an hour each with 3 interviewers. By end of the day, recruiter said I was the #1 pick. I took the offer, at a 80% increment. In the next 12 months, I built up my track record + knowledge in

  1. real time trading engine components, esp. real time pricing engine
  2. fixed income math,
  3. c++ (knowledge rebuild)

I have never looked back since. Fair to say that my family won’t be where we are today, without this Citigroup experience. With this track record I was able to take on relatively high-end programming jobs in U.S. and Singapore. I was able to live in a convenient location, and buy properties and send my kids to mid-range preschools (too pricey in hind sight). Obviously I wanted this kind of job even in 2009. That dream became reality when I passed the superday interview. That interview was one of the turning points in my career.

Fast forward to Apr 2017 — I had a 20-minute phone interview with the world’s biggest asset management firm (Let’s call it PP), then I had a 2-hour skype interview. They made an offer. I discussed with my recruiter their proposal —

  • I would relocate to California
  • I would get paid around 200k pretax and possibly with an increment in 6 months. PP usually increase billing rate after 12 months if contractor does well.
  • recruitment agency CEO said he would transfer my visa and sponsor green card.

If I were to take this offer, my life would be transformed. (I would also have a better chance to break into the  high tech industry in nearby silicon valley, because I would have local friends in that domain.) Such a big change in my life is now possible because … I did well [1] in the interview.

Stripped to the core, that’s the reality in our world of contract programmers.  Project delivery, debugging, and relationship with boss can get you promoted, but those on-the-job efforts have much lower impact than your performance during an interview. Like an NBA playoff match. A short few hour under the spot light can change your life forever.

This is not a rare experience. There are higher-paying contract job offers that could “change your life”, and you only need to do well in the interviews to make it happen.

I feel this is typical of U.S. market and perhaps London. In Singapore. contract roles can’t pay this much. A permanent role has a few subtle implications so I feel it’s a different game.

[1] The 7 interviewers felt I was strong in c++ (not really), java and sql, and competent in fixed income math (I only worked with it for a year). Unlike other high-end interviews, there are not many tough tech questions like threading, algorithms, or coding tests. I feel they liked my interview mostly because of the combination of c++/java/fixed income math — not a common combination.

big data is!! fad; big-data technologies might be

(blogging)

My working definition — big data is the challenges and opportunities presented by the large volume of disparate (often unstructured) data.

For decades, this data has always been growing. What changed?

* One recent changed in the last 10 years or so is data processing technology. As an analogy, oil sand has been known for quite a while but the extraction technology slowly improved to become commercially viable.

* Another recent change is social media, creating lots of user-generated content. I believe this data volume is a fraction of the machine-generated data, but it’s more rich and less structured.

Many people see opportunities to make use of this data. I feel the potential usefulness of this data is somewhat /overblown/ , largely due to aggressive marketing. As a comparison, consider location data from satellites and cellular networks — useful but not life-changing useful.

The current crop of big data technologies are even more hype. I remember XML, Bluetooth, pen computing, optical fiber .. also had their prime times under the spotlight. I feel none of them lived up to the promise (or the hype).

What are the technologies related to big data? I only know a few — NOSQL, inexpensive data grid, Hadoop, machine learning, statistical/mathematical python, R, cloud, data mining technologies, data warehouse technologies…

Many of these technologies had real, validated value propositions before big data. I tend to think they will confirm and prove those original value propositions in 30 year, after the fads have long passed.

As an “investor” I have a job duty to try and spot overvalued, overhyped, high-churn technologies, so I ask

Q: Will Haoop (or another in the list) become more widely used (therefore more valuable) in 10 years, as newer technologies come and go? I’m not sure.

http://www.b-eye-network.com/view/17017 is a concise comparison of big data and data warehouse, written by a leading expert of data warehouse.

[09]%%design priorities as arch/CTO

Priorities depend on industry, target users and managers’ experience/preference… Here are my Real answers:

A: instrumentation (non-opaque ) — #1 priority to an early-stage developer, not to a CTO.

Intermediate data store (even binary) is great — files; reliable[1] snoop/capture; MOM

[1] seldom reliable, due to the inherent nature — logging/capture, even error messages are easily suppressed.

A: predictability — #2 (I don’t prefer the word “reliability”.) related to instrumentation. I hate opaque surprises and intermittent errors like

  • GMDS green/red LED
  • SSL in Guardian
  • thick, opaque libraries like Spring
  1. Database is rock-solid predictable.
  2. javascript was predictable in my pre-2000 experience
  3. automation Scripts are often more predictable, but advanced python is not.

(bold answers are good interview answers.)
A: separation of concern, encapsulation.
* any team dev need task breakdown. PWM tech department consists of teams supporting their own systems, which talk to each other on an agreed interface.
* Use proc and views to allow data source internal change without breaking data users (RW)
* ftp, mq, web service, ssh calls, emails between departments
* stable interfaces. Each module’s internals are changeable without breaking client code
* in GS, any change in any module must be done along with other modules’ checkout, otherwise that single release may impact other modules unexpectedly.

A: prod support and easy to learn?
* less support => more dev.
* easy to reproduce prod issues in QA
* easy to debug
* audit trail
* easy to recover
* fail-safe
* rerunnable

A: extensible and configurable? It often adds complexity and workload. Probably the #1 priority among managers i know on wall st. It’s all about predicting what features users might add.

How about time-to-market? Without testibility, changes take longer to regression-test? That’s pure theory. In trading systems, there’s seldom automated regression testing.

A: testability. I think Chad also liked this a lot. Automated tests are less important to Wall St than other industries.

* each team’s system to be verifiable to help isolate production issues.
* testable interfaces between components. Each interface is relatively easy to test.

A: performance — always one of the most important factors if our system is ever benchmarked in a competition. Benchmark statistics are circulated to everyone.

A: scalability — often needs to be an early design goal.

A: self-service by users? reduce support workload.
* data accessible (R/W) online to authorized users.

A: show strategic improvement to higher management and users. This is how to gain visibility and promotion.

How about data volume? important to eq/fx market data feed, low latency, Google, facebook … but not to my systems so far.

DB=%% favorite data store due to instrumentation

The noSQL products all provide some GUI/query, but not very good. Piroz had to write a web GUI to show the content of gemfire. Without the GUI it’s very hard to manage anything that’s build on gemfire.

As data stores, even binary files are valuable.

Note snoop/capture is no data-store, but falls in the same category as logging. They are easily suppressed, including critical error messages.

Why is RDBMS my #1 pick? ACID requires every datum to be persistent/durable, therefore viewable from any 3rd-party app, so we aren’t dependent on the writer application.

[17]FASTEST muscle-growth=b4/af job changes]U.S.

I now recall that my muscle-building and, to a lesser extent, zbs growth are clearly fastest in the 3 months around each job change. I get frequent interviews and positive feedback. This is a key (subconscious) reason why I prefer contracting even at a lower salary. I get the kick each time I change job.

My blogging activity shows the growth…

  • #1 factor … positive feedback from real offers from good companies.
  • #2 factor — I actually feel real zbs growth thought it tends to be less strategic in hindsight.
  • factor — on a new job, I am curious to learn things I have wanted to learn like Xaml, FIX, Tibco, kdb, SecDB, multicast, orderbook, curve building

Beside the months immediately b4/af job change, I also experienced significant growth in

No such environment in Singapore:(

##tough n high-leverage c++topics#IV QQ

I used to feel I have so much learning(absorption) capacity, but now I feel in my finite career I can’t really master and remember all the tough c++ topics.

Practical solution — Classify each difficult c++topic into one of

  1. QQ: high impact on QnA interview, probably the only type of high-leverage tough topic. Largely textbook knowledge. As such I’m basically confident I can learn all the basics on my own (perhaps slower than Venkat), provided I know the topics.
    1. including “coding questions” designed really for knowledge test, rather than algorithm thinking
    2. eg: HFT, Venkat…
  2. high impact on algo coding IV? No such topic. These coding IV are not about knowledge in tough topics!
  3. ZZ: high impact on GTD zbs — inevitably Low leverage during job hunt
  4. 00: no high impact on anything

Q: Is there a tough topic in both QQ and ZZ? I don’t know any.

I think 00 will be the biggest category:

  • [00] template wizardry;
  • [00] template specialization
  • [00] MI;
  • [00] operator overloading;
  • [00] pthread
  • ————-
  • [QQ]
  • [QQ] move semantics
  • [QQ] [p] boost common lib
  • [QQ] optimization tricks. Remember MIAX and SCB IV by Dmitry
  • [QQ] [p] singleton implementation — not really tough
  • [QQ] pimpl — not really tough
  • [QQ] op-new/malloc (interacting with ctor)
  • [QQ] memory layout
  • [QQ] [p] struct alignment
  • [QQ] specific threading constructs
  • [QQ] smart ptr details
  • [QQ] ptr as a field
  • [QQ] implement auto_ptr or ref-counting string
  • [QQ] [p] UDP —
  • [QQ] [p] multicast
  • [QQ] select()
  • [QQ]
  • [ZZ] IDE set-up
  • [ZZ] compiler/linker/makefile details
  • [ZZ] debuggers
  • [ZZ] crash analysis, like segmentation fault
  • [ZZ] c^c++:SAME debugging/tracing/instrumentation skills #ZZ

[p=paper tiger. ]

##[18] Algo problem solving as hobby@@

  • how does this compare to board game?
  • how does this compare to jigsaw puzzles
  • how does this compare to small mobile app development as a hobby?
    • small mobile game development
  • how does this compare to photography as a hobby?
  • how does this compare to blogging as a hobby
  • how does this compare with DIY home improvement
    • woodwork
  • how does this compare to auto tuning and bike building
  • how does this compare with hi-fi tuning

Every one of them is better than TV, gaming, sight-seeing or drinking. Each one requires consistent effort, sustained focus. Each one will become more frustrating less exciting before you reach the next level.

##high complexity high mkt-value specializations

Opening example 1: Quartz — High complexity. Zero market value as the deep insight gained is decidedly local and won’t make you a stronger developer on another team.

Opening example 2: IDE, Maven, git, Unix + shell scripting — modest complexity; Makes me stronger developer in real projects, but no premium on the labor market.

My best experiences on Wall St — tech skills with high market value + complexity high enough that few developers could master it. On a project involving these I get better lifestyle, lower stress… Examples:

  • threading
  • java collections
  • SQL complex queries + stored proc. Declining demand in high-end jobs?
  • SQL tuning
  • MOM-based, high volume system implementation — reasonable complexity and market value, but not mainstream. Mostly used in trading only 😦
  • pricing math — high market value but too specialized 😦
  • trading algorithms, price distribution, … Specialized 😦

Let’s look at a few other tech skills:

  • c++ build automation — modest complexity; low value
  • c++ low latency — high value;  barrier too high for me 😦
  • java reflection, serialization — high complexity high practical value, but market value is questionable 😦
  • .NET — some part can be high complexity, but demand is a bit lower than 2011 😦
  • Java tuning — high complexity; not high value practically
  • python — modest complexity, growing market value
  • PHP — lower complexity and lower market value than py, IMHO

retreat to raw ptr from smart ptr ASAP

Raw ptr is in the fabric of C. Raw pointers interact/integrate with countless parts of the language in complex ways. Smart pointers are advertised as drop-in replacements but that advertisement may not cover all of those “interactions”:

  • double ptr
  • new/delete/free
  • ptr/ref layering
  • ptr to function
  • ptr to field
  • 2D array
  • array of ptr
  • ptr arithmetics
  • compare to NULL
  • ptr to const — smart ptr to const should be fine
  • “this” ptr
  • factory returning ptr — can it return a smart ptr?
  • address of ptr object

Personal suggestion (unconventional) — stick to the known best practices of smart ptr (such as storing them in containers). In all other situations, do not treat them as drop-in replacements but retrieve and use the raw ptr.

[16]python: routine^complex tasks #XR

XR,

Further to our discussion, I used perl for many years. 95% of my perl tasks are routine tasks. With py, I would say “majority” of my tasks are routine tasks i.e. solutions are easy to find on-line.

  • routine tasks include automated testing, shell-script replacement, text file processing, query XML, query various data stores, query via http post/get, small-scale code generation, simple tcp client/server.
  • For “Complex tasks” , at least some part of it is tricky and not easily solved by Googling. Routine reflection / concurrency / c++Integration / importation … are documented widely, with sample code, but these techniques can be pushed to the limit.
    •  Even if we just use these techniques as documented, but we combine them in unusual ways, then Google search will not be enough.
    • Beware — decorators , meta-programming, polymorphism, on-the-fly code-generation, serialization, remote procedure call … all rely on reflection.

When you say py is not as easy as xxx and takes several years to learn, I think you referred to complex tasks.

It’s quite impressive to see that some powerful and useful functionalities can be easily implemented by following online tutorials. By definition these are routine tasks. One example that jump out at me is reflection in any language. Python reflection can be powerful like a power drill to break a stone wall. Without such a power drill the technical challenge can look daunting. I guess metaclass is one such power drill. Decorator is a power drill I used in %%logging decorator with optional args

I can see a few reasons why managers choose py over java for certain tasks. I heard there are a few jvm-based scripting languages (scala, groovy, clojure, jython …) but I guess python beats them on several fronts including more packages (i.e. wheels) and more mature, more complete and proven solutions, familiarity, reliability + wider user base.

One common argument to prefer any scripting language over any compiled language is faster development. True for routine tasks. For complex tasks, “your mileage may vary”. As I said, if the software system requirement is inherently complex, then implementation in any language will be complex. When the task is complex, I actually prefer more verbose source code — possibly more transparent and less “opaque”.

Quartz is one example of a big opaque system for a complex task. If you want, I can describe some of the complex tasks (in py) I have come across though I don’t have the level of insight that some colleagues have.

When you said the python debugger was less useful to you than java debugger, it’s a sign of java’s transparency. My “favorite” opaque parts of py are module import and reflection.

If any language has excellent performance/stability + good on-line resources [1] + reasonable library of components comparable to the mature languages like Java/c++, then I feel sooner or later it will catch on. I feel python doesn’t have exactly the performance. In contrast, I think php and javascript can achieve very high performance in their respective usage domains.

[1] documentation is nice-to-have but not sufficient. Many programmers don’t have time to read documentation in-depth.

[16]standard practice around q[delete]

See also post about MSDN overview on mem mgmt…

  • “delete” risk: too early -> stray pointer
  • “delete” risk: too “late” -> leak. You will see steadily growing memory usage
  • “delete” risk: too many times -> double free
  • … these risks are resolved in java and dotnet.

For all simple scenarios, I feel the default and standard idiom to manage the delete is RAII. This means the delete is inside a dtor, which in turn means there’s a wrapper class, perhaps some smart ptr class.

It also means we create stack instances of
– the wrapper class
– a container holding the wrapper objects
– an umbrella holding the wrapper object

Should every departure/deviation be documented?

I feel it’s not a best idea to pass a pointer into some cleanup function, and inside the function, delete the passed-in pointer. What if the pointee is a pointee already deleted, or a pointee still /in active service/, or a non-heap pointee, or a pointee embedded in a container or umbrella… See P52 [[safe c++]]

prefer for(;;)+break: cod`IV

For me at least, the sequencing of the 3-piece for-loop is sometimes trickier than I thought. It’s supposedly simple rule(s), but I don’t get it exactly right sometimes. Can you always intuitively answer these simple questions? (Answers scattered.)

A87: ALWAYS absolutely nothing
A29: many statements. They are separated by many statements.

Q1: how many times (minimum, maximum) does the #1 piece execute?
Q2: how many times (minimum, maximum) does the #2 piece execute?
Q3: how many times (minimum, maximum) does the #3 piece execute?
Q: Does the number in A2 always exceeds A3 or the reverse, or no always-rule?
Q29: what might happen between #2 and #3 statements?
Q30: what might happen between #3 and #2? I feel nothing could happen.
Q87: what might happen between #1 and #2 statements?
Q: what’s the very last statement (one of 3 pieces or a something in loop body) executed before loop exit? Is it an “always” rule?

If there’s a q(continue), then things get less intuitive. http://stackoverflow.com/questions/16598222/why-is-continue-statement-ignoring-the-loop-counter-increment-in-while-loop explains the subtle difference between while-loop vs for-loop when you use “continue”.

In contrast, while-loop is explicit. So is do-while. In projects, for-loop is concise and often more expressive. In coding interviews, conditions are seldom perfect, simple and straightforward, so for-loop is error prone. White-board coding IV (perhaps bbg too) is all about nitty-gritty details. The condition hidden in the for-loop is not explicit enough! I would rather use for(;;) and check the condition inside and break.

The least error-prone is for(;;) with breaks. I guess some coding interviewers may not like it, but the more tricky the algorithm is, the more we appreciate the simplicity of this coding style.

Always safe to start your coding interview with an a for(;;) loop and carefully add to the header. You can still have increments and /break/continue inside.

low-churn professions often pay lower#le2Henry

category – skillist, gzThreat

I blogged about several slow-changing professions — medical, civil engineers, network engineers, teachers, quants, academic researchers, accountants (including financial controllers in banks).

My overall impression is, with notable exceptions, many of the slow-changing domains don’t pay so well. We will restrict ourselves to white-collar, knowledge intensive professions.

Sometime between 2013 to 2015, a tech author commented — compared to the newer languages of javascript, ruby, objective-C etc, java programmers are a more traditional, more mature, more stable, more enterprise community.

https://bintanvictor.wordpress.com/2014/11/03/technology-churn-ccjava-letter-to-many/ is my comparison of java, c#, c++. Basically I’m after the rare combination of

– mainstream,
– sustained, robust demand over 15 to 30 years
– low churn

Someone mentioned entry barrier. Valuable feature, but I think it is neither necessary nor sufficient a condition.

SQL and shell scripting are good examples. Very low churn; robust demand, mainstream. Salary isn’t highest, but decent.

[12]too many add-on packages piling up ] java^C++

(blogging) Biggest problem facing a new or intermediate java developer — too much new “stuff”, created by open source or commercial developers. Software re-usability? Software Component industry?…

Some job candidates are better able to articulate about these — advantage. On the real job, I don’t feel a developer needs to know so many java utilities (Architects?)

More than 3 C++ developers told me they prefer c++ over java for this reason. They told me that about the only add-on library they use is STL. Everything else is part of the core language. Some of them tell me in their trading/finance systems, other libraries are less used than STL — smart pointers + a few boost modules + some threading library such as pthreads. In contrast, I can sense they feel a modern day java system requires so many add-on items that it looks daunting and overwhelming.

The most influential books on c++ were written in the early 90’s (or earlier?)… Bottom line — If you know core language + STL you qualify for c++ jobs today. By the way, you don’t need deep expertise in template meta-programming or multiple inheritance as these are rarely used in practice.

In contrast, Java has many core (and some low-level add-on) components kept stable — such as memory model and core multi-threading, basic collections, RMI, serialization, bytecode instrumentation, reflection, JNI … This should in theory give reassurance to developers and job seekers. In reality, on the java (job) market stable core/infrastructure/fundamentals are overshadowed and drown out by the (noisy) new add-on libraries such as spring, hibernate, JSF, gwt, ejb, rich faces,

I feel the java infrastructure technologies are more important to a java app(also to a java project or to a java team), but I continually meet hiring side asking x years of hands-on experience with this or that flavor-of-the-month add-on gadgets. Is any of these packages in the core language layers? I don’t feel very sure.

(I feel some are — cglib, leak detectors… but these aren’t in job specs….)

I suspect many hiring managers don’t care about those extra keywords and just want to hire strong fundamentals, but they are forced to add those flavor-of-the-month keywords to attract talents. Both sides assume those hot new things are attractive to the other side, so they want to “flash” those new gadgets.

Whatever the motivation, result is a lot of new add-on gadgets we developers are basically forced to learn. “Keep running or perish.” — it’s tiring.

op-new : no DCBC rule

B’s op-new is bypassed by D’s op-new [1]
B’s ctor is always used (never bypassed) by D’s ctor.

This is a interesting difference.

Similarly, an umbrella class’s op-new [1] would not call a member object’s op-new. See [[more effC++]]

These issues are real concerns if you want to use op-new to prohibit heap instantiation of your class.

See http://bigblog.tanbin.com/2012/01/dcbc-dtor-execution-order.html

[1] provided these two classes each define an op-new()

By the way, op-new is a static member operator, but is still inherited.

mv-semantic: keywords

I feel all the tutorials seem to miss some important details and selling a propaganda. Maybe [[c++ recipes]] is better?

[s = I believe std::string is a good illustration of this keyword]

  • [s] allocation – mv-semantic efficiently avoids memory allocation on heap or on stack
  • [s] resource — is usually allocated on heap and accessed via a pointer field
  • [s] pointer field – every tutorial shows a class with a pointer field. Note a reference field is much less common.
  • [s] deep-copy – is traditional. Mv-semantics uses some special form of shallow-copy. Has to be carefully managed.
  • [s] temp – the RHS of mv-semantic must strictly be a temp object. I believe by using the move() function and the r-val reference (RVR) we promise to the compiler not to access the temp object afterwards. If we access it, i guess bad things could happen. Similar to UndefBehv? See [[c++standard library]]
  • promise – see above
  • containers – All standard STL container classes (including std::string) provide mv-semantics. Here, the entire container instance is the payload! Inserting a float into a container won’t need mv-semantics.
  • [s] expensive — allocation and copying assumed expensive. If not expensive, then the move is not worthwhile.
  • [s] robbed — the source object of the move is crippled, robbed, abandoned and should not be used afterwards. Its “resource” is already stolen, so the pointer field to that resource should be set to NULL.

——–
http://www.boost.org/doc/libs/1_59_0/doc/html/move/implementing_movable_classes.html says “Many aspects of move semantics can be emulated for compilers not supporting rvalue references and Boost.Move offers tools for that purpose.” I think this sheds light…

math power tools transplanted -> finance

南橘北枳

* martingale originates in gambling…
* Brownian motion originates in biology.
* Heat equation, Monte Carlo, … all have roots in physical science.

These models worked well in the original domains, because the simplifications and assumptions are approximately valid even though clearly imperfect. Assumptions are needed to simplify things and make them /tractable/ to mathematical analysis.

In contrast, financial mathematicians had to make blatantly invalid assumptions. You can find fatal flaws from any angle. Brian Boonstra told me all good quants appreciate the limitations of the theories. A small “sample”:

– The root of the randomness is psychology, or human behavior, not natural phenomenon. The outcome is influenced fundamentally by human psychology.
– The data shows skew and kurtosis (fat tail).
– There’s often no way to repeat an experiment
– There’s often just a single sample — past market data. Even if you sample it once a day, or once a second, you still operate on the same sample.

tech zbs( !!GTD) outweigh other(soft)job skills #%%belief

label: big20

In many teams (including …), I feel technical zbs/GTD capabilities

are still the #1 determinant of job security, comfort level,

competence level, work-life balance. Secondary yet important factors

include

– relative caliber within the team

– boss relationship

– criticality of the role within the team, within the firm

– reliable funding of the project, team or role

– closeness-to-money

– long work hours

– fire-n-hire culture

So zbs is key to keeping a job, but for finding a job, something else needed.

test if a key exists in multimap: count() perf tolerable

map::count() is simpler and slightly slower 🙂 than map::find()

Even for a multimap, count() is slower but good enough in a quick coding test. Just add a comment to say “will upgrade to find()”

In terms of cost, count() is only slightly slower than find(). Note multimap::count() complexity is Logarithmic in map size, plus linear in the number of matches… Probably because the matching entries live together in the RB tree.

linq – a scratch seeking an itch

The more I read about linq2collection, the more convinced I feel that it’s not strictly necessary. Old fashioned for-loops are
cleaner and easier to read. Exception handling is cleaner. Branching is cleaner. Mid-stream abort is cleaner.

A linq proponent may say OO was not strictly necessary (compared to procedural), and relational DB was not strictly necessary
(compared to non-relational DB) but offer real values and so is linq. Here’s my argument, focusing on linq2collections.

– I feel this is a local implementation choice, not a sweeping, strictly enforced architectural choice. As such, a particular method
can choose not to use linq.
– Another comparison is ORM. Once ORM is chosen in a project, you can hardly choose not to use it in one particular method. Linq is
more flexible, less enforced and more “sociable”.

A java guy has no such tool but is perfectly productive, until java 8 streams…

I feel linq gained mindshare on the back of c# success. If F#, python or delphi invents such a cool feature, it would remain a niche feature. I don’t see another language embracing linq, until java 8

Neverthless, we must master linq for the sake of job interviews. This is somewhat comparable to c++ multiple inheritance – not used
widely but frequently quizzed by interviewers.

every async operation involves a sync call

I now feel just about every asynchronous interaction involves a pair of (often remote) threads. (Let’s give them simple names — The requester RR vs the provider PP). An async interaction goes through 2 phases —

Phase 1 — registration — RR registers “interest” with PP. When RR reaches out to PP, the call must be synchronous, i.e. Blocking. In other words, during registration RR thread blocks until registration completes. RR thread won’t return immediately if the registration takes a while.

If PP is remote, then I was told there’s usually a local proxy object living inside the RR Process. Registration against proxy is faster, implying the proxy schedules the actual, remote registration. Without the scheduling capability, proxy must complete the (potentially slow) remote registration on the RR thread, before the local registration call returns. How slow? If remote registration goes over a network or involves a busy database, it would take many milliseconds. Even though the details are my speculation, the conclusion is fairly clear — registration call must be synchronous, at least partially.

Even in Fire-and-forget mode, the registration can’t completely “forget”. What if the fire throws an exception at the last phase after the “forget” i.e. after the local call has returned?

Phase 2 — data delivery — PP delivers the data to an RR2 thread. RR2 thread must be at an “interruption point” — Boost::thread terminology. I was told RR2 could be the same RR thread in WCF.

delegate instance is passed by value@@

myEvent += myDelegate;

Q: Is myDelegate passed by reference?
A: No. member-wise copying. In other words, each item in the delegate inv list is passed in (by reference?).
A: No. The invocation list (say 2 callbacks) are appended to myEvent’s internal inv list, and new list is ReBound to myEvent internal delegate variable.

That’s a short answer. For the long answer, we need to forget events and callbacks and focus on pure delegates as list objects holding simple “items”. Suppose dlg1 variable points to address1, and dlg2 points to address2.

dlg1 += dlg2 // translates to
dlg1 = Delegate.Combine(dlg1, dlg2). // Note: re-binding dlg1

Now dlg1 points to newly allocated address11. At this address there’s a new-born delegate object whose inv list is

[item1 at addr1,
item2 at addr1 …
itemN at addr1,
item1 at addr2,
item2 at addr2, ….
itemM at addr2]

According to http://msdn.microsoft.com/en-us/library/30cyx32c.aspx, some of the items could be duplicates — Legal.

30 seconds after the Combine() operation, suppose dlg2 is subsequently modified (actually re-bound). Instead of 2 callbacks at addr2, variable dlg2 [1] now points to addr22 with 5 callbacks. However, this modification doesn’t affect the inv list at addr11, which dlg1 now points to.

Therefore, we can safely say that in a Combine() operation the RHS delegate’s inv list is copied-over, passed by value or “pbclone”.

[1] like all java reference variables, a c# delegate variable is a 32-bit pointer object. As explained in
http://bigblog.tanbin.com/2012/03/3-meanings-of-pointer-tip-on-delete.html, the compiler is likely to allocate 32 bit of heap for this 32-bit object.

y java object always allocated on heap, again

Short answer #1: Because java objects are passed by reference.

Suppose a holder object (say a collection, or a regular object holding a field) holds a Reference to a student1 object. If student1 object were allocated on stack, and de-allocated when stack frame unwinds, the holder would hold a stray pointer. Java is allergic to and won’t tolerate a single stray pointer.

Java primitive entities are always passed by value (pbclone). No stray pointer – no pointer involved at all.

Short answer #2: all pbref entities live safer on heap.

How about c# and C++? Somewhat more complicated.

greeks(sensitivity) – theoretical !! realistic

All the option/CDS/IRS … pricing sensitivities (known as greeks) are always defined within a math model. These greeks are by definition theoretical, and not reliably verifiable by market data. It’s illogical and misleading to ask the question —
Q1: “if this observable factor moves by 1 bp (i.e. 0.01%) in the market, how much change in this instrument’s market quote?”

There are many interdependent factors in a dynamic market. Eg FX options – If IR moves, underlier prices often move. It’s impossible to isolate the effect of just one input variable, while holding all other inputs constant.

In fact, to compute a greek you absolutely need a math model. Without a model, you can say instrument valuation will appreciate but not sure by how much.

2 math models can give different answers to Q1.

how important is automated pricing engine@@ ask veterans

I proposed/hypothesized with a few pricing engine veteran friends that the big make-or-break trades and successful traders don’t rely so much on pricing engines, because they have an idea of the price they want to sell or buy. However, my friends pointed out a few cases when a pricing engine can be important.

– Risk management on a single structured/exotic position can involve a lot of simulations.
– If u have too many live positions, pricing them for pre-trade OR risk can be too time consuming and error prone.
– If you have many small RFQs, trader may want to focus elsewhere and let the system answer them.

How about those make-or-break big trades? Volume of such trades is by definition small. Decision maker would take the pricing engine output as reference only. They would never let a robot make such big decisions.

secDB — helps drv more than cash traders@@

(Personal speculations only)

Now I feel secDB is more useful to prop traders or market makers with persistent positions in derivatives. There are other target users but I feel they get less value from SecDB.

In an investment bank, equity cash and forex spot desks (i guess ED futures and Treasury too) have large volume but few open positions at end of day [1]. In one credit bond desk, average trade volume is 5000, and open positions number between 10,000 to 15,000. An ibank repo desk does 3000 – 20,000 trades/day

In terms of risk, credit bonds are more complex than eq/fx cash positions, but *simpler* than derivative positions. Most credit bonds have embedded options, but Treasury doesn't.

In 2 European investment banks, eq derivative risk (real time or EOD) need server farm with hundreds of nodes to recalculate market risk. That's where secDB adds more value.

[1] word of caution — Having many open positions intra-day is dangerous as market often jumps intra-day. However, in practice, most risk systems are EOD. I was told only GS and JPM have serious real time risk systems.

iterator = simple smart ptr

– Iterators are *simple* extensions of raw pointers, whereas
– smart pointers are *grand* extensions of pointers.

They serve different purposes.

If an iterator is implemented as a class (template) then it usually defines
– operator=
– operator==, >=
operator* i.e. dereference
– operator++, —
– operator+, += i.e. long jumps
….
—-However, iterator class (template) won't define any operation unsupported by raw pointers, therefore no member functions! Smart pointers do define public ctor, get() etc

Software arch vs enterprise arch

In many companies architects (are perceived to) operate at a high level of abstraction — high-level system modules/interfaces, high-level concepts, high-level decisions. Their audiences include non-technical users, who speak business language. I guess some architects absolutely avoid low-level tech jargon almost like profanity — it’s all about image of an architect. The wrong word on the wrong occasion would creates a permanent bad impression.

To many people, “architecture” implies some network layer, networking protocol, network nodes. These architects might feel less than comfortable if their system has just one machine or just one jvm instance [1], without messaging, database, remote-procedure-call, web service, distributed caching, IPC, shared memory, shared data format (like XML) or communication protocol.

In contrast, software architects can be extremely low-level, too. Their audience are developers. (Examples below include a lot of c/c++…)

– Consider the creator of STL — an architect?
– connectivity engine team lead?
– FIX engine architect?
– Any low-latency trading engine architect?
– Creator of perl/python/boost regex engines?
– All boost libraries I know are low-level. Are the authors architects?
– Tomcat is a single-jvm system. Authors qualify as architects?
– Hibernate isn’t a big system with big modules.  so the authors are low-level architects?
– JNI component designers — single-VM
– java GC innovators
– Creators of any reusable component like a jar, a dll — Just a pluggable module — not even a standalone “server”. Do they qualify as architects?
– The pluggable look-and-feel framework in Swing is designed by someone — an architect?
– spring framework, before all the add-on modules were added
– the tcp/ip/udp software stack was designed by some architect, but that person probably worked mostly on 1 machine, occasionally on 2 machines for testing only.
– GCC creator
– apache creator
– mediawiki creator
– creators of the dotnet CLR and the crown-jewel of Sun, the JVM.

I feel majority of the really important software [2] used in the world today were created by truly powerful developers who focused on a single module in a single system, without messaging, database, remote-procedure-call, web service, distributed caching, shared data format (like XML) or communication protocol.

Now I think there’s a difference between software architect vs enterprise system architect (ESA). To the software/internet industry, the software architect adds more value, whereas to a non-tech company (like finance/ telecom/logistics…), the ESA is the master mind.

I feel some ESA may need to solve performance problems but not implementation/debugging problems.

ESA often covers network/hardware configuration and design too. Security is often part of the “network piece”. If the database is small, then the ESA often covers that too. Tunin/optimization is cross-discipline and touches all the pieces, but in most of my enterprise systems, hardware/network tuning tends to require fewer specialists than DB or application (including MOM and cache). Therefore, ESA sometimes becomes the optimization lead.

(I recently encountered some concepts, methodologies and paradigms at architect level but not sure where to place them — design patterns, TDD/BDD, closures/lambda, meta-programming, functional programming, aspect-oriented… and other programming paradigms — http://en.wikipedia.org/wiki/Programming_paradigm.)

[2] By the way, these so-called “really important” software products are predominantly in C.
[1] Many of the items listed can add real value in a single process or multiple

how much math is needed in … #a few finance IT fields

I always maintain that math is key domain knowledge. In the few areas I know, when a bit of math is required, a lot of regular developers would need to brush up before they can “play in the band” (as in a heavy metal band).

– Bond trading platforms? Close to zero math. Definitely less than bond pricing. Developers do need to remember basics like when interest rate rises, bond prices drop, and zero-coupon bonds trade at huge discounts (due to price/yield math relation). Also high-grade bonds tend to trade at lower yield — minimal math.

– Bond pricing? Slightly more than zero math. Price/yield conversion, yield-to-worst, modified duration, dv01 rollup, time-value-of-money are widely used, but developers usually don't need any mathematical insight into them to do the job. Developers just need basic understanding. Hiring managers may think differently.

– Forex trading? Arithmetic only, I believe. Add/subtract/multiply/divide only. Bond math requires high school math. In contrast, Lots of Forex trading IT jobs require primary school math.

– Option pricing? A bit of statistics and calculus. Like in bond pricing, developers don't need any mathematical insight. However, they need a basic grasp of historical vol calculation, implied-vol inversion, delta and other greeks. Even more fundamental concepts include PCP and the option payoff diagrams of about 10 basic strategies, but I don't think developers need these.

– VaR on a portfolio including bonds and options? Requires more than bond math and option math. More statistics. But I suspect a dedicated developer in a bond pricing system may need more in-depth bond math than a VaR developer, because a stringent math requirement on software developers is impractical, overkill and giving the wrong focus to the wrong role. Instead, the real math is usually for the quant analysts. Risk systems use more math than pricing systems, and employ more quants.

All(@@) OO methods are "static" under the hood

C++ supports “free functions”, static method and non-static method. 
– Static method are very similar to free functions. (Difference? Perhaps inheritance and access modifiers.)
– Non-static methods are implemented as free functions accepting “this” as first argument.

C++ is largely based on classic-C. Basically everything translates to classic-C code, which uses free functions only.

Once we understand the c++ world, we can look at java and c# — Essentially same as c++.

Looking deeper, any method, function or subroutine consists of a sequence of instruction — a “code object” if you like. This object is essentially singleton because there’s no reason to duplicate this object or pass it by value. This object has an address in memory. You get that address in C by taking the address of a function — so-called function pointers.

In a multi-threaded environment, this code object is kind of singleton and stateless (if you know what I mean!). Think of Math.abs(). The code object itself is never thread-unsafe, but invoking it as a non-static method often is. Indeed, Invoking it as a free function can also be thread-unsafe if the function uses a stateful object. Example — localtime().

Non-static method has to be implemented based on this code object.

spot ^ fx-options — 2 "extreme" segments of FX market

Within the FX space, I get the impression that the trend of open standard, transparency, inter-connectedness, commoditization, shrinking spread and margin, widening competition among liquidity-pools (decentralization)… and all the other “good” stuff (to users) is primarily in the spot market, not much in the FX options market. The Reasons beneath are intriguing and offers a glimpse of the true drivers in FX markets.

Spot and Options are kind of 2 extremes in terms of commoditization…..

FX Futures (mostly on CME) is fully standardized and “bulldozed” as equities markets were bulldozed decades earlier. But I was told forward market is still bigger and more popular. Why?

Buy-side probably prefer standardized instruments — cheaper, transparent, fierce sell-side competition … but
Sell-side probably prefers the good old OTC contracts.

Dealers like things murky, and worry about too much transparency. The more murky, the more they can charge big spreads for the “service” they provide. A major reason for the shrinking margin in equity dealing desks over the past decades was the regulatory change to drive trading on to exchanges.

FX options are largely unaffected. Dealers don’t want to publish their quotes. Instead they respond to RFQ, according to a friend. (In theory, there might exist some *private* market maintained by a single market-maker, but I don’t know any.)

Fwd is similar. However, I was told the standard spot trade has a T+2 settlement and works like a short-term forward contract. Also, spot traders do a lot of rolling, so fwd trades and rates are ubiquitous.

Bank sends quotes (perhaps RFQ replies) to ECNs and also privately to individual clients. Known as “Bank feeds”. Notably, for a given currency pair at a given time, a bank sends slightly different spreads depending on audience — tiered quote pricer. A sign of market fragmentation, IMHO. Note, bank feeds aren’t always from commercial megabanks — investment banks also produce bank feeds.

hibernate losing popularity on Wall St@@

XR,

I’m always rather suspicious about newer app-dev technologies – android, distributed cache, weblogic, ruby, php, jaxb, xls, web service… I feel all of these (including superpowers Spring and WPF) will have to stand more tests of time.

With the decline of EJB, I don’t know what weblogic, websphere and JBoss are good for. Servlet containers? But tomcat is quite adequate?

Now I feel hibernate is not as popular as it used to be, at least in finance. My recent Wall St projects all decided not to use hibernate. We chose JDBC. It’s not that we don’t know hibernate. We know it well enough to avoid it.

It actually took these many years (5 +) for wall st developers to see the value/cost/benefit/limitations of Hibernate. Therefore, in the first few years of a new technology, it’s hard to predict if it will catch on, dominate and become entrenched. Over the past 30 years, only a small number of app development technologies achieved that status – SQL, MOM, threading, browser-based enterprise app, Excel, socket, javascript, and of course java/c++.

finance tech is non-mainstream in techland (elite)

Update – Si-Valley also need elites – small number of expert developers.

Financial (esp. trading) IT feels like an elite sector – small number of specialists
– with multi-skilled track record
– familiar with rare, specialized tools — JNI, KDB, FIX, tibrv, sockets, sybase
– Also, Many mainstream tools used in finance IT are used to an advanced level — threading, memory, SQL tuning, large complex SQL

If you compare the track record and skills of a finance IT guy with a “mainstream” tech MNC consultant, the finance guy probably appears too specialized.

That’s one psychological resistance facing a strong techie contemplating a move into finance. It appears risky to move from mainstream into a specialized field.

## practical app documentation in front office

+ Email? default “documentation” system, grossly inefficient but #1 documentation tool in practice.
+ Source code comments? default documentation system

— above are “organic” products of regular work; below are special documentation effort — tend to be neglected —-

+ Jira? rarely but can be used as organic documentation. easy to link
+ [jsw] Cheatsheet, run book and quick guide? same thing, very popular.
+ [jsw] Troubleshooting guide on common production issues
+ Design document? standard practice but often get out of date. A lot of effort. Often too high-level and not detailed enough for developers/implementation guys — written for another audience. Some people tell me a requirement spec should be detailed enough so developers could write code against it, but in reality, i have see extremely detailed specs but developers always have questions needing real time answers.

+ Unit test as documentation? Not sure. In theory only.
+ Wiki? widespread. Easy to link
+ sharepoint? less widely used than wiki as documentation tool. organized into a file tree.

The more “organic”, the easier, and the better its take-up rate; The more artificial, the more resistance from developers.

[w = wiki]
[j = jira]
[s = sharepoint]

how important are QA/BA/DBA/RTB ..#my take

This is a frank, honest but personal observation/reflection of the reality I witnessed. We should always bear in mind how (in)dispensable we really are to the business.

+ Many wall st trading system development teams have no QA personnel so developers do that. I did that.
+ Many wall st trading system development teams have no BA, so the project lead or key developer do that. I did that.
+ Many wall st system development teams have no RTB (Run-The-Bank, as the opposite side from Build-The-Bank) support team, so developers do
that. I did that. RTB can probably handle major FO (front office) apps but a lot of apps aren’t in that league.

+ Each wall st trading system development team I have seen never has a hands-off architect. If there’s an architect this person is first a coder then an architect.

+ Some wall st trading system development teams have no need for a DBA or unix SA (system admin). Without going into specifics, much of the expertise isn’t needed before release to production if (not a big if) developers _can_ build their apps themselves. In rare cases developers need some tuning advice.See also http://bigblog.tanbin.com/2010/12/hands-off-manager-according-to-lego.html

– However, proj mgr is crucial, every time. Most wall st systems are interdependent and extremely quick turnaround, requiring non-trivial communication, coordination, estimation, monitoring. I feel half the time the key developers don’t play the PM role.

eg@3rd domain knowledge #Apama

I-bank may hire someone from a software vendor (like mysis, sunguard, murex …) but more likely they prefer the core engine developer not a presales architect according to Lin Y. The PS architect’s insight and value-add is tied to that core product (hopefully long-lasting), not very generic.

If you remove the “unremarkable” code, I guess the __core__ of the product is a kind of winning architecture + some patented algo/optimizations. They are integrated but each is an identifiable feature. Together they constitute an entry barrier. If you don’t have these insights like the core Apama developers have, then you can’t help break the entry barrier and build a similar trading engine at an ibank. The ibank is more likely to hire someone with core engine development experience from other ibanks

Q: What features can qualify for a patent?

%%A: Not the high level architecture, but some mid-level techniques, or “tricks” in Shaun’s words. I feel the high-level architecture may not be an entry barrier (or qualify as a patent) since in the world of trading platform there are only a small number of common architectures at the high level. However, as a job candidate and a team lead you are expected to communicate in high-level architectural lingo. The low-level jargon is harder to understand by management or by some fellow developers.

What if this inflated Wall-St IT salary bubble bursts@@

Update: I feel it’s unlikely. The industry is becoming more tech, more data, more automated. Complexity increases. Software has to deal with more complexity, not less.

Tip: remember survival = job market beauty contest, not on-the-job productivity.
** Stay in the leading pack – marathon.
** know your weaknesses on the job_market and on the job

Tip: stay relevant to the business. Business needs critical apps rolled out fast so they can trade. They are willing to sponsor such projects.So what are the critical, non-trivial apps?
– trade booking, sub-ledger, daily unrealized pnl
– pricing – usually non-trivial
– market data – sometimes trivial as traders can simply get a low-tech data feed
– OMS – only in exchange trading

Tip: know the high margin trading desks in each city. These profit centers will continue to pay high. SG (FX …) HK (eq), Ldn (FX, IR …)

Tip: remember pure tech firms might catch up Wall St IT pay.

—(Rest of the items may not address the issue in the subject.)—

Tip? broaden trec into GUI. Many of my server-side systems have a GUI counterpart.
Tip? broaden trec from java/c++ to dotnet, math, market data.
Tip?? move up to architect, team lead
Tip?? broaden trec into BA and prj mgmt
Tip?? deepen — unix tuning, DB tuning, memory tuning, math

python: very few named free functions

In python, Most operations are instance/static methods, and the *busiest* operations are operators.

Free-standing functions aren’t allowed in java/c# but the mainstay of C, Perl and PHP. Python has them but very few.

— perl-style free functions are a much smaller population in python, therefore important. See the 10-pager P135[[py ref]] —
len(..)
map(), apply()
open()
min() max()

— advanced free functions such as introspection
repr() str() — related to __repr__() and __str__()
type(), id(), dir()
isinstance() issubclass()
eval() execfile()
getattr() setattr() delattr() hasattr()
range(), xrange()
?yield? not a free-function, but a keyword!

which city is at the spearhead of finance IT

Even if Asia IT jobs were to pay higher than US, US is still spearheading industry best practices. I’m in the financial IT space, but see similar patterns in other IT sectors.

It’s revenue share, stupid.

The more revenue a country office generates, the more budget it controls. The office usually prefers onsite development team, so that team tends to become the hotbed of innovation. For a foreseeable future, NY and London will be the hotbeds.

software engineering where@@ wall street@@

When I look at a durable product to be used for a long time, I judge its quality by its details. Precision, finishing, raw material, build for wear and tear…. Such products include wood furniture, musical instruments, leather jackets, watches, … I often feel Japanese and German (among others) manufacturers create quality.

Quick and dirty wall street applications are low quality by this standard, esp. for code maintainers.

Now software maintainability requires a slightly different kind of quality. I judge that quality, first and foremost, by correctness/bugs, then coverage for edge/corner cases such as null/error handling, automated tests, and code smell. Architecture underpins but is not part of code quality. Neither is performance, assuming our performance is acceptable.

There's a single word that sums up what's common between manufacturing quality and software quality — engineering. Yes software *is* engineering but not on wall street. Whenever I see a piece of quality code, it's never in a fast-paced environment.

secDB/Slang – resembles python inside OODB

A few hypothesis to be validated —

1) you know oracle lets you write stored proc using java (OO). Imagine writing stored proc using python, manipulating not rows/tables but instances/classes

2) you know gemfire lets you write cache listeners in java. Now Imagine writing full “services” in java, or in python.

I feel secDB is an in-memory OODB running in a dedicated server farm. Think of it as a big gemfire node. It runs 24/7.

Why python? Well, python’s object is dict-based. I believe so are secDB objects.

secDB – designed for mkt-making and prop trading

GS 2010 annual report said “A particular technological competitive advantage for GS is that GS has only one central risk system.” Presumably, this is a market risk system. Not much credit risk, or  liquidity risk, or counter-party risk.

It’s instructive to compare several major business models
+ prop-trading and principal-investment — maintain positions ==> need secDB
+ dealing and market-making — need to maintain (and hedge) positions ==> probably needs SecDB esp. option/swap market making
+ security lending — holds client’s positions under bank’s name ==> Needs secDB
– brokerage and agency-trading — don’t need to maintain positions by strict definition. In practice I guess they often lend assets to clients and hold client’s positions as collateral. That would need secDB
– asset management — managing client’s money only, so maintain positions on behalf of clients. If you care about the fund manager’s positions, then you need secDB. In addition, I guess managers often co-invest — need secDB
– high-volume, low-margin electronic trading is usually(??) agency trading
+ if there’s a high volume system that needs secDB, I’d guess it’s prop trading similar to hedge funds

Looking into the design of SecDB (dependency graph + OODB), I feel it’s designed for prop desk.

In citigroup, the research department supports prop desk as its #1 user. I guess the market maker desk would be #2.

When GS say “aggregate positions across desks”, I feel it means prop desk + sell-side “house” desks. I think they mean house money i.e. positions under GS own name, not client names. Risk means risk to the bank’s positions, not client’s positions. (No bank will spend so much money to assess risk to clients’ positions.)