2 processes’ heap address spaces interleaved@@

The address space of a stack is linear within a process AA. Not true for AA’s heap address space. This address space can have “holes” (i.e. deallocated memory blocks) between two allocated blocks. However, how about these 3 consecutive blocks… would this be a problem?

Allocated block 1 belongs to process AA
Allocated block 2 belongs to process BB
Allocated block 3 belongs to process AA

I think several full chapters of [[linux kernel]] cover memory management. The brk() syscall is the key.

I think the actuall addresses may be virtualized .

Kernel page table?

— why do we worry about holes?

  1. I think holes are worrisome due to wasted memory. Imagine you deallocate a huge array half the size of your host memory
  2. hunting for a large block among the holes can be time-consuming
  3. if your graph nodes (in a large data structure like linked lists, trees) are scattered then d-cache efficiency suffers.

— So why do we worry about interleaving?

If we need not worry, then interleaving may be permitted and exploited for efficiency.

cloud: more relevant to system- than software-architects

Software architects worry about libraries, OO designs, software components, concurrency etc.

They also worry about build and release.

How about app integration? I think this can be job for system architect or software architect. If the apps run in the same machine, then probably software architect.

— 10 considerations to develop for cloud

https://www.developer.com/cloud/ten-considerations-for-realizing-the-potential-of-the-cloud.html was a 2015 short article written by a java developer. I like the breakdown into 10 items.

Some of the 10 items are more devops than developer considerations. The other items are more for system architect than software architect.

However, hiring managers expect a senior candidates like me to demonstrate these QQ knowledge. By definition, QQ is not needed on the job.

 

lucky I didn’t invest in Scala #java8/c++11 #le2Greg

Hi Greg,

Am writing another reply to your earlier mail, but I feel you wouldn’t mind reading my observations of Scala and java8 on the WallSt job market.

Let me state my conclusion up-front. Conclusion: too many hypes and fads in java and across the broader industry. I feel my bandwidth and spare time is limited (compared to some techies) so I must avoid investing myself in high-churn domains.

You told me about Scala because MS hiring managers had a hard time selling Scala to the job candidates. The harder they try to paint Scala as part of the Future of java, the more skeptical I become. To date, I don’t know any other company hiring Scala talent.

If I get into a MS Scala job, I would have to spend my bandwidth (on the job or off the job) learning Scala. In contrast, in my current role with all-familiar technologies, I have more spare time on the job and off the job. On a Scala job, I would surely come across strange Scala errors and wrestle with them (same experience with python, and every other language)

. This is valuable learning experience, provided I need Scala in future jobs, but nobody else is hiring Scala.

Therefore, am not convinced that Scala is worth learning. It is not growing and not compelling enough to take over (even a small part of the java) world. I would say the same about Go-lang.

If scala is a bit of a hype, then Java8 is a bit of a fad.

I remember you said in 2019 that java8 knowledge had become a must in some java interviews. I actually spent several weeks reading and blogging about key features of java8. Almost None of them is ever quizzed.

Java8 seems to be just another transitional phase in the evolution of java. My current system uses java8 compiler (not java8 features) , but java 9,10,11,12,13,14 and 15 have come out. There are so many minor new features that interviewers can only ask a small subset of important features. The "small subset" often boils down to an empty set — interviewers mostly ask java1 to java5 "classic" language features such as threading, collections, java polymorphism.

Some hiring teams don’t even ask java8 interview questions beyond the superficial. Yet they say java8 experience is required on the job!

Lastly, I will voice some similar frustrations about c++11/14/17. Most teams use a c++17 compiler without using any new c++ features. Most of the interview questions on "new" c++ revolve around move semantics, a very deep and challenging topic, but I don’t think any team actually uses move semantics in their work. Nevertheless, I spent months studying c++11 esp. move semantics, just to clear the interviews.

[20] SG tech talent pool=insufficient: expertise^GTD

Listening to LKY’s final interviews (2013 ?), I have to agree that Singapore — counting citizens+PRs — doesn’t have enough technical talent across many technical domains, including software dev. https://www.gov.sg/article/why-do-we-need-skilled-foreign-workers-in-singapore is a 2020 publication, citing software dev as a prime example.

A telltale sign of the heavy reliance on foreign talent — If an employer favors a foreigner, it faces penalty primarily (Russell warning) in the form of ban on EP application/renewal. This penalty spotlights the reliance on EPs at multinationals like MLP, Goog, FB, Macq.

The relatively high-end dev jobs might be 90% occupied by foreigners, not citizens like me. I can recall my experience in OC, Qz, Macq, INDEED.com interview… Why? One of the top 2 most obvious reasons is the highly selective interview. High-end tech jobs always feature highly selective tech interviews — I call it “Expertise screening”.

Expertise is unrelated to LocalSys knowledge. LocalSys is crucial in GTD competence.

As I explained to Ashish and Deepak CM, many GTD-competent [1] developers in SG (or elsewhere) are not theoretical enough, or lack the intellectual curiosity [1], to pass these interviews. In contrast, I do have some Expertise. I have proven it in so many interviews, more than most candidates.

(On a side note, even though I stand out among local candidates, the fact remains that I need a longer time to find a job in SG than Wall St. )

[1] As my friend Qihao explained, most rejected candidates (including Ashish) probably have the competence to do the job, but that’s not his hiring criteria. That criteria is too low.  Looks like SG has some GTD-competent developers but not enough with expertise or curiosity.

— Math exams in SG and China

Looking at my son’s PSLE math questions, I was somehow reminded that the real challenge in high-end tech IV is theoretical/analytical skills — “problem-solving” skill as high-end hiring teams say, but highly theoretical in nature. This kind of analytical skill including vision and pattern recognition is similar to my son’s P5 math questions.

In high-end tech IV, whiteboard algo and QQ are the two highly theoretical domains. ECT and BP are less theoretical.

What’s in common? All of these skills can be honed (磨练). Your percentile measures your abilities + effort (motivation, curiosity[1]). I’m relatively strong in both abilities and effort.

So I know the math questions are similar in style in SG and China. I have reason to believe East-European and former-Soviet countries are similar. I think other countries are also similar.

rvalue objects before/after c++11

Those rvalue objects i.e. unnamed temp objects have been around for years. So how is rvr needed to handle rvalue-objects?

  • C++11 added language features (move,forward,rvr..) only to support resource stealing where resource is almost always some heapy thingy.
  • Before c++11, rvalue objects don’t need a special notation and don’t need a special handle (i.e. rvr). They are treated just like a special type of object. You were able to steal resources, but error-prone and unsafe.

op=(): java cleaner than c++ #TowerResearch

A Tower Research interviewer asked me to elaborate why I claimed java is a very clean language compared to c++ (and c#). I said “clean” means consistency and fewer special rules, such that regular programmers can reason about the language standard.

I also said python is another clean language, but it’s not a compiled language so I won’t compare it to java.

See c++complexity≅30% mt java

— I gave interviewer the example of q[=]. In java, this is either content update at a given address (for primitive data types) or pointer reseat (for reference types). No ifs or buts.

In c++ q[=] can invoke the copy ctor, move ctor, copy assignment, move assignment, cvctor( conversion ctor), OOC(conversion operator).

  • for a reference variable, its meaning is somewhat special  at site of initialization vs update.
  • LHS can be an unwrapped pointer… there are additional subtleties.
  • You can even put a function call on the LHS
  • cvctr vs OOC when LHS and RHS types differ
  • member-wise assignment and copying, with implications on STL containers
  • whenever a composite object has a pointer field, the q[=] implementations could be complicated.  STL containers are examples.
  • exception safety in the non-trivial operations
  • implicit synthesis of move functions .. many rules
  • when RHS is a rvalue object, then LHS can only be ref to const, nonref,,,

##[18]G4qualities I admire]peers: !!status #mellow

I ought to admire my peers’ [1] efforts and knowledge (not their STATUS) on :

  1. personal wellness
  2. parenting
  3. personal finance, not only investment and burn rate
  4. mellowness to cope with the multitude of demands, setbacks, disappointments, difficulties, realities about the self and the competition
  5. … to be Compared to
    • zbs, portable GTD, not localSys
    • how to navigate and cope with office politics and big-company idiosyncrasies.

Even though some of my peers are not the most /accomplished/ , they make a commendable effort. That attitude is admirable.

[1] Many people crossing my path are … not really my peers, esp. those managers in China. Critical thinking required.

I don’t have a more descriptive title for this blogpost.

latency zbs in java: lower value cf c++@@

Warning — latency measurement gotchas … is zbs but not GTD or QQ

— My tech bet — Demand for latency QQ will remain higher in c++ than java

  • The market’s perception would catch up with reality (assuming java is really no slower than c++), but the catch-up could take 30 years.
  • the players focused on latency are unused to the interference [1] by the language. C++ is more free-wheeling
  • Like assembly, c++ is closer to hardware.
  • In general, by design Java is not as a natural a choice for low latency as c++ is, so even if java can match c++ in performance, it requires too much tweaking.
  • related to latency is efficiency. java is a high-level language and less efficient at the low level.

[1] In the same vein, (unlikely UDP) TCP interferes with data transmission rate control, so even if I control both sender and receive, I still have to cede control to TCP, which is a kernel component.

— jvm performance tuning is mainstream and socially meaningful iFF we focus on
* machine saturation
* throughput
* typical user-experience response time

— In contrast, a narrow niche area is micro-latency as in HFT

After listening to FPGA, off-heap memory latency … I feel the arms race of latency is limited to high-speed trading only. latency technology has limited economic value compared to mobile, cloud, cryptocurrency, or even data science and machine learning.

Churn?

accu?

 

c++nlg pearls: xx new to refresh old 知新而后温故

Is this applicable in java? I think so, but my focus here is c++.

— 温故而知新 is less effective at my level. thick->thin, reflective.

— 知新而后温故 — x-ref, thin->thick->thin learning.

However, the pace of learning new knowledge pearls could appear very slow and disappointing. 5% new learning + 95% refresh. In such a case, the main benefit and goal is the refresh. Patience and Realistic expectation needed.

In some situations, the most effective learning is 1% new and 99% refresh. If you force yourself to 2% new and 98% refresh, learning would be less effective.

This technique is effective with distinct knowledge PEARLS. Each pearl can be based on a sentence in an article but developed into a blogpost.

 

half%%peers could be Forced into retirement #Honglin

Reality — we are living longer and healthier.

Observation — compared to old men, old women tend to have more of a social life and more involvement with grandchildren.

I suspect that given a choice, half the white-collar guys in my age group actually wish to keep working past 65 (or 70), perhaps at a lower pace. In other words, they are likely to retire not by choice. My reasoning for the suspicion — Beside financial needs, many in this group do not have enough meaningful, “engaging” things to do. Many would suffer.

It takes long-term planning to stay employed past 65.

I think most of the guys in this category do not prepare well in advance and will find themselves unable to find a suitable job. (We won’t put it this way, but) They will be kinda forced into early retirement. The force could be health or in-demand skillset or …

intellij=cleaner than eclipse !

intellij (the community version) is much cleaner than eclipse, and no less rich in features.

On a new job, My choice of java ide is based on
1) other developers in the team, as I need their support

2) online community support — as most questions are usually answered there
I think eclipse beats intellij

3) longevity — I hate to learn a java ide and lose the investment when it loses relevance.
I think eclipse beats intellij, due to open-source

)other factors include “clean”

The most popular tools are often vastly inferior for me. Other examples:
* my g++ install in strawberryPerl is better than all the windows g++ installs esp. msvs
* my git-bash + strawberryPerl is a better IDE than all the fancy GUI tools
* wordpress beats blogger.com hands down
* wordpad is a far simpler rich text editor than msword or browsers or mark-down editors

“Strategic” needs a re-definition #fitness

“Strategic” i.e. long-term planning/t-budgeting needs a re-definition. quant and c# were two wake-up calls that I tragically missed.

For a long time, the No.1 strategic t-expense was quant, then c#/c++QQ, then codingDrill (the current yellowJersey).

Throughout 2019, I considered workout time as inferior to coding drill .. Over weekends or evenings I often feel nothing-done even though I push myself to do “a bit of” yoga, workout, or math-with-boy, exp tracking,

Now I feel yoga and other fitness t-spend is arguably more strategic than tech muscle building. I say this even though fitness improvement may not last.

Fitness has arguably the biggest impact on brain health and career longevity

job losses across WallSt: 90% budget-related

In 99.9% of the involuntary job losses (not due to disciplinary action), the victim loses his (or her) job due to more than one reason, despite the single “official” reason such as performance or budget/redundancy. The individual is selected after passing through several approvals. Several protections must fail to protect him before he can loses his job. Remember, Lord Voldemort’s must lose all seven Horcruxes before he can be killed.

  1. protection: good performance — (with a vague criteria of “good”) This protection is effective to the extent that it helps the immediate manager look good.
  2. protection: threat of law suit for discriminatory hiring/firing
  3. protection: adequate budget is the biggest protection. When SIA suffers, every staff suddenly loses this big protection.
  4. protection: internal transfer opportunities
  5. protection: financial compensation — is a deterrent that protects perm staff. In contrast, ending a contractor is much easier. Not much approval.
  6. protection: a guardian angel. Immediate manager is one of your guardian angels, but there could be other powerful figures protecting you. They can veto the decision to get you laid off

I have seen many strong performers getting laid off, sometimes even without budget pressure. So good performance is not ironclad protection.

The pattern — managers put in lots of effort to identify, select, train a new hire. Sacking a person without budget constraint such as a (department-wide downsize) is too visible and dramatic, even humiliating, looking bad, too harsh, too impactful on the victim and the team morale. I feel most managers are reluctant to do that.

  • They would rather pay a doughnut bonus and wait for the person to leave
  • They can also offer internal transfer to the individual, as in OC/BAML/Macq
  • They can also lower performance expectation on the individual and close one eye. As Kyle Stewart said, “As long as you put in effort”.

Some examples:

  • eg: KhorSiang of Zed

Exceptions prove the rule. Some managers are trigger happy, without budget pressure — deMunk, Stephen of Macq

c++low-^high-end job market prospect

As of 2019, c++ low-end jobs are becoming scarce but high-end jobs continue to show robust demand. I think you can see those jobs across many web2.0 companies.

Therefore, it appears that only high-end developers are needed. The way they select a candidates is … QQ. I have just accumulated the minimum critical mass for self-sustained renewal.

In contrast, I continue to hold my position in high-end coreJava QQ interviews.

##command line c++toolchain: Never phased out

Consider C++ build chain + dev tools  on the command line. They never get phased out, never lost relevance, never became useless, at least till my 70s. New tools always, always keep the old features. In contrast, java and newer languages don’t need so many dev tools. Their tools are more likely to use GUI.
  • — Top 5 examples similar things (I don’t have good adjectives)
  • unix command line power tools
  • unix shell scripting for automation
  • C API: socket API, not the concepts
  • — secondary examples
  • C API: pthreads
  • C API: shared memory
  • concepts: TCP+UDP, http+cookies
Insight — unix/linux tradition is more stable and consistent. Windows tradition is more disruptive.

Note this post is more about churn (phased-out) and less about accumulation (growing depth)

@outage: localSys know-how beats generic expertise

When a production system fails to work, do you contact

  • XX) the original developer or the current maintainer over last 3Y (or longer) with a no-name college degree + working knowledge of the programming language, or
  • YY) the recently hired (1Y history) expert in the programming language with PhD and published papers

Clearly we trust XX more. She knows the localSys and likely has seen something similar.

Exception — what if YY has a power tool like a remote debugger? I think YY may gain fresh insight that XX is lacking.

XX may be poor at explaining the system design. YY may be a great presenter without low-level and hands-on know-how.

If you discuss the language and underlying technologies with XX he may show very limited knowledge… Remember Andrew Yap and Viswa of RTS?

Q: Who would earn the respect of teammates, mgr and external teams?

XX may have a hard time getting a job elsewhere .. I have met many people like XX.

latency favors STM; lockfree constructs target moderate latency

Strictly single-threaded mode means no shared mutable. Therefore, Really low-latency apps should probably avoid lockfree programming which assumes the presence of shared mutable. If no shared mutable, then the lockfree constructs would exert an unwanted performance penalty.

Now I think the real low-latency systems always prefer Single-Threaded-Mode. But is it feasible?

  • xtap and (loosely) Rebus are both STM — very high performance, proven designs.
  • Nasdaq’s new java-based architecture is STM, including their matching engine.
  • The matching engines in many exchanges/ECNs are STM. Remember FXAll interview?

 

WallSt=age-friendly to older guys like me

I would say WallSt is Open to older techies.

I wouldn’t say WallSt is kind to old techies.

I would say WallSt is age-friendly

I would say WallSt offers a bit of the best features of age-friendly professions such as doctors and accountants.

Q: I sometimes feel WallSt hiring managers are kind to older techies like me, but really?
A: I feel WallSt hiring managers are generally a greedy species but there are some undercurrents :

  • 🙂 traditionally, they have been open to older techies who are somewhat less ambitious, less driven to move up, less energetic, less willing to sacrifice personally. This tradition is prevalent for decades in U.S. work culture and I believe it will stay. No such tradition in China and Singapore,
  • 🙂 U.S.hir`mgr may avoid bright young candd #Alan
  • 🙂 some older guys do perform well above expectation. Capability decline with age is very mild and questionable in many individuals, but work ethics differ among individuals, unrelated to age.
  • 😦 if an older guy needs to be cut, many hiring managers won’t hesitate… merciless.

Overall, I think Wall St hiring managers are open to older guys but not sympathetic or merciful. They are profit-driven, not compassionateThe fact that I am so welcome on Wall St is mostly due to my java/c++ QQ, not anyone’s kindness.

I thank God. I don’t need to thank Wall St.

Sunday night: if in the mood4localSys

Realistic scenario — I find myself in the mood for localSys learning on a Sunday night 11pm.

I think it’s better to sleep in office than to go home, but in Singapore, I had better go home and sleep, by taking taxi.

I think it’s better to work on the localSys till 2am (or later). Those 3 hours are precious engagement … Burning pleasure. I don’t get such 3-hour engagements in a week.

I used to feel “why not work on my QQ or coding practice now, and focus on work Monday morning?” It turns out that I’m usually less motivated on Monday morning, for whatever reasons.

Friday night is similar. I often find my highest appetite and absorbency on Friday nights (or eve of public holidays). So better go home late, sleep, then come back to office Saturday early morning to capture the mood.

U.S.startups: often less selective,lower caliber

U.S. startups may represent a sweet spot among my job choices. I think they are less selective than HFT or top tech shops. They attract fewer top geeks.

  • Workload – presumably comparable to ibanks like MS/Baml/Barc. I guess some startups are higher, but I guess fewer than 50%. Mithun described 3 startups but he has no firsthand experience.
  • Salary — Many startups pay on-par with ibanks, according to CYW
  • Leaves – possibly fewer than ibanks
  • Cowoker benchmark – possibly lower caliber
    • Respect – could be slightly better than in an ibank if you earn the respect

Some of the big tech shops are actually less selective than HFT – Amazon, Apple

localSys,big codebase..taxing@the aging memory

A.Brooks talked about innovative brain power. I’m talking about memory capacity.

Now I see that localSys on a big system is taxing on the aging memory. I guess GregM (RTS) might be a cautionary tale. GregM relied on his theoretical knowledge to pass interviews, but not fast enough with local codebase

Is green field better than brown-field codebase? I guess so, based on personal experience. Green-field projects are rarely given to a new joiner but contractors are often hired on green field budget — like Citi, 95G, volFitter and RTS 🙂

Now consider this separate question:

Q: Are the c++11 QQ topics a form of churn on the interview arena?
A: Yes but c++11 QQ is lighter (on the aging memory) than localSys

Q: how about the Coding IV?
A: still lighter stress than localSys.

coding drill outlasting C++QQ study

Am confident that my coding drill will become a “long-term” hobby like java/c++ QQ, better than c#/quant/swing QQ:

  • Anti-aging.
  • Enjoyable? Yes possible and real if no pressure to time limit no measurable target
  • Not theoretical; more immediate result

I think coding drill may even outlast c++ QQ self-study but crucially my c++ QQ has reached … critical mass ! In contrast, some pure algo interviews are harder for me and I have yet to reach critical mass. For example some DP or graph problems.

c++TMP^other QQ topics #java

Alexandrescu’s TMP techniques (not “designs”) are very tricky (not “complex”). They require absorbency, but do they enhance latency? Do they get you higher jobs with lower stress?

I need to make time-allocation decisions among QQ topics, including TMP

In terms of latency, Well, java can now rival c++ in latency. The technical reasons are not obvious nor intuitive, but not my focus today. Just an observed fact which discredits conventional wisdom and our assumptions.

— zbs, based on continued relevance :

TMP is needed when reaching next level in c++ zbs.

TMP is more time-honored than many c++0x features.

Many new c++0x features were added for TMP. I feel TMP is the main innovation front across c++ language n standard development. C++ lost many battles in the language war but no other languages offer anything close to TMP features.

— As QQ

Will C++TMP (and rvr) QQ turn out similar to java bytecode engineering, reflection, generics? (Even in such a scenario, TMP still offers better roti than Qz.) Actually TMP is quizzed more than those. The c++ guru interviewers often adore TMP.. cult following.

EJB is an add-on package .. different category, not an advanced core language feature.

When TMP is not quizzed you may still get opportunities to showcase your halo. Many interviewers ask open-ended questions.

TMP techniques would remain a halo for years to come. Classic QQ topic.

— GTD: templates are never needed in greenfield projects. Occasionally relevant in understanding existing code base such as etsflow, STL, boost..

Q: are there rare projects using TMP and offer me an opportunity to outshine others, gain GTD advantage ..?
A: I guess it’s one in 10 or 20. Here are some factors:

Within a given project codebase, TMP is a powerful tool for DRY improvement and re-usability , but such reusability  is over-rated in most projects.

DRY (don’t repeat yourself) is practiced more widely, but I feel TMP techniques replace 100 lines of code duplication with 20 lines of unreadable code.

 

QQ=fake; zbs_learning=treacherous

I used to feel my QQ knowledge was not zbs but now I think many interviewers (like myself) ask zbs questions. zbs is a vague concept. QQ has a well-established range of topics. Experts sizing up each other … is mostly QQ

Biggest danger with zbs learning — when I spend my precious spare time on zbs not aimed at QQ, I inevitably regret. Without QQ, the learning lacks reinforcement, positive feedback loop… and fades from memory.
I think in SG context the dearth of interviews make zbs learning treacherous

Therefore, zbs learning must aim at xref and thick->thin.

One guy with theoretical zbs (either strong or poor QQ) may be very mediocre joining a new team. It depends on the key skill needed in the team. Is it localSys, design …? I tend to feel what’s really needed is localSys.

max salary: simple game-plan

The strategy — “Whether I ask for a big base or modest base, there’s a chance I may have problem with manager expectation. So let’s just go for the max salary and forget about learning/tsn

    • algo trading? tend to pay a premium, but I wonder how they assess my trec.
    • java/c++ combo role? will Not pay lower
    • Some quant skill? tend to pay a premium
    • If a HFT shop makes a real offer at S$150k base I will decline — no real upside for me. Similarly, if a quant dev job pays $170k base I will decline — the promised accu (across jobs) is a big mirage. Accu can happen within a single job, but so is the technical accu on a single job.

Max-salary game plan must not ignore :

  • correlation between salary and expectation — as observed in some past jobs but not in every lucrative role. My Barclays and 95G roles were great.
  • the stigma, damagedGoods and high expectations in Stirt and Macq…. Ashish’s view — just earn the money for 6 months and leave if not happy.
  • commute
  • reputation risk at the major banks.

Am i still a survivor? I would say YES in OC and GS, and yes in Macq based on the internal transfer offer.

Mithun suggested — Are we traumatized/scarred and fixated on the stigma? I said the same to Deepak CM.

jav isn’t challenged by c++/c# but by high-level languages

  • I noticed at the beginning of this decade (2010-2012) c# mounted a credible challenge to dethrone java but has since lost momentum.
  • c++ had a renaissance with c++1x, but after 8 years I would say it failed to become a threat to java.

The other challengers to java seems to be domain-specific high level languages like javascript (browser and beyond), ruby-on-rails, php (web server side), perl (text processing), q (kdb). Not sure about Golang.

Python is the strongest challenger among them. Here I consider python a domain-specific language given its visible strengths in data science, machine-learning, black-box testing, and traditional process management. I sometimes consider python a general purpose language like java and c++, but i feel it’s not widely perceived as such, perhaps due to dynamic typing and efficiency.

C/c++ remains the default system programming language — that’s clear. Less clearly, Java remains the “default” application-development language on server side — but in this sentence the key words are vague.

Luckily for java developers, nowadays most of the work is on server side, including many mobile and cloud applications. In addition, java is viable for offline/batch job, data science, transaction-processing systems … Actually, I listed javascript as a java challenger because I think a lot of application functionalities are moving to client-side.

Luckily for java developers, a few internet powerhouses led by Google (and Alibaba) have chosen Java as the default language on server side. So have most Wall St firms.

deadly delays@ project^feature levels

insightful article: managing tight deadlines is the background

— deadly delay at feature level, without user impact

This is real stressor primarily because there exist team colleagues who can GTD faster on this same task.

For the serious delay, user impact is … exaggerated! This is true across all my projects — user impact doesn’t really matter. In fact, the feature may not be used much whether you deliver it in high or low quality, within or exceeding budget.

— blatant delay at project level when you are architect / team lead

In theory if you promise to delivery some new feature i.e. green field project, then it can be tough to deliver on time. In reality, project time/budget overrun is very common. You only need good explanation.

Users never care that much about any new system cf current production systems. New systems often hit various teething problems and functionally unreliable for months.

OK , Users don’t really care that much, but there’s visible technology budget impact which make the technology MD look bad. MD must look for an “explanation” and may cut headcount as in StirtRisk and RTS-Trep

Whose fault? Many fingers point at you the architect, but often it’s partial fault of the user-facing manager due to immaturity and overconfidence with timeline estimate.

Delay is mostly architect’s fault if another architect can deliver faster but I can’t imagine two architects working on two comparable projects at the same time. Therefore it’s never “mostly architect’s fault”. In reality, management do apply pressure and you suffer, but it’s really for political reason (like sacrificial lamb or scapegoat)

eg: RTS Trep
eg: PWM Special-Investment-management
eg: Post-Saurabh GMDS — multiple delays and scope reductions

eg@traction[def2] in GTD^IV #Sunil

Sunil is not the only one who tried but failed to break into java. Sunil was motivated by the huge java job market. I believe he had opportunities to work on (probably small) java projects and gained confidence, but that’s really the easy part. That GTD experience was completely insufficient to crack java interviews. He needs IV traction.

  • Zhurong also tried java
  • [g] Venkat actually got a java job in SCB but didn’t like it. I feel he was lacking GTD traction
  • [g] XR and the Iranian java guy had some c# projects at work but didn’t gain traction.
  • CSY had a lot to tell me about breaking into java.
  • [gi] CSY and Deepak CM both had java projects at work but no traction
  • [i=IV traction]
  • [g=GTD traction]

IV/QQ traction — I experienced better IV traction in c# than c++. I think it’s because half my c++ interviews were HFT shops.

GTD traction — I had good GTD traction with javascript, php …, much better than c++. My C# GTD traction was also better than c++. C++ is probably the hardest language in terms of GTD. As explained to Kyle, my friend Ashish experienced tremendous GTD traction but can he crack the interviews? Hiring teams can’t access your GTD but they can ask tough QQ questions.

##[19] am competent at..as a professional thanks2tsn

Background — When I listen to a professional musician or comedian from some unfamiliar country, I wonder if they are actually good. Similarly, when I consult a doctor or dentist, I wonder if they are qualified.

“Self-respecting programmer” — Yi Hai’s motivation.

I have been tested and proven on the big stage i.e. U.S. tech interviews + GTD

  • [p t w] java/c++/c#
  • [w] algo
  • [w] coding test
  • [t] SQL
  • [t] socket
  • [t] unix power user
  • swing
  • [p] web app developer across php, javascript, java
  • py, perl (and shell/javascript?) — are these professional games?
  • [t = long tradition over 20Y]
  • [w = worldwide contest]
  • [p = a well-known standard profession]

became expert via external QQ benchmark` !!localSys xx

Suppose I invest heavily and become very productive on a local c++system. Similarly, a java, or SQL, or socket, pyhton .. system, but the point is — Any local system uses only a small portion of the language features.. at most 5% of the typical high-end interview topics on that language.

Therefore the project experience won’t give me confidence to be an expert on the language, but acing multiple QQ interviews does, as these interviews compare dozens of competitive, motivated and qualified candidates across a wide field.

I grew very confident of java through this type of external benchmarking, in contrast to internal benchmarking within the local team.

required experience: QQ imt CIV

A few years into my technology career, I observed that Solaris/Oracle was harder to self-teach at home than linux/mysql/php/python/perl/javascript because even a high-school student can install and hack with the latter. Entry-barrier was non-existent.

Similarly, I now observe that Wall St QQ-IV demands longer experience than coding IV of west-coast style. Coding drill can use Leetcode over months. QQ requires not so much project experience but a lot of interview experience. It takes more than months of exploration and self-learning.

  • example — tcp/ip. My friend Shanyou preferred to read books systematically, but a book touches on hundreds of topic. He wouldn’t know what topics interviews like to dig into.
  • example — c++ TMP. A fresh grad can read about it but won’t know the favorite topics for interviewers
  • Compelling Example — java concurrency. A fresh grad can build up theoretical knowledge but won’t have my level of insight

Many inexperienced candidates were highly appreciated in west coast interviews. No such appreciation on Wall St because Wall St can VP or contract roles require work experience .. tried-n-tested.

  • Wall St interviews are selective in terms of experience
  • West Coast coding interviews are selective in terms of … speed and optimality

cod`drill^quant self-study

quant knowledge used to be a halo, now replaced by coding IV skills like

  • classic generators
  • DP, recursion
  • clever data structures
  • clever graph algos
  • speed coding

In 2012 When I was studying quant in my spare time, I was on a powerful ascent. In the subsequent years, I gradually lost steam. The quant sector was shrinking and market depth was disappointing.

Since 2018, my coding drill feels very relevant, even though the gap behind the strong players continues to be a pain and a de-motivation. When I stop comparing myself with the stronger players, I would feel my improvement in a highly valuable skill

[19]c++guys becom`very unlucky cf java

See also c++developer=strongest due2hard language@@

On 22 Apr 2019 I told Greg that c++ developers like me, Deepak, CSY.. are just so unlucky — most of the WallSt c++ job interviews are too demanding in terms of latency engineering, either on buy-side or sell-side.

Greg agreed that java interviews are much easier to pass. Greg said if you have reasonable java (interview) skills, then you can get a job in a week.

I told Greg that the only way Deepak or CSY could get an offer is through one of the few easy-entry c++jobs, but there are relatively few such jobs i.e. without a deep moat.

 

web2.0[def] IV need no j/c++insight..our t-investment lost#500w

I attended a few technical interviews at web2.0 [1] type of companies over the years — google, amazon, VMWare … and recently Indeed, Facebook and some well-known Chinese tech shops.

These Interviewers never asked about java/c++ language details or data structures (as implemented in standard libraries), or Linux+compiler system knowledge. ( I know many of these shops do use java or c++ as firm-wide primary language.) They do require data structure knowledge in any language you choose.

My conclusion from these experiences — if we compete for these jobs, we can’t rely on the prior interview experiences gained from all the financial-domain tech interviews. Wall St vs West Coast are too different, so much so that Wall St essential tech knowledge is not useful for west coast interviews.. We have to leave that wealth of knowledge behind when we start on a new journey (to the West) of learning, trying (trying our luck at various interviews), failing and retrying.

Michael Jordan once quit NBA and tried his hand at professional baseball. I see ourselves as experienced basketball players trying baseball. Our interview skills, interview practice, in-depth knowledge of crucial interview topics have no value when we compete in west-coast interviews.

West cost shops mostly rely on algo interviews. You can use any (real) language. The most common are java/c++/python. You just need a tiny subset of the language knowledge to compete effectively in these coding tests. In contrast, financial firms quiz us on much wider and deeper knowledge of java/c++/c#/Linux etc.

Q: What if a west-coast candidate were to try the financial tech jobs like ibanks or hedge funds etc? I think they are likely to fail on those knowledge tests. I think it would take more than a year for them to acquire the obscure knowledge required at high-end financial tech jobs. In contrast, it takes months to practice a few hundreds leetcode problems. You can decide for yourself which side is more /impenetrable/ — Wall St or West Coast.

Ironically, neither the west-coast algo skill nor the financial tech obscure knowledge is needed in any real project. All of these high-end employers on both coasts invent artificial screening criteria to identify “cream of the crop”. What’s the correlation of on-the-job performance to a candidate’s algo skill and obscure knowledge? I would say zero correlation once we remove the intelligence/diligence factors. In other words, algo skill or obscure knowledge are poor predictors of job performance, but intelligence/diligence are good predictors.

In the bigger picture, these tech job markets are as f**ked up as decades ago, not improving, not worsening. As long as there are big salaries in these jobs, deep-pocketed employers will continue to use their artificial screening criteria. We have to play by their rules, or get out of the way.

— [1] web2.0 defined

I call them “web2.0” shops — second wave, second generation of Internet tech powerhouses.

Some of them are focused on cloud, AI, bigData, but usually with a deep integration, heavy reliance on the Internet ecosystem.

One day, I may apply to a new-age tech shop not related to the Internet, but the tech questions are the same. Therefore, I may have to keep my definition of web2.0 fluid and vague.

competitive strengthS offer different$values #speedCod`/math

— update:

if you are fast at coding, your skill is easily recognized and valued! Depth of market
if you are good at cooking, your skill is easily recognized. Depth of market
if you are good at stock or FX trading?
if you are good at gadgets? Better get hired at a profitable firm
if you are good at GUI design?
if you are great at sports? Very few can make it to the professional league
if you are great at writing (Bo Rong?)
if you are great at music instruments or Singing?
If you are great at drawing?
if you are great at public speaking? Very few can make a living
if you are great at teaching kids? Singapore private tution centers would be good

—-

  • competitive strength in speed coding contest — such contest are now more prevalent and the skills are more valued
  • competitive strength in dStruct/algo beyond the classics
  • competitive strength in core cpp QQ
  • competitive strength in core java QQ — bigger job market than cpp
  • competitive strength in socket QQ
  • competitive strength in SQL QQ (and perl GTD) — better than swing
  • competitive strength in math before college level — huge and long-term impact
  • competence in localSys — no long-term impacts, so Ashish’s t-investment is unwise
  • improvement in yoga and fitness

In each “competitive” case, you build up competitive strength over H years but may lose it in K (could be long) years. Each strength has very different long-term impacts and long-term value, not “zero” (transient) as we sometimes perceived them to be.

Any Valuable resources (including a lucrative job) are scarce and invites competition. A competitive strength (in any one of these domains) has long term impact on my health, mental aging, stress level, my job choices, my commute, amount of annual leave.

For a concrete comparison, let’s compare speed coding vs math. In school, math is far more valuable. It enhances your physics, chemistry, economics… There are many math competitions at various levels.  After we turn 30, math, in the form of logical and number-theory quizzes, still plays a small part in some job interviews. However, speed coding strength (am building) now has such an appreciating value on the high-end competitive interview scene.  Over the next 10Y, speed coding will have far more impact on those aspects listed earlier.

However, if you want to invest in building such a strength, beware of huge disappointments. You can say every woman’s natural beauty has imperfections when you see that woman everyday. This is because our idea of perfect beauty is based on paintings and photos, not live humans. Similarly, every endeavor’s ROTI has imperfections compared to our naive, idealized concepts.

If you look for imperfections you will always find some, but such fixation on imperfections is cynical, demoralizing and unproductive research.

We need to zoom into our own strategic strengths + long term interests such as low-level, theoretical stuff, pure algo, dstruct, and avoid our weaknesses.

  • low level or theoretical QQ — my strength
  • low level investigation using tools + codebase — my weakness
  • picking up new GTD challenges — my relative weakness but I did well before joining Wall St.
  • picking up new IV topic — my relative strength

unnoticed gain{SG3jobs: 看破quantDev

All three jobs were java-lite , with some quantDev exposure. Through these jobs, I gained the crucial clarity about the bleak reality of the quantDev career direction. The clarity enabled me to take the bold decision to stop the brave but costly TSN attempts to secure a foothold. Foothold is simply too tough and futile.

Traditional QuantDev in derivative pricing is a shrinking job pool. Poor portability of skills without any standard set of interview topics.

at same pay, now I would prefer eq than drv pricing domain, due to mkt depth and job pool.

QuantDev offers no contract roles !

Instead, I successfully established some c#/py/c++ trec. The c++ accu, though incomplete, was especially difficult and precious.

Without these progresses, I would be lacking the confidence in py/c#/c++ professional dev that enabled me to work towards and achieve multiple job offers. I would still be stuck in the quantDev direction.

market-depth^elite domains #jxee

I used to dismiss “commodity” skills like market data, risk system, JXEE… I used to prefer high-end specializations like algo-trading, quant-dev, derivative pricers. In reality, average salary is only slightly different and a commodity job can often outpay a specialist job.

As I get older, it makes sense to prefer market depth rather than “elite”(high-end niche) domains. A job market with depth (eg jxee, market-data, risk systems) offers a large number of positions. The typical salary of top 10% vs the median are not very different — small gaps. In contrast, the elite domains feature bigger gaps. As I grow older, I may need to reconsider the specialist vs generalist-manager choices.

Reminders about this preference (See also the spreadsheet):

  1. stagnation in my orgradient
  2. may or may not use my specialist skills in math, concurrency, algorithms, or SQL …
  3. robust demand
  4. low churn — a critical criteria whenever I mention “market depth”. I don’t like the market depth of javascript and web java.
  5. salary probabilities(distro): mgr^NBA#marketDepth etc

–case study: Algo trading domain

The skillset overlap between HFT vs other algo systems (sell-side, OTC, RFQ, automated pricing/execution..) is questionable. So is “accumulation” across the boundary.  There seems to be a formidable “dragon gate” — 鲤鱼跳龙门.

Within c++ based HFT, accumulation is conceivable. Job pool is so small that I worry about market depth. My friend Shanyou agreed that most of the technical requirement is latency. C/C++ latency techniques are different from java.

However, HFT developers seldom need to optimize latency

Outside HFT, the level of sophistication and latency-sensitivity varies. Given the vague definition, there are many (mostly java) jobs related to algo trading i.e. better market depth. Demand is more robust. Less elitist.

rvr(+lvr)usually shows up as function param ONLY

r-value reference is a type, and therefore a compile-time thing, not a runtime thing as far as I know. At runtime, there’s no r-value reference variable,

only addresses and 32-bit pointer objects.

(I believe at runtime there is probably no lvr reference variable either.)

Compiler recognizes the RHS’s type and decides how to bind the RHS object to a variable, be it an rvr-variable, lvr-variable, nonref-variable, or const-lvr-variable.

About the only place I would use a “&&” variable is a function parameter. I don’t think I would ever need to declare a local variable or a field with “&&”.

Do I ever assign an rvr variable as a RHS to something else? Only in one case, as described in [[effModernC++]] P162 and possibly Item 25. This kind of usage is only needed for QQ interviews and never in any job…. never. It’s too tricky and doesn’t buy us anything significant.

hands-on dev beats mgr @same pay

BA, project mgr, even mid-level managers in some companies can earn the same 160k salary of a typical “developer” role. For a manager in a finance IT, salary is often higher, but for a statistically meaningful comparison I will use a 160k benchmark. Note in finance IT or tech firms, 160k is not high but on main street many developer positions pay below 160k.

As stated in other blogposts, at the same salary, developers enjoy higher mobility, more choices, higher career security…

Leetcode speed-coding contest #Rahul

  • don’t look at ranking
  • yoga — I CAN keep up this practice. This practice is good for my mental health and family well-being
  • yoga — I feel even if i don’t improve visbly, the fact that my participation count is increasing means I’m improving
  • if I don’t do the contest then I may not do any coding drill at all
  • What if I give up after one or a few episodes?
  • impact on family well-being?
  • leverage over long term?

In [[who moved my cheese]], we discover the world has changed. The harsh reality is, in this profession, your experience (like doctors) is heavily discounted. Your current runtime performance is easily benchmarked, just like a painter, or pianist, or chef.

wasm: a distant threat to javascript

https://medium.com/javascript-scene/what-is-webassembly-the-dawn-of-a-new-era-61256ec5a8f6 is a good WebAssembly intro, by an author.

  • wsam is an in-browser language, like javascript
  • wsam offers low-level (web-ASSEMBLY) programming constructs to complement javascript
  • I feel wsam will only be considered by extreme-performance browser apps. It’s too low-level, too inconvenient to be popular.
  • How many percent market share will javascript lose to wsam? 0.1% to 0.5%
  • The fact that wsam is cited as the main challenger to javascript means javascript is unchallenged as the dominant choice on the browser

real-world treeNode has uplink

I think only in contrived interview questions do we find tree nodes without uplink.

Uplink basically, means every edge is bidirectional. With uplinks, from any node we can easily trace to the ancestor nodes.

In real world trees, uplink is cheap-yet-valuable most of the time. AVL tree node has uplink, but let’s look at RBTree:

  • RBTree needs uplink for tree rotation.
  • Also, I believe when you give an insertion hint, the “engine” needs to validate the hinted node, by checking parent + another ancestor.

success@concurrency features] java^c++^c#

I’m biased towards java.

I feel c# concurrency is less impactful because most of the important concurrent systems use *nix servers not windows, and most concurrent programming jobs do not go to windows developers.

Outside windows, c++ concurrency is mostly based on the C library pthreads, non-OO and inconvenient compared to java/c#

The c++11 thread classes are the next generation following pthreads, but not widely used.

Java’s concurrency support is the most successful among languages, well-designed from the beginning and rather stable. It’s much simpler than c++11 thread classes, having only the Thread.java and Runnable.java data types. More than half the java interviews would ask threading, because java threading is understandable and usable by the average programmer, who can’t really understand c++ concurrency features.

c++complexity≅30% above java #c#=in_between

Numbers are just gut feelings, not based on any measurement. I often feel “300% more complexity” but it’s nicer to say 30% 🙂

  • in terms of interview questions, I have already addressed in numerous blog posts.
  • see also mkt value@deep-insight: java imt c++
  • — tool chain complexity in compiler+optimizer+linker… The c++ compiler is 200% to 400% (not merely 30%) more complex than java… see my blogpost on buildQiurks. Here are some examples:
  • undefined behaviors … see my blogposts on iterator invalidation
  • RVO — top example of optimizer frustrating anyone hoping to verify basic move-semantics.
  • See my blogpost on gdb stepping through optimized code
  • See my blogpost on on implicit
  • — syntax — c++ >> c# > java
  • java is very very clean yet powerful 😦
  • C++ has too many variations, about 100% more than c# and 300% more than java
  • — core language details required for GTD:
  • my personal experience shows me c++ errors are more low-level.
  • Java runtime problems tend to be related to the (complex) packages you adopt from the ecosystem. They often use reflection.
  • JVM offers many runtime instrumentation tools, because JVM is an abstract, simplified machine.
  • — opacity — c++ > c# > java
  • dotnet IL bytecode is very readable. Many authors reference it.
  • java is even cleaner than c#. Very few surprises.
  • — more low-level — c++ > c# > java.
  • JVM is an excellent abstraction, probably the best in the world. C# CLR is not as good as JVM. A thin layer above the windows OS.

short-term ROTI ] java IV: tuning imt threading

Analogy — physics vs chemistry in high-school — physics problems involve more logic, more abstract, more ; Chemistry is more factual. Similarly, Java threading challenges are more complex. Java tuning knowledge-pearls are relatively easy to memorize and show-off.

“Short-term” here means 3-5Y horizon.

I’m many times more successful with java threading– accu, deepening knowledge; lead over the pack.

I hope to build some minor zbs in JVM tuning, mostly GC tuning. However, many important parts of JVM tuning skills suffer from churn and poor shelf life.

consolidate into single-process: low-latency OMS

Low-latency designs are not highly distributed. Scale up rather than scale out? Scalabilty isn’t a priority.

Example #2– In a traditional sell-side OMS, an client FIX order propagates through at least 3 machines in a chain —

  1. client order gateway
  2. main OMS engine
  3. exchange gateway such as smart order router or benchmark execution engine supporting VWAP etc

The faster version consolidates all-of-the-above into a single Process, cutting latency from 2ms to 150 micros .. latency in eq OMS

Example #1– In 2009 I read about or heard from interviewers about single-JVM designs to replace multi-stage architecture.

Q: why is this technique not used on west coast or main street ?
%%A: I feel on west coast throughput outweighs latency. So scale-out is the hot favorite. Single-JVM is scale-up.

Example — MS fastest collocation OMS (proxima?)

Example — MLP Client-direct EPA absorbes OMS into the FIX gateway

decline@quant-dev domain: low value-add

Q: why drv pricing quant jobs used to be more abundant and pay a premium even though the economic value-add of the quant skill has always been questionable?
%%A: because the traders made money for a few years and subsequently received a big budget. Now budget is lower.

Same can happen to quant funds in NY, Chicago, HK…

Some UChicago lecturer once commented that everyone should get some quant training .. (almost as every high school student does math). But I think it is not necessary….

I am more theoretical and was naturally attracted to this domain but the value-add reality soon prevailed in my case.

Tech shops love algo challenges and speed-coding which have dubious value-add similar to the quant skills. However, the hot money in tech sector is bigger and last longer.

short coding IV: ibank^HFT^webShop

See also prevalence@speed coding among coding-IV

  • “web Shop” includes all west-coast type of hiring teams including China shops
  • Ibanks are mostly interested in language knowledge, so they use coding test (including multithreaded) for that purpose.  Ibanks also use simpler coding questions for basic screening.
  • HFT shops (and bbg) have an unpredictable mix of focuses

I predict that Wall St will Not adopt the west coast practice due to .. poor test-coverage — Not testing language knowledge, latency engineering, threading etc.

The table below excludes take-home tests and includes hackerrank, white-board, webex, and by-phone algo questions.

ibanks web2.0 Shops HFT
language knowledge; latency; NOT bigO #1 focus Lglp key focus, not #1
simple/medium speed coding rare #1 focus. High bar common
medium/simple algo without
implementation [1]
common #SCB
minimum bar is low
common key focus, not #1
tough algo never sometimes rare
big-O of your solution NOT latency LGlp. low bar #2 focus. high bar LGlp
concurrency coding test G5 focus Lglp #python/
/ruby/js
sometimes

[1] the harder problems often become algo-only, without implementation. Algo-only is also logistically easiest => popular

## marketable syntax nlg: c++ > j/c#

Every language has poorly understood syntax rules, but only in c++ these became fashionable, and halos in job interviews !

  • ADL
  • CRTP
  • SFINAE
  • double pointers
  • hacks involving void pointers
  • operator overloading to make smart ptr look like original pointers
  • TMP hacks using typedef
  • TMP hacks using non-type template param
  • universal reference vs rvr
  • rval: naturally occurring vs moved
    • const ref to extend lifetime of a naturally occurring rval object

new c++features=mostly4lib developers

In C++0x, Half of the major new language features are designed for the standard library developers.

  • The unspoken assumption — these features are equally useful to other library developers.
  • The follow-up assumption — app developers also need to use some (and understand all) of the nitty-gritty details.

In reality, these nitty-gritty details are Not relevant to GTD for app developers.

c++ecosystem[def]questions are tough #DeepakCM

C++ interviewers may demand <del>c++ecosystem knowledge</del> but java also has its own ecosystem like add-on packages.

As I told my friend and fellow c++ developer Deepak CM,

  1. c++ecosystem QQ questions can be more obscure and tougher than core c++ questions
    • tool chain — compiler, linker, debugger, preprocessor
    • IPC, socket, pthreads and other C-level system libraries
    • kernel interface — signals, interrupts, timers, device drivers, virtual memory+ system programming in general # see the blog catetory
    • processor cache tuning
    • (at a higher level) boost, design patterns, CORBA, xml
    • cross-language integration with python, R, pyp, Fortran + other languages
  2. java ecosystem QQ questions are easier than core java questions. In other words, toughest java QQ questions are core java.
    • java ecosystem questions are often come-n-go, high-churn

Low level topics are tough

  1. c++ ecosystem questions are mostly in C and very low-level
  2. java ecosystem questions are usually high-level
    • JVM internals, GC … are low-level and core java

 

competitiveness: Pre-teen sprinter + HFT selectivity

I was a decent sprinter at age 6 through 13, then I realized I was not good enough to compete at Beijing municipal level.

There are quite a number of HFT shops in Singapore. I think they have a higher bar than ibanks. At ibank interviews I felt again like that pre-teen sprinter, but HFT interview is generally too hard for me

c++changed more than coreJava: QQ perspective

Recap — A QQ topic is defined as a “hard interview topic that’s never needed in projects”.

Background — I used to feel as new versions of an old language get adopted, the QQ interview topics don’t change much. I can see that in java7, c#, perl6, python3.

To my surprise, compared to java7/8, c++0x has more disruptive impact on QQ questions. Why? Here are my guesses:

  • Reason: low-level —- c++ is more low-level than java at least in terms of interview topics. Both java8 and c++0x introduced many low-level changes, but the java interviewers don’t care that much.
  • Reason: performance —- c++0x changes have performance impact esp. latency impact, which is the hot focus of my target c++ employers. In contrast, java8 doesn’t have much performance impact, and java employers are less latency-sensitive.
  • Reason: template  —- c++0x feature set has a disproportionate amount of TMP features which are very hard. No such “big rock” in java.
    • move/forward, enable_if, type traits

Q: if that’s the case, for my career longevity, is c++ a better domain than java?
A: I’m still biased in favor or low-level languages

Q: is that a form of technology churn?
A: yes since the c++11 QQ topics are likely to show up less over the years, replaced by newer features.

%%geek profile cf 200x era, thanks2tsn

Until my early 30’s I was determined to stick to perl, php, javascript, mysql, http [2] … the lighter, more modern technologies and avoided [1] the traditional enterprise technologies like java/c++/c#/SQL/MOM/Corba . As a result, my rating in the “body-building contest” was rather low.

Like assembly programming, I thought the “hard” (hardware-friendly) languages were giving way to easier, “productivity” languages in the Internet era. Who would care about a few microsec? Wrong…. The harder languages still dominate high-end jobs.

Analogy?

* An electronics engineering graduate stuck in a small, unsuccessful wafer fab
* An uneducated pretty girl unable to speak well, dress well.

Today (2017) my resume features java/c++/py + algo trading, quant, latency … and I have some accumulated insight on core c++/c#, SQL, sockets, connectivity, ..

[1] See also fear@large codebase
[2] To my surprise, some of these lighter technologies became enterprise —

  1. linux
  2. python
  3. javascript GUI
  4. http intranet apps

finIT: strong coreJava candd=most mobile

Jxee guys face real barriers when breaking into core java. Same /scene/ when a c++ interviewer /grills/ a java candidate. These interviewers view their technical context as a level deeper and (for the skill level) a level higher. I have seen and noticed this perception many times.

Things are different in reverse — core java guys can move into jxee with relatively small effort. When jxee positions grow fast, the hiring teams often lower their requirements and take in core java guys. Fundamentally (as some would agree with me) the jxee add-on packages are not hard otherwise they won’t be popular.

Luckily for java guys, Javaland now has the most jobs and often well-paying jobs but ..

  1. c# guys can’t move into the huge java market. Ellen have seen many
  2. c++ is a harder language than java, but a disproportionate percentage (80%?) of c++ guys face real entry barrier when breaking into javaland.
  3. I wonder why..
  • I have heard of many c++ and c# veterans complain about the huge ecosystem in javaland.
  • I have spoken to several c++/c# friends. I think they don’t sink their teeth in java. Also a typical guy doesn’t want to abandon his traditional stronghold, even though in reality the stronghold is shrinking.
  • Age is a key factor. After you have gone though not only many years but also a tough accumulation phase on a steep learning curve, you are possibly afraid of going through the same.

[20] java≠a natural choice 4 latency #DQH

I think java could deliver similar latency numbers to c/c++, but the essential techniques are probably unnatural to java:

  • STM — Really low latency systems should use single-threaded mode. STM is widely used and well proven. Concurrency is the biggest advantage of java but unfortunately not effective in serious latency engineering.
  • DAM — (dynamically allocated memory) needs strict control, but DAM usage permeates mainstream java.
  • arrays — Latency engineering favors contiguous data structures i.e. arrays, rather than object graphs including hash tables, lists, trees, or array of heap pointers,,. C pointers were designed based on tight integration with array, and subsequent languages have all moved away from arrays. Programming with raw arrays in java is unnatural.
    • struct — Data structures in C has a second dimension beside arrays – namely structs. Like arrays, structs are very compact, wasting no memory and can live on heap or non-heap. In java, this would translate to a class with only primitive fields. Such a class is unnatural in java.
  • GC — Low latency doesn’t like a garbage collector thread that can relocate objects. I don’t feel confident discussing this topic, but I feel GC is a handicap in the latency race. Suppressing GC is unnatural for a GC language like java.

My friend Qihao commented —

There are more management barriers than technical barriers towards low latency java. One common example is with “suppressing gc is unnatural”.

Most c++guys have no insight in STL

When I recall my STL interviews on Wall St, it’s clear that majority of c++ app developers use STL almost like a black box, every year for 5-10 years. Only 1% has the level of insight as Henry Wu.

Reason? such insight is never needed on the job. This factor also explains my advantage in QQ interviews.

  • Analogy — Most of us live in a house without civil engineering insight.

STL uses many TMP and memory techniques. Some may say STL is simple but I doubt they have even scratched surface on any of the advanced topics.

  • Analogy — some may say VWAP execution algo is simple.

Many interview questions drill in on one or two essential STL functionality (which everyone would have used), just below the surface. These questions filter out majority[1] of candidates. Therein I see an opportunity — You can pick one or two topics you like, and grow an edge and a halo, just as Henry did. Intellectual curiosity vs intellectual laziness. 不求甚解,不加咀嚼,囫囵吞枣 — I see it in many self-taught developers. That’s exactly what these Wall St interview questions are designed to identify and screen out.

How about west coast and other high-end tech interviews? I’m not sure.

[1] sometimes 90%, sometimes 60%.

 

77 c++IV paperTigers

Avoid spending too much time on this list…. These c++ topics appeared non-trivial (perhaps daunting, intimidating) for years, until I started cracking the QnA interviews. Then I realized in-depth expertise isn’t required, so Venkat can impress interviewers with his Wikipedia knowledge. In contrast, DQH would run complicated coding experiments to build insights.

  1. make_shared, enable_shared_from_this
  2. … these are some new items to be sorted…
  3. — real tigers i.e. non-trivial nlg is quizzed
  4. [A] CRPP (SFINAE is worse) — real tiger. I got asked about these around 5 times, sometimes in-depth
  5. socket: non-blocking
  6. std::forward()
  7. — Now the paper tigers
  8. open source or commercial instrumentation for memory, dependency instrumentation. See blogpost to Ashish
  9. [s] what debuggers and memory leak detectors are you familiar with?
  10. [a] singleton, factory, pimpl and other design patterns
  11. —sockets # many more paper tigers to be listed
  12. udp, multicast, select()
  13. [s] socket buffers
  14. [a] byte alignment
  15. endianness
  16. TCP flow control
  17. TCP handshake and disconnect
  18. ack numbers
  19. partial send
  20. close() vs shutdown()
  21. — STL # many more paper tigers to be listed
  22. [s] STL binders
  23. [s] STL allocators
  24. adapters for containers, iterators and functors
  25. [a] iterator invalidation rules
  26. [s] how is deque different from a vector
  27. RBtree
  28. —concurrency # many more paper tigers to be listed
  29. [A] atomic types
  30. pthread functions
  31. [A] IPC mutex
  32. mutex in shared memory
  33. recursive lock
  34. read-write lock
  35. what if a thread dies while holding a lock?
  36. [s] RAII scoped lock
  37. — multi-file build
  38. forward class declarations (as required in pimpl) and their limitations
  39. [s] C/C++ integration, extern-C — heavily quizzed, but no in-depth
  40. [s] what C features are not supported by c++ compiler
  41. circular dependency between libraries — confusing. real tiger but seldom quizzed
  42. [As] shared lib vs static lib
  43. —integration and data exchange
  44. [A] shared memory
  45. [A] CORBA, RPC
  46. [a] serialization in compact wire format — only primitive data types!
  47. [s] OS system calls vs std library (glibc) — sounds daunting to most developers
  48. —exception
  49. catch by ref or by value or by pointer?
  50. [s] exception guarantees
  51. [s] stack unwinding due to exception
  52. throwing destructors — various questions
  53. —memory
  54. which part of memory do static data members go? How about file scope static variables? How about global variables
  55. [s] preventing heap/stack allocation of my class
  56. [s] custom new/delete,  set_new_handler()
  57. [s] intrusive smart ptr, weak_ptr
  58. [sA] ref counting
  59. [sA] union, inherited from C
  60. custom deleter in shared_ptr
  61. [s] reinterpret_cast  # always on pointers
  62. [A] custom allocators
  63. [A] free list in the free store
  64. what if you call delete on a pointer that’s created by array-new?
  65. placement new
  66. out-of-memory in operator-new
  67. —inheritance
  68. dynamic_cast, dynamic_pointer_cast
  69. [A] multiple inheritance
  70. [s] virtual inheritance… which base class ctor gets called first? See https://isocpp.org/wiki/faq/multiple-inheritance#mi-vi-ctor-order
  71. [a] slicing problem
  72. [a] private inheritance
  73. [s] pure virtual
  74. —other low level topics
  75. [s] setjmp, jmp_buf… See the dedicated blog post jmp_buf/setjmp() basics for IV #ANSI-C
  76. [s] cpu cache levels
  77. [s] translation lookaside buffer
  78. [s] what data structures are cache-friendly?
  79. [a] memcpy, memset
  80. [s] ++myItr vs myItr++ how are they implemented differently?
  81. —other language features
  82. [s] RAII
  83. [s] operator-overload
  84. [A] template specialization — part of the STL fabric but largely transparent
  85. [s] ptr to member (function) — seldom used outside library code. I tried the syntax in my binary tree serializer
  86. [A] std::forward() std::move(), rvalue-ref
  87. const and constexp
  88. [a] lambda with capture
  89. [a] double-pointer .. why do you need it?
  90. —-
  91. [s == shallow book knowledge is enough]
  92. [a == actually not that deep, IMHO]
  93. [A == actually deep topic]

7 clusters@HFT-c++IV questions

Every single HFT interview question is about low-latency. Furthermore, the ibank algo-trading interviews revolve around the same clusters.

Even though I’m not aiming for HFT jobs, these topics are still very relevant to ibank and the “3rd type” of c++ shops.

  1. socket — lots of details as Shanyou and I agreed
  2. template meta-programming — deep topic but never quizzed in-depth beyond “tricks”
  3. move-semantics

— Themes are less visible in these clusters:

  1. pthreads and c++ threading but I seldom get c++11 question here
  2. STL container internals, mostly shared_ptr, raw array, vector, RBTree, and hashmap
  3. (back of tricks) memory optimization techniques using allocators, cache-optimization, malloc(), placement-new, object-pool, memset,
  4. miscellaneous core OO features like big4, virtual, MI, pbref/pbval

— other HFT topics are dispersed/scattered, not showing any strong central theme

  1. shared memory
  2. linux system calls
  3. compiler details
  4. selected boost libraries

reasons to limit tcost@SG job hunt #XR

XR said a few times that it is too time consuming each time to prepare for job interviews. The 3 or 4 months he spent has no long-term value. I immediately voiced my disagreement because I took IV fitness training as a lifelong mission, just like jogging or yoga or chin-up.

This view remains as my fundamental perspective, but my disposable time is limited. If I can save the time and spend in on some meaningful endeavors  [1] then it’s better to have a shorter job hunt.

[1] Q: what endeavors?
A: yoga
A: diet
A: stocks? takes very little effort
A: ?

throughput^latency #wiki

High bandwidth often means high-latency:( .. see also linux tcp buffer^AWS tuning params

  • RTS is throughput driven, not latency-driven.
  • Twitter/FB fanout is probably throughput-driven, not latency-driven
  • I feel MOM is often throughput-driven and introduces latency.
  • I feel HFT OMS like in Mvea is latency-driven. There are probably millions of small orders, many of them cancelled.

https://en.wikipedia.org/wiki/Network_performance#Examples_of_latency_or_throughput_dominated_systems shows

  • satellite is high-latency, regardless of throughput
  • offline data transfer by trucks) is poor latency, excellent throughput

highest leverage: localSys^4beatFronts #short-term

Q: For the 2018 landscape, what t-investments promise the highest leverage and impact?

  1. delivery on projects + local sys know-how
  2. pure algo (no real coding) — probably has the highest leverage over the mid-term (like 1-5Y)
  3. QQ
  4. –LG2
  5. portable GTD+zbs irrelevant for IV
  6. obscure QQ topics
  7. ECT+syntax — big room for improvement for timed IDE tests only, not relevant to web2.0 onsite interviews.
  8. Best practices — room for improvement for weekend IDE tests only, not relevant to web2.0 shops.

average O(): hashtable more IMperfect than qsort

In these celebrated algorithms, we basically accept the average complexity as if they were very likely in practice. Naive…

In comp science problems, hash table’s usage and importance is about 10 times higher than qsort

  • I would say qsort is faster than many O(N logN) sorts. Qsort can use random pivot. It degrades only if extremely “lucky;)” like getting a “6” on all ten dice.
  • In contrast, hash table performance depends mostly on programmer skill in designing the hash function, less on luck.

Performance compared to the alternatives — qsort competitive performance is pretty good in practice, but hash table relative performance is often underwhelming compared to red-black trees or AVL trees in practice. Recall RTS.

##Y c++IV improved much faster]U.S.than SG #insight{SCB breakthru

Hi XR,

I received 9 c++ offers since Mar 2017, mostly from U.S. In contrast, over the 4.5 years I spent in Singapore, I received only 3 c++ offers including a 90% offer from HFT firm WorldQuant (c++ job but not hardcore).

  1. Reason: buy-side employers — too picky. Most of the Singapore c++ jobs I tried are buy-side jobs. Many of the teams are not seriously hiring and only wanted rock stars.
    • In contrast, Since 2010 I tried about 6 Singapore ibank c++ jobs (Citi, Barclays, Macquarie, Standard Chartered Bank) and had much better technical wins than at buy-side interviews.
  2. Reason: Much fewer c++ jobs than in U.S.
  3. Reason: employee — I was always an employee while in Singapore and dare not attend frequent interviews.
  4. Reason: my c++ job in the U.S. are more mainstream so I had more opportunities to experiment on mainstream c++ interview topics. Experiments built up my confidence and depth.
  5. Reason: I had much more personal time to study and practice coding. This factor alone is not decisive. Without the real interviews, I would mostly waste my personal time.

Conclusion — availability of reasonable interview opportunities is a surprisingly oversize factor for my visible progress, 

By the way, Henry Wu (whom I told you about) had more successful c++ interviews. He joined WorldQuant and Bloomberg, two companies who didn’t take me up even after my technical wins.

Q: passive income ⇒ reduce GTD pressure#positive stress

See also 3stressors: FOMO^PIP^ livelihood[def]

My (growing) Passive income does reduce cash flow pressure… but it has no effect so far on my work GTD pressure.

Q: Anything more effective more practical?

  1. take more frequent unpaid leaves, to exercise, blog or visit family
  2. expensive gym membership

How about a lower salary job (key: low caliber team)? No I still want some challenge some engagement, some uphill, some positive stress.

mgr|risk| age-unfriendly job mkt cf contractor

Statistically, very few IT managers can maintain the income level beyond age 55.

I believe those younger managers in 30’s and 40’s are often more competitive and more hungry (ambitious), more capable at least in terms of tech learning.

Even if you are above average as a manager, the chance of rising up is statistically slim and you end up contending against the younger, hungrier, /up-and-coming/ rising stars.

low-latency: avoid concurrency #ST-mode

Backgrounder — CPU speed is increasing more gradually than before. The technology industry as a whole is advancing more horizontally — increasing parallelism. Yet the best designs don’t use lock-free or concurrency at all.

I asked Martin Thompson — To really push the limit of latency, should we avoid concurrency as much as possible, completely eliminating it if possible? Answer is yes. Martin pointed out the difference between

  • parallel design —— use multitasking, in ST-mode, or “do multiple things at the same time
  • concurrent design — deal with multitasking, or “deal with multiple things at the same time“. The expression “deal with” implies complexities, hazards, risks, control, management.

One of the hidden hazards Martin pointed out is heap memory de-allocation, but that’s for another blogpost.

Proven GTD: worthless in candidate ranking #JackZ

I feel that Jack Zhang is competent with localSys GTD but weak on c++ and comp science.

Does he have working knowledge of c++? I assume so. Working knowledge is attainable in a couple of months for a clean language, and up to a year for c++

The required level of working knowledge and basic skill is very low for localSys GTD.

His c++ knowledge is probably barely enough to do the job. Remember I didn’t know what things live on java heap vs stack.

Based on my guesstimate, he would fail any c++ interview and any algo interview. He can write simple SQL in an interview, but I am not sure if he can write complex joins.

The fact that Jacn and DeepakM are are proven on GTD is useless and lost in the conversation.

How about CSY? He can solve many algo problems without practice, but he is reluctant to practice.

I think the self-sense of on-the-job competency is misleading. Many in their positions might feel GTD competency is more important than IV skills. They are so afraid of the benchmark that they don’t want to study for it.

When the topic of tech interview comes up, I think they wish to escape or cover their ears.

tried 3″hard”leetcode Q’s #tests !! 100%

I tried Q4, Q10, Q23.

Observation — they are not really harder in terms of pure algo. I found some “medium” questions actually harder than Q4/Q23 in terms of pure algo.

Beside the algorithm, there are other factor to make a problem hard. For me and my peers, coding speed and syntax are a real problem. So the longer my program, the harder it becomes. Some of the “medium” questions require longer solutions than these “hard” problems.

Logistics of instrumentation is another factor. Some problems are easy to set up and easy to debug, whereas 3D, graph or recursive problems are tedious to set up and often confusing when you try to debug with print’s.

There’s another factor that can make any “medium” problem really hard

pick java if you aspire 2be arch #py,c#

If you want to be architect, you need to pick some domains.

Compared to python.. c#.. cpp, Java appears to be the #1 best language overall for most enterprise applications.

  • Python performance limitations seem to require proprietary extensions. I rarely see pure python server that’s heavy-duty.
  • c#is less proven less mature. More importantly it doesn’t work well with the #1 platform — linux.
  • cpp is my 2nd pick. Some concerns:
    • much harder to find talents
    • Fewer open-source packages
    • java is one of the cleanest languages. cpp is a blue-collar language, rough around the edges and far more complex.

[18] ##spend more$ to prolong engagement+absorbency

  • increase spend on good mobile data and mobile devices to capture the creativity, absorbency …
  • increase spend on printer
  • increase spend on hotel stay near office (or taxi home) to capture the engagement on localSys
  • spend on flights to gain family time, engaged
  • spend unpaid leave to attend interviews .. to gain joy, motivation, engagement, precious insight into their selection priority

Not so sure about …

  • Regrettable — spent unpaid leave before starting Macq job .. to gain peaceful family time? low ROI
  • questionable spend (gave up higher pay rate) to gain … c++ skills like QQ, GTD, zbs

price sensitivities = #1 valuable output of risk-run

[[complete guide]] P433, P437 …

After reading these pages, I can see that per-deal PnL and markt-to-market numbers are essential, but to the risk manager, the most valuable output of the deal-by-deal “risk run” is the family of sensitivities such as delta, gamma, vega, dv01, duration, convexity, correlation to a stock index (which is different from beta) , ..

Factor-shocks (stress test?) would probably use the sensitivity numbers too.

In Baml, the sensitivity numbers are known as “risk numbers”. A position has high risk if it has high sensitivity to its main factor (whatever that is.)

“didn’t like my face”: we aren’t his top-favorite #bbg

Hi Deepak,

I now think there’s another reason that SIG, Bloomberg, LiquidNet, CapitalDynamics and other employers didn’t make me an offer even though I probably passed technical screening with a technical win.

In our chats, I used the generic term “didn’t like my face” as an umbrella term for several different factors. Today I want to mention a new factor – “what if this candidate takes my offer and continues to shop around?

I believe some companies shun that risk. When in doubt, they reject. When they make an offer they want to ensure the candidate will accept. They want to see “Hey we are clearly the favorite in his mind and he is in a hurry. If we make him an offer he will likely accept right away.”

Clearly, I’m not that type of candidate. I often come across as a “job shopper”, through my non-verbal language, or even through my explicit verbal answers. For example, when asked “Why are you looking to change job” I often answer “I’m actually doing fine on my current job but there are better opportunities like the role in your company.”

[19] problems+!published solutions: better4me

Nowadays I feel problems without published solutions, without extensive test cases are better for me, since I don’t worry about … wipe-out, like bloodshed.

-> better avoid Leetcode. Careercup and bbg codecon are fine. Hackerrank?

I used to worry about not knowing correct solutions -> wasting my time. Now I know from extensive experience that coming up with my homemade solutions is good enough and rewarding, even if incorrect/incomplete.

I don’t always need or want to know the correct solution, not every time.

Further, I often wish there’s no known solution, since they always wipe out my satisfaction and self-esteem.

 

[19] new problems!=the best drill

I used to focus on how many new problems solved each week, but that’s very challenging and not necessarily most effective. In contrast, Reviewing existing code is easier i.e. requires lower absorbency, but can still take hours.  Worth the time!

There’s a real risk to overspend time on new problems.

  • We don’t always fully digest the solved problems. No one can remember so well without refresh. Therefore, am growing stronger than before and stronger than my peers who don’t invest this time to refresh.
  • a bigger risk is burnout, stress, rat race against friends who “solved X new problems in Y weeks”. Reality is, they don’t necessary perform better or grow stronger.

 

SQL expertise(+tuning)as competitive advantage: losing mkt val

Opening example — I have received very few selective/tough SQL interview questions. The last interview with non-trivial SQL is Lazada.

If you want to rely on SQL/RDBMS skill as a competitive advantage, you will be disappointed.

I think many teams still use SQL, but use it lightly. Simple queries, simple stored proc, small tables, no tuning required. Therefore, interview questions are dumbing down…

I believe I over-invested in SQL. The last job that required any non-trivial SQL was 95G…

strategic value of MOM]tech evolution  is about MOM, but similar things can be said about SQL and RDBMS.

This is a cautionary tail in favor of TrySomethingNew. If my friend Victoria stays within the familiar domain of SQL/Perl/PHP/Apache, then her skillset would slowly lose market value.

Don’t forget — TSN can have low ROTI. We have to accept that possibility

[18]fastest threadsafe queue,minimal synchronization #CSY

I got this question in a 2017 Wells white-board coding interview, and discussed with my friend Shanyou. We hoped to avoid locks and also avoid other synchronization devices such as atomic variables..

Q1: only a single producer thread and a single consumer thread and no other threads.

I put together a java implementation that can enqueue without synchronization, most of the time … See https://wp.me/p74oew-7mE

Q1b: Is it possible to avoid synchronization completely, i.e. single-threaded mode?
A: No. Consumer thread would have absolutely NO idea whatsoever how close it is to the producer end. No. We asneed a memory barrier at the very least.

Q2: what if there are multiple producer/consumer threads?

I believe we can use 2 separate locks for the two ends, rather than a global lock. This is more efficient but invites the tricky question “how to detect when the two ends meet“. I am not sure. I just hope the locks enforce a memory barrier.

Alternatively, we could use CAS on both ends, but see lockfree queue #popular IV

 

c++^java..how relevant ] 20Y@@

See [17] j^c++^c# churn/stability…

C++ has survived more than one wave of technology churn. It has lost market share time and time again, but hasn’t /bowed out/. I feel SQL, Unix and shell-scripting are similar survivors.

C++ is by far the most difficult languages to use and learn. (You can learn it in 6 months but likely very superficial.) Yet many companies still pick it instead of java, python, ruby — sign of strength.

C is low-level. C++ usage can be equally low-level, but c++ is more complicated than C.

unexpected longevity@FOSS

Conclusion — my tech-bets and investment in many FOSS technologies proved to be correct. In contrast, only a few of my tech bets on commercial softwares are correct — MSVS, Oracle, Sybase, Excel+VBA,

I didn’t want to spend too much effort analyzing the forces around FOSS, but to my surprise, those forces keep growing and evolving.

  • Eg: weblogic was once dominant, but left behind by Tomcat and Jboss
  • Eg: Microsoft has to contend with Linux, Java, Apache
  • Eg: Oracle has to keep developing OpenSolaris, and MySQL
  • Eg: IBM, Oracle … have to support Linux
  • Eg: SUN, HP-UX all lost the battle against Linux. SUN has no choice but OpenSolaris
  • Most of them have to face the stiff challenge by a single FOSS — GNU/Linux

Because a FOSS needs no revenue no payroll to stay alive, there’s no survival risk or financial uncertainty in a FOSS project. Therefore, a FOSS often has better longevity.

Some of the most influential, dominant, enduring and low-churn softwares are FOSS and are unlikely to change:

  1. linux, BSD-unix
  2. java and GCC
  3. python, perl, and most scripting languages
  4. most development tools in *nix
  5. many javascript frameworks
  6. many browsers

Q: what forces power the FOSS and provide the energy, momentum?
A: alpha-geeks who want to create a impact and legacy?

Apparently, you need just one (or a few) alpha-geek to create a formidable competitor to a software vendor’s army of developers.

compare%%GTD to single-employer java veterans

Me vs a java guy having only a single long-term java project, who has more zbs (mostly nlg) and GTD power, including

  • performance tuning in thread pool, DB interface, serialization
  • hot-swap and remote debugging
  • JMX
  • tuning java+DB integration

When it comes to QQ and coding test scores, the difference is more visible than it is with GTD/zbs.

Conclusion — over 10 years, your portable GTD power grows too slow if you stick with one (or very few) system.

Am I advocating job hopping? Yes if you want to remain an individual contributor not aiming to move up.

 

Data Specialist #typical job spec

Hi friends,

I am curious about data scientist jobs, given my formal training in financial math and my (limited) work experience in data analysis.

I feel this role is a typical type — a generic “analyst” position in a finance-related firm, with some job functions related to … data (!):

  • some elementary statistics
  • some machine-learning
  • cloud infrastructure
  • some hadoop cluster
  • noSQL data store
  • some data lake
  • relational database query (or design)
  • some data aggregation
  • map-reduce with Hadoop or Spark or Storm
  • some data mining
  • some slice-n-dice
  • data cleansing on a relatively high amount of raw data
  • high-level python and R programming
  • reporting tools ranging from enterprise reporting to smaller desktop reporting software
  • spreadsheet data analysis — most end users still favor consider spreadsheet the primary user interface

I feel these are indeed elements of data science, but even if we identify a job with 90% of these elements, it may not be a true blue data scientist job. Embarrassingly, I don’t have clear criteria for a real data scientist role (there are precise definitions out there) but I feel “big-data”, “data-analytics” are so vague and so much hot air that many employers would jump on th bandwagon and portray themselves as data science shops.

I worry that after I work on such a job for 2 years, I may not gain a lot of insight or add a lot of value.

———- Forwarded message ———-
Date: 22 May 2017 at 20:40
Subject: Data Specialist – Full Time Position in NYC

Data Specialist– Financial Services – NYC – Full Time

My client is an established financial services consulting company in NYC looking for a Data Specialist. You will be hands on in analyzing and drawing insight from close to 500,000 data points, as well as instrumental in developing best practices to improve the functionality of the data platform and overall capabilities. If you are interested please send an updated copy of your resume and let me know the best time and day to reach you.

Position Overview

As the Data Specialist, you will be tasked with delivering benchmarking and analytic products and services, improving our data and analytical capabilities, analyzing data to identify value-add trends and increasing the efficiency of our platform, a custom-built, SQL-based platform used to store, analyze, and deliver benchmarking data to internal and external constituents.

  • 3-5 years’ experience, financial services and/or payments knowledge is a plus
  • High proficiency in SQL programming
  • High proficiency in Python programming
  • High proficiency in Excel and other Microsoft Office suite products
  • Proficiency with report writing tools – Report Builder experience is a plus

 

churn !! bad ] mktData #socket,FIX,.. unexpected!

I feel the technology churn is remarkably low.

New low-level latency techniques are coming up frequently, but these topics are actually “shallow” and low complexity to the app developer.

  • epoll replacing select()? yes churn, but much less tragic than the stories with swing, perl, structs
  • most of the interview topics are unchanging
  • concurrency? not always needed. If needed, then often fairly simple.

rvalue Object holding a resource : rather rare

I think naturally-occurring rvalue objects  rarely hold a resource.

  • literals — but these objects don’t hold any resources via a heap pointer
  • string1 + “.victor”
  • myInventoryLevel – 5000
  • myVector.push_back(Trade(12345)) — there is actually a temp Trade object. Compiler will call the rvr overload of push_back(). https://github.com/tiger40490/repo1/blob/cpp1/cpp/rvr/rvrDemo_NoCtor.cpp is my investigation. My temp object actually hold a resource via a heap pointerBut this usage scenario is rare in my opinion

However, if you have a regular nonref variable Connection myConn (“hello”), you can generate a rvr variable:

Connection && rvr2 = std::move(myConn);

By using std::move(), you promise to the compiler not to use myConn object afterwards.

 

 

dominant server-side language@ibank: evolution

Don’t spend too much time.. Based on my limited observations,

  • As of 2007, the top dog was java.
  • The dominance is even stronger in 2018.
  • Q: how about in 10 years?
  • A: I feel java will remain #1

Look at the innovation leaders — West coast. For their (web) server side, they seem to have shifted slightly towards python, javascript, RoR

Q: Why do I consider buy-side, sell-side and other financial tech shops as a whole and why don’t I include google finance?
A: … because there’s mobility between sub-domains within, and entry barrier from outside.

Buy-side tend to use more c++; Banks usually favor java; Exchanges tend to use … both c++ and java. The latency advantage of c++ isn’t that significant to a major exchange like Nsdq.

 

CV-competition: Sg 10x tougher than U.S.

Sg is much harder, so … I better focus my CV effort on the Sg/HK/China market.

OK U.S. job market is not easy, but statistically, my CV had a reasonable hit rate (like 20% at least) because

  • contract employers don’t worry about my job hopper image
  • contract employers have quick decision making
  • some full time hiring managers are rather quick
  • age…
  • Finally, the number of jobs is so much more than Sg

 

socket^swing: separate(specialized skill)from core lang

  • I always believe swing is a distinct skill from core java. A regular core Java or jxee guy needs a few years experience to become swing veteran.
  • Now I feel socket programming is similarly a distinct skill from core C/c++

In both cases, since the core language knowledge won’t extend to this specialized domain, you need to invest personal time outside work hours .. look at CSY. That’s why we need to be selective which domain.

Socket domain has much better longevity (shelf-life)  than swing!

learn new tech for IV(!!GTD): learn-on-the-job is far from enough

Example — you programmed java for 6+ months, but you scored below 50% on those (basic) java knowledge question I asked you in skype chat. You only know what to study when you attend interviews. Without interviews, you won’t encounter those topics in your projects.

Example — I used SQL for at least 3 years before I joined Goldman Sachs. Until then I used no outer join no self-join no HAVING clause, no CASE, no correlated sub-query, no index tweaking. These topics were lightly used in Goldman but needed in interviews. So without interviews, I wouldn’t not know to pay attention to these topics.

Example — I programming tcp sockets many times. The socket interview questions I got from 2010 to 2016 were fairly basic. When I came to ICE I looked a bit deeper into our socket codebase but didn’t learn anything in particular. Then my interviews started showing me the direction. Among other things, interviewers look for in-depth understanding of

· Blocking/non-blocking

· Fast/slow receivers

· Buffer overflow

· Reliability

· Ack

How the hell can we figure out these are the high-value topics in TCP without interviews? I would say No Way even if I spend 2 years on this job.

## HFT c++knowhow is distinct like .. swing

I figured out long ago that java swing is a distinct skill, distinct from mainstream java.

Now I feel HFT c++ is also such a skill — distinct from mainstream c++. Interviewers tend to ask questions (almost) irrelevant to mainstream c++. Shanyou said focus is on system-level not application-level. I feel superficial knowledge is often enough.

  1. [u] avoid “virtual” for latency
  2. CPU cache management
  3. translation lookaside buffer
  4. [P] kernel bypass
  5. [uP] socket tuning
  6. [P] memory and socket performance monitoring
  7. shared memory
  8. [u] placement new and custom allocator
  9. system calls
  10. [u] in-lining
  11. [u = used in my systems to some extent]
  12. [P = no programming, pure system knowledge]

–These topics are relevant to mainstream c++ but more relevant to HFT

  • setjmp
  • IPC mutex
  • reference counting

c++QnA interviews(HFT): !! pickier than Facebook

I feel my HFT interviews are very picky on low-level kernel or compiler optimizations or network card engineering, but such QQ knowledge is not widely needed.

Don’t feel inferior to them.

  • Q: what if a HFT programmer goes to a Facebook interview?
  • Q: what if a coding contest champion from Facebook goes to an HFT interview?
  • Q: what if these guys go to a tough logic or probability quiz?

 

coreJava^big-data java job #XR

In the late 2010’s, Wall street java jobs were informally categorized into core-java vs J2EE. Nowadays “J2EE” is replaced by “full-stack” and “big-data”.

The typical core java interview requirements have remained unchanged — collections, threading, JVM tuning, compiler details (including keywords, generics, overriding, reflection, serialization ), …, but relatively few add-on packages.

(With the notable exception of java collections) Those add-on packages are, by definition, not part of the “core” java language. The full-stack and big-data java jobs use plenty of add-on packages. It’s no surprise that these jobs pay on par with core-java jobs. More than 5 years ago J2EE jobs, too, used to pay on par with core-java jobs, and sometimes higher.

My long-standing preference for core-java rests on one observation — churn. The add-on packages tend to have a relatively short shelf-life. They become outdated and lose relevance. I remember some of the add-on

  • Hadoop, Spark
  • functional java
  • SOAP, REST
  • GWT
  • NIO
  • Protobuf, json
  • Gemfire, Coherence, …
  • ajax integration
  • JDBC
  • Spring
  • Hibernate, iBatis
  • EJB
  • JMS, Tibco EMS, Solace …
  • XML-related packages (more than 10)
  • Servlet, JSP
  • JVM scripting including scala, groovy, jython, javascript@JVM… (I think none of them ever caught on outside one or two companies.)

None of them is absolutely necessary. I have seen many enterprise java systems using only one or two of these add-on packages.

Q:Just when do App(!! lib)devs write std::move

I feel move ctor (and move-assignment) is extremely implicit and “in-the-fabric”. I don’t know of any user function with a rvr parameter. Such a function is usually in some library. Consequently, in my projects I have not seen any user-level code that shows “std::move(…)”

Let’s look at move ctor. “In the fabric” means it’s mostly rather implicit i.e. invisible. Most of the time move ctor is picked by compiler based on some rules, and I have basically no influence over it.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp shows when I need to call move() but it’s a contrived example — I have some object (holding a resource via heap pointer), I use it once then I don’t need it any more, so I “move” its resource into a container and abandon the crippled object.

Conclusion — as app developers I seldom write code using std::move.

  • P20 [[c++ std lib] shows myCollection.insert(std::move(x)); // where x is a local nonref variable, not a heap pointer!
    • in this case, we should provide a wrapper function over std::move() named getRobberAliasOf()
    • I think you do this only if x has part of its internal storage allocated on heap, and only if the type X has a move ctor.

I bet that most of the time when an app developer writes “move(…)”, she doesn’t know if the move ctor will actually get picked by compiler. Verification needed.

— P544 [[c++primer]] offers a “best practice” — Outside of class implementations (like big4++), use std::move only when you are certain that you need to do a move and it is guaranteed safe.

Basically, the author believes user code seldom needs std::move.

— Here’s one contrived example of app developer writing std::move:

string myStr=input;
vectorOfString.push_back(std::move(myStr)); //we promise to compiler we won’t use myStr any more.

Without std::move, a copy of myStr is constructed in the vector. I call this a contrived example because

  • if input is a char-array, then emplace_back() is more efficient
  • if input is another temp string, then we can simply use push_back(input), which would bind to the rvr overload anyway.

c++QQ/zbs Expertise: I got some

As stated repeatedly, c++ is the most complicated and biggest language used in industry, at least in terms of syntax (tooManyVariations) and QQ topics. Well, I have impressed many expert interviewers on my core-c++ language insight.

That means I must have some expertise in c++ QQ topics. For my c++ zbs growth, see separate blog posts.

Note socket, shared mem … are c++ ecosystem, like OS libraries.

Deepak, Shanyou, Dilip .. are not necessarily stronger. They know some c++ sub-domains better, and I know some c++ sub-domains better, in both QQ and zbs.

–Now some of the topics to motivate myself to study

  • malloc and relatives … internals
  • enable_if
  • email discussion with CSY on temp obj
  • UDP functions

how I achieved%% ComfortableEconomicProfile by44

(I want to keep this blog in recrec, not tanbinvest. I want to be brief yet incisive.)

See 3 ffree scenarios: cashflow figures. What capabilities enabled me to achieved my current Comfortable Economic profile?

  • — top 3
  • by earning SGP citizenships
  • by developing my own investment strategies, via trial-n-error
  • by staying healthy
  • — the obvious
  • by high saving rate thanks to salary + low burn rate — efficiency inspired by SG gov
  • by consistent body-building with in-demand skills -> job security. I think rather few of my peers have this level of job security. Most of them work in one company for years. They may be lucky when they need a new job, but they don’t have my inner confidence and level of control on that “luck”. Look at Y.W.Chen. He only developed that confidence/control after he changed job a few times.

When I say “Comfortable” I don’t mean “above-peers”, and not complete financial freedom, but rather … easily affordable lifestyle without the modern-day pressure to work hard and make a living. In my life there are still too many pressures to cope with, but I don’t need to work so damn hard trying to earn enough to make ends meet.

A higher salary or promotion is “extremely desirable” but not needed. I’m satisfied with what I have now.

I can basically retire comfortably.

python: value@ECT+syntax > deep insight

c++ interviews value deep insight more than any language. Java and c# interviews also value them highly, but not python interviews.

Reminder — zoom in and dig deep in c++, java and c# only. Don’t do that in python too much.

Instead of deep insight, accumulate ECT syntax … highly valued in TIMED coding tests.

Use brief blog posts with catchy titles

p2p messaging beats MOM ] low-latency trading

example — RTS exchange feed dissemination infrastructure uses raw TCP and UDP sockets and no MOM

example — the biggest sell-side equity OMS network uses MOM only for minor things (eg?). No MOM for market data. No MOM carrying FIX order messages. Between OMS nodes on the network, FIX over TCP is used

I read and recorded the same technique in 2009… in this blog

Q: why is this technique not used on west coast or main street ?
%%A: I feel on west coast throughput outweighs latency. MOM enhances throughput.

latency QQ ]WallSt IV #java,c++..

Latency knowledge

  • is never needed on the job but … high Market Value
  • is not GTD at all but … is part of zbs
  • is not needed in py jobs
  • is needed in many c++ interview topics but .. in java is concentrated in JIT and GC
  • is an elite skill but … many candidates try
  • some depth is needed for IV and other discussions but … relatively low-complexity .. low-complexity topics #eg:GC/socket

mgr|obligation| relationships

A technical or contract role is less complicated, though relationships are also important and can make your life very stressful or relatively easy.

In ## 2 heaviest work stressors, I listed “figure-things-out” as a key stressor — if I’m reasonably fast on this front, then the relationships have limited impact on my stress level at work. Not true for a manager role — even if you get things done fast enough, relationships can still mess up your life.

  • the relationship with the immediate boss is most critical. I had many problems in the past.
  • relationship with other teams. Dependency means … stressful relationship
  • relationship with big bosses
  • relationship with subordinates can also become difficult. Shuo told me it was not for him, but I feel some managers depend on some key subordinates. Dependency means stress.
    • managing a non-performing subordinates … is not easy at all. I could see Kevin had headaches with me.
  • relationship with key business users. I feel Venkat (ICE) is under that pressure.

volume alone doesn’t qualify a system as big-data

The Oracle nosql book has these four “V”s to qualify any system as big data system. I added my annotations:

  1. Volume
  2. Velocity
  3. Variety of data format — If any two data formats account for more than 99.9% of your data in your system, then it doesn’t meet this definition. For example, FIX is one format.
  4. Variability in value — Does the system treat each datum equally?

Most of the so-called big data systems I have seen don’t have these four V’s. All of them have some volume but none has the Variety or the Variability.

I would venture to say that

  • 1% of the big-data systems today have all four V’s
  • 50%+ of the big-data systems have no Variety no Variability
    • 90% of financial big-data systems are probably in this category
  • 10% of the big-data systems have 3 of the 4 V’s

My friend JunLi said most of the data stores he has seen are strictly structured data, and cited credit bureau report as an example.

The reason that these systems are considered “big data” is the big-data technologies applied. You may call it “big data technologies applied on traditional data”

See #top 5 big-data technologies

Does my exchange market data qualify? Definitely high volume and velocity, but no Variety or Variability. So not big-data.

%%c++keep crash` I keep grow`as hacker #zbs#AshS

Note these are fairly portable zbs, more than local GTD know-how !

My current c++ project has high data volume, some business logic, some socket programming challenges, … and frequent crashes.

The truly enriching part are the crashes. Three months ago I was afraid of c++, largely because I was afraid of any crash.

Going back to 2015, I was also afraid of c++ build errors in VisualStudio and Makefiles, esp. those related to linkers and One-Definition-Rule, but I overcame most of that fear in 2015-2016. In contrast, crashes are harder to fix because 70% of the crashes come with no usable clue. If there’s a core file I may not be able to locate it. If I locate it, it may not have symbols. If it has symbols the crash site is usually in some classes unrelated to any classes that I wrote. I have since learned many lessons how to handle these crashes:

  • I have a mental list like “10 common crash patterns” in my log
  • I have learned to focus on the 20% of my codebase that are most convoluted, most important, most tricky and contribute most to debugging difficulties. I then invest my time strategically to rewrite (parts of) that 20% and dramatically simplify them. I managed to get familiar and confident with that 20%.
    • If the code belongs to someone else including 3rd party, I try to rewrite it locally for my dev
  • I have learned to pick the most useful things to log, so they show a *pattern*. The crashes usually deviate from the patterns and are now easier to spot.
  • I have developed my binary data dumper to show me the raw market data received, which often “cause” crashes.
  • I have learned to use more assertions and a hell lot of other validations to confirm my program is not in some unexpected *state*. I might even overdo this and /leave no stoned unturned/.
  • I figured out memset(), memcpy(), raw arrays are the most crash-prone constructs so I try to avoid them or at least build assertions around them.
  • I also figured signed integers can become negative and don’t make sense in my case so I now use unsigned int exclusively. In hind sight not sure if this is best practice, but it removed some surprises and confusions.
  • I also gained quite a bit of debugger (gdb) hands-on experience

Most of these lessons I picked up in debugging program crashes, so these crashes are the most enriching experience. I believe other c++ programs (including my previous jobs) don’t crash so often. I used to (and still do) curse the fragile framework I’m using, but now I also recognize these crashes are accelerating my growth as a c++ developer.

##fastest container choices: array of POD #or pre-sized vector

relevant to low-latency market data.

  • raw array is “lean and mean” — the most memory efficient; vector is very close, but we need to avoid reallocation
  • std::array is less popular but should offer similar performance to vector
  • all other containers are slower, with bigger footprint
  • For high-performance, avoid container of node/pointer — Cache affinity loves contiguous memory. After accessing 1st element, then accessing 2nd element is likely a cache-hit
    • set/map, linked list suffer the same

c++11,sockets,IPC in QnA IV #AshS

[18] Update — This has become one of the more  valuable posts in my c++ blog. It cuts down the amount of info overload and learning curve steepness by half then again by half.

Hi Ashish,

I now have a little theory on the relative importance of several c++ tech skills in a job candidate. I feel all of the skills below are considered “secondary importance” to most of the (15 – 30) interviewers I have met. These skill are widely used in projects, but if we say “no experience” in them, BUT demonstrate strength in core c++ then we win respect.

For each topic, I want to classify it as core or ecosystem. Besides, I want to gauge its market value in terms of IV (not GTD), depth with moat.

[e=c++ecosystem topics]
[c=core c++topics]

— #AA [e] c++threading —— many job descriptions say they use threading but it’s probably very simple threading. Threading is such complex topic that only the well-reviewed proven designs are safe to use. If a project team decides to invent their concurrency design ( I invented once ) , and have any non-trivial deviation from the proven designs, they may unknowingly introduce bugs that may not surface for years. So the actual threading usage in any project is always minimal, simple and isolated to a very small number of files.

The fastest systems tend to have nothing shared-mutable, so parallel processing presents zero risk and requires no design. No locking no CAS. Essentially single threaded mode.

However, we don’t have the luxury to say “limited experience” in threading. I have a fair amount of concurrency design experience across languages and thread libraries (pthreads, ObjectSpace, c#), using locks+conditional variables+CAS as building blocks, but the c++ thread library used in another team is probably different. I used to say I used boost::thread a lot but it back-fired.

[e] threading library support? not so standardized as in java. pthreads is standard but not widely used in projects and not a common knowledge

[c] threading support in c++11? Not widely used or widely quizzed.

This domain does have depth with moat 🙂 but not as popular as rvr.

— #AAA [c] c++11 —— is not yet widely used. Many financial jobs I applied have old codebases they don’t want to upgrade. Most of the c++11 features we use as developers are optional convenience features, Some features are fundamental (decltype, constexpr …) yet treated as simple convenience features. I feel move semantics and r-value references are fairly deep but these are really advanced features for library writers, not application developers. Beside shared_ptr, C++11 features seldom affect system design. I have “limited project experience using c++11“.

Interviewers often drill into move semantics. If I admit ignorance I could lose. Therefore, I’m actively pursuing the thin->thick->thin learning path.

This topic is an interview favorite, have market value, depth with moat.

— #A [c] templates including TMP —— another advanced feature primarily for library developers. Really deep and complex. App developers don’t really need this level of wizardry. A few experienced c++ guys told me their teams each has a team member using fancy template meta-programing techniques that no one could understand or maintain. None of my interviewers went very deep on this topic. I have only limited meta-programming experience, but I will focus on 2 common template techniques — CRTP and SFINAE and try to build a good understanding (thin->thick->thin)

This topic is popular, with real depth and moat.

— #C [e] Boost —— is not really widely used. 70% of the financial companies I tried don’t use anything beyond shared_ptr. Most of the boost features are considered optional and high-level, rather than fundamental. If I tell them I only used shared_ptr and no other boost library, they will usually judge me on other fronts, without deducting points. I used many boost libraries but only understand shared_ptr better.

Q: Are there job candidates who are strong with some boost feature (beside shared_ptr) but weak on core c++?
A: I have not seen any.

Q: Are there programmers strong on core c++ but unfamiliar with boost?
A: I have seen many

This  topic is not so popular, never quizzed in depth.

—#D+ [c] IKM —— is far less important than we thought. I know many people who score very high but fail the tech interview badly. On the other hand, I believe some candidates score mediocre but impress the interviewers when they come on-site.

Not so popular in IV. Not much depth with moat

—#B [e] linux system programming like sockets —— is relevant to low-latency firms. I think half the c++ finance firms are. Call them infrastructure team, or engineering team, or back-end team. I just happened to apply for too many of this kind. To them, socket knowledge is essential, but to the “mainstream” c++ teams, socket is non-essential. I have “modest socket experience” but could impress some interviewers.

(Actually socket api is not a c/c++ language feature, but a system library. Compared to STL, socket library has narrower usage.)

Messaging is a similar skill like sockets. There are many specific products so we don’t need to be familiar with each one.

InterProcessCommunication  (Shared memory, pipes… ) is a similar “C” skill to sockets but less widely required. I usually say my system uses sockets, database or flat files for IPC, though shared memory is probably faster. I hope interviewers don’t mind that. If there’s some IPC code in their system, it’s likely isolated and encapsulated (even more encapsulated than threading code), so hopefully most developers don’t need to touch it

If a role requires heavy network programming (heavy IPC or messaging development is less common) but we have limited experience, then it can be a show-stopper.

— #B+ [e] Memory Management ——- another specialized skill just like IPC.

Deeper bookish knowledge into malloc/new, custom allocator, buffer management (DeepakCM)…

Such code is always isolated in some low-level module in a library, and seldom touched. A job spec may require that but actually few people have experience and never a real show stopper.

I feel this topic is gaining popularity perhaps due to c++ competitive advantage. Had depth with moat.

— #D [c] OO design patterns ——- never required in coding tests. In fact, basically no OO features show up in short coding tests. 90% of short coding tests are about GettingThingsDone, using algorithms and data structures.

No depth no moat. Not so popular. Design patterns do show up in QnA interviews but usually not difficult. I usually stick to a few familiar ones — Singleton, Factory, TemplateMethod, ProducerConsumer. I’m no expert with any of these but I feel very few candidates are. I feel most interviewers have a reasonable expectation. I stay away from complex patterns like Visitor and Bridge.

— #F [c] A) instrumentation and B) advanced build tools for memory, dependency etc … are big paper tigers. Can be open-source or commercial. Often useful to GTD but some interviewers ask about these tools to check your real-world experience. 99% of c++programmers have superficial knowledge only because we don’t need to understand the internals of GPS to use it in a car.

Note basic build-chain knowledge is necessary for GTD but relevant “basic” and seldom quizzed.

—————–
These topics are never tested in short coding questions, and seldom in longer coding questions.

What c++ knowledge is valued more highly? See ##20 C++territories for QQ IV

As CTO, I’d favor transparent langs, wary of outside libraries

If I were to propose a system rewrite, or start a new system from scratch without constraints like legacy code, then Transparency (+instrumentation) is my #1 priority.

  • c++ is the most opaque. Just look at complex declarations, or linker rules, the ODR…
  • I feel more confident debugging java. The JVM is remarkably well-behaving (better than CLR), consistent, well discussed on-line
  • key example — the SOAP stub/skeleton hurts transparency, so does the AOP proxies. These semi-transparent proxies are not like regular code you can edit and play with in a sandbox
  • windows is more murky than linux
  • There are many open-source libraries for java, c++, py etc but many of them affects transparency. I think someone like Piroz may say Spring is a transparent library
  • SQL is transparent except performance tuning
  • Based on my 1990’s experience, I feel javascript is transparent but I could be wrong.
  • I feel py, perl are still more transparent than most compiled languages. They too can become less transparent, when the codebase grows. (In contrast, even small c++ systems can be opaque.)

This letter is about which language, but allow me to digress briefly. For data store and messaging format (both require serialization), I prefer the slightly verbose but hugely transparent solutions, like FIX, CTF, json (xml is too verbose) rather than protobuf. Remember 99% of the sites use only strings, numbers, datetimes, and very rarely involve audio/visual data.

contractor^mgr^low-VP 3way-compare #XR

See also hands-on dev beats mgr @same pay and my discussion with Youwei in pureDev beats techLead: paycut=OK #CYW

In the U.S. context, I feel the FTE developer position is, on average, least appealing, though some do earn a lot, such as some quant developers. My very rough ranking of total income is

  1. senior mgr
  2. contractor
  3. FTE-dev including entry-level lead roles

Without bonus, the FTE-dev is often lowest. However, bonus is not guaranteed.

I exclude the pure quants (or medical doctors) as a different profession from IT.

 

edit 1 file in big python^c++ production system #XR

Q1: suppose you work in a big, complex system with 1000 source files, all in python, and you know a change to a single file will only affect one module, not a core module. You have tested it + ran a 60-minute automated unit test suit. You didn’t run a prolonged integration test that’s part of the department-level full release. Would you and approving managers have the confidence to release this single python file?
A: yes

Q2: change “python” to c++ (or java or c#). You already followed the routine to build your change into a dynamic library, tested it thoroughly and ran unit test suite but not full integration test. Do you feel safe to release this library?
A: no.

Assumption: the automated tests were reasonably well written. I never worked in a team with a measured test coverage. I would guess 50% is too high and often impractical. Even with high measured test coverage, the risk of bug is roughly the same. I never believe higher unit test coverage is a vaccination. Diminishing return. Low marginal benefit.

Why the difference between Q1 and Q2?

One reason — the source file is compiled into a library (or a jar), along with many other source files. This library is now a big component of the system, rather than one of 1000 python files. The managers will see a library change in c++ (or java) vs a single-file change in python.

Q3: what if the change is to a single shell script, used for start/stop the system?
A: yes. Manager can see the impact is small and isolated. The unit of release is clearly a single file, not a library.

Q4: what if the change is to a stored proc? You have tested it and run full unit test suit but not a full integration test. Will you release this single stored proc?
A: yes. One reason is transparency of the change. Managers can understand this is an isolated change, rather than a library change as in the c++ case.

How do managers (and anyone except yourself) actually visualize the amount of code change?

  • With python, it’s a single file so they can use “diff”.
  • With stored proc, it’s a single proc. In the source control, they can diff this single proc. Unit of release is traditionally a single proc.
  • with c++ or java, the unit of release is a library. What if in this new build, beside your change there’s some other change , included by accident? You can’t diff a binary 😦

So I feel transparency is the first reason. Transparency of the change gives everyone (not just yourself) confidence about the size/scope of this change.

Second reason is isolation. I feel a compiled language (esp. c++) is more “fragile” and the binary modules more “coupled” and inter-dependent. When you change one source file and release it in a new library build, it could lead to subtle, intermittent concurrency issues or memory leaks in another module, outside your library. Even if you as the author sees evidence that this won’t happen, other people have seen innocent one-line changes giving rise to bugs, so they have reason to worry.

  • All 1000 files (in compiled form) runs in one process for a c++ or java system.
  • A stored proc change could affect DB performance, but it’s easy to verify. A stored proc won’t introduce subtle problems in an unrelated module.
  • A top-level python script runs in its own process. A python module runs in the host process of the top-level script, but a typical top-level script will include just a few custom modules, not 1000 modules. Much better isolation at run time.

There might be python systems where the main script actually runs in a process with hundreds of custom modules (not counting the standard library modules). I have not seen it.

effi^instrumentation ] new project

I always prioritize instrumentation over effi/productivity/GTD.

A peer could be faster than me in the beginning but if she lacks instrumentation skill with the local code base there will be more and more tasks that she can’t solve without luck.

In reality, many tasks can be done with superficial “insight”, without instrumentation, with old-timer’s help, or with lucky search in the log.

What if developer had not added that logging? You are dependent on that developer.

I could be slow in the beginning, but once I build up (over x months) a real instrumentation insight I will be more powerful than my peers including some older timers. I think the Stirt-tech London team guru (John) was such a guy.

In reality, even though I prioritize instrumentation it’s rare to make visible progress building instrumentation insight.

C for latency^^TPS can use java

I’m 98% confident — low latency favors C/C++ over java [1]. FPGA is _possibly_ even faster.

I’m 80% confident — throughput (in real time data processing) is achievable in C, java, optimized python (Facebook?), optimized php (Yahoo?) or even a batch program. When you need to scale out, Java seems the #1 popular choice as of 2017. Most of the big data solutions seem to put java as the first among equals.

In the “max throughput” context, I believe the critical java code path is optimized to the same efficiency as C. JIT can achieve that. A python and php module can achieve that, perhaps using native extensions.

[1] Actually, java bytecode can run faster than compiled C code (See my other posts such as https://bintanvictor.wordpress.com/2017/03/20/how-might-jvm-beat-cperformance/)

[17] 5 unusual tips@initial GTD

See also https://bintanvictor.wordpress.com/wp-admin/edit.php?s&post_status=all&post_type=post&action=-1&m=0&cat=560907660&filter_action=Filter&paged=1&action2=-1

* build up instrumentation toolset
* Burn weekends, but first … build momentum and foundation including the “instrumentation” detailed earlier
* control distractions — parenting, housing, personal investment, … I didn’t have these in my younger years. I feel they take up O2 and also sap the momentum.
* Focus on output that’s visible to boss, that your colleagues could also finish so you have nowhere to hide. Clone if you need to. CSDoctor told me to buy time so later you can rework “under the hood” like quality or design

–secondary suggestions:
* Limit the amount of “irrelevant” questions/research, when you notice they are taking up your O2 or dispersing the laser. Perhaps delay them.

Inevitably, this analysis relies on the past work experiences. Productivity(aka GTD) is a subjective, elastic yardstick. #1 Most important is GTD rating by boss. It sinks deep… #2 is self-rating https://bintanvictor.wordpress.com/2016/08/09/productivity-track-record/

## low-complexity QQ topics #JGC/parser..

java GC is an example of “low-complexity domain”. Isolated knowledge pearls. (Complexity would be high if you delve into the implementation.)

Other examples

  • FIX? slightly more complex when you need to debug source code. java GC has no “source code” for us.
  • socket programming? conceptually, relatively small number of variations and combinations. But when I get into a big project I am likely to see the true color.
  • stateless feed parser coded against an exchange spec

MOM+threading Unwelcome ] low latency@@ #FIX/socket

Piroz told me that trading IT job interviews tend to emphasize multi-threading and MOM. Some use SQL too. I now feel all of these are unwelcome in low latency trading.

A) MOM – see also HFT mktData redistribution via MOMFor order processing, FIX is the standard. FIX can use MOM as transport, but not popular and unfamiliar to me.

FIX does use buffers to hold a burst of incoming or outgoing messages. The buffers resemble message queues.

B) threading – Single-Threaded-Mode is generally the fastest in theory and in practice. (I only have a small observed sample size.) I feel the fastest trading engines are STM. No shared mutable. Nsdq new platform (in java) is STM

Multithreading is OK if the threads don’t compete for resources like CPU, I/O or locks. Compared to STM, most lockfree systems introduce latency like retries, and additional memory barrier. By default compiler optimization doesn’t need such memory barriers.

C) SQL – as stated elsewhere, flat files are much faster than relational DB. How about in-memory relational DB?

Rebus, the order book engine, is in-memory.

2H life-changing xp#Pimco#income,home location,industry…

Here’s a real story in 2010 — I was completely hopeless and stuck in despair after my Goldman Sachs internal transfer was blocked in the last stage. I considered moving my whole family back to Singapore without any offer, and start my job search there. I was seriously considering a S$100k job in a back office batch programming job. Absolutely the lowest point in my entire career. After licking the would for 2 months, tentatively I started looking for jobs outside Goldman and slowly found my foothold. Then in early 2010, I passed a phone screening and attended a Citigroup “superday”. I spent half an hour each with 3 interviewers. By end of the day, recruiter said I was the #1 pick. I took the offer, at a 80% increment. In the next 12 months, I built up my track record + knowledge in

  1. real time trading engine components, esp. real time pricing engine
  2. fixed income math,
  3. c++ (knowledge rebuild)

I have never looked back since. Fair to say that my family won’t be where we are today, without this Citigroup experience. With this track record I was able to take on relatively high-end programming jobs in U.S. and Singapore. I was able to live in a convenient location, and buy properties and send my kids to mid-range preschools (too pricey in hind sight). Obviously I wanted this kind of job even in 2009. That dream became reality when I passed the superday interview. That interview was one of the turning points in my career.

Fast forward to Apr 2017 — I had a 20-minute phone interview with the world’s biggest asset management firm (Let’s call it PP), then I had a 2-hour skype interview. They made an offer. I discussed with my recruiter their proposal —

  • I would relocate to California
  • I would get paid around 200k pretax and possibly with an increment in 6 months. PP usually increase billing rate after 12 months if contractor does well.
  • recruitment agency CEO said he would transfer my visa and sponsor green card.

If I were to take this offer, my life would be transformed. (I would also have a better chance to break into the  high tech industry in nearby silicon valley, because I would have local friends in that domain.) Such a big change in my life is now possible because … I did well [1] in the interview.

Stripped to the core, that’s the reality in our world of contract programmers.  Project delivery, debugging, and relationship with boss can get you promoted, but those on-the-job efforts have much lower impact than your performance during an interview. Like an NBA playoff match. A short few hour under the spot light can change your life forever.

This is not a rare experience. There are higher-paying contract job offers that could “change your life”, and you only need to do well in the interviews to make it happen.

I feel this is typical of U.S. market and perhaps London. In Singapore. contract roles can’t pay this much. A permanent role has a few subtle implications so I feel it’s a different game.

[1] The 7 interviewers felt I was strong in c++ (not really), java and sql, and competent in fixed income math (I only worked with it for a year). Unlike other high-end interviews, there are not many tough tech questions like threading, algorithms, or coding tests. I feel they liked my interview mostly because of the combination of c++/java/fixed income math — not a common combination.

big data is!! fad; big-data technologies might be

(blogging)

My working definition — big data is the challenges and opportunities presented by the large volume of disparate (often unstructured) data.

For decades, this data has always been growing. What changed?

* One recent changed in the last 10 years or so is data processing technology. As an analogy, oil sand has been known for quite a while but the extraction technology slowly improved to become commercially viable.

* Another recent change is social media, creating lots of user-generated content. I believe this data volume is a fraction of the machine-generated data, but it’s more rich and less structured.

Many people see opportunities to make use of this data. I feel the potential usefulness of this data is somewhat /overblown/ , largely due to aggressive marketing. As a comparison, consider location data from satellites and cellular networks — useful but not life-changing useful.

The current crop of big data technologies are even more hype. I remember XML, Bluetooth, pen computing, optical fiber .. also had their prime times under the spotlight. I feel none of them lived up to the promise (or the hype).

What are the technologies related to big data? I only know a few — NOSQL, inexpensive data grid, Hadoop, machine learning, statistical/mathematical python, R, cloud, data mining technologies, data warehouse technologies…

Many of these technologies had real, validated value propositions before big data. I tend to think they will confirm and prove those original value propositions in 30 year, after the fads have long passed.

As an “investor” I have a job duty to try and spot overvalued, overhyped, high-churn technologies, so I ask

Q: Will Haoop (or another in the list) become more widely used (therefore more valuable) in 10 years, as newer technologies come and go? I’m not sure.

http://www.b-eye-network.com/view/17017 is a concise comparison of big data and data warehouse, written by a leading expert of data warehouse.

[09]%%design priorities as arch/CTO

Priorities depend on industry, target users and managers’ experience/preference… Here are my Real answers:

A: instrumentation (non-opaque ) — #1 priority to an early-stage developer, not to a CTO.

Intermediate data store (even binary) is great — files; reliable[1] snoop/capture; MOM

[1] seldom reliable, due to the inherent nature — logging/capture, even error messages are easily suppressed.

A: predictability — #2 (I don’t prefer the word “reliability”.) related to instrumentation. I hate opaque surprises and intermittent errors like

  • GMDS green/red LED
  • SSL in Guardian
  • thick, opaque libraries like Spring
  1. Database is rock-solid predictable.
  2. javascript was predictable in my pre-2000 experience
  3. automation Scripts are often more predictable, but advanced python is not.

(bold answers are good interview answers.)
A: separation of concern, encapsulation.
* any team dev need task breakdown. PWM tech department consists of teams supporting their own systems, which talk to each other on an agreed interface.
* Use proc and views to allow data source internal change without breaking data users (RW)
* ftp, mq, web service, ssh calls, emails between departments
* stable interfaces. Each module’s internals are changeable without breaking client code
* in GS, any change in any module must be done along with other modules’ checkout, otherwise that single release may impact other modules unexpectedly.

A: prod support and easy to learn?
* less support => more dev.
* easy to reproduce prod issues in QA
* easy to debug
* audit trail
* easy to recover
* fail-safe
* rerunnable

A: extensible and configurable? It often adds complexity and workload. Probably the #1 priority among managers i know on wall st. It’s all about predicting what features users might add.

How about time-to-market? Without testibility, changes take longer to regression-test? That’s pure theory. In trading systems, there’s seldom automated regression testing.

A: testability. I think Chad also liked this a lot. Automated tests are less important to Wall St than other industries.

* each team’s system to be verifiable to help isolate production issues.
* testable interfaces between components. Each interface is relatively easy to test.

A: performance — always one of the most important factors if our system is ever benchmarked in a competition. Benchmark statistics are circulated to everyone.

A: scalability — often needs to be an early design goal.

A: self-service by users? reduce support workload.
* data accessible (R/W) online to authorized users.

A: show strategic improvement to higher management and users. This is how to gain visibility and promotion.

How about data volume? important to eq/fx market data feed, low latency, Google, facebook … but not to my systems so far.

DB=%% favorite data store due to instrumentation

The noSQL products all provide some GUI/query, but not very good. Piroz had to write a web GUI to show the content of gemfire. Without the GUI it’s very hard to manage anything that’s build on gemfire.

As data stores, even binary files are valuable.

Note snoop/capture is no data-store, but falls in the same category as logging. They are easily suppressed, including critical error messages.

Why is RDBMS my #1 pick? ACID requires every datum to be persistent/durable, therefore viewable from any 3rd-party app, so we aren’t dependent on the writer application.

[17]FASTEST muscle-growth=b4/af job changes]U.S.

I now recall that my muscle-building and, to a lesser extent, zbs growth are clearly fastest in the 3 months around each job change. I get frequent interviews and positive feedback. This is a key (subconscious) reason why I prefer contracting even at a lower salary. I get the kick each time I change job.

My blogging activity shows the growth…

  • #1 factor … positive feedback from real offers from good companies.
  • #2 factor — I actually feel real zbs growth thought it tends to be less strategic in hindsight.
  • factor — on a new job, I am curious to learn things I have wanted to learn like Xaml, FIX, Tibco, kdb, SecDB, multicast, orderbook, curve building

Beside the months immediately b4/af job change, I also experienced significant growth in

No such environment in Singapore:(

## retirement disposable time usage

See also my framework: Chore^Pleasure activities

  • exercise in the park everyday .. like grandma
  • reflective blogging — likely to be a big time-killer
  • reading as a pastime? GP said at his age, he still loves reading and has many good books at home, but has insufficient physical energy
  • sight-seeing, burning your cash reserve? Grandpa said he is physically unable to
  • — now the more productive endeavors:
  • volunteering for a worthy cause?
  • helping out as grandparents
  • ! … semi-retirement is clearly superior as I would have a real occupation with a commitment and a fixed work schedule

Grandpa pointed out that there are Actually-bigger factors than finding things to do

  1. cash flow
  2. health

##[17]tough n high-leverage c++topics #QQ[def]

I used to feel I have so much absorption capacity, but now I feel in my finite career I can’t really master and remember all the tough c++ topics.

Practical solution — Classify each difficult c++topic into one of

  1. QQ: high impact on QnA interview, probably the only type of high-leverage tough topic. Largely textbook knowledge. As such I’m basically confident I can learn all the basics on my own (perhaps slower than Venkat), provided I know the topics.
    1. including a subset of coding questions designed really for knowledge test, rather than algorithm thinking
    2. eg: HFT, Venkat…
  2. high impact on algo coding IV? rather few such topics. See t_algoQQ. Rather few coding interviews are about knowledge in tough topics!
  3. ZZ: high impact on GTD zbs — inevitably Low leverage during job search
  4. 00: no high impact on anything

Q: Is there a tough topic in both QQ and ZZ? I doubt it.

  • [00] template wizardry;
  • [00] operator overloading;
  • [00] pthreads
  • ————-
  • [QQ]
  • [QQ] template specialization
  • [QQ] MI;
  • [QQ] move semantics
  • [QQ] [p] boost common lib
  • [QQ] optimization tricks. Remember MIAX and SCB IV by Dmitry
  • [QQ] [p] singleton implementation — not really tough
  • [QQ] pimpl — not really tough
  • [QQ] op-new/malloc (interacting with ctor)
  • [QQ] memory layout
  • [QQ] [p] struct alignment
  • [QQ] specific threading constructs
  • [QQ] smart ptr details
  • [QQ] ptr as a field
  • [QQ] implement auto_ptr or ref-counting string
  • [QQ] [p] UDP —
  • [QQ] [p] multicast
  • [QQ] select()
  • [QQ]
  • [ZZ] IDE set-up
  • [ZZ] compiler/linker/makefile details
  • [ZZ] debuggers
  • [ZZ] crash analysis, like segmentation fault
  • [ZZ] c^c++:SAME debugging/tracing/instrumentation skills #ZZ

[p=paper tiger. ]

##[18] Algo problem solving as hobby@@

  • how does this compare to board game?
  • how does this compare to jigsaw puzzles
  • how does this compare to small mobile app development as a hobby?
    • small mobile game development
  • how does this compare to photography as a hobby?
  • how does this compare to blogging as a hobby
  • how does this compare with DIY home improvement
    • woodwork
  • how does this compare to auto tuning and bike building
  • how does this compare with hi-fi tuning

Every one of them is better than TV, gaming, sight-seeing or drinking. Each one requires consistent effort, sustained focus. Each one will become more frustrating less exciting before you reach the next level.

##high complexity high mkt-value specializations

Opening example 1: Quartz — High complexity. Zero market value as the deep insight gained is decidedly local and won’t make you a stronger developer on another team.

Opening example 2: IDE, Maven, git, Unix + shell scripting — modest complexity; Makes me stronger developer in real projects, but no premium on the labor market.

My best experiences on Wall St — tech skills with high market value + complexity high enough that few developers could master it. On a project involving these I get better lifestyle, lower stress… Examples:

  • threading
  • java collections
  • SQL complex queries + stored proc. Declining demand in high-end jobs?
  • SQL tuning
  • MOM-based, high volume system implementation — reasonable complexity and market value, but not mainstream. Mostly used in trading only 😦
  • pricing math — high market value but too specialized 😦
  • trading algorithms, price distribution, … Specialized 😦

Let’s look at a few other tech skills:

  • c++ build automation — modest complexity; low value
  • c++ low latency — high value;  barrier too high for me 😦
  • java reflection, serialization — high complexity high practical value, but market value is questionable 😦
  • .NET — some part can be high complexity, but demand is a bit lower than 2011 😦
  • Java tuning — high complexity; not high value practically
  • python — modest complexity, growing market value
  • PHP — lower complexity and lower market value than py, IMHO

retreat to raw ptr from smart ptr ASAP

Raw ptr is in the fabric of C. Raw pointers interact/integrate with countless parts of the language in complex ways. Smart pointers are advertised as drop-in replacements but that advertisement may not cover all of those “interactions”:

  • double ptr
  • new/delete/free
  • ptr/ref layering
  • ptr to function
  • ptr to field
  • 2D array
  • array of ptr
  • ptr arithmetics
  • compare to NULL
  • ptr to const — smart ptr to const should be fine
  • “this” ptr
  • factory returning ptr — can it return a smart ptr?
  • address of ptr object

Personal suggestion (unconventional) — stick to the known best practices of smart ptr (such as storing them in containers). In all other situations, do not treat them as drop-in replacements but retrieve and use the raw ptr.

[16]python: routine^complex tasks #XR

XR,

Further to our discussion, I used perl for many years. 95% of my perl tasks are routine tasks. With py, I would say “majority” of my tasks are routine tasks i.e. solutions are easy to find on-line.

  • routine tasks include automated testing, shell-script replacement, text file processing, query XML, query various data stores, query via http post/get, small-scale code generation, simple tcp client/server.
  • For “Complex tasks” , at least some part of it is tricky and not easily solved by Googling. Routine reflection / concurrency / c++Integration / importation … are documented widely, with sample code, but these techniques can be pushed to the limit.
    •  Even if we just use these techniques as documented, but we combine them in unusual ways, then Google search will not be enough.
    • Beware — decorators , meta-programming, polymorphism, on-the-fly code-generation, serialization, remote procedure call … all rely on reflection.

When you say py is not as easy as xxx and takes several years to learn, I think you referred to complex tasks.

It’s quite impressive to see that some powerful and useful functionalities can be easily implemented by following online tutorials. By definition these are routine tasks. One example that jump out at me is reflection in any language. Python reflection can be powerful like a power drill to break a stone wall. Without such a power drill the technical challenge can look daunting. I guess metaclass is one such power drill. Decorator is a power drill I used in %%logging decorator with optional args

I can see a few reasons why managers choose py over java for certain tasks. I heard there are a few jvm-based scripting languages (scala, groovy, clojure, jython …) but I guess python beats them on several fronts including more packages (i.e. wheels) and more mature, more complete and proven solutions, familiarity, reliability + wider user base.

One common argument to prefer any scripting language over any compiled language is faster development. True for routine tasks. For complex tasks, “your mileage may vary”. As I said, if the software system requirement is inherently complex, then implementation in any language will be complex. When the task is complex, I actually prefer more verbose source code — possibly more transparent and less “opaque”.

Quartz is one example of a big opaque system for a complex task. If you want, I can describe some of the complex tasks (in py) I have come across though I don’t have the level of insight that some colleagues have.

When you said the python debugger was less useful to you than java debugger, it’s a sign of java’s transparency. My “favorite” opaque parts of py are module import and reflection.

If any language has excellent performance/stability + good on-line resources [1] + reasonable library of components comparable to the mature languages like Java/c++, then I feel sooner or later it will catch on. I feel python doesn’t have exactly the performance. In contrast, I think php and javascript can achieve very high performance in their respective usage domains.

[1] documentation is nice-to-have but not sufficient. Many programmers don’t have time to read documentation in-depth.

[16]standard practice around q[delete]

See also post about MSDN overview on mem mgmt…

  • “delete” risk: too early -> stray pointer
  • “delete” risk: too “late” -> leak. You will see steadily growing memory usage
  • “delete” risk: too many times -> double free
  • … these risks are resolved in java and dotnet.

For all simple scenarios, I feel the default and standard idiom to manage the delete is RAII. This means the delete is inside a dtor, which in turn means there’s a wrapper class, perhaps some smart ptr class.

It also means we create stack instances of
– the wrapper class
– a container holding the wrapper objects
– an umbrella holding the wrapper object

Should every departure/deviation be documented?

I feel it’s not a best idea to pass a pointer into some cleanup function, and inside the function, delete the passed-in pointer. What if the pointee is a pointee already deleted, or a pointee still /in active service/, or a non-heap pointee, or a pointee embedded in a container or umbrella… See P52 [[safe c++]]

prefer for(;;)+break: cod`IV

For me at least, the sequencing of the 3-piece for-loop is sometimes trickier than I thought. It’s supposedly simple rule(s), but I don’t get it exactly right sometimes. Can you always intuitively answer these simple questions? (Answers scattered.)

A87: ALWAYS absolutely nothing
A29: many statements. They are separated by many statements.

Q1: how many times (minimum, maximum) does the #1 piece execute?
Q2: how many times (minimum, maximum) does the #2 piece execute?
Q3: how many times (minimum, maximum) does the #3 piece execute?
Q: Does the number in A2 always exceeds A3 or the reverse, or no always-rule?
Q29: what might happen between #2 and #3 statements?
Q30: what might happen between #3 and #2? I feel nothing could happen.
Q87: what might happen between #1 and #2 statements?
Q: what’s the very last statement (one of 3 pieces or a something in loop body) executed before loop exit? Is it an “always” rule?

If there’s a q(continue), then things get less intuitive. http://stackoverflow.com/questions/16598222/why-is-continue-statement-ignoring-the-loop-counter-increment-in-while-loop explains the subtle difference between while-loop vs for-loop when you use “continue”.

In contrast, while-loop is explicit. So is do-while. In projects, for-loop is concise and often more expressive. In coding interviews, conditions are seldom perfect, simple and straightforward, so for-loop is error prone. White-board coding IV (perhaps bbg too) is all about nitty-gritty details. The condition hidden in the for-loop is not explicit enough! I would rather use for(;;) and check the condition inside and break.

The least error-prone is for(;;) with breaks. I guess some coding interviewers may not like it, but the more tricky the algorithm is, the more we appreciate the simplicity of this coding style.

Always safe to start your coding interview with an a for(;;) loop and carefully add to the header. You can still have increments and /break/continue inside.

low-churn professions often pay lower#le2Henry

category – skillist, gzThreat

I blogged about several slow-changing professions — medical, civil engineers, network engineers, teachers, quants, academic researchers, accountants (including financial controllers in banks).

My overall impression is, with notable exceptions, many of the slow-changing domains don’t pay so well. We will restrict ourselves to white-collar, knowledge intensive professions.

Sometime between 2013 to 2015, a tech author commented — compared to the newer languages of javascript, ruby, objective-C etc, java programmers are a more traditional, more mature, more stable, more enterprise community.

https://bintanvictor.wordpress.com/2014/11/03/technology-churn-ccjava-letter-to-many/ is my comparison of java, c#, c++. Basically I’m after the rare combination of

– mainstream,
– sustained, robust demand over 15 to 30 years
– low churn

Someone mentioned entry barrier. Valuable feature, but I think it is neither necessary nor sufficient a condition.

SQL and shell scripting are good examples. Very low churn; robust demand, mainstream. Salary isn’t highest, but decent.

[12]too many add-on packages piling up ] java^C++

(blogging) Biggest problem facing a new or intermediate java developer — too much new “stuff”, created by open source or commercial developers. Software re-usability? Software Component industry?…

Some job candidates are better able to articulate about these — advantage. On the real job, I don’t feel a developer needs to know so many java utilities (Architects?)

More than 3 C++ developers told me they prefer c++ over java for this reason. They told me that about the only add-on library they use is STL. Everything else is part of the core language. Some of them tell me in their trading/finance systems, other libraries are less used than STL — smart pointers + a few boost modules + some threading library such as pthreads. In contrast, I can sense they feel a modern day java system requires so many add-on items that it looks daunting and overwhelming.

The most influential books on c++ were written in the early 90’s (or earlier?)… Bottom line — If you know core language + STL you qualify for c++ jobs today. By the way, you don’t need deep expertise in template meta-programming or multiple inheritance as these are rarely used in practice.

In contrast, Java has many core (and some low-level add-on) components kept stable — such as memory model and core multi-threading, basic collections, RMI, serialization, bytecode instrumentation, reflection, JNI … This should in theory give reassurance to developers and job seekers. In reality, on the java (job) market stable core/infrastructure/fundamentals are overshadowed and drown out by the (noisy) new add-on libraries such as spring, hibernate, JSF, gwt, ejb, rich faces,

I feel the java infrastructure technologies are more important to a java app(also to a java project or to a java team), but I continually meet hiring side asking x years of hands-on experience with this or that flavor-of-the-month add-on gadgets. Is any of these packages in the core language layers? I don’t feel very sure.

(I feel some are — cglib, leak detectors… but these aren’t in job specs….)

I suspect many hiring managers don’t care about those extra keywords and just want to hire strong fundamentals, but they are forced to add those flavor-of-the-month keywords to attract talents. Both sides assume those hot new things are attractive to the other side, so they want to “flash” those new gadgets.

Whatever the motivation, result is a lot of new add-on gadgets we developers are basically forced to learn. “Keep running or perish.” — it’s tiring.

op-new : no DCBC rule

B’s op-new is bypassed by D’s op-new [1]
B’s ctor is always used (never bypassed) by D’s ctor.

This is a interesting difference.

Similarly, an umbrella class’s op-new [1] would not call a member object’s op-new. See [[more effC++]]

These issues are real concerns if you want to use op-new to prohibit heap instantiation of your class.

See http://bigblog.tanbin.com/2012/01/dcbc-dtor-execution-order.html

[1] provided these two classes each define an op-new()

By the way, op-new is a static member operator, but is still inherited.

mv-semantic: keywords

I feel all the tutorials seem to miss some important details and selling a propaganda. Maybe [[c++ recipes]] is better?

[s = I believe std::string is a good illustration of this keyword]

  • [s] allocation – mv-semantic efficiently avoids memory allocation on heap or on stack
  • [s] resource — is usually allocated on heap and accessed via a pointer field
  • [s] pointer field – every tutorial shows a class with a pointer field. Note a reference field is much less common.
  • [s] deep-copy – is traditional. Mv-semantics uses some special form of shallow-copy. Has to be carefully managed.
  • [s] temp – the RHS of mv-semantic must strictly be a temp object. I believe by using the move() function and the r-val reference (RVR) we promise to the compiler not to access the temp object afterwards. If we access it, i guess bad things could happen. Similar to UndefBehv? See [[c++standard library]]
  • promise – see above
  • containers – All standard STL container classes (including std::string) provide mv-semantics. Here, the entire container instance is the payload! Inserting a float into a container won’t need mv-semantics.
  • [s] expensive — allocation and copying assumed expensive. If not expensive, then the move is not worthwhile.
  • [s] robbed — the source object of the move is crippled, robbed, abandoned and should not be used afterwards. Its “resource” is already stolen, so the pointer field to that resource should be set to NULL.

——–
http://www.boost.org/doc/libs/1_59_0/doc/html/move/implementing_movable_classes.html says “Many aspects of move semantics can be emulated for compilers not supporting rvalue references and Boost.Move offers tools for that purpose.” I think this sheds light…

Within finance, technology outlives most job functions

Look at these job functions —

* Many analysts in finance need to learn data analytics software ….
* Risk managers depend on large risk systems…
* Quants need non-trivial coding skill…
* Everyone in finance needs Excel, databases, and … financial data.
…. while the IT department faces no threat, except outsourcing. Why?

Surely … Financial data volume is growing
Surely … Automation reduces human error, enforces control — operational risk…
Computer capabilities are improving
Financial data quality is improving
Computers are good at data processing, esp. repetitive, multi-step…
Financial info tech is important and valuable (no need to explain), not simple, requires talent, training, experience and a big team. Not really blue-collar.

Many techies point out the organizational inefficiencies and suggest there’s no need for so many techies, but comparatively is there a need for so many analysts, or so many risk managers, or so many accountants or so many traders? Every role is dispensable! Global population is growing and getting better educated, so educated workforce must work.

c++IV: importance: knowledge imt dev xp

1) Many hard-core tech interviewers (Yaakov, Jump, 3Arrows, Bbg, nQuants …) often asked me to explain a language feature, then drill in to see if I really do understand the key ideas, including the rationale, motivation and history. This knowledge is believe to … /separate the wheat from the chaff/

This knowledge can’t be acquired simply by coding. In fact, a productive coder often lacks such knowledge since it’s usually unnecessary theoretical knowledge.

2) West Coast always drills in on algo (+ data structure). No way to pick up this skill in projects…

1+2 —> many interviewers truly believe a deep thinker will always learn faster and design better.

%% priorities in a take-home cod`IV

A lot of times the requirements are way too numerous or stringent, given the time limit. Must give up some. Here are my priorities:

  1.  basic functionality. The essential problem should be solved, or shown to be solvable.
    • pass the essential test cases only
  2. On that foundation, add as many big features (in the problem statement) as I can within the time limit. Leave out the small features.
  3. simplify. Simplicity is the first element of clean , readable code.

I guess some hiring managers are very particular about code quality and readability. I feel it’s like beauty contest. I would give up in those cases. I was told JPM follows [[Clean Code]] when judging/scoring/ranking code submitted.

Need to ensure file timestamps are all within a short window. Prefer to use fewer files. Java solutions require more files :(. If I claim 2H, then better leave some rough edges in the code like input validations.

math power tools transplanted -> finance

南橘北枳

* martingale originates in gambling…
* Brownian motion originates in biology.
* Heat equation, Monte Carlo, … all have roots in physical science.

These models worked well in the original domains, because the simplifications and assumptions are approximately valid even though clearly imperfect. Assumptions are needed to simplify things and make them /tractable/ to mathematical analysis.

In contrast, financial mathematicians had to make blatantly invalid assumptions. You can find fatal flaws from any angle. Brian Boonstra told me all good quants appreciate the limitations of the theories. A small “sample”:

– The root of the randomness is psychology, or human behavior, not natural phenomenon. The outcome is influenced fundamentally by human psychology.
– The data shows skew and kurtosis (fat tail).
– There’s often no way to repeat an experiment
– There’s often just a single sample — past market data. Even if you sample it once a day, or once a second, you still operate on the same sample.

tech zbs( !!GTD) outweigh other(soft)job skills #%%belief

label: big20

In many teams (including …), I feel technical zbs/GTD capabilities

are still the #1 determinant of job security, comfort level,

competence level, work-life balance. Secondary yet important factors

include

– relative caliber within the team

– boss relationship

– criticality of the role within the team, within the firm

– reliable funding of the project, team or role

– closeness-to-money

– long work hours

– fire-n-hire culture

So zbs is key to keeping a job, but for finding a job, something else needed.

test if a key exists in multimap: count() perf tolerable

map::count() is simpler and slightly slower 🙂 than map::find()

Even for a multimap, count() is slower but good enough in a quick coding test. Just add a comment to say “will upgrade to find()”

In terms of cost, count() is only slightly slower than find(). Note multimap::count() complexity is Logarithmic in map size, plus linear in the number of matches… Probably because the matching entries live together in the RB tree.

linq – a scratch seeking an itch

The more I read about linq2collection, the more convinced I feel that it’s not strictly necessary. Old fashioned for-loops are
cleaner and easier to read. Exception handling is cleaner. Branching is cleaner. Mid-stream abort is cleaner.

A linq proponent may say OO was not strictly necessary (compared to procedural), and relational DB was not strictly necessary
(compared to non-relational DB) but offer real values and so is linq. Here’s my argument, focusing on linq2collections.

– I feel this is a local implementation choice, not a sweeping, strictly enforced architectural choice. As such, a particular method
can choose not to use linq.
– Another comparison is ORM. Once ORM is chosen in a project, you can hardly choose not to use it in one particular method. Linq is
more flexible, less enforced and more “sociable”.

A java guy has no such tool but is perfectly productive, until java 8 streams…

I feel linq gained mindshare on the back of c# success. If F#, python or delphi invents such a cool feature, it would remain a niche feature. I don’t see another language embracing linq, until java 8

Neverthless, we must master linq for the sake of job interviews. This is somewhat comparable to c++ multiple inheritance – not used
widely but frequently quizzed by interviewers.