fear@large codebase #web/script coders

One Conclusion — my c++ /mileage/ made me a slightly more confident, and slightly more competent programmer, having “been there; done that”, but see the big Question 1 below.

— Historical view

For half my career I avoided enterprise technologies like java/c++/c#/SQL/storedProc/Corba/sockets… and favored light-weight technologies like web apps and scripting languages. I suspect that many young programmers also feel the same way — no need to struggle with the older, harder technologies.

Until GS, I was scared of the technical jargon, complexities, low-level API’s debuggers/linkers/IDE, compiler errors and opaque failures in java/SQL … (even more scared of C and Windows). Scared of the larger, more verbose codebases in these languages (cf the small php/perl/javascript programs)… so scared that I had no appetite to study these languages.

— many guys are unused to large codebases

Look around your office. Many developers have at most a single (rarely two) project involving a large codebase. Large like 50k to 100k lines of code excluding comments.

I feel the devops/RTB/DBA or BA/PM roles within dev teams don’t require the individual to take on those large codebases. Since it’s no fun, time-consuming and possibly impenetrable, few of them would take it on. In other words, most people who try would give up sooner or later. Searching in a large codebase is perhaps their first challenge. Even figuring out a variable’s actual type can be a challenge in a compiled language.

Compiling can be a challenge esp. with C/c++, given the more complex tool chain, as Stroustrup told me.

Tracing code flow is a common complexity across languages but worse in compiled languages.

In my experience, perl,php,py,javascript codebases are usually small like pets. When they grow to big creatures they are daunting and formidable just like compiled language projects. Some personal experiences —
* Qz? Not a python codebase at all
* pwm comm perl codebase? I would STILL say codebase would be bigger if using a compiled language

Many young male/female coders are not committed to large scale dev as a long-term career, so they probably don’t like this kinda tough, boring task.

— on a new level

  • Analogy — if you have not run marathons you would be afraid of it.
  • Analogy — if you have not coached a child on big exams you would be afraid of it.

I feel web (or batch) app developers often lack the “hardcore” experience described above. They operate at a higher level, cleaner and simpler. Note Java is cleaner than c++. In fact I feel weaker as java programmer compared to a c++ programmer.

Q1: I have successfully mastered a few sizable codebases in C++, java, c#. So how many more successful experiences do I need to feel competent?
A: ….?

Virtually every codebase feels too big at some time during the first 1-2 years, often when I am in a low mood, despite the fact that in my experience, I was competent with many of these large codebases.
I think Ashish, Andrew Yap etc were able to operate well with limited understanding.
I now see the whole experience as a grueling marathon. Tough for every runner, but I tend to start the race assuming I’m the weakest — impostor syndrome.
Everyone has to rely on log and primitive code browsing tools. Any special tools are usually marginal value. With java, live debugger is the most promising tool but still limited pain-relief. Virtually all of my fellow developers face exactly the same challenges so we all have to guess. I mean Yang, Piroz, Sundip, Shubin, … virtually all of them, even the original authors of the codebase. Even after spending 10Y with a codebase, we could face opaque issues. However, these peers are more confident against ambiguity.

pre-allocated array as backing store for graph nodes #java No

I think any graph node can use the same technique, but here I present a simple yet interesting use case — a linked list with each node allocated from an array. https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_66mem/slistFromArray.cpp shows three home-made implementations:

  1. backing array of dummy link nodes, pre-allocated at compile time
  2. backing array of dummy link nodes, pre-allocated from free store aka DMA
  3. backing array is a byte array on heap or data section. Each link node is constructed via placement-new.

Here are a few Advantages that I consider minor because linked list is seldom needed in low-latency

  1. d-cache efficiency
  2. eliminates runtime load on heap allocator, since memory is pre-allocated. See malloc=long considered costly

Advantage #3: For c++ algo questions, this set-up has an interesting advantage — The node address is now an index into the backing array. This index is a natural auto-increment ID , based on creation order.

Now, the biggest advantage of linked list over vector is mid-stream insert/delete. One of the biggest disadvantages is lack of random-access. If nothing happens mid-stream (as in coding questions), then we can achieve random-access-by-id using array as backing store.

If nothing happens mid-stream, then this linked list is physically similar to an array with extra capacity.

This technique won’t work in java because java array of Node is array-of-pointers.

c++toolchain complexity imt new languages #%%advantage

The modern languages all feature dramatically simplified tool chain. In contrast, c++ tool chain feels much bigger to me, including profilers, static analyzers, binary file dumpers, linkers ..

This is one of the real obstacles to new entrants, young or old. This is also my (slowly growing) competitive advantage. I feel some people (like Kevin of Macq) know more, but most developers have a cursory working knowledge in this field.

I was frustrated for years by the complex and messy build tools in c++. Same for the other new entrants — Rahul spent a month setting up Eclipse CDT…

This learning curve, entry barrier … is a direct consequence to the c++ “sweet spot” as Stroustrup described — inherently complex codebase close to hardware.

I wrote dozens of blogposts about c++ build issues. For example, on windows, my strawberryPerl + git_bash + notepad++ setup is unknown to many. These fellow developers struggle with MSVS or Eclipse !

Due to the bigger ecosystem needed to support c++, new features are added at a slower pace than languages having a central organization.

c#^ java+cpp journeys

my c# xx journey was exciting for 6M Before OC and in first year in OC. In contrast, my c++/coreJava (less for jxee) journeys have generated superior ROTI (elusive) because 1. the interview topics are stable 2. market waves steered me to stick to (not abandon) these career directions leverage? c# is lower but not bad. See separate blogpost — in terms of my expectations

  • java – exceeding my expectations in churn. Found 2nd life in web2.0.
  • c# – missed my expectations. Displaced in web2.0. Google CIV uses 5 anguages, without c#
  • c++ – matching my expectation. slow decline. Efficiency advantage is eclipsed by java and some new languages
  • py – exceeding my expectation
  • javascript – exceeding expectation

For all languages, there is no salary hike, no strategic value so at that level all underwhelming  

c++IV=much harder than GTD #Mithun

c++ IV is much harder than c++ job GTD, as I told Mithun.

  • GTD is no different from java jobs, even though the build process can be slightly hairy. Java build can also get messy.
  • In contrast, C++ IV is a totally different game.

You need a rating of 1/10 to do a decent job, but need 7/10 to pass ibank interviews. This gap is wider in c++ than in java as java interview bar is much lower.

Most technical challenges on the job are localSys, so you can just look at existing code and 照猫画虎, 如法炮制, as AndrewYap does. Venkat of RTS said we should but we still do.

Corollary — after 3Y full time c++ job, you may still fail to pass those interviews. Actually I programed C for 2Y but couldn’t pass any C interview whatsoever.

[19]c++guys becom`very unlucky cf java

On 22 Apr 2019 I told Greg that c++ developers like me, Deepak, CSY.. are just so unlucky — most of the WallSt c++ jobs are too demanding in terms of latency engineering, either on buy-side or sell-side.

Greg agreed that java interviews are much easier to pass. Greg said if you have reasonable java skills, then you can get a job in a week.

I told Greg that the only way Deepak or CSY could get an offer is through one of the few easy-entry c++jobs, but there are relatively few such jobs i.e. without a high entry barrier.

— widespread view that c++ developers are perceived as strongest due to c++ being a hard language

My conclusion

  1. yes in terms of QQ and zbs
  2. no in terms of GTD, as GTD challenge is mostly due to localSys. Even a python codebase can be hard.

success@concurrency features] java^c++^c#

I’m biased towards java.

I feel c# concurrency is less impactful because most of the important concurrent systems use *nix servers not windows, and most concurrent programming jobs do not go to windows developers.

Outside windows, c++ concurrency is mostly based on the C library pthreads, non-OO and inconvenient compared to java/c#

The c++11 thread classes are the next generation following pthreads, but not widely used.

Java’s concurrency support is the most successful among languages, well-designed from the beginning and rather stable. It’s much simpler than c++11 thread classes, having only the Thread.java and Runnable.java data types. More than half the java interviews would ask threading, because java threading is understandable and usable by the average programmer, who can’t really understand c++ concurrency features.

criticalMass[def]against churn ] tech IV

See also body-building impact{c++jobs#^self-xx

IV knowledge Critical mass (eg: in core java self-study) is one of the most effective strategies against technology churn in tech interviews. Once I accumulate the critical mass, I don’t need a full time job to sustain it.

I have reached critical mass with core java IV, core c++ IV, swing IV (no churn) and probably c# IV.

The acid test is job interviews over a number of years.

Q: how strongly is it (i.e. critical mass) related to accumulation?
A: not much AFTER you accumulate the critical mass. With core java I did it through enough interviews and reading.

Q: how strongly is it related to leverage?
A: not much though Critical mass enhances leverage.

Q: why some domains offer no critical mass?
A: some (jxee) interviews topics have limited depth
A: some (TMP, py) interview topics have No pattern I could identify from interview questions.

 

c++complexity≅30% above java #c#=in_between

Numbers are just gut feelings, not based on any measurement. I often feel “300% more complexity” but it’s nicer to say 30% 🙂

  • in terms of interview questions, I have already addressed in numerous blog posts.
  • see also mkt value@deep-insight: java imt c++
  • — tool chain complexity in compiler+optimizer+linker… The c++ compiler is 200% to 400% (not merely 30%) more complex than java… see my blogpost on buildQiurks. Here are some examples:
  • undefined behaviors … see my blogposts on iterator invalidation
  • RVO — top example of optimizer frustrating anyone hoping to verify basic move-semantics.
  • See my blogpost on gdb stepping through optimized code
  • See my blogpost on on implicit
  • — syntax — c++ >> c# > java
  • java is very very clean yet powerful 😦
  • C++ has too many variations, about 100% more than c# and 300% more than java
  • — core language details required for GTD:
  • my personal experience shows me c++ errors are more low-level.
  • Java runtime problems tend to be related to the (complex) packages you adopt from the ecosystem. They often use reflection.
  • JVM offers many runtime instrumentation tools, because JVM is an abstract, simplified machine.
  • — opacity — c++ > c# > java
  • dotnet IL bytecode is very readable. Many authors reference it.
  • java is even cleaner than c#. Very few surprises.
  • — more low-level — c++ > c# > java.
  • JVM is an excellent abstraction, probably the best in the world. C# CLR is not as good as JVM. A thin layer above the windows OS.

## marketable syntax nlg: c++ > j/c#

Every language has poorly understood syntax rules, but only in c++ these became fashionable, and halos in job interviews !

  • ADL
  • CRTP
  • SFINAE
  • double pointers
  • hacks involving void pointers
  • operator overloading to make smart ptr look like original pointers
  • TMP hacks using typedef
  • TMP hacks using non-type template param
  • universal reference vs rvr
  • rval: naturally occurring vs moved
    • const ref to extend lifetime of a naturally occurring rval object

c++ecosystem[def]questions are tough #DeepakCM

C++ interviewers may demand <del>c++ecosystem knowledge</del> but java also has its own ecosystem like add-on packages.

As I told my friend and fellow c++ developer Deepak CM,

  1. c++ecosystem QQ questions can be more obscure and tougher than core c++ questions
    • tool chain — compiler, linker, debugger, preprocessor
    • IPC, socket, pthreads and other C-level system libraries
    • kernel interface — signals, interrupts, timers, device drivers, virtual memory+ system programming in general # see the blog catetory
    • processor cache tuning
    • (at a higher level) boost, design patterns, CORBA, xml
    • cross-language integration with python, R, pyp, Fortran + other languages
  2. java ecosystem QQ questions are easier than core java questions. In other words, toughest java QQ questions are core java.
    • java ecosystem questions are often come-n-go, high-churn

Low level topics are tough

  1. c++ ecosystem questions are mostly in C and very low-level
  2. java ecosystem questions are usually high-level
    • JVM internals, GC … are low-level and core java

 

c++changed more than coreJava: QQ perspective

Recap — A QQ topic is defined as a “hard interview topic that’s never needed in projects”.

Background — I used to feel as new versions of an old language get adopted, the QQ interview topics don’t change much. I can see that in java7, c#, perl6, python3.

To my surprise, compared to java7/8, c++0x has more disruptive impact on QQ questions. Why? Here are my guesses:

  • Reason: low-level —- c++ is more low-level than java at least in terms of interview topics. Both java8 and c++0x introduced many low-level changes, but the java interviewers don’t care that much.
  • Reason: performance —- c++0x changes have performance impact esp. latency impact, which is the hot focus of my target c++ employers. In contrast, java8 doesn’t have much performance impact, and java employers are less latency-sensitive.
  • Reason: template  —- c++0x feature set has a disproportionate amount of TMP features which are very hard. No such “big rock” in java.
    • move/forward, enable_if, type traits

Q: if that’s the case, for my career longevity, is c++ a better domain than java?
A: I’m still biased in favor or low-level languages

Q: is that a form of technology churn?
A: yes since the c++11 QQ topics are likely to show up less over the years, replaced by newer features.

##3 java topics I can build zbs #concurrency

C++ has many zbs topics with accu + job market relevance, but I won’t list them here. In contrast, java has very few qualifying topics —

  • [b] concurrency engineering — low-level techniques + high-level designs.
  • latency improvement to c++level — Accumulated experience can become outdated but relatively long shelf-life
  • [b] GC and JIT — Accumulated experience soon becomes outdated
  • –non-ideal choices
  • large java app tuning? can’t read. High churn. Accumulated experience can become outdated
  • [b] reflection? Powerful but seldom quizzed. Low mkt value
  • [b] java5 generics topics as in my book? low industry adoption. Seldom quizzed.
  • jvm source code? never required in IV
  • [b=luckily there are books. I’m good at reading those books.]

Hadoop apps: Is java preferred@@

I didn’t hear about any negative experience with any other languages, I would assume yes java is preferred, and the most proven choice. If you go with the most popular “combination” then you can find the thriving ecosystem — most resources online and the widest support tools.

–According to one website:

Hadoop itself is written in Java, with some C-written components. The Big Data solutions are scalable and can be created in any language that you prefer. Depending on your preferences, advantages, and disadvantages presented above, you can use any language you want.

[18]t-investment: c++now surpassing java

My learning journey has been more uphill in c++. Up to 2018, I probably have invested more effort in c++ than any language including java+swing.

I analyzed c++QQ more than java QQ topics, because java is Significantly easier, more natural for me.

I read and bought more c++ books than java+swing books.

If I include my 2Y in Chartered and 2Y in Macq, then my total c++ professional experience is comparable to java.

Q: why until recently I felt my GTD mileage was less than in java+swing?

  • A #1: c++ infrastructure is a /far cry/ from the clean-room java environment. More complicated compilation and more runtime problems.
  • A: I worked on mostly smaller systems… less familiar with the jargons and architecture patterns
  • A: not close to the heart of bigger c++ systems

Q: why until recently I didn’t feel as confident in c++ as java+swing?

  • A #1: interview experiences. About 30% of my c++ interviews were HFT. I always forgot I had technical wins at SIG and WorldQuant
  • A #2: GTD mileage, described above.

jvm^c++ as infrastructure

c/c++ is part of the infrastructure of many new technologies, and consequently will last for decades whereas java may not.

😦 This doesn’t mean there will be enough c++ jobs for me and my C++ friends.

  • JVM is an infrastructure for a relatively small number of new languages and new frameworks like spring, hadoop,.. However, the machine learning community seem to regard python and c++ as the mainstay.
  • Java (not JVM) serves as infrastructure in the new domains of MSA, cloud, big data etc, but not Machine Learning.

java’s amazing perf^simplicity #c++/c#

Paradoxically
* java’s syntax simplicity is on par with python, better than c# and much better than c++.
* java’s performance is on par with c# and c++, largely due to JVM and JIT

java has been popular on web servers, and crucially the newer mobile, cloud, big-data platforms, beating c++, c#, python

java’s adoption rate as a foundation platform or integration-target… is better than all other languages. Many products are built for java, on JVM or with java in mind. I’m deliberately vague here because I don’t want to spend too much time analyzing this vague, general observation.

c++^java..how relevant ] 20Y@@

See [17] j^c++^c# churn/stability…

C++ has survived more than one wave of technology churn. It has lost market share time and time again, but hasn’t /bowed out/. I feel SQL, Unix and shell-scripting are similar survivors.

C++ is by far the most difficult languages to use and learn. (You can learn it in 6 months but likely very superficial.) Yet many companies still pick it instead of java, python, ruby — sign of strength.

C is low-level. C++ usage can be equally low-level, but c++ is more complicated than C.

c++lib to build simple http endpoint #Mark

Note this is about c++ add-on packages (like boost), less popular in c++ than in jxee !

After overspending on boost (and hibernate, gemfire, struts), I won’t repeat the mistake.

My preference is not feature set or flexibility, but simplicity.  In my professional experience, I remember the java servlet and C# WCF implementations are robust, feature-rich and industry-strength. They are not simple. I like the simplicity in python and perl http endpoints.

  • The simpler, the faster.
  • The simpler, the fewer mistakes we tend to make.
  • The simpler, the more adaptable.
  • The simpler, the easier to integrate with other components
Now some c++ libraries to implement a simple http endpoint:
  1. https://cpp-netlib.org/0.9.1/hello_world_server.html  shows a simple http endpoint constructed with cpp-netlib
  2. https://code.google.com/archive/p/mongoose/ is a simple library to construct http endpoints
  3. https://www.boost.org/doc/libs/1_55_0/doc/html/boost_asio/example/cpp11/http/server/main.cpp is a demo using boost::asio, which is industry-strength, not simple

c++interview tougher than java on WallSt #XR

My friend XR is first to point this out.

Q1: for candidates with 5+ years of experience, is the passing rate really worse in c++ IV than java IV? Let’s limit ourselves to sell-side.
%%A: indeed slightly lower.

Q2: is c++ paying slightly higher? I don’t think so.

Both java and c++ (c# too but not python) interviews go to crazy low levels details. Most tough questions in java/c#/python are low-level. C/C++ is a lower-level language.

c++(!!Java)ecosystem questions are tough #Deepak

Perhaps the most important reason for my Q1 answer is selectivity. A java hiring team can be highly selective but on average, c++ hiring teams have higher selectivity:

  • higher bar
  • more time allocated to selecting the right candidate
  • less interested in commodity skills and more demanding on in-depth knowledge
  • less likely to settle for second best talent

What does it mean for Shanyou and Deepak? They are unlucky to hit a rising bar in a shrinking domain… increasingly competitive.

3changes]SG job market #cautious optimism

  1. Many non-finance companies now can pay 150k base or higher for a senior dev role. In my 2015 job search, I didn’t find any
  2. Many smaller fintech companies (not hedge funds) can pay 150k base or higher
  3. c++ (and c#) is /conceding/ market share to java, partly due to the two categories above. Apparently, java is growing more dominant than before. I guess java is more proven, better supported, by a bigger ecosystem and have bigger talent pool. In contrast, c++ skill is harder to find in Singapore?
    1. Overall good news for me since my java arm is still stronger than c++ arm
  4. remote hiring — more Singapore teams are willing to hire from overseas. Lazada said “mostly over skype”
  5. no longer blue-collar — programmer used to be blue-collar support staff for the revenue staff. Some of the companies listed above treat programmers as first-class citizens.

GTD skill is harder,lasts longer in c++ than in Cleaner languages

In terms of troubleshooting, C++ is 90% same as C, which is a low-level language, close to the hardware.

In contrast, higher level languages strive to have the low level details encapsulated, so developers only need to deal with a simplified, standardized, cleaner façade. Some call it a virtualization.

Eg: sockets

Eg: c++ threading vs java threading

C for latency^^TPS can use java

I’m 98% confident — low latency favors C/C++ over java [1]. FPGA is _possibly_ even faster.

I’m 80% confident — throughput (in real time data processing) is achievable in C, java, optimized python (Facebook?), optimized php (Yahoo?) or even a batch program. When you need to scale out, Java seems the #1 popular choice as of 2017. Most of the big data solutions seem to put java as the first among equals.

In the “max throughput” context, I believe the critical java code path is optimized to the same efficiency as C. JIT can achieve that. A python and php module can achieve that, perhaps using native extensions.

[1] Actually, java bytecode can run faster than compiled C code (See my other posts such as https://bintanvictor.wordpress.com/2017/03/20/how-might-jvm-beat-cperformance/)

[12]too many add-on packages piling up ] java^C++

(blogging) Biggest problem facing a new or intermediate java developer — too much new “stuff”, created by open source or commercial developers. Software re-usability? Software Component industry?…

Some job candidates are better able to articulate about these — advantage. On the real job, I don’t feel a developer needs to know so many java utilities (Architects?)

More than 3 C++ developers told me they prefer c++ over java for this reason. They told me that about the only add-on library they use is STL. Everything else is part of the core language. Some of them tell me in their trading/finance systems, other libraries are less used than STL — smart pointers + a few boost modules + some threading library such as pthreads. In contrast, I can sense they feel a modern day java system requires so many add-on items that it looks daunting and overwhelming.

The most influential books on c++ were written in the early 90’s (or earlier?)… Bottom line — If you know core language + STL you qualify for c++ jobs today. By the way, you don’t need deep expertise in template meta-programming or multiple inheritance as these are rarely used in practice.

In contrast, Java has many core (and some low-level add-on) components kept stable — such as memory model and core multi-threading, basic collections, RMI, serialization, bytecode instrumentation, reflection, JNI … This should in theory give reassurance to developers and job seekers. In reality, on the java (job) market stable core/infrastructure/fundamentals are overshadowed and drown out by the (noisy) new add-on libraries such as spring, hibernate, JSF, gwt, ejb, rich faces,

I feel the java infrastructure technologies are more important to a java app(also to a java project or to a java team), but I continually meet hiring side asking x years of hands-on experience with this or that flavor-of-the-month add-on gadgets. Is any of these packages in the core language layers? I don’t feel very sure.

(I feel some are — cglib, leak detectors… but these aren’t in job specs….)

I suspect many hiring managers don’t care about those extra keywords and just want to hire strong fundamentals, but they are forced to add those flavor-of-the-month keywords to attract talents. Both sides assume those hot new things are attractive to the other side, so they want to “flash” those new gadgets.

Whatever the motivation, result is a lot of new add-on gadgets we developers are basically forced to learn. “Keep running or perish.” — it’s tiring.

low latency: C++preferred over java

(Fastest is FPGA, but Let’s put that aside. )

Low latency developers generally prefer C and avoid OO (runtime overhead), but they do use C++ templates to achieve power and flexibility.

In terms of data structures, I think they use STL too. Array is essential. In contrast, graph data structures incur additional allocation due to the graph Node objects — no such overhead in arrays.

Java’s issues —
* autoboxing — market data use mostly of primitive objects
* Every Object.java instance takes something like 8+ bytes.
* Indeterminate garbage collection
* virtual function overhead
* Even purely local variables are often put into heap for delayed clean-up
* JVM could reach good throughput wrt c++, but only after a slow warm-up.

c++^realtimeJVM for trading

Java might outperform c++ sometimes, but java performance is not consistent or controllable due to GC jitters — #1 motivation behind real time jvm.

IBM realtime jvm limits GC frequency so the GC may struggle to keep up with the demand, and heap gets bigger than in Sun jvm.

Sun realtime jvm requires some code change.

Some people feel that to benefit from real time jvm, you must code very carefully … why not use c++. C++ is the incumbent in low-latency systems.

Market-facing gateways — 1) order execution (FIX or custom), and 2) market data feed — still use c++ primarily for another reason – gateway API is often in c++.