##%%predictionS@prob(PIP)=unreliable

  • [o] RTS — I estimated 30% risk but actually zero
  • [o] OC — I felt I was doing nothing for months (10% risk) but they offered me a transfer
  • [o] citi — I wasn’t productive (20% risk) but they extended my contract for 6M
  • [u] Macq — I felt I was decent (2% risk), but expectation was way too high
  • [u] Stirt – I felt I was moving up (2% risk), but headcount pressure came as surprise
  • [u] barclays — I did superb job (0% risk) but head count pressure came as surprise
  • [u] 95G — I did superb job (0% risk) but head count pressure came as surprise
  • [o=overestimate of prob(PIP or layoff)]
  • [u=underestimate]
Advertisements

[19]cod`drill: XR discussion #low-end WC jobs

sustainable learning — Am not aiming for efficient learning. Instead, am aiming for sustainable learning. So I prefer problems without known solutions.

Life-long — Coding drill could become a lifelong hobby just like yoga. My tech-learning used to be 10% on coding drill but now 50% on coding drill. Ditto with yoga.

I can see my “west-coast coding IV” performance improving. It is still below the bar at Indeed, but I made big progress at the unfamiliar task dependency problem.

The harsh reality is, we are competing with fresh grads or even interns. Better accept the reality rather than burying our heads in the sand.

There are a large number of easier jobs on the west coast, so perhaps apply remotely and attend the remote coding rounds for practice.

The west-coast CIV is a more /level playing field/ than QQ IV. More objective. Am not negative about it.

if not4$$, y I sacrifice so much2reenter U.S.#again

Q: As of 2016 to 2019, I didn’t need high salary so badly, so what’s the real reasons why I sacrifice so much to re-enter U.S.?

A#1: foundation — rebuild confidence about career/financial foundation for next 20->25-30Y, since my passive-income/asset/burn-rate profile was (still is) far from comfortable
* age discrimination
* green card
* lower calibre requirement on a typical job .. “semi-retirement job”

A#2: self-esteem rebuild — after multiple blows (三人成虎 , three-strikes)   .. stigma
A#3: far bigger job market, providing much better sense of career safety

versioned dict #indeed

Suppose a resume consists of key/value pairs of strings. We need to support two operations —

  1. Each write(key, value) operation creates or updates a key with a new value, and returns an incremented integer vid i.e. version ID. A version ID represents a point in history, i.e. a snapshot of the entire resume
  2. Each read(vid) returns a key/value dictionary as a complete resume snapshot
  3. * In a realistic system, we also need a deleteVersion(vid) but no need to optimize for it

Single threaded system. Version 0 is an empty resume.

Q: design an efficient solution.
Q: after creating N versions via N write() operations, what’s the time and space complexity of next write()? What’s the time and space complexity of the next read()?

——-

Without loss of generality, Let’s say there are K fields in the resume version N. Note K <= N. I should have differentiated K and N, which is useful for my clarity of thinking.

—Design 5 in hind sight, optimized for read(): Simple solution would snapshot entire dict at each write() i.e. each version. Space complexity is bad (I now believe time complexity is more important.) Worst case — K==N as version1 creates key1, version 5 creates key5 etc.

A single read() is presumably O(K) just to read the K fields. I was convinced this was the theoretically limit for read() complexity, because I was thinking of network serialization — either my system or client system need to iterate over all fields of the resume. Now in hind sight I feel interviewer was thinking in java mode — to return entire dict by reference is O(1) rather than O(K) … I didn’t get his hint when he said O(K) read was not really theoretical limit.

Each write() clones the dict for a new vid and saves the new dict in a vector of pointers (!) . Therefore write() is O(K).

—My design 1, Based on the same idea, I used a vector for each key. However, my read() had to create a dict to be returned, on the fly in O(K). I didn’t think of return-by-reference/pointer.

—My design 2 is a minor optimization to remove the cloning of unchanged values. I thought I was on to something efficient but in java (and python?), all strings are copied by reference so my optimization is meaningless in java.

—My design 3 is lazy write. So write() only appends to one of the K vectors. The other K-1 vectors are updated only when needed by a later read() or write(). Amortized cost?

This O(1) write() and O(?) read() complexity can be emulated and surpassed by …

—My design 4 used K RBTrees (small trees) to optimize for frequent write() at O(1). Each write appends on one tree. STL insert with hint is O(1). No such feature in java or python.

Read(vid) requires binary search in each [1] of K trees. After N updates, total node count across all K trees is N, so even a naive search would not exceed O(N).

Worst case read():  K == N i.e. every write() creates a new key. So read() is O(N)

Note RBTree supports efficient deletion.

[1] Q: Can we avoid scanning all K trees when vid == 1?
A: Yes maintain a separate vector (or RBTree) “birthday” holding records {vid, pointer to small trees}. Birthday expands (O(1)) by one element whenever a new small-tree is born i.e. new key is created. My read(vid=55) would binary-search in this birthday data structure using vid to locate the last born small-tree before version 55, and then iterate backward to visit each small tree born before version 55.

—-

Now i feel a big string representing the entire resume is still under 100KB. I assumed were were talking about millions of fields of gigabytes each.

For a long time I was confused and assumed that each write() could update multiple key/value pairs.

cod`drill:YOUR satisfactions@@ wipe-out defined #Rahul

Rahul used the word “satisfaction”. (Sometimes I call it “intrinsic motivation” or “joy”.) The satisfaction factor is crucial to absorbency. It enables Rahul to put in so many hours.

Q: What are my satisfactions i.e. intrinsic motivation?
A: discover my own solutions that are reasonably efficient even if not optimal. Almost never elegant and simple.

I don’t mind reading a classic solution in a Real (possibly online) publication but I hate to read solutions in forums as XR does. Those forum posts leads to wipe-out

  • completely wipe out that precious satisfaction.
  • positive feedback loop broken
  • self-esteem destroyed
  • self-mastery destroyed
  • sense of progress wiped out
  • sense of self-improvement wiped out
  • absorbency wiped out
  • I feel /diminished/ and worth-less.
  • i feel hopeless giving up

It’s possible that the forum posters also learned the solutions from publications. But I basically assume they are just so brainy.

camp-out = %%strength

If I am able to get myself engaged for some time, I would be able to spend extra quiet hours including weekends working in office and make progress on localSys. I would be able to apply myself and match my brain power against the accumulated brainpower baked into the local code base.

Usually the engaging period is a honeymoon but hopefully it can become a more sustained focus. If it does happen (I won’t push myself too hard…) I would hit better GTD, productivity, and earn some respect from some colleagues (not necessarily the boss).

More importantly, I would feel proud of myself and free of regret or guilt. I may still feel inferior (damangedGoods) but I would be decisive to leave. Consider Qz and Macq.

— camp-out is a concrete form of engagement

By “Camp-out” I mean all forms of extra time spent working in office.

I used to feel I’m kind of inadequate if I have to camp out and sacrifice family time just to keep my head above water.

Wrong! Most of my peers simply don’t have the absorbency, and productive solitude to do this. Many can’t even imagine it.

I said my son may have more or less math abilities, but his effort (chiefly absorbency) is probably lower than his classmates. My daughter’s abilities is unknown, but her absorbency is better than her brother. Similarly, my absorbency is higher than my peers as demonstrated by camp-out.

— Q: What if I receive a PIP.. will I be able to persuade myself to camp out?
A: Yes. I did in Qz and Macq.

I like the imagery — I felt like a boxer on my back-foot, not hopeful but not pessimistic either.

No honeymoon, but I was able to overcome the huge negative energy, convert it partially to motivation and get myself to camp out and focus on GTD.

MI loses type info about subclasses

[[Alexandrescu]] pointed out a fundamental weakness in MI for library design — base classes “do not have enough type information to carry out their tasks”, and “MI loses type information (about subclasses) which abounds in templates”

This statement can only be understood based on TMP. I like and will repeat this statement in QQ interviews. If interviewer is knowledgeable enough about TMP to quiz me further, I would say

“TMP can be combined with inheritance. I don’t remember the various TMP techniques that make use of the type information of subtypes”.

TMP is a compile-time technique so type information is more available.

max salary: simple game-plan#SCB

The strategy — “Whether I ask for a big base or modest base, there’s a chance I may have problem with manager expectation. So let’s just go for the max salary and forget about learning/tsn

    • algo trading? tend to pay a premium, but I wonder how they assess my trec.
    • java/c++ combo role? will Not pay lower
    • Some quant skill? tend to pay a premium
    • If a HFT shop makes a real offer at S$150k base I will decline — no real upside for me. Similarly, if a quant dev job pays $170k base I will decline — the promised accu (across jobs) is a big mirage. Accu can happen within a single job, but so is the technical accu on a single job.

Max-salary game plan must not ignore :

  • correlation between salary and expectation — as observed in some past jobs but not in every lucrative role. My Barclays and 95G roles were great.
  • the stigma, damagedGoods and high expectations in Stirt and Macq…. Ashish’s view — just earn the money for 6 months and leave if not happy.
  • commute
  • reputation risk at the major banks.

Am i still a survivor? I would say YES in OC and GS, and yes in Macq based on the internal transfer offer.

Mithun suggested — Are we traumatized/scarred and fixated on the stigma? I said the same to Deepak CM.

premium salary to compensate for intrinsic motivation@@

I recall that at more than one juncture in my job hunting career, I feel overwhelmed by a premium offer and said in my head "if I accept this offer and earn 20% more than the standard rate, then the ensuing pride, self-image boost etc would surely create a wellspring of positive motivation."

How naive….in hind sight. The real factors affecting my job satisfaction was usually unrelated to premium salary. See the spreadsheet about job satisfaction.

too many DB-writes: sharding insufficient #Indeed

Context: each company receives many many reviews. In a similar scenario, we can say a stock receives many investor comments

Interviewer: OK you said horizontal sharding by company id can address highly concurrent data store updates. Now what if one company, say, Amazon, by itself gets 20% of the updates so sharding can’t help this one company.

me: I guess the update requests would block and possibly time out. We can use a task queue.

This is similar to WordPress import/export requests.

  • Each task takes a few seconds, so we don’t want user to wait.
  • If high server load, the wait could be longer.
  • More important — this task need not be immediate. It can be scheduled.
  • We should probably optimize for server efficiency rather than latency or throughput

So similarly, all the review submissions on Amazon can be queued and scheduled.

In FIX protocol, an order sender can receive 150=A meaning PendingNew. I call it “queued for exchange”.  The OMS Server sends the acknowledgement in two steps.

  1. Optional. Execution Report message on order placement in OMS Server’s Order Book. ExecType (150) = A (Pending New)
  1. Execution Report message on receiving acknowledgement from exchange. ExecType (150) = 0 (New). I guess this indicates placement in exchange order book

Note this optimization is completely different from HFT latency engineering.

G5 unexpected big wins]%%life #absorbency

should create pointer blogposts in ‘ffree’ and ‘diet’ blogs

20 uphill breakthroughs #wellness/xx is a similar list.

My unexpected success at weight improvement and diet is rather rare … the latest of top 5 achievements of my entire life. Other significant (often unexpected) successes:

  • outstanding project performance in 95G, Barcap and RTS. In other teams, I was able to do a reasonable job among bright young professionals.
  • QQ technical wins on high-end c++ positions (10+ times each), a golden ticket to high salaries on Wall St and then SG
  • [r] high absorbency for self-learning till this age.
    • MSFM
    • 😦 otherwise, the 4.5 years in SG was low.
  • [r] ffree first as bachelor and again in 2018.
  • [r=really rare]

How about my Cambodia investments? No really a personal effort per se.

tough retrans questions from IV+ from me

Many mkt data interviewers ask me high-level biz logic question about retrans, without implementation details.

Q1 (IV x 2): do you track gaps for Line A and also track gaps in Line B?
%%A: no. I use both lines to fill the gaps.

Q2 (IV x 2): Taking parser + orderbook (i.e. rebus) as a single black box, when you notice a gap (say seq #55/56/57), do you continue to process seq # 58, 59 … or do you warehouse these #58, 59… messages and wait until you get the resent #55/56/57 messages?

Q2b (IV): In your orderbook engine (like Rebus), suppose you get a bunch of order delete/exec/modify messages, but the orderId is unrecognized and possibly pending retrans. Rebus doesn’t know about any pending retrans. What would rebus do about those messages?
%%A: I don’t know the actual design [3], but if I were the architect I would always check the orderId. If orderId is unknown then I warehouse the message. If it is a known order Id in Rebus, I will apply the message on the order book. Risks? I can’t think of any.

[3] It’s important to avoid stating false facts, so i will add the disclaimer.

Q2c (IV): what data structures would you use to warehouse those pending messages? ( I guess this question implies warehousing is needed.)
%%A: a linked list would do. Duplicate seqNum check is taken care of by parser.

Q3 (IV): do you immediately send a retrans request every time you see a gap like (1-54, then 58,59…)? Or do you wait a while?
A: I think we do need to wait since UDP can deliver #55 out of sequence.

The above questions were probably the most important questions in a non-tech interview. In other words, if an interview has no coding no QQ then most of the questions would be simpler than these retrans questions ! These questions test your in-depth understanding of a standard mkt data feed parser design. 3rd type of domain knowledge.

Q: after you detect a gap, what does your parser do?
A (Deepak): parser saves the gap and moves on. After a configured timeout, parser sends out the retrans request. Parser monitors messages on both Line A and B.

Q: if you go on without halting the parser, then how would the rebus cope?

  • A: if we are missing the addOrder, then rebus could warehouse all subsequent messages about unknown order IDs. Ditto for a Level 1 trade msg.

Deepak felt this warehouse could build up quickly since the eve-inceasing permanent gaps could contain tens of thousands of missing sequence numbers. I feel orderId values are increasing and never reused within a day, so we can check if an “unknown” orderId is very low and immediately discard it, assuming the addOrder is permanently lost in a permanent gap.

  • A: if we are missing an order cancel (or trade cancel), i.e. the last event in the life cycle, then we don’t need to do anything special. When the Out-of-sequence message shows up, we just apply it to our internal state and send it to downstream with the OOS flag.

If a order cancel is lost permanently, we could get a crossed order book. After a few refreshes (15min interval), system would discard stale orders sitting atop a crossed book.

In general, crossed book can be fixed via the snapshot feed. If not available in integrated feed, then available in the open-book feed.

  • A: If we are missing some intermediate msg like a partial fill, then we won’t notice it. I think we just proceed. The impact is smaller than in FIX.

OOS messages are often processed at the next refresh time.

Q3b: But how long do we wait before requesting retrans?
Q3c: how do you keep a timer for every gap identified?

Q: after you send a retrans request but gets no data back, how soon do you resend the same request again? Do you maintain a timer for every gap?

Q: You said the retrans processing in your parser shares the same thread as regular (main publisher) message processing. What if the publisher stream is very busy so the gaps are neglected? In other words, the thread is overloaded by the publisher stream.

short+efficient : holy grail ] algo search

For most of my tough algo problems (such as leetcode problems)

  1. many celebrated solutions are very short, but not so efficient
    • I think about 60% of them are easier to memorize due to length.
    • if you can understand and memorize them, they are quicker to write due to fewer keystrokes
    • About half of them are easier to understand, thanks to the source code size
    • eg: yield-based generators are a prime example
    • eg: recursive solutions are often brief but inefficient
  2. many great solutions are efficient in terms of O(), but verbose
    • eg: bottom-up DP
    • eg: top-down DP with memoization
    • eg: using a tree (when not required) can sometimes give an efficient solution

However, these two goals are often hard to harmonize. It is my holy grail to meet both criteria simultaneously, but i won”t try so hard.

  1. For real coding problems, I will prefer brevity
  2. For verbal discussions, I will concentrate on efficient solutions

pass generator output to next generator

I think this technique can be extremely time-saving in coding tests.

https://github.com/tiger40490/repo1/blob/py1/py/algo_combo_perm/1fromEachSet.py my code demos:

for myset in pool:     output = list(gen(output, myset))

The gen() function uses yield. For the first call to gen(), we exhaust all of its items and save into a list named “output”.

Then we pass this list into the second gen(), this time with a different myset

template specialization based on NDTTP=true

We know it’s possible to specialize a template for a concrete type like int or std::string, but I didn’t know that It’s also possible to

… specialize a (class or function) template for a particular compile-time const value (like “true”) of a NDTTP (like “bool flag”)

  • On [[Alexandrescu]] Page xii , Scott Meyers showed an elegant example of specializing for “true”. Note “true” is a value, not a data type !
  • P 34 has a longer example.

Note on reading TMP code — the template specialization syntax is clumsy and can add noise to the signal. Better ignore the syntax rules for now to focus on the gist.

more python mock IV questions #Deepak

Q (real IV): how do you verify a given string represents an integer? For example, "3e5" would be invalid.

Q (real IV): compare multi-threading vs multi-processing in python. Which one have you used in your projects?

Q: Have you used a FIFO data structure (like a queue) in python? It’s OK if you have not, but how would you create such a data structure in python?

Q: Compare and contrast deep copy vs shallow copy for a python list
Q: Compare and contrast deep copy vs shallow copy for a python dictionary

Q (advanced question from me): When is a deep copy required? Give some scenarios

Q : Have you used context managers? (I think they are useful.)

Q (advanced question from me): compare and contrast list comprehension vs generator expression

Q: write a Student class with lastName, firstName, birthDate fields. Also include a class attribute "instCnt" showing the number of student objects created. As you create a new Student instance, this instCnt should increment.

Q: have you heard of the LGB rule in python?

Q (basic question from me, never asked in interviews): what data types can’t be used a dictionary keys

Q (basic question from me, never asked in interviews): what’s the usage of "global" keyword? Give some usage scenarios
Q (basic question from me, never asked in interviews): what builtin data types, apart from tuples, are immutable? Give some examples.

##fully engaged+!ROTI still beats boredom

Every learning direction has naysayers, and virtually all learning efforts generates low ROTI. ROTI and strategic investment are holy grails, mirages or white elephants.

We can’t dismiss $ROTI but I think we really need to look beyond $ROTI — too elusive. As Mithun put it, java multithreading alone can fetch you highest salary.

Fully engaged for a few short months but without ROTI is actually not bad. It’s like satisfactory sex without simultaneous climax. I enjoyed the journey, without worrying about the destination.

I’m able to fully engaged for a few months, while my peers can’t!

I have blogposts on spare time usage. Tech learning (even the non-strategic topics) are among the most productive. There’s really nothing more worth learning.

  • — most of these are failed technology bets, but 10x better than not engaged (i.e. boredom)
  • 2012-2013 c# study
  • HFT-specific QQ topics
  • MSFM
  • 2011-2012 self-learning volatility
  • self-learning swing
  • MOM study
  • gemfire study
  • — less disappointing ROTI in terms of self-confidence, thick=>thin/zbs, mobility, broad-base
  • bond math, socket, FIX, data structure, algo, threading,
  • coding drill, including c++ language experiments
  • specific SDI and pure algo challenges

Realistically ideal next SG job #neglected factors

Essential factors: Reasonable coworker benchmark (within my grasp); Reasonable boss, salary, commute

Here are Some of these neglected factors:

  • Ideally: enough spare time in office —- for blogging and tech exploration, AFTER I clear the initial hump.
  • Mainstream and low-churn tech —- like java, c++, py. In contrast GO, javascript, .. would create discontent and other negativity in me.
  • ideally: Mainstream dnlg —- to keep me engaged for a few months. Hopefully some math some low-level complexities
  • too much monotonous, mindless work, could get me disengaged intellectually. Any past examples? OC approval-seeking

java primitives have no address #unlike C

In C, any variable, including those on stack, can have its address printed.

In java, the primitive variables have no address.  Every reference type object has an addresses, by definition (“reference” means address)

C# is somewhat mixed and I’m not going into it.

Python rule is extreme, simple and consistent. Every object has an address. You can print id(x) or getrefcount(x)

>>> from sys import getrefcount as rc
>>> i=40490
>>> rc(i)

##a few Advanced algo IV categories #study

By “Advanced” I mean — the average programmer without focused study will likely struggle.

  • problems that require recursion-in-loop — rather common
    • but I have analyzed quite a few. Now at a plateau. No thick->thin
  • grid or matrix problems — more common than I thought. My weakness,
    • but I have invested quite a bit. Fast early ascent is over
  • graph problems — sometimes harder than average. Hard for everyone but I often fare better.
  • DP and greedy problems — occasionally easy, but usually non-intuitive
  • single-str or two-str problems — usually hard
  • num-array problems
  • generator problems — not so common
  • problems that require backtracking — rare

However, some of the hardest problems involve nothing but an int array or a single string.

[17]#1 impactful endeavor(Now)4family: IV^zbs^gym..

meaningful endeavor Now: algo^zbs^…

In the quiet hours, inevitably I would get my share of self-doubt about the value of my endeavors. See also what efforts go towards 20Y-career building

At other times, I would feel the impacts (of my effort today) on the people who depend on me — grandparents, kids, wife. There’s a lot I can do to make their lives safer, richer, easier, … For example, the interview-preparation effort looks short-term and less meaningful than zbs, but actually has more impact on family well-being such as education, health-care, housing and life-chances. Re zbs, now I realize zbs accumu is rather theoretical and actually limited. Interviews build my confidence and capacity to provide for them.

Looking at my peers … I feel their personal endeavors are not much better than mine:

  • move up to leadership positions. I think that’s a good direction if you CAN move up. I gave up long ago. So I see myself almost like a specialist consultant for hire
  • personal (property) investments
  • kids grades and top schools
accumu #not a benefit #3 mental fitness, anti-aging #2 career after 65 (RnD/teach) #1 family well-being: impact 1-5 #4 lasting social value?
good if low churn good minimal 4 minimal IV: QQ/BP #incl. algo ideas
good excellent none 4 none IV: algo practice
good excellent N.A. 5 none …cf: yoga, jog
good if low churn good possible 3 #helps PKI !! IV minimal zbs #+GTD, instrumentation
churn ask HuKun too volatile 0 none data science, machine learning
some some esp. quant none 1 {– 2 none portable dnlg(+quant)
none some none 4 #if stay`long. Can help move-up but low correlation none xx local sys
some minimal none 1 #questionable can help kids;
can’t teach
per investment analysis #unlike XR
NA minimal none 2 {–1 none … cf: family expense management
some some possible 0 high En/Ch learning
churn good minimal 0 # coding practice churn! Get Real no bandwidth! contribute to OSS??

“spread%%nlg” as a column? trivial.

Q:how much syntax2memorize4speed cod`#python

I would say 5 times more than needed on the job.

Treat it like timed coding competition.

Compare to exam preparation in China.

—- python coding drill for indeed:
small problems using lot of python data structures around chars, strings, arrays
fuxi python blog posts
list comp
define classes with fields
combine multiple lines to 1
triple quote comments

simple tree applied to DP++

This is part of my thick->thin effort, to connect the dots and pattern-recognition. May or may not be highly effective, but no harm trying

By my own definition, A “simple tree” is non-recombinant and cycle-free, and every node has one uplink only. I find these trees very visual. They often help me visualize tough Dynamic Programming and other algorithms:

  • decision tree
  • (Rahul pointed out) DP algos are often recursive, and recursive algos often have a call-tree structure
  • recursive call within each loop iteration — this powerful algo can often be represented as a call-tree or decision-tree
  • backtracking — often uses a tree
  • generating all solutions (paths, formulas..) — often uses a tree
    • paths-from-root — each path often maps to a solution
    • eg: punctuation — to print out all sentences
    • eg: AQR factorization problem may be solvable using the comboSum algo.

union-find O(1) data structure

https://en.wikipedia.org/wiki/Disjoint-set_data_structure described a specific data structure, but probably not used in the union-find algo challenges.

The IV algo challenges are usually small scale. They can use a primitive, trivial version of disjoint-set but doesn’t really gain any meaningful performance advantage. I think this attitude towards disjoint-set is similar to the prevailing attitude towards hash table — it is just a means to achieve big-O demands. With or without this data structure, the problem is solvable but the real challenge is the big-O requirement.

Two main operations can both be optimized — 1) Union() can use by-rank or by-size and 2) Find() can use path compression. Both optimizations keep the tree height very low. High fan-out is desirable and achievable.

Using both path compression and union by rank or size ensures that the amortized time per operation is essentially O(1).

deadly delays@ project^feature levels

insightful article: managing tight deadlines is the background

— deadly delay at feature level, without user impact

This is real stressor primarily because there exist team colleagues who can GTD faster.

For the serious delay, user impact is … exaggerated! This is true across all my projects, but user impact doesn’t really matter. In fact, the feature may not be used much whether you deliver it in high or low quality, within or exceeding budget.

— blatant delay at project level when you are architect / team lead

In theory if you promise to delivery some new feature i.e. green field project, then it can be tough to deliver on time. In reality, project time/budget overrun is very common. You only need good explanation.

Users never care that much about any new system cf current production systems. New systems often hit various teething problems and functionally unreliable for months.

OK Users don’t really care that much, but there’s visible technology budget impact which make the technology MD look bad. MD must look for an “explanation” and may cut headcount as in StirtRisk and RTS-Trep

Whose fault? Many fingers point at you the architect, but often it’s partial fault of the user-facing manager due to immaturity and overconfidence with timeline estimate.

Delay is architect’s fault if another architect can deliver faster but I can’t imagine two architects working on two comparable projects at the same time. Therefore it’s never mostly architect’s fault. In reality, management do apply pressure and you suffer, but it’s really for political reason (like sacrificial lamb or scapegoat)

eg: RTS Trep
eg: PWM Special-Investment-management
eg: Post-Saurabh GMDS — multiple delays and scope reductions

eg@traction[def2] in GTD^IV #Sunil

Sunil is not the only one who tried but failed to break into java. Sunil was motivated by the huge java job market. I believe he had opportunities to work on (probably small) java projects and gained confidence, but that’s really the easy part. That GTD experience was completely insufficient to crack java interviews. He needs IV traction.

  • Zhurong also tried java
  • [g] Venkat actually got a java job in SCB but didn’t like it. I feel he was lacking GTD traction
  • [g] XR and the Iranian java guy had some c# projects at work but didn’t gain traction.
  • CSY had a lot to tell me about breaking into java.
  • [gi] CSY and Deepak CM both had java projects at work but no traction
  • [i=IV traction]
  • [g=GTD traction]

IV/QQ traction — I experienced better IV traction in c# than c++. I think it’s because half my c++ interviews were HFT shops.

GTD traction — I had good GTD traction with javascript, php …, much better than c++. My C# GTD traction was also better than c++. C++ is probably the hardest language in terms of GTD. As explained to Kyle, my friend Ashish experienced tremendous GTD traction but can he crack the interviews? Hiring teams can’t access your GTD but they can ask tough QQ questions.

##[19] am competent at..as a professional thanks2tsn

Background — When I listen to a professional musician or comedian from some unfamiliar country, I wonder if they are actually good. Similarly, when I consult a doctor or dentist, I wonder if they are qualified.

“Self-respecting programmer” — Yi Hai’s motivation.

I have been tested and proven on the big stage i.e. U.S. tech interviews + GTD

  • [p t w] java/c++/c#
  • [w] algo
  • [w] coding test
  • [t] SQL
  • [t] socket
  • [t] unix power user
  • swing
  • [p] web app developer across php, javascript, java
  • py, perl (and shell/javascript?) — are these professional games?
  • [t = long tradition over 20Y]
  • [w = worldwide contest]
  • [p = a well-known standard profession]

##[15] I can professionally qualify as …

See also ##[19] am competent at..as a professional

See also blogpost on “N years’ experience — unimportant” and #1 career safety enhancer@past5years

On Wall St there are “combo” jobs to my advantage, but unheard of in SG.

I can qualify as :

##spare time: what can peers do to “get ahead”

Exec Summary — for all of us, strategic orgro, ROTI etc is … holy grail … elusive. My wife/in-laws often say “You spend so much family time on your studies but are you earning more?” Translating personal endeavor to income is holy grail .. frustrating..

I believe many professionals don’t have the abilities to convert spare time to tangible personal growth.

Tangibility is defined by each individual, and requires a high degree of self-knowledge.

  • investment analysis (HuKun, XR)? I doubt any of them has any ROTI
  • deeper in java for higher pay or promotion? Higher pay is basically hopeless for many of us who are already at the higher end. GTD depends more on localSys. Promotion has no correlation with deeper java knowledge.
  • coding drill
  • tsn like mobile, data science (XR), java (Sunil)
  • personal investment
  • formal education in spare time like CFA, MBA
  • Stephen Keith was able to write academic papers in his spare time .. very rare

In this realistic analysis, my c++/c#/quant/swing attempts now look commendable.

 

## candd are assessed mostly@tech skill !!dnlg

I feel mostly we as candidates are assessed on technical not domain knowledge.

Q: Among your past job interviews which one had the highest emphasis on dnlg? In the interview, which one of the 3 dnlg categories? Usually math I would assume.

I over-invested in dnlg, relying on it to stand out and differentiate. Well, it’s not “reliable” as a competitive advantage just as SQL/Unix skill isn’t such a selective screening criteria. In contrast, I would say about half of my job interviews had highly selective tech screening.

  • eg: west-coast
  • eg: HFT shops
  • pure algo challenge — About 30% are tough, including a few west-coast challenges
  • whiteboard threading — easy for me, but 60% hard for most guys
  • online tests — MCQ (30%) or coding (80% hard)
  • interactive coding, remote/onsite, whiteboard/IDE — 70% are tough
  • core java/c++/c# knowledge — 50% are tough

python closure and global variables

One way to minimize global variables is converting a regular function into nested functions. Nested functions can automatically READ local variables (like rootNode) of enclosing function outer(). No complication. This follows the usual LGB rule.

On the other hand, if nested_func() needs to rebind outer() function’s rootNode variable to another Node object, then we need to declare “global rootNode” in both nested_func() and outer (). This is tested in https://github.com/tiger40490/repo1/tree/py1/py/88lang. This “partial-global” variable is NOT used outside outer().

Another way to avoid global variables — call mutator methods on the variable, rather than reseating/rebinding. Best example is list or dict objects. Inside nested_func() , if I were to rebind myDict to an empty dict, then it has no effect on the myDict in outer(). Hence “global” needed. The alternative is to clear the content of myDict and populating it. The id(myDict) value remains unchanged. This is the standard java idiom.

resource leak due to dtor bug

[[moreEffC++]] has a chapter dedicated to resource leaks in dtor. I have yet to read it, but here are my views:

A “Resource” means a heap object as a data member, in every case I know. In such a case, “Leak” means failure to call q[ delete ] on the data member.

To guarantee the q[ delete ], I feel one of the simplest yet most reliable strategies is a smart ptr as a data member. In particular

  • If a base-class subobject is already fully constructed when subclass ctor throws, the base-class dtor would run. Any resource in base-class is released if using smart ptr.
  • If a component subobject is already fully constructed when host ctor throws, the component dtor would run.
  • In the normal case of a fully constructed subclass or host object, then obviously its dtor would trigger the base-class dtor or component class dtor, in reverse order

However, replacing every ptr field with a smart ptr is costly and impractical.

G9 asset classes,by dev-job mkt depth

Beware many big domains don’t need lots of developers.

  1. Bonds including sovereign
  2. [V] Eq (including ETF) cash and swap
  3. [V] FX cash and fwd
  4. Eq/FX options
  5. [Q] IRS
  6. IR (including bond) futures
  7. [Q] CDS
  8. [Q] MBS
  9. [d=will see higher demand for developers??]
  10. [V=volume and velocity drive the demand for developers]
  11. [Q=low volume but j4 automation is quantitative in terms of automated risk and pricing.] I believe these quantitative asset classes play to my “theoretical” strength and not too niche, but these domains aren’t growing.

## tech xx backlogS ] blog++

  • 10 most recent posts — have the highest visibility but undeserved and should be back-dated aggressively !
  • oq11 blotposts
  • fuxi_* blogposts
  • bookmarks in local browsers
  • 0zoo blogposts — recent and often needs review and quick-refresh
  • –“softer subjects” rather than mostly-tech topics
  • gmail 7day tag
  • my open blog

embed char-array ] java object: %%ideas #XR

With market data it’s common to use some Message class(s) that “embeds” a fixed-length character array like 20-char for example.

Allocating an array object off-site on heap is very costly in memory footprint. One extra allocation per Message.

Also slower reading at run time due to data-cache inefficiency. Data cache favors contiguous data structures. See CPU(data)cache prefetching

c/c++ and c# (via Struct) can easily support exactly this “embedding”. I feel java also has some support. Beside JNI, I wonder if there’s another, pure-java solution.

Q: in java, how can I have embedded fixed-length char-array field in my Message or Acct object, rather than a separate array object allocated somewhere off-site?

  1. Solution: If the fixed length is small like 10, I could maintain 10 individual char fields.
  2. Solution: assuming the chars are ascii (8-bit rather than 16-bit in java), I can group first eight chars into a 64-bit long int field. Provide a translation interface when reading/writing the field. With 10 such fields I can support 80-char embedded.
  3. Solution: If not possible, I would use a gigantic singleton off-site char array to hold fixed-length “segments”. Then I need a single int “position”. Every Acct object has a field this.position, where this.position * fixedLength = offset, to identify one segment.

Among them, not sure which solution is fastest in practice.

homemade ringBuffer@pre-allocated objects to preempt JGC

Goal — to eliminate JGC completely.

Design 1: I will want Order.java to use primitive fields only and avoid reference fields [1] at all cost, so the total footprint of an Order is known in advance. Say it’s 100 bytes. I will create 10M of dummy Order instances, possibly scattered in heap, not adjacent as in c++, and hold their 10M addresses in an Order array… about 1GB footprint for the Order objects + 80M footprint for the array of 8-byte pointers.

(Note I reuse these Order instances in this object pool and never let them get garbage-collected.)

Then i need a few subscripts to identify the “activeRegion” of the ring but how about released slots enclosed therein?

[1] timestamps will be ints; symbolIDs and clientIDs are ints; short ascii strings will use 64-bit ints (8 characters/int); free-form strings must be allocated off-site:(

Design 2a: To avoid the “scatter” and to place the Order instances side by side, Can we use a byte[100] array object to represent one Order? Can we use one gigantic byte array to hold all Orders, eliminating the 80M footprint?

Design 2b: https://blog.bramp.net/post/2015/08/26/unsafe-part-2-using-sun.misc.unsafe-to-create-a-contiguous-array-of-objects/ shows a contiguous array of java objects, like std::vector<MyObject>

Design 2c: https://www.ibm.com/support/knowledgecenter/en/SSYKE2_7.1.0/com.ibm.java.lnx.71.doc/user/packed_optimizing.html is a feature in IBM jvm

Ring buffer is good if the object lifetimes are roughly equal, giving us FIFO phenomenon. This occurs naturally in market data or message passing gateways. Otherwise, we may need a linked list (free list) of released slots in addition to a pair of subscript to identify the active region.

It might be better to allocate a dedicated buffer for each thread, to avoid contention. Drawback? One buffer may get exhausted when another stays unused.

##RDBMS strategies4faster read/write #widely useful#Sunil

(Further to our chat in mid Apr 2019…) Hi Sunil,

I generally agree that RDBMS such as Sybase, Oracle, DB2, MSSQL, … have good performance for Select and slower performance for Inserts, as you concluded in our last call. However, this conclusion is full of ifs and buts.

  • Read performance can be improved with partitioning — Imagine if the table is relatively large like 100 million rows. Even if the queries use an index, the index can have fairly large footprint (with implications). Often the Asia users only query the Asia records while Europe users only query Europe records.
  • Read performance can be improved with pre-computed data tables. If the same calculation is frequently requested via some frequent queries, then the calculation results can be saved in a pre-computed result table.
  • Read performance can be improved with stored procedures.
  • Read performance can benefit from de-normalization.
  • Read performance can improve if entire table is in-memory, as confirmed by my 2009 DB2 trainer in NY.

Now I think most RDBMS performance tuning techniques target Select as slow Select is the most common pain and the most severe pain.

Insert is typically much slower than read, but user expectation is also less demanding. In my observation, most RDBMS databases are either mostly-select or mostly-insert. The mostly-insert database can benefit from batch insert (bcp in sybase/MSSQL), or a background writer thread (Gemfire).

However, sometimes real-time insert is needed. I think the most common strategy is sharding (horizontal partitioning) like splitting 100 million rows into two tables of 50 millions each.

A related strategy is normalization (vertical partitioning). Normalization removes duplicate data and helps inserts.

function returning rvr or rvalue object

The rules are hard to internalize but many interviewers like to zero in on this topic. [[effModernC++]] explains

  • case: if return type is nonref and in the function the thingy-to-be-returned is a …. rvr parameter [1], then by default, it goes through copy-ctor on return.
    • You should apply explicit return move(thingy)
    • [1] extremely rare except in interviews
  • case: if return type is nonref and in the function the thingy-to-be-returned is a …. stack (non-static) object behind a local variable (NOT nameless object) .. a very common scenario, then RVO optimization usually happens.
    • Such a function call is always (P175 bottom) seen by the caller as evaluating to a naturally-occurring nameless rvalue object.
    • the books says move() should never be used in the case
  • case: if return type is rvr? avoid it if possible. I don’t see any use case.
  • case: if return type is lvr (not rvr) and returned thingy is a local object, then compilation fails as explained in the blogpost above

sharding=horizontal partition #too many rows

Assuming a 88-column, 6mil-row table, “Horizontal” means horizontally pushing a knife across all 88 columns, splitting 6 million rows into 3 millions each.

Sharding can work on noSQL too.

GS PWM positions table had 9 horizontal partitions for 9 regions.

“Partitioning” is a more generic term and can mean 1) horizontal (sharding) or 2) vertical cutting like normalization.

WallSt server-side: java gain`mkt share #Greg#Sunil

I spoke to Greg on 22 Apr 2019.

If on server-side java / c# / c++ jobs add to 100%, what percentage is java. I said about 80%. He said 70-80%.

I told him “c# is mostly client-side”. He agreed and said “some buy-sides use c# on server-side”.

I also told Greg about my OC friend Sunil. OC uses server-side c# but according to Sunil, on the job market c# is mostly on GUI and java dominates server-side job market.

I told Greg that over the last 7 years, on server-side java has gained market share relative to c# and c++. Greg agreed.

I now feel the momentum is in java’s favor. Java is clearly easier to use than c++ and not that much slower. Beside c++ and c# the old challengers, is there any new challenger to java?

fear@large codebase #web/script coders

One Conclusion — my c++ /mileage/ made me a more confident, and slightly more competent programmer, having “been there; done that”

For half my career I avoided enterprise technologies like java/c++/c#/SQL/storedProc/MOM/Corba/sockets/turboC++…

Until GS, I was scared of the technical jargon, complexities, low-level API’s debuggers/linkers/IDE, compiler errors and opaque failures in java/SQL … (even more scared of C and Windows). Scared of the larger, more verbose codebases in these languages (cf the small php/perl programs)… so scared that I had no appetite to study these languages.

Look around your office. Many people have at most a single (rarely two) project involving a large codebase. Large like 50k to 100k lines of code (excluding comments and empty lines). I feel the RTB/DBA or BA/PM roles within dev teams usually don’t require the individual to take on those large codebases. Since it’s no fun, time-consuming and possibly impenetrable, few of them would take it on. In other words, most people who try would give up sooner or later.

Searching in a large codebase is a first challenge. Even figuring out a variable’s actual type can be a challenge in a compiled language.

Compiling can be a challenge esp. with C/c++.

Tracing code flow is a common complexity across languages but worse in compiled languages.

In my experience, perl,php,py,javascript codebases are usually small like pets. When they grow to big creatures they are daunting and formidable just like compiled language projects. Some personal experiences —
* Qz? Not a python codebase at all
* pwm comm? I would STILL say codebase would be bigger if using a compiled language

  • Analogy — if you have not run marathons you would be afraid of it.
  • Analogy — if you have not coached a child on big exams you would be afraid of it.

I feel web (or batch) app developers using standard tools lack the hardcore experience. They operate at a higher level, cleaner and simpler.

Note Java is much cleaner than c++

 

xp: pure algo question often require QQ 打通脉络

  • eg: bbg 2017 pure algo question on tree serialization required a standard BFS. Just a few lines but I struggled for a long time. If you memorize the key parts and practice enough, then the *knowledge* would show as a algo skill
  • eg: facebook regex is very hard to anyone I know, unless they worked on a similar problem before, and *know* the tricks.
  • eg: generate permutations, combinations. Can be tricky unless you *know* the key points.
  • eg: DP bottom-up vs top-down with memoization
  • eg: SCB-FM IV: stack-based queue
  • .. many more examples

打通知识脉络 is the Chinese phrase

[19] assignment^rebind in python^c++j

For a non-primitive, java assignment is always rebinding. Java behavior is well-understood and simple, compared to python.

Compared to python, c++ assignment is actually well-documented .. comparable to a mutator method.

Afaik, python assignment is always rebinding afaik, even for an integer. Integer objects are immutable, reference counted.
In python, if you want two functions to share a single mutable integer variable, you can declare a global myInt.
It would be in the global idic/namespace. q[=] has special meaning like

idic[‘myInt’] =..

Alternatively, you can wrap the int in a singular list and call list mutator methods, without q[=].

See my experiment in github py/88lang and my blogpost on immutable arg-parssing

became expert via external QQ benchmark` !!localSys xx

Suppose I invest heavily and become very productive on a local c++system. Similarly, a java, or SQL, or socket, pyhton .. system, but the point is — Any local system uses only a small portion of the language features.. at most 5% of the typical high-end interview topics on that language.

Therefore the project experience won’t give me confidence to be an expert on the language, but acing multiple QQ interviews does, as these interviews compare dozens of competitive, motivated and qualified candidates across a wide field.

I grew very confident of java through this type of external benchmarking, in contrast to internal benchmarking within the local team.

required experience: QQ imt CIV

A few years into my technology career, I observed that Solaris/Oracle was harder to self-teach at home than linux/mysql/php/python/perl/javascript because even a high-school student can install and hack with the latter. Entry-barrier was non-existent.

Similarly, I now observe that Wall St QQ-IV demands longer experience than coding IV of west-coast style. Coding drill can use Leetcode over months. QQ requires not so much project experience but a lot of interview experience. It takes more than months of exploration and self-learning.

  • example — tcp/ip. My friend Shanyou preferred to read books systematically, but a book touches on hundreds of topic. He wouldn’t know what topics interviews like to dig into.
  • example — c++ TMP. A fresh grad can read about it but won’t know the favorite topics for interviewers
  • Compelling Example — java concurrency. A fresh grad can build up theoretical knowledge but won’t have my level of insight

Many inexperienced candidates were highly appreciated in west coast interviews. No such appreciation on Wall St because Wall St can VP or contract roles require work experience .. tried-n-tested.

  • Wall St interviews are selective in terms of experience
  • West Coast coding interviews are selective in terms of … speed and optimality

low-level^high-level expertise

Compare to laymen on the street, I have accumulated fairly deep, specific and demonstrable expertise in two lucrative field i.e. software dev + finance

  • concurrency details (theory++) in real languages
  • dStruct smart choices and nesting
  • dStruct implementation details in java/c++/pyhton
  • SQL joins and tuning
  • sockets? no deep insight but beats most peers
  • Most standard algo problems are familiar to me
  • drv pricing math, VaR statistics, bond math. My expertise goes beyond the degree

These trophies are won based on tsn, interviews and personal time investment … not automatically

— Let me focus on low-level vs high-level
Most of the dev expertise domains are low-level (consequently very specific). I can even teach in these subjects. The more low-level, the rarer is the expertise. The laymen developers have a vague idea of the low-level details, for many reasons.

High-level understanding is often more useful (than low-level, theoretical QQ knowledge) in projects, but in job interviews, low-level knowledge is differentiator.

Practically all (98%) high-end java/c++ interviews use low-level questions as differentiator. I excluded the start-ups as I’m unfamiliar with them… Eg: Quoine.

In my dev (not system architect) interviews, they rarely asked high-level stuff like spring rationale, design patterns…
I tend to believe these questions can’t /separate the crop from the chaff/

The finance dnlg  also serves as differentiator among developers. The most specific sub-domains are quant math, the “architecture” and some of the jargon.

C++ is more low-level than java, which is more low-level than python…

Q: Paradoxically, C and assembly are even more low-level but not more valuable?
%%A: Some C knowledge is highly valued. For example, kernel knowledge is high-value and mostly at the level of C, compiler, assembly, and hardware
%%A: assembly expertise in device drivers or embedded is not really relevant to HFT, and too far from the money

MOM is not low-level and seldom asked.

c++IV=much harder than GTD #Mithun

c++ IV is much harder than c++ job GTD, as I told Mithun.

  • GTD is no different from java jobs, even though the build process can be slightly hairy. Java build can also get messy.
  • In contrast, C++ IV is a totally different game.

You need a rating of 1/10 to do a decent job, but need 7/10 to pass ibank interviews. This gap is wider in c++ than in java as java interview bar is much lower.

Most technical challenges on the job are localSys, so you can just look at existing code and 照猫画虎, 如法炮制, as AndrewYap does. Venkat of RTS said we should but we still do.

Corollary — after 3Y full time c++ job, you may still fail to pass those interviews. Actually I programed C for 2Y but couldn’t pass any C interview whatsoever.

python *args **kwargs: cheatsheet

“asterisk args” — I feel these features are optional in most cases. I think they can create additional maintenance work. So perhaps no need to use these features in my own code.

However, some codebases use these features so we had better understand the syntax rules.

— Inside the called function astFunc(),

Most common way to access the args is a for-loop.

It’s also common to forward these asterisk arguments:

def astFunc(*args, **kwargs):
anotherFunc(*args, **kwargs)

I also tested reading the q[ *args ] via list(args) or args[:]

— how to use these features when calling a function:

  • astFunc(**myDict) # astFunc(**kwa)
  • simpleFunc(**myDict) # simpleFunc(arg1, arg2) can also accept **myDict

See my github

java singleton^immutable classes #enum

[[effJava]] explains that an immutable class needs no copying.

However, we don’t need to work hard trying to make it singletons.

If an immutable class happens to be used as one-instance data type, then lucky. But tomorrow it may get instantiated twice.. no worries 🙂

If boss asks you to make this class singleton, you should point out the legwork required and the numerous loopholes to fix before achieving that goal. Worthwhile?

Java enum types are special. JVM guarantees them to be immutable AND singleton, even in the face of serialization. See P 311.

%% absorbency: experiment/SDI imt speed-coding

When you find yourself in high-absorbency mood, favor (#1 most draining) speed-coding. See list of less-draining drills at the bottom.

I can read dry QQ topics for hours each day for many days, but tend to lose steam after coding for a few hours. Speed coding drill drains my “laser” energy faster than QQ reading/blogging/experiment. When my laser is weakened (by boredom), I must drill harder to go through the “brick walls”.

I guess many fellow programmers enjoy coding more than reading. I feel lucky that in my interviews, knowledge tests still outweigh coding test.

Q: What part of coding drill is worst on my absorbency?

A: speed coding implementation is worst. It drains my laser energy fastest. After coding for a few hours I always feel like a deflated balloon and discharged battery and and need a full day to recharge.

I think frustration is the key. Self-expectation (about progress and traction) and self-image create the frustration and the drain.

Instead of traction, I often feel stuck and overspent.

I feel disappointed with myself (Deepak self-identify as “annoyed”. )

Q: Can we stop comparing with others and just compare with our past? Doable sometimes. Consider Leetcode speed-coding contest #Rahul

— Just like yoga

  • if I work on easy problems I feel wasting my time
  • if I work on tough problems I feel painful, draining and want to give up. After the practice i need hours to recover.

Q: … So can we find easier coding drills that I could enjoy (as Rahul suggested)? Definitely not easy. I think the first difficult step is self-acceptance that I can’t improve much at this age.

Q (excellent question): What type of “coding” drill can I do for hours like reading/blogging?

  • pseudo-code algo on paper/whiteboard is lighter. No ECT so I am swift and efficient. Less draining/frustrating.
  • SDI is most fun, least boring, not draining/frustrating. I can spend hours on a SDI. I feel a bit of accu. More like QQ less like coding drill.
  • concurrency coding questions are less draining as other guys are not faster
  • c++/java language feature QQ experiments are more like QQ. I can spend hours on a QQ experiment. More interesting as there’s no time-line no benchmark no frustration. Also other guys are not stronger. I feel some accu exactly like reading on these me features
  • review of my previous code is much less draining (than writing new solutions) as there’s no time-line and code is already working
  • analyzing patterns and reusable techniques (very few) in past problems. Thick->thin is the holy grail. I work hard towards it.
  • reading syntax and ECT tips in books and my blog

 

#1(reusable)AuxDS for algo challenges

Here’s a Reusable data structure for many pure algo challenges:

Construct a data store to hold a bunch of “structs” in linked list OR growing vector (O(1) insertion). Then we can build multiple “indices” pointing to the nodes

Here are a few average-O(1) indices:
(Note O(1) lookup is the best we can dream of)

  • hashtable keyed by a struct field like a string
  • array indexed by a struct field like small int id
  • If there’s a non-unique int field, then we can use the same array lookup to reach a “group”, and within it use one (or multiple) hashtable(s) keyed by another field

Will tough coding IV decline]10Y@@ #Rahul

It’s possible that as easy coding IVs spread or grow, tough coding IVs decline i.e. less wide-spread. In such a case, my t-investment now will have lower ROTI than expected.

It’s possible that as easy coding IVs spread or grow, tough coding IVs also grow … for differentiation of talent.

Rahul responded to my question and pointed out the two styles of interviewers he observed:

* interested in can/can’t solve — can this candidate get the problem solved in time?
* interested in thought process

Rahul felt the first type will want to use ever tougher problems to pick outstanding candidates… selectivity

Rahul felt the second type will want to use easier problems to get more insight into the candidate’s thought process.

Rahul felt to differentiate, employers can compare completion time on easy questions

python nested function2reseat var] enclos`scope

My maxPalindromeSubstr code in https://github.com/tiger40490/repo1/tree/py1/py/algo_str demos the general technique, based on https://stackoverflow.com/questions/7935966/python-overwriting-variables-in-nested-functions

Note — inside your nested function you can’t simply assign to such a variable. This is like assigning to a local reference variable in java.

https://jonskeet.uk/java/passing.html explains the fundamental property of java reference parameter/argument-passing. Basically same as the python situation.

In c# you probably (99% sure) need to use ref-parameters. In c++, you need to pass in a double-pointer. Equivalently, you can pass in a reference to a pre-existing 64-bit ptr object.

y some backoffice tech jobs=more stressful than FO

Paradox. Here are some answers from colleagues

  • Reason: a front office system can be very stable and mature
  • Reason: a front office system’s business logic can be fairly simple. An outsider would think business functions A/B/C/D should be part of this FO system, but in reality, they are are offloaded to other (FO or MO) systems.
  • Reason: a FO team manager can have good control on the requirements given by business and external teams.
  • Reason: even though front office users can be demanding and want immediate answers, good localSys knowledge can help you find those immediate answers. Back office users can also be demanding.

 

[16] Fwd: techies’ common complaints about jobs

We complain about high churn, but why the hell none of us go teach math?

We complain low $ROTI but how many percent of techies get any $ROTI from personal investment or self-learning?

We complain about low (if any) lasting social value in our work, but why the hell none of us chooses an RnD career?

Hi friends,

Most techies (including developers) probably feel undervalued, and have a lot of potential not utilized on the current job.

We blame our company or our team or our job. Maybe it’s not challenging enough; maybe too repetitive; maybe too niche.

We look up at some high flyer and ask “what if I’m given that role… I may not do better than that person, but surely I will be competent and up to the job.  It may be boring and stressful but Hey I will earn so much more!”

In many over-staffed IT departments, about 20% of the roles are critical and some 20% of the roles are dedicated to “peripheral”systems that no business users care about. Perhaps that system is lightly used, and users don’t trust the output anyway.……

Well, my current (devops) job gives me a lot of opportunities to push myself higher. It’s not too niche (like Quartz/Athena/SecDB). It offers complexity and depth. Not mindless and repetitive. Not something I feel already too familiar with (and jaded). I can see the impact quickly. The impact is on many people. The impact is on front office.

Still I’m not fired-up. I guess there are always better roles out there.

We had better condition our mind not to think that way. Instead make the best use of the current role. “When life gives you lemons, make lemonade”

plowback for zbs(+GTD)@@

For many years I was deeply convinced and motivated by the “Shanyou” notion that I ought to “plowback” for zbs and, to a lesser extent, GTD, not just aim at IV as the Mithuns would.

After working for 20 years, I now believe ANY tech knowledge, accu, deepening/stack-up, sustained focus … has no leverage and basically worthless if not relevant to IV

  • GTD skills? dominated by localSys. Tool knowledge can help but localSys is 10x more important.
    • localSys xx is almost always cursory, rather than in-depth since unnecessary digging (pwm-comm bottom-up xx plan) is invariably unsustainable — The local systems are patchworks , destined for rewrite, illustrating few best practices
  • BestPractice design? Almost never mattered in my career. After EMPWorld, I have never risen to decision-maker level.
  • BestPractice implementation? What’s “best” is mostly a matter of personal preference of the manager
  • zbs? Consider the most practical books like [[Pearls]] — the classics. If not relevant to IV then this zbs is only for self-satisfaction

This is one reason why my reinforcement loop completely broke down in Singapore.

… In contrast, my NY years were filled with joys of self improvement, driven by … interviews.

3Hr@knapsack problem..overspent@@

I spent 2-4 hours implementing/refining my own knapsack DP solution.

Many “peers” would consider it overspent. This kind of “conventional” view tend to destroy the precious satisfaction, the wellspring of positive energy.

Some peers complete it in a hurry and move on.

They may not compare it against the comboSum problem … x-ref is crucial to pattern recognition

They may not learn a lot by doing it quickly. I wouldn’t learn much of anything if I do that .. 囫囵吞枣

sg19: risks@tsn

This time my TSN attitude is not driven by fear of stagnation, but a free-spirited, fun-seeking desire to explore.

My past TSN failures (and some successes) are the best guide. Given the multiple TSN failures, my expectation is rather low. However, the pain of stigma , confidence loss, impact on family can still be very real. That’s why I wrote in my blog that the safest job is back office java+SQL+scripting with low-calibre colleagues.

fellow experts sizing up each other≈ QQ IV

  • Scenario — a passenger falls sick on a flight and two doctors (different countries) came to the rescue. They size up each other to decide who to take the lead.
  • Scenario — two musicians across genres play together impromptu for the first time. Musicians usually are cooperative not competitive though.
  • Scenario — a comp science student visits another campus and logs in on the local network and start exploring (not “hacking”) and encounters local students on the network.
  • Scenario — provincial basketball player coming to a new town and plays a first one-on-one game with a top local player.
  • Scenario — an interesting algo challenge is openly discussed, among programmers from different countries and across industries. A problem to be solved in any language without external tools.

These scenarios are similar to QQ interviews. (Algo interviews are subtly different but mostly similar.) Venkat of OC impressed me with his c++ and c# QQ knowledge.

To really size up each other, we need to discuss common topics, not Your or My localsys.

.. but many people only talk about, in vague or high-level terms, how some localSystems were supposed to work, often based on hearsay. They leave out the important details. It’s impossible to verify any of those claims.

If I were a 5Y VP (or ED) I would have self-doubt — am I just a localSys expert or an accredited expert as proven on benchmarks/interviews? Look at Shubin/Steve, the Sprite expert in London, Richard of Quoine ..
Their expertise and value-add can be marginalized in the context of the current mainstream technologies.

BFT phrasebook

BFT is one of the top 10 heavily used constructs in coding tests.

recursion-free — BFS feels like the most intuitive algo, recursion-free.

two hits — each node is accessed twice — appended to the FIFO and then popped from the FIFO. We need to be careful what to do at first vs 2nd visits. You can get confused if not unaware this subtlety.

graph — BFT is defined on graph, not only trees

tree — BFT produces a tree … the BFT tree.

mailer — Example: send a mailer to all linkedin contacts:

# send to each direct contact
# send to each second-degree contacts …

 

[19]c++guys=becom`very unlucky cf java guys

On 22 Apr 2019 I told Greg that c++ developers like me, Deepak, CSY.. are just so unlucky — most of the WallSt c++ jobs are too demanding in terms of latency engineering, either on buy-side or sell-side.

Greg agreed that java interviews are much easier to pass. Greg said if you have reasonable java skills, then you can get a job in a week.

I told Greg that the only way Deepak or CSY could get an offer is through one of the few easy-entry c++jobs, but there are relatively few such jobs i.e. without a high entry barrier.

 

word ladder #50%done

Leetcode Q 127: word ladder — Given two words (beginWord and endWord), and a dictionary’s word list, find the length of shortest transformation sequence from beginWord to endWord, such that:

* Only one letter can be changed at a time.
* Each transformed word must exist in the word list. Note that beginWord is not a transformed word.
* Return 0 if there is no such transformation sequence.
* All words have the same length.
* All words contain only lowercase alphabetic characters.
* You may assume no duplicates in the word list.

Example 1:
beginWord = “hit”,
endWord = “cog”,
wordList = [“hot”,”dot”,”dog”,”lot”,”log”,”cog”]
Output: 5
Explanation: As one shortest transformation is “hit” -> “hot” -> “dot” -> “dog” -> “cog”,
return its length 5.

Example 2:
beginWord = “hit”
endWord = “cog”
wordList = [“hot”,”dot”,”dog”,”lot”,”log”]
Output: 0

–analysis:
First scan O(NN)to build the bidirectional edges of the graph. Given an existing graph of N words (N can be 1), a new word is compared against each to construct the edge list. N(N+1)/2 comparisons. No choice. No need to be extra clever as this simple O(NN) algo is optimal IMO. No need to worry about the visualization of the graph either because the edge list is a proven representation

At the end of this phase, if beginWord and endWord belong to disjoint sets then return 0. However I see no simple implementation of it.

2nd scan O(N+E) BFS. But there are many cycles, so we need a hashset “Seen”, or array of size N. Array is more elegant than hash table in this case.

To compare two word, char-wise subtraction should give all zero except one char. This last routine can be extracted to a “simple routine to be implemented later”, so no need to worry about it in a white board session.

 

[17]predict 2-5Y job satisfaction #OC surprise

Q: did the learning actually help with my job interviews (am not salary-focused like Mithun)? This is a key feedback.
A: Yes to some extent

The “table” is subject to frequent change, so I keep it in recoll (was a google sheet). Here some notes:

  • stigma/appreciation/respect(zbs) — turns to be the #1 key to job satisfaction, but appreciation is tricky. Bonus can break it, performance review can break it, other things can break it too. I often felt like “damaged goods”.
    • In Mac and Stirt (and OC too), managers considered me good enough for transfer to other teams.
    • retrospective can be more negative than dissatisfaction then-n-there, for Macq, Stirt
  • salary + other income — turns out to be insignificant when I have inner confidence and self-esteem. It is still one of the most important factors. When it’s very high, it overrides almost everything else.
  • distractions — do hurt my O2 for GTD, zbs development and self-learning
  • traction — positive feedback, includes zbs growth, instrumentation, self confidence, IV, catching up or surpassing peers
  • strategic orgro/stagnation — turns out to be a white elephant.
  • Many of the top factors are “perceptions” rather than “hardships”
    • perceptions — self-esteem@comp; strategic tsn; engaging… #best eg@perception — peer comparison
    • hardships — mkt depth (job pool size); workload; family time; commute; salary

! OC job was actually not bad if there were some appreciation from boss. However, the new, strategic specialization didn’t bring enough satisfaction.

! Verizon job experience was rather positive. I was on the rise, in my prime. It all ended when I moved to GS. I should have quit GS earlier. Citi was the start of another prime period. Prime mostly in terms of self-confidence, self-esteem …

My prediction — to have a satisfying (not necessarily strategic) job next time,

  • I need the #1 factor — appreciation.
  • A well-paid java job will mostly likely make me feel good.
  • LearningSomethingNew and engagement will NOT be a deciding factor (Recall c#/py experiences). I will still make time to learn something, just like in 95G

 

west-coast IVs need no java/c++deep knowledge..our t-investment lost

I attended a few technical interviews at “west coast” type of companies over the years — google, amazon, VMWare … and recently Indeed, Facebook and some well-known Chinese tech shops.

These Interviewers never asked about java/c++ language details or data structures (as implemented in those standard libraries), or Linux+compiler system knowledge. ( I know many of these shops do use java or c++ as firm-wide primary language.) They do require data structure knowledge in any language you choose.

My conclusion from these experiences — if we compete for these jobs, we can’t rely on the prior interview experiences gained from all the financial-domain tech interviews. Wall St vs West Coast are too different, so much so that Wall St essential tech knowledge is not useful for west coast interviews.. We have to leave that wealth of knowledge behind when we start on a new journey (to the West) of learning, trying (trying our luck at various interviews), failing and retrying.

Michael Jordan once quit NBA and tried his hand at professional baseball. I see ourselves as experienced basketball players trying baseball. Our interview skills, interview practice, in-depth knowledge of crucial interview topics have no value when we compete in west-coast interviews.

West cost shops mostly rely on algo interviews. You can use any (real) language. The most common are java/c++/python. You just need a tiny subset of the language knowledge to compete effectively in these coding tests. In contrast, financial firms quiz us on much wider and deeper knowledge of java/c++/c#/Linux etc.

What if a west-coast candidate were to try the financial tech jobs like ibanks or hedge funds etc? I think they are likely to fail on those knowledge tests. I think it would take more than a year for them to acquire the obscure knowledge required at high-end financial tech jobs. In contrast, it takes months to practice a few hundreds leetcode problems. You can decide for yourself which side is more /impenetrable/ — Wall St or West Coast.

Ironically, neither the west-coast algo skill nor the financial tech obscure knowledge is needed in any real project. All of these high-end employers on both coasts invent artificial screening criteria to identify “cream of the crop”.

What’s the correlation of on-the-job performance to a candidate’s algo skill and obscure knowledge? I would say zero correlation once we remove the intelligence/diligence factors. In other words, algo skill or obscure knowledge are poor predictors of job performance, but intelligence/diligence are good predictors.

In the bigger picture, these tech job markets are as f**ked up as decades ago, not improving, not worsening. As long as there are big salaries in these jobs, deep-pocketed employers will continue to use their artificial screening criteria. We have to play by their rules, or get out of the way.

%% pure-algo/dStructure skill@recent CIV

Hi Friends,

Over recent (about 12) months I have attempted several coding interviews

  • passed a Standard Chartered pure algo test over phone, short questions but non-trivial
  • probably passed a big weekend coding assignment given by an e-commerce start-up. I later found out my solution is similar to a published topological sorting algorithm so my algo is probably close to optimal.
  • passed another weekend big coding assignment given by Nasdaq
  • passed two separate paper+pencil coding tests at two reputable sell-side firms (i.e. banks or brokers)
  • passed a speed coding test at a reputable internet company headquartered in Texas.
  • probably passed a few bloomberg coding tests, both remote and onsite, on computers or whiteboard.
  • (All recent failed coding tests happened at High-frequency trading shops .. very picky.)

All of these positive experiences convinced me that my algo and esp. data structure skills are improving. So I gave myself a rating of “A-minus” among Wall Street candidates. I felt esp. confident with white-board coding — with a real compiler, my proficiency is lower than other candidates.

Then I failed, on technical ground, at two big internet companies (FB and … let’s say company XX). Ironically, these were white-board pure-algo problems involving non-trivial data structures — my traditional strengths. So what’s going on? Unexpected failures deserve analyses.

— First, I think I was still too slow by their standard. One interviewer (at XX) told me they wouldn’t reject candidates because of slow completion, but I believe that’s not what she actually meant to say (She is not native speaker). Suppose a candidate takes a long time to come up with a sub-optimal solution, I actually believe given more time he would find an optimal solution because I am that type of candidate. But interviewers would have to assume he is weaker than those who complete the optimal solution quickly.

Interviewers often say they look out for thought process, but that’s probably something they look out for in-addition-to finding decent solutions in time. I think I tend to show clear (not always correct) thought process but that’s not enough.

I think those interviewers really look for “near-optimal solutions but not because candidates memorized it before hand”, so candidate need to 1) find good solution 2) be able to explain when challenged.

— Second, I think the competitors are stronger (faster) than on Wall St. I call them “west-coast competitors” though they could be working in any country, all competing for west-coast tech jobs. I guess they practice hundreds of problems at Leetcode. Such heavy practice doesn’t guarantee success but increase their winning chance.

— Third, I think my big-O estimate needs improvement. I made multiple mistakes at XX coding rounds, and elsewhere.

Big-O has never been a show-stopper in my Wall St interviews so I didn’t pay close attention on details.

— Fourth, I think I have not tried often enough. If I keep trying similar internet companies I might get lucky and score a technical win.

—-

Given the stiff competition and high standard at west coast coding tests, I now feel it’s extremely important to set intermediate goals

  1. goal — pass some quick algo quizzes given by west coast interviewers over phone. I used to get those quizzes from Google phone round
  2. goal — pass some initial (simpler) coding round via webex/coderpad
  3. goal — impress at least one interviewer at one coding round. I can often sense that an interviewer is delighted to hear my idea
  4. goal — impress multiple interviewers, even if your solution is not optimal. I achieved this at Bloomberg
  5. goal — pass all coding rounds at a reputable west-coast tech shop — which is my original goal, never achieved
  6. goal — 100% technical win i.e. pass all technical rounds including system-design-interviews. Note you can still fail the soft skill test.
  7. goal — get an offer

With these intermediate goals, the picture is no longer black and white, but grayscale. We might have hit some of the intermediate goals already 🙂

##dry$valuable topics 4 high(absorbency)period #flight

This ranking was originally compiled for “in-flight edutainment”. Warning — If high churn, low accu, low traction, or short shelf life then the precious absorbency effort is wasted

See also the pastTechBet.xlsx listing 20+ tech topics and comparing their mkt-depth, demand-trend, churn-resistance … My conclusion from that analysis is — any non-trivial effort is either on a niche or a non-growing tech skill, with notable exceptions of coreJava+threading. All mainstream and churn-resistant skills need only trivial effort.

  1. coding drill — esp. the hot picks. Look at tag “top50”. Practice solving them again quickly.
    • 😦 low accu?
  2. java concurrency book by Lea. More valuable than c++ concurrency because java threading is a industry standard reference implementation and widely used.
  3. java tuning, effJava,
  4. c++ TMP?
    • 😦 seldom asked but might be asked more in high-end interviews. TMP is heavily used in real world libraries.
    • 😦 low traction as seldom used
  5. effModernC++
  6. linux kernel as halo?
    • 🙂 often needed in high-end interviews
    • 😦 low traction since I’m a few layers removed from the kernel internals
    • 😦 no orgro
  7. c++11 concurrency features?
    • 😦 low traction since seldom asked in-depth

friend-circle #Union-Find#40%

Leetcode Q547 (union-find) There are N students in a class. Some of them are friends, while some are not. If A is a direct friend of B, and B is a direct friend of C, then A is an indirect friend of C. And we defined a friend circle is a group of students who are direct or indirect friends.

Given a N*N matrix M representing the friend relationship between students in the class. If M[i][j] = 1, then the ith and jth students are direct friends with each other, otherwise not. And you have to output the total number of friend circles among all the students.

— analysis:
Rated “medium” on leetcode but my Design #1 is easier than many “easy” questions. Clearly this is a data-structure question … my traditional stronghold.

Challenge is merging.

— design 1:
lookup map{studentId -> circleId}
Circle class{ circleId, presized vector of studentId}

When we merge two circles, the smaller circle’s students would /each/ update their circleId. This merge process has modest performance but simple.

In reality, students outnumber circles, so here’s an alternative ..

— design 2:
map remains same (Not optional!) .
Circle class is now {circleId, parentCircleId (default -1)}

The swallowed circle will have this->parentCircleId set to a top-level circleId… Path-compression as described in disjoint set.
The merge would only update this one field in one or more Circles. O(H) i.e. height of tree. H is usually very small because at any time, each circle’s parentCircleId is either -1 or a top-level circle — I hope to maintain this invariant.

Scenario:

  1. circles AA, BB, CC created
  2. circle a2 acquired by AA
  3. circle a3 acquired by a2 ultimately “branded” by AA
  4. circle b2 and b3 acquired by BB
  5. a2 swallows b2 –> need to update BB as acquired. When we try to update b2.parentCircleId, we realize it’s already set, so we follow the uplink to trace to the top-level node BB, and update ALL nodes on the path, including b2 as b2 is on the “path” to BB, but do we have to update b3 which is off the path? Suppose I don’t. I think it’s safe.
  6. circle c2 acquired by CC
  7. c2 now swallowed by b3. Now c2 will get branded by AA, and so should the nodes on the path ( b3 -> BB -> AA) This chain-update would speed up future mergers. Should C2’s old parent (CC) also get branded by AA? I think so.

After the data structures are fully updated, we simply return the count of top-level circles. (Each time a top-level circle gets created or disappears, we update that count.)

Additional field in Circle: The vector of studentId is needed only if we need to output the individual students in a given circle.

 

competitive strengthS offer different$values #speedCod`,math

  • competitive strength in speed coding contest — such contest are now more prevalent and the skills are more valued
  • competitive strength in dStruct/algo beyond the classics
  • competitive strength in core cpp QQ
  • competitive strength in core java QQ — bigger job market than cpp
  • competitive strength in socket QQ
  • competitive strength in SQL QQ (and perl GTD) — better than swing
  • competitive strength in math before college level — huge and long-term impact
  • competence in localSys — no long-term impacts, so Ashish’s t-investment is unwise
  • improvement in yoga and fitness

In each “competitive” case, you build up competitive strength over H years but may lose it in K (could be long) years. Each strength has very different long-term impacts and long-term value, not “zero” (transient) as we sometimes perceived them to be.

Any Valuable resources (including a lucrative job) are scarce and invites competition. A competitive strength (in any one of these domains) has long term impact on my health, mental aging, stress level, my job choices, my commute, amount of annual leave.

For a concrete comparison, let’s compare speed coding vs math. In school, math is far more valuable. It enhances your physics, chemistry, economics… There are many math competitions at various levels.  After we turn 30, math, in the form of logical and number-theory quizzes, still plays a small part in some job interviews. However, speed coding strength (am building) now has such an appreciating value on the high-end competitive interview scene.  Over the next 10Y, speed coding will have far more impact on those aspects listed earlier.

However, if you want to invest in building such a strength, beware of huge disappointments. You can say every woman’s natural beauty has imperfections when you see that woman everyday. This is because our idea of perfect beauty is based on paintings and photos, not live humans. Similarly, every endeavor’s ROTI has imperfections compared to our naive, idealized concepts.

If you look for imperfections you will always find some, but such fixation on imperfections is cynical, demoralizing and unproductive research.

We need to zoom into our own strategic strengths + long term interests such as low-level, theoretical stuff, pure algo, dstruct, and avoid our weaknesses.

  • low level or theoretical QQ — my strength
  • low level investigation using tools + codebase — my weakness
  • picking up new GTD challenges — my relative weakness but I did well before joining Wall St.
  • picking up new IV topic — my relative strength

balloon burst #DP optimization #50%

Q [ Leetcode 312]: not really classic : Given n (up to 500) balloons, indexed from 0 to n-1. Each balloon is painted with a number on it represented by array “nums”. You are asked to burst all the balloons one by one. If the you burst balloon i you will get nums[left] * nums[i] * nums[right] coins. Here left and right are adjacent indices of i. After the burst, the left and right then becomes adjacent. Find the maximum coins you can collect by bursting the balloons wisely.

If you burst a leftmost balloon, you collect 1*it*rightNeighbor coins. In other words, when multiplying 3 numbers, any absentee is a one.

0 ≤ nums[i] ≤ 100

Example: Input: [3,1,5,8]
Output: 167
Explanation: nums = [3,1,5,8] –> [3,5,8] –> [3,8] –> [8] –> []
coins = 3*1*5 + 3*5*8 + 1*3*8 + 1*8*1 = 167
==analysis:
int-array optimization problem.
Might be related to some classic problem.

Let’s define a generic math-function of 3 balloon IDs score(myle, me, myri). In this problem, score() is simply “return myle*me*myri “, but in the next problem, score() could be any math function of the three inputs.

I see each possible snapshot (having K balloons, i.e. at level K) as a graph node. Exactly 2^N nodes in the grid, i.e. 2^N possible snapshots i.e. 2^N combinations of these N balloons.

Every edge has a score. To compute the score, we only need the two nodes (snapshots) of the edge to identify the 3 balloons for score().

Pyramid — Let’s assume at bottom is “origin” i.e. snapshot of the original array ..Level 500; on top is “phi” i.e. snapshot of the empty array .. Level 0.

The problem transforms into a max path sum problem between these 2 nodes.

–solution-1 DP
From origin to any given node, there are many distinct paths each with a total score up to that node. If a node has 55 paths to it, the max sum among the 55 paths would be the uprank (upward rank) of the node.

If the node also has 44 paths from phi, the max sum among the 44 paths would be the downrank (downwrd rank) of the node. This is an interesting observation, but not needed in this solution since every edge is evaluated exactly once.

To our delight, uprank of a node AA at Level-5 depends only on the six Level-6 parent node upranks, so we don’t need to remember all the distinct paths to AA:). Our space complexity is the size of previous level + current level.

We just need to compute the uprank of every node at Level 6, then use those numbers to work out Level 5…. the Level 4 … all the way to phi.

If there are x nodes at Level 6 and y nodes at level 5, then there are 6x==5y edges linking the two levels.

Time complexity is O(V+E) i.e. visit every edge.

Level n: 1 node
Level n-1: n nodes
Level n-2: nc2 nodes

Level 2: nc2 nodes
Level 1: n nodes
Level 0: 1 node

Each node at level K has K child nodes above. This graph now suggests the max-path-sum algo (with edge scores), but it might be the only way to solve the problem, like the bbg odometer.

consider a DP algo to update the score at each node at level K, ie the max sum from root till here, via one of the K-1 nodes at level K-1

But Level 2 has too many (N-choose-2) nodes. Can We prune the tree, from either origin or phi?

if external lib return object by pointer #Rahul

My colleague Rahul used some external library util and found out it returns a pointer. We briefly discussed the implications. Here are some afterthoughts.

  • Pattern — if I pass in an original object by reference, the lib util may return a wrapper object by pointer.  In such a context, is this wrapper object on heap? Always?

I would say usually on heap. But since the lib implementation is invisible, it may do other things.

In an alternative design, the library could update an object pre-existing in a ring buffer (or 3D matrix) and make it a wrapper. There’s no heap allocation in this round trip. The ring buffer itself could be previously allocated on heap or data segment, not as a local object in main() function. Reason — main() can end before other threads 😦

  • Pattern — if I pass in some parameters, the lib util could create a brand new object, based on my parameters. So in this context is this new-born on heap? Always?

I would say usually on heap. But since the lib implementation is invisible, it may use non-heap.

In a specialized design, the library could return singletons held in a lookup table. The singletons are usually pre-existing in heap or data-segment.

Q: Now suppose the library does return a new heap object, who would be responsible to release the memory?
A: One common design (Sutter) requires the library factory to own it. Reference counted smart pointers are useful, as the factory knows the reference count of the new heap object.

Q: Now suppose the library factory returns a new heap object by raw pointer, who would be responsible to release the memory?

## sg19 java QQ to brush up

  • — jxee
  • [10=10 min ]
  • [20] Containers
  • Cloud native? LG
  • JPA? LG
  • — other ecosystem items
  • containers, cgroups
  • jdk tools for heap analysis
  • tuning best practices
  • GC in general? bookish ok
  • Ajax integration? no one asked
  • — coreJava
  • java8/9 fundamentals?
  • high-level concurrency tools — how they are related to the low-level. Lower mkt value than low-level tools
  • serialization? not popular in IV
  • advanced generics like wildcard? not popular in IV
  • reflection? not popular in IV

c++factory usually heapy

Java factory has little (any?) choice, but focus here is c++ factory techniques. I think it usually has to manufacture on heap.

In specialized cases, my factory can maintain a private pre-allocated object pool, carved out from heap, so the factory returns objects from this pool, without calling new().

To my surprise, some c++ factories actually return by value, as illustrated in both [[c++cookbook]] and https://herbsutter.com/2013/05/30/gotw-90-solution-factories/ RVO and move-sematics would likely kick in.

longer u stay,more localSys nlg⇒safer@@ #5%GS

As I said in this blog, FTE-dev is often worse off than contractors.

I think if you stay for many years without moving up while some of your colleagues move up, you may or may not get a stigma.  Some of the newer members may become your manager:) But this is not the main focus here.

The longer you stay, the more knowledgeable about local system. Hopefully more relaxed and less stress? Partly true but very dangerous slippery slope. You hope that you can make do with 80% of the earlier /intensity/, but I think it won’t happen in most ibanks [1].

In my observation, most VP-level old timers operate under /unrelenting/ pressure (“threat”) to maintain high productivity. They are expected to be more proficient and more productive than earlier, not allowed to slow down and take it easy … No retirement home here.

Otherwise, they would fail the peer benchmark

Another Part of the threat comes from hungrier, younger colleagues able to reach (then surpass) half your speed within a year or two, driven by /formidable/ brain-power (energy, intelligence, ..)

[1] There are exceptions but I only know a few so I don’t want to spend too much analyzing. Exception — if you were the original architect and you keep an eye on the evolution of your brainchild, and remain knowledgeable about it, but this scenario requires some brain-power.

That’s the harsh competitive reality even if you don’t seek increasing responsibilities. A small percentage of the people are ambitious ladder climbers. (Am clearly not, because at my current level I already feel the heavy workload.)

Many people I talk to want an “easy life”, not increasing responsibilities. However, If you don’t take up increasing responsibilities, you may become too expensive. You may get a token bonus. I think you may even get a humiliating bonus.

Overall, in “peacetime” long service without moving up can feel embarrassing and uncomfortable at times, for some individuals. (It’s more noticeable if most of the peers at your level are much younger, as in Macq and OC.) Some corporate cultures may tolerate but stigmatize that

Employer claim they prefer employees staying longer rather than shorter. That’s blatant marketing. In reality, some employers wish some old timers to leave on their own, to make way for younger, cheaper fresh blood. GS annual 5% cull in peacetime is widely-reported in WSJ, Independent... A few common motivations:

  1. Old timers are sometimes perceived as obstacles to change, to be removed.
  2. some employers believe younger, newer workers are cheaper and more motivated on average
  3. Whenever a new manager comes in he would bring in his old friends, otherwise he is weak.

Down turn? All hell breaks loose. Rather than protecting you, your long service track record may make you vulnerable. You may be seen as an opportunity to “replenish fresh blood”. In contrast, the less-productive but newer colleagues may show potential, and the hiring manager don’t want to look bad — hiring then firing new guys. In other words, it’s sometimes safer for the manager to sacrifice an old timer than a recent new hire. This is different from my Stirt experience.

My personal biased conclusions —

  • no such thing as long service recognition. No such thing as two-way commitment.
  • If you can be replaced cheaper, you are at risk. The more you earn, the more risky
  • If you are earning above the market rate then you need enough value-add, regardless how long you have served.

float/int 1:1 mapping#java support

[category javaOrphan ]

See last paragraph of https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/atomic/package-summary.html

  • Useful if you need to precisely compare a 64-bit float value. For example, you can check if a value has changed at all. The value could be user-supplied or shared mutable.
  • Useful if you need to store float values in a hashtable, though I’m not sure about performance.
  • Useful if you need radix sort on floats

##specializations fac`same IV quizzes 20Y on#socket !! c++11

(tagline: the most churn-resistant specializations.)

Socket IV questions have remained unchanged for 20Y — Unmatched stability and churn-resistance, but not necessarily accumulating

  • Preservation of t-investment
  • Preservation of accumulation
  • Preservation of deep learning? Socket programming has modest depth.

Q: Beside the specialization of socket programming, are there other specializations that are dominated by the same old QQ questions 20 years on?

  • [S] classic data structures
  • [S] classic sort/search algorithms on int-array, char-array, list ..
  • [S] classic traversal algorithms on trees, general graphs
  • [s] classic recursive, DP, greedy algorithms beyond graphs
  • [S] pre-c++0x core-C++ (a specialization!) questions are largely unchanged. C++11 questions are rooted in the old knowledge base.. BUT most of the c++11 QQ topics will likely fall out of interview fashion
  • [s] cross-language concurrency primitives.
  • unix file/process/string/signal manipulation
  • unix shell scripting — low market value
  • [S] SQL — including join, index design … but seldom quizzed in depth nowadays
  • [S] regex — seldom quizzed, but often needed in coding
  • [S=classic, well-defined specialization]
  • [s=slightly less well-defined specialization]

Now the disqualified skills

  1. JGC + jvm tuning — high churn over 20Y
  2. TMP — new features introduced in c++11

## strategic TSN among%%abandoned

Q: name the most strategic trySomethingNew domains that I have since abandoned, given up, quit. How about the new plan to take on coding drills as a lifelong hobby?

Don’t spend too much time, because the answers are nothing new even though this is a decent question.

  1. — ranked by surprise
  2. algo trading? actually very few roles spread across a number of firms
  3. c#
  4. drv pricing quant
  5. real time risk as in GS, Qz and Athena
  6. RDBMS tuning
  7. MOM, async, message-driven design knowhow
  8. distributed cache like Coherence and Gemfire
  9. Solaris sys admin and DBA
  10. perl, SQL, Unix/Linux power-user knowledge? No longer a top 10 focus

—-

  • python? Not yet abandoned
  • web dev for dotcom? I did this for years then abandoned it. Many of the tech skills are still relevant like sessions, java thread safety, long-running jobs

EnumSet^regular enum

[category javaOrphan]
A java enum type usually represents .. (hold your breath) .. a radio-button-group. A variable of this type will bind to exactly one of the declared enum constants.

eg: Continent — there are only 7 declared constants. A Continent variable binds to Africa or Antarctic but not both.
eg: SolarPlanet — there are only 8 declared constants
eg: ChemicalElement — there are only 118 declared constants
eg: ChinaProvince — there are only 23 declared constants

In contrast, enum type has a very different meaning if used within an EnumSet. (I will name a meaning soon). Each enum constant is an independent boolean flag. You can mix and match these flags.

Eg: Given enum BaseColor { Red,Yellow,Blue} we can have only 2^3 = 8 distinct combinations. R+Y gives orange color. R+Y+B gives white color.

Therefore, the BaseColor enum represents the 3 dimensions of color composition.

EnumSet was created to replace bit vector. If your bit vector has a meaning (rare!) then the underlying enum type would have a meaning. Here’s an example from [[effJava]]

Eg: enum Style {Bold, Underline, Italic, Blink, StrikeThrough, Superscript, Subscript… } This enum represents the 7 dimensions of text styling.

[[effJava]] reveals the telltale sign — if the enum type has up to 64 declared constants (only three in BaseColor.java), then entire EnumSet is actually represented as a single 64-bit integer. This proves that our three enum constants are three boolean flags.

pre-allocate DTOs@SOD #HFT #RTS

example — RTS pre-allocates outgoing message objects from a ring buffer’s head, and “returns” to the ring buffer at the tail… See How RTS achieved 400-700 KMPS #epoll

example — Sell-side HFT OMS also uses pre-allocation. Suppose for every new order there are 3 new DataTransferObjects A/B/C to be instantiated on heap. Traditional approach would make 3 allocation requests in real time. I think the free-list manager becomes a hotspot, even if there’s a per-thread free list.

Basically HFT avoids new/malloc after market opens.

Pre-allocation is a popular technique. We compute at compile time the sizes of A/B/C based on their class declarations. For DTO class A, sizeof(A) just adds up the non-static data field sizes. Then we estimate how many orders we will get a day (say 7 million). Then we pre-allocate 7 million A objects in an array. The allocation happens at start-up, though the sizes are compile-time constants.

When an order comes in, the A/B/C DTO objects are already allocated but empty.

Byte-array is an alternative, but this smells like the raw free list to me…

java enum: elegant

“Elegant” (and “clean”) is probably the best adjective. My comments below are mostly based on [[effJava]]

  • 🙂 immutable in the face of serialization
  • 🙂 strict singleton in the face of serialization .. P311
  • 🙂 simple enough to be easily embedded as a static nested class
  • 🙂 you can add behavior (and data) unique to Jupiter, using a constant-specific class body
  • 🙂 You can switch on enum values
  • compiler adds two implicit static methods values() and valueOf(), not listed on the official javadoc 😦
  • values() returns a fixed array of the predefined enum values
  • valueOf(planetNameStr) would return the matching enum instance
    • Note this method is unrelated to String.valueOf()
    • you can even add your own fromString(abbrevatedPlanetName) method. see [[effJava]]

EnumSet (see separate blogpost) and EnumMap built on the strength of enum feature

Q:are java primitive+reference on heap or stack #escape

An old question but my answers are not really old 🙂

In Java, a so-called “referent” is a non-primitive thingy with a unique address on heap, accessed via heap pointers.

In java, a referent is always an Object, and an Object is always on heap therefore always a referent.

(Java language defines “reference types” in terms of primitive types, so we need a clear understanding of primitive types first.)

In java, a primitive thingy is either part of a (heapy) Object or a local thingy on stack

(In C++ lingo, object can be a new int(88)…)

A reference is, at run-time really a heap pointer. Assuming 32-bit machine, the pointer itself occupies 4 bytes and must be allocated somewhere. If the reference is local to a method, like a parameter or a local variable, then the 4 bytes are on stack like a 32-bit primitive local variable. If it’s part of an object then it’s on heap just like a 32-bit primitive field.

— advanced topic: escape

Escape analysis is enabled by default. EA can avoid construction an Object on heap, by using the individual fields as local variables.

— advanced topic: arrays

Arrays are special and rarely quizzed. My unverified hypothesis:

  • at run-time an array of 3 ints is allocated like an Object with 3 int-fields
  • at run-time an array of 3 Dogs is allocated like an Object with 3 Dog fields. This resembles std::vector<shared_ptr<Dog>>
  • Q: how about std::vector<Dog>?
  • %%A: I don’t think java supports it.
  • The array itself is an Object

c++QQ critical-mass[def2] has started growing!

In May 2019, I feel I have achieved enough critical-mass on c++ QQ topics. Critical mass is defined by The acid test questions.

Q1: without a full-time c++ job, but with enough interviews, will my c++ QQ insight/understanding show resilience against churn and memory fading, as in coreJava?
Q2: thick->thin achieved? Not, but cross-reference graph built up as a defense against fading memory

This java career review provides a valuable context.

What visible progress gave me the confidence? Recent technical wins show my improving ranking among c++candidates.

  1. SCB-FM
  2. CVA
  3. SIG
  4. TradeWeb core team

Note I have invested more effort on c++ QQ than java… [18]t-investment: c++now surpassing java

  • — now a sample of critical-mass topics, roughly ranked by importance on high-end interviews, mostly at HFT and ibanks
  • coding tests
  • [e] sockets
  • mv-semantics
  • containers
  • smart ptr
  • [e] linux
  • memory mgmt including new..
  • polymorphism including MI #44 posts in the category
  • [e] pthreads + c++11 threads
  • [e] cache efficiency, compiler optimizations, build tools
  • TMP
  • [e=ecosystem topics]

##gains{Stirt job loss

immediately after I joined Macq, I realized I have traded up for a better job, better in every way.

  • gain: self-confidence that I can comfortably handle the ensuing financial impact on family
  • gain: self-confidence that I have survived again, albeit with a scar
  • gain: a protective scar — as of 2019 I’m still traumatized but slowly I’m healing from inside
  • gain: lesson learned the hard way — avoid FTE
  • $gain: compensation package more than covered my bench time but still I prefer the … something else instead of the compensation

unnoticed gain{SG3jobs: 看破quantDev

All three jobs were java-lite , with some quantDev exposure. Through these jobs, I gained the crucial clarity about the bleak reality of the quantDev career direction. The clarity enabled me to take the bold decision to stop the brave but costly TSN attempts to secure a foothold. Foothold is simply too tough and futile.

Traditional QuantDev in derivative pricing is a shrinking job pool. Poor portability of skills without any standard set of interview topics.

at same pay, now I would prefer eq than drv pricing domain, due to mkt depth and job pool.

QuantDev offers no contract roles !

Instead, I successfully established some c#/py/c++ trec. The c++ accu, though incomplete, was especially difficult and precious.

Without these progresses, I would be lacking the confidence in py/c#/c++ professional dev that enabled me to work towards and achieve multiple job offers. I would still be stuck in the quantDev direction.

initialize matrix with invalid value, not 0 #DP

In simple bottom-up dynamic programming algos, we often need to build a matrix of previous results to derive later results.

In contrast, harder DP problems may need other tools, but often in addition to, not in place of, a matrix.

Q: (often neglected question): what initial value to put into matrix?

  • In some problems, the top-right triangular area is not used. So the dump() had better use some obviously invalid values to highlight the used cells. 999999 is often a reasonable choice, but then dump() would need alignment.
    • If you use the read_matrix technique discussed in another blogpost, then I think dump() should use read_matrix
  • Sometimes we can use an initial high value because the DP algo will use min() to compare. I think this is fine.
  • Zero is often a reasonable-looking initial value, but a lot of times zero is a legit vale in the algorithm ! Therefore, the initial value looks like a computed value and very confusing.

 

[19]if I had stayed ] java

I think every tsn experience and proposal has some “buts”, so does it mean we should stop trying?

No. If I had stayed within java/sql/perl then I would have been worse off —

  • fewer job opportunities, less broadened technical base
  • slightly more worry about job security
  • more worry about churn
  • more worry about outsourcing
  • no differentiation from millions of java guys
  • left behind by some of the alpha geeks who branch out and establish new strongholds.
    • My technical mind is the BROAD-n-deep, curious explorer type so it is stifled if confined to java
  • sql and perl both falling out of fashion

But ….

  • possibly more competent in terms of figure-things-out relative to team peers
  • possibly fewer stigmas and more respect
  • ^^ These factors depend mostly on localSys knowledge
  • not so much stress, so much painful struggle in the beginning
  • possibly an architect role, provided I stay long and invest heavily in localSys

Was leverage good on my multiple tsn attempts after GS? reasonable leverage in some attempts.

checked STL^checked java Collections

jdk checkedList, checkedMap etc are designed to check type errors — checking any newly added item has the correct type for the collection. See P 246 [[java generics]]

STL checked container checks very different coding errors. See http://www.informit.com/articles/article.aspx?p=373341, which is extracted from my book [[c++codingStd]]

biasedLocking^lockfree^ST-mode

This is a pure QQ topic, but you can score a few points by mentioning it.

[[javaPerf2014]] P268 says biased locking is enabled by default but can be disabled to improve performance in app servers using a thread pool and contended locks. It explains the reason i.e. biased locking comes with a price — a runtime overhead.

https://stackoverflow.com/questions/9439602/biased-locking-in-java is very concise — un-contended locking may incur zero cost, as claimed in the accepted answer

  • .. incurs no synchronization cost. I suppose this is typically geared towards overly conservative code that performs locks on objects without ever exposing them to another thread. The actual synchronization overhead will only kick in once another thread tries to obtain a lock on the object.
  • Biased locking is the default in java6 ====> -XX:+UseBiasedLocking improving the performance of uncontended synchronization. An object is “biased” toward the thread which first acquires its monitor; subsequent monitor-related operations are relatively faster. Some applications with significant amounts of uncontended synchronization may attain significant speedups

Is this technique considered lockfree? No, but some may speculate that it might be even faster than lockfree. So if you suspect most of the time there’s only one thread accessing a critical section, you could choose to rely on the (java6 default) biased locking rather than lockfree solution. Most of the time this mutex-based design would challenge (if not beat) a lockfree performance.

However, I believe single-threaded mode is still faster, where a thread isn’t aware of other threads, as if it is the only thread access those objects i.e. no shared-mutable. There would be no lock grab, no memory fencing at all. [[javaPerf2014]] P375 agrees.

Q:which linux c++thread is stuck #CSY/Vanguard

This is a typical “c++ecosystem question”. It’s not about c++ or C; it’s about linux instrumentation tools.

Q1: Given a multi-threaded server, you see some telltale signs that process is stuck and you suspect only one of the threads is stuck while the other threads are fine. How do you verify?

Q2: What if it’s a production environment?
A: I guess all my solution should be usable on production, since the entire machine is non-functioning. We can’t make it any worse.  If the machine is still doing useful work, then we should probably wait till end of day to investigate.

–Method: thread dump? Not popular for c++ processes. I have reason to believe it’s a JVM feature, since java threads are always jvm constructs, usually based on operating system threads [1]. JVM has full visibility into all threads and provides comprehensive instrumentation interface.

https://www.thoughtspot.com/codex/threadstacks-library-inspect-stacktraces-live-c-processes shows a custom c++ thread dumper but you need custom hooks in your c++ source code.

[1] Note “kernel-thread” has an unrelated meaning in the linux context

–Method: gdb

thread apply all bt – prints a stack trace of every thread, allowing you to somewhat easily find the stuck one

I think in gdb you can release each thread one by one and suspend only one suspect thread, allowing the good threads to continue

–Method: /proc — the dynamic pseudo file system

For each process, a lot of information is available in /proc/12345 . Information on each thread is available in /proc/12345/task/67890 where 67890 is the kernel thread ID. This is where pstop and other tools get thread information.

 

closestMatch in sorted-collection: j^python^c++

–java is cleanest. P236 (P183 for Set) [[java generics]] lists four methods belonging to the NavigableMap interface

  • ceilingEntry(key) — closest entry higher or equal
  • higherEntry(key) — closest entry strictly higher than key
  • lowerEntry
  • floorEntry

2 threads taking turn #std::ref #CSY

Vanguard Q: write a program containing exactly 3 threads printing 1/3/5/7/… 2/4/6/8… respectively, but in lock steps, as if passing a token between them.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/thr/takeTurn.cpp is my solution.

I managed to avoid sleep() and condVar. The idea — all thread run the same function which

  1. check the shared mutable variable Next. If it’s “my” turn then
  2. grab lock, print my next number, update Next, release lock
  3. end-if
  4. yield and exit

I used an atomic<char> “Next” set to lastOutput%something, that’s visible to all threads, even if not holding a lock.

 

contains(): Set^SortedSet

–in java # See P247/32 [[java generics]]

  • Set<Acct> contains() uses Acct.equals()
  • SortedSet<Acct> contains() uses Comparable<Acct> or a custom comparitor class, and ignores Acct.equals()

–In STL

The tree-based containers use “equivalence” to determine containment, basically same as the java  comparator.

The hash-based containers use hashCode + a equality predicate. The implementation details are slightly complicated since both the hash function and the predicate function are template params. Upon template instantiation, the two concrete types become “member types” of the host class. If host class is undered_set<string>, then we get two concrete member types:

unordered_set<string>::hasher and unordered_set<string>::key_equal

These member types can be implemented as typedefs or nested classes. See ARM.

%%AuxDS strength ] coding test #algoQQ

Update — Indeed experience..

  1. when a tough problem requires us to construct an explicit and non-trivial auxiliary data structures (AuxDS), i often have an intuitive feel for the data structures needed, not always the optimal.
  2. when a tough problem requires a simpler AuxDS + unusual algo, I don’t have an edge.
  3. When a tough problem requires only an unusual algo, I don’t have an edge.
    * eg: edit distance
    * eg: sliding window max
    * eg: maximal sub-matrix

    • Suggestion: invest in algoQQ. I feel they will stick with me.

Can C implement basic containers@@

Templates — I think template is unavailable in C, and I don’t know how C could emulate it, except using macros.

C doesn’t support exceptions but luckily very few exceptions in STL.

Most STL container classes have relative few member functions. I think C can implement free functions with “this” as an additional parameter.

Bulk of the basic operations on container are global functions in the form of STL algorithms. I think C can implement them.

Iterators are tricky. In STL I think they are usually nested classes. I think I would just give up. C supports no operator overloading so there’s no point providing an iterator.

##STL iterator is implemented as ..

  • implemented as raw ptr — if your data is held in an array
  • implemented as member class (sugarcoated as member typedef) — most common
  • implemented as friend class
  • implemented as wrapper over internal container’s iterator — (cheat) if your custom container is kind of wrapper over an STL container, then just use the internal container’s iterator as your iterator.

Remember an iterator class is a form of smart pointer by definition, since it implements operator->() and operator*()

##java8 features #MethodRef

  • lambda
  • default methods
  • streaming
  • method references? Supposed to be an important feature in the core language/syntax. related to lambda syntax, presumably a syntactic sugar coating.
  • java.util.Optional<T> introduced to reduce NPE. Compare boost::optional
  • CompletableFuture<T> to promote async, event-driven programming
  • PermGen disappears in java8
  • –not really java8 features:
  • java7 introduced G1 garbage collector though default GC remains ParallelGC, for both java8 and java7

pureDev beats techLead: paycut=OK #CYW

My hypothesis based on discussions with CYW:

Bottom line –YW said if the pure tech salary is only 20k lower, then he may prefer the workload. The work-life balance, the freedom-from-care and simplicity .. are often highlighted by my tech-lead friends (like Su form GS?), though I suspect most of them still take tech lead roles (or higher).

A techLead has (usually difficult) deliverables for multiple business groups pushing for conflicting priorities. The techLead also need to enlist support from external teams. As such, he as nontrivial additional workload including supervising his own team members, managing users, liaising with other teams … all in addition to he development workload.

  • 😦 吃力不讨好
  • 😦 It’s not easy to mobilize other people to work for your deliverables.
  • 😦 The extra workload often requires overtime. RTS’s Venkat might be an example when he claimed he works 7 days a week.

The power dynamics is based on loyalty. Some big boss might value your loyalty more than your GTD capacity. Loyalty is tricky investment. If you invest in loyalty to big boss A, but A gets replaced by big boss B (as often happens), and B invariably brings in his own loyal lieutenants, then you have a problem. I think Jerry Zhang (Citi) faced this situation when Srini took over.

Unlikely the tech lead, the typical pure-tech contributor doesn’t need to demonstrate loyalty.

I think some senior developer roles are pure-tech-contributors.

both DEPTH+breadth expected of%%age: selective dig

Given my age, many interviewers expect me to demonstrate insight into many essential (not-obscure) topics such as lockfree (Ilya),

Interviewers expect a tough combination of breadth + some depth in a subset of those topics.

To my advantage I’m a low-level digger, and also a broad-based avid reader. The cross-reference in blog is esp. valuable.

Challenge for me — identify which subtopic to dig deeper, among the breadth of topics, given my limited absorbency and the distractions.

##java heap allocation+!explicit q[new]

Most of these are java compiler tricks.

  • (pre-runtime) enum instance instantiation — probably at class-loading time
    • P62 [[java precisely]]
  • String myStr = “string1”; // see string pool blogpost
    • P11 [[java precisely]]
    • NOT anonymous temp object like in c++
  • “string1” + “str2” — is same as myStr.concat(..). So a method can often new up an object like this.
    • P10/11 [[java precisely]]
  • boxing
  • (most tricky) array initialization
    • int[] days ={31,28,31/* instantiates the array on heap */};
    • most tricky
    • P17 [[java precisely]] has examples
    • P16 [[java precisely]] also shows an alternative syntax “new int[]”

crypto exchange revenue #hearsay

–Jasper wrote “Trading Fees: When it comes to crypto-to-fiat trading pairs, there are no fees. As for crypto-to-crypto pairs, 0.15% of transaction value for market takers and -0.075% of transaction value (a rebate) for market makers is the rule.”

$100M daily trading volume is a figure I heard. $150k of fee income, assuming 15 bps fee.

I guess the rebate is like a income tax rebate. Market makers still pay a fee but a lower fee than market-takers.

–one-time listing fee

I was told that some creator of new crypotocurrency pay $5mil to a popular exchange for a listing. It’s a one-time listing fee.

 

standard SQL to support pagination #Indeed

Context — In the Indeed SDI, each company page shows the latest “reviews” or “articles” submitted by members. When you scroll down (or hit Next10 button) you will see the next most recent articles.

Q: What standard SQL query can support pagination? Suppose each record is an “article” and page size is 10 articles.

I will assume each article has an auto-increment id (easier than a unique timestamp) maintained in the database table. This id enables the “seek”” method.  First page (usually latest 10 articles) is sent to browser.  Then the “fetch-next” command from browser would contain the last id fetched. When this command hits the server, should we return the next 10 articles after that id (AA), or (BB) should we check the latest articles again, and skip first 10? I prefer AA. BB can wait until user refreshes the web page.

The SQL-2008 industry standard supports both (XX) top-N feature and (YY) offset feature, but for several reasons [1], only XX is recommended :

select * from Articles where id < lastFetchedId fetch first 10

[1] http://www.use-the-index-luke.com clearly explains that the “seek” method is superior to the “offset” method. The BB scenario above is one confusing scenario affecting the offset method. Performance is also problematic when offset value is high. Fetching the 900,000th page is roughly 900,000 times slower than the first page.

3ways to expire cached items

server-push update ^ TTL ^ conditional-GET

Online articles would hint at both, but few articles list these two modes explicitly. This is a simple concept but fundamental to DB tuning and app tuning.

A) TTL — more common. Each “cache item” embeds a time-to-live data field a.k.a expiry timestamp

B) cache-invalidation — some “events” would trigger an invalidation. Without invalidation, a cache item would live forever with a infinity TTL, like the list of China provinces.

If TTL is predictable, you should combine A and B. I think cookie is an example.

G) conditional-GET in HTTP is a proven industrial strength solution described in my 2005 book [[computer networking]]. The cache server always sends a GET to the database but with a If-modified-since header. This reduces unnecessary database load and network load.

TTL eager server-push conditional-GET
if frequent query, infrequent updates efficient efficient high network load but limited to tiny requests
if latency important OK lowest latency slower lazy fetch, though efficient
if infrequent query good waste DB/client/NW resources as “push” is unnecessary efficient on DB/client/NW
if frequent update unsuitable high load on DB/client/NW efficient conflation
if frequent update+query unsuitable can be wasteful fairly efficient

 

market-depth^elite domains #jxee

I used to dismiss “commodity” skills like market data, risk system, JXEE… I used to prefer high-end specializations like algo-trading, quant-dev, derivative pricers. In reality, average salary is only slightly different and a commodity job can often outpay a specialist job.

As I get older, it makes sense to prefer market depth rather than “elite”(high-end niche) domains. A job market with depth (eg jxee, market-data, risk systems) offers a large number of positions. The typical salary of top 10% vs the median are not very different — small gaps. In contrast, the elite domains feature bigger gaps. As I grow older, I may need to reconsider the specialist vs generalist-manager choices.

Reminders about this preference (See also the spreadsheet):

  1. stagnation in my orgradient
  2. may or may not use my specialist skills in math, concurrency, algorithms, or SQL …
  3. robust demand
  4. low churn — a critical criteria whenever I mention “market depth”. I don’t like the market depth of javascript and web java.
  5. salary probabilities(distro): mgr^NBA#marketDepth etc

–case study: Algo trading domain

The skillset overlap between HFT vs other algo systems (sell-side, OTC, RFQ, automated pricing/execution..) is questionable. So is “accumulation” across the boundary.  There seems to be a formidable “dragon gate” — 鲤鱼跳龙门.

Within c++ based HFT, accumulation is conceivable. Job pool is so small that I worry about market depth. My friend Shanyou agreed that most of the technical requirement is latency. C/C++ latency techniques are different from java.

However, HFT developers seldom need to optimize latency

Outside HFT, the level of sophistication and latency-sensitivity varies. Given the vague definition, there are many (mostly java) jobs related to algo trading i.e. better market depth. Demand is more robust. Less elitist.

jvm footprint: classes can dominate objects

P56 of The official [[java platform performance]], written by SUN java dev team, has pie charts showing that

  • a typical Large server app can have about 20% of heap usage taken up by classes, rather than objects.
  • a typical small or medium client app usually have more RAM used by classes than data, up to 66% of heap usage take up by classes.

On the same page also says it’s possible to reduce class footprint.

mgr role risk: promotion=hazard #Alex

“mgr” in this context means any lead role.

When I feel left behind on the slow track, it’s usually in comparing to the manager peers.

Some Morgan Stanley developer rose to ED but after a while, he wanted hands-on dev so he stopped managing teams and became a very senior dev. But his performance/value-add was bench-marked against those manager EDs. After a while, Presumably he was seen as too expensive as a dev and got the golden handshake in 2019 Mar.

When my friend Alex told me this story, I gave this illustration — Suppose with hard work I am competent at Level 5 (senior VP) and very comfortably at Level 4 (junior VP) but struggle a bit at Level 6 (ED) when benchmarked to Level 6 peers. In such a case, for job safety I may want to remain at Level 5 rather than moving up. For an easy life, I may even stay at Level 4. If I get promoted to 6 I face stiff peer competition, from guys competent at Level 6. The benchmark-risk can be real. 高处不胜寒

When you get promoted to Level 6, you can’t avoid the peer bench-marking. You will get bench-marked, whether you like it or not. I find this peer benchmark unhealthy and hazardous, but Alex is very explicit about the benchmark/calibration system in many firms.

Better remain at a level you can perform well relative to peers.

GP+Ashish: 随遇而安@PIP #GS

grandpa’s advice is 随遇而安 — “Do your best. If they decide it’s a role mismatch then look for another job”. I will expand on his advice and add relevant tips and observations

  • academic self-image .. fragile — Ashish pointed out I was academically too successful and unable to cope with put-downs
  • best effort — I don’t need to bend over backward and sacrifice family
  • no shame
  • no fear of stigma — sounds impossible but it is possible !
  • no regret
  • guilt — the guilt should be on employer for making a wrong hire and creating hardship in my life.
  • stay positive — there’s a chance I can survive for 1-2 years
  • peer caliber — Ashish said those guys aren’t rock stars
  • Saurabh attitude — I believe at a high salary or as the first technology hire for Julian, expectation would be rather high. Can I withstand the pressure as Saurabh did?
  • GS pressure cooker — I survived there, so I should be able to survive anywhere else.
  • learning to cope — At GS/Qz/Macq, did I learn coping strategies to manage the pressure? I hope so.

The pressure to perform would likely create real stress in the family, as i’m not as ‘carefree’ as in Bayonne. I feel some of the past stigmas would come back to haunt me.

See also https://bintanvictor.wordpress.com/wp-admin/post.php?post=28830&action=edit

try{}must be completed by ..#j^c++^c#

— Java before 7 and c#
try{} should be completed by at least a catch or finally. Lone wolf try{} block won’t compile. See https://www.c-sharpcorner.com/UploadFile/skumaar_mca/exception-handling-in-C-Sharp/

In particular, try/finally without catch is a standard idiom.

— java 7:
try{} should be completed by a catch, a finally, both or none .. four configurations 🙂

The try/finally configuration now has an important special case i.e. try-with-resources, where the finally is implicit so you won’t see it in anywhere.

— c++ as of 2019
C++ has no finally.

try{} must be followed by catch.

rvr usually shows up as function param ONLY

r-value reference is a type, and therefore a compile-time thing, not a runtime thing as far as I know. At runtime, there’s no r-value reference variable,

only addresses and 32-bit pointer objects.

(I believe at runtime there is probably no lvr reference variable either.)

Compiler recognizes the RHS’s type and decides how to bind the RHS object to a variable, be it an rvr-variable, lvr-variable, nonref-variable, or const-lvr-variable.

About the only place I would use a "&&" variable is a function parameter. I don’t think I would ever need to declare a local variable or a field with "&&".

Do I ever assign an rvr variable as a RHS to something else? Only in one case, as described in [[effModernC++]] P162 and possibly Item 25. This kind of usage is really needed for QQ interviews and never in any job…. never. It’s too tricky and doesn’t buy us anything significant.

RTS feed for Internet clients #Mark #50%

Q: Based on the RTS market data dissemination system, what if some clients subscribe by a slow Internet connection and your orderbook (TCP) feed need to provide delta updates each time a client reconnects?

Default solution: similar to FIX/TCP .. sender to maintain per-client state. Kenny of Trecquant said his trading app can be extremely simple if exchange maintains state. I won’t elaborate. Here is my own Solution.

Note on terminology — in multicast there’s no TCP-style “server” . Instead, there’s a sender engine for many receiver clients.

Suppose we have too many clients. To minimize per-client state management, my engine would simply multicast to all clients real time updates + periodic snapshots.

A client AA can request a snapshot on any symbol group, and I will immediately multicast the snapshots on a refresh channel. If client BB never requests anything, BB can ignore the refresh multicast channel.

Request quota — each client like AA can request X free snapshots a day. Beyond that, AA’s requests would be regulated like

  • levied a fee
  • queued with lower priority

It’s feasible to replace the refresh multicast group with an unicast UDP channel per client, but to me the multicast-refresh solution offers clear advantages without major drawbacks.

  1. if there is an outage affecting two clients, each would request the same snapshots, creating unnecessary work on the engine. The request quota would incentivize each client to monitor the refresh group and avoid sending the same request as someone else
  2. multicast can be valuable to clients who like more frequent snapshots than the periodic.
  3. My operations team can also utilize this refresh channel to broadcast unsolicited (FreeOfCharge) snapshots. For this task, I would avoid using the publisher channel as it can be busy sending real time updates. We don’t want to interleave real time and snapshot messages.

[19] body-build`can hit low visPgress#leverage,disengaged

In Singapore (and very few NY) jobs, I noticed a pattern — the longer I stayed on a stable job, the lower I dropped in motivation, incentives and positive feedback/reinforcement for IV body-building including QQ, coding..

Every time I pass a non-trivial tech screening, I feel a real boost … Reinforcement of absorbency and reinforcement of a wellspring of energy lasting a few days to a few months … sometimes years thanks to my retrospective blogging. My Singapore experience is missing this crucial IV element. Without it, my absorbency of dry technical learning is hard to sustain. This also explains why my absorbency of localSys is hard to sustain.

(My son has never experienced such positive reinforcement.)

To gain perspective I find it necessary to compare with other endeavors. My conclusion — body-building has the highest leverage. See also meaningful endeavor(Now)4family: IV^zbs^gym..

Whenever I feel guilty/ashamed of my fixation on IV, and try to learn zbs, localSys, GTD etc,  eventually (often quickly) I hit a broken reinforcement loop and a weak or depleted “energy supply” and invariably give up, very rationally.

Q: is there any /endeavor/ with higher visPgress than IV body-building?

chores that require absorbency visPgress #immediate $ROTI leverage over long-term, on family well-being
body-building Yes in the form of blog+github… not $$ [1] reasonably high, given huge $gain and huge t-investment 😦 higher leverage than everything else combined [2], despite the churn
… cf localSys highly visible respect, not $$ ->necessary but insufficient condition for move-up
non-prop investment easily visible but small low given the small profit no leverage so far
yoga (+ fitness) some but hard to keep $zero high leverage, well-known
diet for BMI highest but hard to keep $zero 😦 low since it’s hard to keep up

[1] I think many of my peers can’t keep up the body-building effort precisely because there’s no $ROTI… personal projects: any ROI ] salary@@
[2] direct fruits of the endeavor:

  • made my nice TPY home possible
  • enabled me to move between countries
  • made GC possible
  • gave wife the SPR then Singapore citizenship
  • gave me decent salary
  • supported my property investments

ibank c# shops moving to java #Ellen+Sunil

Update — a c# veteran in OC said server-centric systems prefer java, while client-centric systems still prefer c#. The OC c# system was server-centric, so presumably he has witnessed the decline of c# in his own space.

I also confirmed with two Wall Street WPF veterans that browser GUI systems have grown in capabilities and can now emulate WPF. These systems leverage on west coast innovations with javascript.

—-

A Singapore banking recruiter shared with me her observation. I had earlier spoken to her a few times and she had earned my respect for her insight.

She said a few Singapore ibank departments were using c# before but now hiring java developers instead. I said c# is a newer language and I have never heard of such a migration. She didn’t address that but said java is apparently more open-source than c#.

I think this is a relatively isolated case. Some time in the past or future you may see java shops moving to c#.

She is very close to Standard Chartered bank (a c++ shop, as she acknowledged) and said SCB is hiring more java developers than before.

She said nowadays just about the only Singapore employers of c++ is HFT. I disagree. SCB, Macq and some ibanks still use some c++.

She said java is now the dominant language for internet companies, to my surprise. She said there’s now improving mobility between java developers in ibanks vs internet shops. I think she meant the big-data java roles, not core-java roles.

She said the SG banking job pool is dominated by java — out of 10 jobs, 9 are java jobs — “crazy” as she said. I guess less than half of those 9 jobs are coreJava.

hands-on dev beats mgr @same pay

BA, project mgr, even mid-level managers in some companies can earn the same 160k salary of a typical “developer” role. For a manager in a finance IT, salary is often higher, but for a statistically meaningful comparison I will use a 160k benchmark. Note in finance IT or tech firms, 160k is not high but on main street many developer positions pay below 160k.

As stated in other blogposts, at the same salary, developers enjoy higher mobility, more choices, higher career security…

jvm heap histogram simple console tool

The simplest among many similar tools. This type of analysis is easier in java than in c++ because jvm manages memory allocations. In fact, GC can even relocate objects.

~$jcmd test GC.class_histogram
58161:

num #instances #bytes class name
----------------------------------------------
 1: 1020 93864 [C
 2: 481 54856 java.lang.Class
 3: 526 26072 [Ljava.lang.Object;
 4: 13 25664 [B
 5: 1008 24192 java.lang.String
 6: 79 5688 java.lang.reflect.Field
 7: 256 4096 java.lang.Integer
 8: 94 3760 java.lang.ref.SoftReference
 9: 91 3712 [I
 10: 111 3552 java.util.Hashtable$Entry
 11: 8 3008 java.lang.Thread
...

REST^SOAP

REST stands for Representational State Transfer … basically means that each unique URL is a representation of some object. You can get the contents of that object using an HTTP GET, use a POST, PUT, or DELETE to modify the object (in practice most of the services use a POST for this).

— soap vs REST (most interviewers probably focus here) —

  • REST has only GET POST PUT DELETE; soap uses custom methods “setAge()” etc
  • SOAP takes more dev effort, despite it’s name
  • SOAP used to dominate enterprise apps, though XR used REST in ibanks.

–real REST URLs

https://restfulapi.net/resource-naming/ shows some examples. It also says

URIs should not be used to indicate that a CRUD function is performed. URIs should be used to uniquely identify resources and not any action upon them. HTTP request methods should be used to indicate which CRUD function is performed.

HTTP GET http://api.example.com/device-management/managed-devices  //Get all devices
HTTP POST http://api.example.com/device-management/managed-devices  //Create new Device
HTTP GET http://api.example.com/device-management/managed-devices/{id}  //Get device for given Id
HTTP PUT http://api.example.com/device-management/managed-devices/{id}  //Update device for given Id
HTTP DELETE http://api.example.com/device-management/managed-devices/{id}  //Delete device for given Id

importance@formal edu4tech career

I tend to dismiss the value of formal education (including degrees) and I tend to overvalue on-the-job quick-n-dirty learning. Compared to Asia, U.S. tech culture is less fixated on formal education.

  • Eg: Some inexperienced developer colleagues seem to have a good grasp of option math
  • Eg: my theoretical knowledge of comp science is completely self-taught, including concurrency, SQL, OO,..
  • eg: A more extreme example of such a domain is DynamicProgramming + Greedy algorithms. An intelligent programmer (regardless of age) can become highly competent without formal training.

Some theoretical domains are really based on field practice such as OO design.

Even if you have a solid education, either formally, or on the job over a few focused years, we all face the same challenge of continuing education —

  1. my web dev experience (self-education) is now outdated, according my interviews at Indeed , ByteDance ..
    • But luckily there are many interviews I can attend and learn from.
  2. data science, machine learning … is not so easy to self-learn
    • luckily there are many online learning resources.

 

parent/child pairs→tree algos #Indeed

As a speed-coding test, this problem requires you to apply common computer science constructs to a realistic problem, and then challenges you

“How many seconds do you need to build a working directed graph from raw data, and run BFT/DFT on it?”

45 minutes given. Target is to complete first 2 questions with working code. Can some candidates complete all 3 questions? I guess so.

Q: You are given some raw data like

parent_child_pairs = [ (1, 3), (2, 3), (3, 6), (5, 6), (5, 7), (4, 5), (4, 8), (8, 10), (11,2) ]

Suppose we have some input data describing a graph of relationships between parents and children over multiple generations. The data is formatted as a list of (parent, child) pairs, where each individual is assigned a unique integer identifier.

For example, in this diagram, 3 is a child of 1 and 2, and 5 is a child of 4:

  11
   \
1   2   4
 \ /   / \
  3   5   8
   \ / \   \
    6   7   10

Q1: write a function to output all individuals having no parents(like 1 and 4) in a list, and another list of individuals having a single parent (like 8 and 7)

Q2: write a bool function to determine if two named individuals have any common ancestor. 3 and 6 yes; 3 and 1 no!

I wrote a DFT solution .. https://github.com/tiger40490/repo1/blob/py1/py/tree/commonAncestor_Indeed.py Not very efficient but I really should care less about that since the real challenge is .. timely completion. I was not used to writing DFT on the spot within minutes but I hacked it together under ticking clock, first time in my career !

To find if two sets intersect, I was forced to make a quick judgment call to write my own loop.

  • I didn’t know if there’s a simple and reliable solution online
  • i didn’t know how much effort is required to search online and understand it
  • i didn’t know how much effort is required to adapt standard solution to suit my needs
  • My own loop gives more legwork but more control if requirements turns out to be non-standard.

Q3: (original wording) Write a function that, for a given individual in our dataset, returns their earliest known ancestor — the one at the farthest distance from the input individual. If there is more than one ancestor tied for “earliest”, return any one of them. If the input individual has no parents, the function should return null (or -1).

Sample input and output:

findEarliestAncestor(parentChildPairs, 8) => 4
findEarliestAncestor(parentChildPairs, 7) => 4
findEarliestAncestor(parentChildPairs, 6) => 11
findEarliestAncestor(parentChildPairs, 1) => null or -1

Indeed: hearsay algo questions#speedPractice

Q: The first interview consisted of interviews with two separate people. The first asked for a function that takes in a string of words separated by spaces and prints the first duplicate word. The second asked for a function that takes in two sorted unique int arrays and returns an array of the duplicates between the two arrays.

Q: Implementing an LRU = Least recently used cache. Question was I have a cached of storing just 50 objects now how do you make sure you have unique 50 elements in a cache adding 51 should kick out least recently used one from that 50 objects.

Q: Shunting Yard Algorithm

containers+MSA: cloud-affinity

I feel both the MSA architecture and the low level container technology can have a long-term impact thanks to cloud.

The implementations might be replaced by newer implementations in a few years. but some of the ideas may evolve. Now is still the volatile early phase… ##[17] proliferation → consolidation.. beware of churn

Both of them are easy to launch on elastic cloud

Both of them can be started on multiple machines as a wolf pack to handle bigger workload.

22 notable features added to c++

https://en.cppreference.com/w/cpp/language/history briefly mentions

  • [90] exception handling
  • [90] templates
  • [98] cast operators
  • [98] dynamic_cast and typeid()
  • [98] covariant return type
  • [07] boost ref wrapper .. see std::reference_wrapper
  • [11] GarbageCollector interface .. See c++ GC interface
  • [11] std::next(), prev(), std::begin(), std::end() .. see favor std::begin(arrayOrContainer)
  • [11] exception_ptr? not sure how useful
  • [14] shared_lock — a RW lock
  • [14] shared_timed_mutex .. see try_lock: since pthreads
  • [14] std::exchange — comparable to std::swap() but doesn’t offer the atomicity of std::atomic_exchange()

cpu sharing among Docker container for jvm

Note cgroup is also usable beyond jvm and Docker, but i will just focus on jvm running in a Docker container..

Based on https://jaxenter.com/nobody-puts-java-container-139373.html

CPU shares are the default CPU isolation (or sharing??) and basically provide a priority weighting across all cpu time slots across all cores.

The default weight value of any process is 1024, so if you start a container as follows q[ docker run -it –rm -c 512 stress ] it will receive less CPU cycles than a default process/container.

But how many cycles exactly? That depends on the overall set of processes running at that node. Let us consider two cgroups A and B.

sudo cgcreate -g cpu:A
sudo cgcreate -g cpu:B
cgroup A: sudo cgset -r cpu.shares=768 A 75%
cgroup B: sudo cgset -r cpu.shares=256 B 25%

Cgroups A has CPU shares of 768 and the other has 256. That means that the CPU shares assume that if nothing else is running on the system, A is going to receive 75% of the CPU shares and B will receive the remaining 25%.

If we remove cgroup A, then cgroup B would end up receiving 100% of CPU shares.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/sec-cpu has more precise details.

https://scoutapp.com/blog/restricting-process-cpu-usage-using-nice-cpulimit-and-cgroups compares q(nice), cpulimit and cgroups. It provides more precise info on cpu.shares.

cpulimit can be used on an existing PID 1234:

cpulimit -l 50 -p 1234 # limit process 1234 to 50% of cpu timeslots. The remaining cpu timeslots can go to other processes or go to waste.

Leetcode speed-coding contest #Rahul

  • don’t look at ranking
  • yoga — I CAN keep up this practice. This practice is good for my mental health and family well-being
  • yoga — I feel even if i don’t improve visbly, the fact that my participation count is increasing means I’m improving
  • if I don’t do the contest then I may not do any coding drill at all
  • What if I give up after one or a few episodes?
  • impact on family well-being?
  • leverage over long term?

In [[who moved my cheese]], we discover the world has changed. The harsh reality is, in this profession, your experience (like doctors) is heavily discounted. Your current runtime performance is easily benchmarked, just like a painter, or pianist, or chef.

G9 workhorse-algos ] speedCoding #XR

XR said we should take speed coding contests to find out what basic algos are needed.

Opening example — parent/child pairs→tree algos #Indeed is one example of realistic problem that requires common comp-science constructs…

Need to memorize enough to implement on the spot, as these are basic “comp science constructs” needed for “realistic problems”

  • dft/bft, with levels
  • BST find predecessor
  • Binary search in array
  • Merge-sort/insertion-sort/partitioning
  • .. realistic string problems:
  • .. realistic int array problems:
  • max profit; max subarray sum
  • .. realistic matrix problems:

Below are basic algos for hackerrank/codility but NOT applicable to realistic problems typical of Indeed/FB

  • Linked list: remove dupes
  • string: collapse consecutive chars

##Java9 features #fewer than java8

  1. #1 Most important – modular jars featuring declarative module-descriptors i.e. requires and exports
  2. #2 linux cgroup support.. For one example, see Docker/java9 cpu isolation/affinity
  3. #3 G1 becoming default JGC.. CMS JGC: deprecated in java9
  4. REPL JShell
  5. private interface methods, either static or non-static
  6. Minor: C++11 style collection factory methods like

List<String> strings = List.of(“first”, “second”);


It’s unbelievable but not uncommon in Java history —

  • Java9 release introduced significantly fewer and less impactful features than java8.
  • Similarly, java5 overshadows java6 and java7 combined

 

jvm spawns(approx)32 JGC threads on 32-core

Based on https://jaxenter.com/nobody-puts-java-container-139373.html

jvm would spawn 32 GC threads on a 32-core box [1].  As of 2018, this is the default, but you can change it with jvm parameters like -XX:ParallelGCThreads and -XX:ConcGCThreads

Java 9 introduced automatic detection of cpu-set when jvm runs in a cgroup (such as a docker container) with a cpu-set. For example, JVM detects there are 3 cores in the cpu-set, and spawns 3 GC threads.

[1] presumably for the parallel GC or the CMS GC but i’m not sure. https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/parallel.html and https://blog.codecentric.de/en/2013/01/useful-jvm-flags-part-6-throughput-collector/ says for the parallelGC the default would be 5/8*32 + 3 = 23 threads.

wasm: a distant threat to javascript

https://medium.com/javascript-scene/what-is-webassembly-the-dawn-of-a-new-era-61256ec5a8f6 is a good WebAssembly intro, by an author.

  • wsam is an in-browser language, like javascript
  • wsam offers low-level (web-ASSEMBLY) programming constructs to complement javascript
  • I feel wsam will only be considered by extreme-performance browser apps. It’s too low-level, too inconvenient to be popular.
  • How many percent market share will javascript lose to wsam? 0.1% to 0.5%
  • The fact that wsam is cited as the main challenger to javascript means javascript is unchallenged as the dominant choice on the browser

##[19] cited strengths@java

In this post we compare c++, python, javascript, c#

  • [G3] Scalability and performance [1] – James Governor has a saying: “When web companies grow up, they become Java shops”.Java is built for scalability in mind, which is why it is so popular among enterprises and scaling startups. Twitter moved from Ruby to Java for scaling purposes.
  • [G9] community support [1] such as stackoverflow —
  • [G9] portability [1] — on Linux and Android.
  • [G9] versatile — For web, batch jobs, and server-side. MS is a java shop, using java for trading. but DDAM is not a typical MS app. DDAM has many batch jobs but the UI is all in web java.
    • python and c++ are also versatile
  • [G5] Java has high correlation with fashionable technologies — hadoop; cloud; big data; microservices… Python and javascript are also in the league.
  • [G3] proven —
    • web apps are the biggest market segment. Some (js/php/ruby) of the top 10 most popular languages are used exclusive for web. Java is more proven than c#, python, c++.
    • enterprise apps (complex business logic + DB) are my primary focus. java is more proven than python, javascript, php, c#
  • [G3=a top-3 strength]

[1] https://stackify.com/popular-programming-languages-2018/ explains Java’s popularity

 

scaffolding around try{} block #noexcept

Updates — Stroustrup has more to say about noexcept and the scaffolding in http://www.stroustrup.com/C++11FAQ.html#noexcept

[[ARM]] P358 says that all local non-static objects on the current call stack fully constructed since start of the try-block are “registered” for stack unwinding. The registration is fine-grained in the form of partial destruction —

  • for any array with 3 out of 9 objects fully constructed, the stack unwinding would only destruct those 3
  • for a half constructed composite object with sub-objects, all constructed sub-objects will be destructed
  • Any half-constructed object is not registered since the dtor would be unsafe.

I guess this registration is an overhead at run time.

For the local non-static objects inside a noexcept function, this “registration” is not required, so compiler may or may not call their destructors.

 

long-term value: QQ imt ECT speed

The alpha geeks — authors, experts, open source contributors … are they fast enough to win those coding contests?

The speed-coding contest winners … are they powerful, influential, innovative, creative, insightful? Their IQ? Not necessarily high, but they are nobody if not superfast.

The QQ knowledge is, by definition, not needed on projects, usually obscure, deep, theoretical or advanced technical knowledge. As such, QQ knowledge has some value, arguably more than the ECT speed.

Some say a self-respecting programmer need some of this QQ knowledge.

RAII phrasebook

See [[ARM]] P 358and [[Essential C++] P199.

  • local — local nonstatic object required. See [ARM]]
  • dtor — is required.
  • stack unwinding — either by exception or normal return. Note noexcept may skip stack unwinding.
  • partial destruction — see other blog posts
  • scaffolding — see other blog posts
  • exception guarantee — RAII is the only exception guarantee
  • exception strategy — RAII is the best exception strategy
  • double-exception — what if an unhandled exception triggers unwinding but en-route a new exception is born? No good strategy.
  • == for memory management .. RAII is the #1 most important memory management technique.
  • memory leak prevention
  • smart ptr — example of RAII for memory management.

##tech skill superficial exposure: 5 eg

see also exposure: semi-automatic(shallow)Accu #$valuable contexx and low-complexity topics #JGC..

  • –(defining?) features (actually limitations) of superficial exposure
  • accu — (except bash, sql) I would say very few of the items below offer any accu comparable to my core-c++ and core-java accu
  • entry barrier — (except SQL) is not created since you didn’t gain some real edge
  • commodity skill —
  • traction — spinning the wheel
  1. JGC — i think most guys have only textbook knowledge, no GTD knowledge.
  2. lockfree — most guys have only textbook knowledge
  3. [e] spring, tibrv — xp: I used it many times and read many books but no breakthrough. In-depth knowledge is never quizzed
  4. bash scripting — xp: i read books for years but only gained some traction by writing many non-trivial bash scripts
  5. SQL — xp: 5 years typical experience is less than 1Y@PWM
  6. –other examples (beware oth)
  7. [e] EJB, JPA (hibernate), JMS, Hadoop(?)
  8. memory profilers; leak detectors
  9. design patterns — really just for show.
  10. Ajax
  11. [e] Gemfire
  12. java latency tuning — is an art not a science.
    • Poor portability. JVM tuning is mostly about .. GC but at application level, real problem is usually I/O like data store + serialization. Sometimes data structure + threading also matter receives disproportionate limelight.
    • conclusion — superficial, textbook knowledge in this skillset probably won’t help with latency tuning.
  13. [e=ecosystem like add-on packages outside the core language.] Ecosystem skills are hard to enable deep learning and traction as employers don’t need deep insight. GTD is all they need.