[19] zbs cf to QQ+GTD #compiler+syntax expertise

Why bother — I spend a lot of time accumulating zbs, in addition to QQ halos and localSys GTD

I have t_zbs99 and other categories/tags on my blogposts showcasing zbs (真本事/real expertise) across languages. Important to recognize the relative insignificance of zbs

  • #1 QQ — goal is career mobility. See the halo* tags. However, I often feel fake about these QQ halos.
  • #2 GTD — ( localSys or external tools …) goal is PIP, stigma, helping colleagues. Basic skill to Make the damn thing work. LG2 : quality, code smell, maintainability etc
  • #3 zbs — goal is self-esteem, respect, curiosity, expertise and “expert” status. By definition, zbs knowledge pearls are often not needed for GTD. In other words zbs is “Deeper expertise than basic GTD”. Scope is inherently vague but..
    • Sometimes I can convert zbs knowledge pearls to QQ halos, but the chance is lower than I wished, so I often find myself overspending on zbs. Therefore I consider the zbs topics a distant Number 3.
    • Zbs (beyond GTD) is required as architect, lead developer, decision makers.
    • Yi Hai likes to talk about self-respect
    • curiosity -> expertise — Dong Qihao talked about curiosity. Venkat said he was curious about many tech topics and built up his impressive knowledge.

I also have blog categories on (mostly c++) bulderQuirks + syntax tricks. These knowledge pearls fall under GTD or zbs.

Q: As I grow older and wealthier (Fuller wealth), do I have more reason to pursue my self-esteem, respect, curiosity etc?
A (as of 2020): no. My wealth is not so high, and I still feel a lot of insecurity about career longevity. QQ and GTD remain far more important.

demanding mental power: localSys imt QQ for me-only

I told grandpa that I have noticed signs that my mental power is declining, mostly in localSys and local codebase absorbency. However, localSys has always been a weakness in me. The decline, if any, is very gradual.

QQ learning is still in good progress. Experience, accumulation, deepening.. thick->thin.

In contrast, I suspect that many peers find QQ more demanding than localSys.

Note A.Brooks’ article talks about creativity, innovation, analytical … My issue is mostly memory capacity and speed. Recall the Indeed interview.

 

##[19]if I were2re-select%%tsn bets #tbudget=30m

##[17] checklist@tsn #engaging provides a checklist but here’s a FRESH look, which may not be so in-depth, balanced, and comprehensive.

Q: Hypothetically, knowing what I know now, beside c++ and python, which trySomethingNew bets would I have chosen in 2010, after GS?

  1. mkt data and sockets
  2. forex, equities, options, IRS — I like the well-defined body of dnlg as entry-barrier. I learned it fast 🙂
  3. trading engine dev including pricing, OMS, connectivity
  4. risk analytics?
  5. devops?
  6. big data including hadoop, cloud etc?
  7. — not-so-great
  8. c# — heavy investment, a lot of legwork but insufficient ROTI
  9. MOM and Gemfire — shrinking demand
  10. swing? Fun but very poor job market
  11. quantDev? extremely disappointing job market
  12. HFT? entry barrier too high

highest leverage: localSys^4beatFronts #short-term

Q: For the 2018 landscape, what t-investments promise the highest leverage and impact?

  1. delivery on projects + local sys know-how
  2. pure algo (no real coding) — probably has the highest leverage over the mid-term (like 1-5Y)
  3. QQ
  4. –LG2
  5. portable GTD+zbs irrelevant for IV
  6. obscure QQ topics
  7. ECT+syntax — big room for improvement for timed IDE tests only, not relevant to web2.0 onsite interviews.
  8. Best practices — room for improvement for weekend IDE tests only, not relevant to web2.0 shops.

##[18]orgro lens:which past accu proved long-term # !!quant

(There’s a recoll on this accumulation lens concept…. )

This post is Not focused on IV or GTD. More like zbs.

Holy grail is orgro, thin->thick->thin…, but most of my endeavors fell short. I have no choice but keep shifting focus. A focus on apache+mysql+php+javascript would have left me with rather few options.

  • —-hall of famers
  • 1) [T] data structure theory + implementation in java, STL, c# for IV — unneeded in projects
  • 2) [CRT] core java knowledge including java OO has seen rather low churn,
    • comparable to c++
    • much better than j2EE and c#
  • 3) [T] threading? Yes insight and essential techniques. Only for interviews. C# is adding to the churn.
  • 4) [j] java/c++/c# instrumentation using various tools. Essential for real projects and indirectly helps interviews
  • [C] core C++ knowledge
  • [C] GTD knowledge in perl/python/sh scripting
  • [j] google-style algo quiz — Only for high-end tech interviews. Unneeded in any project
  • [R] SQL? yes but not a tier one skill like c++ or c#
  • coding IV — improved a lot at RTS
  • ————————also-ran :
  • devops
  • [C] personal productivity scripts
  • [T] probability IV
  • [C] regex – needed for many coding interviews and real projects
  • [C] low level C skills@RTS {static; array; cStr; reinterpret_cast;  enum; typedef; namespace; memcpy}
  • [!T] bond math? Not really my chosen direction, so no serious investment
  • [!T] option math?
  • SQL tuning? not much demand in the trading interviews, but better in other interviews
  • [R] Unix — power-user GTD skills.. instrumentation, automation? frequently used but only occasionally quizzed
  • [R] Excel + VBA? Not my chosen direction
  • [jR !D] JGC +jvm tuning

–strengths
C= churn rate is comfortable
D = has depth, can accumulate
R= robust demand
T= thin->thick->thin achieved
j|J = relevant|important to job hunting

de-multiplex by-destPort: UDP ok but insufficient for TCP

When people ask me what is the purpose of the port number in networking, I used to say that it helps demultiplex. Now I know that’s true for UDP but TCP uses more than the destination port number.

Background — Two processes X and Y on a single-IP machine  need to maintain two private, independent ssh sessions. The incoming packets need to be directed to the correct process, based on the port numbers of X and Y… or is it?

If X is sshd with a listening socket on port 22, and Y is a forked child process from accept(), then Y’s “worker socket” also has local port 22. That’s why in our linux server, I see many ssh sockets where the local ip:port pairs are indistinguishable.

TCP demultiplex uses not only the local ip:port, but also remote (i.e. source) ip:port. Demultiplex also considers wild cards.

TCP UDP
socket has local IP:port
socket has remote IP:port no such thing
2 sockets with same
local port 22 ???
can live in two processes not allowed
can live in one process not allowed
2 msg with same dest ip:port
but different source ports
addressed to 2 sockets;
2 ssh sessions
addressed to the
same socket

## retirement disposable time usage

See also my framework: Chore^Pleasure activities

  • exercise in the park everyday .. like grandma
  • reflective blogging — likely to be a big time-killer
  • reading as a pastime? GP said at his age, he still loves reading and has many good books at home, but has insufficient physical energy
  • sight-seeing, burning your cash reserve? Grandpa said he is physically unable to
  • — now the more productive endeavors:
  • volunteering for a worthy cause?
  • helping out as grandparents
  • ! … semi-retirement is clearly superior as I would have a real occupation with a commitment and a fixed work schedule

Grandpa pointed out that there are Actually-bigger factors than finding things to do

  1. cash flow
  2. health

FIX tag#54

  • In NOS (i.e. new order single), 54(side)=buy/sell,
  • in fill report, side=bought/sold i.e. past tense.

For spread order (rather than NOS), #54 may not have a simple value like buy or sell. The #624 (LegSide) tag is often populated to be specific.

In one system I know, #54 and #624 are both populated with value of 1(Buy).

socket API !=designed4microsecond response2signals

I think signals are popular and proven tools to socket programmers.

When a signal is sent to a socket program, the response is not within a few clock cycles. I think the receiving program must notice it and react to it.

I think if faster response is required, then the “event” should be delivered via the socket itself. The proogram would pick it up synchronously.

technology churn on WallSt^internet

my perception of churn has been fundamentally shaped by two phases in my professional experiences — 1) pre-2007 internet career in SG and 2) post-2007 WallSt career in US/SG.

The technologies in the internet space has faster churn.

A big chunk of WallSt infrastructure is browser-based, intranet or internet. They experience similiar level of churn. Jxee is affected. Remember PWM servlet-based framework?

Another chunk of WallSt infrastructure is low churn, including coreJava, c++, SQL, MOM, FIX, sockets, batch jobs

By the way, Linux is dominant on WallSt and in the broader internet ecosystem.

UI technology is usually high-churn. Increasingly browser-based.

high-paying sectors: wife’s envy #oil,banking.. #w1r1

My wife told me “My colleague’s husband is in the oil industry and it looks like everyone in oil industry makes a comfortable living?” This questions doesn’t require inside knowledge, but does require analytical skills, and looking beyond the mass media propaganda.

  • — Some poster child industries , half ranked by my familiarity (not necessarily leve of “confidence” in my assessment)
  • [v] quants
  • [v !w] traders
  • [s !w] web2.0 + e-commerce tech professionals but only the few top tier firms
  • [v] technology MNC firms
  • [s] banking (commercial or investment banking)
  • [sv] IC design
  • [sw] medical, but only if you are a full doctor. Some would say “only specialists”
  • [] casino .. high profit margin but I don’t think the employees are highly paid
  • [sv] oil sector, but only if your job is in the lucrative subsectors, not a gas station employee.
  • [sw] academia in top institutes

Now some of key determinants of total income level over 10Y+

— [w=long window of peak earning years] For example, sports and show-biz stars have a relatively short window of peak earning years.
— [s=specialized skills as moat]. Even if a sector enjoys high profit margin, only the moat-protects specializations would command a premium. This is esp. significant in the medical, energy, banking sectors.

— [v= volatility of profitability over 10Y+]
* some HFT and AI/ML players may fall off the leading pack.
* energy industry experience boom-n-bust cycles, but some players are resilient.
* tech giants like IBM, HP, Motorola, Nokia, Nortel, Sybase, BEA, Yahoo.. face constant challenges and are often acquired.

git | diff | make file6 appear before file2

The pain — a big commit touches on too many files, 3 of them major changes, while 5 files have cosmetic changes. In git-diff or git-difftool, I want to focus on the 3 major files.

Some developers would want to split the big commit into a “major” and a “cosmetic” commit, but one of the snapshots would be half-cooked and unusable. In such a case, the commit comment would warn “This commit is part of a bigger change. Please do not build this commit alone”.

As an alternative we can make the 3 major files show up first (or last) in diff. You can put the content below in src/diffOrder.txt and use it as instructed. All unspecified files would show up in natural order.

The line-ending issue the most common mistake with -O !

# git diff -O src/diffOrder.txt
# check EACH line below to ensure unix line ending, 
# even if entire file is unix-style.
# wildcards are needed below.

*/ProductCacheUtil*
*/CSV*
*/FutureDataCacheUtil.java
*/RJOBRIENDROPTools.java

bigO ^ most@the-time-in-practice

I believe in practice quicksort is actually faster than mergesort (and most other sorts) most of time time in practice, but quicksort performance can become bad for a rare input dataset. I think this is one of the reasons that mergesort is often chosen by library authors, for “predictable performance”, for safety. In the worst case, Mergesort is O(N logN) vs O(NN) for Quicksort.

Analog: Consider vaccine testing. One in several thousand volunteers  develops severe reactions, and the vaccine would be disqualified unless the underlying health condition can be identified and isolated.

https://github.com/tiger40490/repo1/blob/py1/py/algo_str/shortestPerfectDivisor.py is possibly a similar example.

D:=length of a candidate divisor
H:=length of the haystack
S:=length of the incremental suffix appended to the last profiled candidate

The simple solution uses slowCheck for every candidate (every left-substr up to midpoint of the haystack). In other words, try first k chars of haystack as a candidate. so I would say O(HH) overall.

Also, the winning shortest divisor itself could be lengthy. So you may need to try a lot of shorter candidates before getting lucky.

The frequency table quickCheck() will quickly disqualify 99% of the bad candidates, so I expect a huge performance gain in practice. However, for a “qualified” candidate, we have to run a O(H) slowCheck() to be sure. So overall my bigO is similar to the simple solution i.e. no improvement 😦

So in_practice vs in_theory bigO might be vastly different.

— I actually implemented an incremental profile construction. Hopefully, the amortized profile-construction cost across all candidates is O(1).

SlowCheck is O(H).

Each time quickCheck takes O(S) to construct a new profile based on the last profiled candidate, and may need a O(26) loop to check the frequency tables. Therefore quickCheck is O(S + 26).

There are up to H/2 candidates, so aggregate cost is O(sum(S) + 26*H/2) = O(H), because sum of S is H/2 exactly.

Now we need some estimate of the hit/miss ratio. A miss is a quickCheck pass but a slowCheck fail, assumed to be rare. A hit is a pass in both quickCheck and slowCheck, and program exits after first hit.

  • If we pessimistically estimate a lot of misses before one hit, then we could run many slowChecks, and end up with O(HH), no better than the simple solution.
  • If we optimistically estimate there’s an upper limit to the number of misses, then total slowCheck cost is O(H), so aggregate cost is O(H+D)=O(H)

interpreting git diff output #numbers

— based on https://stackoverflow.com/questions/2529441/how-to-read-the-output-from-git-diff

@@ -18,6 +19,8 @@ int cmd_http_fetch(int argc, const char **argv, ...

It is in the format @@ from-file-range to-file-range @@ [header]. The from-file-range is in the form -<start line>,<number of lines shown>, and to-file-range is +<start line>,<number of lines shown>. Both start-line and number-of-lines refer to position and length of hunk in preimage and postimage, respectively. If number-of-lines not shown it means that it is 0.

The num-of-lines in the preimage can mean the number of context lines. This is the case when the change is insert.

The num-of-lines in the postimage includes context lines too.

sending unsupported FIX tag on an algo order

EPA (End Point Adapter) is a generic name for a buy-side FIX gateway, responsible for crafting FIX orders and transmitting to a FIX liquidity venue. A venue could be a ECN or a pass-through broker or a private liquidity venue in an ibank…

Some broker declare that “Orders sent that include Algorithmic parameters that are not supported for that strategy will be rejected.” This warning relates directly to our earlier discussion “validation based on algo name”. Our conclusion is – EPA to pass through any supported or unsupported params to broker, and let broker decide to reject or ignore.

The alternative is EPA internal reject. Based on my experience, Inviting a broker reject is no worse than internal reject. Broker reject is usually preferred. When EPA rejects, there’s always a lingering doubt “Are we sure this unsupported tag would be rejected by broker? If not 100% certain, then why don’t we try placing the order? Maybe broker would accept the order even though they say they would reject.”

In other words, EPA internal reject is premature pessimism.

— validation based on algo name

For example, if tag 8081 is unsupported for the algo in my order, then I believe it’s usually harmless to pass it through to broker. Unsupported means “irrelevant” or “inapplicable” therefore no effect whatsoever. Conversely, If this #8081 had any side effect on this particular algo, then the side-effect would have been documented for this algo, and become a relevant tag for this algo.

So I think it’s harmless to pass through to broker. Whether broker ignores this unsupported tag or rejects, it’s anyway no worse than EPA implementing the same ignoring/rejecting.

conEmu tab-bar customize

— Problem: current active tab is indistinguishable from other tabs. https://github.com/Maximus5/ConEmu/issues/552 describes one solution using “tab templates”. Here’s my own adaptation of this solution:

%m ★ m %s%m ★ m

Explanation: The q[ %m ★ m ] inserts a star iFF the host tab is the active tab.

— Problem: tab label should show only dir name and hostname without noisy clutter.

My solution — use SkipWordsFromTitle feature, with pipe-separated list of bad words

MINGW64:/N/repos/|MI_EPA/|.mlp.com – PuTTY

Macq2015: I decided to ask for no raise

This retrospective aims to shed some lights on my sentiments

  • background: at that time, my visible burn rate was around 6k/Mo, so my  brbr was more than 2.0
  • my fear: bonus stigma. I was driven to do everything I could to avoid another traumatic experience. With low base, it was easier for manager to pay me some bonus.
  • .. now I feel I can be stronger in my psychological defense. Am a weather-proof adult, not a young graduate.
  • my hope: low base (160 vs 200) would make me less expensive, and therefore lighten the weight of expectation on me. I really wanted an easy job like a contract job.

##some defectors among techies #w1r2

Provided I limit the tcost, it can be useful to spend some time and compare some of of the “peers” who switched from tech to other careers (mostly in finance).

— career longevity and age-discrimination
lifetime income:
Let’s first look at tech-}trading switch. I think the trading career is not always better than tech career. I think some of these defectors probably enjoy higher lifetime income, but often with higher stress.

So if you successfully control family burn rate (including college+med), then the additional lifetime income may not be so important. Consider the FIRE leading lights.

In net asset imt I need this life time, I argued that some defectors can lose the dream, the cash flow high ground of “more money than I need in this life time”

— absorbency: I feel most of these individuals don’t have my level of absorbency for dry technical topics or life long learning, including coding drill. My dev-till-70 and interview strategy is uncommon. I can imagine that within a few years of their tech career many of them realized they actually prefer more “business-facing” roles like BA, front or middle office rather than deep technical roles..

Some of them can write code but probably don’t envision themselves doing that full time for 50Y. Remember Ken Li’s comment.

age-65 #demanding moved into who relevance to me
demanding field biz He #Barc unfamiliar
no clue biz CCC team lead unfamiliar
no precedence. Demanding
yet shrinking field 😦
desk quant Derek Li, Eric Zhu #Barc unfamiliar
demanding field trading@@ Nick #Barc 10Y younger
demanding field trading+dev@@ Wang @Dymon 10Y younger, unfamiliar
demanding field PM<-researcher Jack Phung 10Y younger
risk quant<-BA<-dev Bertrand #OC younger
Kun felt confident risk quant Kun.h 10Y younger
unsure risk mgmt Tomas #Macq
Alex shows wisdom localSys expert Alex V #MS older, unfamiliar
I have more confidence
in Siddesh
localSys expert [1] Siddesh #B@A younger

[1] I feel his quant knowledge is mostly localSys.

## WallSt contract career: drawbacks=valuable #double-edged sword

Compared to web2.0, HFT, or VP roles in ibanks,,,,, the WallSt contract career has several key drawbacks or “bad smells”, half-ranked by familarity

  • smell: progressively shrinking job pool
  • smell: overall job security. Need to keep interviewing
  • smell: no leaves
  • smell: no accumulation in localSys value-add, in brank
  • smell: no career growth in current or similar firms
  • smell: outdated, non-mainstream technology compared to the web shops
  • salary is better than main street, yet uninspiring compared to web2.0
  • no bonus,
  • smell: no health insurance, so total comp probably pales against perm jobs
  • no limelight

No wonder very few bright developers perceive it as as more than a temporary solution. Remember Gerald Robinson.

However, these drawbacks are an extremely valuable feature in my view. It underpins my dev-till-70 plan.

The smells are a double-edge sword. The smells put off many strong competitors but I’m used to the smells.

Deepak singled out the technical writer (Gordon) career…

isRisingAfterRemoving1() #AshS

Q: For an int array, design a boolean function to determine the possibility to make it strictly ascending using the option to remove any one element. In other words,

if it’s possible to “clean up” the int array using 0 or 1 deletion, then return true.
if it’s impossible then return false.

Same problem: https://stackoverflow.com/questions/44276036/java-program-testing-for-a-strictly-increasing-array-after-removing-at-most-one
Same problem: https://stackoverflow.com/questions/44914456/how-to-find-if-an-array-is-strictly-in-an-increasing-order-after-omitting-only-o/44914648

==== analysis: I like this simply-defined, well-understood challenge.

— O(N) SolA: try removing the offender (cc) and try removing the prev (bb).
Scan from left end. When we encounter the first offender cc, we try two “kills” and ensure there’s no more offender to the right.

  1. try killing cc .. check the subarray after truncating all before bb
  2. try killing bb .. check the subarray after truncating all before cc
  3. if both fail, then return false.

https://github.com/tiger40490/repo1/tree/cpp1/cpp/algo_arr has my solution, tested with 10 test cases.

 

max height{subArr bumpS: #AshS

My friend Ashish gave me https://www.hackerrank.com/challenges/crush/problem. The mail on 29 Oct included a pictorial illustration.

====analysis
I didn’t write any code.
— my sorting solution, inspired by the skyline problem
Suppose there are J=55 operations. Each operation is a bump-up by k, on a subarray. The subarray has left boundary = a, and right boundary = b.
Step 1: Sort the left and right boundaries. This step is O(J logJ).

a b k
1 5
4 8
6 9
After sorting, we get 1 4 5 6 8 9.

Step 2: one pass through the sorted boundaries. This step is O(J).
* There’s a single integer of “current height”.
* each time I encounter a left boundary (like the #4), I bump current height by the K value for that operation ( +7 in this case)
* each time I encounter a right boundary (like the #5), I reduce current height by the K value of that operation ( -3 in this case)

So during the Step 2 iteration, the currentHeight variable would “trace” the fluctuating height of entire array of N elements. If that’s a correct trace, then simply remember the max height ever reached. At the end of the iteration, return that max height.

— AshS O(J+N) solution without sorting, elegant for small N large J. Maintain a shadow array (size N) of “deltas”. For example, d[1] would record the change from arr[0] to arr[1]. This shadow array is updated J times, before a final pass to find the max.

For each of the J operations, we increment d[a] by k and decrement d[b] by k.

So after processing the J operations, d[77] would record the cumulative effect of every operation starting at arr[77] or ending at arr[77]. Note d[77] is unchanged if an operation bumps arr[77] as an interior of a sub-array.

In the final pass, we use d[] to trace the bumped “skyline” of original arr[0..N] and remember the max height achieved.

jvm stack size; pthreads usage

— some findings from https://pangin.pro/posts/stack-overflow-handling

Most of the interesting implementation techniques are C/C++ techniques, not usable as java coding techniques. Low level techniques, mostly dealing with memory.

Most of the content is low market value because too obscure for IV, not required for GTD at all, and not even relevant as zbs or expertise.

  • — some worthwhile knowledge pearls:
  • stack size for a single thread .. configurable via -Xss command line switch, and defaults to 1MB for a 64-bit system (8 bytes/reference on a frame).
  • .. “This is usually enough to place several thousand average frames.”
  • jvm codebase includes pthreads function calls. “HotSpot explicitly disables glibc guard for all Java threads by calling pthread_attr_setguardsize …” I think author was talking about the non-Windows JVM.

Note some parts of this arcitle discusses non-hotspot JVM’s. These JVM implementations could be very obscure and low-value, therefore quite a distraction.

G1^ZGC

Warning: high churn

ZGC is relevant to low-latency trading. I think G1 is general-purpose and probably not designed for low-latency trading, but it could be more than adequate for your specific latency requirements. Remember Martin Thompson’s comment on “good enough”.

ZGC was introduced in java11 as an experimental GC.

G1 is the default GC since java9, replacing the Parallel GC (not the ConcurrentMarkAndSweep).

 

[20] hiding^overriding ] java inheritance

Upcast from a subclass instance to a superclass variable ..

  • can call the overshadowed (hidden) static methods of the superclass
  • can’t call the replaced nonstatic methods of the superclass (for example nonstaticMeth()).
  • .. I learned this first from my Verizon colleague from India. If you have a subclass instance, then the JVM probably removes all access to the replaced nonstaticMeth() of superclass

There’s also an obscure mechanism involving private methods

— based on stackoverflow:

All methods that are accessible are inherited by subclasses. –> I believe Private methods are not inherited.

From the Sun Java Tutorials:

A subclass inherits all of the public and protected members of its parent, no matter what package the subclass is in. If the subclass is in the same package as its parent, it also inherits the package-private members of the parent. You can use the inherited members as is, replace them, hide them, or supplement them with new members

The only difference with inherited static (class) methods and inherited non-static (instance) methods is that when you write a new static method with the same signature, the old static method is just hidden, not overridden.

From the page on the difference between overriding and hiding.

The distinction between hiding and overriding has important implications. The version of the overridden method that gets invoked is the one in the subclass. The version of the hidden method that gets invoked depends on whether it is invoked from the superclass or the subclass

subclass add`VIRTUAL@inherited nonVirt func

See also

Suppose superclass has a public nonVirtual say() method , and subclass redefines it as virtual. (See my typeid_WithoutVptr.cpp in github.)

Q: Is this a practical scenario? I feel very rarely.
Q: Is this common in interviews? Less rare.

The vptr/polymorphic behavior is not widely understood.

Q: Is the behavior even standardized? Not sure and I won’t bother to find out.

Below is my empirical understanding based on experiment. Suppose we have B* b = new D() and then call b->say().

  • if B has nonVirtual say() and D has virtual say() as in the opening example, then b.say() hits base class B, bypassing vptr.
  • .. I guess the “B” subobject inside the D object has no vtpr. The B class could be written long ago, so the compiler of its ctor would NOT allocate a vptr hidden field.
  • if B has virtual say(), then D (and all descendant classes) has an implicit virtual keyword. Therefore, b->say() hits subclass D, via vptr in the “b” subobject.
  • … if both B and D declare say() virtual? same as above, because the virtual in D is implicit and redundant.

 

keep2solutions]1py: print funcName+retVal

Here, stack() returns a list of stack frames. First frame is the caller of stack(), i.e. myFunc. The 4th item in this “frame record” is the function name, but this level of detail is never appreciated in interviews.

The technique, though, is useful in many coding drills, and even coding tests.

def myFunc():
  ...
  print inspect.stack()[0][3], "returning", ret
  return ret

— alternatively
Note the second stack frame is used in this case, because first stack frame is ‘fn’.

def myFunc():
  ...
  print fn(), ret
  return ret
def fn():
  return inspect.stack()[1][3] + " returning"

##major non-convertible currencies #Korea

A non-convertible currency is a currency used primarily for domestic transactions and not openly traded on FX markets.

“Blocked currencies” is another word.

I think all Non-deliverable-forward currencies are probably blocked currencies?

I feel these governments tend to be weaker or more prudent to protect their own currencies.

  • CNY and TWD (Taiwan)
  • INR – India
  • BRL – Brazil
  • Korea and Taiwan — competitive export-driven economies
  • MYR
  • PHP
  • IDR – Indonesia

java.util.Object methods #@NotNull

This class consist of static util methods for operating on any java object. Mostly for zbs and GTD. Might be useful in rare interviews, but useful in coding IV.

For each method, the purpose is quite practical, but the practical usability depends on your project and requires field testing.

— deepEquals(nullable1, nullable2) — arguments can be arrays.
— equals(nullable1, nullable2)
— hash(obj1, obj2, obj3…) — see javadoc for usage
— hashCode(nullableObj) is useful when the argument is nullable
— requireNonNull(obj, “custom exception msg”)
https://stackoverflow.com/questions/45632920/why-should-one-use-objects-requirenonnull explains why this fail-fast method can be used. I see it as alternative to assert, as executable documentation.

Typical usage of requireNonNull is validation of some arg1 passed to myMethod. If you use requireNonNull the stack trace would show this method as the site of exception. If you don’t use requireNonNull, then the stack trace may not even show myMethod and you may have a hard time locating where the null comes from.

When in a hurry, we can first document the nonNull requirement using @NotNull. Later we can add the requireNonnNull() validation. See https://stackoverflow.com/questions/34094039/using-notnull-annotation-in-method-argument

 

noSql@cloud #lesson 0.5

I have spent about 5 minutes on this topics so far.

Background — If you app runs in the elastic cloud, then local disk files are quite “dangerous”. I like short sentences like that, even though it’s vague. It means “not best practice” and “dangerous design”. Among the alternatives, noSQL is one of the most popular and best-supported.

https://searchcloudcomputing.techtarget.com/tip/Compare-NoSQL-database-types-in-the-cloud hints that NoSQL and elastic cloud are natural partners.

Q: which of the four types (warning: churn) of the current /crops/ of noSQL solutions are good for cloud?

— churn: Both cloud and noSql are evolving. Granted, there are stable features in both but the best practices are likely to change over 20Y as new stuff replace the old.

One case in point is the meaning of noSql. NoSQL stands for Not Only SQL, but some online articles may treat it as NotSql.

 

getRichQuick: meaningful vocation@@

There are many GRQ (getRichQuick) advertisements such as

  • personal e-commerce project
  • property investment
  • stock picking
  • MLM

All of them emphasize low effort. However, I want some meaning in my vocation. I also want some level of intellectual challenge to fend off boredom.

Many of these “quick” ideas are unsustainable, unable to provide long-term “engagement” for 90% of the participants even if we remove the unsuccessful people from our sample.

Most of them offer no or limited organic growth and personal development.

— How do they stack up against my traditional ideas of semi-retirement occupations like

  • traditional: teach math, coding,,,
  • traditional: dev-till-70
  • traditional: technical writer?

I guess some of these GRQ ideas might become more meaningful, more engaging to me.

refresh@blog beats friend tech discussion

There’s accumulation of insights in this blog for 温故知新.

In my recent experience (as of 2020) this is arguably more fruitful than email exchanges with friends, and more fruitful than reading books.

Email exchange is far more efficient than calling friends, unless there’s something confusing.

Reading books is generally more rewarding than reading online.

  • online — URL and text can be copied into blog
  • reading — easier to refresh

restricted FIX session ^ socket shutdown()

— Based on a discussion with my colleague.

In a case of FIX session issue (any type) we can run the session in a restricted mode which allows us to send cancels (but not new/replace) and receive reports.

If incoming sequence numbers are beyond repair, then we will not see the reports but we still can send cancels.  Not an ideal situation but lets you out of the market.

In many venues we have an agreement of cancel on disconnect, so we would simply disconnect.

— socket shutdown() has a comparable feature
The syscall shutdown() can receive a “direction” argument of 0, 1 or 2.

The 0 and 1 constants specify either incoming or outgoing direction. “2” shuts down both.

 

##eg@ MainSt IT salary: SGD 5k

Multiple blogposts should refer to this blogpost

— 2020 HaiFeng
Years ago HaiFeng’s senior at NCS was around 35 (based on HF’s calculation), earning $5k. Now in 2020 this guy is around my age, earning 6k in some main street (healthcare?) sector.

Therefore, HaiFeng felt that within the broader IT job market, the specific sub-sector is crucial.

— 2020 Raymond’s job search
— 2015 my own job hunting experience: Software engineer salary doesn’t rise faster than inflation.
— hardware engineers —

  • many mid-level engineers earn up to SGD 5k in their 30’s, such as Integrated Circuit designers.
  • Canon — My 2012 OC intern from a polytechnic said his dad, engineering team lead in a Japanese electronics giant (Canon?), only earned $5k.

search-tree algos

This blogpost is based on a “Path finding in AI” chapter in [[Algo in a nutshell]] . The illustration problem is the 8-puzzle. The chapter’s highlight is the famed A*search, but I will focus on the brute-force DFS/BFS and other basic stuff.

Each tree node is a “board-state”, or “snapshot”. Each directed edge between two nodes is a “move”.

(Not every developer candidate has the “tree” concept for a board problem. Recall the BBG odometer problem?)

The tree is recombinant, since two paths can reach the same snapshot. Therefore, the search tree is a directed graph. In practice, this directed graph is more like a tree, so it’s useful to treat it as a tree.

Q: shall we use edge set to represent this directed graph?
A: I doubt it , because the graph is being constructed as we go. The graph is not given to us as static input.

Equivalence between two snapshots is based on a snapshot.key() method. Conceptually, two equivalent snapshots be mirror images or rotated images of each other.

snapshot.storedData() returns arbitrary stored data for a given tree node.

— BFS on this type of search tree

  • would evaluate a tremendous (costly) number of snapshots
  • 🙂 no recursion
  • 🙂 queue is constant-time
  • 🙂 guarantees to find shortest path

— DFS on this type of search trees

  • maintains a relative small stack of “open” nodes and a large set of “closed” nodes.  Closed means fully explored dead-ends.
  • .. I think open means not fully explored, including the white and grey nodes defined on P144.
  • .. what if we come back to a node existing on the current path? It would represent a useless move, not one of the validMoves()
  • DFS (not BFS) features back-tracking
  • 😦 DFS doesn’t know when the current node is steps away to the endgame

— relevance to CIV?
I guess some advanced interviews (Google?) may pose an unusual problem that has elegant solutions based on the essentials of path-finding. The problem might be solvable on-paper in 45-minutes iFF you have those basic concepts, either from reading, or raw intelligence, or “intelligent conversation with the interviewer“. I would say prior reading is a huge advantage.

In reality, most candidates won’t have the knowledge or the raw intelligence, so the conversation is really the only way out. Interviewer knows that you the candidate may have zero prior experience, so she attempts to drop hints to “guide” you, as a test of your communication/problem-solving capacity. I remember Rahul said some candidates (like Deepak) simply couldn’t get the hint. Sadly, I don’t always get the hint. Recall the first Facebook onsite.

 

c++macro-function: concat as string arg

The context: In C++ (as in most languages), we often need to pass into a function an argument like

“some text” + someNumber + someBoolean

In this context, java makes it easy.  Python requires str(someNumber) or the comma as connector. In C++, strict typing is …. quite restrictive. The standard solution uses a stringstream object, but clumsy compared to java. The more flexible concat is

cout<<“some text” << someNumber ….

However, this concat is unusable in our context, which requires a string argument.  The trick, found on P12 [[safe C++]], is a macro function SCPP_ASSERT(). You can invoke it like

SCPP_ASSERT(someBooleanCondition,  “some text”<<someNumber );

 

 

semi-smart ptr: 2 parts@SMART

[[safeC++]] has a 2-page chapter on semi-smart ptr. The title of this chapter is de-referencing null ptr. So this is actually one of the two parts of SMART.

So traditional smart pointers have two essential Smarts

  1. ownership i.e. delete
  2. deref control
  3. … other Smarts …. such as copy control

The semi-smart ptr in this chapter features only one of these Smarts.

This semi-smart ptr class is smart about … deref.

— deref — involves two operators i.e. 1) the arrow and 2) the asterisk. Therefore, the semi-smart ptr in this chapter overloads these two operators, to detect null pointers
— footprint — Is the size of this semi-smart ptr instance same as a raw ptr? I guess so. Can write a test program.
— simplicity — Compared to the traditional smart pointers, this semi-smart ptr is very simple.

  • easy to reason with.
  • Easy to incorporate into my own projects.

FIX server can cope with 2 client IPs sharing 1 compID

In a legit scenarios, ip1 and ip2 are the failover machines running identical software, so both would use the same compId7.

However, due to mis-configuration, two unrelated testers on ip1 and ip2 could be using compId7 unknowingly. The sequence numbers would be inconsistent between the two testers.

FIX server would assume these two FIX sessions are operated in tandem. If sequence numbers of ip1 and ip2 are not out of sync, then the 2 could coexist.

 

fill+bust+correction+amend: 4 exec-reports

In a typical buy-side FIX gateway, people are only interested in about 10 important FIX reports from a venue to the order originator (let’s call it the Portfolio Manager). Beside these 10, there are probably many other reports, but it’s usually harmless to filter them out, thrown down the drain.

Among the 10 report types, fill, bust, correction and amend are in the same category with these common traits:

  • unsolicited — these 4 reports are generated due to venue events, not as a immediate reply to the PM. Some call these 3 reports “unsolicited”. Actually, there’s an irrelevant 5th unsolicited report — the unsolicited cancel of an unfilled order (or a portion thereof).
  • position impact — these 4 reports have immediate impact on the PM’s position and therefore need to go into the PM’s books immediately.
  • execution reports — these 4 reports are true execution reports. I feel the other reports tend to update order status.

I believe a correction report represents a corrected version of an erroneous fill report. I feel this is as rare as the bust.

Note in addition to an exchange, a liquidity venue or a broker/dealer can issue such corrections and busts. Any sell-side can. (I worked on a bust processor at a big equity broker.) Of course, the sell-side need to clean up their own record before sending these amendments.

Note both the bust and the correction target a “trade” meaning a fill, not an order i.e. request-to-trade.

I guess overfill can be a reason for a bust, but I don’t see any online discussion of it, so I guess overfill is often taken as is.

##UChicago t$cost overspent: similar items

Q: are these t-investments below similar to my UChicago t-investment overspend? This is another retrospective, after a few others.

— socket programming?
I feel sockets are similar to quant skillset in churn-resistance.

sockets are slightly closer to my current skillset than quant is. Therefore socket skillset is more likely to be relevant.

— latency (micro-)engineering?
Latency skillset is a QQ halo on WallSt, similar to quant.

Unfortunately, latency is similar to quant as an unlikely domain for my future jobs.

My t-spend on this front is guided by interview experience, not a formal college education.

— bigO analysis, DP+greedy, speed coding (ECT etc)
My strength is stronger in this theoretical domain like bigO and quant.

Could I overspend time here? Am I overrating this skill like quant skills?

My t-spend on this front is guided by interview experience, not a formal college education.

— FIX protocol?

social chat: pantry^cubicle: appearance++

I tend to prefer pantry chat but there are disadvantages

— risk: you can miss an urgent issue, a meeting reminder…. During office hours we are supposed to be available.

This is esp. import when you are on support rather than dev project

— disadvantage: unprofessional appearance.

We all have prejudice about “those Indians” or “Those Chinese workers”, or “Those young developers”

Don’t reinforce their impression of you !

cumulative ack #again

http://www.tcpipguide.com/free/t_TCPSlidingWindowAcknowledgmentSystemForDataTranspo-8.htm has some good pointers, probably an online version of the same content in my tcp/ip textbook

  • cumulative ack — receiver using one number to inform sender of the last byte of a continuous range of received bytes.
  • if some bytes above that “ack number” was already received out of order (too early, before the lower segments), then those bytes are not mentioned or implied in the ack message. Receiver warehouses them secretely, and sender may need to resend them. This seems to be inefficient, but has proven simple and effective, letting a single ack number carry out a huge job.
  • .. An optional feature called selective acknowledgments does allow non-contiguous blocks of data to be acknowledged, but not popular

Paraphasing the author, the sliding window cumulative ack protocol is a very powerful (yet simple) technique, which allows TCP to easily acknowledge an arbitrary number of bytes using a single acknowledgment number, thus providing reliability to the byte-oriented protocol without spending time on an excessive number of acknowledgments.

3modes: writ`@(hash/ordered)map

  • insert_Or_Update — (most common map “Writing”): java uses put(); c++ uses operator[].
  • update_iFF_Existing — c++ needs if-block with count(); java uses replace() which returns null if nothing replaced
  • put_IFF_Absent — java uses putIfAbsent(); c++ uses insert()
  • .. Note java computeIfAbsent() is more efficient than putIfAbsent() which unconditionally computes the new “value (often an allocated object)” even though there’s a 50/50 chance this value is unneeded — i.e. the incumbent scenario.

professional trader: longevity@@ #start-up #Kun.h

Look at some young trader like Zhen Fang of Macq, or the Newport “neighbor” 顾小姐. I used to believe they have good career longevity. I knew these traders probably would NOT be full-time trading into their 40’s, because trading seems to be a young men’s game. I thought they could start their own prop trading business, with family money or raised fund. Now I feel it’s not easy at all

Q: At what age can they start on their own? Typically in their 30’s or 40’s. I interviewed with a Singapore trader who started his own FX trading firm. I think he was in his 40’s or 50’s.

Q: how much savings can they amass by that age? I think their income is not much higher than coders, so possibly 0.5 ->1-2 million. However, they also need to buy a home and raise kids.

Q: how much personal money can they use as risk capital? I feel up to a million. Neck was run by two former CIMB stock traders but had only SGD 3M in total capital.

Suppose they can achieve 10% return, the 300k gross profit must go to payroll, rent, market data, technology. If they have any outside money (say, from an uncle), they must distribute a decent return (like 6%) to compensate for the risk assumed by the uncle.

— hot money: most investors i.e. clients, including me, want the freedom to leave. Huge instability for the prop firm.
— competition: clients can choose from many different prop shops. Open competition.
— stress: full time trading for a big bank or hedge fund is stressful, but doing it for your own company is worse.
— Is this blogpost oth_risk? Yes releative to the other blogposts under t_semiHighFlyer.

jGC: pinned objects]heap affect`big-array allocation

In [[enterprise java perf]] published by SunMicrosystems and authored by IBM researchers, there was a brief warning against pinned objects in the java heap.

( Note there’s also off-heap storage, discussed in other blogposts. )

According to the authors, if there are too many pinned objects, then allocating for a large array would be hard, and can lead to OOM. Without evidence, I assumed this risk was low.

Pinned objects prevent the sweep/relocation during garbage collection, and can lead to fragmentation.

I don’t know how we could end up too many pinned objects.

AshS’s impl idiom: array index bound-check

Payload aa= ( i-1 >=0)? arr[ i-1 ] : INT_MIN;  // <– This is the coding style I learnt from my friend Ashish Singh … in the comparison operation, he use the exact same subscript expression 🙂

I used to “simplify” it to

Payload aa= ( i >=2)? arr[ i-1 ] : INT_MIN; // <– The “simplification” code change is error prone and demands mental focus and tires us down. Code is arguably less readable.

Note Ashish’s idiom requires signed int type … This is the main reason to avoid size_t in array subscript variable declaration. I think google style guide says the same.

— bigger example

if ( i+2 > sz -1) return true; // should we “simplify” the expression? No ! Explained below.
payload next = arr[i+2];

Here we compare against sz-1 because the subscript expression i+2 must not exceed sz-1. Therefore, Ashish’s coding style is more readable once you get it.

switch statement: C compiler optimization

I feel this could show up on an interview, either a question or you may get a chance to showcase this knowledge.

Based on http://lazarenko.me/switch/, probably an ebook

clang and gcc are two of the most popular C compilers.

  • — 3 common implementations
  • Jump table — the simple and common implementation
  • if-elsif-elsif — probably least efficient in most cases, but can be the most efficient for very a small switch block.
  • binary search — is used for a switch block having sparse case values

windows startup apps: eg@high-churn skill

I remember discovering and documenting knowledge pearls about how to manage windows startup apps and other windows usability features.

Now these knowledge pearls have lost their value, becuase Microsoft doesn’t bother to keep backward compatibility.

Some may question if this is a common malaise/problem/drawback of GUI apps in general. Perhaps Mac and Unix GUI systems also suffer from the same churn.

seq received too high #Capitan America unfrozen

Let’s first focus on the incoming sequence number.

The FIX software uses a ledger file to remember the expected sequence number. If at logon I expect 111 but receive 999, then I will NOT give up but simply request retrans.

Analogy – Captain American wakes up from his long frozen sleep and missed all the episodes of his favorite television drama. He has to /binge-watch/ all the missed episodes before he can understand what’s going on. This analogy illustrates why the “received too high” scenario is common and expected.

How about the reversed mismatch? It’s rare — if received seq number is unfortunately too LOW, then no hope of recovery. Something seriously wrong, too serious to repair. In my test environment, I do hit this unrealistic scenario. We have to manually update the expected inbound sequence number in QuickFix’s ledger file.

bonus apprehension: circle@control^@concern

In Nov 2020 I had a 30-min phone session with a Cigna counsellor. She pointed out

  • If you focus on and work within your circle of control, you may gain/grow/accumulate confidence.
  • On the other hand if you spend too much time outside the circle of control and into the circle of concern, then you may suffer.

Note the context of the conversation is not PIP but bonus and minor improvement suggestions during annual appraisal.

— on 21 Dec 2020, my apprehension about the bonus call was /palpable/, mostly due to 1) the respect (approval/appreciation), not so much due to 2) the one-time financial impact.

The one-time financial impact would probably die down soon . The base increment is small but more long-lasting.

A situation without the respect factor — When waiting for H1b lottery result, there was no apprehension.

 

git | binary search4 some changes ] a given file(s)

git diff <commit> path/to/file

My shell function below can quickly show diff between a given past commit to the current tip.

You need to hardcode the filename(s) under investigation.

rollback() { # roll back to a given commit
  target=$1 #some git commit id                                                                
  pushd /N/repos/MI_EPA/                                                                       
  git diff head --name-status                                                                  
  set -x                                                                                       
  git checkout $target -- tbl/Generic/ReplaceOrder.tbl # bin/javagen.pl #tbl/MSF/NewOrder.tbl  
  git diff HEAD --name-status # git diff alone won't work    
 
  # above command should reveal only the named files changed.
                                  
  git diff HEAD                                                                                
  set +x                                                                                       
  popd                                                                                         
}                                                                                              

— to sort git tags by date: https://bintanvictor.wordpress.com/wp-admin/post.php?post=36920&action=edit&classic-editor

containers(!!VM) thrive in elastic cloud

I think containers beat VM in this game.

I think containers take less resources (esp. memory), therefore faster to launch. One physical machine can host more container and VM instances.

A container’s footprint (more likely disk space rather than memory) is usually below 100MB, but a vm takes Gigabytes.

[[ A Comparative Study of Containers and Virtual Machines in Big Data Environment  ]] is a 2018 IBM-led research finding. It shows

  • containers are much faster to boot up, probably more than 10 times faster. Bootup latency refers to the period from starting a VM or container to the time (right before starting any hosted application) that it can provide services to clients. This is an important factor that cloud service providers usually consider. There are several reasons. On one hand, a shorter bootup latency means a smaller latency to provide services, which brings better user experiences. On the other hand, faster bootup also allows more elastic resource allocation, since the resources do not have to be reserved for the VM or Docker that is starting up.
  • Each machine can run up to 100 active containers but at most half that many active VM instances.
  • If we create (i.e. boot up) idle instances, then each machine can support up to 1000 containers but only around 100 VM instances.
  • the amount of memory allocated to a container is very small at the beginning, and then increases (to, say, 10GB) based on the demands of the applications in the container. However, the VM instance uses 16GB from very beginning till the end.
  • a container releases its memory after it finishes its workload, while a VM still holds the memory even after it becomes idle.
  • the authors concluded that “with the four big data workloads , Dockers container shows much better scalability than virtual machines.”

2 processes’ heap address spaces interleaved@@

The address space of a stack is linear within a process AA. Not true for AA’s heap address space. This address space can have “holes” (i.e. deallocated memory blocks) between two allocated blocks. However, how about these 3 consecutive blocks… would this be a problem?

Allocated block 1 belongs to process AA
Allocated block 2 belongs to process BB
Allocated block 3 belongs to process AA

I think several full chapters of [[linux kernel]] cover memory management. The brk() syscall is the key.

I think the actuall addresses may be virtualized .

Kernel page table?

— why do we worry about holes?

  1. I think holes are worrisome due to wasted memory. Imagine you deallocate a huge array half the size of your host memory
  2. hunting for a large block among the holes can be time-consuming
  3. if your graph nodes (in a large data structure like linked lists, trees) are scattered then d-cache efficiency suffers.

— So why do we worry about interleaving?

If we need not worry, then interleaving may be permitted and exploited for efficiency.

sys calls, cpu instructions for heap^stack

My book [[linux kernel]] P301 lists just 5 kernel functions for managing the heap for a given process. Only one function brk() is implemented as a sys call.

Q: how is brk() implemented in terms of CPU instructions?
A: I think the complex logic is in software i.e. kernel functions. CPU executes the individual instructions of that logic

Q: does CPU have different instructions for heap vs stack?
%%A: I think cpu has special registers for stack management. Stack is probably simpler than heap, so CPU manages (most of) it very efficiently. See https://gribblelab.org/CBootCamp/7_Memory_Stack_vs_Heap.html

Q: how does CPU allocate for a given stack?
%%A: Perhaps it just increments a pointer (in some special register) to allocate a new stack frame?

Q: does CPU have instructions for allocating memory?
%%A: I don’t think so. Allocating is too high-level and involves multiple instructions. Allocation requires some data structures, which are maintained by kernel. The memory allocation data structure is itself in RAM, perhaps heap

Q: Kernel functions definitely use the stack, but does kernel use the free store?
%%A: I think so. That’s how kernel data structures can grow.

 

cloud-native^MSA

Given a requirement, you can go the traditional approach or go cloud-native. The latter approach is

  1. Containerized, probably using Docker,
  2. Dynamically orchestrated, usually using Kubernetes,
  3. Microservices-oriented

Source: Frequently Asked Questions (FAQ) – Cloud Native Foundation

So I think MSA is a major part of cloud-native dev expertise.

—  churn, longevity

  • I feel linux container technology will continue to improve, perhaps incrementally, become even more stable and efficient. Containres are useful even without the cloud or MSA.
  • Docker and Kubernetes  ride the cloud wave but may be displaced by newer solutions
  • Cloud computing has proven effective and will grow further before stablizing.
  • MSA and REST are trendy ideas and possibly /faddish/. They probably existed before cloud, and can be useful without the cloud.

 

serverless: part of cloud-native dev expertise

I feel serverless is the extreme form of cloud-native. A lot of cloud-native discussions refer to serverless “architecture” as a typical example of cloud-native, without naming it.

Below is Mostly based on this RedHat article

There are still servers (i think author means “hosts”) in serverless, but they are abstracted away from app development. A cloud provider handles the routine work of provisioning, maintaining, and scaling the server infrastructure. Developers can simply package their code in containers for deployment. I gess the artifact in a deployment “package” is a container’s image.

With serverless, routine tasks such as managing the operating system and file system, security patches, load balancing, capacity management, scaling, logging, and monitoring are all offloaded to a cloud services provider.

— dynamic billing

“…when a serverless function[1] is sitting idle, it doesn’t cost anything.” I think in this case there’s no server host created, so no utilization of electricity, disk, bandwidth, memory etc.

[1] I think this means a functional unit of deployment.

— What are some serverless use cases?

Serverless architecture is ideal for 1) asynchronous, stateless apps that can be started instantaneously. Likewise, serverless is a good fit for use cases that 2) see infrequent, unpredictable surges in demand.

Think of a task like batch processing of incoming image files, which might run infrequently but also must be ready when a large batch of images arrives all at once. Or a task like watching for incoming changes to a database and then applying a series of functions, such as checking the changes against quality standards, or automatically translating them.

Serverless apps are also a good fit for use cases that involve 3) incoming data streams, chat bots, scheduled tasks.

cloud prefers linux to other kernels, due2container

https://www.cbtnuggets.com/blog/certifications/open-source/why-linux-runs-90-percent-of-the-public-cloud-workload claims that linux runs 90% of the public cloud workload. About 30 percent of the virtual machines that Microsoft Azure uses are Linux-based.

I would think the other 70% of virtual machines on Azure is linux-free.

IBM CFO James Kavanaugh said in 2020:

“The next chapter of cloud will be driven by mission-critical workloads managed in a hybrid multi-cloud environment. This will be based on a foundation of Linux with Containers and Kubernetes.”

— containers without linux?

Containers are seen as a fundamental enabler/catalyst in cloud compouting, but is it a linux-only feature?

Note linux-vm on windows still uses linux. Is there a container implemntation without linux? https://containerjournal.com/topics/container-ecosystems/5-container-alternatives-to-docker/  lists a few, but I think they are all way behind linux standard containers:

Other Container Runtimes

  • Windows Server Containers.
  • Linux VServer.
  • Hyper-V Containers.
  • Unikernels.
  • Java containers.

On Azure, there are many marketing jargon terms, and Microsoft tries to downplay the reliance on linux, so it’s harder to find out what windows native container solutions there are, but here are a few:

cloud-expertise for devs: G3 aspects

Mostly for interview, i.e. body-building, but also design best practices for the team lead and architect.

Are we assuming Paas or Iaas? I feel more Iaas

  1. persistence and storage — including file system and data store
  2. integration with other systems across the local network
  3. admin interface and real time config changes?
  4. — less relevant aspects:
  5. configuration management? App config data must not be local files. I think they should be deployed along with the binary. As such, this is More of a concern for devops.
  6. high availability, cluster, failover? LG. More of a concern for the system architect
  7. security? LG. More of a concern for the system architect

cloud: more relevant to system- than software-architects

Software architects worry about libraries, OO designs, software components, concurrency etc.

They also worry about build and release.

How about app integration? I think this can be job for system architect or software architect. If the apps run in the same machine, then probably software architect.

— 10 considerations to develop for cloud

https://www.developer.com/cloud/ten-considerations-for-realizing-the-potential-of-the-cloud.html was a 2015 short article written by a java developer. I like the breakdown into 10 items.

Some of the 10 items are more devops than developer considerations. The other items are more for system architect than software architect.

However, hiring managers expect a senior candidates like me to demonstrate these QQ knowledge. By definition, QQ is not needed on the job.

 

stay relevant2new economy#cloud, ML,bigData

  • cloud, edge computing, virtualization,,,
  • AI, ML,
  • big data, data analytics
  • MSA, REST

Q: what’s the risk to my dev-till-70 if I choose to pay minimal attention to anything irrelevant to trading engines?
A: Appearance — I might appear out of date to the younger interviewers. I may need to know the jargon terms (and their relationships) enough to follow the conversations.

Job specs increasingly list these new technologies. I may be /displaced/sidelined by younger candidates, even disqualified. However, my advantage is my diverse experience across industries.

Putting on the black hat over the “displacement” concern, I feel some of my personal experience (in the “Churn” section below) cast doubts on it.

— technology bet i.e. picking real winners from a fountain of new buzzwords and hypes

I feel cloud is the most enduring technology in my list above. In dev environments (am a developer), Cloud infrastructure may become widespread just like git, linux, virtual hosts.

— Churn — defined as the risk of investing (my precious time) into perishable stuff, faddish stuff or hypes

xp in java-land: I stayed away from spring/hibernate. I didn’t get deep into WebLogic or EJB…. Because majority of WallSt elites value coreJava more than jxee (or even c++ but that’s another story).

Coherence/Gemfire etc also faded away.

Functional java also faded away.

cloud4java developers

— Iaas: I agree with the general observation that IaaS doesn’t impact us significantly.

AWS is mostly an IaaS environment. Same for my vultr experience.

— Saas: I feel SaaS doesn’t impact us either. SaaS could offer devops (build/delivery) services for java, c# and c++ developer teams.

— PaaS has the biggest impact on java developers.

We have to use the API /SDK provided by the PaaS vendor. Often no SQL DB. Can’t access a particular host’s file system. MOM is rarely provided.

Am an enterprise java developer, not a web developer. I feel PaaS is designed more for the web developer.

lucky I didn’t invest in Scala #java8/c++11 #le2Greg

Hi Greg,

Am writing another reply to your earlier mail, but I feel you wouldn’t mind reading my observations of Scala and java8 on the WallSt job market.

Let me state my conclusion up-front. Conclusion: too many hypes and fads in java and across the broader industry. I feel my bandwidth and spare time is limited (compared to some techies) so I must avoid investing myself in high-churn domains.

You told me about Scala because MS hiring managers had a hard time selling Scala to the job candidates. The harder they try to paint Scala as part of the Future of java, the more skeptical I become. To date, I don’t know any other company hiring Scala talent.

If I get into a MS Scala job, I would have to spend my bandwidth (on the job or off the job) learning Scala. In contrast, in my current role with all-familiar technologies, I have more spare time on the job and off the job. On a Scala job, I would surely come across strange Scala errors and wrestle with them (same experience with python, and every other language)

. This is valuable learning experience, provided I need Scala in future jobs, but nobody else is hiring Scala.

Therefore, am not convinced that Scala is worth learning. It is not growing and not compelling enough to take over (even a small part of the java) world. I would say the same about Go-lang.

If scala is a bit of a hype, then Java8 is a bit of a fad.

I remember you said in 2019 that java8 knowledge had become a must in some java interviews. I actually spent several weeks reading and blogging about key features of java8. Almost None of them is ever quizzed.

Java8 seems to be just another transitional phase in the evolution of java. My current system uses java8 compiler (not java8 features) , but java 9,10,11,12,13,14 and 15 have come out. There are so many minor new features that interviewers can only ask a small subset of important features. The "small subset" often boils down to an empty set — interviewers mostly ask java1 to java5 "classic" language features such as threading, collections, java polymorphism.

Some hiring teams don’t even ask java8 interview questions beyond the superficial. Yet they say java8 experience is required on the job!

Lastly, I will voice some similar frustrations about c++11/14/17. Most teams use a c++17 compiler without using any new c++ features. Most of the interview questions on "new" c++ revolve around move semantics, a very deep and challenging topic, but I don’t think any team actually uses move semantics in their work. Nevertheless, I spent months studying c++11 esp. move semantics, just to clear the interviews.

##[18]realistic 2-10Y career plann`guidelines #300w

Background: not easy to have a solid plan that survives more than 3Y. Instead of a detailed plan, I will try to manage using a few guidelines.

  • — top 3 “guidelines” [1]
  • respect/appreciation/appraisal(esp. by manager) — PIP/stigma/trauma/damagagedGood. Let’s accept: may not be easy to get
  • Singapore — much fewer choices. Better consider market-depth^elite domain
  • Expertise accu (for dev-till-70) or sustained focus — holy grail
  • ——– secondary:
  • dev-till-70
  • family time — how2get more family time #a top3 priority4Sg job. Some usage is optional (play time) while others are a matter of responsibility.
  • personal time — short commute, flexible time, low workload, freedom to blog]office… is proving to be so addictive that I have forgotten the other guidelines.
  • interviews — Let’s accept : extremely important to me but much harder in Singapore. Even in the U.S. I may need to cut down.
  • distractions — Let’s accept and try to contain it.
  • Entry-barrier — could be too high for me in the popular domains like algo trading
  • Entry-barrier — could be too low for some young guys — the popular domains will have many young guys breaking in
  • FOLB Peer pressures — and slow-track… Let’s accept.
  • trySomethingNew — may/not be justifiable
    • stagnation — could be the norm
    • engaging — keep yourself engaged, challenged, learning, despite the stagnation
  • Shrinking Choices — many employers implicitly prefer younger programmers
  • Churn — Avoid
  • non-lead dev role — Let’s embrace. Don’t feel you must move out or move up. Hands-on coding is gr8 for me. Feel good about it

[1] I didn’t say “priorities”

tech IV: assess` communication

In the U.S. competitive landscape, communication skill often means English speaking, listening and esp. explaining complex technical topics. On the job, communication skills often include email writing.

At a higher level, communication skills also include empathy/rapport, persuasion and relationship-building. Some interviews, exp. the leadership interviews do try to assess the relationship skills, but it’s highly inaccurate and highly subjective.

A new hire may show very different ratings on these two fronts (English vs relationship). My interview with the Japanese NumTech failed partly due to lack of humor.

— immigrant
why is English skill in question? Probably based on field observations, this is due to the talent pool dominated by immigrants from East Europe, Asia and also Latam.

Also, some (Asian++) cultures are known to discourage out-spoken assertiveness or self-expression, but in many tricky situations and team discussions, self-expression is critical. Just look at the EPA team meetings.

— contract vs leadership
Perm employee interviews esp. leadership roles, often require more than superficial assessment on relationship skills.

In contrast, contractor interviews are light on relationship skills.

— what if you are a mainstream writer, like Borong? Technical writer is good at explanatory or showcasing.

.so.2: linker^dynamic loader

— Based on https://unix.stackexchange.com/questions/475/how-do-so-shared-object-numbers-work

In my work projects, most of the linux SO files have a suffix like libsolclient_jni.so.1.7.2. This is to support two executables using different versions of a SO at the same time.

Q: How is the linker able to find this file when we give linker a command line option like “-lsolclient_jni”? In fact, java System.loadLibrary(“solclient_jni”) follows similar rules. That’s why this example uses a java native library.

A: Actually, linker (at link time) and dynamic loader (at run time) follow different rules

  • at compile time, executable binaries saves (hardcoded) info about which version of a SO to load into memory. You can run “ldd /the/executable/file” to reveal the exact versions compiled with the executable.
  • at run time, executable would consult the hardcoded info and load libsolclient_jni.so.1.7.2 into memory
  • at link time, linker only uses the latest version. So there’s usually a symlink like libsoclient_jni.so (without suffix)

— static libraries:

I think static libraries like libc.a do not have this complexity.

During static linking the linker copies all library routines used in the program into the executable image. This of course takes more space on the disk and in memory than dynamic linking. But a static linked executable does not require the presence of the library on the system where it runs.

KenLi: absorbency n dev-till-70

Ken Li said something like … If you enjoy tech (context of our conversation: dev) work then you have job security in tech sector till old age — I would say well into 60’s.

I think he said that because there exists sooo many tech jobs in U.S. not so demanding, where my absorbency advantage _alone_ is more than sufficient to sustain a long career.

 

##conEmu git-bash color schemes

To cope with the dark background, it’s important to increase display hardware brightness+contrast.

Most color schemes are eye-hurting for git-commit message editor. Only the following 3 choices come /close to/.

I would say git-diff is more important than git-commit.

— choice: SolarizedLuke: most acceptable overall.
— choice SolarizedLight: bad for git-diff
— choice: Tomorrow … very bad for git-diff

#1 usage@member template: smartPtr upcast

Does this use SFINAE? Not important. The exact definition of SFINAE is arcane, but the technique here is just as magical as SFINAE.

— OOC member template “function”

P 176 [[moreEffC++]] uses member template functions to generate OOC members for a smart ptr class. Suppose the host type is smartPtr<T>, then the OOC converts from host type to smartPtr<B>

This OOC is generated iFF ptr-to-T can convert to ptr-to-B. I think SCB-FM IV by architect #shared_ptr upcast is relevant.

OOC is an overloaded operator, not a function, but I consider it a member function.

— cvctor instead of OOC

For a similar purpose, https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_33template/shPtrUpcastCopy.cpp features a cvctor template.

This cvctor template is an alternative to the OOC member template.

[20] SG tech talent pool=insufficient: expertise^GTD

Listening to LKY’s final interviews (2013 ?), I have to agree that Singapore — counting citizens+PRs — doesn’t have enough technical talent across many technical domains, including software dev. https://www.gov.sg/article/why-do-we-need-skilled-foreign-workers-in-singapore is a 2020 publication, citing software dev as a prime example.

A telltale sign of the heavy reliance on foreign talent — If an employer favors a foreigner, it faces penalty primarily (Russell warning) in the form of ban on EP application/renewal. This penalty spotlights the reliance on EPs at multinationals like MLP, Goog, FB, Macq.

The relatively high-end dev jobs might be 90% occupied by foreigners, not citizens like me. I can recall my experience in OC, Qz, Macq, INDEED.com interview… Why? One of the top 2 most obvious reasons is the highly selective interview. High-end tech jobs always feature highly selective tech interviews — I call it “Expertise screening”.

Expertise is unrelated to LocalSys knowledge. LocalSys is crucial in GTD competence.

As I explained to Ashish and Deepak CM, many GTD-competent [1] developers in SG (or elsewhere) are not theoretical enough, or lack the intellectual curiosity [1], to pass these interviews. In contrast, I do have some Expertise. I have proven it in so many interviews, more than most candidates.

(On a side note, even though I stand out among local candidates, the fact remains that I need a longer time to find a job in SG than Wall St. )

[1] As my friend Qihao explained, most rejected candidates (including Ashish) probably have the competence to do the job, but that’s not his hiring criteria. That criteria is too low.  Looks like SG has some GTD-competent developers but not enough with expertise or curiosity.

— Math exams in SG and China

Looking at my son’s PSLE math questions, I was somehow reminded that the real challenge in high-end tech IV is theoretical/analytical skills — “problem-solving” skill as high-end hiring teams say, but highly theoretical in nature. This kind of analytical skill including vision and pattern recognition is similar to my son’s P5 math questions.

In high-end tech IV, whiteboard algo and QQ are the two highly theoretical domains. ECT and BP are less theoretical.

What’s in common? All of these skills can be honed (磨练). Your percentile measures your abilities + effort (motivation, curiosity[1]). I’m relatively strong in both abilities and effort.

So I know the math questions are similar in style in SG and China. I have reason to believe East-European and former-Soviet countries are similar. I think other countries are also similar.

rvalue objects before/after c++11

Those rvalue objects i.e. unnamed temp objects have been around for years. So how is rvr needed to handle rvalue-objects?

  • C++11 added language features (move,forward,rvr..) only to support resource stealing where resource is almost always some heapy thingy.
  • Before c++11, rvalue objects don’t need a special notation and don’t need a special handle (i.e. rvr). They are treated just like a special type of object. You were able to steal resources, but error-prone and unsafe.

private-Virtual functions: java^c++

Q: in c++ and java, is private virtual function useful?
A: both languages allow this kind of code to compile. C++ experts uses it for some advanced purpose but in java, any private methods are effective non-virtual, so any subclass method is unrelated to the baseclass private method.

— C++ is more nuanced
The trigger of this blogpost is P68 [[c++ coding standard]] by two top experts Sutter and Alexandrescu, but I find this “coding standard” unconvincing.

Private virtual functions seem to be valuable in some philosophical sense but I don’t see any practical use.

— java
See also hiding^overriding in java

Q: beside hiding (static methods), overriding and overloading, does java support another mechanism where subclass can redefine a (non-static) method?

  • in GTD this is very low value.
  • in terms of zbs and expertise this actually reveals something fundamental, esp. between java and c++
  • .. also highlights some important but lesser-known details of java inheritance
  • in terms of IV, this can be a small halo whenever we talk about overriding^overloading

A: Code below is not overriding nor overloading but it does compile and run, so yes this is another mechanism. I will call it “xxxx” or “redefinition”. The xxxx method is unrelated to the baseclass private method so compiler has no confusion. (In contrast, With overriding and overloading, compiler or the runtime follows documented rules to pick an implementation. )

Note if super and subclasses both use “public”, we get overriding. Therefore, “xxxx” requires “private in superclass, public in subclass”

Code below is based on https://stackoverflow.com/questions/19461867/private-method-in-inheritance-in-java

public class A {
    private void say(int number){
        System.out.print("A:"+number);
    }
}
public class D extends A{
    // a public method xxxx/redefining a baseclass private method 
    public void say(int number){
        System.out.print("Over:"+number);
    }
}
public class Tester {
    public static void main(String[] args) {
        A a=new D();
        //a.say(12); // compilation error ... Private
        ((D)a).say(12); //works
    }
}

array/deque based order book

For HFT + mkt data + internal matching or market making .. this is a favorite interview topic, kind of similar to retransmission.

==== For a first design, look at … https://web.archive.org/web/20141222151051/https://dl.dropboxusercontent.com/u/3001534/engine.c has a brief design doc, referenced by https://quant.stackexchange.com/questions/3783/what-is-an-efficient-data-structure-to-model-order-book

  • TLB?
  • cache efficiency

— no insert/delete of array

Order cancel and full-fill both results in deletion of an array element .. shifting. Random inserts mid-stream also requires shifting in the array. To preempt shifts, the design adopted in engine.c is “one array element for every possible price point”.

  1. when an existing order gets wiped out, its quantity becomes zero. It’s also possible to use a flag, but zero quantity is probably more space efficient.
  2. when we get a limit order at a new price of 9213, we don’t insert but update the dummy price point object in-situ.

What if all the price points in use are only 0.01% of the price points allocated? To answer that question, we need some estimates of the likely price levels and the footprint of the array element. Luckily, price levels are not floating points but integers essentially — a key constraint in the requirement.

  • An array element can be very space efficient — a nullable pointer.
  • Alternatively, it can be an integer subscript into another array of “received price points”. Dummy price point would be represented by “0”, a special subscript. Subscript can be double-byte, a big saving cf 8-byte pointers.
  • Likely price levels could range from 30D min to 30D max plus some margin. Such a range would be up to 10,000 price levels.
    • But What if price plunges at SOD or mid-day? “Not sure how my company system was designed, but here’s my idea –” we would need to allocate then initialize a new array of price levels. Deque would help.
  • Unlikely price levels (for outlier orders) would be hosted in a separate data structure, to support a super low bid (or super high ask). These outlier orders can tolerate longer delays.
  • a deque would support efficient insert near Both ends.

Good design for a penny stock (few price levels), but …

Q: how about a “pricey stock”? Their price levels would be so numerous , almost like float 😦
A: Take the bid book for example. The top 10,000 price levels can use the penny-stock  dequeue design.

==== vector insert as a second design

A fairly complete reference implementation for Nasdaq’s ITCH protocol supports order lookup, order reduction (partial fill or full fill), order cancel and  new order. New order at a fresh price level uses vector::insert —
  • If this insertion happens near the end of the vector (top of book), then the elements shifted would be very few
  • if the market slowly declines, then the best bid would be wiped out, shrinking the vector at the back end. New price levels would be inserted at lower price points, but again the elements above it would be few.
  • If this insertion happens far from the back end, as in an outlier order, then lots of elements shifted, but this scenario is rare.

Note a single element’s footprint can be very small, as explained earlier.

==== switch to a RBTree if some abnormal price movement detected. This idea may sound crazy, but consider RBTree used inside java8 hashmap

bSearch in array^BSTree

A) bSearch in sorted array
B) bSearch in a BST

Both are logN but here are some key differences:

  • #1 diff (for static data): d-cache efficiency
    • .. if data volume is larger than L3 cache or even main memory, then you may hit page faults and need virtual memory, then the d-cache benefit is reduced
  • diff: dynamic .. i.e. insert/deletes. array is hopeless
    • .. what if inserts and deletes only happen at the two ends? Then a dequeu is still usable and possibley faster than BST
  • diff: BST may become unbalanced, derailing the logN performance. Self-balancing BST is easily available.

##xp staying-on after PIP, with dignity

  • +xp GS: informal, unofficial PIP, before I went to Kansas City for July 4th. After it, I stayed for close to a year.
  • +xp Stirt: I overcame the setback at end of 2014 and scored Meet/Meet albeit without a bonus.

Now I think it is possible to stay on with dignity, even pride. I didn’t do anything disgraceful.

I hope at my age now, I would grow older and wiser, less hyper-sensitive less vulnerable, more mellow more robust, or even more resilient.

That would be a key element of personal growth and enlightenment.

market data warehouse #lightning talk

I am not permitted to reveal identities. Let’s say we are talking about a financial institution. You can think of a bank, sovereign fund, insurer or asset manager.

  • total footprint ~ 2.5 PB = 2500 TB, in the original form as received from vendor, and is usually pre-compressed by vendor.
  • daily increment ~ 3 TB and growing
  • biggest subset is tick data, probably 800 TB. One vendor can require 0.5 TB/day after decompression.
  • Most common data dissemination (from vendor) is FTP. The new kid on the block is vendor API whereby clients can pull data from vendor.

— historical market data

This warehouse is for historical data. It can poll a vendor system every 5 minutes to receive latest data.

ICE RTS defines “real time market data” using a latency threshold of 30 minutes. Therefore, some data in this warehouse can be considered “near real time”.

— cloud:

Many vendors are on AWS i.e. provide an AWS dissemination interface, so this MDW is also moving to AWS.

If a vendor (Reuters?) is only on google cloud, then dissemination requires a AWS-Google bridge, but no good bridge as of 2020.

merge pre-processing loop into main loop@@ No

Q: can we remove pre-processing loop and merge its logic into the main loop?

No. My experience shows that it’s often a good practice to separate out that initial processing. After that initial loop, we have a checkpoint/milestone where we can use asserts and prints to verify a number of key assumptions or pre-conditions. This checkpoint can yield huge benefit in reduction of uncertainties.

This preprocessing step seems to introduce additional complexity but it’s not additional, but rather shifts some amount of complexities from the main loop out to the preprocessing loop.

This is similar to my long-time preference of shifting complex logic from main java app to DB (including stored proc) and to client-side such as java script.

Some people call it “separation of concern”. In this case, the job responsibility of the pre-processing loop is well-defined and easy to verify.

[20]OC-effective: overrated

Today on my jog along the Marina Promenade, I reminded myself that my parents gave me a good body and i have taken good care of it. that’s why I’m able to enjoy life to the fullest.

Then it occurred to me that those “effective” people (in the OC sense) usually don’t or can’t make the same claim. They are often overworked, overweight, overeating, lacking exercise. It’s rare to see one of them fit and slim (TYY might be).

Remember OC-effective is defined in the corporate context. Mostly corporate managers.

— OC-effective people are no more likely to be healthier than average. Their life expectancy is not longer. I actually believe health is wealth.

— OC-effective ≠ (Fuller) wealth or measured in Brbr. Many people are more wealthy but not MNC managers.

— OC-effective ≠ “effective with the team”, as in the SG sense. Sometimes it is largely inter-personal (with the boss) effectiveness.

— OC-effective is mostly theoretical and assessment can be very biased . Promotion is decided by upper management, so what team members feel doesn’t count. 360-review is marketing.

— OC-effective ≠ true leadership. We all know some lousy managers getting promoted (RTS, deMunk). However, I think many good leaders have OC-effectiveness. Listening is essential.

— OC-effective ≠ satisfaction with life. Grandpa often says these individuals 活得好累. They often have more stress due to heavy responsibilities.

— OC-effective = effective in that one organization and may not be in another organization. Consider Honglin. In contrast, hands-on developers like me possess more portable skills mostly in the form of IV.

— OC-effective ≠ adding social value. The corporation may not create lasting social value. OC-effectiveness means effective at helping the company reach its goals, regardless of value. In SG context, social value is rather important.

— We often say

  • My memory/eyesight/hearing/teeth/BMI is not so good
  • My stamina/fitness is not that good
  • My savings rate is not high
  • My marriage is not successful
  • My Chinse (or English or math) is not so good
  • I don’t have a great sense of humor

But why don’t we hear anyone saying something like … the things below? If I must take a shot at the reason, I would say the above qualities are unconditionally valuable, useful, life-enhancing, but in contrast the “achievements” below are less universally recognized.

  • “I’m not good at organizational effectiveness.”, in the OCBC sense.
  • “I’m not good at moving up.”
  • “I’m not good at meeting my manager’s key objectives” — key to moving up

MSOL zoom by mouse/touchpad/touchscreen

My home laptop screen resolution is such that zoom-in is required when reading outlook messages. How do I use keyboard to easily zoom in on a new message?

— touch screen two-finger pinch works, even though I disabled it in my laptop

— Ctrl + right-scrollUp is similar to the slider.

With an external mouse, this gesture is probably proven and reliable. With a touchpad, I relied on two-finger scroll:

Two-finger up-scroll + Ctrl does zoom in 🙂

— two-finger page scroll without Ctrl key : is unrelated to zooming

Windows gives us two choices. I chose “Down motion means scroll down”. Intuitively, the two-finger feels like holding the vertical scroll-bar on a webpage and in Initellj.

Note Intellij scroll is otherwise impossible with touch-screen or touchpad!

cloud ^ data_center@@

Q1: does “data center rack space” always mean some form of cloud?
A:???

Q2: if not, then what’s the difference between a traditional/old-fashioned data center and a cloud-enabled data center?
A: ????

Two major industry sectors:

  • financial powerhouses — generally use private data centers, probably not cloud (except build etc). These shops like grid/farm which could offer elasticity i.e. compute nodes enlisted on demand, but I don’t know if docker containers are created on the fly.
  • web sites — often uses the cloud, but not necessarily the elastic cloud

The other industry sectors are unfamiliar to me.

So whenever I hear people talk about cloud or grid, I should inquire about container

scalability^latency on WallSt^web2.0

Web2.0 shops care only about scalability and don’t care about micro latency engineering including multi-threading, language/compiler features.

However, web2.0 shops are not always impressed with WallSt scalability. OPRA feed handler copes with 20,000 kmps and requires scalability. Similarly, my NYSE XDP feed handlers can cope with 370 kmps per thread. These systems are operated by a dotcom!

Many WallSt architects don’t care about scalability per-se. For other architects, scalability is achieved with hardware and architecture.

Both camps care about data structures, which are fundamental to scalability and latency.

pointer as class member #key points

Pointer as a field of a class is uninitialized by default, as explained in uninitialized is either pointer or primitive. I think Ashish’s test shows it

However, such a field creates an opportunity for mv-ctor. The referent object is usually on heap on in static memory

If you replace the raw ptr field with a smart pointer, then your copier, op=, dtor etc can all become simpler, but this technique is somehow not so popular.

Note “pointer”  has a well-defined meaning as explained in 3 meanings of POINTER + tip on q[delete this)]

[17] prepare]advance for RnD career

Grandpa became too old to work full time. Similarly, at age 75 I may not be able to work 8 hours a day. Some job functions are more suitable for that age…

I guess there’s a spectrum of “insight accumulation” — from app developer to tuning, to data science/analysis to academic research and teaching. The older I get (consider age 70), the more I should consider a move towards the research end of the spectrum…

My master’s degree from a reputable university is a distinct advantage. Without it, this career choice would be less viable. (Perhaps more importantly) It also helps that my degree is in a “hard” subject. A PhD may not give me more choices.

For virtually all of these domains, U.S. has advantages over Singapore. Less “difficult/unlikely” in U.S.

In theory I could choose an in-demand research domain within comp science, math, investment and asset pricing … a topic I believe in, but in reality entry barrier could be too high, and market depth poor

Perhaps my MSFM and c++ investment don’t bear fruit for many years, but become instrumental when I execute a bold career switch.

 

FTE^contractor: lowest stress as family]US@@

After hibernation, for the first few years in the U.S. as a family, quite possibly my stress level (and wife’s) would grow to a record high.

Q: Would a contract or VP or AVP job give me stress relief?
A: Clearly, contract job feels lighter, despite the lower salary and lower job security.

Boss personality is the biggest wildcard, but I would say job nature (esp. expectation) is crucial/vital. VPs are benchmarked and ranked, and AVP is similar but less visible. As contractor, I think renewal based on budget is the main appraisal.

I think so far the stress is the 1) PIP stress including bonus-stigma and workload.  In addition, my own stress profile includes another main stressors: 2) livelihood — brbr

Wife’s stress and kids’ stress don’t depend on FTE^contract choice.

— livelihood:
health insurance is a key (possibly main) benefit of FTE.

Job loss is a rare and therefore a mild stress

speed=wrong goal #aging pianist

傅聪 is an outspoken and articulate pianist. His words probably describe many aging pianists — at a certain age, you don’t have the “hardware capacity” to compete on speed.  Today I want to talk about speed.

  • — ^ GTD on the job:
    sense of urgency is good.
  • — ▼ code reading and figure things out
    As we age, this kind of speed will become tougher, so it’s wise to lower the expectation.
  • — ▼ jogging
    speed is the wrong goal and can backfire. My ultimate goal is endurance in heart and lung, weight loss etc.
  • — ▼ grandpa learning computer
    expectation should be lower.
  • — ▼ 最强大脑 had a contestant in his 70’s, aiming to recite the first 5000 digits of pi.
    I believe as we age, we need more refreshing (like DRAM) + more time

====coding interview

— ▼ online test cases !
To run any online test case, you first need (a lot of) ECT to “make it work”. That would take a lot of time, highly discouraging.

— ▼ time limit such as “pass N tests before alighting” — a recipe for self-disappointment, self-hate and unnecessary pressure

— ▼ speed coding IV: by default not appropriate for older guys like me, with exceptions.
ECT, syntax … are harder for older coders. However, for some people speed-coding practice can be anti-aging.

For codility etc, enlist friends.

I now prefer (OO) design questions and weekend assignment

CIV implement hash_table/vector

  • eg: Broadway Tech
  • eg: GS-sg
  • eg: AshS GrassHopper .. (shallow implement) interviewer didn’t come to the coding stage.
  • eg: SIG onsite: vector push_back
  • eg: GS-London: deque: no-code

WallSt interviewers have a broader tradition of onsite implementation of other standard components

  • eg: auto_ptr in eSpeed
  • eg: shared_ptr
  • eg: lockfree stack

I think many fellow candidates in my circle would fail at these question which are designed to reveal detailed understanding of the nitty-gritty. If I consider myself 50% shallow, then those guys are … 10% shallow?

Why is WallSt obsessed with it? It reveals efficiency in detailed implementation

Why web2.0 seldom ask this? Not much algo involved.

cancel^thread.stop() ^interrupt^unixSignal

Cancel, interrupt and thread.stop() are three less-quizzed topics that show up occasionally in java/c++ interviews. They are fundamental features of the concurrency API. You can consider this knowledge as zbs.

As described in ##thread cancellation techniques: java #pthread,c#, thread cancellation is supported in c# and pthreads, whereas java indirectly supports it.

— cancel and java thread.stop() are semantically identical but java thread.stop() is forbidden and unsafe.

PTHREAD_CANCEL_ASYNCHRONOUS is usable in very restricted contexts. I think this is similar to thread.stop().

— cancel and interrupt both define stop points.  In both cases, the target thread choose to ignore the cancellation/interrupt request, or check them at the stop points.

Main difference between cancel and interrupt ? Perhaps just the wording. In pthreads there’s only cancel, no interrupt, but in java there’s no cancel.

https://stackoverflow.com/questions/16280418/pthread-cancel-asynchronous-cancels-the-whole-process

Note interrupted java thread probably can’t resume.

Nos sure if Unix signal handler also supports stop points.

Java GC on-request is also cooperative. You can’t force the GC to start right away.

Across the board, the only immediate (non-cooperative) mechanism is power loss. Even a keyboard “kill” is subject to software programmed behavior, typically the OS scheduler. 

comfort,careerMobility: CNA+DeepakCM #carefree

https://www.channelnewsasia.com/news/commentary/mid-career-mobility-switch-tips-interview-growth-mindsets-11527624 says

“The biggest enemy of career mobility is comfort … Comfort leads us to false security and we stop seeking growth, both in skills and mindset agility. I see all the time, even amongst very successful senior business people, that the ones who struggle with career advancement, are the ones whose worlds have become narrow – they engage mainly with people from their industry or expertise area, and their thinking about how their skills or experience might be transferable can be surprisingly superficial.”

The comfort-zone danger … resonates with Deepak and me.

— My take:

The author cast a negative light on comfort-zone, but comfort is generally a good thing. Whereas, admittedly, those individuals (as described) pay a price for their comfort-zones, there are 10 times more people who desire that level of comfort, however imperfect this comfort is.

Comfort for the masses means livelihood. For me, comfort has the additional meaning of carefree.

Whereas the author’s focus is maximum advancement, not wasting one’s potential i.e. FOLB and endless greed, our goal is long-term income security (including job security [1]). This goal is kinda holy grail for IT professionals.  Comfort as described in CNA is temporary, not robust enough. Therefore, I see three “states of comfort in livelihood”

  • Level 1 — no comfort. Most people are in this state. Struggling to make ends meet, or unbearable commute or workload (my GS experience)
  • Level 2 — short (or medium) term comfort, is the subject of the CNA quote.  Definition of “short-term” is subjective. Some may feel 10Y comfort is still temporary. Buddhists would recognize the impermanence in the scheme.
  • Level 3 — long-term comfort in livelihood. Paradoxically, a contingency contractor can achieve this state of livelihood security if he can, at any time, easily find a decent job, like YH etc. I told Deepak that on Wall St, (thanks to dump luck) Java provides a source of long-term comfort and livelihood security. Detachment needed!

[1] income security is mostly job security. Fewer than 1% of the population can achieve income security without job security. These lucky individuals basically have financial freedom. But do take into account (imported) inflation, medical, housing, unexpected longevity,,

Deepak pointed out a type of Level-2 comfort — professional women whose main source of meaning, duty, joy is the kids they bring up. For them, even with income instability, the kids can provide comfort for many years.

Deepak pointed out a special form of Level-3 carefree comfort — technical writers. They can have a job till their 80’s. Very low but stable demand. Very little competition. Relatively low income.

Deepak felt a key instability in the IT career is technology evolution (“churn” in my lingo), which threatens to derail any long-term comfort. I said the “change of the guard” can be very gradual.

— Coming back to the carefree concept. I feel blessed with my current carefree state of comfort. Probably temporary, but pretty rare.

Many would point to my tech learning, and challenge me — Is that carefree or slavery to the Churn? Well, I have found my niche, my sweet spot, my forte, my 夹缝, with some moat, some churn-resistance.

Crucially, what others perceive as constant learning driven by survival instinct, is now my livelong hobby that keeps my brain active. Positive stress in a carefree life.

The “very successful senior business people”, i.e. high-flyers, are unlikely to be carefree, given the heavy responsibilities. Another threat is the limelight, the competition for power , glory and riches. In contrast, my contractor position is not nearly as desirable, enviable, similar to the technical writers Deepak pointed out.

2^(10^9) imt number@atoms]universe

One coding question says that a DNA is represented by a large binary_number denoted B, which is treated as a lengthy bit_array. The positions of the non-zero bits are saved in an unsigned integer array. For example 0101 -> [2,0],  1100 -> [3,2], 1000 0010 -> [7,1]. Then it went on to say that a value in the integer array can be as high 10^9, denoted 1000M (i.e. 1000 million).

That means the underlying bit_array can have array length = 1000M. So the max binary_number is more than 2^(10^9). What kind of /mind-boggling/ large binary_number is that? Well, we know it has a billion bits, rather than 64-bit. To back up this single number B, we need many tapes.

I decide to compare it to the number of atoms in the universe , denoted E, as a practical, everyday application of high school math. The latter number is about 10^80.

— first try
Taking a ratio: B/E = 2^(10^9) / 10^80 … is still too hard to estimate, so taking logarithm (my favorite technique):

10^9 log2 – 80 ≅ 10^9 * 0.301 – 80 ≅ 301,000,000 – 80 ≅ 300,000,000

Even if the number of atoms in the universe is taken at the higher limit of 10^82, my estimate still holds.

So the large binary_number B is so large that after dividing it by the numberOfAtomsInUniverse, the ratio is a one followed by 300 million zeros.

Frankly, the size of this ratio is still mind-boggling. So I would say, the large binary_number B is more than 100 times larger than the numberOfAtomsInUniverse.

— Let me retry to compare the two number without logarithm.
2^(10^9)  = (2^10)^100M = 1024^100M ≅ 1000^100M = 10^(3*100M) = 10^300M

B/E ≅ 10^300M/10^80 = 10^(300M-80) basically same as 10^300M.

300M is the U.S. population. If U.S. population declines by 80, it makes no difference.

 

[20]what protects family livelihood: IV^GTD skill #AshS

Hi Ashish,

After talking to you about PIP (i.e. performance improvement process) in my past and in future scenarios, and considering my financial situation (wife not working, 2 young kids + 2 elderly grandparents) over the 20Y horizon , I came up with this question —

Q: Between two factors: AA) my competitive position on the tech hiring market, BB) job security at MLP, which factor has more impact on my family livelihood

Answer: I remain as convinced now as 10 years ago: AA is the dominant factor. I won’t allow myself to rely on some employer to provide my family a stable income for 20Y, even if I do a good job. There are countless stories among my peers who worked hard but lost jobs.

Answer (during covid19 mass layoff): I’m even more convinced that AA is the dominant factor. MLP is doing well, but MLP owner is not my dad. See my email to my sister sent on 19 Aug.

If I do a good job in the current team, what’s the chance of keeping this job for 10Y? 10%? There are individuals (like my manager) who stay in one team for 10+ years, but I think anyone like him has witnessed dozens of coworkers who have no choice but leave, for various reasons (not counting those who had a choice to stay but hopped higher elsewhere.)

That’s the basic answer to my opening question, but there are a few important sub-factors to point out.

Family livelihood includes housing, medical and education. In the U.S., I would incur $3k/M rental + 2k/M health insurance. Therefore, livelihood in the U.S. is more precarious, less secured.

My Health — is a big hidden factor. Stamina, mental capacity has a direct impact on our “performance” in the competition, both on job market and on-the-job. I think you probably need a lot of physical and mental energy, stamina,,, to deep dive into an unfamiliar local system or codebase, to become so confident, right?

company stability — is a sub-factor of BB. Some investment banks (GS, Barclays, MS) are known to aggressively cut headcount even in profitable years, to stay lean and mean.

Aging — is a factor affecting AA slightly more than BB. Age discrimination is something I seem to notice more as I grow older. So far I’m able to maintain my “cometptive fitness” on job market. If I rely on BB too much as I age, then I believe I would pay less attention to AA, and grow older and weaker. To strengthen the foundation of my family livelihood as I age, I tell myself to see the reality — as I age I would face a less friendly job market + instability on any job. Therefore I need to give priority to AA, by maintaining/improving my tech skills for competitive interviews.

Demand — for developers continue to grow in the job markets. This is a fundamental reason why AA is so valuable and reliable. This robust demand doesn’t help BB at all.

Overall, my view is biased in favor of AA. This is deliberate. With PIP or without PIP, any high-paying tech job (like yours or mine) comes with an expectation and risk of job loss. AA is the parachute.

— A follow-up mail
My employer is very successful this year. Nevertheless, I never assume my employer’s success, strength and stability gives me an iron rice bowl. Things can turn bad for any reason, at any time, but I always have 3 dependents to feed (my wife stopped working last year).

So my answer remains — AA is the dominant factor.
However secure my employer is, I am fundamentally insecure.  Reality is — just about every job is insecure. Paradoxically, the more we see through the smoke and mirror and recognize the inherent insecurity in any job, the more cool-headed we can become, the more likely we can find motivation to invest in our own competitiveness on the job market.
I don’t know why so few thought leaders point out the truth that’s fundamental in many careers — that tech interview skill is the most important survival skill for many developers like us, in good times or bad times.
What if you are a very senior develoepr, perhaps library developer, or some well-known open source contributor. I guess the next hiring manager would still use some screening test ! Your previous value-add is not easily portable, so you must learn a new codebase, and grow your value-add from scratch. Your value-add is what the next employer pays for.

some realities4China developers #wage,age

Above 35 … considered too old for coding job. Screened out by Many hiring teams. Most guys above 35 need to move into mgmt or consider something else or somewhere else.

— wage:
In Shanghai, above RMB 30k/M is hard (Jason Fu). MS fresh grad (probably in Shanghai) RMB 300k/Y, then slow increment towards 500k/Y

My in-law was an IT manager in Shanghai. CNY 400k is achievable, but CNY 600k is tough. See https://bintanvictor.wordpress.com/wp-admin/post.php?post=5321&action=edit&classic-editor

I can see the livelihood stress level.

— 996 culture — 9am-9pm, six days a week. I think this is fairly realistic, not exaggerated.
“Ten years ago, people rarely complained about 996”, said one insider on https://www.nytimes.com/2019/04/29/technology/china-996-jack-ma.html

An engineer’s day at one of China’s tech giants started at 8:00 a.m., and should have ended at 8:00 p.m., but he said that nobody wanted to be the first to leave, for fear of being labeled unprofessional or uncommitted. Reported on https://www.cbsnews.com/news/chinese-workers-are-starting-to-rebel-against-grueling-9-9-6-workplace-culture/

references: primarily used as params

Many books talk about references used as field, or return values, or perhaps payload saved in containers.

Many interviews also touch on these scenarios.

However, in my projects, 99.9% of the usage of reference is function parameters, both const and non-const.

  • if your goal is GTD, then focus on the primary usage
  • if your goal is IV or “expertise”, then don’t limit your learning to the primary usage.

3stressors: FOMO^PIP^ livelihood[def1]

  • PIP
  • FOMO/FOLB including brank envy
  • burn rate stress esp. the dreaded job-loss

jolt: FSM dividend has never delivered the de-stressor as /envisioned/. In contrast, my GRR has produced /substantial/ nonwork income, but still insufficient to /disarm/ or blunt the threat of PIP ! In my experience, Job-loss stressor is indeed alleviated by this income or the promise thereof 🙂

Excluding the unlucky (broken, sick,,) families, I feel most “ordinary” people’s stress primarily come from burn rate i.e. making ends meet, including job-loss fear. Remember the OCBC pandemic survey: 70%Singaporeans can’t last beyond 6M if jobless. I feel the middle class families around me could survive  at a much lower theoretical burn rate of SGD 3.5-4.5k (or USD 6k perhaps… no 1st-hand experience) …. but they choose the rat race of keeping up with the Jones’s (FOMO). Therefore, their burn rate becomes 10k. See also SG: bare-bones ffree=realistic #WBank^wife^Che^3k and covid19$$handout reflect`Realistic burn rate

For some, FOLB takes them to the next level — bench-marking against the high-flyers.

—– jolt: PIP^job-loss fear

Note Many blogposts (not this one) explore FOMO^livelihood.

For the middle class, any burn rate exceeding 3k is a real (albeit subconscious) stressor because the working adults now need to keep a job and build up a job-loss contingency reserve. Remember Davis Wei….3-month is tough for him? How about 30 years? In a well-publicized OCBC survey during covid19, most Singaporean workers can’t last 6M

With a full time job, salaried workers experience a full spectrum of stressors including PIP. PIP would be impotent/toothless if the job is like a hobby. I would say very few people have such a job.

Aha .. Contract career is free of PIP.

For me (only me), job loss is a lighter stressor than PIP fear. In fact, I don’t worry about end of contract [2] and bench time. I worry more about humiliating bonus. I’d rather lose a contract job than receiving a token bonus after PIP.

I think PIP is the least universal, shared stressor among the three stressors[1]. Even though some percentage of my fellow IT professionals have experienced PIP, they seem to shrug it off. In contrast, I lick my wound for years, even after it turns into a permanent scar. Most people assume that my PIP fear was fundamentally related to cashflow worry, but I am confident about job hunting. So my PIP fear is all about self-esteem and unrelated to cashflow.

[1] In the covid19 aftermath (ongoing), SG government worry mostly about job loss i.e. livelihood. Next, they worry about career longevity, in-demand skills, long-term competitiveness, housing, healthcare and education… all part of the broader “livelihood” concept. As parents, we too worry about our kids’ livelihood.

[2] Because I have a well-tested, strong parachute, I’m not afraid of falling out (job loss)

Q: imagine that after Y2, this job pays me zero bonus, and boss gives some explicit but mild message of “partial meet”. Do I want to avoid further emotional suffering and forgo the excellent commute + flexible hours + comfortable workload + hassel-free medical benefit?
A: I think Khor Siang of Zed would stay on. I think ditto for Sanjay of OC/MLP. Looking at my OC experience, I think I would stay on.

— PIP^benchtime experience: My PIP experience was traumatic enough to leave permant scars, but my unpaid benchtime experiences left no scars:
* before Polaris, after layoff from OceanLake
* between 95G and Barcap
* between RTS and mvea

— what are (not) part of “livelihood” concerns. These clarifications help define “livelihood/生计”

  • housing — smallish, but safe, clean home is part of livelihood
  • healthcare — polyclinic, TCM, public healthcare system in Malaysia … are important components of an adequate healthcare infrastructure, which is livelihood. Anything beyond is luxury healthcare
  • commute to work/school — 1.5H commute is still acceptable. in 1993 I had a 1.5 hour commute to HCJC. A desire for a shorter commute is kinda luxury, beyond livelihood.
  • job security for those of you aged 40-65 — is NOT a livelihood concern if you already have enough nonwork income to cover basic living expenses. Job is really a luxury, for joy, occupation, contribution. Consider grandpa.
    • job security above 65 — is clearly NOT livelihood, unless there’s insufficient retirement income.
  • Life-chances — are more about livelihood and less about FOMO.

— Deepak’s experience

Deepak converted form contractor to perm in mid 2014, but on 30 Oct 2014, lost his job in UK. He sat on the bench for thirteen months and started working in Nov 2015, in the U.S. This loss of income was a real blow, but in terms of the psychological scars, I think the biggest were 1) visa 2) job interviews difficulties. He didn’t have a PIP scar.

 

op=(): java cleaner than c++ #TowerResearch

A Tower Research interviewer asked me to elaborate why I claimed java is a very clean language compared to c++ (and c#). I said “clean” means consistency and fewer special rules, such that regular programmers can reason about the language standard.

I also said python is another clean language, but it’s not a compiled language so I won’t compare it to java.

See c++complexity≅30% mt java

— I gave interviewer the example of q[=]. In java, this is either content update at a given address (for primitive data types) or pointer reseat (for reference types). No ifs or buts.

In c++ q[=] can invoke the copy ctor, move ctor, copy assignment, move assignment, cvctor( conversion ctor), OOC(conversion operator).

  • for a reference variable, its meaning is somewhat special  at site of initialization vs update.
  • LHS can be an unwrapped pointer… there are additional subtleties.
  • You can even put a function call on the LHS
  • cvctr vs OOC when LHS and RHS types differ
  • member-wise assignment and copying, with implications on STL containers
  • whenever a composite object has a pointer field, the q[=] implementations could be complicated.  STL containers are examples.
  • exception safety in the non-trivial operations
  • implicit synthesis of move functions .. many rules
  • when RHS is a rvalue object, then LHS can only be ref to const, nonref,,,

unrolled linked list with small payload: d-cache efficiency

This blogpost is partly based on https://en.wikipedia.org/wiki/Unrolled_linked_list

This uniform-segment data structure is comparable to deque and can be a good talking point in interviews.

Unrolled linked list is designed as a replacement for vanilla linked list , with 2 + 1 features

  1. mid-stream insert/delete — is slightly slower than linked list, much better than deque, which is efficient only at both ends.
  2. traversal — d-cache efficiency during traversal, esp. with small payloads
    • This is the main advantage over vanilla linked list

There’s a 3rd API operation, somewhat uncommon for linked list

  1. jump by index — quasi-random access. Almost constant-time. Count of non-dummy elements in each segment is non-uniform.  See [17]deque with uneven segments #GS

— comparison with deque, which has a fixed segment length, except the head and tail segments

  • Deque offers efficient insert/delete at both ends only. At Mid-stream .. would require shifting, just as in vector
  • Deque offers O(1) random access by index, thanks to the fixed segment length

[20] charmS@slow track #Macq mgrs#silent majority

another critique of the slow track.

My Macq managers Kevin A and Stephen Keith are fairly successful old-timers. Such an individual would hold a job for 5-10 years, grow in know-how, effectiveness (influence,,,). Once a few years they move up the rank. In their eyes, a job hopper or contractor like me is hopelessly left on the slow track — rolling stone gathers no moss.

I would indeed have felt that way if I had not gained the advantages of burn rate + passive incomes. No less instrumental are my hidden advantages like

  • relatively carefree hands-on dev job, in my comfort zone
  • frugal wife
  • SG citizenship
  • stable property in HDB
  • modest goal of an average college for my kids
  • See also G5 personal advantages: Revealed over15Y

A common cognitive/perception mistake is missing the silent majority of old timers who don’t climb up. See also …

read/write volatile var=enter/exit sync block

As explained in 3rd effect@volatile introduced@java5

  • writing a volatile variable is like exiting a synchronized block, flushing all temporary writes to main memory;
  • reading a volatile variable is like entering a synchronized block, reloading all cached shared mutables from main memory.

http://tutorials.jenkov.com/java-concurrency/volatile.html has more details.

https://stackoverflow.com/questions/9169232/java-volatile-and-side-effects also addresses “other writes“.

denigrate%%intellectual strength #ChengShi

I have a real self-esteem problem as I tend to belittle my theoretical and low-level technical strength. CHENG, Shi was the first to point out “你就是比别人强”.

  • eg: my grasp of middle-school physics was #1 strongest across my entire school (a top Beijing middle school) but I often told myself that math was more valuable and more important
  • eg: my core-java and c++ knowledge (QQ++) is stronger than most candidates (largely due to absorbency++) but i often say that project GTD is more relevant. Actually, to a technical expert, knowledge is more important than GTD.
  • eg: I gave my dad an illustration — medical professor vs GP. The Professor has more knowledge but GP is more productive at treating “common” cases. Who is a more trusted expert?
  • How about pure algo? I’m rated “A-” stronger than most, but pure algo has far lower practical value than low-level or theoretical knowledge. Well, this skill is highly sought-after by many world-leading employers.
    • Q: Do you dismiss pure algo expertise as worthless?
  • How about quant expertise? Most of the math has limited and questionable practical value, though the quants are smart individuals.

Nowadays I routinely trivialize my academic strength/trec relative to my sister’s professional success. To be fair, I should say my success was more admirable if measured against an objective standard.

Q: do you feel any IQ-measured intelligence is overvalued?

Q: do you feel anything intellectual (including everything theoretical) is overvalued?

Q: do you feel entire engineering education is too theoretical and overvalued? This system has evolved for a century in all successful nations.

The merit-based immigration process focus on expertise. Teaching positions require expertise. When laymen know you are a professional they expect you to have expertise. What kind of knowledge? Not GTD but published body of jargon and “bookish” knowledge based on verifiable facts.

edu credential: I beat most@those brank guys

In terms of college brand recognition and exclub status, most of those “brank” guys are actually at or below my level

  • shuo
  • Jiang.Zhu
  • Zhao
  • LN.Qiao
  • Mo.Zhu
  • those MDs

I tend to dismiss this educational achievement but if I reverse role with one of the brank guys, and look at Tan Bin’s UChicago achievement, I would recognize

  1. [h] he achieve it at age 43
  2. [h] his GPA
  3. [h] he never copied homework of others, and others often copied his homework
  4. [h] he has one of the highest class attendence rates
  5. [h] Most students took 2 Pass/Fail but he took only one.
  6. [h] most students took about 1 year to finish the program, but he took it over 2.5 years, and learned more.
  7. UChicago has heavy homework and non-trivial exams
  8. [h=hard facts, not subject to interpretation]

scan an array{both ends,keep`min/max #O(N)

I feel a reusable technique is

  • scan the int array from Left  and keep track of the minimum_so_far, maximum_so_far, cumSum_so_far. Save all the stats in an array
  • scan the int array from right and keep track of the minimum_so_far, maximum_so_far, cumSum_so_far. Save all the stats in an array
  • save the difference of min_from_left and max_from_right in an array
  • save the difference of max_from_left and min_from_right in an array

With these shadow arrays, many problems can be solved visually and intuitively, in linear  time.

eg: max proceeds selling 2 homes #Kyle

How about the classic max profit problem?

How about the water container problem?

range bump-up@intArray 60% #AshS

https://www.hackerrank.com/challenges/crush/problem Q: Starting with a 1-indexed array of zeros and a list of operations, for each operation add a “bump-up” value to each of the array element between two given indices, inclusive. Once all operations have been performed, return the maximum in your array.

For example, given array 10 of zeros . Your list of 3 operations is as follows:

    a b k
    1 5 3
    4 8 7
    6 9 1

Add the values of k between the indices a and b inclusive:

index->	 1 2 3  4  5 6 7 8 9 10
	[0,0,0, 0, 0,0,0,0,0, 0]
	[3,3,3, 3, 3,0,0,0,0, 0]
	[3,3,3,10,10,7,7,7,0, 0]
	[3,3,3,10,10,8,8,8,1, 0]

The largest value is 10 after all operations are performed.

====analysis

This (contrived) problem is similar to the skyline problem.

— Solution 1 O(minOf[N+J, J*logJ ] )

Suppose there are J=55 operations. Each operation is a bump-up by k, on a subarray. The subarray has left boundary = a, and right boundary = b.
Step 1: Sort the left and right boundaries. This step is O(N) by counting sort, or O(J logJ) by comparison sort. A conditional implementation can achieve O(minOf[N+J, J*logJ ] )

In the example, after sorting, we get 1 4 5 6 8 9.

Step 2: one pass through the sorted boundaries. This step is O(J).
Aha — the time complexity of this solution boils down to the complexity of sorting J small positive integers whose values are below n.

3overhead@creating a java stackframe]jvm #DQH

  • additional assembly instruction to prevent stack overflow… https://pangin.pro/posts/stack-overflow-handling mentions 3 “bang” instructions for each java method, except some small leaf methods
  • safepoint polling, just before popping the stackframe
  • (If the function call receives more than 6 arguments ) put first 6 args in register and the remaining args in stack. The ‘mov’ in stack involves more instructions than registers. The subsequent retrieval from stack is likely L1 cache, slower than register read.

age40-50career peak..really@@stereotype,brainwash,,

stereotype…

We all hear (and believe) that the 40-50 period is “supposed” to be the peak period in the life of a professional man. This expectation is created on the mass media (and social media such as Linkedin) brainwash that presents middle-aged managers as the norm. If not a “manager”, then a technical architect or a doctor.

[[Preparing for Adolescence]] illustrates the peer pressure (+self-esteem stress) felt by the adolescent. I feel a Deja-vu. The notion of “normal” and “acceptable” is skewed by the peer pressure.

Q: Out of 100 middle-aged (professional or otherwise) guys, how many actually reach the peak of their career in their 40’s?
A: Probably below 10%.

In my circle of 40-somethings, the norm is plateau or slow decline, not peak. The best we could do is keep up our effort and slow down the decline, be it wellness, burn rate, learning capacity, income,,,

It’s therefore hallucinatory to feel left behind on the slow track.

Q: at what age did I peak in my career?
A: I don’t want to overthink about this question. Perhaps towards the end of my first US era, in my late 30s.

I think middle-aged professional guys should read [[reconciliations]] by Theodore Rubin. The false expectation creates immense burden.

const data member initialization: simple on the surface

The well-known Rule 1 — a const data member must be initialized exactly once, no more no less.

The lesser-known Rule 2 — for class-type data member, there’s an implicit default-initialization feature that can kick in without us knowing. This default-init interacts with ctor initializer in a strange manner.

On a side note, [[safeC++]] P38 makes clever use of Rule 2 to provide primitive wrappers. If you use such a wrapper in place of a primitive field (non-const), then you eliminate the operational risk of “forgetting to initialize a non-const primitive field

The well-known Rule 3 — the proper way to explicitly initialize a const field is the ctor initializer, not inside ctor body.

The lesser-known Rule 4 — at run-time, once control passes into the ctor body, you can only modify/edit an already-initialized field. Illegal for a const field.

To understand these rules, I created an experiment in https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_misc/constFieldInit.cpp

— for primitive fields like int, Rule 2 doesn’t apply, so we must follow Rule 1 and Rule 3.

— for a class-type field like “Component”,

  • We can either leave the field “as is” and rely on the implicit Rule 2…., or
  • If we want to initialize explicitly, we must follow Rule 3. In this case, the default-init is suppressed by compiler.

In either case, there’s only one initialization per const field (Rule 1)

joinable instance of std::thread

[[effModernC++]] P 252 explains why in c++ joinable std::thread objects must not get destroyed. Such a destruction would trigger std::terminate(), therefore, programmers must make their std::thread objects non-joinable before destruction.

The key is a basic understanding of “joinable”. Informally, I would say a joinable std::thread has a real thread attached to it, even if that real thread has finished running. https://en.cppreference.com/w/cpp/thread/thread/joinable says “A thread that has finished executing code, but has not yet been joined is still considered an active thread of execution and is therefore joinable.”

An active std::thread object becomes unjoinable

  • after it is joined, or
  • after it is detached, or
  • after it is be “robbed” via move()

The primary mechanism to transition from joinable to unjoinable is via join().

std::thread key points

For a thread to actually become eligible, a Java thread needs start(), but c++ std::thread becomes eligible immediately after initialization i.e. after it is initialized with its target function.

For this reason, [[effModernC++]] dictates that between an int field and a std::thread field in a given class Runner, the std::thread field should be the last initialized in constructor. The int field needs to be already initialized if it is needed in the new thread.

Q1: Can you initialize the std::thread field in the constructor body?
A: yes unless the std::thread field is a declared const field

Now let’s say there’s no const field.

Q2: can the Runner copy ctor initialize the std::thread field in the ctor body, via move()?
A: yes provided the ctor parameter is non-const reference to Runner.
A: no if the parameter is a const reference to Runner. move(theConstRunner) would evaluate to a l-value reference, not a rvr. std::thread ctor and op= only accept rvr, because std::thread is move-only

See https://github.com/tiger40490/repo1/tree/cpp1/cpp/sys_thr for my experiments.

##[18]G4qualities I admire]peers: !!status #mellow

I ought to admire my peers’ [1] efforts and knowledge (not their STATUS) on :

  1. personal wellness
  2. parenting
  3. personal finance, not only investment and burn rate
  4. mellowness to cope with the multitude of demands, setbacks, disappointments, difficulties, realities about the self and the competition
  5. … to be Compared to
    • zbs, portable GTD, not localSys
    • how to navigate and cope with office politics and big-company idiosyncrasies.

Even though some of my peers are not the most /accomplished/ , they make a commendable effort. That attitude is admirable.

[1] Many people crossing my path are … not really my peers, esp. those managers in China. Critical thinking required.

I don’t have a more descriptive title for this blogpost.

2011 white paper@high-perf messaging

https://www.informatica.com/downloads/1568_high_perf_messaging_wp/Topics-in-High-Performance-Messaging.htm is a 2011 white paper by some experts. I have saved the html in my google drive. Here are some QQ  + zbs knowledge pearls. Each sentence in the article can expand to a blogpost .. thin->thick.

  • Exactly under what conditions would TCP provide low-latency
  • TCP’s primary concern is bandwidth sharing, to ensure “pain is felt equally by all TCP streams“. Consequently, a latency-sensitive TCP stream can’t have priority over other streams.
    • Therefore, one recommendation is to use a dedicated network having no congestion or controlled congestion. Over this network, the latency-sensitive system would not be victimized by the inherent speed control in TCP.
  • to see how many received packets are delayed (on the receiver end) due to OOS, use netstat -s
  • TCP guaranteed delivery is “better later than never”, but latency-sensitive systems prefer “better never than late”. I think UDP is the choice.
  • The white paper features an in-depth discussion of group rate. Eg: one mkt data sender feeding multiple (including some slow) receivers.

 

j4 analyzing my perception@reality: c++/bigO..

Using words and numbers, am trying to “capture” my perceptions (intuitions + observations+ a bit of insights) of the c++/java job market trends, past and future. There’s some reality out there but each person including the expert observer has only a limited view of that reality, based on limited data.

Those numbers look impressive, but actually similar to the words — they are mostly personal perceptions dressed up as objective measurements.

If you don’t use words or numbers then you can’t capture any observation of the “reality”. Your impression of that reality [1] remains hopelessly vague. I now believe vague is the lowest level of comprehension, usually as bad as a biased comprehension. Using words + numbers we have a chance to improve our perception. Clarify is a basic improvement.

[1] (without words you can’t even refer to that reality)

My perceptions shape my decisions, and my decisions affect my family’s life chances.

My perceptions shape my selective listening. Gradually, actively, my perception would modify my “membrane” of selective listening! All great thinkers, writers update their membrane.

Am not analyzing reality. Instead, am basically analyzing my perception of the reality, but that’s the best I could do. I’m good at analyzing myself as an object.

Refusing to plan ahead because of high uncertainty is lazy, is pessimistic, is doomed. This applies to retirement planning, what-if scenario planning.

latency zbs in java: lower value cf c++@@

Warning — latency measurement gotchas … is zbs but not GTD or QQ

— My tech bet — Demand for latency QQ will remain higher in c++ than java

  • The market’s perception would catch up with reality (assuming java is really no slower than c++), but the catch-up could take 30 years.
  • the players focused on latency are unused to the interference [1] by the language. C++ is more free-wheeling
  • Like assembly, c++ is closer to hardware.
  • In general, by design Java is not as a natural a choice for low latency as c++ is, so even if java can match c++ in performance, it requires too much tweaking.
  • related to latency is efficiency. java is a high-level language and less efficient at the low level.

[1] In the same vein, (unlikely UDP) TCP interferes with data transmission rate control, so even if I control both sender and receive, I still have to cede control to TCP, which is a kernel component.

— jvm performance tuning is mainstream and socially meaningful iFF we focus on
* machine saturation
* throughput
* typical user-experience response time

— In contrast, a narrow niche area is micro-latency as in HFT

After listening to FPGA, off-heap memory latency … I feel the arms race of latency is limited to high-speed trading only. latency technology has limited economic value compared to mobile, cloud, cryptocurrency, or even data science and machine learning.

Churn?

accu?

 

find all subArrSums divisibleByK

Q: modified slightly from Leetcode 974: Given an array of Signed integers, print all (contiguous, non-empty) subarrays having a sum divisible by K, including zero-sum subarrays.

https://github.com/tiger40490/repo1/blob/py1/py/algo_arr/subarrayDivisibleByK.py is my one-pass, linear time solution. I consider this technique an algoQQ. Without prior knowledge, a O(N) solution is inconceivable.

I received this problem in an Altonomy hackerrank. I think Kyle gave me this problem too.

So 1) Leetcode 2) Kyle 3) Altonomy featured this question.

===analysis

Sliding window? I didn’t find any use.

Key idea — My homegrown solution kSub3() shows more insight ! Treat array as array of deltas. For every level reached, compute its modulus-K and save it in a multimap.

enumerate()iterate py list/str with idx+val

The built-in enumerate() is a nice optional feature. If you don’t want to remember this simple simple syntax, then yes you can just iterate over xrange(len(the_sequence))

https://www.afternerd.com/blog/python-enumerate/#enumerate-list is illustrated with examples.

— to enumerate backward,

Since enumerate() returns a generator and generators can’t be reversed, you need to convert it to a list first.

for i, v in reversed(list(enumerate(vec)))

c++nlg pearls: xx new to refresh old 知新而后温故

Is this applicable in java? I think so, but my focus here is c++.

— 温故而知新 is less effective at my level. thick->thin, reflective.

— 知新而后温故 — x-ref, thin->thick->thin learning.

However, the pace of learning new knowledge pearls could appear very slow and disappointing. 5% new learning + 95% refresh. In such a case, the main benefit and goal is the refresh. Patience and Realistic expectation needed.

In some situations, the most effective learning is 1% new and 99% refresh. If you force yourself to 2% new and 98% refresh, learning would be less effective.

This technique is effective with distinct knowledge PEARLS. Each pearl can be based on a sentence in an article but developed into a blogpost.

 

non-volatile field can have volatile behavior #DQH

Unsafe.getObjectVolatile() and setObjectVolatile() should be the only access to the field.

I think for an integer or bool field (very important use cases), we need to use Unsafe.putIntVolatile() and Unsafe.getIntVolatile()

Q: why not use a volatile field?
A: I guess in some designs, a field need not be volatile at most access points, but at one access point it needs to behave like volatile field.  Qihao agrees that we want to control when we want to insert a load/store fence.

Non-volatile behavior usually has lower latency.

 

half%%peers could be Forced into retirement #Honglin

Reality — we are living longer and healthier.

Observation — compared to old men, old women tend to have more of a social life and more involvement with grandchildren.

I suspect that given a choice, half the white-collar guys in my age group actually wish to keep working past 65 (or 70), perhaps at a lower pace. In other words, they (and me) are likely to retire not by choice. My reasoning for the prediction — regardless of financial needs, many in this group do not have enough meaningful, “engaging” things to do. Many would suffer from retirement.

It takes long-term planning to stay employed past 65.

I think most of the guys in this category do not prepare well in advance and will find themselves unable to find a suitable job.

(We won’t put it this way, but) These guys will be kinda forced into early retirement. The force could be health or in-demand skillset or …

— Honglin is a positive example. He had the sharp vision that he was forced into retirement.

observer@middle-aged SG workers#all-the-way

“Most of them in the 40s are already stable and don’t want to quit. Even though the pay may not be so good, they’re willing to work all the way[1]. It’s an easy-going life.”

The observer was comparing SG (white or blue collar) employees across age groups, and this is the brief observation of the 40-something.

This observation is becoming increasingly descriptive of me… Semi-retired on the back of my passive income streams. Easy life.

[1] I interpret “all the way” as all the way to retirement age, no change of direction, not giving in to boredom, sticking to the chosen career despite occasional challenges (pains, disappointments, setbacks).

local variables captured in nested class #Dropcopy

If a (implicitly final) local variable [1] is captured inside a nested class, where is the variable saved?

https://stackoverflow.com/questions/43414316/where-is-stored-captured-variable-in-java explains that the anonymous or local class instance has an implicit field to hold the captured variable !

[1] local variable can be an arg passed into the enclosing function. Could a primitive type or a reference type i.e. heapy thingy

The java compiler secretly adds this hidden field. Without this field, a captured primitive would be lost and a captured heapy would be unreachable when the local variable goes out of scope.

A few hours later, when the nested class instance need to access this data, it would have to rely on the hidden field.

 

lambda^anon class instance ] java

A java lambda expression is used very much like an instance of an anonymous class. However, http://tutorials.jenkov.com/java/lambda-expressions.html#lambda-expressions-vs-anonymous-interface-implementations pointed out one interesting difference:

The anonymous instance in the example has a field named. A lambda expression cannot have such fields. A lambda expression is thus said to be stateless.

get collection sum after K halving operations #AshS

Q: given a collection of N positive integers, you perform K operations like “half the biggest element and replace it with its own ceiling”. Find the collection sum afterwards.

Note the collection size is always N. Note K(like 5) could exceed N(like 2), but I feel it would be trivial.

====analysis====

This is a somewhat contrived problem.

I think O(N + K log min(N,K)) is pretty good if feasible.

git | merge-commits and pull-requests

Key question — Q1: which commit would have multiple parents?

— scenario 1a:

  1. Suppose your feature branch brA has a commit hash1 at its tip; and master branch has tip at hashJJ, which is the parent of hash1
  2. Then you decide to simply q[ git merge brA ] into master

In this simple scenario, your merge is a fast-forward merge. The updated master would now show hash1 at the tip, whose only parent is hashJJ.

A1: No commit would have multiple parents. Simple result. This is the default behavior of git-merge.

Note this scenario is similar to https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges#rebase-and-merge-your-pull-request-commits

However, github or bit-bucket pull-request flow don’t support it exactly.

— scenario 1b:

Instead of simple git-merge, what about pull request? A pull-request uses q[ git merge –no-ff brA ] which (I think) unconditionally creates a merge-commit hashMM on maser.

A1: now hashMM has two parents. In fact, git-log shows hashMM as a “Merge” with two parent commits.

Result is unnecessarily complex. Therefore, in such simple scenarios, it’s better to use git-merge rather than pull request.

https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges explains the details.

— Scenario 2: What if ( master’s tip ) hashJJ is Not parent of hash1?

Now maser and brA have diverged. I think you can’t avoid a merge commit hashMM.

A1: hashMM

— Scenario 3: continue from Scenario 1b or Scenario2.

3. Then you commit on brA again , creating hash2.

Q: What’s the parent node of hash2?
A: I think git actually shows hash1 as the parent, not hashMM !

Q: is hashMM on brA at all?
A: I don’t think so but some graphical tools might show hashMM as a commit on brA.

I think now master branch shows  hashMM having two parents (hash1+hashMM), and brA shows hash1 -> hash2.

I guess that if after the 3-way-merge, you immediately re-create (or reset) brA from master, then hash2’s parent would be hashMM.


Note

  • direct-commit on master is implicitly fast-forward, but merge can be fast-forward or non-fast-forward.
  • fast-forward merge can be replaced by a rebase as in Scenario 1a. Result is same as direct-commit.
  • fast-forward merge-commit (Scenario 1b) and 3way merge (Scenario 2) both create a merge-commit.
  • git-pull includes a git-merge without –no-ff

Optiver coding hackathon is like marathon training

Hi Ashish,

Looking back at the coding tests we did together, I feel it’s comparable to a form of “marathon training” — I seldom run longer than 5km, but once a while I get a chance to push myself way beyond my limits and run far longer.

Extreme and intensive training builds up the body capacity.

On my own, it’s hard to find motivation to run so long or practice coding drill at home, because it requires a lot of self-discipline.

Nobody has unlimited self-discipline. In fact, those who run so much or takes on long-term coding drills all have something beside self-discipline. Self-discipline and brute force willpower is insufficient to overcome the inertia in every one of these individuals. Instead, the invisible force, the wind beneath their wings is some forms of intrinsic motivation. These individuals find joy in the hard drill.

( I think you are one of these individuals — I see you find joy in lengthy sessions of jogging and gym workout. )

Without enough motivation, we need “organized” practice sessions like real coding interviews or hackathons. This Optiver coding test could probably improve my skill level from 7.0 to 7.3, in one session. Therefore, these sessions are valuable.

[18]latency stat: typical sell-side DMA box:10 μs

(This topic is not GTD not zbs, but relevant to some QQ interviewers.)

https://www.youtube.com/watch?v=BD9cRbxWQx8 is a 2018 presentation.

  1. AA is when a client order hits a broker
  2. Between AA and BB is the entire broker DMA engine in a single process, which parses client order, maintains order state, consumers market data and creates/modifies the outgoing FIX msg
  3. BB is when the broker ships the FIX msg out to exchange.

Edge-to-edge latency from AA to BB, if implemented in a given language:

  • python ~ about 50 times longer than java
  • java – can aim for 10 micros if you are really really good. Dan recommends java as a “reasonable choice” iFF you can accept 10+ micros. Single-digit microsecond shops should “take a motorbike not a bicycle”.
  • c# – comparable to java
  • FPGA ~ about 1 micro
  • ASIC ~ 400 ns

— c/c++ can only aim for 10 micros … no better than java.

The stronghold of c++, the space between java and fpga, is shrinking … “constantly” according to Dan Shaya. I think “constantly” is like the growth of Everest.. perhaps by 2.5 inches a year

I feel c++ is still much easier, more flexible than FPGA.

I feel java programming style would become more unnatural than c++ programming in order to compete with c++ on latency.

Kenneth of MLP said his engine gets a CMF-format order message from PM (AA) does some minimal checks and (BB) sends it as FIX to broker. Median latency from AA to BB is 40 micros.

— IPC latency

Shared memory beats TCP hands down. For an echo test involving two processes:

Using an Aeron same-host messaging application, 50th percentile is 250 ns. I think NIC and possibly kernel (not java or c++) are responsible for this latency.

Kenneth said shared memory latency (also Aeron same-host) is 1-4 micros measured between XX) PM writes the order object into shm AA) engine reads the order from shm.

AlmostIncreasing #AshS

Q: (from GrassHopper Nov 2020): given an int array (size < 100000), can you make the array strictly increasing by removing at most one element?

https://github.com/tiger40490/repo1/tree/cpp1/cpp/algo_arr has my solution tested locally.

====analysis: well-understood, simple requirement. Simple idea as implemented in sol2kill(). But there are many clumbsy ways to implement this idea.

There are also several less-simple ideas that could be useful in other problems

— idea: scanning from both ends. When left pointer hits roadblock (stop rising) at CC, we know BB and CC may need to go, but AA is safe, so we can say max-from-left is AA or higher.

When right poiner hits roadblock at BB, then we know the only roadblock in existence is AA.BB.CC.DD. So min-from-right is DD or lower. So DD must exceed AA, and one of BB and CC must be strictly between AA/DD.

If right poiner hits a roadblock far away from CC then probably hopeless.

This idea is truly one-pass, whereas my simple idea is arguably two-pass.

SG dev salary: FB^banks

Overall, it’s a positive trend that non-finance employers are showing increasing demand for candidates at my salary level. More demand is a good thing for sure.

Even though these tech shops can’t pay me the same as MLP does, 120k still makes a good living

— https://news.efinancialcareers.com/sg-en/3001699/salaries-pay-facebook-singapore  is a curated Aug-2020 review of self-reported salary figures on glassdoor.com

Mid-level dev base salary SGD 108k,  much lower than U.S. This is a vanilla dev role without any specialized skill mentioned (by the self-reporter) such as data science or security.

FB Singapore has 1000 headcount including devs, but I think the mix might be similar to the BAML mix in Harborfront — most of the devs are no in front office.

— google SG: https://news.efinancialcareers.com/sg-en/3001375/google-salaries-pay-bonuses-singapore is on google, but I find the dev salary figures unreliable.

https://www.quora.com/How-much-do-Google-Singapore-Software-Engineers-earn is another curated review.

— banking tech: https://news.efinancialcareers.com/sg-en/3000694/banking-technology-salaries-singapore is published under the same author but could be authored by someone else.

This EFC site is finance-centric, so this data is more detailed, better curated. Very close to my first-hand observations.

sponsored DMA

Context — a buy-side shop (say HRT) uses a DMA connection sponsored by a sell-side like MS (or Baml or Instinet) to access NYSE. MS provides a DMA platform like Speedway.

The HRT FIX gateway would implement the NYSE FIX spec. Speedway also has a FIX spec for HRT to implement. This spec should include minor customization on the NYSE spec.

I have seen the HPR spec. (HPR is like an engine running in Baml or GS or whatever.) HPR spec seems to talks about customization for NYSE, Nsdq etc …re Gary chat.

Therefore, the HRT FIX gateway to NYSE must implement, in a single codebase,

  1. NYSe spec
  2. Speedway spec
  3. HPR spec
  4. Instinet spec
  5. other sponsors’ spec

The FIX session would be provided (“sponsored”) by MS or Baml, or Instinet. I think the HRT FIX gateway would connect to some IP address belonging to the sponsor like MS. Speedway would forward the FIX messages to NYSE, after some risk checks.

VWAP=a bmark^exectionAlgo

In the context of a broker algos (i.e. an execution algo offered by a broker), vwap is

  • A benchmark for a bulk order
  • An execution algo aimed at the benchmark. The optimization goal is to minimize slippage against this benchmark. See other blogposts about slippage.

The vwap benchmark is simple, but the vwap algo implementation is non-trivial, often a trade secret.

Avichal: too-many-distractions

Avichal is observant and sat next to me for months. Therefore I value his judgment. Avichal is the first to point out I was too distracted.

For now, I won’t go into details on his specific remarks. I will simply use this simple pointer to start a new “thread”…

— I think the biggest distraction at that time was my son.

I once (never mind when) told grandpa that I want to devote 70% of my energy to my job (and 20% to my son), but now whenever I wanted to settle down and deep dive into my work, I feel the need and responsibility to adjust my schedule and cater to my son. and try to entice him to study a little bit more.

My effort on my son is like Driving uphill with the hand-brake on.

As a result, I couldn’t have a sustained focus.

gradle: dependency-jar refresh, cache, Intellij integration..

$HOME/.gradle holds all the jars from all previous downloads.

[1] When you turn on debug, you can see the actual download : gradle build –debug.

[2] Note IDE java editor can use version 123 of a jar for syntax check, but the command line compilation can use version 124 of the jar. This is very common in all IDEs.

When I make a change to a gradle config,

  • Intellij prompts for gradle import. This seems to be unnecessary re-download of all jars — very slow.
  • Therefore, I ignore the import. I think as a result, Intellj java editor [2] would still use the previous jar version as the old gradle config is in effect. I live with this because my focus is on the compilation.
  • For compilation, I use the grade “build” action (probably similar to command line build). Very fast but why? Because only one dependency jar is refreshed [3]
  • Gary used debug build [1] to prove that this triggers a re-download of specific jars iFF you delete the jars from $HOME/.gradle/caches/modules-2/files-2.1

[3] For a given dependency jar, “refresh” means download a new version as specified in a modified gradle config.

— in console, run

gradle build #there should be a ./build.gradle file

Is java/c# interpreted@@No; CompiledTwice!

category? same as JIT blogposts

Q: are java and c# interpreted? QQ topic — academic but quite popular in interviews.

https://stackoverflow.com/questions/8837329/is-c-sharp-partially-interpreted-or-really-compiled shows one explanation among many:

The term “Interpreter” referencing a runtime generally means existing code interprets some non-native code. There are two large paradigms — Parsing: reads the raw source code and takes logical actions; bytecode execution : first compiles the code to a non-native binary representation, which requires much fewer CPU cycles to interpret.

Java originally compiled to bytecode, then went through an interpreter; now, the JVM reads the bytecode and just-in-time compiles it to native code. CIL does the same: The CLR uses just-in-time compilation to native code.

C# compiles to CIL, while JIT compiles to native; by contrast, Perl immediately compiles a script to a bytecode, and then runs this bytecode through an interpreter.

bone health for dev-till-70 #CSY

Hi Shanyou,

I have a career plan to work as a developer till my 70’s. When I told you, you pointed out bone health, to my surprise.

You said that some older adults suffer a serious bone injury and become immobile. As a result, other body parts suffer, including weight, heart, lung, and many other organs. I now believe loss of mobility is a serious health risk.

These health risks directly affect my plan to work as a developer till my 70’s.

Lastly, loss of mobility also affects our quality of life. My mom told me about this risk 20 years ago. She has since become less vocal about this risk.

Fragile bones become more common when we grow older. In their 70’s, both my parents suffered fractures and went through surgeries.

See ## strengthen our bones, reduce bone injuries #CSY for suggestions.

available time^absorbency[def#4]:2 limiting factors

see also ## identify your superior-absorbency domains

Time is a quintessential /limiting factor/ — when I try to break through and reach the next level on some endeavor, I often hit a /ceiling/ not in terms of my capacity but in terms of my available time. This is a common experience shared by many, therefore easy to understand. In contrast, a more subtle experience is the limiting factor of “productive mood” [1].

[1] This phrase is vague and intangible, so sometimes I speak of “motivation” — not exactly same and still vague. Sometimes I speak of “absorbency” as a more specific proxy.

“Time” is used as per Martin Thompson.

  • Specific xp: Many times I took leaves to attend an IV. The time + absorbency is a precious combination that leads to breakthrough in insight and muscle-building. If I only provide time to myself, most of the time I don’t achieve much.
    • I also take leave specifically to provide generic “spare time” for myself but usually can’t achieve the expected ROTI.
  • Specific xp: yoga — the heightened absorbency is very rare, far worse than jogging. If I provide time to myself without the absorbency, I won’t do yoga.
  • the zone — (as described in my email) i often need a block of uninterrupted hours. Time is clearly a necessary but insufficient condition.
  • time for workout — I often tell my friends that lack of time sounds like an excuse given the mini-workout option. Well, free time still helps a lot, but motivation is more important in this case.
  • localSys — absorbency is more rare here than coding drill, which is more rare than c++QQ which is more rare than java QQ
  • face time with boy — math practice etc.. the calm, engaged mood on both sides is very rare and precious. I tend to lose my cool even when I make time for my son.
  • laptop working at train stations — like MRT stations or 33rd St … to capture the mood. Available time by itself is useless

exec algo: with-volume

— WITH VOLUME
Trade in proportion to actual market volume, at a specified trade rate.

The participation rate is fixed.

— Relative Step — with a rate following a step-up algo.

This algo dynamically adjusts aggressiveness(participation rate) based on the
relative performance of the stock versus an ETF. The strategy participates at a target percentage of overall
market volume, adjusting aggressiveness when the stock is
significantly underperforming (buy orders) or outperforming (sell orders) the reference security since today’s open.

An order example: “Buy 90,000 shares 6758.T with a limit price of ¥2500.
Work order with a 10% participation rate, scaling up to 30%
whenever the stock is underperforming the Nikkei 225 ETF (1321.OS)
by 75 basis points or more since the open.”

If we notice the reference ETF has a 2.8% return since open and our 6758.T has a 2.05% return, then the engine would assume 6758.T is significantly underperforming its peers (in its sector). The engine would then step up the participation to 30%, buying more aggressively, perhaps using bigger and faster slices.

What if the ETF has dropped 0.1% and 6758.T has dropped 0.85%? This would be unexpected since our order is a large order boosting the stock. Still, the other investors might be dumping this stock. The engine would still perceive the stock as underperforming its peers, and step up the buying speed.

Y alpha-geeks keep working hard #speculation #Ajay

Based on my speculation, hypothesis, imagination and a tiny bit of observation.

Majority of the effective, efficient, productive tech professionals don’t work long hours because they already earn enough. Some of them can retire if they want to.

Some percentage of them quit a big company or a high position, sometimes to join a startup. One of the reasons — already earned enough. See my notes on envy^ffree

Most of them value work-life balance. Half of them put this value not on lips but on legs.

Many of them still choose to work hard because they love what they do, or want to achieve more, not because no-choice. See my notes on envy^ffree

fixtag-Num,fixtag-nickname,fixtag-Val #my jargon

a fixtag is not an atomic item. Instead, a fixtag usually comprise two parts namely the value and the identifier.

The identifier is usually a fixtag-num.

Note the fixtag-name is not always unique ! It’s more like a fixtag-descriptor or fixtag-nickname, not part of the actual message, so they are merely standard nicknames!

“Payload” is not a good term. “pair” is kinda uncommon.