[19] zbs cf to QQ+GTD #compiler+syntax expertise

Why bother — I spend a lot of time accumulating zbs, in addition to QQ halos and localSys GTD

I have t_zbs99 and other categories/tags on my blogposts showcasing zbs (真本事/real expertise) across languages. Important to recognize the relative insignificance of zbs

  • #1 QQ — goal is career mobility. See the halo* tags. However, I often feel fake about these QQ halos.
  • #2 GTD — ( localSys or external tools …) goal is PIP, stigma, helping colleagues. Basic skill to Make the damn thing work. LG2 : quality, code smell, maintainability etc
  • #3 zbs — goal is self-esteem, respect, curiosity, expertise and “expert” status. By definition, zbs knowledge pearls are often not needed for GTD. In other words zbs is “Deeper expertise than basic GTD”. Scope is inherently vague but..
    • Sometimes I can convert zbs knowledge pearls to QQ halos, but the chance is lower than I wished, so I often find myself overspending on zbs. Therefore I consider the zbs topics a distant Number 3.
    • Zbs (beyond GTD) is required as architect, lead developer, decision makers.
    • Yi Hai likes to talk about self-respect
    • curiosity -> expertise — Dong Qihao talked about curiosity. Venkat said he was curious about many tech topics and built up his impressive knowledge.

I also have blog categories on (mostly c++) bulderQuirks + syntax tricks. These knowledge pearls fall under GTD or zbs.

Q: As I grow older and wealthier (Fuller wealth), do I have more reason to pursue my self-esteem, respect, curiosity etc?
A (as of 2020): no. My wealth is not so high, and I still feel a lot of insecurity about career longevity. QQ and GTD remain far more important.

highest leverage: localSys^4beatFronts #short-term

Q: For the 2018 landscape, what t-investments promise the highest leverage and impact?

  1. delivery on projects + local sys know-how
  2. pure algo (no real coding) — probably has the highest leverage over the mid-term (like 1-5Y)
  3. QQ
  4. –LG2
  5. portable GTD+zbs irrelevant for IV
  6. obscure QQ topics
  7. ECT+syntax — big room for improvement for timed IDE tests only, not relevant to web2.0 onsite interviews.
  8. Best practices — room for improvement for weekend IDE tests only, not relevant to web2.0 shops.

professional trader: start-up

Look at some young trader like Zheng of Macq.

I used to believe they have good career longevity. I understand they probably can’t be full-time trading into their 40’s, because trading seems to be a young men’s game. I thought they could start their own prop trading business, with family money or raised fund. Now I feel it’s not easy at all

Q: At what age can they start on their own? Typically in their 30’s or 40’s. I interviewed with a Singapore trader who started his own FX trading firm. I think he was in his 40’s

Q: how much savings can they amass by then? I think their income is not much higher than coders, so possibly 0.5 ->1-2 million. However, they also need to buy a home and raise kids.

Q: how much personal money can they use as risk capital? I feel up to a million. Neck was run by two former CIMB stock traders but had only SGD 3M in total capital.

Suppose they can achieve 10% return, the 300k gross profit must go to payroll, rent, market data, technology. If they have any outside money (say, from an uncle), they must distribute a decent return (like 6%) to compensate for the risk assumed by the uncle.

— hot money: most investors i.e. clients, including me, want the freedom to leave. Huge instability for the prop firm.

— competition: clients can choose from many different prop shops. Open competition.

— stress: full time trading for a big bank or hedge fund is stressful, but doing it for your own company is worse.

switch statement: C compiler optimization

I feel this could show up on an interview, either a question or you may get a chance to showcase this knowledge.

Based on http://lazarenko.me/switch/, probably an ebook

clang and gcc are two of the most popular C compilers.

  • — 3 common implementations
  • Jump table — the simple and common implementation
  • if-elsif-elsif — probably least efficient in most cases, but can be the most efficient for very a small switch block.
  • binary search — is used for a switch block having sparse case values

windows startup apps: eg@high-churn skill

I remember discovering and documenting knowledge pearls about how to manage windows startup apps and other windows usability features.

Now these knowledge pearls have lost their value, becuase Microsoft doesn’t bother to keep backward compatibility.

Some may question if this is a common malaise/problem/drawback of GUI apps in general. Perhaps Mac and Unix GUI systems also suffer from the same churn.

seq received too high #CapAmerica

Let’s first focus on the incoming sequence number.

The FIX software uses a ledger file to remember the expected sequence number. If at logon I expect 111 but receive 999, then I will NOT give up but simply request retrans.

Analogy – Captain American wakes up from his long sleep and missed all the episodes of his favorite television drama. He has to /binge-watch/ all the missed episodes before he can understand the conversations. This analogy illustrates why the “received too high” scenario is common and expected.

How about the reversed mismatch? It’s rare — if received seq number is unfortunately too LOW, then no hope of recovery. Something seriously wrong, too serious to repair. In my test environment, I do hit this almost unrealistic scenario. We have to manually update the expected inbound sequence number in QuickFix’s ledger file.

git | binary search for a change

git diff <commit> path/to/file

My shell function below is relevant.

rollback() { # roll back to a given commit
  target=$1 #some git commit id                                                                
  pushd /N/repos/MI_EPA/                                                                       
  git diff head --name-status                                                                  
  set -x                                                                                       
  git checkout $target -- tbl/Generic/ReplaceOrder.tbl # bin/javagen.pl #tbl/MSF/NewOrder.tbl  
  git diff HEAD --name-status # git diff alone won't work    
 
  # above command should reveal only the named files changed.
                                  
  git diff HEAD                                                                                
  set +x                                                                                       
  popd                                                                                         
}                                                                                              

containers(!!VM)thrive in elastic cloud

I think containers beat VM in this game.

I think containers take less resources (esp. memory), therefore faster to launch. One physical machine can host more container and VM instances.

A container’s footprint (more likely disk space rather than memory) is usually below 100MB, but a vm takes Gigabytes.

[[ A Comparative Study of Containers and Virtual Machines in Big Data Environment  ]] is a 2018 IBM-led research finding. It shows

  • containers are much faster to boot up, probably more than 10 times faster. Bootup latency refers to the period from starting a VM or container to the time (right before starting any hosted application) that it can provide services to clients. This is an important factor that cloud service providers usually consider. There are several reasons. On one hand, a shorter bootup latency means a smaller latency to provide services, which brings better user experiences. On the other hand, faster bootup also allows more elastic resource allocation, since the resources do not have to be reserved for the VM or Docker that is starting up.
  • Each machine can run up to 100 active containers but at most half that many active VM instances.
  • If we create (i.e. boot up) idle instances, then each machine can support up to 1000 containers but only around 100 VM instances.
  • the amount of memory allocated to a container is very small at the beginning, and then increases (to, say, 10GB) based on the demands of the applications in the container. However, the VM instance uses 16GB from very beginning till the end.
  • a container releases its memory after it finishes its workload, while a VM still holds the memory even after it becomes idle.
  • the authors concluded that “with the four big data workloads , Dockers container shows much better scalability than virtual machines.”

2 processes’ heap address spaces interleaved@@

The address space of a stack is linear within a process AA. Not true for AA’s heap address space. This address space can have “holes” (i.e. deallocated memory blocks) between two allocated blocks. However, how about these 3 consecutive blocks… would this be a problem?

Allocated block 1 belongs to process AA
Allocated block 2 belongs to process BB
Allocated block 3 belongs to process AA

I think several full chapters of [[linux kernel]] cover memory management. The brk() syscall is the key.

I think the actuall addresses may be virtualized .

Kernel page table?

— why do we worry about holes?

  1. I think holes are worrisome due to wasted memory. Imagine you deallocate a huge array half the size of your host memory
  2. hunting for a large block among the holes can be time-consuming
  3. if your graph nodes (in a large data structure like linked lists, trees) are scattered then d-cache efficiency suffers.

— So why do we worry about interleaving?

If we need not worry, then interleaving may be permitted and exploited for efficiency.

sys calls, cpu instructions for heap^stack

My book [[linux kernel]] P301 lists just 5 kernel functions for managing the heap for a given process. Only one function brk() is implemented as a sys call.

Q: how is brk() implemented in terms of CPU instructions?
A: I think the complex logic is in software i.e. kernel functions. CPU executes the individual instructions of that logic

Q: does CPU have different instructions for heap vs stack?
%%A: I think cpu has special registers for stack management. Stack is probably simpler than heap, so CPU manages (most of) it very efficiently. See https://gribblelab.org/CBootCamp/7_Memory_Stack_vs_Heap.html

Q: how does CPU allocate for a given stack?
%%A: Perhaps it just increments a pointer (in some special register) to allocate a new stack frame?

Q: does CPU have instructions for allocating memory?
%%A: I don’t think so. Allocating is too high-level and involves multiple instructions. Allocation requires some data structures, which are maintained by kernel. The memory allocation data structure is itself in RAM, perhaps heap

Q: Kernel functions definitely use the stack, but does kernel use the free store?
%%A: I think so. That’s how kernel data structures can grow.

 

cloud-native^MSA

Given a requirement, you can go the traditional approach or go cloud-native. The latter approach is

  1. Containerized, probably using Docker,
  2. Dynamically orchestrated, usually using Kubernetes,
  3. Microservices-oriented

Source: Frequently Asked Questions (FAQ) – Cloud Native Foundation

So I think MSA is a major part of cloud-native dev expertise.

—  churn, longevity

  • I feel linux container technology will continue to improve, perhaps incrementally, become even more stable and efficient. Containres are useful even without the cloud or MSA.
  • Docker and Kubernetes  ride the cloud wave but may be displaced by newer solutions
  • Cloud computing has proven effective and will grow further before stablizing.
  • MSA and REST are trendy ideas and possibly /faddish/. They probably existed before cloud, and can be useful without the cloud.

 

serverless: part of cloud-native dev expertise

I feel serverless is the extreme form of cloud-native. A lot of cloud-native discussions refer to serverless “architecture” as a typical example of cloud-native, without naming it.

Below is Mostly based on this RedHat article

There are still servers (i think author means “hosts”) in serverless, but they are abstracted away from app development. A cloud provider handles the routine work of provisioning, maintaining, and scaling the server infrastructure. Developers can simply package their code in containers for deployment. I gess the artifact in a deployment “package” is a container’s image.

With serverless, routine tasks such as managing the operating system and file system, security patches, load balancing, capacity management, scaling, logging, and monitoring are all offloaded to a cloud services provider.

— dynamic billing

“…when a serverless function[1] is sitting idle, it doesn’t cost anything.” I think in this case there’s no server host created, so no utilization of electricity, disk, bandwidth, memory etc.

[1] I think this means a functional unit of deployment.

— What are some serverless use cases?

Serverless architecture is ideal for 1) asynchronous, stateless apps that can be started instantaneously. Likewise, serverless is a good fit for use cases that 2) see infrequent, unpredictable surges in demand.

Think of a task like batch processing of incoming image files, which might run infrequently but also must be ready when a large batch of images arrives all at once. Or a task like watching for incoming changes to a database and then applying a series of functions, such as checking the changes against quality standards, or automatically translating them.

Serverless apps are also a good fit for use cases that involve 3) incoming data streams, chat bots, scheduled tasks.

cloud prefers linux to other kernels, due2container

https://www.cbtnuggets.com/blog/certifications/open-source/why-linux-runs-90-percent-of-the-public-cloud-workload claims that linux runs 90% of the public cloud workload. About 30 percent of the virtual machines that Microsoft Azure uses are Linux-based.

I would think the other 70% of virtual machines on Azure is linux-free.

IBM CFO James Kavanaugh said in 2020:

“The next chapter of cloud will be driven by mission-critical workloads managed in a hybrid multi-cloud environment. This will be based on a foundation of Linux with Containers and Kubernetes.”

— containers without linux?

Containers are seen as a fundamental enabler/catalyst in cloud compouting, but is it a linux-only feature?

Note linux-vm on windows still uses linux. Is there a container implemntation without linux? https://containerjournal.com/topics/container-ecosystems/5-container-alternatives-to-docker/  lists a few, but I think they are all way behind linux standard containers:

Other Container Runtimes

  • Windows Server Containers.
  • Linux VServer.
  • Hyper-V Containers.
  • Unikernels.
  • Java containers.

On Azure, there are many marketing jargon terms, and Microsoft tries to downplay the reliance on linux, so it’s harder to find out what windows native container solutions there are, but here are a few:

cloud-expertise for devs: G3 aspects

Mostly for interview, i.e. body-building, but also design best practices for the team lead and architect.

Are we assuming Paas or Iaas? I feel more Iaas

  1. persistence and storage — including file system and data store
  2. integration with other systems across the local network
  3. admin interface and real time config changes?
  4. — less relevant aspects:
  5. configuration management? App config data must not be local files. I think they should be deployed along with the binary. As such, this is More of a concern for devops.
  6. high availability, cluster, failover? LG. More of a concern for the system architect
  7. security? LG. More of a concern for the system architect

cloud: more relevant to system- than software-architects

Software architects worry about libraries, OO designs, software components, concurrency etc.

They also worry about build and release.

How about app integration? I think this can be job for system architect or software architect. If the apps run in the same machine, then probably software architect.

— 10 considerations to develop for cloud

https://www.developer.com/cloud/ten-considerations-for-realizing-the-potential-of-the-cloud.html was a 2015 short article written by a java developer. I like the breakdown into 10 items.

Some of the 10 items are more devops than developer considerations. The other items are more for system architect than software architect.

However, hiring managers expect a senior candidates like me to demonstrate these QQ knowledge. By definition, QQ is not needed on the job.

 

stay relevant2new economy#cloud, ML,bigData

  • cloud, edge computing, virtualization,,,
  • AI, ML,
  • big data, data analytics
  • MSA, REST

Q: what’s the risk to my dev-till-70 if I choose to pay minimal attention to anything irrelevant to trading engines?
A: Appearance — I might appear out of date to the younger interviewers. I may need to know the jargon terms (and their relationships) enough to follow the conversations.

Job specs increasingly list these new technologies. I may be /displaced/sidelined by younger candidates, even disqualified. However, my advantage is my diverse experience across industries.

Putting on the black hat over the “displacement” concern, I feel some of my personal experience (in the “Churn” section below) cast doubts on it.

— technology bet i.e. picking real winners from a fountain of new buzzwords and hypes

I feel cloud is the most enduring technology in my list above. In dev environments (am a developer), Cloud infrastructure may become widespread just like git, linux, virtual hosts.

— Churn — defined as the risk of investing (my precious time) into perishable stuff, faddish stuff or hypes

xp in java-land: I stayed away from spring/hibernate. I didn’t get deep into WebLogic or EJB…. Because majority of WallSt elites value coreJava more than jxee (or even c++ but that’s another story).

Coherence/Gemfire etc also faded away.

Functional java also faded away.

lucky I didn’t invest in Scala #java8/c++11 #le2Greg

Hi Greg,

Am writing another reply to your earlier mail, but I feel you wouldn’t mind reading my observations of Scala and java8 on the WallSt job market.

Let me state my conclusion up-front. Conclusion: too many hypes and fads in java and across the broader industry. I feel my bandwidth and spare time is limited (compared to some techies) so I must avoid investing myself in high-churn domains.

You told me about Scala because MS hiring managers had a hard time selling Scala to the job candidates. The harder they try to paint Scala as part of the Future of java, the more skeptical I become. To date, I don’t know any other company hiring Scala talent.

If I get into a MS Scala job, I would have to spend my bandwidth (on the job or off the job) learning Scala. In contrast, in my current role with all-familiar technologies, I have more spare time on the job and off the job. On a Scala job, I would surely come across strange Scala errors and wrestle with them (same experience with python, and every other language)

. This is valuable learning experience, provided I need Scala in future jobs, but nobody else is hiring Scala.

Therefore, am not convinced that Scala is worth learning. It is not growing and not compelling enough to take over (even a small part of the java) world. I would say the same about Go-lang.

If scala is a bit of a hype, then Java8 is a bit of a fad.

I remember you said in 2019 that java8 knowledge had become a must in some java interviews. I actually spent several weeks reading and blogging about key features of java8. Almost None of them is ever quizzed.

Java8 seems to be just another transitional phase in the evolution of java. My current system uses java8 compiler (not java8 features) , but java 9,10,11,12,13,14 and 15 have come out. There are so many minor new features that interviewers can only ask a small subset of important features. The "small subset" often boils down to an empty set — interviewers mostly ask java1 to java5 "classic" language features such as threading, collections, java polymorphism.

Some hiring teams don’t even ask java8 interview questions beyond the superficial. Yet they say java8 experience is required on the job!

Lastly, I will voice some similar frustrations about c++11/14/17. Most teams use a c++17 compiler without using any new c++ features. Most of the interview questions on "new" c++ revolve around move semantics, a very deep and challenging topic, but I don’t think any team actually uses move semantics in their work. Nevertheless, I spent months studying c++11 esp. move semantics, just to clear the interviews.

##[18]realistic 2-10Y career plann`guidelines #300w

Background: not easy to have a solid plan that survives more than 3Y. Instead of a detailed plan, I will try to manage using a few guidelines.

  • — top 3 “guidelines” [1]
  • respect/appreciation/appraisal(esp. by manager) — PIP/stigma/trauma/damagagedGood. Let’s accept: may not be easy to get
  • Singapore — much fewer choices. Better consider market-depth^elite domain
  • Expertise accu (for dev-till-70) or sustained focus — holy grail
  • ——– secondary:
  • dev-till-70
  • family time — how2get more family time #a top3 priority4Sg job. Some usage is optional (play time) while others are a matter of responsibility.
  • personal time — short commute, flexible time, low workload, freedom to blog]office… is proving to be so addictive that I have forgotten the other guidelines.
  • interviews — Let’s accept : extremely important to me but much harder in Singapore. Even in the U.S. I may need to cut down.
  • distractions — Let’s accept and try to contain it.
  • Entry-barrier — could be too high for me in the popular domains like algo trading
  • Entry-barrier — could be too low for some young guys — the popular domains will have many young guys breaking in
  • FOLB Peer pressures — and slow-track… Let’s accept.
  • trySomethingNew — may/not be justifiable
    • stagnation — could be the norm
    • engaging — keep yourself engaged, challenged, learning, despite the stagnation
  • Shrinking Choices — many employers implicitly prefer younger programmers
  • Churn — Avoid
  • non-lead dev role — Let’s embrace. Don’t feel you must move out or move up. Hands-on coding is gr8 for me. Feel good about it

[1] I didn’t say “priorities”

.so.2: linker^dynamic loader

— Based on https://unix.stackexchange.com/questions/475/how-do-so-shared-object-numbers-work

In my work projects, most of the linux SO files have a suffix like libsolclient_jni.so.1.7.2. This is to support two executables using different versions of a SO at the same time.

Q: How is the linker able to find this file when we give linker a command line option like “-lsolclient_jni”? In fact, java System.loadLibrary(“solclient_jni”) follows similar rules. That’s why this example uses a java native library.

A: Actually, linker (at link time) and dynamic loader (at run time) follow different rules

  • at compile time, executable binaries saves (hardcoded) info about which version of a SO to load into memory. You can run “ldd /the/executable/file” to reveal the exact versions compiled with the executable.
  • at run time, executable would consult the hardcoded info and load libsolclient_jni.so.1.7.2 into memory
  • at link time, linker only uses the latest version. So there’s usually a symlink like libsoclient_jni.so (without suffix)

— static libraries:

I think static libraries like libc.a do not have this complexity.

During static linking the linker copies all library routines used in the program into the executable image. This of course takes more space on the disk and in memory than dynamic linking. But a static linked executable does not require the presence of the library on the system where it runs.

#1 usage@member template: smartPtr upcast

Does this use SFINAE? Not important. The exact definition of SFINAE is arcane, but the technique here is just as magical as SFINAE.

— OOC member template “function”

P 176 [[moreEffC++]] uses member template functions to generate OOC members for a smart ptr class. Suppose the host type is smartPtr<T>, then the OOC converts from host type to smartPtr<B>

This OOC is generated iFF ptr-to-T can convert to ptr-to-B. I think SCB-FM IV by architect #shared_ptr upcast is relevant.

OOC is an overloaded operator, not a function, but I consider it a member function.

— cvctor instead of OOC

For a similar purpose, https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_33template/shPtrUpcastCopy.cpp features a cvctor template.

This cvctor template is an alternative to the OOC member template.

[20] SG tech talent pool=insufficient: expertise^GTD

Listening to LKY’s final interviews (2013 ?), I have to agree that Singapore — counting citizens+PRs — doesn’t have enough technical talent across many technical domains, including software dev. https://www.gov.sg/article/why-do-we-need-skilled-foreign-workers-in-singapore is a 2020 publication, citing software dev as a prime example.

A telltale sign of the heavy reliance on foreign talent — If an employer favors a foreigner, it faces penalty primarily (Russell warning) in the form of ban on EP application/renewal. This penalty spotlights the reliance on EPs at multinationals like MLP, Goog, FB, Macq.

The relatively high-end dev jobs might be 90% occupied by foreigners, not citizens like me. I can recall my experience in OC, Qz, Macq, INDEED.com interview… Why? One of the top 2 most obvious reasons is the highly selective interview. High-end tech jobs always feature highly selective tech interviews — I call it “Expertise screening”.

Expertise is unrelated to LocalSys knowledge. LocalSys is crucial in GTD competence.

As I explained to Ashish and Deepak CM, many GTD-competent [1] developers in SG (or elsewhere) are not theoretical enough, or lack the intellectual curiosity [1], to pass these interviews. In contrast, I do have some Expertise. I have proven it in so many interviews, more than most candidates.

(On a side note, even though I stand out among local candidates, the fact remains that I need a longer time to find a job in SG than Wall St. )

[1] As my friend Qihao explained, most rejected candidates (including Ashish) probably have the competence to do the job, but that’s not his hiring criteria. That criteria is too low.  Looks like SG has some GTD-competent developers but not enough with expertise or curiosity.

— Math exams in SG and China

Looking at my son’s PSLE math questions, I was somehow reminded that the real challenge in high-end tech IV is theoretical/analytical skills — “problem-solving” skill as high-end hiring teams say, but highly theoretical in nature. This kind of analytical skill including vision and pattern recognition is similar to my son’s P5 math questions.

In high-end tech IV, whiteboard algo and QQ are the two highly theoretical domains. ECT and BP are less theoretical.

What’s in common? All of these skills can be honed (磨练). Your percentile measures your abilities + effort (motivation, curiosity[1]). I’m relatively strong in both abilities and effort.

So I know the math questions are similar in style in SG and China. I have reason to believe East-European and former-Soviet countries are similar. I think other countries are also similar.

rvalue objects before/after c++11

Those rvalue objects i.e. unnamed temp objects have been around for years. So how is rvr needed to handle rvalue-objects?

  • C++11 added language features (move,forward,rvr..) only to support resource stealing where resource is almost always some heapy thingy.
  • Before c++11, rvalue objects don’t need a special notation and don’t need a special handle (i.e. rvr). They are treated just like a special type of object. You were able to steal resources, but error-prone and unsafe.

private-Virtual functions: java^c++

Q: in c++ and java, is private virtual function useful?
A: both languages allow this kind of code to compile. C++ experts uses it for a purpose but in java, any private methods are not really virtual, so any subclass method is unrelated to the baseclass private method.

— java

Q: beside overriding and overloading, does java support a 3rd mechanism where subclass can redefine a (non-static) method?

  • in GTD this is very low value.
  • in terms of zbs and expertise this actually reveals something fundamental, esp. between java and c++
  • in terms of IV, this can be a small halo whenever we talk about overriding^overloading

A: Code below is not overriding nor overloading but it does compile and run, so yes this is a 3rd mechanism. I will call it “hiding” or “redefinition”. The hiding method is unrelated to the baseclass private method so compiler has no confusion. (In contrast, With overriding and overloading, compiler or the runtime follows (documented) rules to pick an implementation. )

Code below is based on https://stackoverflow.com/questions/19461867/private-method-in-inheritance-in-java

public class A {
    private void say(int number){
        System.out.print("A:"+number);
    }
}
public class D extends A{
    // a public method hiding/redefining a baseclass private method 
    public void say(int number){
        System.out.print("Over:"+number);
    }
}
public class Tester {
    public static void main(String[] args) {
        A a=new D();
        //a.say(12); // compilation error
        ((D)a).say(12); //works
    }
}

— C++ is more nuanced

The trigger of this blogpost is P68 [[c++ coding standard]] by two top experts Sutter and Alexandrescu, but I find this “coding standard” unconvincing.

Private virtual functions seem to be valuable in some philosophical sense but I don’t see any practical use.

array/deque based order book

For HFT + mkt data + internal matching or market making .. this is a favorite interview topic, kind of similar to retransmission.

==== For a first design, look at … https://web.archive.org/web/20141222151051/https://dl.dropboxusercontent.com/u/3001534/engine.c has a brief design doc, referenced by https://quant.stackexchange.com/questions/3783/what-is-an-efficient-data-structure-to-model-order-book

  • TLB?
  • cache efficiency

— no insert/delete of array

Order cancel and full-fill both results in deletion of an array element .. shifting. Random inserts mid-stream also requires shifting in the array. To preempt shifts, the design adopted in engine.c is “one array element for every possible price point”.

  1. when an existing order gets wiped out, its quantity becomes zero. It’s also possible to use a flag, but zero quantity is probably more space efficient.
  2. when we get a limit order at a new price of 9213, we don’t insert but update the dummy price point object in-situ.

What if all the price points in use are only 0.01% of the price points allocated? To answer that question, we need some estimates of the likely price levels and the footprint of the array element. Luckily, price levels are not floating points but integers essentially — a key constraint in the requirement.

  • An array element can be very space efficient — a nullable pointer.
  • Alternatively, it can be an integer subscript into another array of “received price points”. Dummy price point would be represented by “0”, a special subscript. Subscript can be double-byte, a big saving cf 8-byte pointers.
  • Likely price levels could range from 30D min to 30D max plus some margin. Such a range would be up to 10,000 price levels.
    • But What if price plunges at SOD or mid-day? “Not sure how my company system was designed, but here’s my idea –” we would need to allocate then initialize a new array of price levels. Deque would help.
  • Unlikely price levels (for outlier orders) would be hosted in a separate data structure, to support a super low bid (or super high ask). These outlier orders can tolerate longer delays.
  • a deque would support efficient insert near Both ends.

==== vector insert as a second design

A fairly complete reference implementation for Nasdaq’s ITCH protocol supports order lookup, order reduction (partial fill or full fill), order cancel and  new order. New order at a fresh price level uses vector::insert —
  • If this insertion happens near the end of the vector (top of book), then the elements shifted would be very few
  • if the market slowly declines, then the best bid would be wiped out, shrinking the vector at the back end. New price levels would be inserted at lower price points, but again the elements above it would be few.
  • If this insertion happens far from the back end, as in an outlier order, then lots of elements shifted, but this scenario is rare.

Note a single element’s footprint can be very small, as explained earlier.

==== switch to a RBTree if some abnormal price movement detected. This idea may sound crazy, but consider RBTree used inside java8 hashmap

##xp staying-on after PIP, with dignity

  • +xp GS: informal, unofficial PIP, before I went to Kansas City for July 4th. After it, I stayed for close to a year.
  • +xp Stirt: I overcame the setback at end of 2014 and scored Meet/Meet albeit without a bonus.

Now I think it is possible to stay on with dignity, even pride. I didn’t do anything disgraceful.

I hope at my age now, I would grow older and wiser, less hyper-sensitive less vulnerable, more mellow more robust, or even more resilient.

That would be a key element of personal growth and enlightenment.

market data warehouse #lightning talk

I am not permitted to reveal identities. Let’s say we are talking about a financial institution. You can think of a bank, sovereign fund, insurer or asset manager.

  • total footprint ~ 2.5 PB = 2500 TB, in the original form as received from vendor, and is usually pre-compressed by vendor.
  • daily increment ~ 3 TB and growing
  • biggest subset is tick data, probably 800 TB. One vendor can require 0.5 TB/day after decompression.
  • Most common data dissemination (from vendor) is FTP. The new kid on the block is vendor API whereby clients can pull data from vendor.

— historical market data

This warehouse is for historical data. It can poll a vendor system every 5 minutes to receive latest data.

ICE RTS defines “real time market data” using a latency threshold of 30 minutes. Therefore, some data in this warehouse can be considered “near real time”.

— cloud:

Many vendors are on AWS i.e. provide an AWS dissemination interface, so this MDW is also moving to AWS.

If a vendor (Reuters?) is only on google cloud, then dissemination requires a AWS-Google bridge, but no good bridge as of 2020.

[20]OC-effective: overrated

Today on my jog along the Marina Promenade, I reminded myself that my parents gave me a good body and i have taken good care of it. that’s why I’m able to enjoy life to the fullest.

Then it occurred to me that those “effective” people (in the OC sense) usually don’t or can’t make the same claim. They are often overworked, overweight, overeating, lacking exercise. It’s rare to see one of them fit and slim (TYY might be).

Remember OC-effective is defined in the corporate context. Mostly corporate managers.

— OC-effective people are no more likely to be healthier than average. Their life expectancy is not longer. I actually believe health is wealth.

— OC-effective ≠ (Fuller) wealth or measured in Brbr. Many people are more wealthy but not MNC managers.

— OC-effective ≠ “effective with the team”, as in the SG sense. Sometimes it is largely inter-personal (with the boss) effectiveness.

— OC-effective is mostly theoretical and assessment can be very biased . Promotion is decided by upper management, so what team members feel doesn’t count. 360-review is marketing.

— OC-effective ≠ true leadership. We all know some lousy managers getting promoted (RTS, deMunk). However, I think many good leaders have OC-effectiveness. Listening is essential.

— OC-effective ≠ satisfaction with life. Grandpa often says these individuals 活得好累. They often have more stress due to heavy responsibilities.

— OC-effective is effective in that one organization and may not be in another organization. Consider Honglin. In contrast, hands-on developers like me possess more portable skills mostly in the form of IV.

— OC-effective ≠ adding social value. The corporation may not create lasting social value. OC-effectiveness means effective at helping the company reach its goals, regardless of value. In SG context, social value is rather important.

MSOL zoom by mouse/touchpad/touchscreen

My home laptop screen resolution is such that zoom-in is required when reading outlook messages. How do I use keyboard to easily zoom in on a new message?

— touch screen two-finger pinch works, even though I disabled it in my laptop

— Ctrl + right-scrollUp is similar to the slider.

With an external mouse, this gesture is probably proven and reliable. With a touchpad, I relied on two-finger scroll:

Two-finger up-scroll + Ctrl does zoom in 🙂

— two-finger page scroll without Ctrl key : is unrelated to zooming

Windows gives us two choices. I chose “Down motion means scroll down”. Intuitively, the two-finger feels like holding the vertical scroll-bar on a webpage and in Initellj.

Note Intellij scroll is otherwise impossible with touch-screen or touchpad!

pointer as class member #key points

Pointer as a field of a class is uninitialized by default, as explained in uninitialized is either pointer or primitive. I think Ashish’s test shows it

However, such a field creates an opportunity for mv-ctor. The referent object is usually on heap on in static memory

If you replace the raw ptr field with a smart pointer, then your copier, op=, dtor etc can all become simpler, but this technique is somehow not so popular.

Note “pointer”  has a well-defined meaning as explained in 3 meanings of POINTER + tip on q[delete this)]

speed=wrong goal #aging pianist

傅聪 is an outspoken and articulate pianist. His words probably describe many aging pianists — at a certain age, you don’t have the “hardware capacity” to compete on speed.  Today I want to talk about speed.

  • — ^ GTD on the job:
    sense of urgency is good.
  • — ▼ code reading and figure things out
    As we age, this kind of speed will become tougher, so it’s wise to lower the expectation.
  • — ▼ jogging
    speed is the wrong goal and can backfire. My ultimate goal is endurance in heart and lung, weight loss etc.
  • — ▼ grandpa learning computer
    expectation should be lower.
  • — ▼ 最强大脑 had a contestant in his 70’s, aiming to recite the first 5000 digits of pi.
    I believe as we age, we need more refreshing (like DRAM) + more time

====coding interview

— ▼ online test cases !
To run any online test case, you first need (a lot of) ECT to “make it work”. That would take a lot of time, highly discouraging.

— ▼ time limit such as “pass N tests before alighting” — a recipe for self-disappointment, self-hate and unnecessary pressure

— ▼ speed coding IV: by default not appropriate for older guys like me, with exceptions.
ECT, syntax … are harder for older coders. However, for some people speed-coding practice can be anti-aging.

For codility etc, enlist friends.

I now prefer (OO) design questions and weekend assignment

cancel^thread.stop() ^interrupt^unixSignal

Cancel, interrupt and thread.stop() are three less-quizzed topics that show up occasionally in java/c++ interviews. They are fundamental features of the concurrency API. You can consider this knowledge as zbs.

As described in ##thread cancellation techniques: java #pthread,c#, thread cancellation is supported in c# and pthreads, whereas java indirectly supports it.

— cancel and java thread.stop() are semantically identical but java thread.stop() is forbidden and unsafe.

PTHREAD_CANCEL_ASYNCHRONOUS is usable in very restricted contexts. I think this is similar to thread.stop().

— cancel and interrupt both define stop points.  In both cases, the target thread choose to ignore the cancellation/interrupt request, or check them at the stop points.

Main difference between cancel and interrupt ? Perhaps just the wording. In pthreads there’s only cancel, no interrupt, but in java there’s no cancel.

https://stackoverflow.com/questions/16280418/pthread-cancel-asynchronous-cancels-the-whole-process

Note interrupted java thread probably can’t resume.

Nos sure if Unix signal handler also supports stop points.

Java GC on-request is also cooperative. You can’t force the GC to start right away.

Across the board, the only immediate (non-cooperative) mechanism is power loss. Even a keyboard “kill” is subject to software programmed behavior, typically the OS scheduler. 

comfort,careerMobility: CNA+DeepakCM #carefree

https://www.channelnewsasia.com/news/commentary/mid-career-mobility-switch-tips-interview-growth-mindsets-11527624 says

“The biggest enemy of career mobility is comfort … Comfort leads us to false security and we stop seeking growth, both in skills and mindset agility. I see all the time, even amongst very successful senior business people, that the ones who struggle with career advancement, are the ones whose worlds have become narrow – they engage mainly with people from their industry or expertise area, and their thinking about how their skills or experience might be transferable can be surprisingly superficial.”

The comfort-zone danger … resonates with Deepak and me.

— My take:

The author cast a negative light on comfort-zone, but comfort is generally a good thing. Whereas, admittedly, those individuals (as described) pay a price for their comfort-zones, there are 10 times more people who desire that level of comfort, however imperfect this comfort is.

Comfort for the masses means livelihood. For me, comfort has the additional meaning of carefree.

Whereas the author’s focus is maximum advancement, not wasting one’s potential i.e. FOLB and endless greed, our goal is long-term income security (including job security [1]). This goal is kinda holy grail for IT professionals.  Comfort as described in CNA is temporary, not robust enough. Therefore, I see three “states of comfort in livelihood”

  • Level 1 — no comfort. Most people are in this state. Struggling to make ends meet, or unbearable commute or workload (my GS experience)
  • Level 2 — short (or medium) term comfort, is the subject of the CNA quote.  Definition of “short-term” is subjective. Some may feel 10Y comfort is still temporary. Buddhists would recognize the impermanence in the scheme.
  • Level 3 — long-term comfort in livelihood. Paradoxically, a contingency contractor can achieve this state of livelihood security if he can, at any time, easily find a decent job, like YH etc. I told Deepak that on Wall St, (thanks to dump luck) Java provides a source of long-term comfort and livelihood security. Detachment needed!

[1] income security is mostly job security. Fewer than 1% of the population can achieve income security without job security. These lucky individuals basically have financial freedom. But do take into account (imported) inflation, medical, housing, unexpected longevity,,

Deepak pointed out a type of Level-2 comfort — professional women whose main source of meaning, duty, joy is the kids they bring up. For them, even with income instability, the kids can provide comfort for many years.

Deepak pointed out a special form of Level-3 carefree comfort — technical writers. They can have a job till their 80’s. Very low but stable demand. Very little competition. Relatively low income.

Deepak felt a key instability in the IT career is technology evolution (“churn” in my lingo), which threatens to derail any long-term comfort. I said the “change of the guard” can be very gradual.

— Coming back to the carefree concept. I feel blessed with my current carefree state of comfort. Probably temporary, but pretty rare.

Many would point to my tech learning, and challenge me — Is that carefree or slavery to the Churn? Well, I have found my niche, my sweet spot, my forte, my 夹缝, with some moat, some churn-resistance.

Crucially, what others perceive as constant learning driven by survival instinct, is now my livelong hobby that keeps my brain active. Positive stress in a carefree life.

The “very successful senior business people”, i.e. high-flyers, are unlikely to be carefree, given the heavy responsibilities. Another threat is the limelight, the competition for power , glory and riches. In contrast, my contractor position is not nearly as desirable, enviable, similar to the technical writers Deepak pointed out.

[20]what protects family livelihood: IV^GTD skill #AshS

Hi Ashish,

After talking to you about PIP (i.e. performance improvement process) in my past and in future scenarios, and considering my financial situation (wife not working, 2 young kids + 2 elderly grandparents) over the 20Y horizon , I came up with this question —

Q: Between two factors: AA) my competitive position on the tech hiring market, BB) job security at MLP, which factor has more impact on my family livelihood

Answer: I remain as convinced now as 10 years ago: AA is the dominant factor. I won’t allow myself to rely on some employer to provide my family a stable income for 20Y, even if I do a good job. There are countless stories among my peers who worked hard but lost jobs.

Answer (during covid19 mass layoff): I’m even more convinced that AA is the dominant factor. MLP is doing well, but MLP owner is not my dad. See my email to my sister sent on 19 Aug.

If I do a good job in the current team, what’s the chance of keeping this job for 10Y? 10%? There are individuals (like my manager) who stay in one team for 10+ years, but I think anyone like him has witnessed dozens of coworkers who have no choice but leave, for various reasons (not counting those who had a choice to stay but hopped higher elsewhere.)

That’s the basic answer to my opening question, but there are a few important sub-factors to point out.

Family livelihood includes housing, medical and education. In the U.S., I would incur $3k/M rental + 2k/M health insurance. Therefore, livelihood in the U.S. is more precarious, less secured.

My Health — is a big hidden factor. Stamina, mental capacity has a direct impact on our “performance” in the competition, both on job market and on-the-job. I think you probably need a lot of physical and mental energy, stamina,,, to deep dive into an unfamiliar local system or codebase, to become so confident, right?

company stability — is a sub-factor of BB. Some investment banks (GS, Barclays, MS) are known to aggressively cut headcount even in profitable years, to stay lean and mean.

Aging — is a factor affecting AA slightly more than BB. Age discrimination is something I seem to notice more as I grow older. So far I’m able to maintain my “cometptive fitness” on job market. If I rely on BB too much as I age, then I believe I would pay less attention to AA, and grow older and weaker. To strengthen the foundation of my family livelihood as I age, I tell myself to see the reality — as I age I would face a less friendly job market + instability on any job. Therefore I need to give priority to AA, by maintaining/improving my tech skills for competitive interviews.

Demand — for developers continue to grow in the job markets. This is a fundamental reason why AA is so valuable and reliable. This robust demand doesn’t help BB at all.

Overall, my view is biased in favor of AA. This is deliberate. With PIP or without PIP, any high-paying tech job (like yours or mine) comes with an expectation and risk of job loss. AA is the parachute.

references: primarily used as params

Many books talk about references used as field, or return values, or perhaps payload saved in containers.

Many interviews also touch on these scenarios.

However, in my projects, 99.9% of the usage of reference is function parameters, both const and non-const.

  • if your goal is GTD, then focus on the primary usage
  • if your goal is IV or “expertise”, then don’t limit your learning to the primary usage.

3stressors: FOMO^PIP^ livelihood[def1]

  • PIP
  • FOMO/FOLB including brank envy
  • burn rate stress esp. the dreaded job-loss

jolt: FSM dividend has never delivered the de-stressor as /envisioned/. In contrast, my GRR has produced /substantial/ nonwork income, but still insufficient to /disarm/ or blunt the threat of PIP ! In my experience, Job-loss stressor is indeed alleviated by this income or the promise thereof 🙂

Excluding the unlucky (broken, sick,,) families, I feel most “ordinary” people’s stress primarily come from burn rate i.e. making ends meet, including job-loss fear. Remember the OCBC pandemic survey: 70%Singaporeans can’t last beyond 6M if jobless. I feel the middle class families around me could survive  at a much lower theoretical burn rate of SGD 3.5-4.5k (or USD 6k perhaps… no 1st-hand experience) …. but they choose the rat race of keeping up with the Jones’s (FOMO). Therefore, their burn rate becomes 10k. See also SG: bare-bones ffree=realistic #WBank^wife^Che^3k and covid19$$handout reflect`Realistic burn rate

For some, FOLB takes them to the next level — bench-marking against the high-flyers.

—– jolt: PIP^job-loss fear

Note Many blogposts (not this one) explore FOMO^livelihood.

For the middle class, any burn rate exceeding 3k is a real (albeit subconscious) stressor because the working adults now need to keep a job and build up a job-loss contingency reserve. Remember Davis Wei….3-month is tough for him? How about 30 years? In a well-publicized OCBC survey during covid19, most Singaporean workers can’t last 6M

With a full time job, salaried workers experience a full spectrum of stressors including PIP. PIP would be impotent/toothless if the job is like a hobby. I would say very few people have such a job.

Aha .. Contract career is free of PIP.

For me (only me), job loss is a lighter stressor than PIP fear. In fact, I don’t worry about end of contract [2] and bench time. I worry more about humiliating bonus. I’d rather lose a contract job than receiving a token bonus after PIP.

I think PIP is the least universal, shared stressor among the three stressors[1]. Even though some percentage of my fellow IT professionals have experienced PIP, they seem to shrug it off. In contrast, I lick my wound for years, even after it turns into a permanent scar. Most people assume that my PIP fear was fundamentally related to cashflow worry, but I am confident about job hunting. So my PIP fear is all about self-esteem and unrelated to cashflow.

[1] In the covid19 aftermath (ongoing), SG government worry mostly about job loss i.e. livelihood. Next, they worry about career longevity, in-demand skills, long-term competitiveness, housing, healthcare and education… all part of the broader “livelihood” concept. As parents, we too worry about our kids’ livelihood.

[2] Because I have a well-tested, strong parachute, I’m not afraid of falling out (job loss)

Q: imagine that after Y2, this job pays me zero bonus, and boss gives some explicit but mild message of “partial meet”. Do I want to avoid further emotional suffering and forgo the excellent commute + flexible hours + comfortable workload + hassel-free medical benefit?
A: I think Khor Siang of Zed would stay on. I think ditto for Sanjay of OC/MLP. Looking at my OC experience, I think I would stay on.

— what are (not) part of “livelihood” concerns. These clarifications help define “livelihood/生计”

  • housing — smallish, but safe, clean home is part of livelihood
  • healthcare — polyclinic, TCM, public healthcare system in Malaysia … are important components of an adequate healthcare infrastructure, which is livelihood. Anything beyond is luxury healthcare
  • commute to work/school — 1.5H commute is still acceptable. in 1993 I had a 1.5 hour commute to HCJC. A desire for a shorter commute is kinda luxury, beyond livelihood.
  • job security for those of you aged 40-65 — is NOT a livelihood concern if you already have enough nonwork income to cover basic living expenses. Job is really a luxury, for joy, occupation, contribution. Consider grandpa.
    • job security above 65 — is clearly NOT livelihood, unless there’s insufficient retirement income.
  • Life-chances — are more about livelihood and less about FOMO.

— Deepak’s experience

Deepak converted form contractor to perm in mid 2014, but on 30 Oct 2014, lost his job in UK. He sat on the bench for thirteen months and started working in Nov 2015, in the U.S. This loss of income was a real blow, but in terms of the psychological scars, I think the biggest were 1) visa 2) job interviews difficulties. He didn’t have a PIP scar.

 

op=(): java cleaner than c++ #TowerResearch

A Tower Research interviewer asked me to elaborate why I claimed java is a very clean language compared to c++ (and c#). I said “clean” means consistency and fewer special rules, such that regular programmers can reason about the language standard.

I also said python is another clean language, but it’s not a compiled language so I won’t compare it to java.

See c++complexity≅30% mt java

— I gave interviewer the example of q[=]. In java, this is either content update at a given address (for primitive data types) or pointer reseat (for reference types). No ifs or buts.

In c++ q[=] can invoke the copy ctor, move ctor, copy assignment, move assignment, cvctor( conversion ctor), OOC(conversion operator).

  • for a reference variable, its meaning is somewhat special  at site of initialization vs update.
  • LHS can be an unwrapped pointer… there are additional subtleties.
  • You can even put a function call on the LHS
  • cvctr vs OOC when LHS and RHS types differ
  • member-wise assignment and copying, with implications on STL containers
  • whenever a composite object has a pointer field, the q[=] implementations could be complicated.  STL containers are examples.
  • exception safety in the non-trivial operations
  • implicit synthesis of move functions .. many rules
  • when RHS is a rvalue object, then LHS can only be ref to const, nonref,,,

unrolled linked list with small payload: d-cache efficiency

This blogpost is partly based on https://en.wikipedia.org/wiki/Unrolled_linked_list

This uniform-segment data structure is comparable to deque and can be a good talking point in interviews.

Unrolled linked list is designed as a replacement for vanilla linked list , with 2 + 1 features

  1. mid-stream insert/delete — is slightly slower than linked list, much better than deque, which is efficient only at both ends.
  2. traversal — d-cache efficiency during traversal, esp. with small payloads
    • This is the main advantage over standard linked list

There’s a 3rd task, slightly uncommon for linked list

  1. jump by index — quasi-random access.. See [17]deque with uneven segments #GS

— comparison with deque, which has a fixed segment length, except the head and tail segments

  • Deque offers efficient insert/delete at both ends only. At Mid-stream .. would require shifting, just as in vector
  • Deque offers O(1) random access by index, thanks to the fixed segment length

[20] charmS@slow track #Macq mgrs#silent majority

another critique of the slow track.

My Macq managers Kevin A and Stephen Keith are fairly successful old-timers. Such an individual would hold a job for 5-10 years, grow in know-how, effectiveness (influence,,,). Once a few years they move up the rank. In their eyes, a job hopper or contractor like me is hopelessly left on the slow track — rolling stone gathers no moss.

I would indeed have felt that way if I had not gained the advantages of burn rate + passive incomes. No less instrumental are my hidden advantages like

  • relatively carefree hands-on dev job, in my comfort zone
  • frugal wife
  • SG citizenship
  • stable property in HDB
  • modest goal of an average college for my kids
  • See also G5 personal advantages: Revealed over15Y

A common cognitive/perception mistake is missing the silent majority of old timers who don’t climb up. See also …

read/write volatile var=enter/exit sync block

As explained in 3rd effect@volatile introduced@java5

  • writing a volatile variable is like exiting a synchronized block, flushing all temporary writes to main memory;
  • reading a volatile variable is like entering a synchronized block, reloading all cached shared mutables from main memory.

http://tutorials.jenkov.com/java-concurrency/volatile.html has more details.

https://stackoverflow.com/questions/9169232/java-volatile-and-side-effects also addresses “other writes“.

denigrate%%intellectual strength #ChengShi

I have a real self-esteem problem as I tend to belittle my theoretical and low-level technical strength. CHENG, Shi was the first to point out “你就是比别人强”.

  • eg: my grasp of middle-school physics was #1 strongest across my entire school (a top Beijing middle school) but I often told myself that math was more valuable and more important
  • eg: my core-java and c++ knowledge (QQ++) is stronger than most candidates (largely due to absorbency++) but i often say that project GTD is more relevant. Actually, to a technical expert, knowledge is more important than GTD.
  • eg: I gave my dad an illustration — medical professor vs GP. The Professor has more knowledge but GP is more productive at treating “common” cases. Who is a more trusted expert?
  • How about pure algo? I’m rated “A-” stronger than most, but pure algo has far lower practical value than low-level or theoretical knowledge. Well, this skill is highly sought-after by many world-leading employers.
    • Q: Do you dismiss pure algo expertise as worthless?
  • How about quant expertise? Most of the math has limited and questionable practical value, though the quants are smart individuals.

Nowadays I routinely trivialize my academic strength/trec relative to my sister’s professional success. To be fair, I should say my success was more admirable if measured against an objective standard.

Q: do you feel any IQ-measured intelligence is overvalued?

Q: do you feel anything intellectual (including everything theoretical) is overvalued?

Q: do you feel entire engineering education is too theoretical and overvalued? This system has evolved for a century in all successful nations.

The merit-based immigration process focus on expertise. Teaching positions require expertise. When laymen know you are a professional they expect you to have expertise. What kind of knowledge? Not GTD but published body of jargon and “bookish” knowledge based on verifiable facts.

scan an array{both ends,keep`min/max #O(N)

I feel a reusable technique is

  • scan the int array from Left  and keep track of the minimum_so_far, maximum_so_far, cumSum_so_far. Save all the stats in an array
  • scan the int array from right and keep track of the minimum_so_far, maximum_so_far, cumSum_so_far. Save all the stats in an array
  • save the difference of min_from_left and max_from_right in an array
  • save the difference of max_from_left and min_from_right in an array

With these shadow arrays, many problems can be solved visually and intuitively, in linear  time.

eg: max proceeds selling 2 homes #Kyle

How about the classic max profit problem?

How about the water container problem?

range bump-up@intArray 60% #AshS

https://www.hackerrank.com/challenges/crush/problem Q: Starting with a 1-indexed array of zeros and a list of operations, for each operation add a “bump-up” value to each of the array element between two given indices, inclusive. Once all operations have been performed, return the maximum in your array.

For example, given array 10 of zeros . Your list of 3 operations is as follows:

    a b k
    1 5 3
    4 8 7
    6 9 1

Add the values of k between the indices a and b inclusive:

index->	 1 2 3  4  5 6 7 8 9 10
	[0,0,0, 0, 0,0,0,0,0, 0]
	[3,3,3, 3, 3,0,0,0,0, 0]
	[3,3,3,10,10,7,7,7,0, 0]
	[3,3,3,10,10,8,8,8,1, 0]

The largest value is 10 after all operations are performed.

====analysis

This (contrived) problem is similar to the skyline problem.

— Solution 1 O(minOf[N+J, J*logJ ] )

Suppose there are J=55 operations. Each operation is a bump-up by k, on a subarray. The subarray has left boundary = a, and right boundary = b.
Step 1: Sort the left and right boundaries. This step is O(N) by counting sort, or O(J logJ) by comparison sort. A conditional implementation can achieve O(minOf[N+J, J*logJ ] )

In the example, after sorting, we get 1 4 5 6 8 9.

Step 2: one pass through the sorted boundaries. This step is O(J).
Aha — the time complexity of this solution boils down to the complexity of sorting J small positive integers whose values are below n.

3overhead@creating a java stackframe]jvm #DQH

  • additional assembly instruction to prevent stack overflow… https://pangin.pro/posts/stack-overflow-handling mentions 3 “bang” instructions for each java method, except some small leaf methods
  • safepoint polling, just before popping the stackframe
  • (If the function call receives more than 6 arguments ) put first 6 args in register and the remaining args in stack. The ‘mov’ in stack involves more instructions than registers. The subsequent retrieval from stack is likely L1 cache, slower than register read.

age40-50career peak..really@@stereotype,brainwash,,

stereotype…

We all hear (and believe) that the 40-50 period is “supposed” to be the peak period in the life of a professional man. This expectation is created on the mass media (and social media such as Linkedin) brainwash that presents middle-aged managers as the norm. If not a “manager”, then a technical architect or a doctor.

[[Preparing for Adolescence]] illustrates the peer pressure (+self-esteem stress) felt by the adolescent. I feel a Deja-vu. The notion of “normal” and “acceptable” is skewed by the peer pressure.

Q: Out of 100 middle-aged (professional or otherwise) guys, how many actually reach the peak of their career in their 40’s?
A: Probably below 10%.

In my circle of 40-somethings, the norm is plateau or slow decline, not peak. The best we could do is keep up our effort and slow down the decline, be it wellness, burn rate, learning capacity, income,,,

It’s therefore hallucinatory to feel left behind on the slow track.

Q: at what age did I peak in my career?
A: I don’t want to overthink about this question. Perhaps towards the end of my first US era, in my late 30s.

I think middle-aged professional guys should read [[reconciliations]] by Theodore Rubin. The false expectation creates immense burden.

const data member initialization: simple on the surface

The well-known Rule 1 — a const data member must be initialized exactly once, no more no less.

The lesser-known Rule 2 — for class-type data member, there’s an implicit default-initialization feature that can kick in without us knowing. This default-init interacts with ctor initializer in a strange manner.

On a side note, [[safeC++]] P38 makes clever use of Rule 2 to provide primitive wrappers. If you use such a wrapper in place of a primitive field (non-const), then you eliminate the operational risk of “forgetting to initialize a non-const primitive field

The well-known Rule 3 — the proper way to explicitly initialize a const field is the ctor initializer, not inside ctor body.

The lesser-known Rule 4 — at run-time, once control passes into the ctor body, you can only modify/edit an already-initialized field. Illegal for a const field.

To understand these rules, I created an experiment in https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_misc/constFieldInit.cpp

— for primitive fields like int, Rule 2 doesn’t apply, so we must follow Rule 1 and Rule 3.

— for a class-type field like “Component”,

  • We can either leave the field “as is” and rely on the implicit Rule 2…., or
  • If we want to initialize explicitly, we must follow Rule 3. In this case, the default-init is suppressed by compiler.

In either case, there’s only one initialization per const field (Rule 1)

joinable instance of std::thread

[[effModernC++]] P 252 explains why in c++ joinable std::thread objects must not get destroyed. Such a destruction would trigger std::terminate(), therefore, programmers must make their std::thread objects non-joinable before destruction.

The key is a basic understanding of “joinable”. Informally, I would say a joinable std::thread has a real thread attached to it, even if that real thread has finished running. https://en.cppreference.com/w/cpp/thread/thread/joinable says “A thread that has finished executing code, but has not yet been joined is still considered an active thread of execution and is therefore joinable.”

An active std::thread object becomes unjoinable

  • after it is joined, or
  • after it is detached, or
  • after it is be “robbed” via move()

The primary mechanism to transition from joinable to unjoinable is via join().

std::thread key points

For a thread to actually become eligible, a Java thread needs start(), but c++ std::thread becomes eligible immediately after initialization i.e. after it is initialized with its target function.

For this reason, [[effModernC++]] dictates that between an int field and a std::thread field in a given class Runner, the std::thread field should be the last initialized in constructor. The int field needs to be already initialized if it is needed in the new thread.

Q1: Can you initialize the std::thread field in the constructor body?
A: yes unless the std::thread field is a declared const field

Now let’s say there’s no const field.

Q2: can the Runner copy ctor initialize the std::thread field in the ctor body, via move()?
A: yes provided the ctor parameter is non-const reference to Runner.
A: no if the parameter is a const reference to Runner. move(theConstRunner) would evaluate to a l-value reference, not a rvr. std::thread ctor and op= only accept rvr, because std::thread is move-only

See https://github.com/tiger40490/repo1/tree/cpp1/cpp/sys_thr for my experiments.

##[18]G4qualities I admire]peers: !!status #mellow

I ought to admire my peers’ [1] efforts and knowledge (not their STATUS) on :

  1. personal wellness
  2. parenting
  3. personal finance, not only investment and burn rate
  4. mellowness to cope with the multitude of demands, setbacks, disappointments, difficulties, realities about the self and the competition
  5. … to be Compared to
    • zbs, portable GTD, not localSys
    • how to navigate and cope with office politics and big-company idiosyncrasies.

Even though some of my peers are not the most /accomplished/ , they make a commendable effort. That attitude is admirable.

[1] Many people crossing my path are … not really my peers, esp. those managers in China. Critical thinking required.

I don’t have a more descriptive title for this blogpost.

2011 white paper@high-perf messaging

https://www.informatica.com/downloads/1568_high_perf_messaging_wp/Topics-in-High-Performance-Messaging.htm is a 2011 white paper by some experts. I have saved the html in my google drive. Here are some QQ  + zbs knowledge pearls. Each sentence in the article can expand to a blogpost .. thin->thick.

  • Exactly under what conditions would TCP provide low-latency
  • TCP’s primary concern is bandwidth sharing, to ensure “pain is felt equally by all TCP streams“. Consequently, a latency-sensitive TCP stream can’t have priority over other streams.
    • Therefore, one recommendation is to use a dedicated network having no congestion or controlled congestion. Over this network, the latency-sensitive system would not be victimized by the inherent speed control in TCP.
  • to see how many received packets are delayed (on the receiver end) due to OOS, use netstat -s
  • TCP guaranteed delivery is “better later than never”, but latency-sensitive systems prefer “better never than late”. I think UDP is the choice.
  • The white paper features an in-depth discussion of group rate. Eg: one mkt data sender feeding multiple (including some slow) receivers.

 

analyzing my perception of reality

Using words and numbers, am trying to “capture” my perceptions (intuitions + observations+ a bit of insights) of the c++/java job market trends, past and future. There’s some reality out there but each person including the expert observer has only a limited view of that reality, based on limited data.

Those numbers look impressive, but actually similar to the words — they are mostly personal perceptions dressed up as objective measurements.

If you don’t use words or numbers then you can’t capture any observation of the “reality”. Your impression of that reality [1] remains hopelessly vague. I now believe vague is the lowest level of comprehension, usually as bad as a biased comprehension. Using words + numbers we have a chance to improve our perception.

[1] (without words you can’t even refer to that reality)

My perceptions shape my decisions, and my decisions affect my family’s life chances.

My perceptions shape my selective listening. Gradually, actively, my selective listening would modify my “membrane” of selective listening! All great thinkers, writers update their membrane.

Am not analyzing reality. Instead, am basically analyzing my perception of the reality, but that’s the best I could do. I’m good at analyzing myself as an object.

Refusing to plan ahead because of high uncertainty is lazy, is pessimistic, is doomed.

latency zbs in java: lower value cf c++@@

Warning — latency measurement gotchas … is zbs but not GTD or QQ

— My tech bet — Demand for latency QQ will remain higher in c++ than java

  • The market’s perception would catch up with reality (assuming java is really no slower than c++), but the catch-up could take 30 years.
  • the players focused on latency are unused to the interference [1] by the language. C++ is more free-wheeling
  • Like assembly, c++ is closer to hardware.
  • In general, by design Java is not as a natural a choice for low latency as c++ is, so even if java can match c++ in performance, it requires too much tweaking.
  • related to latency is efficiency. java is a high-level language and less efficient at the low level.

[1] In the same vein, (unlikely UDP) TCP interferes with data transmission rate control, so even if I control both sender and receive, I still have to cede control to TCP, which is a kernel component.

— jvm performance tuning is mainstream and socially meaningful iFF we focus on
* machine saturation
* throughput
* typical user-experience response time

— In contrast, a narrow niche area is micro-latency as in HFT

After listening to FPGA, off-heap memory latency … I feel the arms race of latency is limited to high-speed trading only. latency technology has limited economic value compared to mobile, cloud, cryptocurrency, or even data science and machine learning.

Churn?

accu?

 

find all subsums divisible by K #Altonomy

Q: modified slightly from Leetcode 974: Given an array of signed integers, print all (contiguous, non-empty) subarrays having a sum divisible by K.

https://github.com/tiger40490/repo1/blob/py1/py/algo_arr/subarrayDivisibleByK.py is my one-pass, linear time solution. I consider this technique an algoQQ. Without prior knowledge, a O(N) solution is inconceivable.

I received this problem in an Altonomy hackerrank. I think Kyle gave me this problem too.

===analysis

Sliding window? I didn’t find any use.

Key idea — build data structure to remember cumulative sums, and remember all the positions hitting a given “cumsum level”. My homegrown solution kSub3() shows more insight !

enumerate()iterate py list/str with idx+val

The built-in enumerate() is a nice optional feature. If you don’t want to remember this simple simple syntax, then yes you can just iterate over xrange(len(the_sequence))

https://www.afternerd.com/blog/python-enumerate/#enumerate-list is illustrated with examples.

— to enumerate backward,

Since enumerate() returns a generator and generators can’t be reversed, you need to convert it to a list first.

for i, v in reversed(list(enumerate(vec)))

c++nlg pearls: xx new to refresh old 知新而后温故

Is this applicable in java? I think so, but my focus here is c++.

— 温故而知新 is less effective at my level. thick->thin, reflective.

— 知新而后温故 — x-ref, thin->thick->thin learning.

However, the pace of learning new knowledge pearls could appear very slow and disappointing. 5% new learning + 95% refresh. In such a case, the main benefit and goal is the refresh. Patience and Realistic expectation needed.

In some situations, the most effective learning is 1% new and 99% refresh. If you force yourself to 2% new and 98% refresh, learning would be less effective.

This technique is effective with distinct knowledge PEARLS. Each pearl can be based on a sentence in an article but developed into a blogpost.

 

non-volatile field can have volatile behavior #DQH

Unsafe.getObjectVolatile() and setObjectVolatile() should be the only access to the field.

I think for an integer or bool field (very important use cases), we need to use Unsafe.putIntVolatile() and Unsafe.getIntVolatile()

Q: why not use a volatile field?
A: I guess in some designs, a field need not be volatile at most access points, but at one access point it needs to behave like volatile field.  Qihao agrees that we want to control when we want to insert a load/store fence.

Non-volatile behavior usually has lower latency.

 

half%%peers could be Forced into retirement #Honglin

Reality — we are living longer and healthier.

Observation — compared to old men, old women tend to have more of a social life and more involvement with grandchildren.

I suspect that given a choice, half the white-collar guys in my age group actually wish to keep working past 65 (or 70), perhaps at a lower pace. In other words, they are likely to retire not by choice. My reasoning for the suspicion — Beside financial needs, many in this group do not have enough meaningful, “engaging” things to do. Many would suffer.

It takes long-term planning to stay employed past 65.

I think most of the guys in this category do not prepare well in advance and will find themselves unable to find a suitable job. (We won’t put it this way, but) They will be kinda forced into early retirement. The force could be health or in-demand skillset or …

[19] keen observer@SG workers in their 40s

“Most of them in the 40s are already stable and don’t want to quit. Even though the pay may not be so good, they’re willing to work all the way[1]. It’s an easy-going life.”

The observer was comparing SG (white or blue collar) employees across age groups, and this is the brief observation of the 40-something. This observation is becoming increasingly descriptive of me… Semi-retired on the back of my passive income streams.

[1] I interpret “all the way” as all the way to retirement age, no change of direction, not giving in to boredom, sticking to the chosen career despite occasional challenges (pains, disappointments, setbacks).

 

local variables captured in nested class #Dropcopy

If a (implicitly final) local variable [1] is captured inside a nested class, where is the variable saved?

https://stackoverflow.com/questions/43414316/where-is-stored-captured-variable-in-java explains that the anonymous or local class instance has an implicit field to hold the captured variable !

[1] local variable can be an arg passed into the enclosing function. Could a primitive type or a reference type i.e. heapy thingy

The java compiler secretly adds this hidden field. Without this field, a captured primitive would be lost and a captured heapy would be unreachable when the local variable goes out of scope.

A few hours later, when the nested class instance need to access this data, it would have to rely on the hidden field.

 

lambda^anon class instance ] java

A java lambda expression is used very much like an instance of an anonymous class. However, http://tutorials.jenkov.com/java/lambda-expressions.html#lambda-expressions-vs-anonymous-interface-implementations pointed out one interesting difference:

The anonymous instance in the example has a field named. A lambda expression cannot have such fields. A lambda expression is thus said to be stateless.

get collection sum after K halving operations #AshS

Q: given a collection of N positive integers, you perform K operations like “half the biggest element and replace it with its own ceiling”. Find the collection sum afterwards.

Note the collection size is always N. Note K(like 5) could exceed N(like 2), but I feel it would be trivial.

====analysis====

This is a somewhat contrived problem.

I think O(N + K log min(N,K)) is pretty good if feasible.

git | merge-commits and pull-requests

Key question — Q1: which commit would have multiple parents?

— scenario 1a:

  1. Suppose your feature branch brA has a commit hash1 at its tip; and master branch has tip at hashJJ, which is the parent of hash1
  2. Then you decide to simply q[ git merge brA ] into master

In this simple scenario, your merge is a fast-forward merge. The updated master would now show hash1 at the tip, whose only parent is hashJJ.

A1: No commit would have multiple parents. Simple result. This is the default behavior of git-merge.

Note this scenario is similar to https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges#rebase-and-merge-your-pull-request-commits

However, github or bit-bucket pull-request flow don’t support it exactly.

— scenario 1b:

Instead of simple git-merge, what about pull request? A pull-request uses q[ git merge –no-ff brA ] which (I think) unconditionally creates a merge-commit hashMM on maser.

A1: now hashMM has two parents. In fact, git-log shows hashMM as a “Merge” with two parent commits.

Result is unnecessarily complex. Therefore, in such simple scenarios, it’s better to use git-merge rather than pull request.

https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges explains the details.

— Scenario 2: What if ( master’s tip ) hashJJ is Not parent of hash1?

Now maser and brA have diverged. I think you can’t avoid a merge commit hashMM.

A1: hashMM

— Scenario 3: continue from Scenario 1b or Scenario2.

3. Then you commit on brA again , creating hash2.

Q: What’s the parent node of hash2?
A: I think git actually shows hash1 as the parent, not hashMM !

Q: is hashMM on brA at all?
A: I don’t think so but some graphical tools might show hashMM as a commit on brA.

I think now master branch shows  hashMM having two parents (hash1+hashMM), and brA shows hash1 -> hash2.

I guess that if after the 3-way-merge, you immediately re-create (or reset) brA from master, then hash2’s parent would be hashMM.


Note

  • direct-commit on master is implicitly fast-forward, but merge can be fast-forward or non-fast-forward.
  • fast-forward merge can be replaced by a rebase as in Scenario 1a. Result is same as direct-commit.
  • fast-forward merge-commit (Scenario 1b) and 3way merge (Scenario 2) both create a merge-commit.
  • git-pull includes a git-merge without –no-ff

Optiver coding hackathon is like marathon training

Hi Ashish,

Looking back at the coding tests we did together, I feel it’s comparable to a form of “marathon training” — I seldom run longer than 5km, but once a while I get a chance to push myself way beyond my limits and run far longer.

Extreme and intensive training builds up the body capacity.

On my own, it’s hard to find motivation to run so long or practice coding drill at home, because it requires a lot of self-discipline.

Nobody has unlimited self-discipline. In fact, those who run so much or takes on long-term coding drills all have something beside self-discipline. Self-discipline and brute force willpower is insufficient to overcome the inertia in every one of these individuals. Instead, the invisible force, the wind beneath their wings is some forms of intrinsic motivation. These individuals find joy in the hard drill.

( I think you are one of these individuals — I see you find joy in lengthy sessions of jogging and gym workout. )

Without enough motivation, we need “organized” practice sessions like real coding interviews or hackathons. This Optiver coding test could probably improve my skill level from 7.0 to 7.3, in one session. Therefore, these sessions are valuable.

[18]latency stat: typical sell-side DMA box:10 μs

(This topic is not GTD not zbs, but relevant to some QQ interviewers.)

https://www.youtube.com/watch?v=BD9cRbxWQx8 is a 2018 presentation.

  1. AA is when a client order hits a broker
  2. Between AA and BB is the entire broker DMA engine in a single process, which parses client order, maintains order state, consumers market data and creates/modifies the outgoing FIX msg
  3. BB is when the broker ships the FIX msg out to exchange.

Edge-to-edge latency from AA to BB, if implemented in a given language:

  • python ~ about 50 times longer than java
  • java – can aim for 10 micros if you are really really good. Dan recommends java as a “reasonable choice” iFF you can accept 10+ micros. Single-digit microsecond shops should “take a motorbike not a bicycle”.
  • c# – comparable to java
  • FPGA ~ about 1 micro
  • ASIC ~ 400 ns

— c/c++ can only aim for 10 micros … no better than java.

The stronghold of c++, the space between java and fpga, is shrinking … “constantly” according to Dan Shaya. I think “constantly” is like the growth of Everest.. perhaps by 2.5 inches a year

I feel c++ is still much easier, more flexible than FPGA.

I feel java programming style would become more unnatural than c++ programming in order to compete with c++ on latency.

Kenneth of MLP said his engine gets a CMF-format order message from PM (AA) does some minimal checks and (BB) sends it as FIX to broker. Median latency from AA to BB is 40 micros.

— IPC latency

Shared memory beats TCP hands down. For an echo test involving two processes:

Using an Aeron same-host messaging application, 50th percentile is 250 ns. I think NIC and possibly kernel (not java or c++) are responsible for this latency.

Kenneth said shared memory latency (also Aeron same-host) is 1-4 micros measured between XX) PM writes the order object into shm AA) engine reads the order from shm.

SG dev salary: FB^banks

Overall, it’s a positive trend that non-finance employers are showing increasing demand for candidates at my salary level. More demand is a good thing for sure.

Even though these tech shops can’t pay me the same as MLP does, 120k still makes a good living

https://news.efinancialcareers.com/sg-en/3001699/salaries-pay-facebook-singapore is a curated review of self-reported salary figures on glassdoor.com

Mid-level dev base salary SGD 108k, much lower than U.S.

FB SG has 1000 headcount.

— google SG

https://news.efinancialcareers.com/sg-en/3001375/google-salaries-pay-bonuses-singapore is on google, but I find the dev salary figures unreliable.

https://www.quora.com/How-much-do-Google-Singapore-Software-Engineers-earn is another curated review.

— banks

https://news.efinancialcareers.com/sg-en/3000694/banking-technology-salaries-singapore?_ga=2.5598835.1762609396.1594807443-15595788.1594807443

sponsored DMA

Context — a buy-side shop (say HRT) uses a DMA connection sponsored by a sell-side like MS (or Baml or Instinet) to access NYSE. MS provides a DMA platform like Speedway.

The HRT FIX gateway would implement the NYSE FIX spec. Speedway also has a FIX spec for HRT to implement. This spec should include minor customization on the NYSE spec.

I have seen the HPR spec. (HPR is like an engine running in Baml or GS or whatever.) HPR spec seems to talks about customization for NYSE, Nsdq etc …re Gary chat.

Therefore, the HRT FIX gateway to NYSE must implement, in a single codebase,

  1. NYSe spec
  2. Speedway spec
  3. HPR spec
  4. Instinet spec
  5. other sponsors’ spec

The FIX session would be provided (“sponsored”) by MS or Baml, or Instinet. I think the HRT FIX gateway would connect to some IP address belonging to the sponsor like MS. Speedway would forward the FIX messages to NYSE, after some risk checks.

VWAP=a bmark^exectionAlgo

In the context of a broker algos (i.e. an execution algo offered by a broker), vwap is

  • A benchmark for a bulk order
  • An execution algo aimed at the benchmark. The optimization goal is to minimize slippage against this benchmark. See other blogposts about slippage.

The vwap benchmark is simple, but the vwap algo implementation is non-trivial, often a trade secret.

Avichal: too-many-distractions

Avichal is observant and sat next to me for months. Therefore I value his judgment. Avichal is the first to point out I was too distracted.

For now, I won’t go into details on his specific remarks. I will simply use this simple pointer to start a new “thread”…

— I think the biggest distraction at that time was my son.

I once (never mind when) told grandpa that I want to devote 70% of my energy to my job (and 20% to my son), but now whenever I wanted to settle down and deep dive into my work, I feel the need and responsibility to adjust my schedule and cater to my son. and try to entice him to study a little bit more.

My effort on my son is like Driving uphill with the hand-brake on.

As a result, I couldn’t have a sustained focus.

gradle: dependency-jar refresh, cache, Intellij integration..

$HOME/.gradle holds all the jars from all previous downloads.

[1] When you turn on debug, you can see the actual download : gradle build –debug.

[2] Note IDE java editor can use version 123 of a jar for syntax check, but the command line compilation can use version 124 of the jar. This is very common in all IDEs.

When I make a change to a gradle config,

  • Intellij prompts for gradle import. This seems to be unnecessary re-download of all jars — very slow.
  • Therefore, I ignore the import. I think as a result, Intellj java editor [2] would still use the previous jar version as the old gradle config is in effect. I live with this because my focus is on the compilation.
  • For compilation, I use the grade “build” action (probably similar to command line build). Very fast but why? Because only one dependency jar is refreshed [3]
  • Gary used debug build [1] to prove that this triggers a re-download of specific jars iFF you delete the jars from $HOME/.gradle/caches/modules-2/files-2.1

[3] For a given dependency jar, “refresh” means download a new version as specified in a modified gradle config.

— in console, run

gradle build #there should be a ./build.gradle file

Is java/c# interpreted@@No; CompiledTwice!

category? same as JIT blogposts

Q: are java and c# interpreted? QQ topic — academic but quite popular in interviews.

https://stackoverflow.com/questions/8837329/is-c-sharp-partially-interpreted-or-really-compiled shows one explanation among many:

The term “Interpreter” referencing a runtime generally means existing code interprets some non-native code. There are two large paradigms — Parsing: reads the raw source code and takes logical actions; bytecode execution : first compiles the code to a non-native binary representation, which requires much fewer CPU cycles to interpret.

Java originally compiled to bytecode, then went through an interpreter; now, the JVM reads the bytecode and just-in-time compiles it to native code. CIL does the same: The CLR uses just-in-time compilation to native code.

C# compiles to CIL, while JIT compiles to native; by contrast, Perl immediately compiles a script to a bytecode, and then runs this bytecode through an interpreter.

bone health for dev-till-70 #CSY

Hi Shanyou,

I have a career plan to work as a developer till my 70’s. When I told you, you pointed out bone health, to my surprise.

You said that some older adults suffer a serious bone injury and become immobile. As a result, other body parts suffer, including weight, heart, lung, and many other organs. I now believe loss of mobility is a serious health risk.

These health risks directly affect my plan to work as a developer till my 70’s.

Lastly, loss of mobility also affects our quality of life. My mom told me about this risk 20 years ago. She has since become less vocal about this risk.

Fragile bones become more common when we grow older. In their 70’s, both my parents suffered fractures and went through surgeries.

See ## strengthen our bones, reduce bone injuries #CSY for suggestions.

available time^absorbency[def#4]:2 limiting factors

see also ## identify your superior-absorbency domains

Time is a quintessential /limiting factor/ — when I try to break through and reach the next level on some endeavor, I often hit a /ceiling/ not in terms of my capacity but in terms of my available time. This is a common experience shared by many, therefore easy to understand. In contrast, a more subtle experience is the limiting factor of “productive mood” [1].

[1] This phrase is vague and intangible, so sometimes I speak of “motivation” — not exactly same and still vague. Sometimes I speak of “absorbency” as a more specific proxy.

“Time” is used as per Martin Thompson.

  • Specific xp: Many times I took leaves to attend an IV. The time + absorbency is a precious combination that leads to breakthrough in insight and muscle-building. If I only provide time to myself, most of the time I don’t achieve much.
    • I also take leave specifically to provide generic “spare time” for myself but usually can’t achieve the expected ROTI.
  • Specific xp: yoga — the heightened absorbency is very rare, far worse than jogging. If I provide time to myself without the absorbency, I won’t do yoga.
  • the zone — (as described in my email) i often need a block of uninterrupted hours. Time is clearly a necessary but insufficient condition.
  • time for workout — I often tell my friends that lack of time sounds like an excuse given the mini-workout option. Well, free time still helps a lot, but motivation is more important in this case.
  • localSys — absorbency is more rare here than coding drill, which is more rare than c++QQ which is more rare than java QQ
  • face time with boy — math practice etc.. the calm, engaged mood on both sides is very rare and precious. I tend to lose my cool even when I make time for my son.
  • laptop working at train stations — like MRT stations or 33rd St … to capture the mood. Available time by itself is useless

exec algo: with-volume

— WITH VOLUME
Trade in proportion to actual market volume, at a specified trade rate.

The participation rate is fixed.

— Relative Step — with a rate following a step-up algo.

This algo dynamically adjusts aggressiveness(participation rate) based on the
relative performance of the stock versus an ETF. The strategy participates at a target percentage of overall
market volume, adjusting aggressiveness when the stock is
significantly underperforming (buy orders) or outperforming (sell orders) the reference security since today’s open.

An order example: “Buy 90,000 shares 6758.T with a limit price of ¥2500.
Work order with a 10% participation rate, scaling up to 30%
whenever the stock is underperforming the Nikkei 225 ETF (1321.OS)
by 75 basis points or more since the open.”

If we notice the reference ETF has a 2.8% return since open and our 6758.T has a 2.05% return, then the engine would assume 6758.T is significantly underperforming its peers (in its sector). The engine would then step up the participation to 30%, buying more aggressively, perhaps using bigger and faster slices.

What if the ETF has dropped 0.1% and 6758.T has dropped 0.85%? This would be unexpected since our order is a large order boosting the stock. Still, the other investors might be dumping this stock. The engine would still perceive the stock as underperforming its peers, and step up the buying speed.

Y alpha-geeks keep working hard #speculation

Based on my speculation, hypothesis, imagination and a tiny bit of observation.

Majority of the effective, efficient, productive tech professionals don’t work long hours because they already earn enough. Some of them can retire if they want to.

Some percentage of them quit a big company or a high position, sometimes to join a startup. One of the reasons — already earned enough. See my notes on envy^ffree

Most of them value work-life balance. Half of them put this value not on lips but on legs.

Many of them still choose to work hard because they love what they do, or want to achieve more, not because no-choice. See my notes on envy^ffree

fixtag-Num,fixtag-nickname,fixtag-Val #my jargon

a fixtag is not an atomic item. Instead, a fixtag usually comprise two parts namely the value and the identifier.

The identifier is usually a fixtag-num.

Note the fixtag-name is not always unique ! It’s more like a fixtag-descriptor or fixtag-nickname, not part of the actual message, so they are merely standard nicknames!

“Payload” is not a good term. “pair” is kinda uncommon.

 

by-level traversal output sequence: isBST #AshS

Q: given a sequence of payload values produced from a by-level traversal of a binary tree, could the tree be a BST?

Ashish gave me this question. We can assume the values are floats.

====analysis

(Not contrived, somewhat practical)

— idea 1:

Say we have lined up the values found on level 11. We will split the line-up into sections, each split point being a Level-10 value.

Between any two adjacent values on level 10 (previous level), how many current-level values can legally fit in? I would say up to 2.

In other words, each section can have two nodes or fewer. I think this is a necessary (sufficient?) condition.

–idea 2: One-pass algo

Try to construct a BST as we consume the sequence.

Aha — there’s only one possible BST we can build. If we can’t build one, then return false. I think most sequences can produce a BST. Can you come up with a sequence that can’t construct a BST? I doubt it.

(unordered)map erase: prefer by-key !! itr #AshS

Relevant in coding tests like speed-coding, take-home, onsite. Practical knowledge is power !

As shown in https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_misc/mapEraseByVal.cpp

.. by-key is cleaner, not complicated by the iterator invalidation complexities.

You can save all the “bad” keys, and later erase one by one, without the invalidation concerns. You can also print the keys.

if you were to save the “bad” iterators, then once you erase one iterator, are the other iterators affected? No, but I don’t want to remember.

STL iterator invalidation rules, succinctly has a succinct summary, but I don’t prefer to deal with these complexities when I have a cleaner alternative solution.

std::pair mvctor = field-by-field

std::pair has no pointer field so I thought it needs no meaningful mvctor, but actually std::pair mvctor is defaulted i.e. field-wise move i.e. each field is moved.

If a pair holds a vector and a string, then the vector would be move-constructed, and so does the string.

Q1: So what kind of simple class would have no meaningful mvctor?
%%A: I would say a class holding no pointer whatsoever. Note it can embed another class instance as a field.

Q2: so why is std::pair not one of them?
A: std::pair is a template, so the /concrete/ field type can be a dynamic container including std::string.

All dynamic containers use pointers internally.

fine print ] source code

Q1: “Here is 95% of the logic” — In what contexts is such a documentation considered complete? Answered at the end.

When we programmers read source code and document the “business logic” implemented thereby, we are sometimes tempted to write “My write-up captures the bulk of the business logic. I have omitted minor details, but they are edge cases. At this stage we don’t need to worry about them”. I then hope I have not glossed over important details. I hope the omitted details are just fine prints. I was proven wrong time and again.

Sound byte: source code is all details, nothing but details.

Sound byte: Everything in source code is important detail until proven otherwise. The “proof” takes endless effort, so in reality, Everything in source code is important detail.

The “business logic” we are trying to capture actually consists of not only features and functionalities, but functional fragments i.e. the details.

When we examine source code, a visible chunk of code with explicit function names, variable names, or explicit comments are hard to miss. Those are the “easy parts”, but what about those tiny functional fragments …

  • Perhaps a short condition buried in a complicated if/while conditional
  • Perhaps a seemingly useless catch block among many catches.
  • Perhaps a break/continue statement that seems serve no purpose
  • Perhaps an intentional switch-case fall-through
  • Perhaps a seemingly unnecessary sanity check? I tend to throw in lots of them
  • Perhaps some corner case error handling module that look completely redundant and forgettable, esp. compared to other error handlers.
  • Perhaps a variable initialization “soon” before an assignment
  • Perhaps a missing curly brace after a for-loop header

( How about the equal sign in “>=” … Well, that’s actually a highly visible fragment, because we programmers have trained vision to spot that “=” buried therein. )

Let me stress again. The visibility, code size … of a functional fragment is not indication of its relative importance. The low-visibility, physically small functional fragments can be equally important as a more visible functional fragment.

To the computer, all of these functional fragments are equally significant. Each could have impact on a production request or real user.

Out of 900 such “functional fragments”, which ones deal with purely theoretical scenarios that would never arise (eg extraneous data) .. we really don’t know without analyzing tons of production data. One minor functional fragment might get activated by a real production data item. So the code gets executed unexpectedly, usually with immediate effect, but sometimes invisibly, because its effect is concealed by subsequent code.

I would say there are no fine-prints in executable source code. Conversely, every part of executable source code is fine print, including the most visible if/else. Every executable code has a real impact, unless we use real production data to prove otherwise.

A1: good enough if you have analyzed enough production data to know that every omitted functional fragment is truly unimportant.

intellij=cleaner than eclipse !

intellij (the community version) is much cleaner than eclipse, and no less rich in features.

On a new job, My choice of java ide is based on
1) other developers in the team, as I need their support

2) online community support — as most questions are usually answered there
I think eclipse beats intellij

3) longevity — I hate to learn a java ide and lose the investment when it loses relevance.
I think eclipse beats intellij, due to open-source

)other factors include “clean”

The most popular tools are often vastly inferior for me. Other examples:
* my g++ install in strawberryPerl is better than all the windows g++ installs esp. msvs
* my git-bash + strawberryPerl is a better IDE than all the fancy GUI tools
* wordpress beats blogger.com hands down
* wordpad is a far simpler rich text editor than msword or browsers or mark-down editors

value-based type as mutex #Optional.java

Value-based type is a new concept in java. Optional.java is the only important example I know, so I will use it as illustration.

One of the main ideas about value types is the lack of object identity (or perhaps their identity is detectable only to the underlying implementation i.e. JVM not Java applications). In such a world, how could we tell whether variables aa and bb “really” are the same or different?

Q: why avoid locking on value-based objects?
%%A: locking is based on identity. See why avoid locking on boxed Integers
A: https://stackoverflow.com/questions/34049186/why-not-lock-on-a-value-based-class

Side note — compared to java, c++ has a smaller community and collective brain power so discussions are more limited.

Y avoid us`boxed Integer as mutex

https://stackoverflow.com/questions/34049186/why-not-lock-on-a-value-based-class section on “UPDATE – 2019/05/18” has a great illustration

Auto-boxing of “33” usually produces distinct objects each time, but could also produce the same object repeatedly. Compiler has the leeway to optimize, just as in c++.

Remember: locking is based on object identity.

wage+homePrice: biased views@China colleagues

— salary

Many China colleagues (YH, CSY, CSDoctor, Jenny as examples) say their classmates earn much more than them in the U.S.  These colleagues seem to claim their Typical Chinese counterpart earns higher than in the U.S.

Reality — those classmates are outliers. The average salary in China is much lower than U.S. Just look at statistics.

Some of these guys (like CSY) feel inferior and regret coming to the U.S.

— cost of reasonable lifestyle

Many Chinese friends complain that cost level is higher in China than U.S. and Singapore. A young MLP colleague (Henry) said a RMB 500k/year feels insufficient to a Chinese 20-something.

In reality, per-sqft property price is indeed higher in some cities than in the U.S. Everything else in China, cost level is much lower than in the U.S. Just look at statistics.

success in long-term learning: keen≠interest

For both my son and my own tech learning over the long term, “interest” is not necessarily the best word to capture the key factor.

I was not really interested in math (primary-secondary) or physics (secondary). In college, I tried to feel interested in electronics, analog IC design etc but unsuccessful. At that level, extrinsic motivation was the only “interest” and the real motivation in me. Till today I don’t know if I have found a real passion.

Therefore, the strongest period of my life to look at is not college but before college. Going through my pre-U schools, my killer strength was not so much “interest” but more like keenness — sharp, quick and deep, absorbency…

Fast forward to 2019, I continue to reap rewards due to the keenness — in terms of QQ and zbs tech learning. Today I have stronger absorbency than my peers, even though my memory, quick-n-deep, sharpness .. are no longer outstanding.

Throughout my life, Looking at yoga, karaoke, drawing, sprinting, debating, piano, .. if I’m below-average and clueless, then I don’t think I can maintain “interest”.

Optional.java notes

Q: if an optional is empty, will it remain forever empty?

— An Optional.java variable could but should never be null, as this instance need a state to hold at least the boolean isPresent.

If a method is declared to return Optional<C>, then the author need to ensure she doesn’t return a null Optional ! This is not guaranteed by the language.

https://dzone.com/articles/considerations-when-returning-java-8s-optional-from-a-method illustrates a simple rule — use a local var retVal throughout then, at the very last moment, return Optional.ofNullable(retVal). This way, retVal can be null but the returned reference is never null.

If needed, an Optional variable should be initialized to Optional.empty() rather than null.

https://dzone.com/articles/considerations-when-returning-java-8s-optional-from-a-method shows the pitfalls of returning Optional from a method.

I feel this is like borrowing from a loan shark to avoid accidental credit card interest charge.

If you are careful, it can help you avoid the NPE of the traditional practice, but if you are undisciplined (like most of us), then this new stragey is even worse —

Traditional return type is something like String, but caller has to deal with nulls. New strategy is supposed to remove the risk/doubt of a null return value, but alas, caller still needs to check null first, before checking against empty!

–immutability is tricky

  1. the referent object is mutable
  2. the Optional reference can be reseated, i.e. not q[ final ]
  3. the Optional instance itself is immutable.
  4. Therefore, I think an Optional == a mutable ptr to a const wrapper object enclosing a regular ptr to a mutable java object.

Similarity to String.java — [B/C]

Compare to shared_ptr instance — [A] is true.

  • C) In contrast, a shared_ptr instance has mutable State, in terms of refcount etc
  • B) I say Not-applicable as I seldom use a pointer to a shared_ptr

— get() can throw exception if not present

— not serializable

— My motivation for learning Optional is 1) QQ 2) simplify my design in a common yet simple scenario

https://www.mkyong.com/java8/java-8-optional-in-depth/ is a demo, featuring … flatMap() !!

OO-modeling: c++too many choices

  • container of polymorphic Animals (having vtbl);
  • Nested containers; singletons;
  • class inheriting from multiple supertypes ..

In these and other OO-modeling decisions, there are many variations of “common practices” in c++ but in java/c# the best practice usually boils down to one or two choices.

No-choice is a Very Good Thing, as proven in practice. Fewer mistakes…

These dynamic languages rely on a single big hammer and make everything look like a nail….

This is another example of “too many variations” in c++.

dev jobs ] Citi SG+NY #Yifei

My friend Yifei spent 6+ years in ICG (i.e. the investment banking arm) of Citi Singapore.

  • Over 6Y no layoff. Stability is Yifei’s #1 remark
  • Some old timers stay for 10+ years and have no portable skill. This is common in many ibanks.
  • Commute? Mostly in Changi Biz Park, not in Asia Square
  • Low bonus, mostly below 1M
  • VP within 6 years is unheard-of for a fresh grad

I feel Citi is rather profitable and not extremely inefficient, just less efficient than other ibanks.

Overall, I have a warm feeling towards Citi and I wish it would survive and thrive. It offers good work-life balance, much better than GS, ML, LB etc

[17] deadlock involving 1 lock(again) #StampedLock

— Scenario 3: If you app uses just one lock, you can still get deadlock. Eg: ThreadM (master) starts ThreadS (slave). S acquires a lock. M calls the now-deprecated suspend() method on S, then tries to acquire the lock, which is held by the suspended S. S is waiting to be resumed by M — deadlock.

Now the 4 conditions of deadlock:
* wait-for cycle? indeed
* mutually exclusive resource? Just one resource
* incremental acquisition? indeed
* non-Preemptive? Indeed. But if there’s a third thread, it can restart ThreadS.

— Scenario 2: a single thread using a single non-reentrant lock (such as StampedLock.java and many c++ lock objects) can deadlock with itself. A common mis-design.

— Scenario 1:

In multithreaded apps, I feel single-lock deadlock is very common. Here’s another example — “S gets a lock, and wait for ConditionA. Only M can create conditionA, but M is waiting for the lock.”

Here, ConditionA can be many things, including a Boolean flag, a JMS message, a file, a count to reach a threshold, a data condition in DB, a UI event where UI depends partly on ThreadM.

P90 [[eff c#]] also shows a single-lock deadlock. Here’s my recollection — ThrA acquires the lock and goes into a while-loop, periodically checking a flag to be set by ThrB. ThrB wants to set the flag but need the lock.

 

debugger stepping into library

I often need my debugger to step into library source code.

Easy in java:

c++ is harder. I need to find more details.

  • in EclipseCDT, STL source code is available to IDE ( probably because class templates are usually in the form of header files), and debugger is able to step through it, but not so well.

Overall, I feel debugger support is significantly better in VM-based languages than c++, even though debugger was invented before these new languages.

I guess the VM or the “interpreter” can serve as an “interceptor” between debugger and target application. The interceptor can receive debugger commands and suspend execution of the target application.

complacent guys]RTS #comfortZone #DeepakCM

Deepak told me that Rahul, Padma etc stayed in RTS for many years and became “complacent” and uninterested in tech topics outside their work. I think Deepak has sharp observations.

I notice many Indian colleagues (compared to East European, Chinese..) uninterested in zbs or QQ topics. I think many of them learn the minimum to pass tech interviews. CSY has this attitude on coding IV but the zbs attitude on socket knowledge

–> That’s a fundamental reason for my QQ strength on the WallSt body-building arena.

If you regularly benchmark yourself externally, often against younger guys, you are probably more aware of your standing, your aging, the pace of tech churn, … You live your days under more stress, both negative and positive stress.

I think these RTS guys may benchmark internally once a while, if ever. If the internal peers are not very strong, then you would get a false sense of strength.

The RTS team may not have any zbs benchmark, since GTD is (99.9%) the focus for the year-end appraisal.

These are some of the reasons Deepak felt 4Y is the max .. Deepak felt H1 guys are always on our toes and therefore more fit for survival.

fear@large codebase #web/script coders

One Conclusion — my c++ /mileage/ made me a slightly more confident, and slightly more competent programmer, having “been there; done that”, but see the big Question 1 below.

— Historical view

For half my career I avoided enterprise technologies like java/c++/c#/SQL/storedProc/Corba/sockets… and favored light-weight technologies like web apps and scripting languages. I suspect that many young programmers also feel the same way — no need to struggle with the older, harder technologies.

Until GS, I was scared of the technical jargon, complexities, low-level API’s debuggers/linkers/IDE, compiler errors and opaque failures in java/SQL … (even more scared of C and Windows). Scared of the larger, more verbose codebases in these languages (cf the small php/perl/javascript programs)… so scared that I had no appetite to study these languages.

— many guys are unused to large codebases

Look around your office. Many developers have at most a single (rarely two) project involving a large codebase. Large like 50k to 100k lines of code excluding comments.

I feel the devops/RTB/DBA or BA/PM roles within dev teams don’t require the individual to take on those large codebases. Since it’s no fun, time-consuming and possibly impenetrable, few of them would take it on. In other words, most people who try would give up sooner or later. Searching in a large codebase is perhaps their first challenge. Even figuring out a variable’s actual type can be a challenge in a compiled language.

Compiling can be a challenge esp. with C/c++, given the more complex tool chain, as Stroustrup told me.

Tracing code flow is a common complexity across languages but worse in compiled languages.

In my experience, perl,php,py,javascript codebases are usually small like pets. When they grow to big creatures they are daunting and formidable just like compiled language projects. Some personal experiences —
* Qz? Not a python codebase at all
* pwm comm perl codebase? I would STILL say codebase would be bigger if using a compiled language

Many young male/female coders are not committed to large scale dev as a long-term career, so they probably don’t like this kinda tough, boring task.

— on a new level

  • Analogy — if you have not run marathons you would be afraid of it.
  • Analogy — if you have not coached a child on big exams you would be afraid of it.

I feel web (or batch) app developers often lack the “hardcore” experience described above. They operate at a higher level, cleaner and simpler. Note Java is cleaner than c++. In fact I feel weaker as java programmer compared to a c++ programmer.

Q1: I have successfully mastered a few sizable codebases in C++, java, c#. So how many more successful experiences do I need to feel competent?
A: ….?

Virtually every codebase feels too big at some time during the first 1-2 years, often when I am in a low mood, despite the fact that in my experience, I was competent with many of these large codebases.
I think Ashish, Andrew Yap etc were able to operate well with limited understanding.
I now see the whole experience as a grueling marathon. Tough for every runner, but I tend to start the race assuming I’m the weakest — impostor syndrome.
Everyone has to rely on log and primitive code browsing tools. Any special tools are usually marginal value. With java, live debugger is the most promising tool but still limited pain-relief. Virtually all of my fellow developers face exactly the same challenges so we all have to guess. I mean Yang, Piroz, Sundip, Shubin, … virtually all of them, even the original authors of the codebase. Even after spending 10Y with a codebase, we could face opaque issues. However, these peers are more confident against ambiguity.

[19] 4 Deeper mtv2work4more$ After basic ffree

Note sometimes I feel my current ffree is so basic it’s not real ffree at all. At other times I feel it is real, albeit basic, ffree. After achieving my basic ffree, here are 3 deeper motivations for working hard for even more money:

  • am still seeking a suitable job for Phase B. Something like a light-duty, semi-retirement job providing plenty of free time (mostly for self-learning, blogging, helping kids). This goal qualifies as a $-motivation because … with more financial resources, I can afford to take some desirable Phase-B jobs at lower pay. In fact, I did try this route in my 2019 SG job search.
  • I wish to spend more days with grandparents — need more unpaid leaves, or work in BJ home
  • more respect, from colleagues and from myself
  • stay relevant for 25 years. For next 10 years, I still want more upstream yet churn-resistant tech skills like c++.

–Below are some motivations not so “deep”

  • better home location (not size) — clean streets; shorter commute; reasonable school.. Eg Bayonne, JC
  • Still higher sense of security. Create more buffers in the form of more diversified passive incomes.

— Below are some secondary $-motivations

  • * more time with kids? Outside top 10 motivations.
  • * better (healthy) food? usually I can find cheaper alternatives
  • * workout classes? Usually not expensive

profilers for low-latency java

Most (simple) java profilers are based on jvm safepoint. At a safepoint, they can use JVM API to query the JVM. Safepoint-based profiling is relatively easy to implement.

s-sync (Martin) is not based on safepoint.

Async profiler is written for openJDK, but some features are usable on Zinc JVM. Async profiler is based on process level counters. Can’t really measure micro-latency with any precision.

Perf is an OS-level profiler, probably based on kernel counters.

“Strategic” needs a re-definition #fitness

“Strategic” i.e. long-term planning/t-budgeting needs a re-definition. quant and c# were two wake-up calls that I tragically missed.

For a long time, the No.1 strategic t-expense was quant, then c#/c++QQ, then codingDrill (the current yellowJersey).

Throughout 2019, I considered workout time as inferior to coding drill .. Over weekends or evenings I often feel nothing-done even though I push myself to do “a bit of” yoga, workout, or math-with-boy, exp tracking,

Now I feel yoga and other fitness t-spend is arguably more strategic than tech muscle building. I say this even though fitness improvement may not last.

Fitness has arguably the biggest impact on brain health and career longevity

job losses across WallSt: 90% budget-related

In 99.9% of the involuntary job losses (not due to disciplinary action), the victim loses his (or her) job due to more than one reason, despite the single “official” reason such as performance or budget/redundancy. The individual is selected after passing through several approvals. Several protections must fail to protect him before he can loses his job. Remember, Lord Voldemort’s must lose all seven Horcruxes before he can be killed.

  1. protection: good performance — (with a vague criteria of “good”) This protection is effective to the extent that it helps the immediate manager look good.
  2. protection: threat of law suit for discriminatory hiring/firing
  3. protection: adequate budget is the biggest protection. When SIA suffers, every staff suddenly loses this big protection.
  4. protection: internal transfer opportunities
  5. protection: financial compensation — is a deterrent that protects perm staff. In contrast, ending a contractor is much easier. Not much approval.
  6. protection: a guardian angel. Immediate manager is one of your guardian angels, but there could be other powerful figures protecting you. They can veto the decision to get you laid off

I have seen many strong performers getting laid off, sometimes even without budget pressure. So good performance is not ironclad protection.

The pattern — managers put in lots of effort to identify, select, train a new hire. Sacking a person without budget constraint such as a (department-wide downsize) is too visible and dramatic, even humiliating, looking bad, too harsh, too impactful on the victim and the team morale. I feel most managers are reluctant to do that.

  • They would rather pay a doughnut bonus and wait for the person to leave
  • They can also offer internal transfer to the individual, as in OC/BAML/Macq
  • They can also lower performance expectation on the individual and close one eye. As Kyle Stewart said, “As long as you put in effort”.

Some examples:

  • eg: KhorSiang of Zed

Exceptions prove the rule. Some managers are trigger happy, without budget pressure — deMunk, Stephen of Macq

grouping 1-min interval stats into 5-min intervals #Deepak

Requirements Based on 5 Sep 2020 email from Deepak.

— Input: Each row in the files represents the trading activity in 1 stock over a 1 minute period. Open is the price of the first trade in that period. High is the price of the highest trade. Low is the price of the lowest trade. Close is the price of the last trade. Volume is the total number of shares traded in that period.

http://equitiessampledata.s3-website-us-east-1.amazonaws.com/compGenericOutput.tar.gz

The naming convention for files is <ticker>.txt over a number of days. The data is ordered by time and formatted in the following manner:

Date,Time,Open,High,Low,Close,Volume

Date values are recorded in the “mm/dd/yyyy” format. Time values are in the EST time zone in the “hh:mm” format.

All files include pre-market and after-market transactions.

Note that time values are for the beginning of the interval, and intervals with no trades are not included.

— requirements:

  1. Generate similarly organized files containing 5 minute interval data with the same data format, appropriately aggregating each column. Note that similar to the 1 minute input interval data, these are non-overlapping, nonempty, and should be “every 5 minutes on the 5 minutes.”
  2. Include option to exclude off market transactions from the output. Market is open 9:30am-4pm so all other times would be excluded.
  3. Include option for intervals of other lengths.
  4. Include option to include empty intervals in the output.

Extra credit:

  1. Add a Return column. The Return is the percentage change from the Close price of the most recent previous interval to the Close price of the current interval. For example, if the previous close is $10.00 and the current close is $10.25, the return is 2.5% and should be expressed as 0.025.
  2. Create an output of “index” intervals that compute the Return of all tickers traded in that interval. This return can simply be an unweighted average of those returns. Don’t worry about Open, Close, High, Low, and Volume.

— Sample data

01/05/1998,10:11,48.584,48.653,48.584,48.653,32168.0
01/05/1998,10:12,48.722,48.722,48.722,48.722,12878.0
01/05/1998,10:13,48.758,48.758,48.689,48.689,67538.0


My solution is published in https://github.com/tiger40490/repo1/tree/py1/py/88miscIVQ.

  • coding  tip:  global variables — like windowStart, currentOutputInterval are convenient in python, but every function that assigns to a global variable must use “global thatVariable”
  • algorithm tip: two simultaneous loops — scanning the input records, while generating a series (one by one) of 5-min output intervals. This two-simultaneous-loop challenge is a rather frequent challenge. My solution here is a common solution.
  • #1 implementation change: very easy to omit an input. I relied on assertions like lineCnt == cumSize and size <= 5 to give much-needed assurance

git | reword historical commit msg

Warning — may not work if there’s a recent merge-in on your branch

Find the target commit and its immediate parent commit.

git rebase -i the.parent.commit

First commit in the list would be your target commit. Use ‘r’ for the target commit and don’t change other commits. You will land in vim to edit the original bad commit msg. Once you save and quit vim, the rebase will complete, usually without error.

Now you can reword subsequent commit messages.

c++low-^high-end job market prospect

As of 2019, c++ low-end jobs are becoming scarce but high-end jobs continue to show robust demand. I think you can see those jobs across many web2.0 companies.

Therefore, it appears that only high-end developers are needed. The way they select a candidates is … QQ. I have just accumulated the minimum critical mass for self-sustained renewal.

In contrast, I continue to hold my position in high-end coreJava QQ interviews.

conclusions: mvea xp

Is there an oth risk? Comparable to MSFM, my perception of the whole experience shapes my outlook and future decision.

  • Not much positive feedback beside ‘providing new, different viewpoints’, but Josh doesn’t give positive feedback anyway
  • should be able to come back to MS unless very stringent requirement
  • Josh might remember Victor as more suitable for greenfield projects.
  • I think Josh likes me as a person and understands my priorities. I did give him 4W notice and he appreciated.
  • I didn’t get the so-called “big picture” that Josh probably valued. Therefore I was unable to “support the floor” when team is out. The last time I achieved that was in Macq.
  • work ethic — A few times I worked hard and made personal sacrifices. Josh noticed.
  • In the final month, I saw myself as fairly efficient to wrap up my final projects including the “serialization” listed below

Q: I was brought in as a seasoned c++ old hand. Did I live up to that image? Note I never promised to be an expert
A: I think my language knowledge (zbs, beyond QQ) was sound
A: my tool chain GTD knowledge was as limited as other old hands.

Q: was the mvea c++ codebase too big for me?
A: No, given my projects are always localized. and there were a few old hands to help me out.

I had a few proud deliveries where I had some impetus to capture the momentum (camp out). Listed below. I think colleagues were impressed to some extent even though other people probably achieved more. Well, I don’t need to compare with those and feel belittled.

This analysis revealed that Josh is not easily impressed.  Perhaps he has high standard as he never praised anyone openly.

  • * I identified two stateless calc engines in pspc. Seeing the elegant simplicity in the design, I quickly zoomed in, stepped over and documented the internal logic and replicated it in spreadsheet.
  • * my pspc avg price sheet successfully replicated a prod “issue”, shedding light into a hitherto murky part of the codebase
  • * I quickly figure out serialization as root cause of the outage, probably within a day.
  • * I had two brave attempts to introduce my QOT innovation
  • * My 5.1 Brazil pspc project was the biggest config project to date. I single-handedly overcame many compilation (gm-install) and startup errors. In particular, I didn’t leave the project half-cooked, even though I had the right to do so.
  • I make small contributions to python test set-up

##With time2kill..Come2 jobjob blog

  • for coding drill : go over
    • [o] t_algoClassicProb
    • [o] t_commonCodingQ22
    • t_algoQQ11
    • open questions
  • go over and possibly de-list
    1. [o] zoo category — need to clear them sooner or later
    2. [o] t_oq tags
    3. t_nonSticky tags
    4. [o] t_fuxi tags
    5. Draft blogposts
    6. [o] *tmp categories? Sometimes low-value
    7. remove obsolete tags and categories
  • Hygiene scan for blogposts with too many categories/tags to speed up future searches? Low value?
  • [o=good for open house]

## smart ptr: practical values ] design

This is a common interview question.

  • j4: reduce (if not eliminate) leak of shared resource. Resource can be memory or some database connection etc. You can forget to destruct or call resource->release().
  • j4: avoid uninitialized ptr. [[safeC++]] mentions a simple ptr wrapper for this purpose
  • j4: simplify big-4 of MyClass by replaced T* field with smartPtr<T> field. My blogpost briefly described this technique, based on an article. I think this technique is actually safer than I thought.

— automatic clean-up with RAII

With RAII, you instantiate a smart ptr instance on stack. The ctor would be custom made to acquire some resource. Then you can rely on the RAII dtor to clean up.

Raw ptr has no ctor/dtpr because it is not a class.

— implement exclusive ownership, which is error-prone with raw ptr

Note shared ownership is rarely needed.

 

y smart ptr popular in c++IV

I think even if a hiring team doesn’t use smart ptr (for that matter, also TMP, threading,,), they would drill into this topic as it is a good probe on big4 and a lot of other core language features.

See some of the essential requirements on a smart ptr in drop-in replacement class for a raw ptr. Some of these items are part of an interview discussion on smart ptr.

q[throw]: always(?)explicit in c++

in java, a Throwable object can come from some invisible code, perhaps in the jvm implementation code.

In c++ a catch clause supposedly “always” catches something explicitly thrown by a c++ code module, “never” something from a C library. Now I think the “always” and “never” are simply wrong, because there are real life examples to prove otherwise. P 115 [[c++coding standards]] written by top experts also said operation systems can wrap low level errors in exceptions.

However, https://stackoverflow.com/questions/28925878/do-c-standard-library-functions-which-are-included-in-c-throw-exception shows some corner case of undefined behavior in a strcpy(). In such a case, the c++ compiler (for this C function) can do anything such as throwing exception.

Note C functions like strcpy() doesn’t throw because C doesn’t support exceptions, but c++ compiler is permitted to and does generate throwing code in some cases.

 

##command line c++toolchain: Never phased out

Consider C++ build chain + dev tools  on the command line. They never get phased out, never lost relevance, never became useless, at least till my 70s. New tools always, always keep the old features. In contrast, java and newer languages don’t need so many dev tools. Their tools are more likely to use GUI.
  • — Top 5 examples similar things (I don’t have good adjectives)
  • unix command line power tools
  • unix shell scripting for automation
  • C API: socket API, not the concepts
  • — secondary examples
  • C API: pthreads
  • C API: shared memory
  • concepts: TCP+UDP, http+cookies
Insight — unix/linux tradition is more stable and consistent. Windows tradition is more disruptive.

Note this post is more about churn (phased-out) and less about accumulation (growing depth)

FOLB: ManagingDirector ex-peers: brank envy

See also c++ interview rock stars

In many business teams (trading, stock analyst, portfolio manager, IBD deal maker), an MD is an individual contributor. In other departments, she could lead a small team, or large team like in technology. I have ex-peers now occupying MD or director position. I felt the FOLB. FOLB is the focus on this blogpost.

Q: How does a MD job compare to a respected GP doctor, or my dad, or me in the following dimensions?

  • exclusive club based on net worth — I think Honglin was MD a few times. His assets are not 10 times higher than mine. See below
  • brbr — may not be better than mine. Basically, I don’t really need so much money or the exclusive club status. Remember Rajat Gupta
  • career longevity — likely inferior to mine
  • mobility — inferior to mine. As I said in OC-effective: overrated some of these MD individuals may be unable to add value in another company.
  • carefree peaceful leisure — likely inferior to mine
  • stress from competition and replacement — remember Honglin’s words?

— OC-effective? These MD’s may not be more effective. See OC-effective: overrated

— exclub — I think my dad would say we don’t need to join those exclusive clubs

Fundamentally, I don’t need so much disposable income or net worth.

Since my middle school days, I have been brainwashed to admire  achievers – top school students, top college graduates, PhD’s, professors, technopreneurs, Managing directors… Those (within my age group) lacking such a badge of honor are perceived as substandard, inferior, unsuccessful.

This perception is deep and pervasive. Hard to escape. I can see my wife is still trapped. So am I.

Reminds me of the perceptions during Cultural revolution. I always wondered why intelligent people fall pray to such absurd propaganda, and degrade themselves. Now I see I’m also trapped by a similar brain-washing machine.

力争上游,人往高处走 as I tell my son…

— striking a balance: Be humble without self-degradation.

I have been practicing self-degradation for decades. Now I know I am a dependable father, husband, son, brother, colleague and friend to the people around me.

My mom and dad gave me a good body. I took care of it and I’m proud of myself as they are proud of me.

More importantly, my worth is not due to the people dependent on me or whom I help. I’m valuable in myself. Call it self-esteem.

finding1st is easier than optimizing

  • Problem Type: iterate all valid choices without duplicates — sounds harder than other types, but usually the search space is quite constrained and tractable
    • eg: regex
    • eg: multiple-word search in matrix
  • Problem Type: find best route/path/combo, possibly pruning large subtrees in the search space — often the hardest type
  • Problem Type: find fist — usually easier

In each case, there are recurring patterns.

automation scripts for short-term GTD

Background — automation scripts have higher values in some areas than others

  1. portable GTD
    • Automation scripts often use bash, git, SQL, gradle, python/perl… similar to other portable GTD skills like instrumentation know-how
  2. long-term local (non-portable) GTD
  3. short-term local (non-portable) GTD

However, for now let’s focus on short-term local GTD. Automation scripts are controversial in this respect. They take up lots of time but offer some measurable payoff

  • they could consolidate localSys knowledge .. thick -> thin
  • They can serve as “executable-documentation”, verified on every execution.
  • They reduce errors and codify operational best practices.
  • They speed up repeated tasks. This last benefit is often overrated. In rare contexts, a task is so repetitive that we get tired and have to slow down.

— strength — Automation scripts is my competitive strength, even though I’m not the strongest.

— respect – Automation scripts often earn respect and start a virtuous cycle

— engagement honeymoon — I often experience rare engagement

— Absorbency — sometimes I get absorbed, though the actual value-add is questionable .. we need to keep in mind the 3 levels of value-add listed above.

FOLB: startup riches: statistically unlikely #ZhuJiang

In wake up1day..left behind the packone of the longest-running envies and FOLB pains is about tech start-up. Zhu Jiang (ZJ) is the poster boy story but completely imaginary, without evidence. Yet, In my mind, the numbers are quite exaggerated:

  • start up is valued around $1b (an easy number for this blogpost) 100m to a few billions
  • ZJ would be one of the top 20 key employees and would receive some 2m of real cash, not stocks
  • the company grows for a few years and is acquired or otherwise dies a glorious death. Then ZJ would go on to another start-up.

Now I question every part of this glorified fantasy.

Even if your startup employer (eg Mdaq) emerges out of a thousand others, you the employee is unlikely to become real-money millionaire. I think YiGe gave me some figures like $200k cash-out.

— valuation — A unicorn start-up is as rare as as a lottery winner.

U.S. is only slightly better than SG.

I tried Catcha and Zed but didn’t become a key employee.

Zed had big valuations; Catcha was approved for IPO and probably a unicorn.

I tried vibrasoft and EmpWorld as a key employee. There are millions of vibrasofts for one unicorn.

— company cash flow #cf hedge funds

A unicorn probably has total funding around 100m to burn. If you want to cash out 2m, you are drawing blood from a struggling young patient !

A struggling patient fighting for her life. Remember most startups are extremely vulnerable, facing constant threats of extinction, without an established stronghold in the market like a hedge fund has.  Even a small hedge fund would make money from first year 🙂

So if you get $2m, it’s probably all-stock-no-cash. The cash-out would be pushed out by years, until an acquisition or IPO.

Mdaq may become a unicorn but how likely is a top 20 key employee to receive $2m stocks? You have to join early and be very very lucky.

— key employee

How likely can you become a top 20 key employee in a successful start-up of a unicorn caliber, if you are not capable in other jobs?

Note there are key employees in marketing, sales, product visionary, CFO, so you may need to be more than the lead architect to qualify.

Note if everyone takes 2m cash, then we are talking about 40m cash allocation from a pool of 100m. Investors would kill you.

I think when a start up hits IPO or acquired, the founders have probably (5->20m-100m) paper wealth. Since you are not a founder, 2m paper wealth is rather high.

— failure rate

Many thriving tech start-ups fail eventually. I feel that’s the nature of the tech business — high churn. The cumulative profit is probably a small fraction of the peak valuation. Perhaps a few millions.

If you are able to cash out $2m before the down turn, then you are lucky.

Zeng Sheng felt decline (in valuation or popularity) is very common, even among the successful startup’s

Among the unicorns, a small minority get acquired as unicorns. The majority must survive through tech churn, make a living and prove their business model.

— another start-up ?

It’s never guaranteed. ZJ is in the networking domain, displaced out of the limelight by block chain, crypto-currency, AI/ML, cloud.

Your age is a growing concern by the time you try another start-up. This is a ageism bias.

[19]Aug16 weekend drill

Target: up to 10 problems in total including 5 new ones.

— fully tested:

  1. https://leetcode.com/problems/merge-two-binary-trees/submissions/
  2. https://bintanvictor.wordpress.com/wp-admin/post.php?post=33914&action=edit — offending section ] sorted arr
  3. https://leetcode.com/problems/maximum-length-of-repeated-subarray/
  4. https://bintanvictor.wordpress.com/wp-admin/post.php?post=33925&action=edit — Alien dictionary
  5. https://leetcode.com/problems/find-minimum-in-rotated-sorted-array-ii/
  6. https://leetcode.com/problems/maximum-length-of-repeated-subarray/
  7. https://github.com/tiger40490/repo1/blob/cpp1/cpp/lang_misc/add2BigBinary.cpp

— presumably solved

  1. https://leetcode.com/problems/path-sum-iii/
  2. https://leetcode.com/problems/convert-bst-to-greater-tree/
  3. https://leetcode.com/problems/diameter-of-binary-tree/
  4. https://bintanvictor.wordpress.com/wp-admin/post.php?post=33889&action=edit — count squares in matrix
  5. https://bintanvictor.wordpress.com/wp-admin/post.php?post=33938&action=edit — longest full path name
  6. https://leetcode.com/problems/k-closest-points-to-origin/ — O(N) by quick-select
  7. https://leetcode.com/problems/first-missing-positive

— pending

— reviewed

jGC heap: 2 unrelated advantages over malloc

Advantage 1: faster allocation, as explained in other blogposts

Advantage 2: programmer can "carelessly" create an "local" Object in any method1, pass (by reference) the object into other methods and happily forget about freeing the memory.

In this extremely common set-up, the reference itself is a stack variable in method1, but the heapy thingy is "owned" by the GC.

In contrast, c/c++ requires some "owner" to free the heap memory, otherwise memory would leak. There’s also the risk of double-free. Therefore, we absolutely need clearly documented ownership.

microwave WAN for low latency trading

RF (i.e. microwave) is gaining popularity for low latency WAN (not co-location) connectivity.

— reliability

RF is less reliable than fiber due to air interference and data loss etc.

— latency

RF wins due to shorter distance than fiber. Both operate at the speed of light

— bandwidth

RF (i.e. microwave) about 10 Mbps

fiber about 100 Gbps.

I think a RF trading network has to share the medium with cellular network. I guess a WAN cable can be dedicated.

— example:

RF links between colocation sites

  • Mahwah (NYSE)
  • Carteret (NSDQ)
  • Secaucus (Bats etc)
  • some Chicago sites.

I think this link is needed because of cross-exchange trading opportunities, which are latency-critical. Market data received at one exchange could trigger trading decision on the same exchange or another exchange. For the latter, some signal has to travel the distance between the two colocation sites. Even if CME offers a data center at Mahwah, the signal still need to travel from CME matching engine to the trader’s system then to the NYSE matching engine.

max-sum non-adjacent subSequence: signedIntArr#70%

Q: given a signed int array a[], find the best sub-sequence sum. Any Adjacent elements in a[] should not both show up. O(N) time and O(1) space.

====analysis

— O(1) idea 2:

Two simple variables needed: m_1 is the m[cur-1] and m_2 is m[cur-2]. So compare a[cur] + m_2 vs m_1.

https://github.com/tiger40490/repo1/blob/py1/py/algo_arr/nonAdjSeqSum.py is self-tested DP solution.

IC design wage: much lower than SG finIT+westCoast

The reality is, some sectors have so much money that even the mediocre guys in those domains get paid so much higher than the top performers in IC design.

eg: web2.0 shops have so much money. Their main cost is salary. Once you pass their interviews, even a mediocre performance would earn you much more than any other tech sector

eg: For a commercial bank like OCBC to compete for trading developers, it has to pay higher than it pays commercial banking tech developers.

On one hand I feel blessed, but on the other hand it’s crucial to avoid holding on too tight. Try gratitude with detachment.

— I single out IC design as it was, for decades, one of the most coveted job titles in the entire hardware industry, or the technology industry, at the very center of digital revolution and knowledge industry, a “poster child” profession with

  • moat — deep knowledge? It looks like west coast coding IV requires similar amount of knowledge.
  • moat — work experience is extremely hard to get? It looks like there are plenty of IC designers and trading engine developers with sufficient exeprience, and west coast doesn’t need any work experience, just motivation and IQ
  • extremely upstream? doesn’t mean much in terms of salary
  • explosive growth? yes in chip production but the amount of custom design has reduced after industry standardization.
    • Similarly, client-side software dev simplified significantly after browser became the standard client platform.

— IC designer salary in SG

https://www.glassdoor.sg/Salaries/singapore-ic-design-engineer-salary-SRCH_IL.0,9_IM1123_KO10,28.htm shows about S$50k/Y base but I know ibanks in SG pay about S$120-150k base

Even a senior/staff IC designer is only up to S$80k

Principal IC designer — S$140-160k. Might be a 1st percentile among IC designers in SG

— IC designer (senior) in Shanghai

up to USD 100k according to Zhang Jiong, pretty much identical to Ying Hui at RMB 60k/M

 

IC design wage, unexpected explanations #le2Jiong

Thanks Zhang Jiong for sharing your insight. For many years IC design was my dream job.

https://www.glassdoor.sg/Salaries/singapore-ic-design-engineer-salary-SRCH_IL.0,9_IM1123_KO10,28.htm shows SGD 48k annual base salary as the average

  • across Singapore,
  • regardless of seniority or how much experience,
  • “very high confidence” estimate

This figure is surprisingly low. I guess many data points in this sample are recent immigrants from India or China or South-East Asia. These individuals are often satisfied with lower salaries.

https://www.glassdoor.sg/Monthly-Pay/Broadcom-IC-Design-Engineer-US-Monthly-Pay-EJI_IE6926.0,8_KO9,27_IL.28,30_IN1.htm?experienceLevel=SEVEN_TO_NINE shows USD $117k annual base salary for

  • Broadcom U.S. IC designers with 7-9 years experience
  • high confidence estimate

I think this figure is on the lower half of your estimate of 100k – 150k, because your “sample” might be more successful, more competent engineers. Possible?

I completed a Master’s degree program in financial mathematics, with a lot of training on statistics. So I know a thing or two about sampling bias 🙂

A summary of the factors affecting IC designer salary:

  • supply — pool of qualified, experienced IC designers is growing so the skill is not so rare.
  • demand — many U.S. employers are driven to downsize due to China competition
  • demand — the IC design industry has reached maturity and the downward curve (sunset industry?), because product differentiation can be achieved using industry-standard modules, without custom designs such as ASIC.
  • demand — demand is fundamentally affected by company profit. Overall profit across IC design companies is possibly declining because Chinese competitors can undercut everyone else due to cheaper labor and export subsidies.

FB 2019 ibt Indeed onsite

  • coding rounds — not as hard as the 2011 FB interview — regex problem
    • Eric gave positive feedback to confirm my success but perhaps other candidates did even better
    • No miscommunication as happened in the VersionedDict.
    • 2nd Indeed interviewers failed me even though I “completed” it. Pregnant interviewer may follow suit.
  • data structure is fundamental to all the problems today.
  • SDI — was still my weakness but I think I did slightly better this time
  • Career round — played to my advantage as I have multiple real war stories. I didn’t prepare much. The stories just came out raw and hopefully authentic
  • the mini coding rounds — played to my advantage as I reacted fast, thanks to python, my leetcode practice …

So overall I feel getting much closer to passing. Now I feel One interview is all it takes to enter a new world

* higher salary than HFT
* possibly more green field
* possibly more time to research, not as pressed as in ibanks

sorting collection@ chars||small Integers

Therefore, given a single (possibly long) string, sorting it should Never be O(N logN) !

Whenever an algo problem requires sorting a bunch of singular English letters, it’s always, always better to use counting sort. O(N) rather O(N logN)

Similarly, in more than one problem, we are given a bunch of integers bound within a limited range. For example, the array index values could be presented to us as a collection and we may want to sort them. Sorting such integers should always , always be counting sort.

This is an immiediate win.

— How about sorting a bunch of studentID or SSN?

Range-bound, possibly (unlikely) sparse integers. StudentID is essentially array subscript.

When sample size N grows as in bigO analysis, N would eventually exceed the cosntant range, so counting sort would be O(N). How about initializing the R buckets for R possible values? I think that is usually constant-time, regardless of R. R is a compile-time constant so no dynamic allocation required.