egrep: highlight pattern]file !!filtering !!scrolling

https://github.com/tiger40490/repo1/blob/bash/bash/send.sh shows:

cat some.txt | egrep –color 40490.$rand|

Note the trailing pipe gets rid of filtering 🙂

— other solutions:

less +/pattern # not ideal as it scrolls

Advertisements

dev career]U.S.: fish out of water in SG

Nowadays I feel in-demand on 1) Wall St, 2) with the web2.0 shops. I also feel welcome by 3) the U.S. startups. In Singapore, this feeling of in-demand was sadly missing. Even the bank hiring managers considered me a bit too old.

Singapore banks only has perm jobs for me, which feel unsuitable, unattractive and stressful.

In every Singapore bank I worked, I felt visibly old and left behind. Guys at my age were all managers… painful.

doubly-linked list as AuxDS for BST

I find it a very natural auxDS. Every time the BST gets an insert/delete, this list can be updated easily.

Q: How about the self-adjustment after an insert or delete?
%%A: I think this list is unaffected

With this list, in-order walk becomes really easy.

https://leetcode.com/problems/kth-smallest-element-in-a-bst/solution/ is the first place I saw this simple technique.

ask`lower base2reduce mgr expectation

This plan didn’t work out at Macq. Expectation was still too high.

The logic is, if my coworkers get total comp 200k and I ask only 160k, then I’m more likely to get some bonus. Even if I underperform them, I would still hit somewhere below 200k.

Now I think if I qualify to stay, then there will be some bonus even if my base is, say 190k. Hiring managers would not agree to a 200k base and run the risk paying doughnut bonus to a qualified employee.

j8 MethodRef #cheatsheet

Need to developer more low-level insights as QQ…

  • Out of the four kinds of method refs, only AA) static-method and BB) specific-instance kinds are popular. The other two types are obscure.
  • q[ :: ] is used in all four kinds
  • I think the static-method kind is most readable and most intuitive. The javadoc tutorial features a BB example that should be implemented as static method IMHO.
  • a lambda expression has parameter types. A method ref has none and must be converted to a lambda expression. Where does compiler infer the parameter types? I think it is inferred from the calling context.

std::vector capacity reduction

Q: how would vector’s backing array reduce in size? In other words, how would the capacity every reduce?
%%A: The only hope is to shrink_to_fit() which is a request to compiler. Compiler may ignore it.

If capacity reduction does happen at runtime, then reallocation would probably happen.

I believe resize() assign() clear() etc will never reduce vector capacity.

PIP@Macq: tough judge@%%design

If I were the judge, then Kevin’s solution may get rejected or rated mediocre.

I think the judgement can be unreasonably tough when the judge herself is a practitioner — consider Yang and Sundip Jangi.

On the other hand,

  • Yang liked my OO design in EOS
  • Sundip liked my personalization design

The outcome (PIP etc) doesn’t mean my work (i.e. output) is sub-standard. The outcome has many reasons and causes.

I need to be fair and impartial to myself. [[learned optimism]] uses the three P’s. One of them is Personal.

technology xyz is dead

I often hear people say “technology XX is dead” .. exaggerated attention-grabbing gimmick, common on social media user posts.

I like the “dead”, old-fashioned, time-honored, stable technologies that are robust (not only resilient) against churn.

The alternative technologies are use-less, worth-less, hope-less, less proven, low-value, more likely to die young or even dead-on-arrival

STL map[k]=.. Always uses assignment@runtime

The focus is on the value2, of type Trade.

— someMap[key1] = value2; //uses default-ctor if needed. Then Unconditionally invokes Trade::operator=() for assignment! See P344 [[Josuttis]]

Without default-ctor or operator=, then Trade class probably won’t compile with this code.

— someMay.emplace(key1, value2); //uses some ctor, probably the copy-ctor, or the cv-ctor if value2 type is not Trade

specialize class- but !!function- templates

A fundamental TMP technique is class template specialization.

Class templates’ flexibility can only be achieved via specialization but function templates’ flexibility can be achieved via overload ! Overload is much simpler than specialization.

In [[c++coding standard]], Alexandrescu/Sutter said “Don’t specialize func templates”, for multiple reasons.

After remembering this sound byte, it’s probably important to remember one of the reasons, for IV (halo) and zbs.

real effort but below-bar@@ only Macq

Macq is probably the only job where I focused on localSys GTD but still fell below the bar.

The PIP cast a long shadow and left a deep scar. Am still recovering. This scar is deeper than Stirt …

Remember the abused children, who grew up traumatized? I was not traumatized as a kid. My traumatic experience is still /devastating/ but I can handle it. I have the maturity to handle it.

Adults are often traumatized by failed marriage, in-law conflicts,..

pre-order: Polish notation #any binTree

  • Pre-order walk can create a expression tree in Polish notation
  • Post-order walk can create a expression tree in reverse-Polish notation

https://en.wikipedia.org/wiki/Tree_traversal#Uses has a concise example.

This usage is fairly popular in my past coding interviews.

In fact, the wikipedia article goes on to say that pre/post order walk can create a “representation  of a binary tree” … see my blogposts for details.

premium salary to compensate for intrinsic motivation@@

I recall that at more than one juncture in my job hunting career, I feel overwhelmed by a premium offer and said in my head “if I accept this offer and earn 20% more than the standard rate, then the ensuing pride, self-image boost etc would surely create a wellspring of positive motivation.”

How naive….in hind sight. The real factors affecting my job satisfaction was usually unrelated to premium salary. See the spreadsheet about job satisfaction.

I think the premium salary didn’t improve my life chances. A satisfying job does.

stay]shape 4CIV+QQ till 50-55

I would say QQ remains my stronger arm. (I don’t need to care about HFT shops’ assessment of me.)

  • QQ benefits from thick->thin (and xRef) … one of the key competitive advantages I could develop through blogging and continuous refresh.
  • CIV also benefits from blogging and continuous practice … my competitive advantages. Remember David Okao’s question “what is your secret weapon?”

Note High-end CIV is only needed at top west-coast shops. I think most of the top performers are young but I could stand out among my age group.

sharding:=horizontal partition #too many rows

Assuming a 88-column, 6mil-row table, “Horizontal” means horizontally pushing a knife across all 88 columns, splitting 6 million rows into 3 millions each.

Sharding can work on noSQL too.

GS PWM positions table had 9 horizontal partitions for 9 regions.

“Partitioning” is a more generic term and can mean 1) horizontal (sharding) or 2) vertical cutting like normalization.

unnoticed gain{SG3jobs: 看破quantDev

All three jobs were java-lite , with some quantDev exposure. Through these jobs, I gained the crucial clarity about the bleak reality of the quantDev career direction. The clarity enabled me to take the bold decision to stop the brave but costly TSN attempts to secure a foothold. Foothold is simply too tough and futile.

Traditional QuantDev in derivative pricing is a shrinking job pool. Poor portability of skills without any standard set of interview topics.

at same pay, now I would prefer eq than drv pricing domain, due to mkt depth and job pool.

QuantDev offers no contract roles !

Instead, I successfully established some c#/py/c++ trec. The c++ accu, though incomplete, was especially difficult and precious.

Without these progresses, I would be lacking the confidence in py/c#/c++ professional dev that enabled me to work towards and achieve multiple job offers. I would still be stuck in the quantDev direction.

closestMatch in sorted-collection: j^python^c++

–java is cleanest. P236 (P183 for Set) [[java generics]] lists four methods belonging to the NavigableMap interface

  • ceilingEntry(key) — closest entry higher or equal
  • higherEntry(key) — closest entry strictly higher than key
  • lowerEntry
  • floorEntry

both DEPTH+breadth expected of%%age: selective dig

Given my age, many interviewers expect me to demonstrate insight into many essential (not-obscure) topics such as lockfree (Ilya),

Interviewers expect a tough combination of breadth + some depth in a subset of those topics.

To my advantage I’m a low-level digger, and also a broad-based avid reader. The cross-reference in blog is esp. valuable.

Challenge for me — identify which subtopic to dig deeper, among the breadth of topics, given my limited absorbency and the distractions.

jvm footprint: classes can dominate objects

P56 of The official [[java platform performance]], written by SUN java dev team, has pie charts showing that

  • a typical Large server app can have about 20% of heap usage taken up by classes, rather than objects.
  • a typical small or medium client app usually have more RAM used by classes than data, up to 66% of heap usage take up by classes.

On the same page also says it’s possible to reduce class footprint.

long-term value: QQ imt ECT speed

The alpha geeks — authors, experts, open source contributors … are they fast enough to win those coding contests?

The speed-coding contest winners … are they powerful, influential, innovative, creative, insightful? Their IQ? Not necessarily high, but they are nobody if not superfast.

The QQ knowledge is, by definition, not needed on projects, usually obscure, deep, theoretical or advanced technical knowledge. As such, QQ knowledge has some value, arguably more than the ECT speed.

Some say a self-respecting programmer need some of this QQ expertise.

noexcept impact on RAII

If a function (esp. dtor) is declared noexcept, compiler can choose to omit stack-unwinding “scaffolding” around it.  Among other things, there’s a runtime performance gain. This gain is a real advantage of using noexcept instead of empty throw() which is deprecated in c++0x.

Q: Is there any impact on RAII?
%%A: yes

Q: Can we even use RAII in such a context?
%%A: I think we can. If the function does throw, then std::terminate() runs, instead of the destructors

q[static thread_local ] in %%production code

static thread_local std::vector<std::vector<std::string>> allFills; // I have this in my CRAB codebase, running in production.

Justification — In this scenario, data is populated on every SOAP request, so keeping them as non-static data members is doable but considered pollutive.

How about static field? I used to think it’s thread-safe …

When thread_local is applied to a variable of block scope, the storage-class-specifier static is implied if it does not appear explicitly. In my code I make it explicit.

TCP_NODELAY to improve latency #unclear

https://www.extrahop.com/company/blog/2016/tcp-nodelay-nagle-quickack-best-practices/#3

https://stackoverflow.com/questions/3761276/when-should-i-use-tcp-nodelay-and-when-tcp-cork

The default Nagle’s algo helps in applications like telnet. However, it may increase latency when sending streaming data.

In the case of interactive applications or chatty protocols with a lot of handshakes such as SSL, Citrix and Telnet, Nagle’s algorithm can cause a drop in performance, whereas enabling TCP_NODELAY can improve the performance.

In such cases, disabling Nagle’s algorithm is a better option. Enabling the TCP_NODELAY option disables Nagle’s algorithm.

 

temp object binding preferences: rvr,lvr.. #SCB

(Note I used “temp object” as a loose shorthand for “rval-object”.)

Based on https://www.codesynthesis.com/~boris/blog/2012/07/24/const-rvalue-references/

  • a const L-value reference … … can bind to a naturally-occurring rvalue object (or a robbed object after std::move)
  • a non-const r-value reference can bind to a naturally-occurring rvalue object
  • a const r-value reference (crvr) can bind to a naturally-occurring rvalue object but canNOT bind to an lvalue object

Q: so in the presence of all overloads, what kind of reference can naturally occurring temp objects bind to?
A: a const rvr. Such an object prefers to bind to a const rvalue reference rather than a const lvalue reference. There are some obscure use cases for this binding preference.

More important is the fact that const lvr (param type of copy-ctor) can bind to temp object. [[c++primer]] P540 has a section describing that if you pass a temp into Foo ctor you may hit the copy-ctor:

Foo z(std::move(anotherFoo)) // compiles and runs fine even if move-ctor is unavailable. This is THE common scenario before c++11. No change.

Compiler doesn’t bother to synthesize the move-ctor, when copy-ctor is defined!

 

nonVirtual1() calling this->virt2() #templMethod

http://www.cs.technion.ac.il/users/yechiel/c++-faq/calling-virtuals-from-base.html has a simple sample code. Simple idea but there are complexities:

  • the given print() should never be used inside base class ctor/dtor. In general, I believe any virt2() like any virtual function behaves non-virtual in ctor/dtor.
  • superclass now depends on subclass. The FAQ author basically says this dependency is by-design. I believe this is template-method pattern.
  • pure-virtual is probably required here.

returning^throwing local object

Tag line – always catch by reference. [[moreEffC++]] has a chapter with this title.

  • If you return a local object by reference, it will lead to run time error as the object would be wiped out from stack. Compiler is likely to give a warning.
  • If you throw a local object and catch by reference up the call stack, it is actually considered best practice, because compiler always clones the local object and throws the clone.

xchg between register/memory triggers fencing

I guess xchg is used in some basic concurrency constructs, but I need to research more.

See my blogpost hardware mutex, based@XCHG instruction

https://c9x.me/x86/html/file_module_x86_id_328.html points out the implicit locking performed by CPU whenever one of the two operands is a memory location. The other operand must be a register.

This implicit locking is quite expensive according to https://stackoverflow.com/questions/50102342/how-does-xchg-work-in-intel-assembly-language, but presumably cheaper than user-level locking.

This implicit locking involves a memory fence.

compile-time ^ run-time linking

https://en.wikipedia.org/wiki/Dynamic_linker describes the “magic” of linking *.so files with some a.out at runtime, This is more unknown and “magical” than compile-time linking.

“Linking is often referred to as a process that is performed when the executable is compiled, while a dynamic linker is a special part of an operating system that loads external shared libraries into a running process”

I now think when the technical literature mentions linking or linker I need to ask “early linker or late linker?”

lower workload ⇏ quality free time #family

Lower workload CAN mean

… more time for kids + workout, but can also mean

… more time wasted .. burn/rot

The free time saved due to lower workload is often spent on reflective blogging… controversial

A great example of quality free time is the Bayonne -> MS commute in 2018/2019. For a few weeks I was able to do coding drill on commute, despite the segmented commute. For a few months I was doing /productive/ git-blogging

WallSt contract as fallback career plan: 2019summary

After talking to you and a few friends, now I feel Wall St contract job market is a comfortable, lucrative fallback career plan for me. (See [19] y WallSt_contract=my best Arena #Grandpa) The alternatives are:

  • ibank VP jobs — I know many people in these roles including our friend Youwei. Looks too stressful.
  • startups? No real experience but probably more stressful
  • web2.0 shops — (Goog, FB etc) I assume the expectation will be too high esp. for older guys like me
  • Singapore jobs — mostly as stressful as ibank VP jobs.

Hadoop apps: Is java preferred@@

I didn’t hear about any negative experience with any other languages, I would assume yes java is preferred, and the most proven choice. If you go with the most popular “combination” then you can find the thriving ecosystem — most resources online and the widest support tools.

–According to one website:

Hadoop itself is written in Java, with some C-written components. The Big Data solutions are scalable and can be created in any language that you prefer. Depending on your preferences, advantages, and disadvantages presented above, you can use any language you want.

Virtual_Machine^guest_OS

https://www.virtualbox.org/manual/ch01.html#virtintro explains that

Guest OS runs in a virtual machine or “vm”. A “vm”

  • usually refers to a container process if it’s “live”

 

  • More often, a vm means a vm-config i.e. a collection of parameters defining a physical container process to-be-started.It’s important to realize (windows host OS as example) a vm is strictly an application with a window, like a browser or shell. As such, this application has it’s own config data saved on disk.

 

reasons to limit tcost@SG job hunt #XR

XR said a few times that it is too time consuming each time to prepare for job interviews. The 3 or 4 months he spent has no long-term value. I immediately voiced my disagreement because I took IV fitness training as a lifelong mission, just like jogging or yoga or chin-up.

This view remains as my fundamental perspective, but my disposable time is limited. If I can save the time and spend in on some meaningful endeavors  [1] then it’s better to have a shorter job hunt.

[1] Q: what endeavors?
A: yoga
A: diet
A: stocks? takes very little effort
A: ?

POSIX countingSemaphore ^ lock+condVar #Solaris docs

https://docs.oracle.com/cd/E19120-01/open.solaris/816-5137/sync-11157/index.html points out a lesser-known difference in the Solaris context:

Because semaphores need not be acquired and be released by the same thread, semaphores can be used for asynchronous event notification, such as in signal handlers (but presumably not interrupt handlers). And, because semaphores contain state, semaphores can be used asynchronously without acquiring a mutex lock as is required by condition variables. However, semaphores are not as efficient as mutex locks.

The same page also shows POSIX countingSemaphore can be used IPC or between threads.

prefer ::at()over operator[]read`containers#UB

::at() throws exception … consistently 🙂

  • For (ordered or unordered) maps, I would prefer ::at() for reading, since operator[] silently inserts for lookup miss.
  • For vector, I would always favor vector::at() since operator[] has undefined behavior when index is beyond the end.
    1. worst outcome is getting trash without warning. I remember getting trash from an invalid STL iterator.
    2. better is consistent seg fault
    3. best is exception, since I can catch it

 

## 9 c++realized $ROTI

I wanted c++ ROTI. After so many years of trial and error, I got two

  • G3 [18] The CVA $122/hr offer
  • G3 [18] SCB-FM S$210k offer, unthinkable in my Singapore job search.
    • In terms of base, This one is about $$190k. My “reasonable” target was S$150k and my “high” target was $170k.
  • G5 [18] SIG technical win
  • G5 [12] BNP forex prop trading contract offer
  • [19] MLP-sg java connectivity team actually has a small c++ requirement.
  • G9 overcame fear@large codebase]c++/j and emerged above most developers.

Machine Learning #notes

Machine Learning — can be thought of as a method of data analysis, but a method that can automate analytical model building. As such, this method can find hidden insights unknown to the data scientist. I think the AlphaGo Zero is an example .. https://en.wikipedia.org/wiki/AlphaGo_Zero

Training artificial intelligence without datasets derived from human experts is… valuable in practice because expert data is “often expensive, unreliable or simply unavailable.”

AlphaGo Zero’s neural network was trained using TensorFlow. The robot engaged in reinforcement learning, playing against itself until it could anticipate its own moves and how those moves would affect the game’s outcome

So the robot’s training is by playing against itself, not studying past games by other players.

The robot discovered many playing strategies that human players never thought of. In the first three days AlphaGo Zero played 4.9 million games against itself and learned more strategies than any human can.

In the game of GO, world’s strongest players are no longer humans. Strongest players are all robots. The strongest strategies humans have developed are easily beaten by these robots. Human players can watch these top (robot) players fight against each other, and try to understand why their strategies work.

I created a shared_ptr with a local object address..

In my trade busting project, I once created a local object, and used its address to construct a shared_ptr (under an alias like TradePtr).

Luckily, I hit consistent crashes. I think the reason is — shared_ptr likes heap objects. When my function returns, the shared_ptr tried to call delete on the raw ptr, which points at the local stack, leading to crash.

The proven solution — make_shared()

Q: passive income ⇒ reduce GTD pressure#positive stress

My (growing) Passive income does reduce cash flow pressure… but it has no effect so far on my work GTD pressure.

Q: Anything more effective more practical?

  1. take more frequent unpaid leaves, to exercise, blog or visit family
  2. expensive gym membership

How about a lower salary job (key: low caliber team)? No I still want some challenge some engagement, some uphill, some positive stress.

job market: SG still much slower than NY

See also my mail to Ellen. I must stop /romanticizing/ about the “improvement” in SG job market.

  • Singapore tech shops are mostly not keen about my profile. U.S.? Not sure.
  • Singapore fintech shops ? zero interest shown, even when I asked 150k
  • Singapore buy-sides are interested but way too selective and kinda slow.
  • Note except GS I didn’t try the ibank jobs this time round.

There’s oth risk because … I’m not so keen about SG job market.

 

 

unique_ptr implicit copy : only for rvr #auto_ptr

P 470-471 [[c++primer]] made it clear that

  • on a regular unique_ptr variable, explicit copy is a compilation error. Different from auto_ptr here.
  • However returning an unnamed temp unique_ptr (rvalue object) from a function is a standard idiom.
    • Factory returning a unique_ptr by value is the most standard idiom.
    • This is actually the scenario in my SCB-FM interview by the team architect

Underlying reason is what I have known for a long time — move-only. What I didn’t know (well enough to impress interviewer) — the implication for implicit copy. Implicit copy is the most common usage of unique_ptr.

mgr role risk: age-unfriendly

Statistically, very few IT managers can maintain the income level beyond age 55.

I believe those managers in 30’s and 40’s are often more capable, more competitive and more ambitious.

Even if you are above average as a manager, the chance of rising up is statistically slim and you end up contending against the younger, hungrier, /up-and-coming/ rising stars.

array=#1 important data structure

C supports only array (horizontal) and struct (vertical). I feel most standard libraries across languages are designed based on the same. Array + graph are about the only data structures in those libraries and in CIV.

For cross-language coding drill, we should probably keep our focus on arrays.

For comp science algorithm research, there’s more energy focused on array than any other data structure.

performance domain is low-level. Array (among various data structures) is the real focus of micro tuning and hardware optimizations.

jvm^c++ as infrastructure

c/c++ is part of the infrastructure of many new technologies, and consequently will last for decades whereas java may not.

😦 This doesn’t mean there will be enough c++ jobs for me and my C++ friends.

  • JVM is an infrastructure for a relatively small number of new languages and new frameworks like spring, hadoop,.. However, the machine learning community seem to regard python and c++ as the mainstay.
  • Java (not JVM) serves as infrastructure in the new domains of MSA, cloud, big data etc, but not Machine Learning.

git | merge conflict #personal tips

I prefer cherry-pick.

Often the merge/rebase/cherry-pick/stash-pop operation would silently put some of the changes in _staging_, therefore git-diff would fail to show them. I have to use git-diff-HEAD instead:(

–If you have no choice then here’s the standard procedure

  1. git rebase master
  2. you get a merge conflict and file1 now contains <<<< ===
  3. vi file1 # to remove the bad stuff
  4. git add file1 # on the unnamed branch
  5. # no commit needed
  6. git rebase –continue # will bring you back to your feature branch

 

overcoming exchange-FIX-session throughput limit

Some exchanges (CME?) limits each client to 30 orders per second. If we have a burst of order to send , I can see two common solutions A) upstream queuing B) multiple sessions

  1. upstream queuing is a must in many contexts. I think this is similar to TCP flow control.
    • queuing in MOM? Possible but not the only choice
  2. an exchange can allow 100+ FIX sessions for one big client like a big ibank.
    • Note a big exchange operator like nsdq can have dozens of individual exchanges.

Q: is there any (sender self-discipline) flow control in intranet FIX?
A: not needed.

xx: weekendCIV^codingDrill^work

Amount of tech learning and zbs growth over weekend coding assignments are different from coding drill or work projects

  • intensity — highest
  • QQ halos — comparable to my language feature experiments — improves QQ
  • scale — larger than coding drill but not too large like work projects
  • BP — is a major learning focus and evaluation criteria
  • absorbency — highest
  • sustained focus — lasting over a few days, pretty rare for me

dummy param in c++func

I used to name unused parameters of myfunction as “dummy”.

  • now I also use ‘_’. I think this also works in python
  • now I also use nothing as in void fn(T*)

Not always better than ‘dummy’ but at least this kind of compiler knowledge is useful in troubleshooting.

I don’t name the parameter “unused” as it is a confusing name.

prefer std::deque when you need stack/queue

  • std::stack is unnecessarily troublesome for dumping and troubleshooting … see https://github.com/tiger40490/repo1/blob/cpp1/cpp/2d/maxRectangle.cpp
  • std::queue has a similar restriction.

Philosophically, std::stack and std::queue are container adapters designed to deny random access, often for safety. They are often based on an underlying “crippled deque”. As soon as you need to dump the container content, you want a full-fledged deque.

frequency@(hardware)timer interrupts #KHz

In the 2003 edition of [[LinuxKernel]] P195, 1200 interrupts/second was the highest frequency, above 1KHz.

The CPU must process these timer interrupts. If it were easy to “nullify” these interrupts and eliminate the CPU overhead, then we would have millions of time interrupts per second but I doubt it. The overhead is probably unavoidable.

Scheduler is a kernel component. I think scheduler divides cpu time into slots. I guess the smallest quantum is limited by this frequency, among other factors.

P350 of The same book says that scheduler runs upon timer interrupts + keyboard interrupts.

Note many people get confused by software timer vs hardware timer. Hardware timer is my focus.

Many people get confused by software interrupt vs hardware interrupts. I have a separate blogpost (interrupts=hardware interrupts #by default), but basically most interrupts are hardware interrupts. Software interrupts are simulated.

 

c++^java..how relevant ] 20Y@@

See [17] j^c++^c# churn/stability…

C++ has survived more than one wave of technology churn. It has lost market share time and time again, but hasn’t /bowed out/. I feel SQL, Unix and shell-scripting are similar survivors.

C++ is by far the most difficult languages to use and learn. (You can learn it in 6 months but likely very superficial.) Yet many companies still pick it instead of java, python, ruby — sign of strength.

C is low-level. C++ usage can be equally low-level, but c++ is more complicated than C.

CAS cpu-instruction takes 3 inputs

A CVA interviewer asked me to explain the cmpxch cpu-instruction. I now believe it COMPARES two values (expected^current) and IIF matched, updates the memory location to a “newValue”.

Out of these 3 “inputs”, only the expected and newValue are inputs to the function. The 3rd item “current” is NOT an input parameter to the function, but discovered in the hardware.

See P1018 [[the c++StdLib]] by Josuttis

convert a recursive algo to iterative #inOrderWalk

Suppose you have just one function being called recursively. (2-function scenario is similar.) Say it has 5 parameters. Create a struct named FRAME (having 5 fields + possibly a field for lineNo/instructionPointer.)

Maintain a stack holding the Frame instances. Each time the recursive algorithm adds to the call stack, we add to our stack too.

Wiki page on inorder tree walk  has very concise recursive/iterative algos. https://github.com/tiger40490/repo1/blob/py1/py/tree/iterative_InOrderWalk.py is my own attempt that’s not so simple. Some lessons:

  • Differentiate between popping vs peeking the top.
  • For a given node, popping and printing generally happen at different times without any clear pattern.
    • the sequence of pop() is probably a pre-order tree walk
    • the sequence of print is an in-order tree walk

reactive java #learning notes

After a 5-minute glance, I feel this is yet another (one of a group) jxee add-on package. Not sure about its shelf-life.

There are many jargon terms in, or related to, this concept. Presumably too many (and intimidating) to a new comer.

https://spring.io/blog/2016/06/07/notes-on-reactive-programming-part-i-the-reactive-landscape is a 2016 Spring article.

https://dzone.com/articles/rxjava-part-1-a-quick-introduction is a 2016 tutorial to RxJava library.

bitwise coding questions: uncommon

Don’t over-invest.

Q: How prevalent are these questions in coding interviews?

  • I feel these questions are usually contrived, and therefore low-quality and unpopular among hiring firms.
  • There’s no classic comp-science constructs at bitwise level, and bitwise doesn’t play well with those contructs
  • the bitwise hacks must be language-neutral wrt python and javascript, so quality questions are scarce.
  • phone round? impossible

How is debugger breakpoint implemented@@ brief notes #CSY

This is a once-only obscure interview question. I said up-front that CPU interrupts were needed. I still think so.

I believe CPU support is needed to debug assembly programs, where kernel may not exist.

For regular C program I still believe special CPU instructions are needed.

https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints seems to agree.

It says Interrupt #3 is designed for debugging. It also says SIGTRAP is used in linux, but windows supports no signals.

 

NxN matrix: graph@N nodes #IV

Simon Ma of CVA team showed me this simple technique.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/tokenLinked_Friend.cpp is my first usage of it.

  • I only needed half of all matrix cells (excluding the diagonal cells) because relationships are bilateral.
  • Otherwise, if graph edges are directed, then we need all (N-1)(N-1) cells since A->B is not same as B->A.

My discrete math textbook shows this is a simplified form of representation and can’t handle self-link or parallel edge. The vertex-edge matrix is more robust but space-inefficient.

churn !! bad ] mktData #socket,FIX,.. unexpected!

I feel the technology churn is remarkably low.

New low-level latency techniques are coming up frequently, but these topics are actually “shallow” and low complexity to the app developer.

  • epoll replacing select()? yes churn, but much less tragic than the stories with swing, perl, structs
  • most of the interview topics are unchanging
  • concurrency? not always needed. If needed, then often fairly simple.

UDP recv()from 1 send()at most

P116 [[tcp/ip sockets in C]] made it very clear.

A call to recv on the receiver machine will return data from at most one send() on the sender machine.

It can be a partial message, but would be the first part. See https://stackoverflow.com/questions/13317532/receiving-a-part-of-packet-via-recvfrom-udp

I believe entire payload of one send()/sendto() is packaged into an envelope. The kernel would never deliver two envelopes to one recv()/recvfrom() call. Therefore receiver can only receive one envelope at a time. If entire envelope is too large then only only first part of the payload is delivered.

childThr.get_id() after join()

Not sure about pthreads but here is c++11 std::thread::get_id() behavior:
“If the thread object is not joinable, the function returns a default-constructed object of member type thread::id.”
I believe after you join a childThr, that thread is no longer joinable, SO get_id() will return a meaningless boilerplate value.

Solution: to use that id, you need to save it before joining
https://github.com/tiger40490/repo1/blob/cpp1/cpp/thr/takeTurn.cpp is my experiment

## optimize code for i-cache: few tips

I don’t see any ground-breaking suggestions. I think only very hot functions (confirmed by oprofile + cachegrind) requires such micro-optimization.

I like the function^code based fragmentation framework on https://www.eetimes.com/document.asp?doc_id=1275472 (3 parts)

  • inline: footprint+perf can backfire. Can be classified as embedding
  • use table lookup to replace “if” ladder — minimize jumps
  • branching — refactor a lengthy-n-corner-case (not “hot”) code chunk out to a function, so 99% of the time the instruction cache (esp. the pre-fetch flavor) doesn’t load a big chunk of cold stuff.
    • this is the opposite of embedding !
  • Trim the executable footprint. Reduce code bloat due to inlining and templates?
  • loop unrolling to minimize jumps. I think this is practical and time-honored — at aggressive optimization levels some compilers actually perform loop unrolling! Programmers can do it manually.
  • Use array (anything contiguous) instead of linked list or maps to exploit d-cache + i-cache
  • https://software.intel.com/en-us/blogs/2014/11/17/split-huge-function-if-called-by-loop-for-best-utilizing-instruction-cache is a 2014 Intel paper — split huge function if it’s invoked in a loop.

 

reinterpret_cast(zero-copy)^memcpy: raw mktData parsing

Raw market data input comes in as array of unsigned chars. I “reinterpret_cast” it to a pointer-to-TradeMsgStruct before looking up each field inside the struct.

Now I think this is the fastest solution. Zero-cost at runtime.

As an alternative, memcpy is also popular but it requires bitwise copy. It often require allocating a tmp variable.

JGC G1 Metaspace: phrasebook #intern

Incidentally, NIO buffer is also in native memory

 

c++condVar 2 usages #timedWait

poll()as timer]real time C : industrial-strength #RTS is somewhat similar.

http://www.stroustrup.com/C++11FAQ.html#std-condition singles out two distinct usages:

1) notification
2) timed wait — often forgotten

https://en.cppreference.com/w/cpp/thread/condition_variable/wait_for shows std::condition_variable::wait_for() takes a std::chrono::duration parameter, which has nanosec precision.

Note java wait() also has nanosec precision.

std::condition_variable::wait_until() can be useful too, featured in my proposal RTS pbflow msg+time files #wait_until

killing a stuck thread #cancellation points#CSY

Q: once you know one of many threads is stuck in a production process, what can you do? Can you kill a single thread?
A: there will Not be a standard construct provided by OS or thread library because killing a thread is inherently unsafe.. Look at java Thread.stop()
A: yes if I have a builtin kill-hook in the binary

https://www.thoughtspot.com/codex/threadstacks-library-inspect-stacktraces-live-c-processes describes a readonly custom hook. Its conceivable to add a kill feature —

  • Each thread runs a main loop to check an exit-condition periodically.
  • This exit-condition would be similar to pthreads “cancellation points”

https://stackoverflow.com/questions/10961714/how-to-properly-stop-the-thread-in-java shows two common kill hooks — interrupt and Boolean flag

 

socket^swing: separate(specialized skill)from core lang

  • I always believe swing is a distinct skill from core java. A regular core Java or jxee guy needs a few years experience to become swing veteran.
  • Now I feel socket programming is similarly a distinct skill from core C/c++

In both cases, since the core language knowledge won’t extend to this specialized domain, you need to invest personal time outside work hours .. look at CSY. That’s why we need to be selective which domain.

Socket domain has much better longevity (shelf-life)  than swing!

5concerns@bigData(+quant)domains#10Y

  1. fads — vaguely I feel these are fads.
  2. salary — (Compare to financial IT) absolute profit created by data science is small but headcount is high ==> most practitioners are not well-paid. Only buy-side data science stands out
  3. volatile — I see data science too volatile and churning, like javascript, GUI and c#.
  4. shrink — I see traditional derivative-pricing domain shrinking.
  5. entry barrier — quant domain requires huge investment but may not reward me financially
  6. value — I am suspicious of the economic value they claim to create.

##functions(outside big4)using either rvr param or move()

Q: Any function(outside big4) with rvr param?
%%A: Such functions are rare. I don’t know any.
AA: [[effModernC++]] has a few functions taking rvr param, but fairly contrived as I remember.
AA: P544 [[c++primer]] says class methods could use rvr param
* eg: push_back()

 

Q: any function (outside big4) using std::move?

  • [[effModernC++]] has a few functions.
  • P544 [[c++primer]] says rarely needed
  • [[josuttis]] p20

 

down-cast a reference #idiom

ARM P69 says down-cast a reference is fairly common. I have never seen it.

Q: Why not use ptr?
%%A: I guess pointer can be null so the receiver function must face the risk of a null ptr.
%%A: 99% of references I have seen in my projects are function parameters, so references are extremely popular and proven in this use case. If you receive a ref-to-base, you can down cast it.

See post on new-and-dynamic_cast-exceptions
see also boost polymorphic_cast

favor std::begin(arrayOrContainer)

https://stackoverflow.com/questions/7593086/why-use-non-member-begin-and-end-functions-in-c11 explains some important details.

Q: So how do we choose between

  • this free global function
  • the container member function cont::begin() / end()?

%%A: Basically, I would always use std::begin() instead of cont.begin() esp. in template-enable programs.

slist in python@@ #no std #Ashish

A quick google search shows

* python doesn’t offer linked list in standard library

* python’s workhorse list like [2,1,5] is a expendable array, i.e. vector. See https://stackoverflow.com/questions/3917574/how-is-pythons-list-implemented and https://www.quora.com/How-are-Python-lists-implemented-internally

* {5, 1, 0} braces can initialize a set. I very seldom use a set since a dict is almost always good-enough.

mgr role risk: smaller job pool

When not comfortable (under threat), or job lost … the prospect of finding a similar job is much worse than a hands-on developer, because the number of senior mgr jobs is much smaller.

Avichal basically said he would avoid hands-off manager roles.

As contractor, most of the time I feel very relaxed about moving in and out. The price to pay, of course, is lower salary.

## emulators of secDB #GregR

— secDB derivatives/cousins/emulators:

  • Pimco and Macquarie licensed Beacon
  • CS hired two secDB veterans to build a similar system on java
  • MS has a RCE (risk calc engine) project, based on scala and java
  • UBS tried it too. I applied for this job in 2011 or 2012
  • Athena and Quartz
  • BlackRock Aladdin (1988), written in java, for risk management across portfolios. All other functionalities are secondary.

I feel you need such a system only if your books have many derivative contracts that needs constant “revaluation”. This is a core feature of derivative risk systems.

Q: beyond risk systems, why is Quartz also supporting trade booking and execution?

I think secrete key is in the data store, which is central to those systems. SecDB systems feature a specially designed in-memory and replicated data store, which can be the basis of those systems.

A special data store is live and reference market data.

y I use lots of if..{continue;}

Inside a loop, many people prefer if/elif/else. To them, it looks neat and structured.

However, I prefer the messier if…continue; if…continue; if..continue. Justification?

I don’t have to look past pageful of if/elif/elif/…/else to see what else happens to my current item. I can ignore the rest of the loop body.

Beside the current item, I also can safely let go (rather than keeping track) of all the loop-local variable values longer. All of them will be wiped out and reset to new values.

pass temp into func by val: mv-ctor skipped #RVO CSY

Suppose we have a class MoveOnlyStr which has only move-ctor, no copy-ctor. Suppose we pass an unnamed temporary instance of this class into a function by value, like void func1(MoverOnlyStr arg_mos).

Q: Will the move-ctor be used to create the argument object arg_mos? We discussed this in your car last time we met up.

A: If the temp is produce by a function, then No. My test shows I was right to predict that compiler optimizes away the temporary, due to RVO. So move-ctor is NOT used. This RVO optimization has existed long before c++11.

A: if the temp is not produced by a function, then RVO is irrelevant (nothing “Returned”) but I don’t know if there’s still some copy-elision.

c++GC interface

https://stackoverflow.com/questions/27728142/c11-what-is-its-gc-interface-and-how-to-implement

GC interface is partly designed to enable

  • reachability-based leak detectors
  • garbage collection

The probe program listed in the URL shows that as of 2019, all major compilers provide trivial support for GC.

Q: why does c++ need GC, given RAII and smart pointers?
A: system-managed automatic GC instead of manual deallocation, without smart pointers

Y allocate static field in .c file %%take

why do we have to define static field myStaticInt in a cpp file?

For a non-static field myInt, the allocation happens when the class instance is allocated on stack, on heap (with new()) or in global area.

However, myStaticInt isn’t take care of. It’s not on the real estate of the new instance. That’s why we need to declare it in the class header, and then define it exactly once (ODR) in a cpp file. It is allocated at compile time — static allocation.

p2p messaging beats MOM ] low-latency trading

example — RTS exchange feed dissemination infrastructure uses raw TCP and UDP sockets and no MOM

example — the biggest sell-side equity OMS network uses MOM only for minor things (eg?). No MOM for market data. No MOM carrying FIX order messages. Between OMS nodes on the network, FIX over TCP is used

I read and recorded the same technique in 2009… in this blog

Q: why is this technique not used on west coast or main street ?
%%A: I feel on west coast throughput outweighs latency. MOM enhances throughput.

CMS JGC: deprecated in java9

Java9/10 default GC is G1. CMS is officially deprecated in Java 9.

Java8/7 default GC is ParallelGC, CMS. See https://stackoverflow.com/questions/33206313/default-garbage-collector-for-java-8

Note parallelGC uses

  • parallel in most generations
  • serial in old gen

…whereas parallelOldGC uses parallel in all generations.

Q: why is CMS deprecated?
A: one blogger seems to know the news well. He said JVM engineering team needs to focus on new GC engines and need to let go the most high-maintenance but outdated codebase — the CMS, As a result, new development will cease on CMS but CMS engine is likely to be available for a long time.

efficient swap(): two containers-of-T

Background — template function std::swap(T&, T&) works for int, float etc, but the same implementation will not work efficiently for vector, list, map or set. Therefore I suspected there might be specializations of swap() template function.

As it turns out, vector (and the other containers) provides a swap() member function. So the implementation of vector swap is indeed different from std::swap().

q[inline] to avoid a.. jump^stackFrame

Dino of BBG FX team asked me — when you mark a small f1() function inline (like manually copying the code into main()), you save yourself a jump or a new stack frame?

A: both a jump and a new stack frame.

It turns out a new stack frame would require a jump, because after the new stack frame is created, thread jumps to the beginning of f1().

However, there’s something to set up before the jump — Suppose f1() is on Line 5 in main(), then Line 6’s address has to be saved to CPU register, otherwise the thread has no idea where to” jump back” after returning from f1(). According to my codebashing training (required at RTS team), this Line 6’s address is saved in the main() stack frame, not the f1() stack frame!

Note the Line 6’s address is not a heap address not a stack address but an pointer into the code area.

jmp_buf/setjmp() basics for IV #ANSI-C

Q: how is longjmp different from goto? See http://ecomputernotes.com/what-is-c/function-a-pointer/what-is-the-difference-between-goto-and-longjmp-and-setjmp

A: longjmp can 1) jump across functions, and 2) restore program state from a jmp_buf, which was saved earlier by setjmp.

fileHandle/socket/dbConection are thread Unsafe

There’s a buffer in a DB connection, in a file handle, in a socket …

The buffer is a shared mutable object. Consider a read-buffer. The host object knows how much of the data in the buffer was already delivered, to avoid repeated delivery. There’s some kind of watermark, which is moved by a consumer thread.

As all shared mutables, these objects are thread unsafe.

All of these objects can also be allocated on stack and therefore invisible to other threads. Therefore, this could be the basis of a thread-safe design.

 

python run complex external commands #subprocess

I prefer one single full-feature solution that’s enough for all my needs. The os.system() solution is limited. The subprocess module is clearly superior. One of the simplest features is

>>> subprocess.call([“ls”, “-l”])

If you need redirection and background, then try the single-string version

>>> subprocess.call(‘ls /tmp > /tmp/a.log &’, shell=True) # output goes to STDOUT, hard to capture

c++4beatFronts: which1to LetGo if I must pick1

Q: Which one to let go, given I have limited O2/laser/bandwidth and mental capacity?

  1. give up BP — biggest room for improvement but least hope
  2. give up codility style — Just get other friends to help me. See codility: ignore overflow, bigO

How about pure algo?

  • already decent? Can improve.
  • diminishing return? Smaller room for improvement? but I can learn a few key ideas about the G100 questions

container of smart^raw pointer

In many cases, people need to store addresses in a container. Let’s use std::vector for example. Both smart ptr and raw ptr are common and practical

  • Problem with raw ptr — stray pointer. Usually we the vector doesn’t “own” the pointee, and won’t delete them. But what if the pointee is deleted somewhere and we access the stray pointer in this vector? smart pointer would solve this problem nicely.
  • J4 raw ptr — footprint efficiency. Raw ptr object is smaller.

WordPad : rich-text note taker

I found WordPad RTF files a good middle ground between ascii text and MSWord files.

  • 🙂 basic text effects like background+font colors
  • 🙂 footprint is comparable to ascii, much smaller than MSWord files.
    • Q: How about after compression?
  • 🙂 how about copying to Linux? tested

Verdict —

  1. default general-purpose note-taker remains ascii, but
  2. for office work notes with text highlighting, let’s try WordPad more often. But let’s not become dependent. Convert old rtf notes to ascii.

select^poll # phrasebook

Based on https://www.ulduzsoft.com/2014/01/select-poll-epoll-practical-difference-for-system-architects/, which I respect.

  • descriptor count — up to 200 is fine with select(); 1000 is fine with poll(); Above 1000 consider epoll
  • time-out precision — poll/epoll has millisec precision. select() has nanosec, a million times higher precision, but only embedded devices need such precision.
  • single-threaded app — poll is just as fast as epoll. epoll() excels in MT.

https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/rzab6/poll.htm has sample code on poll().

collection-of-abstract-shape: j^c++

In java, this usage pattern is simple and extremely common — Shape interface.

In c++, we need a container of shared_ptr to pure abstract base class 😦

  • pure abstract interface can support MI
  • shared_ptr supports reference counting, in place of garbage collection
  • pointer instead of nonref payload type, to avoid slicing.

This little “case study” illustrates some fundamental differences between java and c++, and showcases some of the key advantages of java.

gdb –tui #split screen

You can also start gdb normally, then switch to split screen, with “ctrl-x  ctrl-a” (thanks to Gregory).

Upper screen shows the source code with a moving marker

–Here’s my full gdb command showing

  • –args to run the target executable with arguments
  • redirect stderr to a file so my gdb screen isn’t messed up — not always effective

gdb –tui –args $base/shared/tp_xtap/bin/xtap -D 9 -c $base/etc/test_replay.cfg 2 >

big guns: template4c++^reflection4(java+python)

Most complex libraries (or systems) in java require reflection to meet the inherent complexity;

Most complex libraries in c++ require template meta-programming.

But these are for different reasons… which I’m not confident to point out.

Most complex python systems require … reflection + import hacks? I feel python’s reflection (as with other scripting languages) is more powerful, less restricted. I feel reflection is at the core of some (most?) of the power features in python – import, polymorphism

## low-complexity QQ topics #JGC/parser..

java GC is an example of “low-complexity domain”. Isolated knowledge pearls. (Complexity would be high if you delve into the implementation.)

Other examples

  • FIX? slightly more complex when you need to debug source code. java GC has no “source code” for us.
  • socket programming? conceptually, relatively small number of variations and combinations. But when I get into a big project I am likely to see the true color.
  • stateless feed parser coded against an exchange spec

zbs/GTD/KPI/effi^productivity #succinctly

See zbs^GTD^QQ

  • zbs — real, core strength (内功) of the tech foundation for GTD and IV.
  • GTD —
  • KPI — boss’s assessment often uses productivity as the most important underlying KPI, though they won’t say it.
  • productivity — GTD level as measured by __manager__, at a higher level than “effi”
  • effi — a lower level measurement than “Productivity”

joining/leaving a multicast group

Every multicast address is a group address. In other words, a multicast address identifies a group.

Sending a multicast datagram is much simpler than receiving…

[1] http://www.tldp.org/HOWTO/Multicast-HOWTO-2.html is a concise 4-page introduction. Describes joining/leaving.

[2] http://ntrg.cs.tcd.ie/undergrad/4ba2/multicast/antony/ has sample code to send/receive. Note there’s no server/client actually.

 

Single-Threaded^STM #jargon

SingleThreadedMode means each thread operates without regard to other threads, as if there’s no shared mutable data with other threads.

Single-Threaded can mean

  • no other thread is doing this task. We ensure there’s strict control in place. Our thread still needs to synchronize with other threads doing other tasks.
  • there’s only one thread in the process.

Inlining is THE optimization

MSVS and g++ debug build both disable inline (presumably to ease debugging). The performance difference vis-a-vis release build is mostly due to this single factor, according to [[optimized c++]]

The same author asserts that inlining is probably the most powerful code optimization.

Stroustrup singled out inline as a key feature of c++.

In his low-latency seminar, Martin Thompson also quoted a computer scientist saying “Inlining is THE optimization”. This is now a memorable quote.

I think one of my java performance books singled out inlining as the single most effective optimization compiler can do.

Pimpl effectively disables inlining

vptr-based dispatch also disables inlining

C++faq gives many reasons for and against inline.

simple implementation of memory allocator#no src

P9 [[c++game development primer]] has a short implementation without using heap. The memory pool comes from a large array of chars. The allocator keeps track of allocated chunks but doesn’t reuse reclaimed chunks.

It showcases the header associated with each allocated chunk. This feature is also part of a real heap allocator.

reinterpret_cast is used repeatedly.

y concurrentHM.size() must lock entire map#my take

Why not lock one segment, get the subcount, unlock, then move to next segment?

Here’s my take. Suppose 2 threads concurrently inserts an item in each of two segments. Before that, there are 33 items. Afterwards, there are 35 items. So 33 and 35 are both "correct" answers. 34 is incorrect.

If you lock one segment at a time, you could count an old value in one segment then a new value in another segment.

multicast address 1110xxxx #briefly

By definition, multicast addresses all start with 1110 in the first half byte. Routers seeing such a destnation (never a source) address knows the msg is a multicast msg.

However, routers don’t forward any msg with destnation address 224.0.0.0 through 224.0.0.255 because these are local multicast addresses. I guess these local multicast addresses are like 192.168.* addresses.

serialize access to shared mutable: mutex^CAS

[[optimized c++]] P290 points out that, in addition to mutex, CAS construct also serializes access to shared mutable objects.

I feel it’s nothing but a restatement of the definition of “shared mutable”.  More relevant question is

Q: what constructs support unimpeded concurrent access to shared mutable?
A: read-write lock lets multiple readers proceed, in the absence of writers.
A: RCU lets all readers proceed, but writers are restricted.