throughput^latency #wiki

High bandwidth often means high-latency:( .. see also linux tcp buffer^AWS tuning params

  • RTS is throughput driven, not latency-driven.
  • Twitter/FB fanout is probably throughput-driven, not latency-driven
  • I feel MOM is often throughput-driven and introduces latency.
  • I feel HFT OMS like in Mvea is latency-driven. There are probably millions of small orders, many of them cancelled.

https://en.wikipedia.org/wiki/Network_performance#Examples_of_latency_or_throughput_dominated_systems shows

  • satellite is high-latency, regardless of throughput
  • offline data transfer by trucks) is poor latency, excellent throughput
Advertisements

op=() will call implicit cvctor if RHS is!!correct type

Below is an online c++ question, to predict the output (12310). The assignment “i=10” looks like a call to the synthesized assignment operator, but I overlooked the fact that RHS is a primitive int but LHS is Integer class, but assignment operator requires both sides to be Integer class type!

So an implicit conversion-ctor is required before the assignment operator is picked by the compiler.

A second knowledge gap — the synthesized assignment operator performs a bit-wise copy including the “num” field.

class Integer{
  int num;
public:
  Integer(){
    num=0;
    cout<<"1";
  }
  //explicit //would break the assignment
  Integer(int arg){
    cout<<"2";
    num=arg;
  }
  int getValue(){
    cout<<"3";
    return num;
  }
};
int main(){
  Integer i;
  i = 10;
  cout<<i.getValue();
}

semaphore: often !! ] thread library

lock and condVar are essential components of any thread library. Counting Semaphore is not.

  • In (C library) POSIX the semaphore functions do not start with pthread_ like locks and condVars are.
  • In (C library) SysV, the semaphore API is not part of any thread library whatsoever
  • In java, locks and condVars are integrated with every object, but Semaphore is a separate class
  • Windows is different

Important Note — both dotnet and java are a few abstraction levels higher than the thread or semaphore “libraries” provided on top of an operating system. These libraries [1]  are sometimes NOT part of kernel, even if the kernel provides basic thread support.

[1] ObjectSpace and RogueWave both provides thread libraries, not built into any operating system whatsoever.

[17] semaphore^mutex

See also CSY blogpost on binary semaphore vs mutex..

I usually trust official documentation by Microsoft, Oracle, Linux .. more than online forums, but I did find some worthwhile read in
https://stackoverflow.com/questions/62814/difference-between-binary-semaphore-and-mutex . Multiple authors pointed out the notification feature of semaphore. I agree.

There are at least 4 different semaphore APIs
* java has Semaphore class
* dotnet has a Semaphore class based on Windows semaphore
* system-V semaphore API
* POSIX semaphore API

The windows (including dotnet) semaphore API is somewhat different from the rest. Notification feature is built into windows semaphore, but I don’t think the other 3 API documentations have any mentions of notification or signaling features. I believe ownership is NOT the key point. By “ownership” the (Windows) semaphore discussions basically means “notification”. Non-owners of a semaphore can announce intention-to-acquire.

POSIX countingSemaphore ^ lock+condVar #Solaris docs

https://docs.oracle.com/cd/E19120-01/open.solaris/816-5137/sync-11157/index.html points out a lesser-known difference in the Solaris context:

Because semaphores need not be acquired and be released by the same thread, semaphores can be used for asynchronous event notification, such as in signal handlers (but presumably not interrupt handlers). And, because semaphores contain state, semaphores can be used asynchronously without acquiring a mutex lock as is required by condition variables. However, semaphores are not as efficient as mutex locks.

The same page also shows POSIX countingSemaphore can be used IPC or between threads.

compare-up too much; too little compare-out

Comparison with “peers” is really hard to rationalize away.

Now my habit is 80% compare-up; and only 20% of the time we feel fortunate comparing to the wider population.

It’s better to become 20% compare-up.

Reason – Those up-comparisons are usually impractical, irrational and with wrong groups i.e. managers.

highest leverage: localSys^4beatFronts #short-term

Q: For the 2018 landscape, what t-investments promise the highest leverage and impact?

  1. delivery on projects
  2. local sys know-how
  3. portable GTD+zbs irrelevant for IV
  4. pure algo (no real coding) — probably has the highest leverage over the mid-term (like 1-5Y)
  5. QQ
  6. –LG2
  7. obscure QQ topics
  8. ECT+syntax — big room for improvement for timed IDE tests only
  9. Best practices — room for improvement for weekend IDE tests only

average O(): hashtable more IMperfect than qsort

In these celebrated algorithms, we basically accept the average complexity as if they were very likely in practice. Naive…

In comp science problems, hash table’s usage and importance is about 10 times higher than qsort

  • I would say qsort is faster than many O(N logN) sorts. Qsort can use random pivot. It degrades only if extremely “lucky;)” like getting a “6” on all ten dice.
  • In contrast, hash table performance depends mostly on programmer skill in designing the hash function, less on luck.

Performance compared to the alternatives — qsort competitive performance is pretty good in practice, but hash table relative performance is often underwhelming compared to red-black trees or AVL trees in practice. Recall RTS.

given unsorted ints, find median in O(N)

Classic problem — Q: Given an unsorted int array, find its median in O(N) time. For simplicity, we can first assume array size is odd.

I think we can use any data structure. We can scan the array multiple times.

—-simple partition algo

Pick a random item in the array as pivot value and partition the array. Let’s say we are unlucky and get 12% low vs 88% high. So we discard the 12% and repeat. Suppose then we get 66% vs 22% (within high section). We would then discard the 22%.

So we are likely to require 2N “visits”, I don’t think it would degrade to ((N logN).

—-fancy idea — weighted pivot

In one scan find the max and min. Calculate the mid value. Say this is 23.5, not even an integer. I will use this as a pivot value to partition the array into 2 segments.

Suppose N=101, so I’m looking for #51 item. Suppose left segment has 10 and right segment has 91, so I discard left segment. Now I’m looking for the #41 in the remaining array.

Now I find max and min again (if necessary). Now, Instead of getting the mid point between them, I will use a weighted pivot of (10*min+91*max)/101, so this pivot is shifted to the right, based on suspicion of a right-heavy histogram.

–complexity?

On average, I should get N+N/2+N/4 … <2N

Worst case? The earlier illustration is rather unlucky since that histogram happens to be right-heavy. In such a case, my “weighted pivot” idea should alleviate it.

[18]t-investment: c++now surpassing java

My learning journey has been more uphill in c++. Up to 2018, I probably have invested more effort in c++ than any language including java+swing.

I analyzed c++QQ more than java QQ topics, because java is Significantly easier, more natural for me.

I read and bought more c++ books than java+swing books.

If I include my 2Y in Chartered and 2Y in Macq, then my total c++ professional experience is comparable to java.

Q: why until recently I felt my GTD mileage was less than in java+swing?

  • A #1: c++ infrastructure is a /far cry/ from the clean-room java environment. More complicated compilation and more runtime problems.
  • A: I worked on mostly smaller systems… less familiar with the jargons and architecture patterns
  • A: not close to the heart of bigger c++ systems

Q: why until recently I didn’t feel as confident in c++ as java+swing?

  • A #1: interview experiences. About 30% of my c++ interviews were HFT. I always forgot I had technical wins at SIG and WorldQuant
  • A #2: GTD mileage, described above.

prefer ::at()over operator[]read`containers#UB

::at() throws exception … consistently 🙂

  • For (ordered or unordered) maps, I would prefer ::at() for reading, since operator[] silently inserts for lookup miss.
  • For vector, I would always favor vector::at() since operator[] has undefined behavior when index is beyond the end.
    1. worst outcome is getting trash without warning. I remember getting trash from an invalid STL iterator.
    2. better is consistent seg fault
    3. best is exception, since I can catch it

 

get()vs operator*()on c++11 smart pointers #Rahul

Same for unique_ptr and shared_ptr

  • shared_ptr::get() returns a raw ptr like T*
  • shared_ptr::operator*() returns a lvalue reference, similar to operator*() on raw ptr
    • equivalent to: q[  * get() ]

Rahul wrote

   auto newShPtr = make_shared<Acct>(*existingShPtr);

After much discussion, I now believe

  • q[*existingShPtr] evaluates to a reference to the Acct instance on heap.
  • Therefore, at run time we hit the Acct class copy ctor.
  • We thereby instantiate a second Acct instance on heap — deep clone.
  • The address of this second instance is saved in a new “club” of shared_ptr instances. This new club is about the second Acct instance, unrelated to the existing club, so there’s no risk of double-delete. In contrast, the version below would probably create a (disaster) new club around the same Acct instance on heap :
  auto newShPtr = shared_ptr<Acct>(existingShPtr.  get  ());

You can also call make_shared<Acct>() to invoke the default ctor. Note there might be a default argument like

public Acct(int id=0);

## 9 c++realized $ROTI

I wanted c++ ROTI. After so many years of trial and error, I got two

  • G3 [18] The CVA $122/hr offer
  • G3 [18] SCB-FM S$210k offer, unthinkable in my Singapore job search.
    • In terms of base, This one is about $$190k. My “reasonable” target was S$150k and my “high” target was $170k.
  • G5 [18] SIG technical win
  • G5 [12] BNP forex prop trading contract offer
  • [19] MLP-sg java connectivity team actually has a small c++ requirement.
  • G9 overcame fear@large codebase]c++/j and emerged above most developers.

my”native”language=C :feel-good

See also post on CivilEngineers.

Context: speaking to interviewers, colleagues, I like to say my native programming language is … C

C is the first language I studied in-depth on my own, in 1994. C was also the first professional programming language in my very first job. I’m proud of my association with C because :

  • My dad is a specialist on 墨子. C is like 孔子. C never completely fell out of fashion for system programmers.
  • C is the most important language to Unix system programming (kernel, sockets, standard library…). As of 2019, system programing knowledge is growing progressively more important to my job interviews.
  • threading and data structure are among the top 5 most important and evergreen interview topics, both “born” in C.
    • Most thread implementations are related to system libraries.
    • all important data structures are low level and implemented in C more efficiently than other languages
  • In terms of depth — I think C, c++, java, c# have the most depth. I am slowly building my grasp of this depth. I think the accumulation is good.
  • In terms of of churn and accu — C is among the best. See [17] j^c++^c# churn/stability…
  • In terms of it’s relation to other languages — C is the #1 most important, as Confucius is in Chinese culture. Java shows barely visible heritage from C. In contrast, C#, python, perl etc show a visible heritage from C. I feel most popular languages today inherits from C or are created in C.
  • In terms of longevity — C is #1, the grand-daddy in the short history of programming languages. (C++ might come 2nd.) In contrast, all the popular languages will probably come and go — java, python, c#, javascript
  • Mark of Quoin seem to suggest that my low-level experience is less valuable than experience using c++ libraries, but I think most people would agree that the high-level experience is superficial, lower accumulation, high churn, and offers no insight.

##[18]portable skills like coreJava/c++

I have found core java and core c++ outstanding domains for my personality. I feel there are not too many similar domains with

  1. feature: standardized knowledge highly portable across teams
  2. feature: requires more than such (simple) knowledge that anyone can pick up in a crash course
  3. feature: low churn high stability

… so we can hope to accumulate. Here are some comparable domains:

  • pure algo 🙂 yes standardized, non-trivial questions. I can accumulate. Better than brain teasers
  • socket and sys programming:) Yes standardized questions, including malloc, sharedMem
  • SQL 🙂 good despite shrinking demand.
    • Can we say the same (“shrinking”) about ANSI-C? Well, lots of c++ questions are C.
  • bond math 🙂 not so deep but good
  • JGC + jvm tuning? 😦 churn
  • python ? 😦 Deep questions are rare and non-standardized, like concurrency, meta-programming..
  • quant-dev domain 😦 questions are standard, but too few jobs
  • algo trading? 😦 i don’t think they ask any standard question

longest substring+!repeating chars #untested

Q(leetcode Q3): Given a string, find the longest substring without repeating characters.

–Sol1 O(N):
keep a never-shrinking sliding window + a “hashmap” of chars in it. Actually the HM is a 26-element integer array of frequencies.

Every time the lagging edge of the windows moves by one, by definition one char drops out, so we remove that char from the HM, by decrementing its frequency. If hitting 0 then we also decrement a global var uniqCnt describing the HM.

IFF uniqCnt == windowSz then window is a clean.

Every time we see a clean window and it’s longer than the longest clean window, we update our record.

##Y c++IV improved much faster]U.S.than SG #insight{SCB breakthru

Hi XR,

I received 9 c++ offers since Mar 2017, mostly from U.S. In contrast, over the 4.5 years I spent in Singapore, I received only 3 c++ offers including a 90% offer from HFT firm WorldQuant (c++ job but not hardcore).

  1. Reason: buy-side employers — too picky. Most of the Singapore c++ jobs I tried are buy-side jobs. Many of the teams are not seriously hiring and only wanted rock stars.
    • In contrast, Since 2010 I tried about 6 Singapore ibank c++ jobs (Citi, Barclays, Macquarie, Standard Chartered Bank) and had much better technical wins than at buy-side interviews.
  2. Reason: Much fewer c++ jobs than in U.S.
  3. Reason: employee — I was always an employee while in Singapore and dare not attend frequent interviews.
  4. Reason: my c++ job in the U.S. are more mainstream so I had more opportunities to experiment on mainstream c++ interview topics. Experiments built up my confidence and depth.
  5. Reason: I had much more personal time to study and practice coding. This factor alone is not decisive. Without the real interviews, I would mostly waste my personal time.

Conclusion — availability of reasonable interview opportunities is a surprisingly oversize factor for my visible progress, 

By the way, Henry Wu (whom I told you about) had more successful c++ interviews. He joined WorldQuant and Bloomberg, two companies who didn’t take me up even after my technical wins.

##taking on c# ] 2012 #j4..

See post on j4 stick2c++: Score big{losing@quant/c#

This review is mostly for future planning, not nostalgia

  • — Q: what were the motivation/j4 in 2012?
  • c# was #1 on front end in banks + some buy-side… Now it is losing mind share to web GUI. Very little heard on WPF.
    • lousy technology bet
  • c# was challenging java in a small number of banks … Now it has taken too long to mount that challenge
  • After the “conquest” of java (QQ and GTD), I felt c# was fairly close to Java and a “low-hanging big fruit” compared to c++ and python
  • I witnessed a few systems with java back-end and c# front-end and a demand for versatile developers….Now there are very few.
  • On Wall St I saw more c# than c++ jobs … now unsure. Python and java have since gained market share from c# and c++
  • — Q: why I stopped pushing on the c# front? See 2 reasons y I stayed with c++NOT c#
  • I don’t like the Windows platform. My focus has shifted away. No single big reason.
  • The absolute amount of energy (and time) I had to spend on both GTD and QQ were precious. The “focus shift” means a huge write-off, and disappointing ROTI, comparable to the MSFM write-off.
  • — Q: how was my …. –>  planning and execution?
  • I feel 80% successful. I feel in my c# first 12M I gained more confidence (5/10) than my java first 12M
  • experience (in GTD and … also IV) grew from 0 to 5/10 in windows __serverside__ dev and scripting, c# language
  • attending interviews remotely… Worked to some extent
  • taking on a wide range of c# GTD tasks … worked, including WCF, Excel, WindowsService, vbscript integration, …
  • chipping away at the biggest GTD rock namely MSVS .. worked. Now more confidence
  • — intangible gains { c# endeavor , a fresh look as of 2019. NO…. No need to consolidate with the above
  • self-image boost in absorbency, sustained focus and deep-dive.
  • Precious engagement for 12M. Over the last 20Y, some of the best months were in those 12M.
  • i’m no long afraid of a decline of java in the face of a Microsoft challenge. Right now, java is challenged mostly by python and javascript
  • removed most of my fear of joining a windows dev team like Ashish’s
  • insight into the evolution and cross pollination among the big3 languages
  • leverage? not bad
  • market depth? good except the high-end usually requires WPF
  • MSVS was a scary monster. After my c# experience I became familiar with this monster.

parts@mv-semantics impl: helicopter/hist view

Move semantics is 90% compile-time + 10% run-time programming. 95% in-the-fabric and invisible to us.

  • 90% TMP
    • 70% is about special casts — in the form of std::move and std::forward
    • std::swap
  • 10% traditional programming
    • RAII to destroy the “robbed” rvalue object
    • heap ptr assignment [1] to rob the resource

[1] This “stealing” is tip of the iceberg, the most visible part of move semantics. Bare-hand stealing was doable in c++03, but too dangerous. The rvr is a fundamental language feature to make the “steal” safer:

  • a safety device around the dangerous “steal”
  • a safety glove
  • a controlled destruction

The resource is typically a heapy thingy, common fixture in all STL containers and+ std::string + smart pointers. Therefore, move-semantics is widely used only in libraries, not in applications.

IV Q: implement op=()using copier #swap+RAII

A 2010 interviewer asked:

Q: do you know any technique to implement the assignment operator using an existing copy ctor?
A: Yes. See P100 [[c++codingStd]] and P347 [[c++cookbook]] There are multiple learning points:

  • RAII works with a class owning some heap resource.
  • RAII works with a local stack object owning a heapy thingy
  • RAII provides a much-needed exception safety. For the exception guarantee to hold, std::swap and operator delete should never throw, as echoed in my books.
  • I used to think only half the swap (i.e. transfer-in) is needed. Now I know the transfer-out on the old resource is more important. It guarantees the old resource is released even-if hitting exceptions, thanks to RAII.
  • The std::swap() trick needed here is powerful because there’s a heap pointer field. Without this field, I don’t think std::swap will be relevant.
  • self-assignment check — not required as it is rare and tolerable
  • Efficiency — considered highly efficient. The same swap-based op= is used extensively in the standard library.
  • idiomatic — this implementation of operator= for a resource-owning class is considered idiomatic, due to simplicity, safety and efficiency

http://www.geeksforgeeks.org/copy-swap-idiom-c/ shows one technique. Not sure if it’s best practice. Below is my own

C & operator=(C const & rhs){
  C localCopy(rhs); //This step is not needed for move-assignment
  std::swap(localCopy.heapResource, this->heapResource);
  return *this;
}// at end of this function, localCopy is destructed, and the original this->heapResource is deleted

C & operator=(C && rhs){ //move-assignment
  std::swap(rhs.heapResource, this->heapResource);
  return *this;
}

 

dlopen^LoadLibrary # plugin

Windows doesn’t have the dlopen API, but many techniques are similar on Windows LoadLibrary API.

https://en.wikipedia.org/wiki/Dynamic_loading#In_C/C++

  • UNIX-like operating systems such as macOS, Linux … uses dlopen(), dlsym() etc
  • Windows uses LoadLibrary() and GetProcAddress() etc

I was never asked about this in interviews, and never needed this feature. Useful knowledge for a c++ veteran though.

Machine Learning #notes

Machine Learning — can be thought of as a method of data analysis, but a method that can automate analytical model building. As such, this method can find hidden insights unknown to the data scientist. I think the AlphaGo Zero is an example .. https://en.wikipedia.org/wiki/AlphaGo_Zero

Training artificial intelligence without datasets derived from human experts is… valuable in practice because expert data is “often expensive, unreliable or simply unavailable.”

AlphaGo Zero’s neural network was trained using TensorFlow. The robot engaged in reinforcement learning, playing against itself until it could anticipate its own moves and how those moves would affect the game’s outcome

So the robot’s training is by playing against itself, not studying past games by other players.

The robot discovered many playing strategies that human players never thought of. In the first three days AlphaGo Zero played 4.9 million games against itself and learned more strategies than any human can.

In the game of GO, world’s strongest players are no longer humans. Strongest players are all robots. The strongest strategies humans have developed are easily beaten by these robots. Human players can watch these top (robot) players fight against each other, and try to understand why their strategies work.

## placement new: pearls to impress IV

container{string}: j^c++

In java, any container (of string or int or anything) holds pointers only.

I think c# collections (i.e. containers) contain pointers if T is a reference type.

In cpp,

  • container of int always contains nonref, unlike java
  • container of container contains ptr, just like in java
  • but container of string is widely used, and invariably contains nonref std::string !

Q: is there any justification for container<(smart) ptr to string>? I found rather few online discussions.
A: See boost::ptr_container

Q: what if the strings are very large?
A: many std::string implementations use COW to speed up copying + assignment, however, string copy ctor has O(lengthOfString) per in the standard ! So in a standard-compliant implementation copy and assignment would be expensive, so I believe we must use container<(smart) ptr to string>

 

##advanced python topics !! IV

I read a few books and sites listing “advanced python” topics, but they don’t agree on what features are important.

Anyone can list 20 (or 100) obscure and non-trivial python features and call it “list of advanced python features

  • Mixin/AbstractBaseClass, related to Protocols
  • Protocol, as an informally defined Interface
  • coroutines?
  • Futures for async? python3.2 😦
  • Properties? Similar to c# properties; provides better encapsulation than __myField3
  • q[yield] keyword

std::defer_lock #kill1@4deadlock factors

Only in the lock() call does the thread grabs both locks together. This breaks “incremental acquisition” , one of the four deadlock conditions.

Sample code from https://en.cppreference.com/w/cpp/thread/lock_tag

  std::unique_lock<std::mutex> ulock1(mutex1,std::defer_lock);
  std::unique_lock<std::mutex> ulock2(mutex2,std::defer_lock);

  // now grab both locks
  std::lock(ulock1,ulock2); 

ECN domain knowledge IV questions, by an MD interviewer

  • Q: what kind of data are stored in your orderbook?
  • Q: what kind of data fields are sent by the matching engine in the order messages?
  • Q3: In an NYSE orderbook, typically do you see smaller or bigger order quantities at the next level below top of book?
  • Q3b: Suppose you, as a dealer on a private ECN, maintains a best offer and the 2nd best offer for IBM (or gold or whatever). Which of the two offer quantities would you make bigger and why?
  • Q: your server hardware capacity?
  • Q: how many threads in your parser process?
  • Q: name a few resiliency features in your parser
  • Q: what happens when a parser process is down?
  • Q: why do you consider Line A + Line B a resilience feature if the same thread consumes both? The entire parser can crash, right?
  • %%A3b: 2nd best. The average execution price would be better when some market-taker /lifts both offers/. When we put out our inventory at a discount we don’t want to give too much — we say first 100 customers only.
  • A3b: 2nd best. risk minimization… easier to hedge the ensuing exposure created when some market-taker /lifts both offers/.

 

##5 understandable-yet-useful type traits for TMP

type_traits.h defines too many constructs but only a few are easy to understand and unambiguous:

  1. enable_if
  2. is_same
  3. conditional — https://en.cppreference.com/w/cpp/types/conditional. Not necessarily widely used but my favorite
  4. is_base_of — https://en.cppreference.com/w/cpp/types/is_base_of
  5. is_polymorphic — https://en.cppreference.com/w/cpp/types/is_polymorphic about virtual function
  6. — not so understandable
  7. is_pointer — ambiguous. I think it only knows about regular ptr and function ptr
  8. is_class and is_scalar — possibly useful to check T is a class vs a “simple” type
  9. — understandable but not so valuable
  10. is_abstract — having pure virtual function
  11. — neither very understandable nor highly valuable

I guess all of these type traits are class templates. The way to access the result is the (boolean) ::value member typedef usually, though enable_if_t evaluates to a type.

c++TMP: 9 fundamental features + little tricks

ranked:

  1. SFINAE — fundamental compiler rule for overload resolution
  2. template specialization
  3. NDTTP — non-type template param, widely used, probably most of the standard TMP
  4. within a class template, define member function templates or nested class templates, with additional dummy types, probably required by SFINAE
  5. Even if an actual type (eg, Trade) behind a dummy type T doesn’t support an operation in vector<T>, compiler can still instantiate vector<Trade> if you don’t use that operation. [[c++Primer]] P329 has examples.
  6. default arguments — for type-param (or non-type-param), a small syntactical feature with BIG usages

Tricks:

  1. #include <type_traits>
  2. member typedefs — a regular class can define member typedefs. but this trick is used much more in class templates

enable_if{bool,T=void} #sfinae,static_assert

  • enable_if is a TMP technique that hides a function overload (or template specialization) — convenient way to leverage SFINAE to conditionally remove functions from overload resolution based on type traits and to provide separate function overloads for different type traits. [3]
    • I think static_assert doesn’t do the job.
  • typically used with std::is_* compile-time type checks provided in type_traits.h
  • enable_if_t evaluates to either a valid type or nothing. You can use enable_if_t as a function return type. See [4]. This is the simplest way usage of enable_if to make the target function eligible for overload resolution. You can also use enable_if_t as a function parameter type but too complicated for me. [3]
  • There are hundreds of references to it in the C++11 standard template library [1]
  • type traits — often combined with enable_if [1]
  • sfinae — is closely related to enable_if [1]
  • by default (common usage), the enable_if_t evaluates to void if “enabled”. I have a github experiment specifically on enable_if_t. If you use enable_if_t as return type you had better put a q(*) after it!
    • Deepak’s demo in [4] uses bool as enable_if_t
  • static_assert — is not necessary for enable_if but can be used for compile-time type validation. Note static_assert is unrelated to sfinae. https://stackoverflow.com/questions/16302977/static-assertions-and-sfinae explains that
    • sfinae checks declarations of overloads. sfinae = Substitution failure is not a (compiler) error. An error would break the compilation.
    • static assert is always in definitions, not declarations. static asserts generate compiler errors.
    • Note the difference between failure vs error
    • I think template instantiation (including overload resolution) happens first and static_assert happens later. If static_assert fails, it’s too late. SFINAE game is over. Compilation has failed irrecoverably.
    • Aha moment on SFINAE !

[1] https://eli.thegreenplace.net/2014/sfinae-and-enable_if/

[2] https://stackoverflow.com/questions/30556176/template-detects-if-t-is-pointer-or-class

[3] https://en.cppreference.com/w/cpp/types/enable_if

[4] https://github.com/tiger40490/repo1/blob/cpp1/cpp/template/SFINAE_Deepak.cpp

[12] closure in java^c#

I feel c# Action<T1, T2 ..> and Func<T1,….> constructs are a good illustration of closure. The code block has access to local variables in the enclosing block. Static/non-static Fields are accessible too.

I feel the c# syntax is much simpler than java. http://stackoverflow.com/questions/5443510/closure-in-java-7  said “Since Java 1.1, anonymous inner class have provided this facility in a highly verbose manner. they also have a restriction of only being able to access final (and definitely assigned) variables.”

A practical scenario — When I extract common code into a static (sometimes non-static) utility function, I often end up passing in billions of local variables. It often pays to refactor the function into a closures. (More verbose in java but technically doable.) With closures you don’t need to pass those variables as they are implicitly passed into the closure (or needs no passing at all).

Obviously, if the common code is complicated (above 10 lines) the closure would look bulky. Solution?  Keep the static utility function as is when you refactor, and have the closure call it. Suppose the static function compute() takes 5 local variable arguments. Closure can be invoked with none but closure will invoke compute() with the 5 local variables “implicitly passed in”.

[15] template type constraint : j^c++

java and c# templates can have constraints. If the template uses T->length() then the constraint says T must subtype a certain interface containing a length() method. C++ handles it differently.

(http://stackoverflow.com/questions/874298/c-templates-that-accept-only-certain-types presents other solutions like boost static_assert…)

http://stackoverflow.com/questions/122316/template-constraints-c points out

You can call any functions you want upon a template-typed value, and the only instantiations that will be accepted are those for which that method is defined. For example:

template <typename T>
int compute_length(T *value)
{
return value->length();
}
We can call this method on a pointer to any type which declares the length() method to return an int. Thusly:
string s = “test”;
vector vec;
int i = 0;

compute_length(&s);
compute_length(&vec);

//…but not on a pointer to a type which does not declare length():
compute_length(&i); //This third example will not compile.

This works because C++ compiles a new version of the template function (or class) for each instantiation. As it performs that compilation, it makes a direct, almost macro-like substitution of the template instantiation into the code prior to type-checking. If everything still works with that template, then compilation proceeds and we eventually arrive at a result. If anything fails (like int* not declaring length()), then we get the dreaded six page template compile-time error.

t-investment: QQ^coding^localSys

My investment in QQ category is more adequate than my investment in

  • coding drill
  • local sys knowledge

All 3 are important to family well-being, sense of security, progress, self-esteem..

However, localSys learning is obviously non-portable. Still at the current juncture localSys is the most important area lacking “sunshine”

## template tricks for type constraint

  1. –Here are the compile-time validation/checks offered by c++:
  2. std::is_pointer and family — each function checks one type of pointer. Very specific and clean
  3. q[ = delete ]  — to eliminate specific concrete type args. Laser removal .. extremely clean. Scott Meyers pointed out this usage of q[=delete]. Does this work with SFINAE ??????? I didn’t find anything online
  4. std::enable_if() — see separate blogpost. Designed more More for SFINAE than type constraint
  5. sfinae — probably the most versatile, flexible and powerful
  6. static_assert? Can be combined with other techniques to implement compile-time validation and constraints. See https://github.com/tiger40490/repo1/blob/cpp1/cpp/template/SFINAE_ptrCheck.cpp

covariant return type: c++98→java

java “override” rule permits covariant return — a overriding function to return a type D that’s subtype of B which is the original return type of the overridden method.

— ditto c++

https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Covariant_Return_Types. Most common usage is a “clone” method that

  • returns ptr to Derived, in the Derived class
  • returns ptr to Base in the Base class

Covariant return types work with multiple inheritance and with protected and private inheritance — these simply affect the access levels of the relevant functions.

I was wrong to say virtual mechanism requires exact match on return type.

CRT was added in c++98. ARM P211 (c 1990) explains why CRT was considered problematic in the Multiple Inheritance context.

python Protocols #phrasebook

  • interface — similar to java Interface
  • unenforced — unlike java Interface, python compiler doesn’t enforce anything about protocols
  • eg: ContextManager protocol defines __enter__() and __exit__() methods
  • eg: Sequence protocol defines __len__() and __getitem__() methods
  • partial — you can partially implement the required functions of a protocol
    • eg: your class can implement just the __getitem__() and still works as a Sequence

c++ lockfree data types: mostly bool/int/pointer

  • In generally, only atomic bool/int/pointer can be lock-free. Bigger types need special implementation techniques and I have read that lock-free is usually not possible.
  • Atomic flags are lock-free (this is the only type guaranteed to be lock-free on all library implementations)

numpy,scipy,pandas #cheatsheet

  • numpy provides efficient and large matrices
    • also linear algebra
  • scipy extends numpy
  • scipy compete against matlab and provides
    • mathematical optimizations including optimize.curve_fit()
    • stats
    • numerical integration
    • spline smoothing
  • pandas extends numpy
  • pandas provides key data structures like
    • time series
    • dataFrame i.e. spreadsheet

concurrent python #my take

I’m biased against multi-threading, biased towards multiprocessing because …

  1. threading is for high-performance, but java/c++ leaves python in the dust
  2. GIL in CPython, which is the default download version of python. The standard doc says “If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing

(For my academic curiosity ….) Python thread library offers common bells and whistles:

  • join()
  • timers
  • condVar
  • lock
  • counting semaphore
  • barrier
  • concurrent queue
  • isDaemon()
  • Futures? python3.2 😦

 

[19]%%competitive strengths as professional techie

In my 30’s I would look at the managers sitting in their private offices, and tell myself “I’m not a manager type. I’m the technical type”. However, I have always failed to rise above the rest and reach tech leadership, like senior architect or tech fellow.

Eventually I concluded my strength was limited to theoretical interview topics in java and math.

After spending 2 years self-learning and 3 years in UChicago on quant, I still feel stronger in math than my fellow developers, but math is becoming less and less important to most developer jobs. I can’t afford to invest more time in math (like CSDoctor does), because jobs are scarce.

Now finally I can add “c++” after “java” in my profile.

Pure algorithm (without real-coding test) was my traditional strength, but now I know the west coast and HFT standard. I think it’s still my strength by ibank standard. With more mileage and 打通脉络 I can grow stronger.

I was seldom very strong in GetThingsDone, but ironically the GTD guys seldom rise up. GTD productivity is a necessary but insufficient condition for promotion.

I created a shared_ptr with a local object address..

In my trade busting project, I once created a local object, and used its address to construct a shared_ptr (under an alias like TradePtr).

Luckily, I hit consistent crashes. I think the reason is — shared_ptr likes heap objects. When my function returns, the shared_ptr tried to call delete on the raw ptr, which points at the local stack, leading to crash.

The proven solution — make_shared()

subclass ctor/dtor using virtual func

See https://stackoverflow.com/questions/13440375/invoking-virtual-method-in-constructor-difference-between-java-and-c

Suppose both superclass and subclass defines a virtual cleanup() method.

— c++

… you lose the virtual i.e. “dynamic dispatch” feature. The subclass instance is not present so the only the base class implementation of cleanup() could run.

–java: let’s focus on ctor

… the subclass implementation of cleanup() would run, even though the subclass instance is not initialized — dangerous! See P70 [[elements of java style]]

std::weak_ptr phrasebook

ALWAYS need to compare with raw ptr + shared_ptr, to understand the usage context, motivations and justifications

http://www.stroustrup.com/C++11FAQ.html#std-weak_ptr is concise.

— Based on the chapter in [[effModernC++]]:

#1 feature — detect dangle

  • use case — a subject that keeps track of its observers who might become dangling pointers
  • use case — objects A and B pointing to each other with ref count … leading to island. Using raw pointers exclusively is possible but requires explicit deletion, as pointed out on P 84 [[Josuttis]]
  • In both use cases, Raw ptr won’t work since dangle becomes unnoticed.
  • Achilles’ heel of the #1 feature — manual “delete” on the raw ptr is beneath the radar of reference counting, and leads to chaos and subversion of ownership control, as illustrated —
#include <iostream>
#include <memory>
using namespace std;

void f1(){
  auto p = new int(55);
  shared_ptr<int> sp(p);
  weak_ptr<int> wp(sp);

  cout<<"expired()? "<<wp.expired()<<endl; // false
  cout<<"deleting from down below\n";
  delete p; // sp.reset();
  cout<<"expired()? "<<wp.expired()<<endl; // still false!
  // at end of this function, shared_ptr would double-delete as the manual delete 
// is beneath the radar of reference counting:(
}
int main(){
  f1();
}

##[18] y dotnet hasn’t caught on #enterprise-focus

Don’t spend too much time about this unless considering to take up c#.

In 2012/13 I invested my time into c# just in case c# would become more widespread and high-paying. However, after 2013, I have not noticed this trend strengthening. If I have to guess why, here some personal glimpses:

  • web-GUI — Rise of alternative user interfaces such as mobiles and javascript-based web UI, etching away the strongest stronghold of C#. Nowadays, which new build chooses a fat client built in swing, javaFX, winforms, WPF or Silverlight? Fewer and fewer, to my surprise.
  • Linux — Maturity and further performance improvement of Linux machines compared to windows machines. For large server side apps, or cloud apps, I see more and more adoption of Linux.
  • Java@server-side — (in financial sector) Strengthening of java as the technology of choice on the server side. c# made a valiant but unsuccessful attempt to dethrone java in this game. Java may have limitations (do we know any?), but somehow the decision makers are not worried about them.
  • Open source — there’s continuing community effort to improve the java ecosystem (+ python, javascript, c++ ecosystems). C# is losing out IMO.
  • C++@windows — (My personal speculation) I feel c# offers higher performance than java on windows machines, but vc++ is even faster.

Among ibanks, the main reasons for the current c# situation are 1) Linux 2) web-GUI

— Some update based on Wallace chat.

I think 1) microsoft stack, 2) jvm stack and 3) web 2.0 stack all have vast ecosystems, but at enterprise level, only microsoft^jvm

How about server side c++ or python? much smaller mind-share.

Wallace said on Wall St (where he worked for 5+ years) he knows no ibank using c# on the back-end. Wallace believed Unix/Linux is the main reason.

How about outside ibanks? Wallace said c# back-end is not so rare.

##all c++ explicit casts #std::move included

  • — implicit casts
  • cvctor — is the most important implicit cast
  • conversion operator member-functions
  • numerical type conversions specified by C standard

=== now the explicit casts

  • — C:
  • parentheses cast
  • — C++03 added four casts
  • dynamic_cast — usable on references and pointers.
  • const_cast
  • reinterpret_cast — usually on pointers
  • static_cast
  • — c++11
  • std::move() — only for references, not pointers
  • std::forward()
  • numerous type conversions in type_traits.h
  • — boost:
  • boost::polymorphic_cast? is similar to dynamic_cast but much less popular and not essential knowledge expected of a c++ veteran

 

## G5 move-only types #std::atomic

[1] P106 [[effModernC++]]

enable_shared_from_this #CRTP

  • std::enable_shared_from_this is a base class template
  • shared_from_this() is the member function provided
  • CRTP is needed when you derive from this base class.
  • underlying problem (target of this solution) — buggy design where two separate “clubs” centered around the same raw ptr. Each club thinks it owns the raw ptr and would destroy it independently.
  • usage example —
    • your host class is already managed via a shared_ptr
    • your instance method need to create new shared_ptr objects from “this”
    • Note if you only need to access “this” you won’t need the complexity here.

Pimco asked it in 2017!

both base classes”export”conflicting typedef #MI

Consider class Der: public A, public B{};

If both A and B expose a public member typedef for Ptr, then C::Ptr will be ambiguous. Compiler error message will explicit highlight the A::Ptr and B::Ptr as “candidates”!

Solution — inside Der, declare

typedef B::Ptr Ptr; //to exclude the A::Ptr typedef
// This solution works even if B is a CRTP base class like

class Der: public A, public B{
  typedef B::Ptr Ptr;
};

Q: passive income ⇒ reduce GTD pressure#positive stress

My (growing) Passive income does reduce cash flow pressure… but it has no effect so far on my work GTD pressure.

Q: Anything more effective more practical?

  1. take more frequent unpaid leaves, to exercise, blog or visit family
  2. expensive gym membership

How about a lower salary job (key: low caliber team)? No I still want some challenge some engagement, some uphill, some positive stress.

Docker+java9 cpu isolation/affinity #2006

https://jaxenter.com/nobody-puts-java-container-139373.html is a 2018 article with some concrete examples demonstrating cpu isolation.

a Docker cgroup can specify a cpu-set (like core0 + core3 + core14) and limit itself to this cpu-set. Performance Motivation — preventing a process hopping between cores.

The “cpu-set” scheme provides conceptually simpler cpu isolation, but less popular than the “cpu-share” scheme.

Java9 offers support for cpu isolation if you adopt the the cpu-set scheme but not the cpu-share scheme, as explained succinctly in the article.

A historical note — In 2006 (Mansion/Strategem) I spoke to a Sun Microsystems consultant. An individual Solaris “zone” can specify which cpu core to use. This is my first encounter with CPU isolation/affinity.

const vector{string} won’t let you modify strings #ChengShi

Point #1: a const vector only gives out const references. In this case, reference to const string objects.

My friend ChengShi said he has never seen const vector<const string>, because const vector<string> is good.

Point #2: In fact, my experiment of q[vector<const string>] can’t compile. Why?

https://stackoverflow.com/questions/17313062/vector-of-const-objects-giving-compile-error explains that in vector<T>, T must be assignable!