##5 understandable-yet-useful type traits for TMP

type_traits.h defines too many constructs but only a few are easy to understand and unambiguous:

  1. enable_if
  2. is_same
  3. conditional — https://en.cppreference.com/w/cpp/types/conditional. Not necessarily widely used but my favorite
  4. is_base_of — https://en.cppreference.com/w/cpp/types/is_base_of
  5. is_polymorphic — https://en.cppreference.com/w/cpp/types/is_polymorphic about virtual function
  6. — not so understandable
  7. is_pointer — ambiguous. I think it only knows about regular ptr and function ptr
  8. is_class and is_scalar — possibly useful to check T is a class vs a “simple” type
  9. — understandable but not so valuable
  10. is_abstract — having pure virtual function
  11. — neither very understandable nor highly valuable

I guess all of these type traits are class templates. The way to access the result is the (boolean) ::value member typedef usually, though enable_if_t evaluates to a type.

Advertisements

your brain power^brain power invested ] local system

Background – “how fast you figure things out relative to your peers”.

For each team member AA, the struggle is the same — AA’s brain power vs the cumulative brain power that has gone into the local system which measures the complexity. If the local system complexity is too high then AA would struggle and take a long time (before he gives up).

The “local system” could include firmwide frameworks, or something open-source.

I prefer a local system created by low site-specific brain power, like one with standard SQL/stored-procs, standard noSQL, standard data encoding (FIX, Json..), standard java/c++ libraries.

  • RTS and OC – relatively small amount of site-specific brain power in the system.
  • PWM comm – actually small amount of local system complexity but time given is too short
  • Barc – brand new codebase .. low site-specific brain power.
  • Quartz — the worst

find min substring containing(but not limited to)all my chars

Q (leetcode): Given a string Haystack and a string T, find the minimum window in Haystack which contains (at least) all the characters in T according to the frequencies. Time complexity O(n). Eg: minWindow(ccbabccbabcb, bbc)==bcb

If there is such a window, you are guaranteed that there will always be only one unique minimum window in Haystack. <– I thought this guarantee means something but it doesn’t.

Without loss of generality, I will assume the chars are a-z. I believe those Leetcode corner cases will use only 3 chars

—analysis—

For single-string problem, use array indexed by ascii code. I can convert T to such an array to store the required frequencies (reqFrq)

I can construct a shadow array, same length as Haystack with these payloads:

  • if the hay is not in reqFrq, then payload is a special value like nullptr
  • if the hay is in reqFrq, then….?

–SolSW: sliding-window based

  1. Scan Haystack from left and keep count of actual frequency (check against reqFrq each time). I will inevitably find the earliest good window. By construction, both ends of this window are in reqFrq.
    • Note the entire haystack is more than a good window.
  2. Now I slide the fixed-sized window. If I find another good window, with extra chars on the left, then I have found a shorter window, so I truncate my window on the left
  3. continue Step 2

reference(instead of ptr) to smart ptr instance

I usually pass smart pointers by value (copy-constructor or move-constructor), just like copying a raw ptr.  Therefore the code below looks unnatural:

unique_ptr<Trade> & ref2smartPtr

Actually it is rather common because

  • As Herb Sutter suggested, when we need to put pointer into containers, we should avoid raw ptr. Unique ptr is the default choice, and the first choice, followed by shared_ptr
  • I often use unique_ptr as map value . The operator[] return type is a reference to the value type i.e. reference to unque_ptr
  • I may also put unique_ptr into a vector…. ditto for vector operator[]

limit-IOC ^ market-IOC

Limit IOC (Immediate-or-Cancel): Can be used for FX Spot and CFD.

An instruction to fill as much of an order as possible within pre-defined tolerances of a limit price, immediately (5 second Time-to-Live).

Unlike Market IOC orders, Limit IOC orders allow a Client to control the maximum slippage that they are willing to accept.

Under normal market conditions a Market IOC order will be filled in full immediately. In the event that it isn’t, any residual amount will be cancelled. Price Tolerance cannot be added on a Market IOC order, meaning that a client cannot control slippage.

thread^process: lesser-known differences #IV

Popular IV question. Largely a QQ question.  Some may consider it zbs.

To the kernel, there are man similarities between the “thread” construct vs the “process” construct. In fact, a (non-kernel) thread is often referenced as a LightWeightProcess in many kernels such as Solaris and Linux.

  • context switching — is faster between threads than between processes. In linux, context switching between kernel-threads is even faster.
  • creation — some thread libraries can create threads without the kernel knowing. No such thing for a process.
  • socket — 2 threads in a process can access the same socket; two processes usually can’t access the same socket, unless … parent-child. See post on fork()
  • memory — thread AA can access all heap objects, and even Thread BB’s stack objects via pointers. Two processes can’t share these, except via shared memory.
  • a non-kernel thread can never exist without an owner process. In contrast, every process always has a parent process which could be long gone.

 

CVA c++ IV 2 #oq

void func(ResourceMgr & rm){ 
  int connId = rm.getConn();
  double d=externalFunc(connId); 
  cout<<d<<endl;
  rm.reclaim(connId); //release the resource to the mgr
}
  • Q5: In the code above, external func can throw, so write a RAII wrapper to prevent resource leak.
  • Q5b: what if you aren’t allowed to use RAII? Ok you said catch-all. Is an empty catch block enough?
    AA: need to re-throw the original exception, to mimic the RAII behavior.
  • Q(paper coding): Iterate over two vectors in lock steps. Which is faster — iterator vs vector index?
  • Q (bond math): Is there any uncertainty in the present value calc on an pre-existing vanilla IRS?
  • q: what design patterns do you use?
  • q: write a singleton skeleton
  • Q: how do we make a class “final” in pre-c++11
    %%A: either make dtor private or make all constructors private
  • Q: is shared_ptr thread safe?
  • Q: any difference — const shared_ptr<T> vs shared_ptr<const T>?
  • %%Q: does it make sense to pass a shared_ptr by const reference? I feel it’s cleaner to pass by raw ptr
    AA!: pass by const-reference is faster and recommended

–Zheng

  • Q: why use weak_ptr instead of raw ptr. Valid question.
    A: See [[std c++lib]] P84.
    A: See last paragraph in https://bintanvictor.wordpress.com/2010/01/20/weak_ptr-can-access-inner-ptr-only-through-a-shared_ptr/
  • Q: you said atomic<int> operator++ probably wraps a CAS in a while loop internally, so is it spinlock?
    %%A: I think so.   http://www.cs.cornell.edu/courses/cs4410/2015su/lectures/lec06-spin.html is a detailed explanation
  • Q34: various synchronization techniques?
  • Q34b: what’s the c++ function to force a memory barrier?
    A: std::atomic_thread_fence(),
    but i feel if an application (not a library) uses this function then something is seriously wrong.
  • Q: can we implement a mutex using CAS?
    AA: blockingMutex implementation ] kernel
    AA: spinlock, not blockingMutex, can be implemented as in   https://www.andrew.cmu.edu/course/15-440-sp09/applications/ln/lecture4.html
  • ? Q (paper coding): write a piecewise linear interpolation Curve class with a ctor taking a vector<pair<double, double>>.  See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/curveInterpolation_CVA.cpp
  • Q: What are r-values and r-value references
  • Q: Explain your use of SFINAE. See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/template/SFINAE_demo1.cpp
  • Q: what’s the c++ library function for cmpxch?
    A: atomic::compare_exchange_weak()
  • Q: is your barcap vol surface used for EOD risk only?
    %%A: no. A trader might suspect a lesser known product is currently mis-priced. She could confirm the live bid/ask are obvious outliers on the fitted surface.
    %%A: a dealer desk price new deals based on the fitted surface, whether or not the strike/expiry requires interpolation.

–Simon Ma:

  • Q (paper coding): implement Fib() recursively and iteratively
  • Q: what’s inline?
  • Q: semaphore vs mutex
  • Q: how did you compute greeks like gamma?
  • Q (bond math): given 6M spot IR = 5% and 12M = 10%, what’s the 6M rate 6M forward?
  • Q: Assign first half of a vector to another vector
    %%A: vector::assign() probably takes two iterators so I can pass v.begin(), v.begin()+5
  • Q (obscure): What data structure to represent a directed graph of N nodes? See NxN matrix for graph of N nodes
    %%A: create a Node class with a single ptr field…
  • Q: use parallel algorithm to compute sum of a vector<int>
    %%A: only the last stage global aggregation has a shared mutable that needs protection. The sub-aggregation can be single-threaded.
  • Q (probability problem): Both Alan and Bob agree to show up at Cafe9 sometime between 12 and 2pm. Whoever arrives first would wait for 15 minutes only. What’s the probability of them actually meeting
  • Q: what’s thread_local?
    %%A (correct): a new storage class that’s similar to static
  • Q (paper coding): A natural number is Good if it is a product of only 3, 5 and 7. The smallest Good numbers are 1,3,5,7,9,15,21,25,27,35… Write a print(int N) to print the Nth Good number. Hint: write a isGood(int k). See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/isGood357_CVA.cpp
  • Q (unclear): implement subtract(int a, int b) using only operator+ and comparison operators. I feel this question is unclear. Can we use bitwise? Can we multiply?
    %%A: just try all the integers (i) one by one a == b+i
  • Q (obscure): what operators can’t be overloaded?
    %%A: q(?:) correct
    AA: address-of CAN be overloaded!

–Mikhail the mgr

  • Q: inserting 1000,000 items into a list vs a vector without reserve()
    A: vector wins
  • Q: do you define your exception classes?
  • Q: in what context is a deque completely unusable whereas vector is perfectly fine?
    A: a C function taking an array (i.e. a ptr + a count). Vector::data() is designed for this. In Pre-c++11, we can pass &v[0] and v.size()
    A: if the code takes one element and uses pointer increment to access other elements. Deque would fail at segment boundaries.
  • Q89 (coding question): given a vector of T [passed in by const ref], write a template function to return [by const reference] the minimum element.
  • Q89b: how do you handle an empty vector?
  • ? Q89c (obscure): given q(vector const & vec), what can you say about q{auto it = vec.begin()} versus q{vector::const_iterator it=vec.begin()}

y FIX needs session seqNo over TCP seqNo #reset

My friend Alan said … Suppose your FIX process crashed or lost power, reloads (from disk) the last sequence received and reconnects (resetting tcp seq#). It would then receive a live seq # higher than expected. CME documentation states:

… a given system, upon detecting a higher than expected message sequence number from its counterparty, requests a range of ordered messages resent from the counterparty.

Major difference from TCP sequence number — FIX has no Ack. See Ack in FIX^TCP

— Sequence number reset policy:

After a logout, sequence numbers is supposed to reset to 1, but if connection is terminated ‘non-gracefully’ sequence numbers will continue when the session is restored. In fact a lot of service providers (eg: Trax) never reset sequence numbers during the day. There are also some, who reset sequence numbers once per week, regardless of logout.

IP fragmentation #MTU,offset

A Trex interviewer said something questionable. I said fragmentation is done at IP layer and he said yes but not reassembly.

I was talking about IP layer breaking up , say, a 4KB packet (TCP or UDP packet) into three IP-fragments no bigger than 1500B [1]. The reassembly task is to put all 3 fragments back together in sequence (and detect missing fragments) and hand it over to TCP or UDP.

This reassembly is done in IP layer. IP uses an “offset” number in each fragment to identify the sequencing and to detect missing fragments. The fragment with the highest offset also has a flag indicating it’s the last fragment of a given /logical/ packet.

Therefore, IP detects and will never deliver partial packets to UDP/TCP (P328 [[computer networking]]), even though IP is considered an unreliable service.

[1] MTU for some hardware is lower than 1500 Bytes …

addicted to c++for()loop@@

I find myself thinking in c++ while writing python for loops. Inefficient.

for i in range(0,len(mystrLiTup),1): # explicit is best
  mystrLiTup[i]...

What if you need to adjust the index “i” in the loop? A transitional construct (until we get rid of c++ thinking):

mystrLiTup='sentence. with. periods. end'
i=-1
while True:
  i+=1 # increment explicitly
  if i >= len(mystrLiTup): break
  print mystrLiTup[i]
  if mystrLiTup[i] == '.':
    i+=1 

Gayle: no rush2write code: point allocation

Gayle strongly advocates to get a brute-force solution as soon as possible, and walk through the “idea” in detail, before coding. However, I’d rather start coding up my idea much earlier since I need more time coding.

I guess Gayle’s experience shows that some interviewers are less worried about runnable working code than the thought process. For a tricky algo, if the idea is sound, then you basically pass….? See with algo cracked,ECT=雕虫小技

However, I feel an elaborated/detailed “idea” (item 2 below) alone is not enough to pass. I frequently run out of time implementing things like https://www.geeksforgeeks.org/print-a-given-matrix-in-spiral-form/. The algorithm is simple and not worth 40% but I was too slow completing a basic implementation. “Done is better than perfect”. Basic implementation is better than no implementation. You need to demonstrate you can implement a working solution.

Here’s my estimate of the point distribution in a typical whiteboard interview:

  1. 10(% clear understanding of requirements + clarifying questions
  2. 35(50)% detailed idea (even unimplemented) with decent performance enhancement
  3. 35(20-40)% basically working implementation.
  4. 5% performance analysis — can be tricky
  5. 5% readability — modularized, clean
  6. 5% eyeball test — can be hard to get right if code is convoluted with many variables
  7. 5% minor bugs, corner cases

bucketSort: O(N+k) could be better or worse than O(N)

Based on the top answer in https://stackoverflow.com/questions/7311415/how-is-the-complexity-of-bucket-sort-is-onk-if-we-implement-buckets-using-lin

It’s obvious (by definition) that O(2N) is equivalent to O(N).

O(N+k) is worse than O(2N) when k is larger, like 4000 times N. For a concrete illustration, say, we have N=5 million strings to sort, using k = 20 billion buckets.

  • It takes constant time to bucket each string, so the first step takes O(N) i.e. grows proportional to N.
  • 2nd step is now dominated by k, since all 20 billion buckets have to be examined. Time complexity is O(N+k). This is explained in the stackoverflow answer.
  • Therefore, total complexity is O(N+k) i.e. cN + bk with two constant factors c and b. Even if b is small (complexity of looking at an empty bucket), as k grows, at sometime the k term would overtake N term and become dominant factor.

In practice, k is often chosen to be a smaller integer than N, so I don’t think this is a practical issue.

ValueType default ctor needed in std::map[]

Gotcha — Suppose you define a custom Value type for your std::map and you call mymap[key] = Value(…). There’s an implicit call to the no-arg ctor of Value class!

If you don’t have a no-arg ctor, then avoid operator[](). Use insert(make_pair()) and find(key) methods.

Paradox — for both unordered_map and map, a lookup HIT also requires a no-arg ctro to be available. Reason is,

  • since your code uses map operator[], compiler will “see” the template member function operator[]. This function explicitly calls the ctor within a if-block.
    • If you have any lookup miss, this ctor is called at run time
    • If you have no lookup miss, this ctor is never called at run time
    • compiler doesn’t know if lookup miss/hit may happen in the future, so both branches are compiled into a.out
  • If your source code doesn’t use operator[], then at compile time, the template function operator[] is not included in a.out, so you don’t need the default ctor.

 

##5 algoIV constructs beyond dataStruct : xLang

  1. elegant, legit simplifications
  2. Judicious use of global -vs- argument -vs- local variables
  3. iterator and  Pointer in node/VO class
  4. Construct tree, hashmap or Sorted array as auxiliary, or for memoisation
  5. Recursion
  6. sentinel node trick: array/slists

The tools below are more specialized and less widely relevant

  • Graph traversal, recursive or iterative..
  • Permutations etc, recursive or iterative ..
  • Std::lower_bound etc
  • sorting

The topics below are Not really for a coding test:

  • DP
  • concurrency

extern^static on var^functions

[1] https://stackoverflow.com/questions/14742664/c-static-local-function-vs-global-function confirmed my understanding that static local function is designed to allow other files to define different functions under this name.

extern var file-scope static var static func extern-C func
purpose 1 single object shared across files [1] applies. 99% sure [1] name mangling
purpose 2 private variable private function
alternative 1 static field anon namespace no alternative
alternative 2 singleton
advantage 1 won’t get used by mistake from other file
disadv 1 “extern” is disabled if you provide an initializer no risk. very simple
disadv 2
put in shared header? be careful  should be fine  not sure

conflation: design question

I have hit this same question twice — Q: in a streaming price feed, you get IBM prices in the queue but you don’t want consumer thread AA to use “outdated” prices. Consumer BB needs a full history of the prices.

I see two conflicting requirements by the interviewer. I will point out to the interviewer this conflict.

I see two channels — in-band + out-of-band needed.

  1. in-band only — if full tick history is important, then the consumers have to /process/ every tick, even if outdated. We can have dedicated systems just to record ticks, with latency. For example, Rebus receives every tick, saves it and sends it out without conflation.
  2. dual-band — If your algo engine needs to catch opportunities at minimal latency, then it can’t afford to care about history. It must ignore history. I will focus on this requirement.
  3. in-band only — Combining the two, if your super-algo-engine needs to analyze tick-by-tick history and also react to the opportunities, then the “producer” thread alone has to do all work till order transmission, but I don’t know if it can be fast enough. In general, the fastest data processing system is single-threaded without queues and minimal interaction with other data stores. Since the producer thread is also the consumer thread for the same message, there’s no conflation. Every tick is consumed! I am not sure about the scalability of this synchronous design. FIFO Queue implies latency. Anyway, I will not talk further about this stringent “combo” requirement.

https://tabbforum.com/opinions/managing-6-million-messages-per-second?print_preview=true&single=true says “Many firms mitigate the data they consume through the use of simple time conflation. These firms throw data on the floor based solely on the time that data arrived.”

In the Wells interview, I proposed a two-channel design. The producer simply updates a “notice board” with latest prices for each of 999 tickers. Registered consumers get notified out-of-band to re-read the notice board[1], on some messaging thread. Async design has a latency. I don’t know how tolerable that is. I feel async and MOM are popular and tolerable in algo trading. I should check my book [[all about HFT]]…

In-band only — However, the HSBC manager (Brian?) seems to imply that for minimum latency, the socket reader thread must run the algo all the way and send order out to exchange in one big function.

Out-of-band only — two market-leading investment bank gateways actually publish periodic updates regardless how many raw input messages hit it. Not event-driven and not monitoring every tick!

  • Lehman eq options real time vol publisher
  • BofA Stirt Sprite publishes short-term yield curves on the G10 currencies.

[1] The notification should not contain price numbers. Doing so defeats conflation and brings us back to a FIFO design.

tcp: one of 3 client-receivers is too slow

See also no overflow]TCP slow receiver #non-blocking sender

This is a real BGC interview question https://bintanvictor.wordpress.com/2017/04/08/bgc-iv-c-and-java/

Q: server is sending data fast. One of the clients (AA) is too slow.

Background — there will be 3 worker-sockets. The local address:port will look identical among them if the 3 clients connect to the same network interface, from the same network segment.

The set-up is described in simultaneous send to 2 tcp clients #mimic multicast

Note every worker socket for every client has identical local port.

See https://stackoverflow.com/questions/1997691/what-happens-when-tcp-udp-server-is-publishing-faster-than-client-is-consuming

I believe the AA connection/session/thread will be stagnant. At a certain point [1] server will have to remove the (mounting) data queue and release memory — data loss for the AA client.

[1] can happen within seconds for a fast data feed.

I also feel this set-up overloads the server. A TCP server has to maintain state for each laggard client, assuming single-threaded multiplexing(?). If each client takes a dedicated thread then server gets even higher load.

Are 5 client Connections using 5 sockets on server? I think so. Can a single thread multiplex them? I think so.

no overflow]TCP slow receiver #non-blocking sender

Q: Does TCP receiver ever overflow due to a fast sender?

A: See http://www.mathcs.emory.edu/~cheung/Courses/455/Syllabus/7-transport/flow-control.html

A: should not. When the receiver buffer is full, the receiver sends AdvertizedWindowSize to informs the sender. If sender app ignores it and continues to send, then sent data will remain in the send buffer and not sent over the wire. Soon the send buffer will fill up and send() will block. On a non-blocking TCP socket, send() returns with error only when it can’t send a single byte. (UDP is different.)

Non-block send/receive operations either complete the job or returns an error.

Q: Do they ever return with part of the data processed?
A: Yes they return the number of bytes transferred. Partial transfer is considered “completed”.

 

UDP/TCP socket read buffer size: can be 256MB

For my UDP socket, I use 64MB.
For my TCP socket, I use 64MB too!

These are large values and required kernel turning. In my linux server, /etc/sysctl.conf shows these permissible read buffer sizes:

net.core.rmem_max = 268435456 # —–> 256 MB
net.ipv4.tcp_rmem = 4096   10179648   268435456 # —–> 256 MB

Note a read buffer of any socket is always maintained by the kernel and can be shared across processes [1]. In my mind, the TCP/UDP code using these buffers is kernel code, like hotel service. Application code is like hotel guests.

[1] Process A will use its file descriptor 3 for this socket, while Process B will use its file descriptor 5 for this socket.

jmp_buf/setjmp() basics for IV #ANSI-C

Q: how is longjmp different from goto? See http://ecomputernotes.com/what-is-c/function-a-pointer/what-is-the-difference-between-goto-and-longjmp-and-setjmp

A: longjmp can 1) jump across functions, and 2) restore program state from a jmp_buf, which was saved earlier by setjmp.

defining+using your swap() #gotcha

Gotcha! If you define your own swap() then it may not get picked up depending on some subtle factors. In the demo below, when the args are declared as “int” variables, then the hidden std::swap() gets picked up instead of your custom swap(size_t, size_t)!

Note there’s no specific #include required.

  • Solution : if practical, avoid “using namespace std” even in cpp files
  • solution : except the outermost main(), enclose everything  in an anonymous namespace, to force the unqualified swap() to resolve to your custom version.
#include <iostream>
#include <assert.h>

//namespace{

using namespace std;

int value1 = 5;
int calls=0;
template <typename T> void swap(size_t a, size_t b){
        ++calls;
        std::cout<<"in my swap()"<<std::endl;
}

int mymain(){
  int a=0, b=1;
  int oldCount = calls;
  swap<int>(a,b); //int arguments won't invoke my swap()
  assert (calls == oldCount);

  std::cout<<"after 1st call to swap()"<<std::endl;
  swap<int>(0,1); //calls my swap()
  assert (calls == 1+oldCount);

  std::swap<int>(a,b);  //can compile even without any #include
  // no return required
}

//}

int main(){
  mymain();
}

(Revisit) senior manager IV: what they watch out for

Common, non-trivial, perhaps hidden issues in a candidate, ranked:

  • twist and turn and not /candid/ about his past
  • not listening
    • jumping to conclusions
    • assuming too much
    • not waiting for a long-winded interviewer to finish, perhaps interrupting
    • jumping to the defensive, when deliberately provoked
  • desperate for the job and losing his perspective
  • opinionated
  • choosy on projects
  • conceited beyond “cool confidence”
  • —–secondary
  • impulsive answers ^ elusive, guarded answers
    • elusive answers? common
  • bad-mouth past managers
  • lack of humor? common
  • tensed up and not cool and calm, perhaps prone to workplace stress
  • exaggeration about his past? common
  • not articulate enough? common
  • poor eye contact? common

find anagram groups]file #1-pass, FB

Requirement: if any 2 words are anagrams, that’s a group. Within a file, identify all such groups.

Normalize each word using counting sort on the characters of the word. There’s no competing alternative in this isolated part of the solution. Main part of the solution is sorting all words based on the normalized version. There are many alternative solutions to this sort.

desirable — memory efficiency. create no or few objects besides the strings read in from the file.

desirable — avoid the need to create/allocate linkNode or wrapper objects around original words. In c++, such a wrapper is smaller than in java, probably a pointer to the original word + a link pointer

Solution 1 — insert into a sorted set, using a customized comparison (subclass binary_function) based on the normalized string and something unique, such as line number or object address. After all the inserts, when you iterate the set, an anagram group would show up in a row. We can also keep a separate array or iterators (pointers) that point to the first word in each group — emulating a multimap.

Without the something unique, the set would drop words.

I think insertion is inefficient since a normalized string is regenerated for the same word over and over. To overcome this, we could create a wrapper object with 2 fields — normalized string + a pointer to the original word. However, if “SAT” is a very common normalized string then we create multiple copies of it.

Solution 2 — separate chaining. Maintain a hashed map (faster) or sorted map keyed by the normalized strings, just a single copy for each normalized string. Value of each map entry is a linked list of the original words.

#include <iostream>
#include <iomanip>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <map>
#include <algorithm>
using namespace std;

map<vector<char>, size_t> m;
template<typename T> ostream & operator<<(ostream & out , vector<T> const & v){
        for(int i=0; i<v.size(); ++i) out<<setw(1)<<v[i];
        out<<" ";
}
int main(){
  ifstream fs("words.txt");
  string line, word;
  while(getline(fs, line)){
     stringstream ss(line);
     while(getline(ss, word, ' ')){
        //trim
        int endpos = word.find_last_not_of(" ");
        if (endpos < string::npos) word = word.substr(0, endpos+1);
        if (word.empty()) continue;
        //lower case
        transform(word.begin(), word.end(), word.begin(), ::tolower);
        //sort
        vector<char> sorted(word.begin(), word.end()); sort(sorted.begin(), sorted.end());

        int count = ++m[sorted]; //1st time we get 1 🙂
        cout<<sorted<<"\t<- "<<word<<" :" <<count<<endl;
     }
  }
}

 

c# static classes : java/c++

–c++:

use a (possibly nested) namespace to group related free functions. See google style guide.

–java:

Java 8 allows static methods in interfaces. See https://stackoverflow.com/questions/512877/why-cant-i-define-a-static-method-in-a-java-interface

–c# is the most avant-garde on this front

  • C# static class can be stateful but rarely are
  • it can have a private ctor

multiple hits: lower_bound gives earliest

Looking for lower_bound (2) in {0,1,2,2,3,4}, you get the earliest perfect hit among many, i.e. the left-most “2”.

No such complexity in upper_bound since upper_bound never returns the perfect hit.

No such complexity in set.lower_bound since it won’t hold duplicates.

int main(){
  vector<int> s{0,1,2,2,3,4};
  vector<int>::iterator it = lower_bound(s.begin(), s.end(), 2);
  cout<<"to my left: "<<*(it-1)<<endl;
  cout<<"to my right: "<<*(it+1)<<endl;
  cout<<"to my right's right: "<<*(it+2)<<endl;
}

lower_bound() may return end() #gotcha

lower_bound() may return end().

If your target value is too high and nothing qualifies, all 6 functions return the right end of the range. If you look at the (key value in) return value i.e. end-of-range,

  • This end-of-range node is a dummy. Never read its key value.
  • After lower_bound or upper_bound, always validate before reading the return value

I spent hours puzzled by the wrong data returned after lower_bound()->first. Specifically, if the map/set is keyed by integer, then the end()->first can be normal int value even when it fails and returns map.end()!

Consistent across 6 functions:

  • std::lower_bound
  • std::upper_bound
  • set/map methods

What if the target value is too low? Easier — upper bound should return left boundary iterator, and lower_bound returns the same iterator! See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/curveInterpolation_CVA.cpp

 

memset: a practical usage #Gregory

  • memset is a low-level C function.
  • memset takes a void pointer.
  • Fast and simple way to zero out an array of struct, having primitive data members. No std::string please. No ptr please. Use sizeof to get the byte count.
  • Useful in low level wire coding
// illustrates packed and memset
#include <iostream>
using namespace std;

struct A{
  unsigned int i1; //4 bytes
  bool b; //1 byte
  char cstr[2];
  int* ptr; //8 bytes
} __attribute__((packed));
size_t const cnt = 3;
A arr[cnt];
int main(){
  cout<<sizeof(A)<<endl;
  size_t sz = sizeof(arr);
  cout<<sz<<endl;
  memset(arr, 0, sz);
  for(size_t i=0; i<cnt; ++i){
    A* tmp = &arr[i];
    cout<<"i1 = "<<tmp->i1<<"; b = "<<tmp->b<<" ; cstr[1] = "<<(int)tmp->cstr[1]<<" ptr = "<<tmp->ptr<<endl;
  }
}

demo: static method declare/define separately n inherited

Low complexity in this demo, but if you lack a firm grip on the important details here, they will add to the complexity in a bigger code base.

  • When subclass is compiled, compiler complains about undefined sf() since it sees only the declaration. You need “g++ der.cpp base.cpp”.
  • Note the static method is inherited automatically, so you could call der::sf().
#include <iostream>
struct base{
  static void sf();
};
///////// end of header /////////
#include "base.h"
using namespace std;
void base::sf(){ // no "static" please
  cout<<"sf"<<endl;
}
///////// end of base class /////////
#include "base.h"
using namespace std;
struct der: public base{};
int main(){
  der::sf();
}
///////// end of sub class /////////

SFINAE #sizeof#ptr2member as template param

https://jguegant.github.io/blogs/tech/sfinae-introduction.html is relatively simple, concise. Shows how to test T has method1()

https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/SFINAE is shorter and uses the same sizeof trick.

https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Member_Detector is another illustration

–all 3 resource above use sizeof and function template (not class template) —

https://github.com/tiger40490/repo1/blob/cpp1/cpp/template/includes useful demo of my own code in production, powering the nyse+nyseAmerican real time market data parsers behind most of the biggest financial data sites.

When the compiler evaluates sizeof, which is a compile-time task, it would try one of the 3 func() overloads and check the winner’s return type[1] . Always exactly one of the 3 overloads can compile.

When T is AddRefreshOrderStruct, the compiler tries 1st overload, where AddRefreshOrderStruct needs a long field, and AddRefreshOrderStruct needs a sendTimeNS field. Pass! So the return type is int.

When T is EmptyStruct, the compiler tries the 1st overload, where EmptyStruct needs a long field … failed:( Only the last overload, the default overload, passes.

[1] the size of the return type is used to initialize the static const field!

The asterisk at end of the func declarations is needed as the func() argument will be NULL pointer. NULL pointer can match a pointer to any type.

Breadth-first-traversal, height-aware

BST visits root level (one node only), and 2nd generation, and 3rd generation etc. The BST below is aware of the height or “generation” currently visited.

Latest: https://github.com/tiger40490/repo1/blob/cpp1/cpp1/binTree/BST_show_level.cpp

struct Node {
    int data;
    Node *left, *right, *next;
    Node(int x, Node * le = NULL, Node * ri = NULL) : data(x), left(le), right(ri), next(NULL) {}
};
    Node _15(15);
    Node _14(14);
    Node _13(13);
    Node _12(12);
    Node _11(11);
    Node _10(10);
    Node _9(9);
    Node _8(8);
    Node _7(7, &_14, &_15);
    Node _6(6, NULL, &_13);
    Node _5(5, &_10, NULL);
    Node _4(4, NULL, &_9);
    Node _3(3, &_6,  &_7);
    Node _2(2, &_4,  &_5);
    Node root(1, &_2, &_3);
    Node marker(-1); //to be added to queue to mark new level

Node * dequeue(queue<Node*> & q){
    assert(!q.empty());
    Node * n = q.front();
    q.pop();
    return n;
}
void BFT(){
  queue<Node *> q;
  size_t height=0;

  for( q.push(&marker), q.push(&root); !q.empty();){
    Node * n = dequeue(q);
    if (n == &marker){
        if (q.empty()){
                cout<<"  ^^^^^^^^^^"<<endl;
                break;
        }
        n = dequeue(q);
        q.push(&marker); //move the marker to end of queue
        cout<<"starting height # "<<++height<<endl;
    }
    int dataL = 0, dataR=0;
    if (n->left){
        cout<<"pushing "<<n->left->data<<endl; q.push(n->left);
        dataL = n->left->data;
    }
    if (n->right){
        cout<<"pushing "<<n->right->data<<endl; q.push(n->right);
        dataR = n->right->data;
    }
    cout<<dataL<<" <- "<<n->data<<" -> "<<dataR<<endl;
// "  #  next is "<<dataN<<endl;
  }
}
int main() {
    BFT();
}

memory leak demo; memset(); std::string

valgrind proves the leak.

using namespace std;
size_t const len=15;
struct A{
        string s4;
        A(string s="default std::string"): s4(s){ cout<<"ctor"<<endl; }
};
size_t const  cnt=2;
size_t siz=cnt * sizeof(A);
A arr[cnt], ar2[cnt];

char fname[] = "/tmp/,.dat";
void leakDemo1() {
        {
                arr[0]=A("grin");
                arr[1]=A("frown"); //somehow skipping this causes core dump
        }
        cout<<"before write()"<<endl;

        int fd = open(fname, O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR);
        write(fd, arr, siz);
        close(fd);

        if(1){
                int fd2 = open(fname, O_RDONLY);
                read(fd2, ar2, siz);
                close(fd2);
        }
        for (int idx = 0; idx < cnt; ++idx){
                A * tmp = ar2 + idx;
                cout<<tmp->s4<<endl;
        }
}
void leakDemo2(){
        int * intp = new int(11); //not deleted. valgrind detected the leak
        memset(&intp, 0, 8); //overwrite the 8-byte pointer object
        delete intp;  //deleting on the fake address from memset
}
void leakDemo3(){
        string s="hello";
        cout<<"sie of string == size of pointer = " << sizeof(string)<<endl; //inside the string object, there's nothing but a poitner!
        memset(&s, 0, 1); // overwite pointer object itself, so it now points to somewhere else ... leak

        //somehow, memset(....,2) causes seg fault?
}
int main() {
        leakDemo1();
}

## stateful OMS class design: observations

Here’s a well-established and large-scale order manager class design. It handles millions of orders a day.

  • The entire process is restarted on every trading day. Before the restart, all pending orders are cancelled! The OM is probably a per-thread singleton in the process.
  • The OM stores all the orders for the day, including each closed order in case it needs cancellation.
  • The OM keeps all the partial executions (aka partial fills) for a given order, because each execution could be busted.
  • Each action on an order (such as validation, partial execution ..) is performed by a dedicated object. For 1000 orders, if there are 5 actions, then there would be 5000 distinct “action objects”. The OM has pointers to all of these action objects.
  • Most action objects are stateful. ALL action objects are persisted somewhere so as to support busting/cancellation.

 

std::vector-of-containers: initializer list

Typical example: If you heavily use a vector of map, it’s tempting to use a vector of pointers to maps. The java way.

If you drop the “pointers to”, then when you retrieve the map from the vector, you often get a copy, unless you save the return value in a reference variable

By the way, here’s an initializer for std::map:

vec.push_back(map<int, int>{{32,1}} );

thread-unsafe shared_ptr: tiny examples

##shared_ptr thr-safety: 3 cases is a “framework”.

http://www.boost.org/doc/libs/1_35_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety gave simple examples of unsafe access.

//— Example 3, where the shared mutable is a club member p3 ——————
// thread A
p = p3; // reads p3, writes p

// thread B
p3.reset(); // writes p3: Undefined, simultaneous read/write

I gave this example correctly to a Sep 2018 SCB interview 🙂

===> In the same clock cycle, p3 Object is accessed on 2 threads. This is nothing to do with the ref count, or the pointee object.

//— Example 4,  where the shared mutable is a club member p2 ——————
// thread A
p3 = refToP2; // reads p2, writes p3

// thread B
// p2 goes out of scope: Undefined, simultaneous read/write since the destructor is considered a “write access”

===> In the same clock cycle, p2 Object gets write-accessed.

##y MultiCast favored over TCP

Reason: data rate constraints inherent in TCP protocol. Congestion Control?
Reason: TCP to a large group would be one-by-one unicast, highly inefficient and too much load on the sender. Reason: TCP has more data-overhead in the form of non-payload data. * TCP header is typically 20 bytes vs 8 bytes for UDP
* Receiver need to acknowledge

socket accept() key points often missed

I have studied accept() many times but still unfamiliar.

Useful as zbs, and perhaps QQ, rarely for GTD…

Based on P95-97 [[tcp/ip socket in C]]

  • used in tcp only
  • used on server side only
  • usually called inside an endless loop
  • blocks most of the time, when there’s no incoming new connections. The existing clients don’t bother us as they communicate with the “child” sockets independently. The accept() “show” starts only upon a new incoming connection
    • thread remains blocked, starting from receiving the incoming until a newborn socket is fully Established.
    • at that juncture the new remote client is probably connected to the newborn socket, so the “parent thread[2]” have the opportunity/license to let-go and return from accept()
    • now, parent thread has the newborn socket, it needs to pass it to a child thread/process
    • after that, parent thread can go back into another blocking accept()
  • new born or other child sockets all share the same local port, not some random high port! Until now I still find this unbelievable. https://stackoverflow.com/questions/489036/how-does-the-socket-api-accept-function-work confirms it.
  • On a host with a single IP, 2 sister sockets would share the same local ip too, but luckily each socket structure has at least 4 [1] identifier keys — local ip:port / remote ip:port. So our 2 sister sockets are never identical twins.
  • [1] I omitted a 5th key — protocol as it’s a distraction from the key point.
  • [2] 2 variations — parent Thread or parent Process.

##minimum python know-how for cod`IV

Hi XR,

My friend Ashish Singh (in cc) said “For any coding tests with a free choice of language, I would always choose python”. I agree that perl and python are equally convenient, but python is the easiest languages to learn, perhaps even easier than javascript and php in my opinion. If you don’t already have a general-purpose scripting language as a handy tool, then consider python as a duct tape and Swiss army knife

(Actually linux automation still requires shell scripting. Perl and python are both useful additions.)

You can easily install py on windows. Linux has it pre-installed. You can then write a script in any text editor and test-run, without compilation. On windows the bundled IDLE tool is optional but even easier. For the ECT cycle – see https://stackoverflow.com/questions/6513967/running-python-script-from-idle-on-windows-7-64-bit

I actually find some inconveniences — IDLE uses Alt-P to get previous command. Also copy-paste doesn’t work at all. On Windows The basic python command-line shell is better than IDLE!

For coding tests, a beginner would need to learn

  • String common operations — learn 30 and master 20 of them
    • “in” operator on string, list, dict — is one of the operations to master
    • slicing on string and list — is one of the operations to master
    • converting between string, list, dict etc — is one of the operations to master
    • Regex not needed since many developers aren’t familiar with it
  • list and dict data structures and common operations — learn 30 and master 20 of them
    • A “Set” rarely needed. I never create tuple but some built-in functions return tuples.
  • Define simple functions
    • Recursion is frequently coding-tested.
    • multiple return values are possible but not required
  • “global” keyword used inside functions
  • if/elif/else; while loop with beak and next. Switch statement doesn’t exit.
  • for-each loop is useful in coding test, esp. iterating list, dict, string, range(), file content
  • range() and xrange() function – frequently needed in coding test
  • check 2 object have same address
  • null pointer i.e. None
  • what counts as true / false
  • · No need to handle exceptions
  • · No need to create classes
    • I think “struct-type” classes with nothing but data fields are useful in coding tests, but not yet needed in my experience.
  • · No need to learn OO features
  • · No need to use list comprehension and generator expressions, though very useful features of python
  • · No need to use lambda, map()/reduce()/filter()/zip(), though essential for functional programming
  • · No need to use import os and sys modules or open files, which are essential for everyday automation scripts

addiction2low-level hacking:keep doing; no shame

Update: low-level hacking is generally easier in c++ than java.

When I become interested in a tech topic, I often throw cold water over my head — “This is such a /juvenile/, albeit productive and wholesome, hobby. Look at ex-classmates/colleagues so and so, with their business wing. They deal with business strategies. My tech stuff is so low-level and boring compared to what they deal with.”

Damaging, harmful, irrational, demoralizing SMS! Get Real, Man! Let’s assess our own situation

  • A) On one hand, I need to avoid spending too much time becoming expert in some low-leverage or high-churn technology (php? XML? ASP?).
  • B) On the other hand, the enthusiasm and keen interest is hard to get and extremely valuable. They could be the catalyst that grow my zbs and transform me into a veteran over a short few years. Even with this enthusiasm and depth of interest, such a quick ascent is not easy and not likely. Without them, it’s simply impossible.

Case: grandpa. His research domain(s) is considered unglamorous 冷门 but he is dedicated and passionate about it. He knows that in the same Academy of social sciences, economics, geopolitics and some other fields are more important. He often feels outside the spotlight (kind of sidelined but for valid reasons). That is a fact which had a huge impact on my own choice of specialization. But once he decided to dig in and invest his whole life, he needed to deal with that fact and not let it affect his motivation and self-image. As a senior leader of these unglamorous research communities, he has to motivate the younger researchers.

Case: Greg Racioppo, my recruiter, treats his work as his own business. The successful recruiters are often in the same business for many years and make a long term living and even create an impact for their employees (and people like me). They could easily feel “boring” compared to the clients or the candidates, but they don’t have to.

Case: PWM wealth advisors. They could feel “boring” compared to the filthy rich clients they deal with, but in reality, these advisors are more successful than 99% of the population.

Case: The ratio of support staff to traders is about 50:1, but I don’t feel “boring” because of them.

Case: Look at all the staff in a show, movie, supporting the stars.

q[nm] instrumentation #learning notes

When you want to reduce the opacity of the c++ compiled artifacts, q(nm) is instrumental. It is related to other instrumentation tools like

c++filt
objdump
q(strings -a)

Subset of noteworthy features:
–print-file-name
–print-armap? Tested with my *.a file. The filename printed is different from the above
–line-numbers? Tested
–no-sort
–demangle? Similar to c++filt
–dynamic? for “certain” types of shared libraries
–extern-only

My default command line is


nm --print-armap --print-file-name --line-numbers --demangle
nm --demangle ./obj/debug/ETSMinervaBust/src.C/ReparentSource.o //worked better

In May 2018, I ran nm on a bunch of *.so files (not *.a) to locate missing symbol definitions. Once I found a needed symbol is exported by libabc.so, I had to add -labc to my g+ command line.

[17] 5 unusual tips@initial GTD

See also https://bintanvictor.wordpress.com/wp-admin/edit.php?s&post_status=all&post_type=post&action=-1&m=0&cat=560907660&filter_action=Filter&paged=1&action2=-1

* build up instrumentation toolset
* Burn weekends, but first … build momentum and foundation including the “instrumentation” detailed earlier
* control distractions — parenting, housing, personal investment, … I didn’t have these in my younger years. I feel they take up O2 and also sap the momentum.
* Focus on output that’s visible to boss, that your colleagues could also finish so you have nowhere to hide. Clone if you need to. CSDoctor told me to buy time so later you can rework “under the hood” like quality or design

–secondary suggestions:
* Limit the amount of “irrelevant” questions/research, when you notice they are taking up your O2 or dispersing the laser. Perhaps delay them.

Inevitably, this analysis relies on the past work experiences. Productivity(aka GTD) is a subjective, elastic yardstick. #1 Most important is GTD rating by boss. It sinks deep… #2 is self-rating https://bintanvictor.wordpress.com/2016/08/09/productivity-track-record/

low-complexity topics #eg:GC/socket

java GC is an example of “low-complexity domain”. Isolated knowledge pearls. (Complexity would be high if you delve into the implementation.)

Other examples

  • FIX? slightly more complex when you need to debug source code. java GC has no “source code” for us.
  • tibrv set-up
  • socket programming? relatively small number of variations and combinations.
  • stateless feed parser against an exchange spec. Can become more complex when the code size increases.

java GC frequency and duration #100ms

Actually, GC overhead is more important than GC frequency or duration, except in low latency systems. This blog has many posts on overhead.

–frq

Could be Every 10 sec , as documented in my blog

–stop-the-world duration:

100 msec duration is probably good enough for most apps but too long for latency sensitive apps, according to my blog.

For a 32GB JVM in a latency-sensitive Barclays system, worst long pause is about 300ms.

q[return]: bash func^sourced^regular script

http://stackoverflow.com/questions/9640660/any-way-to-exit-bash-script-but-not-quitting-the-terminal says

you can use return instead of exit. Its main purpose is to return from a shell function, but if you use it within a sourced script, it returns from that script.

In a regular shell script,  “return” also kind-of works. It is illegal so it fails the script, but all previous commands actually execute as expected.

return exit
from function 🙂 probably dangerous
from sourced script 🙂 immediate disconnection 😦
from standalone script fails but OK 🙂 nice
from shell fails but OK immediate disconnection 😦

blockchain #phrasebook

A blockchain is a peer-to-peer network that timestamps records by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work.

In contrast, a distributed ledger is a peer-to-peer network that uses a defined consensus mechanism to prevent modification of an ordered series of time-stamped records. All blockchains are distributed ledgers, but not all distributed ledgers are blockchains.

Keywords:

  • Peer-to-peer — no central single-point-of-failure
  • Immutable — records of past transactions
  • Ever-growing — the chain keeps growing and never shrinks. Is there some capacity issue in terms of storage, backup, search performance?
  • Double-spend — is a common error to be prevented by blockchain

Does inline really make a difference@@ Yes

MSVS and g++ debug build both disable inline (presumably to ease debugging). The performance difference vis-a-vis release build is mostly due to this single factor, according to [[optimized c++]]

The same author asserts that inlining is probably the most powerful code optimization.

Pimpl effectively disables inline.

However, c++faq gives many reasons for and against inline.

SO_REUSEPORT TCP server socket option – hungry chicks

With SO_REUSEPORT option, multiple TCP server processes could bind() to the same server endpoint. Designed for the busiest multithreaded servers.

http://i.dailymail.co.uk/i/pix/2011/03/03/article-1362552-0D7319F3000005DC-882_634x357.jpg – a bunch of hungry chicks competing to get the next worm the mother delivers. The mother can only give the worm to one chick at a time. SO_REUSEPORT option sets up a chick family. When an incoming connection hits the accept(), kernel picks one of the accepting threads/processes and delivers the data to it alone.

See https://lwn.net/Articles/542629/  + my socket book P102.

(server)promiscuous socket^connected socket

[[tcp/ip sockets in C]] P100 has a diagram showing that an incoming packet will be matched against multiple candidate listening sockets:

  • format: {local address:local port / remote address:remote port}
  • Socket 0: { *:99/*:*}
  • Socket 1: {10.1.2.3:99/*:*}
  • Socket 2: {192.168.3.2:99/ 172.1.2.3:30001} — this one has the remote address:port populated because it’s an Established connection)

An incoming packet need to match all fields otherwise it’s rejected.

However it could find multiple candidate sockets. Socket 0 is very “promiscuous”. The rule (described in the book) is — the more wild cards, the less likely selected.

(Each packet must be delivered to at most 1 socket as far as I know.)

##google-searchable dev technologies(!!Qz):lower stress

This topic is Rather important to my stress level, my available time for learning, my available time for family….

For example, with Quartz, I must ask around and experiment (millions of times). Slow to build a coherent understanding. Slow ramp-up. (In contrast, with Python I could do that by reading good articles online.) So my productivity lag/gap remains even after a few years.

Other Negative examples – Tibrv, Autosys, Accurev, Less-known boost libraries,..

MSVS? Most of the search results are about c#, so it’s somewhat harder to solve problems.

Eclipse CDT? Most of the search results are about Eclipse java.

Positive examples – vbscript, DOS batch,

Yet, this stressor is mild compared to “performance warnings”.

readLock +! writeLock – same as no lock@@

Q9: With regard to shared-mutable access semantics, using a readLock and discarding the writeLock (i.e. unable to use it) is similar to using no lock, but are they semantically the same?

This is a theoretical discussion. The implementations are not familiar to me and not relevant. Before we answer  Q9, let’s look at

Q1: in the absence of shared-mutable, do we need any lock?
A1: no

Q2: what if we use readLock without writeLock?
A2: should be fine.

Q3″ what if we do use a lock in a shared function?
A3: system behavior is modified — serialized access, which is unnecessary

A9: should be semantically identical. Everyone gets a free ticket at entrance.

cpp parse DateTime using stringstream #no boost

This is the simplest way I have found.

#include <ctime>
#include <iomanip>
#include <iostream>
#include <sstream>
using namespace std;

//withou Boost, parsing string to DateTime and back
// from http://arsenmk.blogspot.sg/2014/07/converting-string-to-datetime-and-vice.html
int main(){
 stringstream ss{ "1970-01-01 8:00:01" };
 tm simpleStruct; //construct a placeholder on stack
 //parse and output to the placeholder
 ss >> get_time(&simpleStruct, "%Y-%m-%d %H:%M:%S");

 time_t secSinceEpoch = mktime(&simpleStruct);
 if (secSinceEpoch < 0) {
 cout << "parsing failed. (Very strict.) " << secSinceEpoch << endl;
 return -1;
 }
 cout << secSinceEpoch <<" seconds since Epoch (1970/1/1 midnight GMT) is -> ";
 cout << asctime(localtime(&secSinceEpoch));
}

c++SCB eFX IV#Dmitry

100% QQ type, as defined in https://bintanvictor.wordpress.com/2017/02/15/qqzz-mutual-exclusion-cjava/.I feel many are micro optimizations with questionable improvement. I wonder how much value such obscure knowledge adds to the team.

Q: Scanning a vector of int (like finding the average or max). Forward iteration vs backward iteration, which one could be faster, considering all possible compiler optimizations.

%%A: forward. Memory read into cpu cache will be in chunks, not one element at a time. Easy for forward iteration. Not sure about backward.

Q: Which one could be fastest:

void f(double arg){…..}
void f(double & arg){….}

%%A: inlining for first but not 2nd?
A: See http://stackoverflow.com/questions/722257/should-i-take-arguments-to-inline-functions-by-reference-or-value esp. the long answer.

Q: Thr1 and Thr2 on 2 CPU’s both update an object s, having 2 fields. Thr1 only updates s.field1. Thr2 only updates s.field2. No interference. No synchronization required. We observe the performance is slower than using one thread to update both fields. Any explanation?
%%A: caching in cpu

Q: weak_ptr justification, when we have shared_ptr already? I feel [[effModernC++]] has a good chapter on it.

Ashish pointed out in some apps, you could identify a clear risk of circular dependency. Replace with weak_ptr.

Q: given an 2D array arr[10][5], how do you use pointer arithmetic to hit arr[1][5]

A: Contiguous. see http://stackoverflow.com/questions/7784758/c-c-multidimensional-array-internals. Note this is different from an array of pointers.

Q: what would you see if a TCP socket server has a full queue
%%A: TCP requires handshake, so if server is unable to accept a request the client would know it.
%%A: connection refused?

Q: what STL algorithms did you use?
%%A: foreach(), find(), copy_if(), transform(), reverse(), sort(), replace_if, remov_if

##which common UDP/TCP functions are blocking

A non-blocking send fails when it can’t send a single byte, usually because destination send buffer (for TCP) is full. For UDP, see [4]

Note read() and write() has similar behavior. See send()recv() ^ write()read() @sockets and http://www.mathcs.emory.edu/~cheung/Courses/455/Syllabus/7-transport/flow-control.html

[1] meaning of non-blocking connect() on TCP is special. See P51[[tcp/ip sokets in C]] and https://www.scottklement.com/rpg/socktut/nonblocking.html
[2] non-blocking accept() is obscure knowledge — See https://www.scottklement.com/rpg/socktut/nonblocking.html
[3] select() on a non-blocking socket is obscure knowledge —  See https://www.scottklement.com/rpg/socktut/nonblocking.html
[4] UDP has no send buffer but some factors can still cause blocking

default flags arg to func fcntl on entire socket touching TCP buffers?
select blocking not supported? still blocking! [3]  no
epoll blocking not supported?  no
recv blocking can change to NB can change to NB  yes
send/write TCP blocking frequently can change to NB can change to NB  yes
recvfrom blocking can change to NB can change to NB  yes
sendto UDP blocking sometimes [4] can change to NB can change to NB  yes
accept blocking not supported? can change to NB [2]  yes
connect()TCP blocking not supported? can change to NB [1]  no
connect()UDP NB. Saves the destination
for future transfers
Not Applicable

##c++good habits like java q[final]

  • Good habit — internal linkage
    • ‘static’ keyword on file-level static variables
    • anonymous namespace
  • if (container.end() ==itrFromLower_bound) {…}
  • use typedef to improve readability of nested containers
  • typedef — “typedef int FibNum” as source documentation
  • initialize local variables upon declaration.
  • use ‘const/constexpr’ whenever it makes sense. The c++11 constexpr is even better.
  • [c++11] – use ‘override’ or ‘final’ keywords when overriding inherited virtual methods. Compiler will break if there’s nothing to override.

how many degrees(upTo180)btw hr^min hands #Dilip

Dilip had an elegant solution by hand.

3:15 — MH is at 90 degrees; HH is slightly over 90. It’s 1/4 way towards 120 degrees i.e. 90+7.5 degrees. Therefore, the answer is 7.5 degrees

3.10 — MH is at 60 degrees; HH is slightly over 90. It’s 1/6 way towards 120 degrees, i.e. 95 degrees. Answer is 95 – 6 = 35 degrees.

Note the MH is always at an integer degree but HH can be at a xxx.5 degrees.

[12]back-scan any container,print`every Other item #MS

Have I overspent my time on this once-asked question?

The umbrella question — write a utility function to iterate any container and print out every Other element backwards?

Good coding practice! I think this is all about iterator syntax knowledge (my weakness) not algorithm (my strength)!

Note this is really about knowledge not  coding abilities. QQ not ZZ.

Iterator declaration is a can of worm 😦 I might need to give up on this.

#include <iostream>
#include <vector>
#include 	<list>
#include <set>
using namespace std;

template<class _InIt>  void printAlternateItem2itr(_InIt _First, _InIt _Last){
	bool flag = true;
	// if the iterator is from rbegin, then ++ would reverse it!
	for (_InIt it = _First; it != _Last; ++it, flag=!flag) {
		if (flag) cout << *it << ' ';
	}
	cout << endl;
}
template <typename CONT> void printAlternateItemBackward(CONT const & cont) {
	printAlternateItem2itr(cont.rbegin(), cont.rend());
}
int main() {
	//vector<int> cont = { 11,2,3,4,5,6,7,18 };
	//list<int> cont = { 11,2,3,4,5,6,7,18 };
	string cont = "0123456789a";
	set<int> cont2 = { 11,2,33,44,55,66,77,88,99 };
	printAlternateItemBackward(cont);
	printAlternateItemBackward(cont2);
	int arr[] = { 11,2,3,4,5,6,7,18,9 };
	int size = sizeof(arr) / sizeof(arr[0]);
	printAlternateItem2itr(arr, arr + size); //forward only
}

—-
Q: is comparison defined on all iterators?
A: now I think linked list doesn’t. Now I think only random access itr does.

%%Q: what’s the signature of STL find()? I will use those declarations of iterators in my function. (Actually the map and set containers have member functions find() outperforming std::find)

%%Q: from a const container, can u get a non-const iterator?

Q: why don’t you take a container as input? Why must you take iterators?
%%A: it’s more common to take iterator, but in this case container will do. All containers provide rbegin() or begin() including string. Raw array doesn’t but the iterator increment won’t work for raw arrays anyway.


Separate question
Q: OO design — how would you represent Order state transition graph in an OMS?

## Y in sg(^U.S.)u can’t be developer till 65,succinctly

In the US, at 65 you could work as a developer. (Actually that’s not the mainstream for most immigrant techies. What do they do? Should ask Ed? Anirudh? Liu Shuo, ZR…)

Why SG is different? Here’s my answer, echoing my earlier posts.

  1. US culture (job market, managers…) has a tradition of being more open to older techies
  2. US culture respects technologists. Main street techies get paid significantly higher than SG main street techies
  3. high-end (typical VP-level) technical work – more comon to get in the US than SG, partly because wage premium is smaller, like 100k -> 150k

mv-semantic: keywords

I feel all the tutorials seem to miss some important details and selling a propaganda. Maybe [[c++ recipes]] is better?

[s = I believe std::string is a good illustration of this keyword]

  • [s] allocation – mv-semantic efficiently avoids memory allocation on heap or on stack
  • [s] resource — is usually allocated on heap and accessed via a pointer field
  • [s] pointer field – every tutorial shows a class with a pointer field. Note a reference field is much less common.
  • [s] deep-copy – is traditional. Mv-semantics uses some special form of shallow-copy. Has to be carefully managed.
  • [s] temp – the RHS of mv-semantic must strictly be a temp object. I believe by using the move() function and the r-val reference (RVR) we promise to the compiler not to access the temp object afterwards. If we access it, i guess bad things could happen. Similar to UndefBehv? See [[c++standard library]]
  • promise – see above
  • containers – All standard STL container classes (including std::string) provide mv-semantics. Here, the entire container instance is the payload! Inserting a float into a container won’t need mv-semantics.
  • [s] expensive — allocation and copying assumed expensive. If not expensive, then the move is not worthwhile.
  • [s] robbed — the source object of the move is crippled, robbed, abandoned and should not be used afterwards. Its “resource” is already stolen, so the pointer field to that resource should be set to NULL.

——–
http://www.boost.org/doc/libs/1_59_0/doc/html/move/implementing_movable_classes.html says “Many aspects of move semantics can be emulated for compilers not supporting rvalue references and Boost.Move offers tools for that purpose.” I think this sheds light…

non-virtual dtor: some random scenarios tested #IV

Background — How important are these scenarios? First off, tech quizzes are extremely important since you are judged just over a few questions. Second, these scenarios pop up by accidents, rather than be design, all the time in real projects. You better learn to deal with a drunken driver while on the road.

Q1: what if Base and Derived dtor both non-virtual and an autoVar is destroyed?
AA: Tested — ~Derived() and then ~Base().  See post on DCBC.

Q2: What if Base dtor is non-virtual but Derived is virtual, and a Derived auto variable is destroyed on the stack?
AA: Same as Q1. For an autoVariable that’s not deleted via a ptr, Derived ctor (virtual or not) runs, followed by Base dtor. Same DCBC

Q3: Base dtor is non-virtual but Derived is virtual. Derived heap pointer is assigned to a B pointer and deleted?
AA: only ~Base runs , in my g++ test, though officially UB.

Q4: Base and Derived are virtual. Derived heap pointer is assigned to a B pointer and deleted?
AA: ~Derived() then ~Base(). DCBC + dynamic dispatch

Note the well-known __undefinedBehavior__ affects delete only, not stack variables or static variables.Note virtual keyword affects pointer variable. Non-ref variables aren’t affected.

The object to destruct is a Derived
~B ~D st? delete heap D via B* or D* test result notes
1 nv nv stack ~D~B DCBC + static dispatch
2 nv virtual stack ~D~B DCBC + static dispatch
3 nv virtual B*  ~B only static dispatch. “virtual” ignored
4 v virtual D*  ~D~B DCBC + dynamic dispatch
5 v implicitly v D*  ditto ditto
struct B{
  ~B(){ cout<<"~B\n";}
};
struct D: public B{
  virtual ~D(){ cout<<"~D\n";}
};
int main(){
  B* myD=new D();
  delete myD;
}

don’t use cash instrument in replication strategies

update — use “bank account” …

Beginners like me often intuitively use cash positions when replicating some derivative position such as a fwd, option or swap.

I think that’s permissible in trivial examples, but in the literature, such a cash position is replaced by a bond position or a MMA. I think the reason is, invariably the derivative position has a maturity, so when we lend or borrow cash or deploy our savings for this replication strategy, there’s a fixed period with interest . It’s more like a holding a bond than a cash position.

buying (i.e. long) a given interest rate

Tony (FX lecturer) pointed out “buying” any variable means executing at the current “level” and hope the “level” moves up. (Note a mathematician would point out an interest rate is not directly tradeable, but never mind.)

Therefore, buying an interest rate means borrowing (not lending) at a rock bottom rate.
Wrong intuition — “locking in the interest income stream”.
Eg: Say gov bond interest is super low, we would borrow now, and hope for a rise.
Eg: Say swap rate is super low, we would lock it in — pay fixed and lock in the floating income stream, and hope for the swap rate and floating stream both to rise.

GC tuning: will these help@@

Q: have a small nursery generation, so as to increase the frequency of GC collections?

AA: An optimal nursery size for maximum application throughput is such that as many objects as possible are garbage collected by young collection rather than old collection. This value approximates to about half of the free heap.

Q: have maximum heap size exceeding RAM capacity.
A: 32-bit JVM won’t let you specify more than 4G even with 32 GB RAM. Suppose you use 64-bit JVM, then actually JVM would start and would likely use up all available RAM and starts paging.

2013 Citadel IV – c#, C++

Q: System.Array.CopyTo() vs Clone()
%%A: Clone() is declared in a supertype IClonable, whereas CopyTo is only in Array class – confirmed
%%A: Clone() is controversial – confirmed. http://www.dotnetperls.com/clone
%%A: Clone() returns a new object whereas CopyTo requires a target array of the correct size to pre-exist – correct

AA: Array class has  a static method Clone(), but I feel the real important point is the Clone() controversy.

Q4: What does the “…” mean in C++ catch(…)?
%%A: If the troublemaker has no (not empty) exception spec, then “…” means catch-all. Any type of exception can be thrown and will be caught here
%%A: if the troublemaker has an exception spec of type B, then “…” means all subtypes of B. If another  type of exception is thrown, then unexpected() triggers, with the catch(…) ignored. Tested 🙂
A: i think my answer was correct.

Q4b: What’s the c# counter part?
%%A: catch(Exception) or an empty catch{/**/} — confirmed

Q3: delegate — what is it and what’s the usage?

Q3b: what’s the c++ equivalent (or closest)
Q: What design patterns are behind the c# delegate?
A: for multicast …. observer
A: for unicast … command? The delegate instance usually has a host object
Q: how do you manage memory leaks?

mutable data types: python ilt java #dictKey

As illustrated in http://bigblog.tanbin.com/2012/03/objectvariable-type-value-immutability.html and P29 [[python essential ref]], python simple int Age variable is implicitly a pointer to a reference-counted, copy-on-write, IMmutable pointee object. That begs the question ..

Q: so, how do 2 variables share a Mutable object?
A: Use instance methods like mutablePerson.setAge()???
A: Lists and dictionaries also offer versatile “shared mutable objects”. listofMutables[0] would return a reference to the first element.

I feel these 2 answers cover 80% of the use cases.

In summary,
– “scalar” or primitive type Variables point to immutable Objects. Example — Everyday strings and numbers
– most “composite” object are mutable, such as dict, list and user-defined objects
– between the 2 well-understood categories, there exist some special data types
** tuples are composite but immutable
** method objects?
** class objects?
** modules?

Incidentally, (in a bold departure from java/c#[2]) only immutables (string, tuple,”myInt” variables) can be dictionary keys[1]. Lists, dictionaries and most user-defined objects are Mutable therefore disqualified. A “fake” immutable tuple of list also disqualifies — just try it. In real project, we only use strings and ints as keys.

[1] the underlying Object must be immutable, even though the variables can re-bind.
[2] c# expects but does not require keys to be immutable

python pseudo constructors4everyday tasks

These Python builtin functions have something in common:

* pseudo [1] constructors — manufacture an object of the specified type
* conversion constructors — converting some input value into an object of the specified type

[1] Actually builtin functions rather than ctor. I guess the fine differences between builtin functions, keywords and operators are not that important at this stage.

P64 [[essential ref]] lists these and more, as “type conversion” functions.

– str()
– dict()
– list() — see py list initialize []^list()
– tuple()
– set()
– int()

– file() — very similar to open()

default-initialize^value-initialize — new-expressions

Q: difference between new Dog and new Dog() with the parentheses?

http://www.cplusplus.com/forum/general/37962/ answers clearly
The first constructor – naked – provides what is called default initialization (unrelated to default ctor). Default initialization leaves the values of fundamental types (int, double, etc) uninitialized, i.e. arbitrary as in the graveyard. Therefore, with new Dog you get uninitialized chunk of memory with arbitrary values in the fields.

The second constructor – with parenthesis – provides what is called value initialization. Value initialization zeros fundamental types. Therefore, with new-Demo() you get zero-initialized chunk of memory – all fields in this case will be zeros.

–C++Primer P407 made it even clearer that default-initialize on the heap basically leaves the bits (of the real estate carved out of a graveyard) as they were left behind by the dead/freed/bulldozed corpses.

http://stackoverflow.com/questions/620137/do-the-parentheses-after-the-type-name-make-a-difference-with-new/620402#620402 has good illustrations of 3 categories of classes and shows the behavior of new-Dog vs new-Dog(). Another contributor pointed out that the difference is usually negligible. “If there is such a constructor it will be used. For 99.99% of sensibly designed classes there will be such a constructor, and so the issues can be ignored.” However in practice, how many classes out of 100 are sensibly designed?

— to initialize the field (of type T) of a class template, 
P687 [[absoluteC++]] says we can skip the init since there’s no good default value for the unknown type T.

Stanley Lippman on P246 [[essentialC++]] says we could use T() even if T happens to be int.

##[12] bottlenecks in a high performance data "flow" #abinitio

Bottlenecks:

#1 probably most common — database, both read and write operations. Therefore, ETL solutions achieve superior throughput by taking data processing out of database. ETL uses DB mostly as dumb storage.

  • write – if a database data-sink capacity is too slow, then entire pipe is limited by its throughput, just like sewage.
    • relevant in mkt data and high frequency trading, where every execution must be recorded
  • read – if you must query a DB to enrich or lookup something, this read can be much slower than other parts of the pipe.

#2 (similarly) flat files. Write tends to be faster than database write. (Read is a completely different story.)
* used in high frequency trading
* used in high volume market data storage — Sigma2 for example. So flat file writing is important in industry.
* IDS uses in-memory database + some kind of flat file write-behind for persistence.

#? Web service

#? The above are IO-bound. In contrast, CPU-bound compute-intensive transform can (and do) also become bottlenecks.

RAM insufficient? telltale signs #scanRate

Scan Rate is the most important indicator. There’s a threshold to look for. It indicates “number of pages per second scanned by the page stealing daemon” in unix but not windows. http://www.dba-oracle.com/oracle10g_tuning/t_server_ram.htm is concise

Paging (due to insufficient Physical memory) is another important indicator, correlated with Scan Rate. There’s a threshold to look for.

remain relevant (survival) to your team n global economy

When we feel job insecurity, we look around and see who’s more secure, more relevant, more current, and more in-demand. We may find some colleagues more relevant to your team or to the company. Maybe they are local system experts. Maybe they have deep expertise in the base vendor product (Tibco, Murex, LotusNotes, WebLogic…). We may want to _specialize_ like them.

Resist!

Job security derives from skill relevance, but relevant to who? The industry, not the local team, which could disappear if entire team becomes obsolete and irrelevant. I have seen people (including myself) specializing in DNS, VB.NET, Swing, Perl, EJB, let alone home-grown systems such as SecDB. These guys are (extremely) relevant but for a limited time only.

Have a long-term perspective — If you want job security through skill relevance, then invest in more permanent skills such as Math, algorithms, data structures, threading, SQL,…

Have a global perspective — Job security is a serious, almost life-and-death issue. You probably should look beyond your local industry. What skills are truly relevant on the global arena? Some skill (say Murex) might be more relevant locally but less globally.

Avoid niche domains unless it helps boost your mainstream skills.

##common c++ run time errors

In java, the undisputed #1 common run time error is NPE, so much so that half of all error checks are null-pointer-checks. In c++, divide-by-zero is similarly a must. But there are more…

– divide by zero
– pointer-move beyond bounds — remember all array sizes are compile-time constants, even with realloc()
– read/write a heap object (heapy thingy) via a reference variable after de-allocation
– return by reference an object allocated on the stack — bulldozed
– dereference (unwrap) a dangling pointer
– c-style cast failure
– double-free
– dereference (unwrap) a null pointer — undefined behavior, unlike java NPE. We should always check null before dereferencing.

—-less common
– delete a pointer to non-heap
– free a pointer to non-heap
– hold pointer/reference to a field of an object, not knowing it’s soon to be reclaimed/bulldozed. Object could be on heap or stack. More likely in multi-threaded programs.
– misusing delete/delete[] on a pointer created with new/new[]. The variable always looks the same — plain pointers.

————-
For any of the above, if it happens in a dtor, you are in double trouble, because the dtor could be executing due to another run time error.

The authoritative [[essential c++]] says that for every exception, there must be a “throw” you can find. Divide-by-zero is something directly done on “hardware” so no chance to throw. I feel many error conditions in C are treated same as in C, without “throw”. In contrast,

  • I feel operator-new is a typical “managed” low level operation that can (therefore does) use exception.
  • dynamic_cast() is anther “managed” low level operation added over C, so it does throw in some cases

Now, JVM “wraps” up all these runtime error Conditions into exceptions. Java creators generally prefer to push these error conditions to the compilation phase, to reduce the variety of Runtime errors. What remain are wrapped up as various RuntimeExceptions. It’s rather remarkable that JVM let’s you catch  all(?) of these exceptions. No undefined behavior.

offload non-essential processing to helper threads

update: https://wiki.sei.cmu.edu/confluence/display/java/LCK09-J.+Do+not+perform+operations+that+can+block+while+holding+a+lock mentions a queue for logging requests.

In many real time trading systems, a novice threading developer may decide to dispatch some non-essential logging[1] logic to a task queue backed by a thread pool, without measuring the resulting latency. I don’t always have time to profile such trivial code, so I am not sure if latency might be better if we simply execute the logging code in the current “request processing” thread.

However — here’s the key point — even if separate threading doesn’t help in this case, it’s still important to know how to use threading in similar contexts.

I’d rather have a developer with threading know-how even though we don’t need it now, than a developer uninitiated to the intricacies of threading.

If you are wondering why the fuss over non-essentials, here’s an example — In some of the busiest, most time-critical trading engines in the exchanges/ECNs, there’s just a single thread sorting/matching quotes for a given symbol. This thread has to be lean and mean. All non-essentials must be delegated. We just mentioned a naive Design #1.

Design #2) Another common technique is messaging. Send a message with all the details, and leave the rest to the receivers. I feel this is slower than threading. Serialization, data copy, network bottleneck, socket handshake. Fastest thread should be network-free and disk-free.

Design #3) Another producer/consumer design uses a pre-allocated (circulalr) array of Requests. Main thread just set a flag to mark a request as executed and move on. Some consumer threads would clean it up and remove it from the array. Typically, we have a single producer and a configurable bunch of consumers, just like a thread pool. If there are enough consumer threads, then the buffer [2] will be almost empty. I feel it’s helpful if the buffer fits into CPU cache.

Note all the designs are producer/consumer.

[1] or other non-essential work such as serialization, persistence, replication, comparison, verification, DB query, or web download. See http://bigblog.tanbin.com/2010/10/gemfire-write-behind-and-gateway-queue.html

[2] You can call it a task queue, but physically it’s really a simple memory buffer, the simpler the faster.

eg – c++ exception-passing ^ param-passing

Update — [[c++primer]] points out the key concept of ExceptionObject (exo for short) in a try/catch. This is the object thrown, assuming we don’t throw a pointer.

exo isn’t the object created (“object1”) and passed to the “throw” statement. That object1 can have a short lifespan and destroyed much sooner than the exo, which needs a long lifespan — described on P1039. Note exo is copy-constructed from object1. See R1 below.

exo isn’t the object caught either, unless you catch by reference. P1037.

exo IS the object re-thrown if you use the empty throw statement.
—–
Based on [[More effC++]] item on the difference between exception-passing and function-parameter-passing. Illustrates a few rules — R1 R2 R3

class B: public std::exception{};
class C: public B{}; //and so on
class D: public C{}; //and so on
class K: public I{}; //and so on

f(){ D* ex = new K();
  throw *ex; // Copy#1 created [R1] and thrown, sliced [1] to D
}
main(int){
  try{ f();}
  catch (exception * ex){} // silently ignored because no “address” is thrown — type mismatch [R2]

  //catch (B ex){} //sliced Copy#2a created, just like function parameter pbclone
  // we will leave the above catch clause commented without further discussion.

  catch(B & exCaught){ //Copy#1 caught without slicing
      //throw; //throws Copy#1 [R3], type D
      throw exCaught; //Copy#2b created [R1] and thrown, sliced[1] to B
}

[1] slicing by copy-ctor.

pure virtual with/out implementation #IV

Item 44 in [[EffC++]] offers a summary that basically says

1) a pure virtual means only interface is inherited (since parent provides no implementation)
2) a simple virtual means interface plus a default implementation is inherited
3) a non-virtual means interface plus a mandatory implementation is inherited and subclasses are advised to keep the implementation. C++ actually allows subclass to /deviate/ by redefining and hiding the parent implementation. Java has a cleaner solution in the “final” keyword.

I’d like to add

1b) a pure virtual with an implementation tells the subclass author

      “Inherit this interface. By the way I also offer an implementation upon request, but uninheritable.”

This differs from (2). Both are illustrated in Item 36.

polymorphic equals() in java/c# ? NO

A java interviewer quizzed me why my equals() implementation insists on both objects having exactly the same runtime type, and rejecting a derived object. He questioned what if we happen to have

B var1 = new Derived(1,2); B var2 = new B(1);

and the B portion of both objects are identical?

Biggest problem with this type of polymorphic equals() — X === Y === Z doesn’t mean X === Z. Suppose the base sub-object of X/Y/Z are identical. Y is a base object. Therefore Y.equals(X) == true == Y.equals(Z), but X and Z are subtype instances, and different.

I’m a purist with high standard. If my equals() determines 2 objects to be equals, then I “swear” they are indistinguishable and interchangeable carbon copies.

It’s generally safe to be strict writing equals() — insist both objects’ actual types at runtime are identical.

1st SCB IV #comm #private inheritance

Q: you have identical rows in a table. How do you clean it up?
%A: select into a new table, then overwrite the old, but this involves a lot of disk space and IO
%A: perhaps delete where (select count(*)…) > 1 and rowid > 1 — using oracle rowid

Q: given a long string of letters A-Z, print a histogram like A:865 times, B:9932 times….

Q: copy a vector to another vector. What if the source is very large? Correct — you get reallocation, so how do you avoid that?
%A: either reserve or initialize the target vector with sufficient capacity. However, the 2nd option default-constructs a large number of payload objects!

Q: when would you use private inheritance?
A: This is rarely needed or quizzed. Not on my Tier 1/2. [[effC++]] P204 has an example of MI and also an example of private inheritance — a public inheritance of a pure interface and also a private inheritance of a concrete class. However, Google style guide strictly states that private inheritance should be replaced with composition.

[11]Cantor/eSpeed c++ #manyQ

Mostly knowledge-based QQ questions. I need to be selective what to study.

Q1: The reverse is common but when do we provide c wrapper around c++ classes?
A: internal implementation is c++ but clients need C API

Q1b: can c call c++ # <—— very common question
%%A: at compile time, c++ code may be uncompilable by a c compiler. CFRONT converts c++ code into c source code until exceptions were added to the c++ language. At run time, if I have object code precompiled from c++, can c call it? I think it’s often possible.
A: check out the C Application Binary Interface. If your C and C++ code are interoperable, it is because your C and C++ compilers also conform to the same ABI.
A: yes that’s what extern “C” is for.

Q: tool for binary code dependency? (I guess the dependencies are dynamic libraries, rather than static libs linked in.)
AA: ldd or objdump. See http://ask.xmodulo.com/check-library-dependency-program-process-linux.html

Q9: call java from c++, what’re the c++ function(s)?
A: create_vm(), CallStaticVoidMethod(), CallVoidMethod() etc

Q9b: How do I access an int field of a java class?
%%A: if I have a pointer to the java object, I should be able to directly access the int field

Q: how do u implement a singleton

Q: semaphore(1) == mutex?
%%A: no since semaphore in most implementations uses a mutex internally. Functionally they are identical
%%A: semaphore is often cross process. Mutex can also be used on shared memory.

Q: for a const method getAge(), can it return a ptr to a non-const int which is a field?
%%A: i have seen sample code, so probably compilable
A: tricky. P215 effSTL says that inside a const method, all non-static fields become const.

Q3a: what’s the return type of operator*()?
%%A: can be anything, just like operator[], but not void not empty

Q3b: How about qq[ operator T*() ] ?
A(now): is it the OOC to a ptr? Correct. See http://en.cppreference.com/w/cpp/language/cast_operator

Q: if Base class has a virtual method f(int), and Derived class has an overload virtual method f(float, char), is the Base f() accessible via a D object?
%%A: the D f() hides the base f(), but base f() may still be accessible using A::f()

Q: 1 process, 30 threads vs 10 processes * 3 threads each? Assuming there’s data sharing among the threads
%%A: avoid ITC and serialization
%%A (Now): I feel single-process has a single point of failure without redundancy.
A: depends on the IPC cost. If non-trivial cost then favor 30-threads.

Q: how do u change the default fair-share scheduling among threads? #<— uncommon
%%A: I can make some threads sleep or wait. I can control thread-lib-level priorities, but the kernel thread priority levels could be very coarse
%%A(now): if userland thread, then scheduling is done in the thrd lib.

Q: unix performance monitoring?
%%A: jconsole; top; vmstat; perfmeter

 

Q: Dynamic loadable library. How do we configure the system to load a new version of the library?
%%A: $LD_LIBRARY_PATH
%%A: just overwrite the existing *.so file??

Q: how to convert a ptr-to-non-const to a ptr-to-const?
AA: just assign it. The reverse assignment needs const_cast. Tested in g++. I feel this is an obscure technicality.

— communication layer
Q: how do I see syscalls and sockets open by a process?
%%A: truss; lsof; snoop/tcpdump; netstat

Q: practical use of signals?
%%A: kill, core dump, thread dump, suspend,..
A: I used q(trap) to install my signal handler in my bash

Q: what are named pipe? #<—– uncommon
%%A: I think they exist in the file system as fake files. “cat | more” uses an unnamed pipe I believe.
A: now I think I should just say “never used it”.

pipe, stream, pesudo file, tcp-socket#xLang

* Java has a family of IO classes;
* c++ has a family of stream classes derived from class ios;
* unix has various file-like things.

“pipe” and “stream” are 2 general terminologies for an input/output destination with sequential[1] access. A file is one subtype of stream. Therefore java and c++ access files using their family of classes above.

[1] c++ gives each sequential stream a get-ptr and a put-ptr.
A named pipe is an entity in the file system….

TCP (not UDP) socket is a stream-oriented protocol. Sockets, either unix domain or TCP (not udp), is also like a pipe/stream. In fact, Unix treats any socket as a file so you can read and write to a it like a file. Perl liberally “confuses” file handles and sockets, just as C uses the same type of file-descriptor-number for files and sockets.

TCP (not UDP) is ordered — if two messages are sent over a connection in sequence, the first message will reach the receiving application first. When data segments arrive in the wrong order, TCP buffers the out-of-order data until all data can be properly re-ordered and delivered to the application.

Python wraps the socket descriptor number in a socket object, but you can still get the underlying descriptor number by mySocket.fileno()–

import socket
mySocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print mySocket.fileno()

AtomicReference = a global lookup table

As briefly mentioned in another post, an AtomicReference object is (serious!) a global [1] lookup table.

Example 1: Imagine a lookup service to return the last trade on NYSE. If 2 trades execute in the same clock cycle (rare in a GHz server), then return a Collection.

This is a lookup table with just one default key and a value which is an object’s address. A simple example, but we may not need CAS or AtomicReference here — each thread just overwrite each other.

Example 2: say a high frequency trading account must maintain accurate cash balance and can’t afford to lose an update, because each cash balance change is large. CashBalance isn’t a float (there’s no AtomicFloat anyway) but an object.

This is also a single-entry lookup table. Now how do you update it lock-free? CAS and AtomicReference.

[1] In reality, the AtomicReference is not globally accessible. You need to get a reference to it, but it’s better to ignore this minor point and focus on the key concept.

vague^normal^specific answers in non-tech interviews

My communication style is detail-oriented and I tune in to specific details. (That’s why I can write.) When another person’s or my own answer is rather vague or very specific, i often notice it before others do.

There are vague answers, normal answers and specific answers in any job interview. You can actually observer the interviewer’s own styleFor tech questions, the more specific, the better. For personality questions, probably not.

After you pass the tech interviews, you don’t have to sell any more. Non-tech interviewers are already sold and basically sniff for potential personality weaknesses. Non-tech interviews need to see “nothing suspicious” in you and not seeking “star qualities”. If a non-tech interview is busy or have seen many candidates, he can only remember very few things about you. If you just give stock answers, you might be less memorable, but that’s perfectly fine. They always notice that you can explain complex things, can understand and speak English — you pass.

Very specific answers to non-tech questions often show a type of personality. Leaders aren’t always that type. High-level thinker/communicator. As a leader, you often need to communicate to different team members. Not all of them detail-oriented communicators.

Non-specific answers don’t sound evasive and fake.

I often react to surprise questions with slightly vague answers. (These answers could be **better** than the detailed but contrived answers I sometimes come up with.) I feel rather bad about my imprecise *words* therein because I’m sensitive to individual words. I think most people are less sensitive to individual words but rather listen to the whole sentence and the whole person. I think a lot of IT candidates aren’t particularly articulate and eloquent and often beat around the bush with vague, bland answers.

I almost never give brief answers to behavioral questions, but some good answers are fairly short, possibly guarded or evasive, perhaps depending on the listener.

You don’t want to be evasive when answering a pointed question like “why you left that company?”

If you can’t think of specifics to substantiate your answer, then it’s probably ok to be less specific. In your effort to sell yourself and demonstrate your personal quality, you might reveal a very strong flavor of communication style. I think for technical non-lead roles, most candidates answers are not that specific but that’s fine.

If you are detailed-oriented, story-telling may come natural to you, even if you tell the story backward, even if you use imprecise words. Most of my stories are relevant, and most interviewers are interested. Some may be too busy, so it’s good to stop and test their appetite. See other post on a list of good stories.

mkt-data subscription engine java IV #barc eq drv

Say you have market data feeds from Reuters, Wombat, Bloomberg, eSpeed, BrokerTec, ION… Data covers some 4000 underliers and about half a million derivative instruments on these underliers. For each instrument, there can be new bid/offer/trade ticks at any millisecond mark[1]. Volume is similar to option data feed like OPRA.

Say you have institutional clients (in additional to in-house systems) who register to receive IBM ticks when a combination of conditions occur, like “when bid/ask spread reaches X, and when some other pricing pattern occurs”. There are other conditions like “send me the 11am IBM snapshot best bid/ask”, but let’s put those aside. For each of the instruments, there are probably a few combination of conditions, but each client could have a different target value for a condition — 2% for u, 2.5% for me. Assuming just 10 combination for each instrument, we have 5 million combination to monitor. To fulfill clients, we must continuously evaluate these conditions. CEP and Gemfire continuous query have this functionality.

I proposed a heavily multi-threaded architecture. Each thread is event-driven (primary event) and wakes up to reevaluate a bunch of conditions and generate secondary events to be sent out. It can drop the new 2ndary event into a queue so as to quickly return. The “consumer” can pick up the 2ndary events and send out by multicast.

Each market data vendor (Reuters, e-speed, ION, even tibrv) provides a “client-runtime” in the form of a jar or DLL. You embed the client-runtime into your VM, and it may create private threads dedicated to communicating with the remote publisher.

[1] Each IBM tick actually has about 10 fields, but each IBM update from vendor only contains 2 fields if the other field the symbol didn’t change. So we need something like Gemfire to reconstruct the entire 10-field object.

boost weak_ptr can access inner ptr only through a shared_ptr

Unlike shared_ptr, scoped_ptr and auto_ptr, a boost weak_ptr doesn’t overload operator-} and operator*. Ditto for std::weak_ptr.

There’s an important implication.[2]

“If I want to veto deletion of internal ptr, then I must join the club, i.e. the club must register me as –veto– member”.

As a passive observer, weak_ptr is helpless to prevent the internal pointer’s demise. There’s an important implication again [3].

[2] No direct means to get the raw pointer from a weak_ptr. You must create and go through a shared_ptr, either this->lock() or shared_ptr conversion ctor.  I feel a weak_ptr instance is an observer using a shared_ptr instance as a lens.

[3] User of weak_ptr can’t reliably use the referenced resource, since the resource could disappear any time.

Crucially, a std::weak_ptr can be empty, as explained on P96 [[std c++lib]]. We can detect weak_ptr emptiness when the inner ptr is deleted. In contrast, raw ptr doesn’t offer this feature — if the underlying heap object is reclaimed, the raw ptr becomes dangling and we have absolutely no way to detect that, according to my friend Paul M.

stack to heap trend in languages

  •  C is mostly stack. I guess you seldom see malloc() and free() in business applications. I have seen a number of c applications but didn’t notice them. However, how does c pointers deal with automatic destruction of the objects they reference? I guess all such objects are declared and used in the same scope, never declared in an inner func and used in an outer func?
  • C++ biz apps use lots of new(), but by default, if you don’t call new() or malloc(), C++ uses the stack and global area. In terms of language support and encouragement, stack is still easier to use than heap in C++. I consider C++ as a transition between C and newer languages.
  • STL seems to be mostly heap, to support expansion
  • Java uses the heap more than the stack. Every object is on the heap. Memory leak becomes too likely so java had to provide garbage collector.
Conclusion: c -> c++ -> STL -> java, I see less stack and more heap usage.

## c++ syntax monsters

* constructor calls without new — see other posts.
* initializers in constructors
* array of pointers (P234 [[24]])
* function parameters — reference types
* copy constructor
* constant pointer to constant object
* assignment operator overloading
* function pointers, and (worse still) function pointers to member functions
* generic functions and (worse still) generic classes

A crude progress bar of a learner is these “chapters”….

const nonref variable: reasonably immutable

Reading the c++ FAQ 18.10, I got the impression that most const variables are ptr or ref.

const nonref primitive is like a java constant or java immutable.

Q: are there const nonref class object? Are they immutable?
A: Yes according to http://www.learncpp.com/cpp-tutorial/810-const-class-objects-and-member-functions/

Once a const class object has been initialized via constructor, any 
attempt to modify the member variables of the object is disallowed, as it 
would violate the constness of the object.

Just like the built-in data types (int, double, char, etc…), class objects 
can be made const by using the const keyword.

Q: can you cast away its constness?
A: very tricky.  See effC++ and the IBM arcitlce.  Short answer is no.

##triggers for GC: various comments

Some big trading engine (UBS?) runs just 1 GC cycle each day. I wonder how often GC runs normally.

P155 [[SanFrancisco]] nicely outlines that GC can start
* upon alloc-failure
* when jvm explicitly waits (IO, sleep …)
* every 10 seconds, but at a low priority

– In addition, application can request GC.

Synchronous GC is predominant, but google “asynchronous garbage collection idle periods of low activity” and you see GC can start when system is idle. Creditex java guys also hinted at that.

In my experiment, I do see GC springs into action when app is paused.

http://java.sun.com/docs/hotspot/gc1.4.2/faq.html says about JDK1.4 —
* In the default garbage collector a generation is collected when it is full (i.e., when no further allocations can be done from that generation).
* The concurrent low pause collector starts a collection when the occupancy of the tenured generation reaches a specified value (by default 68%).
* The incremental low pause collector collects a portion of the tenured generation during each young generation collection.
* A collection can also be started explicitly by the application.

http://www.softwareengineeringsolutions.com/blogs/2010/04/30/garbage-collection-in-java-part-2/ drills into alloc-failure —

The JVM is unable to fulfill the request (alloc-failure), and so it awakens the garbage collector to come to its aid. This results in a Young Generation collection cycle.

A Young Generation collection is called for because there isn’t enough space available in Eden to satisfy this allocation request (alloc-failure). However, unlike the earlier scenario, the promotion of the middle-aged object from the active Survivor space to being an elderly object in the old generation cannot proceed because there is no free space available in the Old Generation. In this case, the Young Generation collection is put on hold, while an Old Generation collection cycle is spun up to free up space in the Old Generation.

Callable tasks, Future results

A few learning notes based on P196 [[Java Threads]]. See also
http://bigblog.tanbin.com/2010/07/futurejava-designed-for.html
http://bigblog.tanbin.com/2009/05/callable-tasks-simple-thread-pool-set.html
http://bigblog.tanbin.com/2011/01/exception-passing-between-threads.html

Callable is an enhanced Runnable. Whenever possible, I’d design with Callable rather than Runnable. However, Callable doesn’t extend Runnable. For example, there’s no “new Thread(Callable)”.

Q: Does it make sense to implement both Runnable and Callable interfaces, so my class can be used in both contexts?
%%A: conceivable.

Q: Can this technique be used to convert existing Runnable designs to use Callable, assuming run() calls call()?
%%A: you might find it troublesome to get the task output (either result or exception) when you use the task object as a Runnable.

A Future object represents the result of a task. I feel it’s a window into that real world task in JVM. The (either real or symbolic) task is *stateful*.

I think of the “future” construct as a concept first. It represents a yet-to-complete async task. It could be ready when you read it. As such, I think it’s implemented using condVars.

FutureTask (like SwingWorker) is one of the implementations of Future interface.

Just as a Thread is a (rather imperfect) handle on a JVM thread, a Callable object with the corresponding FutureTask object are handles on the real world task.

Thread object as a lock object: myThr.wait()sometimes useful

Someone (HSBC?) asked me what happens if I call myThread.wait().

MyThread thrd = new MyThread();
thrd.start();
synchronized(thrd){
 thrd.wait(); // similar to thrd.join()
}

Using a Thread.java object as a monitor object is legal, but not advisable. Unlike c++, java has no undefined behavior, so the behavior is probably “defined” in the java language spec. I guess the behavior is that main thread waits in the “waiting room” of the thrd object, but who will notify main thread?

My experiment shows that main thread actually gets notified when thrd ends. Before we uncover the mysterious notifier, let’s remember that it’s unusual, confusing and never necessary to use a Thread.java object as a lock. There’s always a simpler alternative.

If you really use a Thread.java object as a lock/conditionVar, there are hidden issues. Here’s one issue I know –

As a condition variable, a Thread.java object has the undocumented behavior of calling this.notifyAll() when it dies. Someone speculated — “In fact this is how the join() method is implemented, if I remember correctly: it waits on the thread object until it dies.” Note join() is an instance method of an Thread.java object. When main thread calls thrd2.join(), it treats thrd2 as a conditionVar and waits on it until another thread calls thrd2.notifyAll(). Notifying thread is actually the thrd2 itself.

up and down casting non-ref/ref/ptr

Primary usage of dynamic_cast is down-cast
* base/derived class must have vptr, or you get compile-time error
* original and target types must be ptr/ref, or you get compile-time error [1]
* there’s just one object involved, not 2
** That one object must be a complete and full[2] Derived object, otherwise you get runtime (not compile time) failure, indicated by 1) null ptr or 2) exception (during reference down-cast)
*** boost polymorphic_cast adds uniformity by throwing exception for both

[1] static_cast can cast nonref.
[2] static_cast doesn’t incur the overhead to check that

Q: up-casting a nonref??
A: no casting operator required, but you get sliced. qq/ myB = myD; /
A: by the way, no such thing in java, since java doesn’t use “nonref”

Q: down-casting a nonref??
A: impossible in dynamic_cast. “dynamic_cast can be used only with pointers and references to objects”

Q: down-casting a ptr (to a polymorphic object only)?
A: dynamic_cast. May return NULL. java has a similar feature.
A: see also boost polymophic_cast
Q: down-casting a ref (to a polymorphic object only)?
A: dynamic_cast. Never returns NULL. See post on new-and-dynamic_cast-exceptions
A: see also boost polymophic_cast

Q: down-cast a base ptr/ref to a derived object but no vtbl/vptr no virt func?
A: impossible. dynamic_cast won’t compile.

Q: up-cast a ptr?
A: common in c++ and java. everyday virtual function scenario. no operator required.

Q: up-cast a ref?
A: less common than ptr. See post on virtual^redefining

best way to sanity-check java method arguments?

Say you are writing a utility Class1.getByIndex(int arrayIndex). Best way to enforce that arrayIndex is positive?

– throw an unchecked error and crash the jvm? Recommended by many experts. I guess the downside is tolerable, and there are no better candidates as a standard, universal solution.
– throw a checked exception and force other programmers to try-catch? I think the overhead grows when you set this as standard practice.
– return a boolean result? Many methods/constructors can’t.

Ideally, you want to remind other programmers to check their input first, but not force them to put up a useless try-catch.

dependency, rigid, fragile — OO design concepts

— based on http://www.objectmentor.com/resources/articles/oodmetrc.pdf

rigidity (in-flexible, un-adaptable) is due to the fact that a single change to heavily interdependent software begins a cascade of changes in dependent modules. When the extent of that cascade of change cannot be predicted by the designers or maintainers the impact of the change cannot be estimated. This makes the cost of the change impossible to estimate. Managers, faced with such unpredictability, become reluctant to authorize changes.

Fragility (non-robust) is the tendency of a program to break in many places when a single change is made. Often the new problems are in areas that have no conceptual relationship with the area that was changed. Simple changes to one part of the application lead to failures in other parts that appear to be completely unrelated.

———————
I think this “change-impact” analysis would help answer the question “how to reduce subclass/superclass dependency”

O(n) sort – radix sort

The principles are relevant to pure-algo interviews.

Q: quicksort is universally usable. How about radix sort?
A: not widely mentioned as a drawback against quicksort. (In contrast, I think Counting-sort is not universally usable…)
A: wikipedia says integers could [1] represent strings of characters (e.g., names or dates) and specially formatted floating point numbers, radix sort is not limited to integers.

Radix sort is good for parallel computing. See http://www.drdobbs.com/parallel/parallel-in-place-radix-sort-simplified/229000734

Radix-sort Expected running time is faster than quicksort? Could be but….
… but the hidden constant factor in O() is much worse than quicksort.

[1] It’s hard to prove “cannot”