(de)serialize binary tree(!!BST)using funptr #BFS bbg

struct Node{
  Node * left, right;
  int data;

… If you simply output as “data,leftId,rightId | data,leftId,rightId | …” then the graph structure i.e. link is lost. Therefore, I feel the stream need to identify each node: “Id,data,leftId,rightId|…
Next question is how to assign the id values. Easiest — use node address as a string id!

Suppose there’s a treewalk algo walk(Node * root, void ( *callback ) (Node *)), i will call it like

walk(root, serialize1node); // where serialize1node is the callback.

Note the position of the asterisk:

  • It’s on the right of type name Type1 — read left to right .. pointer to Type1
  • It’s on the left of variable name var2 — read left to right .. var2 is a pointer to …

https://github.com/tiger40490/repo1/blob/cpp1/cpp/binTree/serialize_bbg.cpp is my tested solution

—-idea from my friend Deepak: pre-order tree walk and output one line per node {payload, direction from root, direction from Level 1 ancestor, direction from Level 2 ancestor..}

  • first node in the file is always the root
  • the output file always includes the higher level nodes before the lower-level nodes
  • delimiter needed between nodes

—-If the tree is a BSTree, then this idea is based on the fact that a reading a series of unsorted nodes can construct a BSTree. For this solution, we have no requirement on the payload values. They can be booleans.

first pass — in-order walk to build a hash table of {node address -> integer id}. If we treat the Id’s as payload, then the tree is a BST.

2nd pass — BFT the original tree. For each node, we write original payload, id to the file, without writing the links or addresses. File looks like T,5|T,4|T,6|T,2|…

When we reconstruct the tree, we read each node from the file, and insert that node into our new tree, using the id value to decide where.


SIG phone round

See retrans questions on retrans biz logic if I were CTO

Q: how do you extract the seq number from packet header?
A: reinterpret_cast to a pktHeaderStruct ptr, then read the field

Q: is reinterpret_cast safe?

Q: why is RBTree sometimes faster than hashtable for a small collection? See unordered_map^map performance: unpredictable

Q4: speed of std::sort() on a vector vs list::sort
%%A: probably vector due to cache efficiency
A: also the stability of list::sort comes at a cost

Q4b: how are the two sorting implemented?
%%A: possibly two variations of quicksort

Q: std::sort() vs qsort() in C. Same algorithm, but which is faster?
%%A: std::sort, due to functor inlining
AA: Item 46 in [[effectiveSTL]]

Q: given a csv file with a million rows, you use c/java/c#/py to (convert to number) sum up 2nd column and produce a single output number, which language is fastest?
%%A: main cost is file read, but let’s suppose entire file is loaded into memory.
%%A: I feel python bytecode (even optimized) will be slightly slower. Java (due to JIT) and C are likely faster.
A: now in hindsight I feel integer parsing will cost much more than arithmetics.

CVA c++IV 1

Q: when would you pass (in/out of functions) by ptr rather than non-const ref?
%%A: if the argument can be null, or void ptr
A: if I need to pass a double ptr
A: if I need to pass a raw array
%%A: A factory often returns by ptr or smart ptr, seldom by reference

Q: do you catch exception by value, ptr or reference?
A: slicing

Q: iterating vector by int index vs iterator… which is faster?
A: iterator wins since vec[i] needs a O(1) pointer arithmetic operation to locate the element

Coding question  https://github.com/tiger40490/repo1/blob/cpp1/cpp1/array/filterInSitu.cpp

mvea phone IV c++

Q: any experience with FIX?
Q: how many percent of your time is on support vs development?
Q: are you involved in requirement gathering and analysis?
Q: in your project, where do you use python vs c++?
Q: describe the reliability/resilience features in your NYSE system
%%A: retransmission; hot standby; database dump/restore;
%%A: after a mid-day restart, parser would request replay or snapshot to get up to date with the live feed

Q: What’s “pure virtual” vs virtual?
Q: what’s polymorphism
Q: why do you use smart pointers?
Q: what c++11 features did you use?
Q: what’s the “auto” keyword?
Q: j4 stored proc? See https://bintanvictor.wordpress.com/2007/08/01/j4stored-procedures-a-sybase-perspective/
Q: std::vector vs std::list
%%A: list offers efficient splice, insert/delete in the middle and front. Never reallocates.

Up to 700,000 orders received. Half of them are latency sensitive. System validates each order and generates a “principal order” to be sent out to all the stock exchanges in the U.S. (and sometimes Americas), in FIX. These principal trades provide perfect hedging to MS.

Each of the 700,00 orders look just like a regular cash equity order, except a flag to indicate it’s a OTC swap..

find median@2 sorted arrays(same len) #Trexquant

This "coding" question is really math problem. Once you work out the math techniques, the coding is rather simple.

My own idea — find the median of X and median of Y. If med(X) < med(Y) then discard the lower portion of X and higher portion of Y (two portions of equal size). Then repeat.

Why can’t the final "winner"be somewhere among the lower portion of X? Because the higher portion of X and higher portion Y already constitute half the population, and all of them are higher.

Definition of lower portion —
* all lower items up to but not including med(X) If len(X) is odd
* exactly the lower half of X if len(X) is even

Trexquant QnA IV #std::forward,noexcept

Q: how would a tcp consumer-client know the server process died rather than a quiet server?

Q: how would a tcp producer-server know a client process died? Signal?

Q: What’s perfect forwarding?

Q: std::forward() vs std::move()? See std::move=unconditional-cast #forward=conditional-cast

Q1: in c++03, a myVectorOfTrade.push_back( Trade(22, 0.7) ) uses how many ctor/dtor invocations?
A: regular ctor, copy-ctor, dtor of the temporary

Q1b: how about c++11?
A: regular ctor, mv-ctor, dtor of temporary. See P293…. We assume there’s a Trade mv-ctor and there’a some pointer field in Trade, otherwise the mv-ctor has nothing to steal and is probably same as copy-ctor

Q1c: what about something like emplace_back(22, 0.7)
A: in-place ctor using placement-new. P294 [[eff modern c++]] perfect forwarding eliminates temporaries

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp is my illustration.


Q: how would “noexcept” improve runtime performance?
AA: P 91 [[effModernC++]] has a short paragraph explaining “noexcept” functions are more compiler-optimizable

Q: please implement a simple Stack class backed by a vector. Implement push(), pop(), top().



tcp: detect wire unplugged

In general for either producer or consumer, the only way to detect peer-crash is by probing (eg: keepalive, 1-byte probe, RTO…).

  • Receiver generally don’t probe and will remain oblivious.
  • Sender will always notice a missing Ack. After retrying, TCP module will give up and generate SIGPIPE.
send/recv buffers full buffer full then receiver-crash receiver-crash then sender has data to send receiver-crash amid active transmission
visible symptom 1-byte probe from sender triggers Ack containing AWS=0 The same Ack suddenly stops coming very first expected Ack doesn’t come Ack was coming in then suddenly stops coming
retrans by sender yes yes yes yes
SIGPIPE no probably yes probably

Q20: if a TCP producer process dies After transmission, what would the consumer get?
AA: nothing. See http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html — Receiver is ready to receive data, and has no idea that sender has crashed.
AA: Same answer on https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html

Q21: if a TCP producer process dies During transmission, what would the consumer get?
%A: ditto. Receive has to assume sender stopped.

Q30: if a TCP consumer process dies during a quiet period After transmission, what would the producer get?
AA: P49 [[tcp/ip sockets in C]] Sender doesn’t know right away. At the next send(), sender will get -1 as return value. In addition, SIGPIPE will also be delivered, unless configured otherwise.

Q30b: Is SIGPIPE generated immediately or after some retries?
AA: https://superuser.com/questions/911808/what-happens-to-tcp-connections-when-i-remove-the-ethernet-cable/911829 describes Ack and re-transmission. Sender will notice a missing Ack and RTO will kick in.
%A: I feel TCP module will Not give up prematurely. Sometimes a wire is quickly restored during ftp, without error. If wire remains unplugged it would look exactly like a peer crash.

Q31: if a TCP consumer dies During transmission, what would the producer get?
%A: see Q30.

Q32: if a TCP consumer process dies some time after buffer full, what would the producer get?
%A: probably similar to above, since sender would send a 1-byte probe to trigger a Ack. Not getting the Ack tells sender something. This probe is builtin and mandatory , but functionally similar to (the optional) TCP Keepalive feature

I never studied these topics but they are quite common.

notable linux system calls: QQ question

https://syscalls.kernelgrok.com can sort the functions by function id

http://asm.sourceforge.net/syscall.html is ordered by function id

  • fork()
  • waitpid()
  • open() close() read() write()
  • –socket
  • socket() connect() accept()
  • recvfrom() sendto()
  • shutdown() is for socket shutdown and is more granular than the generic close()
  • select()
  • epoll family
  • –memory
  • brk

elaborate python project/challenge #PWM

Key is “technical challenge”. Without it, I may look like just another python coder. Some of the interviewers may not be easy to impress. I will only target the majority.

Different companies might have different needs for python skill. I can only guess

  1. Usage: data analysis, data science
  2. Usage: automation, glue language
  3. …. py as primary language to implement business logic in an enterprise application? Only in Macquarie, baml and jpm
    1. I could remake my PWM Commissions system in python

For 2), I used

  • devops
  • integration with git, maven, g++, code generation, test result parsing, …. for devops
  • automated testing
  • simple network servers
  • subprocess
  • integration with shell script
  • automatic upload
  • generate c++ source code to be compiled into a native python extension
  • import hack
  • logging decorators — https://github.com/tiger40490/repo1/blob/py1/py/loggingDecorator.py
  • python code attached to trade object. Code to be evaluated at reval time.