hearsay c++IV: Cubist

Q: TCP connection close .. handshakes?

Q3: why is new() so slow?

Q3b: what if I used array-new but then regular delete?

Q: implement Fib calculation at compile time

Q: write code to extract names and floating point numbers from a free-form string (no constraints)

Advertisements

LiquieNet IV c++

Q1: partition list into good^bad sections

Q3b: fix a sample source code that passes unique_ptr<int> by value to a foo() function:

unique_ptr<int> foo(unique_ptr<int> kk)

Q3b.1: give examples of move semantic

Q3b.2: write a template solution to check if the template argument T is a pointer type. I don’t have a good solution but I mentioned SFINAE.

Q3a: improve

string & find(list<Person> li, string name){
  for (auto i=li.begin(); i!=li.end(); i++)
    if (*i == name) return i->nickname;
  return "";
}

%%A: last return will crash due to return-by-ref. If my function’s return value can be empty, then I usually return by pointer
%%A: equality check is dubious. Better be explicit like Person(name) or i->convertToString()
%%A: li.end() can be cached
%%A: ++i
%%A: li and name should really be const ref

Q4: predict output
%%A: I feel #2/3 are bad code but compiler may not be able to

struct B{void foo(){cout<<"hello\n";}; b = new B(); b->foo(); //#1
delete b;
b->foo(); //#2 .. what?
b=NULL;
b->foo(); //#3 .. what?

LiquidNet IV questions beyond c++

Each orders is big, at least 40,000 shares… latency unimportant. Two clients can negotiate without revealing identities — Any chat is monitored. No auction supported.

Other dev teams outside the “core” MatchingEngine team:

  • execution-algo engine — Client can also send an algo order like a vwap order. The LN ME (matching engine) will forward it to an execution-algo engine (owned by another team), which slices it and send the pieces to “lit markets” like NYSE. Alternatively, it can send them back to the same LN ME.
  • There are WPF apps for clients to use.
  • There are post-trade analytics systems (Venkat’s interest?).
  • There are settlement systems

Q: how do you go about testing your c++ code? LN uses python to automate tests. C++ developers must write tests for their own features.

–QA interviewer: a junior BA proposed a system with system F (perhaps some hedge fund) integrating with system L. L would periodically (5-sec interval) use an API to query F to get a list of orderId’s. Then L would convert each orderId into a FIX message and send to F. Please improve.

I feel this question requires domain knowledge.

  • Sugg: batch mode — one big message containing 999 items is more efficient than 999 small messages. Not sure about FIX
  • Sugg: In the current “pull” model, 5-sec interval may be too frequent if F is quiet; 5-sec could be too infrequent if F has a burst of new orders, so it’s best to make F “push” to L.
  • Sugg: the push can make use of 2-way FIX messaging

–retrans question (Good): In your order-book engine (like Rebus), suppose you get a bunch of order delete/exec/modify messages, but the orderId is unrecognized and possibly pending retrans. Rebus doesn’t know about any pending retrans. What would rebus do about those messages?

A: I feel warehousing is the only choice. After some timeout, should clear the outdated/hopeless … memory footprint. Can use a FIFO cache.

Nsdq onsite IV

Q: Java NonBlocking IO?
%%A: one thread monitoring multiple sockets

Q: many simple Objects with complex relationships/interactions vs one (or very few) huge complex object?

Q: thread Futures? Any comments?

–some of my findings

  • microservices and containers are important
  • According to the architect, the new framework uses one dedicated instance of matching engine serving a single symbol such as AAPL.
  • the new java framework only supports a few smaller exchanges initially, but is designed to roll out to more “exchanges” esp. those to be installed on client sites (like CSP). It is (theoretically) designed to be capable enough to implement the main stock exchange.
  • is Mostly java + javascript (Amber, react) with small amount of c++
  • the new java framework was deliberately designed to operate in single-threaded mode, despite oppositions. Therefore, multiple “clients” calling the same library concurrently would be unsafe.
  • java8
  • noSQL is adopted in addition to PostgreSQL, but the principal architect is less concerned about data store, which is only used in individual components

SIG phone round

See retrans questions on retrans biz logic if I were CTO

Q: how do you extract the seq number from packet header?
A: reinterpret_cast to a pktHeaderStruct ptr, then read the field

Q: is reinterpret_cast safe?

Q: why is RBTree sometimes faster than hashtable for a small collection? See unordered_map^map performance: unpredictable

Q4: speed of std::sort() on a vector vs list::sort
%%A: probably vector due to cache efficiency
A: also the stability of list::sort comes at a cost

Q4b: how are the two sorting implemented?
%%A: possibly two variations of quicksort

Q: std::sort() vs qsort() in C. Same algorithm, but which is faster?
%%A: std::sort, due to functor inlining
AA: Item 46 in [[effectiveSTL]]

Q: given a csv file with a million rows, you use c/java/c#/py to (convert to number) sum up 2nd column and produce a single output number, which language is fastest?
%%A: main cost is file read, but let’s suppose entire file is loaded into memory.
%%A: I feel python bytecode (even optimized) will be slightly slower. Java (due to JIT) and C are likely faster.
A: now in hindsight I feel integer parsing will cost much more than arithmetics.

CVA c++IV 1

Q: when would you pass (in/out of functions) by ptr rather than non-const ref?
%%A: if the argument can be null, or void ptr
A: if I need to pass a double ptr
A: if I need to pass a raw array
%%A: A factory often returns by ptr or smart ptr, seldom by reference

Q: do you catch exception by value, ptr or reference?
A: slicing

Q: iterating vector by int index vs iterator… which is faster?
A: iterator wins since vec[i] needs a O(1) pointer arithmetic operation to locate the element

Coding question https://github.com/tiger40490/repo1/blob/cpp1/cpp/array/inSituFilter_CVA.cpp

mvea phone IV c++

Q: any experience with FIX?
Q: how many percent of your time is on support vs development?
Q: are you involved in requirement gathering and analysis?
Q: in your project, where do you use python vs c++?
Q: describe the reliability/resilience features in your NYSE system
%%A: retransmission; hot standby; database dump/restore;
%%A: after a mid-day restart, parser would request replay or snapshot to get up to date with the live feed

Q: What’s “pure virtual” vs virtual?
Q: what’s polymorphism
Q: why do you use smart pointers?
Q: what c++11 features did you use?
Q: what’s the “auto” keyword?
Q: j4 stored proc? See https://bintanvictor.wordpress.com/2007/08/01/j4stored-procedures-a-sybase-perspective/
Q: std::vector vs std::list
%%A: list offers efficient splice, insert/delete in the middle and front. Never reallocates.

Up to 700,000 orders received. Half of them are latency sensitive. System validates each order and generates a “principal order” to be sent out to all the stock exchanges in the U.S. (and sometimes Americas), in FIX. These principal trades provide perfect hedging to MS.

Each of the 700,00 orders look just like a regular cash equity order, except a flag to indicate it’s a OTC swap..

find median@2 sorted arrays #Trex

https://leetcode.com/problems/median-of-two-sorted-arrays/description/ is similar except X and Y can be unequal length. My solution solves the harder, generalized problem.

This “coding” question is really math problem. Once you work out the math techniques, the coding is simple.

–My own idea on the spot — find the median of X and median of Y. If med(X) < med(Y) then discard the lower portion of X i.e. the “XB group”, and higher portion of Y (“YA group”). Then repeat.

  • Note len(XB) == len(YA) == min(len(X), len(Y))/2 := K. So every iteration would shrink the shorter array by half (i.e. K), and shrink the longer array by K. K would drop in value in next iteration.
  • loop exit — When the shorter of the two (say it’s X) shrinks to length 1, we are lucky — find the numbers around median(Y) and adjust the answer based on X[0].

Insight — Why can’t the final “winner”be somewhere in XB group? Because XA + YA already constitute half the population, and all of them are higher.

I always like concrete examples. So Suppose there are 512 items in the lower portion “XB group”, and the higher portion “XA” has 512 items. Suppose there are 128 items each in YB and YA groups. So in this iteration, we discard YA and the lowest 128 items in XB.

Definition of lower portion —
* all lower items up to but not including med(X) If len(X) is odd
* exactly the lower half of X if len(X) is even

Trex QnA IV #std::forward,noexcept

Q: how would a tcp consumer-client know the server process died rather than a quiet server?

Q: how would a tcp producer-server know a client process died? Signal?

Q: What’s perfect forwarding?

Q: std::forward() vs std::move()? See std::move=unconditional-cast #forward=conditional-cast

Q1: in c++03, a myVectorOfTrade.push_back( Trade(22, 0.7) ) uses how many ctor/dtor invocations?
A: regular ctor, copy-ctor, dtor of the temporary

Q1b: how about c++11?
A: regular ctor, mv-ctor, dtor of temporary. See P293…. We assume there’s a Trade mv-ctor and there’a some pointer field in Trade, otherwise the mv-ctor has nothing to steal and is probably same as copy-ctor

Q1c: what about something like emplace_back(22, 0.7)
A: in-place ctor using placement-new. P294 [[eff modern c++]] perfect forwarding eliminates temporaries

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp is my illustration.

 

Q: how would “noexcept” improve runtime performance?
AA: P 91 [[effModernC++]] has a short paragraph explaining “noexcept” functions are more compiler-optimizable

Q: please implement a simple Stack class backed by a vector. Implement push(), pop(), top().

 

 

tcp: detect wire unplugged

In general for either producer or consumer, the only way to detect peer-crash is by probing (eg: keepalive, 1-byte probe, RTO…).

  • Receiver generally don’t probe and will remain oblivious.
  • Sender will always notice a missing Ack. After retrying, TCP module will give up and generate SIGPIPE.
send/recv buffers full buffer full then receiver-crash receiver-crash then sender has data to send receiver-crash amid active transmission
visible symptom 1-byte probe from sender triggers Ack containing AWS=0 The same Ack suddenly stops coming very first expected Ack doesn’t come Ack was coming in then suddenly stops coming
retrans by sender yes yes yes yes
SIGPIPE no probably yes probably

Q20: if a TCP producer process dies After transmission, what would the consumer get?
AA: nothing. See http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html — Receiver is ready to receive data, and has no idea that sender has crashed.
AA: Same answer on https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html

Q21: if a TCP producer process dies During transmission, what would the consumer get?
%A: ditto. Receive has to assume sender stopped.


Q30: if a TCP consumer process dies during a quiet period After transmission, what would the producer get?
AA: P49 [[tcp/ip sockets in C]] Sender doesn’t know right away. At the next send(), sender will get -1 as return value. In addition, SIGPIPE will also be delivered, unless configured otherwise.

Q30b: Is SIGPIPE generated immediately or after some retries?
AA: https://superuser.com/questions/911808/what-happens-to-tcp-connections-when-i-remove-the-ethernet-cable/911829 describes Ack and re-transmission. Sender will notice a missing Ack and RTO will kick in.
%A: I feel TCP module will Not give up prematurely. Sometimes a wire is quickly restored during ftp, without error. If wire remains unplugged it would look exactly like a peer crash.

Q31: if a TCP consumer dies During transmission, what would the producer get?
%A: see Q30.

Q32: if a TCP consumer process dies some time after buffer full, what would the producer get?
%A: probably similar to above, since sender would send a 1-byte probe to trigger a Ack. Not getting the Ack tells sender something. This probe is builtin and mandatory , but functionally similar to (the optional) TCP Keepalive feature

I never studied these topics but they are quite common.

Bbg QnA IV

Q: how would you write your own program to see if local system is big-endian or little-endian.

Q2: what design patterns do you use?
A: factory, template-method are used a lot in my code.
A: producer-consumer is an essential pattern in async designs
A: adapter — wrapper functions?
A: When needed, I use visitor as a big gun.

Q2b: you said template method is often used implicitly. Can you give some examples in STL?
A: would be hard since classic template method uses virtual functions but there are very few of them in STL. std::sort() fixes the procedure and let user specify the comparison, using template techniques instead of virtual

Q: how would you design multicast to a number of registered users?

 

notable linux system calls: QQ question

https://syscalls.kernelgrok.com can sort the functions by function id

http://asm.sourceforge.net/syscall.html is ordered by function id

  • fork()
  • waitpid()
  • open() close() read() write()
  • –socket
  • socket() connect() accept()
  • recvfrom() sendto()
  • shutdown() is for socket shutdown and is more granular than the generic close()
  • select()
  • epoll family
  • –memory
  • brk

elaborate python project/challenge #PWM

Key is “technical challenge”. Without it, I may look like just another python coder. Some of the interviewers may not be easy to impress. I will only target the majority.

Different companies might have different needs for python skill. I can only guess

  1. Usage: data analysis, data science
  2. Usage: automation, glue language
  3. …. py as primary language to implement business logic in an enterprise application? Only in Macquarie, baml and jpm
    1. I could remake my PWM Commissions system in python

For 2), I used

  • devops
  • integration with git, maven, g++, code generation, test result parsing, …. for devops
  • automated testing
  • simple network servers
  • subprocess
  • integration with shell script
  • automatic upload
  • generate c++ source code to be compiled into a native python extension
  • import hack
  • logging decorators — https://github.com/tiger40490/repo1/blob/py1/py/loggingDecorator.py
  • python code attached to trade object. Code to be evaluated at reval time.

celebrity search #MS ez

Q: write an algorithm to search for celebrities in a collection of people with the help of a blackbox predicate that can tell whether one person knows the other (bool knows(a, b)) which is an asymmetric relation. Celebrities are those who know no one but are known by everyone else.


Analysis: There can be 0 or 1 celebrity. There are exactly N*N relationships in the system, so brute force is O(N*N) but I propose a 2-pass O(N) solution:

  1. First pass: shrink the collection of Nodes A, B .. to N
    1. check a random pair. if A knows B, then remove A from list; else, remove B from list. So first step always shrinks list by one
    2. Now check any two) persons in remaining list. Check either direction. Will remove exactly one person.
    3. list will shrink to 1 person (Dan) after N-1 steps
  2. Second pass: check this Dan throughly. Dan must be known to everyone and must not know anyone. We end up with either no celebrity OR one celebrity in Dan.

Lesson learned — better use A/B/C or Alice/Bob/Cody instead of #1, #2, #3.

thesys mgr QnA IV

Q: why do you say look-up can be slower on hash table than RBTree?
A: hash collision; slow hashing;

Q: Given a fast parser but a slow disk writer, how do you persist all the messages received?
%%A: socket for IPC
%%A: shared mem is probably faster, even with synchronization cost
%%A: we can also use threads. Now I feel we should use coarse-grained concurrency. Each task is a big chunk of data. One of multiple consumers/writers pick up a chunk and spend a sizable amount of time writing to disk.

Q: what’s your concept of multiplexing?

Q: you notice packet dropping due to overcapacity. What would you do to improve throughput?

Q: what system calls have you used?

bbg cod`IV: %%clarifications

I had four technical discussions (before and after the wonderful lunch chat with Chitan and Neel). For three of them, I have a few echnical clarifications, as in a product design discussion.

  1. FX team:  in-place list shuffle — At home, I wrote and tested a splice() solution, but splice() works only on a linked list, which has roughly double the memory footprint of a vector of integers. If we forcibly apply some kind of splice on a vector, runtime cost of repeated shifts will be unacceptable. If there’s an in-place solution with O(1) space complexity and O(N) time complexity, then I have not found it.
  2. FX team: local static variables are zero-initialized by default, contrary to what we assumed in our discussion. This represents a bug in my solution. The extreme-value tests will indeed flush out this bug.
  3. Equities team: filters in screen —

At Akhil’s hint I sketched a conceptual design based on sorted map<priceValue, ticker>. I will call it a RBTree. This RBTree must be updated with each message, by erase/insert. Suppose we care only about filter latency, not about update latency. Or suppose we get relatively infrequent updates but many users perform applyAllFilters(). Then this RBTree can help. We will use lower_bound followed by backward iteration (for example, Price < 150). This is presumably faster than my simple solution of full hash table iteration. With both designs, we get the list of “survivor tickers” from this filter.

That’s the first filter in a screen. How about the second filter?  In my simple solution, I take the survivor list (of size N) from first apply_filter and then look up N times against filter2’s hash table (no full iteration). With the RBTree design, we have a choice. Let’s name the filter2’s RBTree as “tree2”.

  • if tree2.lower_bound() leaves very few (like 5) qualified tickers, but the survivor list from filter1 is large, then we should take each qualified ticker from tree2.lower_bound() and check if it is in filter1’s survivor list.
  • If tree2.lower_bound() leaves many (like 99999999) tickers to be checked, then we should drive from the survivor list. The tree2 won’t be used.
  • However, in a RBTree, how many nodes come before a given node is hard to count. This counting requires expensive iteration. so I feel the choice cannot be made at runtime.

As I told Akhil and Rajan, the RBTree vs my simple solution should be benchmarked in a realistic setting. Depending on the context, one will outperform the other.

private dtor restricts allocation to heap only

My test proves that merely declaring the a stack instance or global instance is illegal because compiler generates a (illegal) call to the private dtor!

Friend can also call dtor!

class PrivDtor{
  ~PrivDtor(){
        cout<<"dtor\n";
  }
public:
  static void destroy(PrivDtor* p){p->~PrivDtor();}
  static void del(PrivDtor* p){delete p;}
};

//PrivDtor globalInstance;  //won't compile
int main(){
  //PrivDtor stackInstance; //won't compile

  PrivDtor * p = new PrivDtor;
  //PrivDtor::del(p); // works!
  PrivDtor::destroy(p);  //works!
}

Wells IV #socket/c++/threading

This is one of the longest tech interviews and one of the most enriching 🙂 even though I don’t “like” all their questions — very academic/theoretical, text-book driven, too low-level to be practical, not testing practical zbs.

Q: Any consistently reliable technique to detect stale order in your book?

Q: what’s your frame size if your tcp/ucp buffers sizes are so large?

Q: experience storing ticks?

—— C++ questions:
Q1: how can you avoid the cost of virtual function?
%%A: enum/switch or CRTP

Q1b: what’s CRTP

Q: what if your copy ctor signature is missing the “const”?
%%A: you can’t pass in temporaries or literal values. Such arguments must pass into “const &” parameter or “const” parameter. Correct.

Q: double delete?

Q: policy class? traits class?

Q: STL binders.. use case?

Q: how many types of smart pointers do you use?

Q: difference between java generics vs c++ templates?
%%A: type erasure. Java Compiler enforces many rules, but bytecode saves no info about the type argument, so we can get strange runtime errors.
%%A: template specialization. Template meta-programming
%%A (accepted): code bloat since each template instantiation is a separate chunk of code in the object file and in memory.
A: I have a dedicated blog post on this.

Q: what’s returned by a std::queue’s dequeue operation when it’s empty?
AA: undefined behavior, so we must check empty() before attempting dequeue. I believe ditto for a std::stack

Q: why template specialization?
%%A: customize behavior for this particular vector since the standard implementation is not suitable.

Q: how do you implement a thread-safe c++singleton
A: not obvious. See concurrent lazy singleton using static-local var

Q12: in a simple function I have
vector v1 = {“a”, “b”}; vector v2 = v1; cout<<….
What happens to the ctor, dtor etc?
A: 2 std::strings constructed on heap, vector constructed on stack; 2nd vector copy-constructed on stack; 2 new strings constructed on heap; vector destructors deletes all four strings
A: the actual char array is allocated only once for each actual value, due to reference counting in the current std::string implementation.

Q12b: you mean it’s interned?

Coding question: implement void remove_spaces(char * s) //modify the s char array in place. See %%efficient removeSpaces(char*) #Wells

—— threading, mostly in java
Q: What are the problems of CAS solutions?
A: too many retries. CAS is optimistic, but if there are too many concurrent writes, then the assumption is invalid.
%%A: missed update? Not a common issue so far.

%%Q: Synchronized keyword usage inside a static method?
AA: you need be explicit about the target object, like synchronized(MyClass.class)

Q21: Name 3 effects of java volatile keyword — Advanced. See 3effects@volatile ] java5.

Q21b: analyze the double-checking singleton implementation.
staticInst = new Student(); // inside synchronized block, this can assign an incomplete object’s address to staticInst variable, which will be visible to an unsynchronized reader. Solution – declare staticInst as volatile static field

—— System programming (ANSI-C) questions
Q: have you used kernel bypass?

Q5: how do you construct a Student object in a memory-mapped-file?
%%A: placement new

Q5b: what if we close the file and map it again in the same process?
%%A: the ptr I got from malloc is now a dangling ptr. Content will be intact, as in Quest

Q5c: what if Student has virtual functions and has a vptr.
%%A: I see no problem.

Q: you mentioned non-blocking TCP send(), so when your send fails, what can you do?
%%A: retry after completing some other tasks.

Q4: why is UDP faster than TCP
%%A: no buffering; smaller envelopes; no initial handshake; no ack;
%%A: perhaps most important for market data — multiple recipient with TCP is very heavy on sender and on network

Q4b: why buffering in tcp?
%%A: resend

Q: can tcp client use bind()?
%%A: bind to a specific client port, rather than a random port

Q6: is there socket buffer overflow in TCP?
A: probably not

Q6b: what if I try to send a big file by TCP, and the receiver’s buffer is full? Will sender care about it? Is that reliable transmission?
A: Aquis sends 100MB file. See no overflow]TCP slow receiver #non-blocking sender

Q: in your ticking risk system, how does the system get notified with new market data?
%%A: we use polling instead of notification. The use case doesn’t require notification.

Q: any network engineering techniques?

Q: what kernel parameters did you tune?

 

—— Stupid best-practice questions:
Q: what’s the benefit of IOC?

Q: Fitnesse, mock objects

Q6: motivation of factory pattern?
Q6b: why prevent others calling your new()?
%%A: there are too many input parameters to my ctor and I want to provide users a simplified façade
%%A: some objects are expensive to construct (DbConnection) and I need tight control.
%%A: similarly, after construction, I have some initialization in my factory, but I may not be allowed to modify the ctor, or our design doesn’t favor doing such things in the ctor. I make my factory users’ life easier if they don’t call new() directly.
%%A: I want to centralize the logic of how to decide what to return. Code reuse rather than duplication
%%A: I can have some control over the construction. I could implement some loose singleton

Thesys QnA IV

Q: what’s the average O() complexity of insertion sort
AA: N^2 in most implementations — comparison-based, but using contiguous memory. NlogN if using skip list

Q: what’s the average O() complexity of heap sort
AA: NlogN, since it’s comparison-based.

Q: what are the synchronization primitives on the operating systems you know

Q: how many levels of caches are in a typical Intel processor
%%A: at least 3 possibly 4.
AA: some intel processors do have 4 levels

Q: name some IPC mechanisms
%%A: 1) shared mem 2) sockets 3) pipes

–2nd round (Don Bush?)

Q: Requirement: insert into a std::map only if key is absent
AA: the insert() function does that by default! Returns a pair<Iterator-where-inserted, bool-isInserted>. I used this same technique in my handleInputInternal()

Q: if my class Student has a private dtor, will it compile?
%%A: yes it’s possible to define a member function to invoke the dtor on a pointer passed in
%%A: deleting a ptr-to-Student won’t compile. Correct!
AA: have a friend call the dtor!

Q: Any guess why STL stack designer didn’t combine top() and void pop() like in other languages? hint: what if the copy ctor throws?
%%A: it’s more efficient if a function doesn’t need to return anything, so we don’t want to force the pop() users to pay a price for something they don’t need.

Q: do you create your own templates?
A: I should name the SFINAE hack!

Q: any restriction on a non-type template parameter’s type?
%%A: int and bool are common. Pointers are allowed. https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.2.0/com.ibm.zos.v2r2.cbclx01/non-type_template_parameters.htm shows even arrays and ptr-to-functions are allowed.
AA: has to be a discrete type, not a float

%%Q: what’s the usage of this non-type parameter?
A: std::array uses it for the size. It can’t be a ctor argument since if you construct an array on stack the size must be known at compile time.

Q: can you use new-arrays with shared_ptr? I think he wanted vector<shared_ptr>
%%A: if I have a new-array to store somewhere, then I think shared_array or scoped_array
A: how about shared_ptr<vector>?

Q: what if ctor failes in Foo * p = new Foo() // will memory leak?
%%A: i know it’s common practice to throw from ctor…

Q: name some c++11 features you used in your recent projects
%%A: unique_ptr, shared_ptr, auto, unordered_map, initializer list for vector/map

Nomura C++quant IV

Q: vector.at() vs operaotr[]?
%%A: operator[] throws exception only if you specify a compiler flag like D_GLIBCXX_DEBUG. I think this is correct

Q: given a vector or ints, how would you sort it by abs value?
%%A: free function
%%A: subclass “less” with operator(). body is same as the free function

Q: clustered index vs non-clustered?

Q: inputs to price an equity option?
%%A: 5 inputs: K, S, T, r, dividend rate. I missed “implied volatility”, which is a soft market data derived from a raw market data.

Q: inputs to a mortgage prepayment model (or callable bond)? ….
%%A: Black model to price a callable bond. Inputs include yield, maturity, interest rate, but there’s no strike per-se? I think the implied volatility (in which random process?) is the most important factor and needs to be backed out from some market data

Q: how did you construct your yield curves at Barcap?

Q: what’s local vol
A: the instantaneous volatility is a deterministic function of 2 inputs. Deterministic because there’s no dB … See other blog posts.

Q: zero-volatility spread?

Q: how do you compute OAS?
%%A: I think OAS is a curve.

Q: how about dv01?

Q: duration of a 10Y T-bond vs 10Y muni bond? I answered with the (more “visual”) Macaulay duration, which is slightly different from the (most useful) modified duration

TradeWeb c++IV

100% QnA interview. No coding no white-board no ECT no BP no algo no data structure.

System is Multi-threaded VC++/MSSQL.

One technical interviewer only. (The other 2 are behavior interviewers.) He is a director and looks very technical. I have met perhaps 3 to 5 interviewers focused on the fundamentals of a language/compiler. They pick one of 10 important parts of a language (like java, c++ or c#) and try to determine how well a candidate understands the key details and the rationales.

Below, Q’s are questions from this interviewer. A’s are my answers/guesses.

Q1: let’s talk about non-virtual dtor. You said there are problems in deleting a subclass instance via a base class pointer. Show an example.
A: Eg: subclass holds some resources to be released in its dtor, but in our scenario only the base dtor is called. Subclass dtor is not invoked.

Q1b: if subclass dtor is invoked, does it always run the base dtor?
%%A: Yes guaranteed. See my blog.

Q1c: if a subclass dtor is virtual and you delete an instance via a subclass ptr, you said this is good and subclass dtor will run, so is the “virtual” unnecessary? What’s the difference virtual dtor  vs non-virtual dtor?
A: The dtor is still picked at run time (and slower), but yes in both cases the subclass dtor runs.  If your system is written such that you delete a subclass instance only via a subclass ptr and never superclass ptr, then drop the “virtual” to improve runtime performance.

Q3: Let’s talk about a static member function sf2() in class Base. If you have a reference to a Base instance, Can you call myInstance.sf2()?
A: yes

Q3b: if Base and Der both have a static method sf2()?
A: subclass sf2 tends to hide Base sf2. Static type of myInstance variable is used to determine which version is used.

Q3c: how about myPtr->sf2()? Obscure details that I don’t want to spend too much time on.
%%A: looks very odd, but if it compiles, then it’s static, never dynamic, binding.

Q4: when must you use the pointer “this”? Obscure details that I don’t want to spend too much time on.
%%A: ctor chaining? Actually chaining is supported in c++11 but not using “this”.
A: if there’s a field and a local variable of the same name, I use “this->” to disambiguate. CORRECT:) See
https://stackoverflow.com/questions/22832001/access-member-field-with-same-name-as-local-variable-or-argument
A: in my ctor (or method), i may want to print my address. Yes you can usually do that in the caller afterwards but not always
A (hindsight): Similarly, in my ctor (or method) I may want to conditionally save my address in some container. The conditional logic is not doable after the ctor.
A (hindsight): if I know the host object is in an array, I could do array arithmetic using “this” with precaution.
A: delete this. I have seen people doing it.
A (hindsight): CRTP
A (hindsight): if parent class has a const method m1() and non-const overload m1(), you can cast “this” to call either of them.
A (hindsight): what if inside a method you need to access a parent object field hidden by a host object field? Can you cast “this”? Yes tested but not a “must” use case. Alternative solution is B::nonStaticField
AA: sometimes you must return *this where the return type is HostClass&

Q4b: can you think of an example of using “this” to avoid access violation?

Q4h: why use ctor chaining even if it’s supported? Why not call a common (non-static) setter method s2() from multiple ctors of the same class?
A: that setter has to be non-static!

Q4i: You think ctor should not call a non-static method?
A: it can, but I don’t do it. The host object is half-initialized, so the setter method must strictly perform field initialization and nothing else.

Q5: why do people use class templates? Note — Interviewer didn’t ask function templates.
%%A: avoid the cost of virtual functions

Q5b: Let’s look at combining class hierarchy with templates.
A: not a good idea to combine them. STL has probably no virtual functions.

Q5c: Let’s see. You said you have a bunch of “payload” classes and you want to use a container of PL1 and container of PL2. Ok You said that’s a common use case of class templates unrelated to virtual functions. Suppose the container template requires each payload class to implement a method f2(). If f2 is provided by a base class, wouldn’t it be easier (can be virtual or non-virtual)?
A: If there’s a family of payload classes, like Shape, Rectangle, Square… then I feel it’s best practice to use a template of smart pointers. I think it reduces the risk of slicing.
A: if the payload classes don’t form a family hierarchy, then there is probably some template meta-programming technique to provide a default implementation of f2(). I’m no expert on template meta-programming.

3groupsOf3digits #YiHai

Q: Find three 3-digit numbers with a ratio of 1:2:3;
These numbers pick their 3 digits from a range of 1 to 9;
All digits that form these numbers must be completely unique, i.e. you must use up all distinct 9 digits. For example, a set of 123:246:369 doesn’t qualify.

def failed_to_add(myset, digit): # return False when failed
    return digit == 0 or len(myset) == ( myset.add(digit) or len(myset))

def other_number_failed(myset, num): # returns False by default if everything good
    for digit in(num/100, num/10%10, num%10):
        if failed_to_add(myset, digit):     return True

def check(number1, myset):
    number2 = 2*number1
    if other_number_failed(myset, number2): return 
    number3 = 3*number1
    if other_number_failed(myset, number3): return 
    print number1, number2, number3, myset
    winners.append([number1, number2, number3])

winners=list()
nine_digits = tuple(range(1,10))
for digit1 in (1,2,3):
    last8 = list(nine_digits)
    last8.remove(digit1)
    for digit2 in last8:
        last7 = list(last8)
        last7.remove(digit2)
        for digit3 in last7:
            number1 = digit1*100 + digit2*10 + digit3
            if number1 > 329: break
            check(number1, {digit1, digit2, digit3})
print "-- lucky winners --", winners

ref-counted copy-on-write string #MIAX exchange IV

I completely failed this 2011 IV question from MIAX options exchange:

Q: outline a ref-counted copy-on-write string class, showing all the function declarations
A: here’s my 2017 answer

struct Payload{ //this object must live outside any Str instance. If 1st Str instance in the group is destroyed, this object must live on.
	char * const arr;
	size_t const length;
	mutable size_t refCount;
	Payload(std::string const &);
};
class Str{
	Payload const * payload;
public:
	~Str(); //decrement refCount and if needed, delete the payload
	//Str(); //default ctor is useless since after construction we can't really modify this instance, due to copy-on-write
	Str(std::string const &);
	Str(Str const &); // similar to shared_ptr

	Str & Operator=(Str const & other) const; // will return a reference to a new instance constructed on heap (never on stack!). The new instance will have a ref-count initialized to 1
	Str & replace_with(std::string const &) const; //ditto

// optional utilities
	char const * c_str()    const; // <-- Hey mine is very similar to std::string
	Str(char const * arr, size_t const len); // <-- Hey mine looks similar to std::string
	friend ostream & operator<<(ostream &, Str const &);
};

java GC frequency and duration

Actually, GC overhead is more important than GC frequency or duration, except in low latency systems. This blog has many posts on overhead.

–frq

Could be Every 10 sec , as documented in my blog

–stop-the-world duration: (Concurrent collection probably doesn’t worry us as much.)

100 msec duration is probably good enough for most apps but too long for latency sensitive apps, according to my blog.

 

[11] threading,boost IV #Sapient perhaps

Read-guard, write-guard ?

Q: how to test if current thread already holds a target lock?
%%A: Though there might be platform-specific tricks, I think it’s not widely supported. Best to use recursive mutex to bypass the problem.
AA: pthread_self() is a C function (=> non-method) that returns the thread id of the current thread.
AA: https://blogs.msdn.microsoft.com/oldnewthing/20130712-00/?p=3823 has a microsoft specific solution
AA: using this thread::id object a lock can “remember” which thread is holding it.  https://stackoverflow.com/questions/3483094/is-it-possible-to-determine-the-thread-holding-a-mutex shows a gdb technique, not available at runtime.

Q: Is scoped_lock reentrant by default? Can you make it reentrant?
AA: With a boost recursive mutex, a single thread may lock the same mutex several times and must unlock the mutex the **same-number-of-times**.

AA: pthreads supports it. http://www.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html says — If the mutex type is PTHREAD_MUTEX_RECURSIVE and the mutex is currently owned by the calling thread, the mutex lock count shall be incremented by one and the pthread_mutex_trylock() function shall immediately return success.

Q: Boost supports try_lock()?
AA: yes

Q: Boost supports lockInterruptibly()?
A: probably not, but there’s timed_lock

Q: Boost supports reader/writer locks?
AA: read/write lock in the form of boost::shared_mutex

Q: difference between mutex and condition var?

Q: can you write code to implement “recursive” locking with reader/writer threads?
A: key technique is a bool isHoldingLock(targetlock)

Q: what types of smart pointers did you use and when. What are the differences? I’d like to focus on the fundamentals not the superstructures.

Q: what other boost lib?
%%A: I briefly used boost bind. I believe STL binders are imperfect, and lambdas are superior

Q2: if I want a custom class as a map key?
%%A: must overload the less-than operator

Q2c: what if I can’t change the source code?
%%A: derive from binary_function to get a functor, and pass it to map constructor as the optional type param. However, the new functor only has “operator()” overloaded? I think it will still work, as the function doesn’t have to be named “less()”.
AA: Item 42 [[eff STL]]

%%Q2g: is operator less-than a friend func or a method?
AA: can be a non-friend(if everything public) or a friend class/func.

PIMCO(accrual account`)c++/java/sql IV

Q: (no good answer) You said users sometimes don’t know what they are requesting so their requirements actually don’t make sense. We as developers need good knowledge to discuss with them and solve the problem, not just take their instructions. Give an example where you challenged a user to understand better what she wants and then proposed a solution that better solves her problem.

Q: bond with sinking fund provision?

Q: suppose you bought at $104.5 with the $4.5 premium (or discount), how do you amortize that amount over the remaining 23 years of the bond? I guess the book value changes over time unlike stocks.

Q: compare a stock and a bond on the same company
%%A: coupon payments follow a predetermined schedule, not discretionary
%%A: bond has an end of life
%%A: future cashflow of the bond is known in advance
%%A: creditor vs partial owner, in the event of bankruptcy
A: bonds (esp. long-duration bonds) are more sensitive to interest rate changes

Q: what’s a bullet bond?
AA: Most bonds repay the principal as a single payment at the end of its maturity, which is known as a bullet payment. For this reason, conventional bonds are also known as bullet bonds. Amortizing bonds, on the other hand, repay the principal over a series of payments rather than all at once on the maturity date, so some or all of the periodic payments will consist of both interest and principal.

Q: how would you describe OO to a procedural programming student
%%A: use a real yet simple example like students/projects/grades/classes/labs/semesters/events/teams to illustrate the essential OO features

Q: how would you describe the spring framework?

q: java vs c++ for a concurrency system like a basic consumer/producer design?

q: java vs c++ for hadoop

%%Q: could I say std::transform() is a swiss army knife?

Q: a SAM interface in c++? Makes sense?

q: describe factory method pattern in c++?

q: bridge pattern implemented in pimpl?

q: can you use scoped_ptr or unique_ptr as class fields?
%%a: scoped doesn’t make sense
AA: wrong interpretation of “scope”. It can be a class field!

q: enable_shared_from_this ? I guess this has to do with the common scenario of two shared_ptr “clubs” each thinking it has sole ownership of a raw pointer

q: how do you declare an interface in c++?
%%A: no field. All methods are pure virtual. No ctor but dtor should be virtual

q: if you don’t declare a copy ctor what does the default copy ctor do?
%%A: bit-wise copy

q: can move ctor be synthesized if you don’t declare one?
%%A: some times it can, but I don’t think it’s always possible.
AA: Correct. for some classes, the standard forbids synthesized move ctor

q: what’s martingale
%%a: a process whose expected value at any future time is equal to the last realized or observed value. No drift.

q: what’s hitting time
%%a: expected time the process will hit a certain level

q: what if my join query produces duplicates, and those tables I join are not my table so I can’t redesign them?
%%a: use distinct or order by.
A: now I think it can happen if you select nothing but “gender”

q: what access modifiers can apply to a java interface?
%%a: private and protected doesn’t make sense. If it’s only used within a package and should not be exposed outside, then the default access may do.
AA: correct

RTS c++IV

One of the less challenging (but not easy) c++ interviews. No coding test.

  • Q: what editors do you use in your current c++ project
  • Q: function returning pointer to a local object — spot the bug and to fix it.
  • Q: implement a simple singleton class.
  • Q: something about overflow in integer data types
  • Q: non-virtual member function inherited by subclass but problematic when you call it on a subclass instance. It can’t access the subclass field
  • Q: spot the memory leak
  • Q: swap 2 integer variables without using a temp variable.
  • Q: implement sqrt(int) without using the math library.

c++base class method accessing subclass field #RTS IV

Surprisingly, the non-virtual base class method can Never know that it’s actually operating within a subclass instance. It always behaves as a strictly-baseclass method and can’t access subclass data. The non-virtual method getId() is compiled into base class binary, so it only knows the fields of the base class. When a subclass adds a field of the same name, it is not accessible to the getId(). A few workarounds:

  1. curiously recurring template pattern
  2. virtual function in subclass
  3. down cast the pointer and directly access the field
#include <iostream>
using namespace std;
struct B {
	char id = 'B';
	virtual ~B() {}
	char getId() {
		return id;
	}
} b;
struct C: public B{
	char id = 'C';
} c;
int main() {
	B* ptr = new C();
	C* ptr2 = dynamic_cast<C*>(ptr);

	cout << "base class nonref: " << b.getId() << endl;
	cout << "sub class nonref: " << c.getId() << endl; //still base class method. 
	// The fact that host object is subclass doesn't matter.

	cout << "base class ptr to subclass instance: " << ptr->getId() << endl; 
	cout << "downcast ptr non-virtual method: " << ptr2->getId() << endl; //B
	cout << "downcast ptr direct access: " << ptr2->id << endl; //C
	return 0;
}

 

UBS java IV 2017

Q: How could you get short theta?

Q: For a double-linked list, what if the prev pointer field is final?

Q1b: Can you construct such a list?

Q1c: What operation can you perform on it?

Q: How do you tell if a java app (or part thereof) is live-locked?

Q: How do you know if a jvm is short of memory, without seeing OOM error?

%%A: GC log would show much higher activity than the “baseline”.

Q: if a java app has hundreds of threads but most of them are not doing work, then what thread states would they have?

%%A: they must be blocked

pimco java IV onsite

Hint 81: fib(a,b, N) can be implemented as fib(b, a+b, N-1). For example, 8th Fibonacci number would be fib(0,1, 8), which would call fib(1,1,7), which would call fib(1,2,6)… This is related
to http://stackoverflow.com/questions/310974/what-is-tail-call-optimization

Q: can CMS ever stop the world?
AA: yes

Consider Fibonacci sequence. Write a function fib(N) that returns the N’th number in the sequence. fib(0) := 0, fib(1) :=1, fib(N) := fib(N-2) + fib(N-1).

Q1a: write a recursive version. What’s the time complexity and space complexity?
Q1b: write a non-recursive version. What’s the time complexity and space complexity?
Q1c: write a non-recursive version with space complexity of O(1) Q1d: write a tail-recursive version with time complexity O(N) and space complexity O(1). If no clue, look for the hint hidden nearby:)

Q2: Consider a simplified shared shopping cart. It contains any number of orders. Each order has {id, productName, quantity}. Only id is immutable; other fields are writable by multiple threads. Implement the following methods without locking (you could use atomic or other solutions to ensure thread safety):

int addOrder(productName, quantity);
boolean modifyOrder(id, productName, quantity); // two threads could modify the same order concurrently.
[Optional feature] boolean deleteOrder(id);
[Optional feature] boolean isOrderCompleted(id);

To keep requirements simple, It’s tolerable to have two orders both for the same productName.

Q3: At a high level, list all the common approaches to thread safety %%A: Read-Copy-Update; thread local; immutable objects; copy-on-write; mutex; CAS

Q4: how do you implement an immutable class with only primitive fields? %%A: make all fields private; provide no setter

Q4b: the fields need to be final?
%%A: I think unnecessary.

Q4c: class should be final?
A: I said yes but Now I think subclassing can be allowed since my fields will not be writable by subclasses. For example, a ImmutablePerson{SSN, DOB} can be sub-classed by Student{nationality, department}

Q5: what kind of classes can be a key in a map?
%%A: I feel the relevant fields should be primitive, string and enums only. Any other field I want to be careful about.
%%A: key class is effectively immutable — after the key goes into a map, the relevant fields should not change because the position of the key will not be updated.

Q6: for an overnight batch program, what GC algorithm is good for throughput?
%%A: long pauses are tolerable.
AA: Use parallel GC. See my blog
post https://bintanvictor.wordpress.com/2017/04/05/two-gc-algos-for-latencyt hroughput/

Q7: how frequent is the minor GC in your app?

Q8: how do you deal with OOM in production?
%%A: use jconsole, or somehow print a memory “report” to the log.

Q10: compare vector and linked list data structures
%%A: vector is random-access. Appending at the end is usually fast but can cause reallocation; linked list can insert anywhere but each node is too big.

Vector takes smaller space and is more cache-friendly.
See https://bintanvictor.wordpress.com/2017/04/05/contiguous-memory-data-str uctures-page-faultscache-miss/

BGC IV – c++ and Java

Q: how do you mark a memory region as always resident?
A: See mlock() : prevent paging ] real time apps

—socket
Q: is select() blocking or non-blocking?
A: Yes select() takes a timeout argument to specify the blocking interval!
A: More interestingly, https://stackoverflow.com/questions/5351994/will-read-ever-block-after-select/5352634#5352634 says after select(), you should use read() which normally won’t block.

Q: server uses socket() listen() bind() accept(). How about client side?

Q1: if a producer thread uses select to send packets to 22 receivers via 22 tcp sockets and a single receiver has very low capacity, what would happen? See tcp: one of 3 client-receivers is too slow

Q1b: select() would show that one sender socket as writable or failed?

Q1c: what if producer keeps sending? What happens next?
%%A: the producers function stack will get an error code.  Wrong!

Q1d: what if UDP rather than TCP
%%A: then no error would occur in the producer system. The slow consumer would be left unnoticed.  There’s no flow-control in UDP

%%Q: can a socket specify a wild card for its local port?
%%A: no. Wildcard is only for local IP
—IPC
Q: fastest IPC method between producer and consumer? OK you said shared memory, so how do you synchronize the producer and consumer?
%%A: use a named sys-V semaphore in the kernel
A: See boos::interprocess documentation. This is a common requirement addressed in boost.

Q: how does the producer/consumer actually use the shared memory?
%%A: memory mapped file

—brain teasers:
Q: Only a 5L and a 3L pail and a running tap. How to get exactly 4L water in the 5L pail?

Q: you are given a bag of coins (like 1c 1c 1c 5c 5c 10c 25c). Tell me whats the smallest exact payment thats impossible. Every payment must be exact. Hint: sort the coins first, then a clever O(1) single-pass will do.
AA: coin problem #all large-enough amounts are decomposable

Q: given a list of integers, find if any pair adds up to 18
%%A: in one pass, build a hashset of brown items i.e. every item processed so far. For every new (i.e green) item, compute the partner, and look for it in the hashset. If not there, then add the green item as a brown item.
AA: https://bintanvictor.wordpress.com/2015/07/21/locate-a-pair-whose-diff55/ is a clever solution.

—whiteboard coding challenge
Q: given a singly linked list with a start node and null-end node, print each node in reverse, without using any “new” implicitly/explicitly
A: reverse the list first (tail recursion or iteratively), then print normally.

Q: Given an arbitrary directed graph where each parent node has 0 to K children, write a function hasCycle() to check for existence of cycle, visiting each node and each edge once only..
%%A: My verbal algorithm — use breadth-first traversal to trace every branch from root to a leaf node. Detect cycle in each branch. But I wonder whether two parent nodes can have the same descendant node.

Whenever we branch out to 2 child branches, duplicate the parent branch, so that each child branch object has a full path from root.

Each branch object could be a (possibly linked[1]) hashset.
[1] for instrumentation.

Q: given a blackbox utility function String convert(String), write a parallelized Collection parellelConvert(Collection theSequence)
Requirement: theSequence need to be maintained. Size of input is preserved Requirement: Out of 100 items in theSequence, #3 and #5 might be identical, but they still need to show up as “converted” in the output
Requirement: converter uses a very slow and expensive remote system, so we don’t want to send it the same string twice.

—-
%%Q: is countdown latch and join() implemented using wait/notify? %%Q: readResolve vs readObject
%%Q: can CMS ever stop the world?
%%Q: CMS is used in which region of the heap?
Q: what if you suspect theres leak?
Q: what can cause mem leak in java?
%%A: registered listeners; static variables pointing to a collection

Q: why is swing memory footprint so bad?
Q: array blocking queue — when would it block?
%%A: I think a circular array is the underlying.

pimco java IV#Zoltan

Hint 81: fib(a,b, N) can be implemented as fib(b, a+b, N-1). For example, 8th Fibonacci number would be fib(0,1, 8), which would call fib(1,1,7), which would call fib(1,2,6)… This is related
to http://stackoverflow.com/questions/310974/what-is-tail-call-optimization

Q: can CMS ever stop the world?

Consider Fibonacci sequence. Write a function fib(N) that returns the N’th number in the sequence. fib(0) := 0, fib(1) :=1, fib(N) := fib(N-2) + fib(N-1).

Q1a: write a recursive version. What’s the time complexity and space complexity?
Q1b: write a non-recursive version. What’s the time complexity and space complexity?
Q1c: write a non-recursive version with space complexity of O(1) Q1d: write a tail-recursive version with time complexity O(N) and space complexity O(1). If no clue, look for the hint hidden nearby:)

Q2: Consider a simplified shared shopping cart. It contains any number of orders. Each order has {id, productName, quantity}. Only id is immutable; other fields are writable by multiple threads. Implement the following methods without locking (you could use atomic or other solutions to ensure thread safety):

int addOrder(productName, quantity);
boolean modifyOrder(id, productName, quantity); // two threads could modify the same order concurrently.

[Optional feature] boolean deleteOrder(id);
[Optional feature] boolean isOrderCompleted(id);

To keep requirements simple, It’s tolerable to have two orders both for the same productName.

Q3: At a high level, list all the common approaches to thread safety %%A: CAS; mutex; Read-Copy-Update; immutable objects; copy-on-write; thread local

Q4: how do you implement an immutable class with only primitive fields? %%A: make all fields private; provide no setter

Q4b: the fields need to be final?
%%A: I think unnecessary.

Q4c: class should be final?
A: I said yes but Now I think subclassing can be allowed since my fields will not be writable by subclasses. For example, a ImmutablePerson{SSN, DOB} can be sub-classed by Student{nationality, department}

Q5: what kind of classes can be a key in a map?
%%A: I feel the relevant fields should be primitive, string and enums only. Any other field I want to be careful about.
%%A: key class is effectively immutable — after the key goes into a map, the relevant fields should not change because the position of the key will not be updated.

Q6: for an overnight batch program, what GC algorithm is good for throughput?
%%A: long pauses are tolerable.
AA: Use parallel GC. See my blog
post https://bintanvictor.wordpress.com/2017/04/05/two-gc-algos-for-latencyt hroughput/

Q7: how frequent is the minor GC in your app?

Q8: how do you deal with OOM in production?
%%A: use jconsole, or somehow print a memory “report” to the log.

Q10: compare vector and linked list data structures
%%A: vector is random-access. Appending at the end is usually fast but can cause reallocation; linked list can insert anywhere but each node is too big.

Vector takes smaller space and is more cache-friendly.
See https://bintanvictor.wordpress.com/2017/04/05/contiguous-memory-data-str uctures-page-faultscache-miss/

## how might jvm surpass c++]latency #MS

A Shanghai Morgan Stanley interviewer asked in a 2017 java interview.

One hypothesis — no free() or delete() in java, so the memory manager doesn’t need to handle reclaiming and reusing the memory.  [[optimizedC++]] P333 confirmed the c++ mem mgr need that. Instead the GC uses a very different algorithm — see below.

One hypothesis — after a warm-up period, based on heuristics JIT compiler could aggressively compile bytecode into machine code with speedy shortcuts for the “normal” code path + special code path to handle the abnormal conditions. Here’s an analogy — if every borrower seen so far has acceptable credit score, the bank may simplify credit check and have special procedure to deal with defaults.  For most of the cases this work flow is faster than the traditional. P76 [[java performance]] gave an example. A java method calls obj1.equals(obj2). Based on heuristics, JIT knows that all previous calls have obj1 being a String object, so JIT can generate assembly code to

  1. call String.equals() and go ahead to read some field of the String object obj1
  2. if no such field, then obj1 is not String, then backtrack and use obj1 vtable to look up the virtual method obj1.equals()

It may turn out that  99.9% of the time we can skip the time-consuming Step 2.

http://www.javaworld.com/article/2076593/performance-tests-show-java-as-fast-as-c&#8211;.html is a 1998 research.

In my GS-PWM days, a colleague circulated a publication that java could match C in performance, but they didn’t say “surpass”

https://stackoverflow.com/questions/1984856/java-runtime-performance-vs-native-c-c-code is not a published expert but he says:

On average, a garbage collector is far faster than manual memory management, for many reasons:

  • on a managed heap, dynamic allocations can be done much faster than the classic heap
  • shared ownership can be handled with negligible amortized cost, where in a native language you’d have to use reference counting which is awfully expensive
  • in some (possibly rare and contrived) cases, object destruction is vastly simplified as well (Most Java objects can be reclaimed just by GC’ing the memory block. In C++ destructors must always be executed)

earliest java IV@2017 #MS+HSBC

  • Q: When is JIT compiled code performance higher than c++? See separate blog
  • Q: difference between JVM stack vs native stack?
  • Q: ThreadLocal internal implementation?
  • Q: data structures with concurrent modification notifications — how is it implemented?
  • — IPC between processes (language-neutral) —
  • Q: how is shared memory managed?
  • Q: messaging uses sockets and has high overhead. What other solutions can maintain FIFO?
  • %%A: nothing new. The earliest MOM has dealt with this problem long ago. Perhaps multiple files with single producer and single consumer would be ideal. The 2 processes need to operate on both ends of the file. (There could be some kernel support for this.) See https://coderanch.com/t/278842/java/Reading-writing-concurrent-threads-file
  • Q: what kind of jdk locks have you used?
  • %%A: readwrite lock, reentrant lock
  • Q: How would you size your thread pool, based on processor count?
  • Q: For a market data gateway, when would additional threads help (and when would they be useless or counterproductive)?
  • %%A: I/O bound, the processors could be 99% idle. More threads would increase the utilization rate. Ideal is simultaneous saturation.
  • Q: thread cancellation without using Futrues?
  • Q: default methods in interface — is it a breaking change? http://stackoverflow.com/questions/22618493/does-introducing-a-default-method-to-an-interface-really-preserve-back-compatibi has a concise answer
  • Q: how is lambda implemented in java 8? See my separate blog post.
  • Q: Singleton pattern — what issues do you know?

c++SCB eFX IV#Dmitry

100% QQ type, as defined in https://bintanvictor.wordpress.com/2017/02/15/qqzz-mutual-exclusion-cjava/.I feel many are micro optimizations with questionable improvement. I wonder how much value such obscure knowledge adds to the team.

Q: Scanning a vector of int (like finding the average or max). Forward iteration vs backward iteration, which one could be faster, considering all possible compiler optimizations.

%%A: forward. Memory read into cpu cache will be in chunks, not one element at a time. Easy for forward iteration. Not sure about backward.

Q: Which one could be fastest:

void f(double arg){…..}
void f(double & arg){….}

%%A: inlining for first but not 2nd?
A: See http://stackoverflow.com/questions/722257/should-i-take-arguments-to-inline-functions-by-reference-or-value esp. the long answer.

Q: Thr1 and Thr2 on 2 CPU’s both update an object s, having 2 fields. Thr1 only updates s.field1. Thr2 only updates s.field2. No interference. No synchronization required. We observe the performance is slower than using one thread to update both fields. Any explanation?
%%A: caching in cpu

Q: weak_ptr justification, when we have shared_ptr already? I feel [[effModernC++]] has a good chapter on it.

Ashish pointed out in some apps, you could identify a clear risk of circular dependency. Replace with weak_ptr.

Q: given an 2D array arr[10][5], how do you use pointer arithmetic to hit arr[1][5]

A: Contiguous. see http://stackoverflow.com/questions/7784758/c-c-multidimensional-array-internals. Note this is different from an array of pointers.

Q: what would you see if a TCP socket server has a full queue
%%A: TCP requires handshake, so if server is unable to accept a request the client would know it.
%%A: connection refused?

Q: what STL algorithms did you use?
%%A: foreach(), find(), copy_if(), transform(), reverse(), sort(), replace_if, remov_if

[17]java MS java threading IV#phone

See also https://bintanvictor.wordpress.com/2009/03/21/realtime-inter-vm-communication-in-front-desk-trading-sys/

These are in the QQ category i.e. skills required for QnA IV only.

Q1: 3 threads to print the numbers 1,2,3,4,5… in deterministic, serial order. Just like single-threaded.

Q1b: what if JVM A has T1, T2, and JVM B has T3? How do they coordinate?
%%A: in C++ shared memory is the fastest IPC solution for large data volume. For signaling, perhaps a semaphore or named pipe
%%A: I feel the mutex is probably an kernel object, accessible by multiple processes.

On Windows, mutex, wait handle, … are all accessible cross-process, but java (on Windows or unix) is designed differently and doen’t have these cross-process synchronization devices.

%%A: use a database table with one row one column. Database can notify a JVM.
AA: The java.nio.file package provides a file change notification API, called the Watch Service API. The registered JVM has a thread dedicated to watching.AA: in java, the JDK semaphore is NOT a wrapper of the operation system semaphore so not usable for IPC
A: java Semaphore? Probably not an IPC construct in java.

Q2: have you used any optimized Map implementations outside the JDK?

Q3: to benchmark your various threading solutions how do you remove the random effects of GC and JIT compilation?
%%A: allocate enough memory to avoid GC. Turn off JIT to compile every code path. Perhaps give the JVM some warm-up period to complete the JIT compilation before we start the benchmark.

Trex QnA IV #low-level C

Q: false sharing? cache coherence? cache line?

Q5: 3 threads in my process, and one of them calls fork(). What are the threads in the child process and which thread will be running?
Q5b: what potential problems do you see?
%%A: some resources like locks could be in a bad state

Q: how do you start a thread? %%A: pthreads_create()
Q1: how do you start a process? %%A: fork()
Q1b: what happens to the parent’s signal handlers, file handles etc

Q: how does the parent process wait for the child process to end?
AA: wait()

Q: sigkill vs sigterm?
%%A: can’t catch sigkill

See other questions discussed in unix signal #Trexquant IV

Q: why we need isnan(myFloat) rather than if (myFloat == Nan)
Q2: I have a float variable1=20180213, and I cast it to int. What’s the int value?
Q2b: two floating point numbers are both very close to 1. In fact they are the closest possible pair. What’s your estimate of their difference?
Q2c: what’s the representation of a float object?

c++method default arg isn’t saved]vtable #Ashish

Google c++ style guide forbids default args in virtual functions.

Ashish said he was asked a common question in multiple c++ interviews. The default arg value is “declared” in base class and (or) derived class but never saved in the vtable. It’s basically resolved statically, at compile time.

If you use a pointer to Base to invoke the virtual method, then the Base default arg value applies unconditionally, even though the subclass method is chosen at run time, via the vtable. Highly confusing.

C++ ICE mkt-data – some of the more difficult questions

Q: you said you write code on Windows, but production platform is linux, so how do you test your code?
Q: difference between a C struct and a C++ struct?
Q: have you used memory leak detectors?
Q: How do you link c and c++ code? dlopen()?
Q: have you used memory leak detectors?
Q: scope resolution operator?
Q: default constructor?

Q: what other things are synthesized?
%%A: dtor, copier, op= but c++11 add some move operators
A: indeed, c++11 would synthesize 1) move ctor and 2) move-assignment operator under strict conditions. See http://stackoverflow.com/questions/3734247/what-are-all-the-member-functions-created-by-compiler-for-a-class-does-that-hap

Q: given template<typename T> void f(T a, T b), can f(3, 0.8) compile?

Q: how do you use gdb to view a core file?

Q: write a sqrt(int) function without using the math library.  Estimate to 0.01 precision.

Q: UDP vs TCP?
A: virtual circuit. Single remote socket. In contrast UDP can send to multiple receivers efficiently – broadcast or multicast. A: transmission control, with data loss prevention

Q: what’s wrong with multiple inheritance? How do you deal with it?
%%A: diamond problem. I don’t use virtual inheritance. I use MI when base class is an interface without any field.

Q1: how do you create a thread in c++?
A: std::thread is RAII so the constructor starts the thread, and the dtor must run after joining or detaching.

Q1b: Have you used posix_thread?
A: it’s a c library, so no constructor magic. Probably some pthread_something() function. Correct! pthread_create()

how many degrees(upTo180)btw hr^min hands #Dilip

Dilip had an elegant solution by hand.

3:15 — MH is at 90 degrees; HH is slightly over 90. It’s 1/4 way towards 120 degrees i.e. 90+7.5 degrees. Therefore, the answer is 7.5 degrees

3.10 — MH is at 60 degrees; HH is slightly over 90. It’s 1/6 way towards 120 degrees, i.e. 95 degrees. Answer is 95 – 6 = 35 degrees.

Note the MH is always at an integer degree but HH can be at a xxx.5 degrees.

find the line passing the most points #solved

https://leetcode.com/problems/max-points-on-a-line/description/ might be similar

Q: Given N points with positive integer coordinates, find the straight line passing through the most points
A: For each of (N*N-N)/2 pairs of points, compute a Line object identified by 2 numbers:
* a slope S = (y2 – y1)/(x2 – x1)
* intercept on y-axis.

So these 2 numbers can be computed easily from any pair.

Save each Line object as key in a hashmap. When a pair gives a Line that’s already seen, increment its count.

Intercept formula y_inter(int, int, int, int) can be assumed to exist. Writing this function isn’t relevant to a coding interview:

Suppose this value is y3, so the incept point is (0,y3), so

(y3-y1)/(0-x1) = S, so y3 = y1 – S x1

2 overlapping rectangles

bloomberg-lp-interview-question.

Q: You have a rectangle a and b. Determine if the two rectangles overlap. That is at least some part of either rectangle should be within the other.

I will assume the coordinates are given.

%%A: find both centers.
– case 1: identify box A’s corner that’s closes to B’s center. Check if this corner is inside B.

– case 2: if they lie on one vertical line, then only the bottom border (of upper box) vs top border (of lower box) matters.

c++DRW IV

Q: a vector needs to allocate N instances of Account but Account has no default ctor, but it just works. How does the compiler achieve it? (Actually it won’t compile if compiler can’t synthesize the no-arg ctor)
A: indeed the call to array new would prevent the vector concrete class from compiling due to SFINAE rule or something like that. (Java and c# would use type constraints instead.) The compiler must be using 2 separate lines – one to allocate raw memory, the other to initialize the object.

However, when is vector internal array allocated by array new? I discussed with Ashish but found no answer.

Q: why did you say you needed to write your own smart ptr?
A: super simple one, just to deal with some issue of the raw ptr… probably dangle pointer, by overriding the deref operator

Q: why would anyone use unique ptr when shared ptr is general purpose
A: threading…
A: some pointee objects are designed with sole-ownership

Q: is the lambda functionality doable in c++98? what is the c++98 equivalent?
A: some functor with a challenging syntax I can’t remember.

Q: OK you don’t use c++11 at work, but do you hack around at home?

Q: why is  the bid/ask spread much wider in options than the underlier?
A: must be the risk to the writer. Competition didn’t drive down the bid/ask spread like it did in FX and cash equities.
AA: delta hedge adjustment can’t be done every second. Before the next adjust, the risk to the writer (market maker and quoter) would be quite high.

c++IV Art@Click #Jens

Q: buffer overflow?
%%A: avoid arrays, use vector
Q: XOR usage?
AA: usages? easy to google
Q: why bitwise shift?
%%A: mostly an optimization as far as I know, but the compiler probably translates integer multiply/divide already.
AA: usages? easy to google
Q: what’s wrong with pointers?
Q: dangling pointer?
Q3: what are the common exceptions in c++?
%%A: c++ has a few standard exceptions and a lot of UB; java has lots of standard exceptions and no UB. q(new) and dynamic_cast…
Q3b: undefined behavior?
%%A: much worse than exceptions or error codes
%%A: perhaps fairly consistent on one platform, but I know writing beyond an array’s limit is indeterminate. See [[c++ debugging)]
Q: exceptions – why do you not want to use it in your API?
%%A: can of worm. If I throw I can’t control how clients use this API. What if it’s thrown in a dtor? What if they don’t catch by reference? What if they catch a sliced one or a copy rather than the original exception object I want them to get? What if they catch by pointer and try to delete or forget to delete? Java cleaned it up.
%%A: I don’t see a lot of well-regarded API’s exposing exceptions
%%A: there’s performance cost
A: the best practice has always frowned on exception specifications. C++11 favors “noexcept”
A: now I think we should be consistent throughout – either throw exceptions consistently or never.
Q: memory leak – what is it and how do you deal with it?
%%A: valgrind replaces malloc with …?
%%A: provide class-specific op-new (and delete), which is safer (see effC++) than a customized global op-new. Add your own house keeping code therein…
Q: how is semaphore different from a mutex
%%A: I think a mutex is more basic and usually provided by the kernel (For a userland thread the thread library not the kernel must provide the mutex). I guess the counting semaphore is implemented using mutex + condition variables, since the semephore may need to inform the waiting threads.
Q: preprocessors?
%%A: 3 usages in my projects – #includes, macros and conditional compile. Now I think template meta programming also uses a lot of macros.
Q: stack trace?
%%A: very useful, that’s why java, c#, python, perl provide it, and GDB too.
A: [[safe c++]] shows simple and robust technique to build a stack trace upon assertion failure
I said many times “I’m philosophical about that” – meaning “it’s controversial IMO and I have my views which may look naive or extreme or eccentric”

C++bbg Standard IV Questions { Zack

1. How is keyword virtual used in C++?

2. Does better O() algorithm always works faster in real cases?

3. Between pre-increment (++i) and post-increment (i++) operator, which one is more efficient in both built-in and overloading case? …….. [[more eff c++]] item 6

4. Please compare array and linked-list in terms of insertion and access

5. Please name some sorting algorithms and their time complexity

6. Can you tell me the difference between stack and heap?

7. What Linux command do I use the find a file in a directory recursively?

8. what is the virtual functions and destructors

9. what is the constructor destructor orders of a class then an inherited class

10. what is the most known sort algorithms , and what is the complexity of each

11. Smart Pointer, how does it work.

12. Templates, when use instead of inheritance…….. [[eff c++]] Item 41

13. Polymorphism, when use multi inheritance, problems that can happen, ……. [[eff c++]]

14. Sockets, monitoring sockets.

15. Multithreading, give a small example of Producer, Consumer.

16. Debugging issues, using GDB

19. TCP/IP and Multicast

20. STL and Boost libraries

rope cliff – puzzle #Roger Lee

You are on the top of a 200m cliff. You have a 150m long rope and a knife. You can only tie your rope where you stand, or 100m below on a tree. Can you reach the bottom of the cliff alive, and how?

Any Free fall is considered deadly.

——-

trick is a loop

Suppose the cliff is point C and the tree is T and the midpoint is point M.

Cut (1st and last cut) 100M of rope and put into pocket. Tie the other 50M at C then descend to M. Convert the end of this rope into a loop at the point M. Lubricate the loop then pass the 100M rope through, to make a 50+50 double rope. Descend to T then retrieve the entire 100M.

c++advanced IV questions – contributed by real interviewers

http://www.quora.com/Anyone-can-list-good-common-and-advanced-questions-for-testing-C++-STL-skill-during-interview

These authors are real interviewers. They really do care about the topics I mentioned to you Monday night. They are seeking c++ specialists who understand the low level details. They don't care about the high level, architecture, design, picking the right framework or library etc.
Victor

Fwd: difficult phone screen with Credit Suisse

Hi Bin,
I just finished a frustrating interview with Credit Suisse.
It's just a phone screen, but many questions I cannot answer. When you have time, please add some comments to below questions.
And because I work from home for the interview, so I can record the whole call. It's shared to you in another email. Please give me some comments about how I behaved, just assume if you were the interviewer, what's your feeling about this candidate?
It's not that urgent, so just do it only when you can find time. I know you have a lot of work to do, so don't waste too much of you time.
——————————————————
1. describe one of your best coding challenges in 2 to 3 years, that can show off your coding abilities and your object oriented skills.
Rong: I mentioned NIO server I had hands on experience.
Interrupted by him, he said: the key thing I'm asking for is the specific coding that you personally did.
Rong: I should not mention I lead a team, because maybe that's the reason he suspect I describe other developers' work as mine. I feel for developer role, management experience is a downside factor.
2. Describe how did the single threaded application avoid blocking IO.
Rong: I cannot articulate how NIO works. I actually indeed created a NIO server. That's true, but I don't know how to explain well in phone screen, and I cannot recall some details such as the class, methods.
3. He continued to asked some details about NIO, and I cannot recall details.
4. Now he start normal quiz procedure. I only list difficult ones here, but I attached the full list FYI.
……
Would you want to implement a maximum queue depth?(I didn't know what's maximum queue depth mean)
What happens if your queue backsup?(I don't known what's that mean)
Would you want a maximum length? Why?
What happens if your producer hit the maximum length?
Do you know what's the completion services?
There is an alternative that you can implement producer/consumer without a queue?
What are the 3 different ways of creating a thread?(implement Runnable, extend thread, what's the third?)
Different between Runnable and Callable? (Callable return result, Callable throw Exception)
Any other differences or advantages for one over the other?
Let's say we have a process that receives request to do work. And each request has no dependency on each other.
And the request queue backs up. How would you solve this? 
(I don't know what's backs up mean. So I assume it means the queue is full. So requesters are blocked and processor will process one by one.)
When you want to use a timer of some kind, what are the different way you can do it in your code?
(I said there is a scheduler class?? not sure if there is)
How would you shut down the multi-threaded system? let's assume you write your own multi-threaded system.
(use countdownlatch?)
What if the thread is blocked?
(call thread.interrupt())
That's it?
What's the exception chaining? (I don't know)
Apart from try-catch block, how can you ensure all exceptions in thread pool are caught by our code?
What's meant by safe publication and the context thread safty shared object retrieval?
fail-fast vs fail-safe?
How would you use the future task? Which method would you use? How would you retrieve the result later?
Do you know the method, what's that called?
Difference between semophore and mutex?
……

Fwd: vague question { xr – production troubleshoot

This is another vague question I couldn’t handle well.
He said that if the production application is hung, how do you find out what caused the problem?
Rong: I check CPU, if CPU is busy… (He interrupted: “suppose CPU is not busy.”)
Rong: I will check memory usage, if …(He interrupted: “suppose not memory used up.”)
Rong: I will check if there is I/O blocking …(He interrupted “suppose not because of I/O blocking.”)
Rong: That’s my way to analyse issue, I will rule out something to … 
(He interrupted “ok, let’s suppose it’s I/O issue, but it’s million line code application, 
and could be thousands of part involves I/O, how do you solve the problem?)
Rong: For this huge application, troubleshooting needs deep understanding of the codes.
(He said: suppose you know the code very well, and suppose you wrote the code yourself.)
I thought for a while and cannot answer. (I guess he has a specific answer, but I just cannot spot it.)
He: it’s ok, let’s move on to next question.
Do you have any idea about what he want? And this is also a Indian. I tend to conclude that either Indian think in a way different from mine, or he intended to mess up the interview, because all interviewers who throw me a vague question and refuse to give further hints are Indians.

Fwd: java threading questions {xr

Question: 
Since vector is synchronized, is below code thread safe? If 2 threads are running below code, what will happen?
if(!vector.contains(element)) {
vector.add(element);
}
Rong answer: not thread safe, because another thread could add element in between vector.contains(element) and vector.add(element);
(my own question, not from interview, but just my own thought: What does “Vector is synchronized” mean?
my own answer: It means all methods that can update data are synchronized. So calling each single method is thread safe, 
but a transaction involving 2 synchronized methods is not thread safe. Only if a transaction is atomic, we can say it’s thread safe.
Question: 
synchronize(obj) {
obj.wait();
}
if 2 threads are running above code the same time, what will happen?
Rong answer: one thread acquire lock, then release lock, another thread acquire lock and then release lock, both wait forever.
follow up question: if a third thread call obj.notifyAll(), what will happen?
Rong answer: both waiting threads will be woken up, then one thread acquire lock and finish, then another thread acquire lock and finish.
Question:
synchronize(obj) {
print(“A”);
obj.wait();
Thread.sleep(100000000);
}
what will happen if 2 threads are running above code?
Rong answer: both print “A” then wait; if notified by a third thread, then will sleep for a long time.

Fwd: MS IV 20141211

Question 1:
void transfer(Account accA, Account accB, double amount) {
synch(accA) {
sync(accB) {
}
}
}
There is a deadlock issue, e.g Thread1 transfer from accA to accB, and Thread2 transfer from accB to accA. How to fix this bug?
Question 2: 
class Trade
OptionTrade extends Trade
BondTrade extends Trade
SwapTrade extends Trade
Pricer
double getPV(OptionTrade trade)
double getPV(BondTrade trade)
double getPV(SwapTrade trade)
Trade trade = …//you got a trade instance
Pricer pricer = …//you got a pricer instance
//how to get price value of the trade
if(trade instanceOf OptionTrade) {
pricer.getPV((OptionTrade)trade) 
} else if(… instanceOf …) {
} else if (… instanceOf …){
}
He siad, this works, but looks he is not satisfied with this solution. He asked : “Is there a better way to do that?”
I cannot think of another way…, He followed with a question “what's polymorphism?”, so maybe the solution should be related to polymorphism.

Fwd: Bloomberg interview questions 2/13/2015 {XR

String getRegularExpression(String str1, String str2);
Base on input 2 strings, return a regex String that can cover 2 input strings.
e.g “Bloomberg”, “BloomingDale” should output “Bloom.*g.*”

I proposed a solution:
step 1: find out common segments of the 2 strings
step 2: insert “.*” in between the common segments to form a string

He asked: how to find the common segments?
also he let me think of below examples, but I really couldn’t find any
clue from his example.
“Bloomberg”, “BloomingDale”
“StarBucksCoffee”, “RedCoffee”

In the end, he said my proposed solution is not technically optimized.
Obviously, he has better solution, but he didn’t give me the hint.
So I still have no idea at all.

Can you think of a solution?

Fwd: Markit phone interview, full time position, 10/13/14

———- Forwarded message ———-
Date: Tue, Oct 14, 2014 at 4:02 AM
Subject: Markit phone interview, full time position, 10/13/14

Which Java package do you particularly like?

To design a cache framework, in such a scenario, you have already 50 objects in the map, that's its capacity. Now 51st object comes in, you have to remove the first object in the cache (this implies the map is sorted) to make room for the new comer. How do you go with that?
The answer is to use LinkedBlockingQueue as the map's placeholder
What's weak reference? Explain how GC works? Do you know any GC algorithm? If to choose one GC algo for your application, how do you set the JVM parameter?
Java Serialization. If there is a serializable Singleton to be wired from one JVM to another, what do you do to guarantee the other JVM to have only one instance of the Singleton class v.s multiple instances?
Threadpool. Can you set the max size of a threadpool? How do you go about it? What if the max size is 10, what will happen to the new coming task but not serviced? What is the queue used in Thread pool? What if the queue is full? Why uses block queue?
Database. How do you detect deadlock in database? How do you read Threaddump?
Database. If you were handed with a complex query running against 10 tables, and there is a performance issue, how do you proceed to resolve it? Have you used Execution plan?

Fwd: Barcap onsite interview, 10/09/14

———- Forwarded message ———-
Date: Fri, Oct 10, 2014 at 7:56 AM
Subject: Barcap onsite interview, 10/09/14
Coding, no pseudo code, must be workable code. On paper.

1. Read a file, find out the occurrence of the alphanumeric letters, like “122113, Hello world”, returns 1:3; 2:2; 3:1; H:1;e:1; l:3; o:2; r:1;d:1
2. find out if a string brackets come in pairs. For example, “[{}]{([])}” will return true, but “[(])” will return false;
An algo: from an array of length of n, pick up m elements randomly, m<=n. Must be no dups, meaning once an element has been picked, it can't be picked again. Best way to implement it.
In an extreme, for a length of 1000, pick up 999 elements.
A bunch of multi-choice stupid Java questions, like “What's AWT stand for”
Some designing questions, how to implement N-Tiers applications; What's its pro and cons;
What's it that you don't like about Java? 
This is just some open discussion: my answer was that Java is now too huge a monster that tries to please everyone, which is against its original philosophy of simplicity. And I gave example of lamda expression, and multiple inheritance in Java 8, which is the center of debates. 
Then he asked if w/o lamda, how to do functional programming in java? and my answer was that other JVM language such as Scala can do it just fine or even better since it's created for that purpose. Also since it's in JVM ecosystem it can fit in seamlessly.
How to sync up with database in Hibernate?
Discussion:
What's asynchronous IO? How to achieve highly efficient concurrent processing services. 
I couldn't impress him as I didn't know it but it seems that I managed to get his attention by mentioning the emerging technologies such as Node.js, which is designed for that purpose
In a scenario, we have 10 requests being handled by all available threads in a thread pool, they are long processing tasks, thus more requests come in but can't be processed, then what? How to design an architecture that can prioritize the requests so that clients can have reasonable responsiveness.
 

Fwd: 10/08/14 phone interview with BNP Paribas, GWT UI developer position, Jersey City

———- Forwarded message ———-
From: Hai Yi

JAVA

1. How GWT communicate from frontend to backend;

2. what's interface?
3. What's serialization? How to customize serialization
4. hashing in Java, explain.
5. How to tune GC parameters
SPRING
6. By default, Spring bean is singlton, what's the other type?
7. What's autowired?
HIBERNATE
8. In Hibernate, whats the difference b/w Session and Transaction?
9. There are classes of inheritance relationships, for example, a Person class, a Employee class and a Manager class, what's the different strategies to design tables to map these classes? If only one table is used, how to differentiate those classes?
10. How does Hibernate handle multiple databases?
DATABASE
11. Whats table partition?
12. How do you do to improve performance for data retrival/saving?
13. What's composite primary key?
XML
14. XML. Efficiency compare: SAXand DOM. Whats API do you use?
15. Web service. SOAP/RESET, what tools do you use?
OTHER
16. How do you do unit test and integration test? Have you used mockit?
17. Maven. The “Runtime” or “Test” inside Dependency tag, what are they used for?
 

Fwd: onsite BofA IV{phone IV 10/01/14

———- Forwarded message ———-
From: Hai Yi
Date: Wed, Oct 8, 2014 at 9:42 AM
Subject: onsite BofA interview, continuance of the phone interview on 10/01/14

Coding on the computer and white board

1) code on Eclipse
There is a folder containing stock files, each line of which includes info as below, and delimiter is “t”.
date in the format “YYYY-MM-DD”
indicator (BUY/SELL), this is counter party BUY from or SELL to BofA
symbol
counter party
quantity
price
There are 100 such files and each file contains 10,000 lines.
In the same folder there is also a short file, marks.txt in which there are two columns in each line showing the symbol and market price, for example:
AAPL 350.00
Task: reading the file and figure out the long positions that BofA hold on the stocks, list top 20 in the descendant order with regard to value (quantity*price) and the respective market value. For instance, BofA might buy/sell AAPL, if the net is long, you have to figure it out and count it as required above.
2) white board
Create a high efficient cache solution for computation, it must be thread safe.
public interface Computable {
  public V compute(K k);
}
public class Cache implements Computable{
    //the Caculator below is a delegate to do the heavy lift, but the result can be cached so it's supposed to call only once
    private Computable caculator = new Caculator();
    private Map cache = new HashMap();
    public V compute(K k){
            if(cache.contains(k){
              return cache.get(k);
            }
            V val = caculator.compute(k);
             cache.put(k,val);
             return val; 
    };      
}
What's the problem with the above implementation? How to make it highly performant and thread safe?
I wrote a good implementation with synchronization and he complimented “nearly perfect” but he continue to request to create a “lock-free” solution, I couldn't do it and he wrote one using FutureTask, I don't remember the details. And he also suggested a even better one using AKKA's actor.
I didn't say but IMHO the common idea that synchronization is a performance penalty is just a urban legend – with the nowaday's modern Java compiler the overhead of synchronization is just trivial.
    
3) White board Singlton pattern.
4) Eclipse a Lock implementation, use only AtomicBoolean
public interface Lock {
  public void lock();
  public void unlock();
}
public class LockImpl implements Lock{
  private AtomicBoolean bool = new AtomicBoolean();
  public void lock(){
    //What's in here?
  } 
  public void unlock(){
    //what's in here?
  }
}

Fwd: 10/01/14, 5:30PM, BofA phone interview with Wilson

———- Forwarded message ———-

From: Hai Yi

Date: Thu, Oct 2, 2014 at 11:47 PM

Subject: 10/01/14, 5:30PM, BofA phone interview with Wilson

Java

1. Explain how HashMap works

2. Compare HashTable and ConcurrentHashMap. Explain how CHM handle concurrency.

3. There is an arraylist of 1,2,2,3, how do you output unique values?

how do you only output dup values?

4. Can you remove an element during iteration? How do you do it properly?

5. Can you update elements in an arrayList if its reference is defined

as “final”. If so, what do you do to prevent it?

6. Name JSP internal objects. If in URL there is “id=12345”, how to

retrieve that value?

7. Difference b/w StringBuffer and StringBuilder; and their difference

with String?

8. What design patterns can you name? What's singleton?

9. How do you implement Producer-consumer pattern?

10. Explain Executor framework and Threadpool

11. Explain volatile and transient

12. Explain Breath-first traversal and depth-first traversal

13. What's the new feature in JDK 8? What's the “default” keyword used

for in JDK 8 context?

Database

1. What's Union and Union all?

2. explain left join and inner join

3. There is a table “Salary” with two columns “id” and “salary”, this

is oracle, write sql to list first 4. 100 record with salary from

highest to lowest

4. How do you dump the data from table A to table B, if table B

doesn't exist. Do it in one SQL statement.

5. Difference b/w function and store procedure

6. What's view? What's trigger?

7. How to call a store procedure from Java? The JDBC API.

8. What are the transaction isolation levels?

9. How do you get the current date using Oracle?

10. In PL/SQL, how do you throw an exception and how to catch it?

Unix

1. what's the command to find the files that were updated in the past 10 days?

2. How to find a file with a particular name, like “hello.java”?

3. What's the command to kill a running Java process and generate threaddump?

4. There is a file of 10 million lines, whats' the command to list

first 10 line?

5. What's the command to count the line number in a file, what parameter to use?

6. In shell script, how to split a string with dilimiter?

7. In shell script, how to check if a file exists or not?

8. what's the command to sort a file?

9. what's the command to list all running processes?

10. In VI, how to replace a repeated word in the whole text?

minimize locking – queue for producer/consumer

I was asked this question in a big bank IV.

Q: if our queue needs to be as fast as possible, we want to avoid a global lock. How?

%%A1: multi-queue, based on cusip or account prefix. I implemented this partially in a JPM take-home coding test

%%A2: if were are very sure the queue is never depleted, then use 2 locks at both ends, so consumer threads only need to contend for the consumer lock.

%%A3: lock free queues are probably quite common in c#, java and c++

For A2, On one hand, we want to keep things simple by having fewer locks. On the other hand, we want to increase parallelism by breaking up one big sync group (of threads) into independent groups, each having a “smaller” lock.
Coarse-grained parallelism is key. If the 2 smaller sync groups never cross paths, then the additional locks won’t add complexity. We may need a way to ensure that the 2 ends of the queue never cross, hopefully without needing both locks. The risk — the consumer is too fast and “eats” an item that’s not yet added. We can make sure this always fails reliable in production and stress test, rather than UndefinedBehavior.

I feel in reality, there is usually some bound on the capacity of producer, consumer or queue. Sometimes producer will be too fast (overflow), or too slow (cross), so a queue without any bound check is unreliable in practice.

python shortcomings for large enterprise app

– performance, efficiency. Compare — C statements translate to brief machine codes.
-? scalability
– threading – GIL, no precise control
– industrial strength commercial support. Only c++/java/c# has big tech powerhouses. Linux is widely adopted by enterprises, but is an exception that proves the rule.
– ? commercial auxiliary software to support the ecosystem, including IDE, debuggers, jars, code generators, profilers, … I feel python has these.
-? most enterprise apps are now in java or c# as these are proven, in terms of performance, resource scalability, commercial support etc.

Fwd: IDC interview (12/09/13) Desktop

Subject: IDC interview (12/09/13) Desktop

I got the offer for this but turned it down today – HSBC seems a better one.

a lot of questions about Swing, since the project is a desktop data consolidation application using Swing and JIDE.

OO concepts:

Open-closed principle

Encapsulation and inheritance, your understanding?

aggregation or composition, comments?

Collections:

enumerate the Collection APIs

Which Set implementation or interface retains the natural order of the elements?

[TB] Natural order is like the order in an English dictionary, right? TreeSet/SkipListSet, SortedSet (interface)

Algo:

whiteboard Pascal triangle

Quick sort

binary search

How to refactor the following code snippet (they took out a code from Effective Java):

public class Person {

private final Date birthDate;

public boolean isBabyBoomer() {

Calendar gmtCal = Calendar.getInstance(TimeZone.getTimeZone(“GMT”));

gmtCal.set(1946, Calendar.JANUARY, 1, 0, 0, 0);

Date boomStart = gmtCal.getTime();

gmtCal.set(1965, Calendar.JANUARY, 1, 0, 0, 0);

Date boomEnd = gmtCal.getTime();

return birthDate.compareTo(boomStart) >= 0 && birthDate.compareTo(boomEnd) < 0;

}

}

[TB] I feel isBabyBoomer should become a static utility method, like

isBabyBoomer(Person) or isBabyBoomer(Date)

The boomStart and boomEnd should be static constants. No need to create them as temp objects in each invocation.

Alternatively, we only need to look at a given year of birth “YOB”. Is baby boomer IIF 1946 <= YOB <= 1964 i.e. YOB between 1946 and 1964.

Am I right?

Fwd: HSBC interview (12/11/13, Jersey City), a Dodd Frank Recon Report project

Looks like most of java questions are on threading and spring, and nothing else?

See my answers below.

JAVA

if wait/notify is called outside the sync block, what exception is thrown?

[TB] Doing this won’t work. It’s a programmer error, not system error like out of memory. I don’t remember this exception because I have never made this mistake. 

So this is an obscure question.

enumerate Executors’s static methods to create a thread pool

[TB] I remember ExecutorS.java has a bunch of static factory methods. They let us specify the queue data structure used, the size of the queue, (min/max) size of the pool. There are also methods to create a single-thread pool. I think there are also methods to create timer-enabled thread pool.

how do you do to ensure a thread processed prior to other active threads. What do you in 1.4 and in 1.5 or later?

[TB] Thread priority is a platform-dependent feature. Usually java threads are kernel threads, so the kernel scheduler decides whether to honour the priority I set on my java threads. Sometimes the kernel scheduler ignores the priority I set. If that happens, I call yield() in many places in the methods executed by other threads. However, the “important” thread may still get pre-empted (context-switched out). In such a case, I would need to use lock or condition variables to block all other threads. Another techniques is a global Boolean flag, set by the important thread, and read-only by all other threads. If it’s 0, then all other threads will go back to a sleep loop.

Volatile variable? What’s your comment?

SPRING

How many ways of instantiating a Spring container?

How many ways of injecting a bean in Spring XML config 

What’s drawbacks of using constructor as injection mean? circular reference, what’s exception will be thrown?

Spring annotation. If a bean must be provided with a constructor with another bean injected, what’s the attribute of the annotation should be used to enforce it.

What’re the scopes in a bean tag in Spring XML?

If a scope being prototype, what will return from using “getBean()”

UNIX

How to check the long running java process, is it done, deadlock, hanging there or still running.

[TB] Good question. If it is done or deadlocked then cpu usage would be 0 for a long time (like a few minutes). Hang is always for specific reasons like deadlock or blocked on IO.

If the java is a network server then it could be waiting for requests, so no-activity is OK. We had better know the actual type of program it is.

[TB] Ideally there should be log entry from the process. For a network server log could say “waiting for requests”. In reality, the programmer may have forgotten to do the right things.

kill -3 to get the threasdump

 

c++iv: Jump#2 #overflow catch

Mostly QQ type of questions. I feel i may have to give up on some of the very low level (perf optimization) topics. I feel java and c# interviews are not so low.

Most of the questions are fairly … uncommon

Q: Stack overflow – who can detect it and print an error msg? JVM can do it but what if there’s no VM?

Q: What data type would you use for the tasks in a thread pool??
(I find this question too advanced. c++11 offers Futures…)
%%A: look at pthread-create. a func ptr taking a void ptr

Q: After malloc(), how do you cast the pointer to MyClass* ? Do you call the ctor? How?
(This is asked again by Alex of DRW)
A: placement-new?

  • Inter-thread communications in thread pool – how does it work?
  • Thread pool — Your resume mentioned your home-made thread pool? How?
  • Boost::any, boost::bind, boost::function
  • CPU cache – how do you use it to improve performance? Any specific techniques?
  • Stack size – who controls it? at Compile time or run time?
  • Shared ptr – how is it implemented?
  • Scoped lock – what is it, why use it?
  • Your bash shell customizations as a cpp developer?
  • $LD_LIBRARY_PATH — what is it?

c++Chicago/Sing (Jump) IV Aug 2012

Q: UDP vs TCP diff?
%%A: multicast needs UDP.
%%A: UDP is faster – no connection setup/teardown no error check no ACK, no sequence number; shorter emvelope

Q: How would you add reliability to multicast?
%%A: sequence number

Q: How would you use tibco for trade messages vs pricing messages?
%%A: the trade msg must be delivered reliably to back office?
%%A: one of them is real time?

Q5: In your systems, how serious was data loss in non-CM multicast?
%%A: Usually not a big problem. During peak volatile periods, messaging rates could surge 500%. Data loss would deteriorate.

Q5b: how would you address the high data loss?
%%A: test with a target message rate. Beyond the target rate, we don’t feel confident.
A: tune the tibco reliability parameter —  http://javarevisited.blogspot.sg/2011/04/dataloss-advisory-in-tibco-rendezvous.html

Q7: how is order state managed in your OMS engine?
%%A: if an order is half-processed and pending the 3nd reply from ECN, the single thread would block.

Q7b: even if multiple orders (for the same security) are waiting in the queue?
%%A: yes. To allow multiple orders to enter the “stream” would be dangerous.

Now I think the single thread should pick up and process all new orders and keep all pending orders in cache. Any incoming exchange messages would join the same task queue (or a separate task queue) – the same single thread.

3 main infrastructure teams
* exchange connectivity – order submission
* exchange connectivity – price feed i.e. market data. I think this is incoming-only, probably higher volume. Probably similar to Zhen Hai’s role.
* risk infrastructure – no VaR mathematics.

IV spr/hib, SQL, …

———- Forwarded message ———-

From: Hai Yi
Date: 23 January 2014 11:24
Subject: Citi interview (01/22/14)

this is a tough one, a real torture

input an array of strings, how to eliminate the dups and output a new array. Don’t use iteration of each element.
That is to take advantage of Set#addAll and Set#toArray
Write Factorial in two ways
recursive and non-recursive
How to customize serialization, what methods to use?
Normally the serialization in Java works in this way:
An object to be serialized, say A, needs to implement Serializable interface;
From program, use ObjectInputStream/ObjectOutStream to read in / write A.
To customize the serialization, you have to implement readObject/writeObject inside A, and both methods should be “private”.
A second way to customize serialization is to implement Externlizable – be noted this one is NOT a mark interface and it has two methods readExternal / writeExternal
the dynamic proxy in Java, whats its name?
Proxy
JNI is loaded into which area of the heap? Say for example, an oracle driver
How to record the hit counter from an application server in a multi-threading environment.
AtomicInteger
optimistic lock and pessimistic lock, what exactly the SQL in Oracle to implement them?
for optimistic lock, use an additional column in Table, with type being “Timestamp” or “version” and check it before updating.
For pessimistic lock, add “For Update” clause in the SQL thus effectively lock the access to the table.
what’s the default autowiring in Spring?
Autowired by type
whats the JVM argument is used for GC?
What the tools for memory dump?
1 mil transaction per sec down to 1 transaction in an application server, what’s the primary suspect?
How to prevent the field from being persisted.
transient keyword
How to implement lazy loading in Hibernate?
lazy load in Hibernate is one of Hibernate’s fetching strategies, its purpose is to load the related children of an entity only when the user needs them.
if using XML config, in hbm.xml, use attribute fetch = “select”
If using annotation, do these:
    @OneToMany(fetch = FetchType.LAZY, mappedBy = "stock")
     @Fetch(FetchMode.SELECT)
How to use Query in Hibernate? And Criteria ?
What are the patterns implemented in Hibernate?
ApplicationContext, what is more than BeanFactory?
maven, phase:goal, explain.
Macro processor in Linux?
Now I know you are just playing me…

c++buy-side data support IV (wq)

(world quant?)

local vs my in perl?
what’s bless in perl?
multiprocessing vs multithreading modules in python?

From a simple salary table, select the row with the 2nd highest salary. Address in my blog on top9

toss a dice 3 times. What’s pr(getting 3 different numbers)

Q: advantages of operator overloading?
A: sometimes you have no choice, like op= and Smart pointers
A: eg: STL containers offer bracket operator
A: eg: iterators offer increment
A: if you have a special_number class, you would consider +/-

Q: vector vs linked list

Q: difference between private and protected keywords in c++? Seldom quizzed!
A: P613 [[c++primer]] says

  • members of the subclass are unaffected by these priv/prot derivation-access-specifier. (Instead, they are controlled by the base class member’s priv/prot classification.)
  • users and children of the subclass are affected!

Q: Scan a multi-line text file just once to pick a line at “random” i.e. where each line has the same probability. Not well-defined problem. Don’t spend too much time.
A: save the lines in a vector. At end of the scan, we know the vector size. Pick a random non-negative int below vector.size()

 

overload operator<< as non-friend #MIAX

Q: without the “friend” keyword, how do you support “dump” output like q(cout<<myDog<<endl), where

Dog myDog; // has private fields

Note you can modify Dog class implementation.

%%A: add const pointers to private fields, so the non-friend q(operator<<) can print them.
%%A: change Dog from a class to a struct.
A: c++ reflection is harder than java. Probably overkill here.

java spring IV questions from friends #YH etc

Q: How many ways of instantiating a Spring container?

Q: How many ways of injecting a bean in Spring XML config

Q: What's drawbacks of using constructor as injection mean? circular reference, what's exception will be thrown?

Q: Spring annotation. If a bean must be provided with a constructor with another bean injected, what's the attribute of the annotation should be used to enforce it.

Q: What're the scopes in a bean tag in Spring XML?

Q: If a scope being prototype, what will return from using “getBean()”

prevent heap instantiation of my class#Part 2#nQuant

I might have blogged about this…. [[more effC++]] by Scott Meyers has a chapter on this. The basic ideas, expressed in my own language for maximum absorption —

case 1) it’s relatively easy to prevent qq( new MyClass ) P157 basically says customize a new-operator as a private static class member and leave it undefined.

— below cases 2 and 3 are non-issues in “our” projects, where team is small and guys are expected to read the source code comments —

case 2) MyClass could be subclassed to Der. If Der is beyond my control, then MyClass could (against my will) be instantiated on heap as a subobject

case 3) MyClass could be a field in class Umbrella. If Umbrella is beyond my control, then MyClass could (against my will) be instantiated on heap as a subobject

I feel in many enterprise app teams, Cases 2/3 are rare and impractical, because Der and Umbrella would be owned by myself or my own team.

Therefore, in many real world contexts it is feasible to prevent heap instantiation of my class, and reduces memory leak.

c++ low-latency connectivity IV (nQuant) #2

This IV is heavy on low-level QQ in C/C++. Such obscure knowledge won’t help GTD and is not significant zbs. They may improve your design though.

Q3: Memory alignment – what if on the stack I declare 2 char variables? See post on memory alignment.

Q3b: what if I have 2 char fields in a struct?

Q3c: I have two 64-bit ints, one misaligned. When I use them what problems will I have?
A: not much performance penalty. See https://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/

Q1: If inside a member function I call “delete this”, what happens?
%%A: what if this “this” points to an object embedded as a field of an umbrella object? The deallocation would happen, but the destruction of the umbrella object may again deallocate it? This is confirmed in the FAQ (https://isocpp.org/wiki/faq/freestore-mgmt#delete-this)
%%A: how do we know the host obj is on heap, stack or global area i.e. data section.

Q1b: To achieve heap-only, my class has private ctors and private op= and a static factory method. Will it work?
%%A: according to moreEffC++ P146, I would say yes, with certain caveats.

Q2: What’s reinterpret_cast vs dynamic_cast vs static_cast?

Q2b: What other casts are there?

Q: Placement new – can I use the regular “delete”?
%%A: probably no. Need to call the dtor manually? See P42 moreEffC++

Q: How does tcp handshake work? (I don’t know why this nlg is even relevant)
A: http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Connection_establishment

Q: Some tcp parameter to speed it up?
A: larger TCP window size?
AA: TCP_NODELAY

Q: tcp client to specify a non-random port? See post on bind()

Q: If a c++ app runs fine in debug build (compiler optimizations removed), but crashes in release mode, what guesses/clues do you have?
%%A: conditional compilation, like in my c# project
%%A: the compiler optimization leads to unusual execution speed between 2 threads, and cooks up a rare corner case
%%A: I have seen assertions turned on in debug build (a debug STL can also be used), so we know one data file F1 is unusable, and another data file F2 is usable. In release build, someone else tries F1 and it crashes somewhere else.

See c++debug build can modify app behavior!

nQuant c++IV #1 #monitoring

Overall, I remember nQuant IV was rather low-level and heavy on optimization. Mostly QQ.

Q6: how could do you implement a userland thread?
%%A: setjmp and longjmp, as in early JVM

Q6b: how? I feel better be modest and offer my tentative guesses.

Q6c: how is setjump different from goto? Seldom asked!
AA: http://ecomputernotes.com/what-is-c/function-a-pointer/what-is-the-difference-between-goto-and-longjmp-and-setjmp and http://www.geekinterview.com/question_details/3330

Q: by default, is a linux mutex cross-process?
%%A: I guess the mutex in pthreads by default isn’t. Outside linux, the windows mutex is cross-process.
AA: Beyond the default, linux mutex can be cross-process, by using shared memory — http://stackoverflow.com/questions/9389730/is-it-possible-to-use-mutex-in-multiprocessing-case-on-linux-unix

Q: in linux, what’s the difference between a process and a thread?

Q: start 5 copies of the current process, using fork? (Can use the example in [[head first c]] )

Q: If a variable in (2-process) shared memory is marked volatile, is it a reasonable usage of volatile keyword?
AA: now i think so. Similar to a variable writable by a temperature sensor. http://embeddedgurus.com/barr-code/2012/01/combining-cs-volatile-and-const-keywords/ shows use of “const volatile” on a variable in shared memory.
AA: [[moving from c to c++]] section on Volatile (P75) seems to agree

Q: what’s a limitation of the select() system call when there are too many sockets to check, and each messages (~ 2KB) is important.
A: select has max socket count. Use epoll() — http://stackoverflow.com/questions/5357445/select-max-sockets

Q: what linux command to monitor memory usage by a given process, showing size of heap, stack, text, even broken down to shared lib vs static lib
A: cat /proc/{pid}/smaps
A: pmap -x {pid}

Q: given a one-line c program “while(1);”, launch it and /usr/bin/top would show 100% cpu usage. What does it mean?

Q: what’s inside a socket? My socket book has detailed diagrams.
%%A: http://bigblog.tanbin.com/2011/06/5-parts-in-socket-object.html

Q: can a socket be shared by 2 processes?
AA: Yes. See https://bintanvictor.wordpress.com/2017/04/29/socket-shared-between-2-processes/

———I feel most of the questions below are rarely asked.

Q: how many sockets can a single process open? Not too sure. A few hundred?
A: http://stackoverflow.com/questions/410616/increasing-the-maximum-number-of-tcp-ip-connections-in-linux

Q: what linux command to monitor network performance?
%%A: Beside netstat, I have seen tools that report error rates that indicate saturation
A: ss

 

Q: how to remove the word “option” from a resume, unless it is a sub-word? Use perl or python
%%A: Word boundary symbol?

Q: when a user land thread makes a syscall, what’s the implication?
%%A: the thread enters kernel mode?

Q: what’s offered on Layer 2? Can IP simply operate on top of physical layer without Layer 2? Too deep and seldom asked.
A: ethernet is a L2 technology.
A: Each device on a network has a hardware address or MAC address, used by the data link layer

c# coding drill for IV

These questions will seldom hit the phone round…

Q: Implement IEnumerable foo(IEnumerable aa, IEnumerable b) such that foo returns string in either aa or b but not both.

To keep things simple, let’s assume aa holds unique elements, and b too.

Q: regex engine supporting quantifier “*”, and the wild-card “.”
Q: IteratorTest from MS
Q: circular buffer
Q: reverse a linked list in-place
Q: rotate a long array by 3
Q: spreadsheet concretization

(610610 blog has a letter to XR, which shows a list of coding practice projects I completed.)

c++IV data analyst(WorldQuant

Struct base{
char * buffer
base(){ buffer = new char[1000]; }
..
}; //how do we improve this class?

What if the new() throws exception

Q: if you make buffer a smart array then what do you do with the big3?

Q: diff between scoped_array vs shared_array?

Q: what if I have a derived class without virtual dtor?

base * a = new derived;
delete a;

Q: What if I make derived a virtual subclass and have a derivedDerived class?

c++buy-side data support IV (wq) – phone round

Q: inline function vs macro. Not on my Tier 1/2
%%A: macro is frowned upon. Inline is a hint to compiler

Q: reverse a single linked list in-place. How many temp variables needed and how?
A: 3. I wrote this, in this blog

Q: volatile keyword in c++
A: used to be hardware-related. Also used for threading? Not according to a stackoverflow post
Q: throw exception in constructors? Best practices?
A: memory cleanup needed … JVM/CLR handles it!
Q: given an unsigned integer, how do you test if it’s a power of 2.
A: x-1 would be all-1…

2013 Citadel IV – c#, C++

Q: System.Array.CopyTo() vs Clone()
%%A: Clone() is declared in a supertype IClonable, whereas CopyTo is only in Array class – confirmed
%%A: Clone() is controversial – confirmed. http://www.dotnetperls.com/clone
%%A: Clone() returns a new object whereas CopyTo requires a target array of the correct size to pre-exist – correct

AA: Array class has  a static method Clone(), but I feel the real important point is the Clone() controversy.

Q4: What does the “…” mean in C++ catch(…)?
%%A: If the troublemaker has no (not empty) exception spec, then “…” means catch-all. Any type of exception can be thrown and will be caught here
%%A: if the troublemaker has an exception spec of type B, then “…” means all subtypes of B. If another  type of exception is thrown, then unexpected() triggers, with the catch(…) ignored. Tested 🙂
A: i think my answer was correct.

Q4b: What’s the c# counter part?
%%A: catch(Exception) or an empty catch{/**/} — confirmed

Q3: delegate — what is it and what’s the usage?

Q3b: what’s the c++ equivalent (or closest)
Q: What design patterns are behind the c# delegate?
A: for multicast …. observer
A: for unicast … command? The delegate instance usually has a host object
Q: how do you manage memory leaks?

100-lightbulb problem, with my solution

Q: There are 100 switched-off light bulbs along a staircase. Visitor #1 walks upstairs, pushing the on/off switch at each bulb. Visitor #2 walks upstairs hitting bulb #2,4,6…100. Visitor #3 to #100 each walks upstairs. In the end how many light bulbs are on? Can you tell me which bulbs are on? Can you tell me how many times each bulb has turned on?

For each integer between 1 and 100, factor it into “prime” factors. For eg, 72 = 2x2x2x3x3. Each integer is then expressed in the “Beautiful” form like A^n x B^m x C^p… where ABC are increasing prime numbers and nmp are exponents. For eg, 72 = 2^3 x 3^2.

Now (n+1)(m+1)(p+1)… is the number of “toggles” received by the light bulb. For eg, Bulb#72 gets 12 toggles. Bulb 10 gets 4 toggles (due to 2^1 x 5^1). Bulb 13 gets 2 toggles (due to 13^1).

Now let me defined Group-E and Group-O —
If (n+1)(m+1)(p+1)… is even, then the light bulb is in Group-E.
If (n+1)(m+1)(p+1)… is odd, then the light bulb is in Group-O.

Now, (n+1)(m+1)(p+1)… is usually an even integer. It’s odd only if n,m,p… are all even — for eg Bulb#4, #64, #36, …– Group-O. It’s easy to list all of them.

2^2,  2^4,  2^6,  3^2 , 3^4,  5^2,  7^2
2^2×3^2,  2^2×5^2

For all the “prime” bulbs like #2, #3 … #37…, the beautiful form looks like A^1, so these prime bulbs each gets 2 toggles exactly. Therefore Group-E.

Special cases — Bulb #1 gets togged only once — Group-O.

It turned out Group-O is the group of square numbers.

c#Reuters database IV

Database Question

Table E
=========
int id
string empType
string email
string firstName
string lastName

indices:
======
CLUSTERED (id, empType)
UNIQUE(id)
UNIQUE (empType, id, firstName)
DUPLICATE (empType, firstName)

Table E has: 10,000,000,000 rows

Table E has: one row with firstName “bettlejuice”

Table E has: 1,000,000,000 rows with empType = ‘fulltime’

SELECT * FROM E WHERE email = ‘foo@example.com‘;

SELECT * FROM E WHERE id = 100 AND firstName = ‘beetlejuice’;

SELECT * FROM E WHERE empType = ‘fulltime’ AND firstName = ‘beetlejuice’;

SELECT * FROM E WHERE empType = ‘fulltime’ AND firstName = ‘beetlejuice’;

How many pages for leaf nodes does the above query read?

INSERT: Writes how many pages?

c# bbg IV – 15minutes

Q: Can a struct implement an interface
%%A: an interface is a reference type. When casting to the interface type, you get autoboxing
A: http://stackoverflow.com/questions/63671/is-it-safe-for-structs-to-implement-interfaces shows many standard structs implement
IEquatable and other interfaces

Q1 When I create a custom class, what’s the type I derive from?
%%A: System.Object, which is a reference type

Q1b What methods do you usually override in your custom class
%%A: tostring, gethashcode, equals. Fairly good answer.

Q: what are the value types you normally create
%%A: enums and structs

Q: what’s IDisposable

Q: what’s the relationship – IDisposable vs Finalizer

Q: Mutex vs Monitor

never edit table model on non-EDT?

A Swing interviewer asked me

Q: Can a non-EDT thread ever writes to the underlying data-structure of a (table/tree/list) model? 
A: If it does, we are at risk. EDT could be reading it — uncoordinated and unexpected. EDT might see corrupted data. Classic reader/writer race scenario. Murphy’s Law — if it could happen it will.

(In theory, EDT can also update the data due to a message listener or a DB operation –writer/writer race, but) This one scenario alone is reason enough. All underlying data queried by EDT should Never be modified outside EDT.

In one real world live trading app though, some MOM listener threads do indeed update underlying data structure. When that happens, users are not allowed to trigger a refresh — we can disable some buttons. After the update, we send invokeLater(fireXXX()) to refresh the view.

You may want to lock-protect the underlying data, but then EDT might be blocked. Yet, this is widespread. Look at DefaultTableModel in which EDT does access the underlying Vector, which is lock-protected. If the Vector is large, growing it (perhaps 2 folds) can take log(N) time, and EDT would be blocked for that duration. However, in practice this latency is probably below 50ms and unnoticeable.

As a minimum requirement, ALL models need to be thread-safe. That often entails locking or CAS or copy-on-write.

Swing models normally do not use large data structures (since a screen can’t realistically show even 1000 rows). If it does, update should ideally be designed with a worker thread, to populate a replacement data structure, and then ask EDT to swap it in. Many non-ideal situations exist in practice, which calls for specific techniques and tactical solutions.

c# IV – Eikon pre-screening

This is a 1) Knowledge based and 2) algo-based IV. The areas covered are evergreen and stable.

TCP vs UDP
ArrayList vs List
Mutex vs Semaphore — see other blog

Q: lock vs Mutex
%%A: any reference type object can be a lock…

Q3a: implement a data structure with random access and fast add/remove on both ends.
%%A: deque, using sements linked by a directory, as in STL
A: now I know circular array is simpler.

Q3b: How about the directory data structure in your deque
%%A: perhaps a hashtable?
A: Now I know it’s a vector of pointers

Q: 2 servers to synch their clocks on startup. No need for correct time, just the same time.
%%A: Side B will query A (or receive broadcasts) and update itslef. Side A will blindly broadcast. Both query and broadcast are
simple TCP operations.

c# IV questions found online

I want to see some knowledge of things like the GAC, CLR and JIT. Here are some sample C#-specific questions: 
– Where are the framework assemblies stored? How is this useful? 
– Explain the abstract keyword and what is an example of its use? 
– Can you prevent a class from being inherited by another class? If so, how? 
Easy questions:
    * Is System.String a class or a struct (or reference or value type if you prefer).
    * Explain IDisposable and the 'using' statement.
    * Explain public, protected, private, and internal.
Intermediate questions:
    * What's the difference between Hashtable and Dictionary?
    * What does int? mean? Explain the relationship with Nullable.
Hard questions:
    * Explain the following snippet of code: 'from x in collection select new { x.Foo }'. What is the compiler doing? What is the CLR executing?
    * Explain the “yield” keyword. What is the compiler doing internally?
——
# What are Generics ? How do they affect performance.
# How would you define an Anonymous Functions? What is the difference between Anonymous Methods and Lambda Expressions?
# How does an Enumerator (IEnumerable & IEnumerator) work ? How does an Iterator work in C# 2.0 ?
# What is Garbage Collection ? How does it work ? What are generations in GC ?
# What is the role of an Extension Method ? When would you use it ?
# What is a Lambda Expression ? Where would you use it ?
# How would you define a new custom event ?
# How would you choose between a pure abstract base class and an interface.
# What is MVC how does it differ from MVP ?
# What is Dependency Injection / Inversion of Control ?
———
Everyone who writes code
    * Describe the difference between a Thread and a Process?
    * What is a Windows Service and how does its lifecycle differ from a “standard” EXE?
    * What is the maximum amount of memory any single process on Windows can address? Is this different than the maximum virtual memory for the system? How would this affect a system design?
    * What is the difference between an EXE and a DLL?
    * What is strong-typing versus weak-typing? Which is preferred? Why?
    * Corillian's product is a “Component Container.” Name at least 3 component containers that ship now with the Windows Server Family.
    * What is a PID? How is it useful when troubleshooting a system?
    * How many processes can listen on a single TCP/IP port?
    * What is the GAC? What problem does it solve?
Mid-Level .NET Developer
    * Describe the difference between Interface-oriented, Object-oriented and Aspect-oriented programming.
    * Describe what an Interface is and how it’s different from a Class.
    * What is Reflection?
    * What is the difference between XML Web Services using ASMX and .NET Remoting using SOAP?
    * Are the type system represented by XmlSchema and the CLS isomorphic?
    * Conceptually, what is the difference between early-binding and late-binding?
    * Is using Assembly.Load a static reference or dynamic reference?
    * When would using Assembly.LoadFrom or Assembly.LoadFile be appropriate?
    * What is an Asssembly Qualified Name? Is it a filename? How is it different?
    * Is this valid? Assembly.Load(“foo.dll”);
    * How is a strongly-named assembly different from one that isn’t strongly-named?
    * Can DateTimes be null?
    * What is the JIT? What is NGEN? What are limitations and benefits of each?
    * How does the generational garbage collector in the .NET CLR manage object lifetime? What is non-deterministic finalization?
    * What is the difference between Finalize() and Dispose()?
    * How is the using() pattern useful? What is IDisposable? How does it support deterministic finalization?
    * What does this useful command line do? tasklist /m “mscor*”
    * What is the difference between in-proc and out-of-proc?
    * What technology enables out-of-proc communication in .NET?
    * When you’re running a component within ASP.NET, what process is it running within on Windows XP? Windows 2000? Windows 2003?
Senior Developers/Architects
    * What’s wrong with a line like this? DateTime.Parse(myString);
    * What are PDBs? Where must they be located for debugging to work?
    * What is cyclomatic complexity and why is it important?
    * Write a standard lock() plus “double check” to create a critical section around a variable access.
    * What is FullTrust? Do GAC’ed assemblies have FullTrust?
    * What benefit does your code receive if you decorate it with attributes demanding specific Security permissions?
    * What does this do? gacutil /l | find /i “Corillian”
    * What does this do? sn -t foo.dll
    * What ports must be open for DCOM over a firewall? What is the purpose of Port 135?
    * Contrast OOP and SOA. What are tenets of each?
    * How does the XmlSerializer work? What ACL permissions does a process using it require?
    * Why is catch(Exception) almost always a bad idea?
    * What is the difference between Debug.Write and Trace.Write? When should each be used?
    * What is the difference between a Debug and Release build? Is there a significant speed difference? Why or why not?
    * Does JITting occur per-assembly or per-method? How does this affect the working set?
    * Contrast the use of an abstract base class against an interface?
    * What is the difference between a.Equals(b) and a == b?
    * In the context of a comparison, what is object identity versus object equivalence?
    * How would one do a deep copy in .NET?
    * Explain current thinking around IClonable.
    * What is boxing?
    * Is string a value type or a reference type?
    * What is the significance of the “PropertySpecified” pattern used by the XmlSerializer? What problem does it attempt to solve?
    * Why are out parameters a bad idea in .NET? Are they?
    * Can attributes be placed on specific parameters to a method? Why is this useful?
C# Component Developers
    * Juxtapose the use of override with new. What is shadowing?
    * Explain the use of virtual, sealed, override, and abstract.
    * Explain the importance and use of each component of this string: Foo.Bar, Version=2.0.205.0, Culture=neutral, PublicKeyToken=593777ae2d274679d
    * Explain the differences between public, protected, private and internal.
    * What benefit do you get from using a Primary Interop Assembly (PIA)?
    * By what mechanism does NUnit know what methods to test?
    * What is the difference between: catch(Exception e){throw e;} and catch(Exception e){throw;}
    * What is the difference between typeof(foo) and myFoo.GetType()?
    * Explain what’s happening in the first constructor: public class c{ public c(string a) : this() {;}; public c() {;} } How is this construct useful?
    * What is this? Can this be used within a static method?

dynamic glass-drop

http://en.wikipedia.org/wiki/Dynamic_programming#Egg_dropping_puzzle is a similar but more ambiguous problem.


If we toss a wine glass out of a 3rd floor window it may break, but it may not if tossed from a 2nd floor window.

Q: with exactly two wine glasses taken from a (Japanese:) production line, you want to identify starting at exactly which level of a 11-floor tower this brand of glass will break if tossed out of the window. Equivalently, determine the highest safe floor. Optimization goal — minimize worst-case cost. Top floor is known to break, so answer might be 1st floor, 2nd floor,… or top floor — 11 possible answers, but 10 floors to test.

Needless to say, if the glass breaks at 6th floor, it will surely break at a higher level.

—- Analysis —-
With just a single glass, you have no strategy. You have to try from 1st floor up. If your classmate starts on 2nd floor and breaks her only glass there, she is disqualified because with no more glass to use, it’s now impossible to know whether it would survive1st floor.

I feel this is NOT a probability problem, so each toss is bound to pass or fail with 100% certainty but we just don’t know it.

We don’t want to toss first glass @2nd, @4th, @6th… because too many first-glass tests. Neither do we want to toss first glass @5th because if broken, then 2nd glass must try @1@2@3@4. Therefore first glass toss should be somewhere between ( @2, @5 ) exclusive.

—-%% solution, with heavy use of symbols —–
I feel this problem is tractable by dynamic programming. Let
* L denote the highest _safe__ floor known so far, and
* U denote the lowest _unsafe_ floor known so far. U is initially at top floor, and L is initially 0, not 1
* u denote U-1

u >= L is an invariant. The Upper and Lower bounds. (u-L) is the maximum number of floors to test. When we get u-L = 0 and final Answer is the current value of U.

Here’s a typical example
6 —- U known unsafe
5 —- u, so 3 floors to test (u-L)
4
3
2 —- L known safe

z8 denotes the worst case # tests required to locate the Answer with both glasses when u-L = 8 i.e. 8 floors to test. Could be z(L=0, u=8) or z(L=1,  u=9). Same thing.

For example, z(1,3) means worst case # tests when 1st and 4th storeys need no testing. Note z(1,3) == z(2,4) == z2

z1 = 1 i.e. one toss needed
z2 = 2
z3 = 2 // B! C || BA. The untested floors are named (upward) ABCDEFG (if 7 floors to try), and trailing ! means broken.

z4 = 3 // B! CD || BA
z5 = 3 // C! +2 || C then z2 // if C broken, then must test last glass at D and E
z6 = 3 // C! +2 || C z3  // ABCDEF
z7 = 4 // (D! + 3 || D z3) or (C! + 2 || C z4)
z8 = 4 // (D! + 3||D z4) or the smarter (C! + 2||C z5)
z9 = 4 // (D! + 3||D z5) or (C! + 2 || C z6)
z10 = 4 // (D!+3||D z6). Can’t use (C! + 2 || C z7) — for a 11-storey tower, we need 4 tosses.

I think there are cleverer implementations. This is not brute force. It builds up useful results with lower z() values and use them to work out higher z() values — dynamic programming.

This solution not only finds where to start the test and the worst-case # tests. It also works out the exact decision tree.

dynamic poker game in Zhou Xinfeng book

Q: you are dealt the 52 cards in a random sequence.
Each red card earns you $360
Each black card costs you $360
Game ends at end of the 52 draws or any time you quit. How do you optimize your earning and how much is admission ticket worth?

Let’s denote the amount of money you take home as H, a random variable. Your net profit/loss would be H minus admission price. If 555 reasonable/intelligent people play this game, then there would be five hundred and fifty-five values of H. What’s the average? That would be the answer.

p or “+” denotes the # of positive cards earned so far
n or “-” denotes the # of negative cards suffered so far

Exp(H|p=25,n=26) = Exp(H|p=26,n=26) = $0.
There’s an intrinsic value in our hands, Defined as i=(p-n). The e-value or Extrinsic value Evaluated from Expectation, may be higher or lower. Whenever i-value > e-value, we should exit. This is our strategy/policy.

Whenever intrinsic value <$0, E(H) is out of the money but always non-negative because we can wait till maturity and get all 52 cards.

E(p24,n26) = p(next is p)E(p25,n26) = 100%E(p25,n26) = $0
E(p26,n25) = $360     because we will exit

E(p25n25) = Prob(next is p)E(p26,n25) + Prob(next card is n)E(p25,n26) = 50%$360+0 = $180
E(p24n25) = p(p)E(p25,n25) + p(n)E(p24,n26)= p(p)E(p25,n25)+$0 = 66%$180= $120
E(p25n24)= p(p)E(26,n24)+p(n)E(p25,n25)= 33%$360×2 + 66%$180= $360

E(p24,n24)= half of E(25,24)+E(24,25)= half of  $360+$120= $240
E(p23,n25)= p(p)E(24,25)
+ p(n)E(p23,n26) # $0
=3/4x$120= $90
E(p23,n24)= p(p)E(24,24)+p(n)E(23,25)= 3/5 .$240+2/5 .$90

dynamic dice game (Zhou Xinfeng book

P126 [[Zhou Xinfeng]] presents — 
Game rule: you toss a fair dice repeatedly until you choose to stop or you lose everything due to a 6. If you get 1/2/3/4/5, then you earn an incremental $1/$2/$3/$4/$5. This game has an admission price. How much is a fair price? In other words, how many dollars is the expected take-home earning by end of the game?

Let’s denote the amount of money you take home as H. Your net profit/loss would be H minus admission price. If 555 reasonable/intelligent people play this game, then there would be 555 H values. What’s the average? That would be the answer.

It’s easy to see that if your cumulative earning (denoted h) is $14 or less, then you should keep tossing.

Exp(H|h=14) is based on 6 equiprobable outcomes. Let’s denote Exp(H|h=14) as E14
E14=1/6 $0 + 1/6(h+1) + 1/6(h+2) + 1/6(h+3) + 1/6(h+4) + 1/6(h+5)=$85/6= $14.166

E15=1/6 $0 + 1/6(h+1) + 1/6(h+2) + 1/6(h+3) + 1/6(h+4) + 1/6(h+5) where h=15, so E15=$15 so when we have accumulated $15, we can either stop or roll again.

It’s trivial to prove that E16=$16, E17=$17 etc because we should definitely leave the game — we have too much at stake.

How about E13? It’s based on 6 equiprobable outcomes.
E13 = 1/6 $0 +1/6(E14) + 1/6(E15) + 1/6(E16) + 1/6(E17) + 1/6(E18) = $13.36111
E12 = 1/6 $0 + 1/6(E13) +1/6(E14) + 1/6(E15) + 1/6(E16) + 1/6(E17) = $12.58796296

E1 =  1/6 $0 + 1/6(E2) +1/6(E3) + 1/6(E4) + 1/6(E5) + 1/6(E6)

Finally, at start of game, expected end-of-game earning is based on 6 equiprobable outcomes —
E0 =  1/6 $0 + 1/6(E1) + 1/6(E2) +1/6(E3) + 1/6(E4) + 1/6(E5) = $6.153737928

2-headed coin – HeardOnTheStreet & sunrise problem

See [[Heard On the Street]] Question 4.18. 10 heads in a row is very unlikely on a fair coin, so you may feel Prob(unfair) exceeds 50%.

However, it can be fairly unlikely to pick the unfair coin.

In fact, they are equally unlikely. It turns out the unlikelihood of “10 heads/fair” is numerically close to the unlikelihood of “picking an unfair”. P(Fair coin picked) ~= 50/50

This brings me to the sunrise problem — after seeing 365 * 30 (roughly 10,000) consecutive days of sun rise or 10,000 consecutive tosses showing Head, you would think P(H) is virtually 100%, i.e. you believe you have a 2-headed coin, but what if you discover there’s only 1 such coin in a pool of 999,888,999,888,999,777,999 coins? Do you think you picked that one? Or do you think you had 10,000 lucky tosses? It may turn out you need more luck to pick that special coin than getting 20 heads on a fair coin. In such a case, you are more likely to have a fair coin than an unfair coin. Your next toss would be 50/50.

mother of 2 kids, at least 1 boy

A classic puzzle showing most people have unreliable intuition about Cond Prob.

Question A: Suppose there’s a club for mothers of exactly 2 kids — no more no less. You meet Alice and you know she has at least one boy. What’s Prob(both boys)?
Question K: You meet Kate (at clubhouse) along with her son. What’s P(she has 2 boys)?
Question K2: You also see the other kid in the stroller but not sure Boy or Girl. What’s P(BB)? This is essentially the same question on P166 [[Cows in the maze]]

Solution A: 4 equi-events BB/BG/GB/GG of 25% each. GG is ruled out, so she is equally likely to be BB/BG/GB. Answer=33%

Solution K: 8 equi-events BB1/BB2/BG1/GB2/BG2/GB1/GG1/GG2. The latter 4 cases are ruled out, so what you saw was equally likely to be BB1/BB2/BG1/GB2. Answer=50%

Question C: Each mother wears a wrist lace if she has a boy and 2 if 2 boys (Left for 1st born, Right for 2nd born). Each mother comes with a transparent (hardly visible) hairband if she has either 1 or 2 boys. There are definitely more wrist laces than hairbands in the clubhouse. If you notice a mother with a hairband, you know she has either 1 or 2 wrist laces. If you see a wrist lace, you know this mother must have a hairband.

C-A: What’s P(BB) if you see a mother with a hairband?
C-K: What’s P(BB) if you see a mother with a wrist lace on the left hand?

Solution C-A: Out of 2000 mothers, 1500 have hairband. 500 have 2 boys. P(BB) = 33%
Solution C-K: 500 have 2 wrist laces; 500 have only a left wrist lace; 500 have only a right wrist lace. P(BB) = 50%

Seeing a wrist lace is not the same as seeing a hairband. The 2 statements are NOT equivalent. Wrist laces (2000) outnumber hairbands (1500). A wrist lace sighting guarantees a hairband, so a wrist lace is more Rare, and a hairband  sighting is more Common. Within the clubhouse, 3 out of 4 hairband “tests” are positive, but only 2 out of 4 wrist lace tests are positive.

 Applied to original questions…
* Alice wears hairband but perhaps One of her wrists might be naked. If she brings one child each time to clubhouse, we may not always see the a boy.
* Kate wears at least one wrist lace (so we know she has a hairband too).

$ if we randomly “test” Alice for wrist lace on a random hand, she may fail
$ if we randomly “test” Alice for hairband, sure pass.
–> the 2 tests are NOT equivalent.

$$ if we randomly “test” Kate for wrist lace on a random hand, she may fail
$$ if we randomly “test” Kate for hairband, sure pass.
–> the 2 tests are NOT equivalent for Kate either

The wrist-lace-test pass implies hairband-test pass, but the same knowledge object contains additional knowledge. The 2 tests aren’t equivalent.

—– How is Scenario K2 different from A?
–How many mothers are like K2? We need to divide the club into 8 equal groups
* perhaps Kate is from the BB group and you saw the first kid or the 2nd kid
* perhaps Kate is from the BG group and you saw the first kid – BG1
* perhaps Kate is from the GB group (500 mothers) and you saw the 2nd kid – GB2. Now if you randomly pick one hand from each GB mother then 250 of them would show left hand (GB1) and 250 of them would show right hand (GB2). Dividing them into 2 groups, we know Kate could be from the GB2 group.
=} Kate could be from bb1, bb2, bg1, gb2 groups. In other words, all these 4 groups are “like Kate”. They (1000 mothers) all wear wrist lace, but not all having wrist lace are like-Kate — The bg2 (250) and gb1 (250) mothers are not like-Kate

–How many mothers are like Alice? 75% consisting of BB BG GB
——-
^ Spotting a hairband, the wearer (Alice) is equally likely from the 3 groups — BB(33%) BG(33%) GB(33%)
^ Spotting a wrist lace, the wearer (Kate) is more likely from the BB group (50%) than BG(25%) or GB(25%) group.

If I hope to meet a BB mother, then spotting a wrist lace is more valuable “signal” than a hairband.  Reason? Out of the 2000 mothers, there are 2000 wrist laces, half of them from-BB. There are 1500 hairbands, and a third of them are from-BB.

Further suppose each twin-BB mother gets 100 free wrist laces (because wrist lace manufacturer is advertising?), and all the BB mothers claim to have a twin-BB. As a result, wrist laces explode. Virtually every wrist lace you see is from-BB.
——-
There are many simple ways of reasoning behind the 33% and 50%, but they don’t address the apparent similarity and the subtle difference between A and K. When would a reasoning become inapplicable? It’s good to get to the bottom of the A-vs-K difference, the subtle but fundamental. A practitioner needs to spot the difference (like an eagle).

reliably convert Any c++ source to C : IV

More than one person asked me “Can you compile a c++ app if you only have a c compiler?”

Some of the foremost experts on c/c++ compiler said on http://www.edg.com/faq/convert —

If you mean “can you convert C++ source to C source, and run the C through a C compiler to get object code“, as a way to run C++ code on a system that has only a C compiler, yes it is possible to implement all of the features of ISO standard C++ by translation to C source code, and except for exception handling this produces object code with efficiency comparable to that of the code generated by a conventional compiler.

For exception handling, it is possible to do an implementation using setjmp/longjmp that is completely conformant, but the code generated will be 5-20% slower than code generated by a true c++ compiler.

16bit page-counter to Estimate when a webpage is hit the 1,000,000th time

We don’t need to estimate the hit Frequency which could be time-varying.

If we can use a 2 counters, we can precisely determine the 1,000,000th hit. One slow-counter one regular fast counter. But that means we use 32 bits for our counters, like a 32-bit counter!

void increment(){ // was named count()?
  static int count16bit;

  long unsigned int now = system_time_in_nanos();
  if (now%16 == 0){ //just check last 4 bits of “now”
     count16bit++;
     //now check count16bit
     if  (count16bit <= 0){cout<<"got it"<<endl; }
  }
}

https://en.wikipedia.org/wiki/Approximate_counting_algorithm#Algorithm is the standard implementation of a similar idea.

BNP (NY) FX algo trading IV 2012

Tick data needed for … back testing, analysis? Quants use R, matlab etc to analyze data. IT don’t use those.

Q: how can you help my desk make more money?

Q: from a low-level interviewer — describe to me a challenging technical design (and expect me to drill in).

Q: We know the drawback of auto_ptr. Now write an auto_ptr template class spec (without function bodies) with private op=()

Q: describe some threading challenge in your low-latency system.

Q: how would you use pigeons for IPC (possibly across machines)?
Q: what’s a unix domain socket?

Q: is Throwable checked or unchecked exception?

Q: how many incoming/outgoing orders a day?
Q: peak incoming order rate

(I feel they noticed my pretrade pricer as it’s similar to their market making engine.)

Q3: what pretrade pricing rules did you have? Describe how they work in real life
Q3b: what’s the peak price update frequency?

Q: in your pretrade pricing engine, how many messages do you send out each day?

Q: how does the pretrade pricing system interact with other systems?
Q: are the conduits more retail or institutional?
Q: how does the asset valuation work?

%%Q: what do quants here do differently from quants in der pricing?
A: tune parameters; analyze data and come up with strategies; Sounds simple but a lot of work.
%%A: constantly improve the strategies and parameters.

%%Q: 20,000 euro loss would be realized loss?
A: yes

%%Q: what if multiple strategies trade the same currency pair without knowing each other?
A: multiple live strategies can trade the same pair. They should be designed to minimize correlation and know each other.

c++/vol IV: FX risk (barcap?) mid2012

Q: basic exception guarantee
Q: stochastic vol vs local vol?
Q: what’s RAII? What are the major classes you know using RAII
%%A: smart pointers, locks, stl containers, strings??

Q: what synchronization classes are there in c++?
Q: can a static member function be const?
%%A: no. the const is on “this”

Q: is it ok to mark a field mutable
%%A: I think so. student.getAge() can modify lastAccessed timestamp.

Q: what are the option models you know

client conn IV#FIX/java

Q: OrdStatus vs ExecType in FIX?
http://fixwiki.fixprotocol.org/fixwiki/ExecutionReport
http://fixprotocol.org/FIXimate3.0/en/FIX.5.0SP2/tag39.html — good to memorize all of these

Each order can generate multiple execution reports. Each report has an exectype describing the report. Once an order status changes form A to B, OrdStatus field changes from A to B, but in the interim the exectype in the interim execution reports can take on various values.

Enum values of exectype tag and ordstatus tag are very similar. For example, a partial fill msg will have exectype=F (fill) and ordstatus=1 (partial fillED).

exectype is the main carrier of the cancellation message and the mod message.

In a confusing special case, The exectype value can be the letter “i”, representing OrderStatus, spelt “OrderStatus”.

Q: what’s a test request in FIX
AA: force opposite party to send a heartbeat.

Q: what’s send Message() vs post Message in COM?
http://wiki.answers.com/Q/What_is_the_difference_between_PostMessage_and_SendMessage

Q: intern()?

Q: are string objects created in eden or ….?

Nomura FX option IV #Ldn video link

Q: what are the products traded on your desk
%%A: stock/ETF/index options + var swap for flow vol. For EFS, there are a lot – digital options, barrier options, Asian style options, structures with 2 underliers
Q: how do you monitor the barriers – discrete or continuous or …?
Q: what does a typical quant library API function looks like?
Q: You said c++ is more challenging, but what technical challenges do you see using c++ compared to java?
(Now I would say pointer, memory mgmt, STL, undefined behaviors…)
Q: how is a libor yield curve constructed?
Q: what’s the rationale for using swap rates to derive spot rates for a long tenor?
Q: what’s the rationale for using libor futures rates to derive spot rates? 
Q: for risk management, what features of the vol surface are measured/monitored?
%%A: skew bump, tail bump. Wings are known to be problematic.
Q: did you work on PnL explain?
Q: Our trades are often long gamma. What happens to their thetas?

Nomura FX option + java IV

Q: Does your system cover tenor based risk?

Q: Between Equity vol and FX vol systems, do you monitor the same greeks?
%%A: in eq, it’s about delta gamma vega and theta. In FX, I believe these are all important. But I guess rho is probably more important than in eq, since FX is very sensitive to interest rate and there are 2 interest rates in each currency pair.
A: delta/gama/vega/theta are the big 4. rho is insignificant in both eq and FX. See http://bigblog.tanbin.com/2011/06/rho-of-vanilla-fx-option.html

Q: You mentioned that you build thousands of Libor yield curves each day? How do you do that?
%%A: using deposit rates, futures rates, swap rates, and year-end turns

Q: You mentioned EOD marking that feeds into downstream PnL attribution. How do you do that?

What Unit test tools did you use?

What’s a good unit test?

In java how do you compare 2 strings?

Why is java string designed to be immutable ?

SCB threading design IV: caching remote DB

This could be a good story to tell during interviews.

The MOM design is still unfinished but I think a practitioner could see we are on the right track.

Q: A remote position data source (a DB or some opaque data source) is super slow. It takes up to 500ms to return a position object when queried. The read(int key) method is provided by the DB’s client-side API and is thread safe, so multiple threads can call read() concurrently. Now, we need some kind of caching to minimize the 500ms latency, but we want a lazy cache, partly to avoid overloading this busy DB — We don’t want to load all 9 million positions when we are interested in only a few positions. Our cache shall provide an API like a Cache class with a method

“public Position lookup(int key) const”

If a local lookup thread is blocked due to a cache miss (resulting in a DB read), it should not block other unrelated lookup threads. Also, if 2 threads happen to call lookup(123) in quick succession which is a cache miss, 2nd thread should block and gets the result as soon as first thread receives the position and populates the cache. Perhaps some kind of notification.
———– Analysis ———–
Similar to my B2bTradingEngine. I feel we can address this either with low-level tools or with high level tools. Perhaps the interviewer is interested in the low-level know-how, or perhaps interested in the higher-level design techniques and principles. I feel it’s more challenging to assume high volume and slow connections, more relevant to the growing big data economy. Perhaps a distributed infrastructure.

It’s generally easier to scale down an over-engineered design than to scale-up a simplistic design.
———– MOM based idea ———-
(not yet a “design”). If the request volume is too high, then we run the risk of creating thousands of blocked threads. MOM to the rescue. I will refer to the MOM-based server process as the “engine”.

The client thread would have to be “decoupled” from the engine thread, otherwise 5000 concurrent requests would map to 5000 threads in the MOM server. To decouple, the client thread (one of 5000 or 55000) could block as a synchronous MOM consumer, or the client thread could register an onMsg() callback.

On the MOM server, we check request against the cache and returns the Position if hit — synchronously. Position data goes out in a response “topic”, so dozens of clients could filter the responses to look for their desired position. What if cache miss? Request goes into another queue (cacheMissQueue). Add the key to a task queue (producer-consumer). The consumer thread pool could have 1 or more work threads. Assuming the read() method is blocking, the worker thread must block. Upon failure, we update the cache with a special record for key 123. Upon success, we update the global cache. Either way we publish to the response topic.

Note If a request key 123 is already being processed or in the task queue, then we won’t even add it to the task queue. The task queue should “see” each key only once.
——–Low level design ——-
Now let’s assume this is a low-level single-process threading challenge, without MOM. The lookup() method could be called on 50 threads simultaneously. If 2 threads both request key 123 (cache miss), they must both block. Each reader thread should create a lock tagged with the key. During the read(), the lock should not be held. Upon successful read, the reader thread should lock, update cache, notify, unlock, then return the Position object. If in the interim a 2nd thread (a want2read thread) also has a cache miss on key 123, it would look for the lock tagged with 123, lock, then wait in it. Upon wake-up, it check cache. If found, it exits the loop and returns the Position object.

If read() times out or fails, the reader thread would fake a special Position object with an error msg, and do the same as above.

%%A: We need a thread pool to consume the queue. The pool threads will execute processOneReadTask(), “complete” the Position object in cache and notify on the position id

class Cache{
private:
   std::map<.....> * map; // should use shared_ptr<position>
   Mutex mutex;
   Condition cond;
   Queue<int> queue; //thread safe queue, not std::queue
   DB * db;
   Cache():
     map(new std::map<....>),
     mutex(createMutex()),
     cond(mutex),
     queue(createQueue()),
     db(getRemoteDB()) {}
   void submitReadTask(int key){
         queue.enqueueIfNew(key); //thread-safely enqueue, if key has never been enqueued
   }
public:
   //callable by clients
   Position * lookup(int key){
       ScopedLock lock(mutex); //may block
       Position * found = map.findByKey(key);
       if (found) return found;
       this.submitReadTask(key);
       while(1){
          cond.wait();
          Position * found = map.findByKey(key);
          if (found) return found;
       }
   }
   // executed by pool threads
   void processOneReadTask(){
       int key = queue.pop();
       if (key == -1) return; // empty queue
       Position * readOut = this.db.read(key); //slow
       ScopedLock lock(this.mutex); //may block
       map.populate(key, readOut);
       this.cond.signalAll();
    }
}; //end of class

How about the listener solution instead of wait/notify? If we have 500 threads requesting position 123. They all block in wait() — memory-intensive. The listener solution means each thread would package its local state info into a state object and save it into a global collection. A few threads would be enough. For example, a single thread could wake up upon notification and sequentially complete the tasks of the 500 requests.

weekend IV – Ab Initio

Which project has the largest data volume and complexity?

How do you usually implement xml transform — in java or using stylesheet?

How do you manage both latency and throughput?

What’s the level of latency you deal with — micros or millis?

(Real time fraud detection 2000 transactions/sec, with millis latency.)

How do you diagnose latency issues?

How do you measure latency?

If you use simple logging technique to record latency, then how do you minimize the latency added by logger itself?

In-memory databases — how do you manage the disk read/write.

time-series databases — how do you query them?

Complex data analysis — use RDBMS or implement in java?

Experience with http://en.wikipedia.org/wiki/OLAP_cube?

Experience with SybaseIQ?

Many real world data processing systems receive xml messages, process it in DB then send back an xml reply. How would you deal with this? XSL transform is slower than parsing in java. I feel xml transform needs to traverse the entire xml tree, so it needs the DOM tree, which is slower than SAX parser.

Present a past project with parallel processing, large volume, possibly async messaging, good latency and throughput.

IV: local volatility by OCBC Bertrand

Q: tell me 1 or 2 low-latency challenges in your projects.
Q: what’s a variance swap?
Q: what’s your favorite messaging vendor product? (MQ is firm standard but too slow for this team.)
Q: Where do you get your implied vol numbers? From exchange or you do your own inversion? (I guess exchanges do publish quotes in vol.)
Q: how many tenors on your vol surface?
Q: how do you model term structure of vol?
Q: how do you perform interpolation when you query the vol surface?
Q: you mentioned various vol models (taylor, cubic etc) but how do you decide which model to use for each name?
Q: describe BS equation in simple language, and what are the inputs?
Q: FX vol is delta sticky. On X-axis you see 25 delta, 10 delta, atm vol etc. How about eq?
Q: any example of structured vol products?
Q: what’s dv01?
Q: which asset class are you most comfortable with?
Q: in your systems there are often multiple languages. How do you make them interoperable?
Q: what’s java generic?
Q: value type vs reference type in c#
%%Q: who will validate the prices and decide if it’s a reasonable or valid price?
A: traders. Even though they aren’t fully trained as quants are. Some managers also have pricing knowledge. Perhaps no real quants.
———
Q1: what’s fwd vol vs local vol?
Q: you said variance is additive? Can you elaborate?
Q: have you heard of total variance?
Q3: what’s put-call parity?
%%A: C + K = P + F. Each of the 4 can be synthesized using the other 3

Q3b: can you show me one synthetic portfolio, say a synthetic long call?
Q: Can you explain PCP in terms of deltas?

Q: you mentioned code bloat due to c++ templates, so how does c++ deal with that?
Q: have you heard of c++ strong exception guarantees vs basic exception guarantees?
A: http://en.wikipedia.org/wiki/Exception_guarantees

Q: how does java generic differ from c++ and c# — take a hashmap for example?
Q: c# hashmap of integer? Is it special?
Q: you mentioned java hashmap of integer involves autoboxing, so what happens during autoboxing?
Q: What smart pointers have you used?
Q: if i were to use a smart pointer with my legacy class, do I have to modify my class?

Q7: you mentioned java generic was designed for backward compatibility, but why not add a new type HashMap in addition to the old non-generic HashMap class?
%%A: old and new must be interoperable

Q7b: what do you mean by interoperable?

Q: tell me one project with low latency

— some answers revealed —
A1: local vol is designed to explain skew. During the diffusion, instantaneous vol is assumed to be deterministic, and a function of spot price at that “instant” and TTL.

I guess what he means is, after 88 discrete steps of diffusion, the underlier could be at any of 888888 levels (in a semi-continuous context). At each of those levels, the instantaneous vol for the next step is a function of 2 inputs — that level of underlier price and the TTL.

A3: easiest way is “call premium – put premium == fwd price” (cash amount to be added to LHS?)

c++ Push all zeros in array to the end #easy

Basic programming exercise

#include <cstdlib>
#include <iostream>
#include <iterator>
#include <assert.h>

using namespace std;
int a[] = {0,0,-1,2,0,4,0,0,8,0};
/*
Push all the zero's of a given array to the end of the array. In place only. Ex 1,2,0,4,0,0,8 becomes 1,2,4,8,0,0,0
*/
int main(int argc, char *argv[])
{
   size_t size = sizeof(a)/sizeof(a[0]);
   copy(a, a+size, ostream_iterator<int>(cout, " "));
   int left = 0;
   int right = size -1;
   while(left < right){ if (0==a[left]){ while (0==a[right]) --right; if (left >= right) break;
          swap(a[left], a[right]);
       }
       ++left;
   }
   cout<<"\n new: \n";
   copy(a, a+size, ostream_iterator<int>(cout, " "));
   cin.get();
}

Fwd: MS IRD IV

MS
Gel (testing algo)

1.       Write a blockingqueue using wait / notify
2.       Traversal of non-binary-tree,
3.       What if the leaf pointing back to the root, (infinitive loop), how to prevent it?

Solamon (testing threading)
1.       Write a thread safe class, getValue, incr (int val++) (sync on both, use volatile, or use AtomicInteger);
2.       If AtomicInteger is used, do u know how its method incrementAndSet work? oldVal = value; newVal = value; if(newVal != value)…
3.       Besides the performance penaty of lock, what’s the big problem locking have? Deadlock
4.       How to generate a deadlock?
5.       The performance issue with locking? Like val++, that require two machine inst, but sync around it will take hundreds of inst; on the other hand, database ops take million of intrs so 100 is neglective
6.       Lock free system?
7.       Sorting types. Bubble sorting, selection sorting, merge sorting and quick sorting. Performance of each, in big O notation. For merge sorting, is nlogn. N for divide and logn for merge.
Joe (testing architecture)

1.       Write customized list class that will work as general list but with version history info. The signature like get(int index, int verNum)
set(int index, String val) it will return new version number;

2.       2*3-5, parse the string, the final string form is 23*5-, input is a bin-tree like
      –
     /
   *   
  /   
2 3   5

Do the pre-order transversal, the output will be 23*5-

3.       List, break down to a batch of lists, list<list>, the size of inner list is about 10.

throwing dtor: %%justified use cases

Status — still I don’t have a well-justified use case.

I feel throwing dtor is frowned upon but not “illegal”. In practice it’s seldom done by design. When used, it’s not worth the analysis. In interviews, however, this receives disproportionate attention.

However, in practice, there are reasons to break the rule. Suppose I have

int i1, i2;// temp variables,
try{
  i1 = myobj->eat();
  myobj->drink(&i1);
  //....
  delete myobj;
}catch(business_exception & be){
  //handle exception using the temp variables i1, i2 etc
}

Since eat(), drink() etc and dtor all throw the same business_exception, this code is clean and maintainable. If we need to throw more than one exception type from those 3 functions, we can easily add the code. The same exception handler is used as a catch-all.

It would be messy to pass the temp variables i1, i2 etc into myobj dtor and replicate the same exception-handling logic therein.

So in this case, I’d make myobj dtor throw business_exception.

Now, as described in [[moreEffC++]] myobj dtor is invoked as part of stack unwinding due to another exception? [1] Well, in this case, I know that’s a fatal scenario and I do want system to crash anyway, like an assertion error, so the terminate() behavior is not unacceptable.

In other words, myobj’s class is written such that its dtor should throw exception only under normal object destruction and should never be part of an exceptional stack unwinding. In such a case, no one should misuse this class in an exception-unsafe context. If they ignore the restrictions on this class, they could get this dtor invoked as part of an exceptional stack unwinding, and the consequence is something they must deal with.

[1] in c++11, system will trigger std::terminate() whether or not this is part of unwinding. See https://akrzemi1.wordpress.com/2011/09/21/destructors-that-throw/

c++highly parallel numerical programm`IV(smart ptr, threading…

These are mostly QQ type. Jap firm https://www.numtech.com

Q: What are the differences between win32 and linux threading Implementation? I probably won’t spend time reading this.

Q: given a bunch of 10-year old linear algebra c++ functions (using many VD or MD templates as function inputs), how would you go about extracting and packaging them into a DLL? No experience. Not on my Tier 1/2.
%%A: stateless
%%A: pure functions.
%%A: Thread safe
%%A: remove code duplication
%%A: for each function, there should usually be default parameters, so we can call it with 2 args, 3 args, 4 args etc
A: function parameters should not be templates. Ints, double are probably fine. DLL interface is defined by the platform, not the language.

Q: what’s wrong with boost libraries?
AA: (STL is fine) many of them aren’t proven — just 10 years

Q: Since you said shared_ptr is the most popular boost module, how is reference count done in heavily parallel programming
%%A: for regular shared_ptr, thread safety is a big design goal, so probably fine[1]. For intrusive_ptr, the pointee class must expose certain mutator methods, which must be made thread-safe by the pointee class author, not boost authors

[1] isn’t correct. The testing on shared_ptr thread safety isn’t sufficient for large number (thousands) of threads. See https://bintanvictor.wordpress.com/2017/07/09/shared_ptr-thread-safety-take/

 

island rainfall problem – C array/pointer algo

#include <cstdlib>
#include <cstdio>
#include <iostream>
#include <iterator>
#include <algorithm>
#include <assert.h>
using namespace std;
int const island[] = { 54, 50, 54, 54, 52, 55, 51, 59, 50, 56, 52, 50 };
///////////////   Pos # 0   1   2   3   4   5   6   7   8   9  10  11
int const size = sizeof(island) / sizeof(int);
int accu = 0;
template<class ForwardIterator>
ForwardIterator max_element_last(ForwardIterator first, ForwardIterator last) {
    ForwardIterator ret = first;
    if (first == last)
        return last;//empty range
    while (++first != last)
        if (*ret <= *first)
            ret = first;
    return ret;
}
void print1(int const* const a, char const * const label) {
    printf("%s=%d/%d ", label, *a, a - island);
}
void printAll(int const* const L, int const* const l, int const* const h,
        int const* const H) {
    if (l < h) {
        print1(L, "wallL");
        print1(l, "ptr");
        printf("  ");
        print1(h, "ptr");
        print1(H, "wallH");
    } else {
        print1(H, "wallH");
        print1(h, "ptr");
        printf("  ");
        print1(l, "ptr");
        print1(L, "wallL");
    }
    printf("%d=accumulated\n", accu);
}
void onePassAlgo(){
    int*wallLo, *wallHi;
    int*l, *h; //moving pointers
    wallLo = l = const_cast<int*> (island);
    wallHi = h = const_cast<int*> (island) + size - 1;
    if (*l > *h) {
        std::swap(l, h);
        std::swap(wallLo, wallHi);
    }
    printAll(wallLo,l,h,wallHi);
    printf("All pointers initialized\n");
    while (l != h) {
        if (*l > *wallHi) {
            wallLo = wallHi;
            wallHi = l;
            std::swap(l, h);
            //printf("new wallHi:");
        } else if (*l >= *wallLo) {
            wallLo = l;
            //printf("new wallLo:");
        } else {
            accu += *wallLo - *l;
            printf("adding %d liter of water at Pos#%d (T=%d)\n", *wallLo - *l,
                    l - island, accu);
        }
        printAll(wallLo,l,h,wallHi);
        //now move the scanner
        if (l < h)
            ++l;
        else
            --l;
    }
}
void twoPassAlgo() {
    int const* const peak = max_element_last(island, island + size);
    printf("highest peak (last if multiple) is %d, at Pos %d\n", *peak, peak
            - island);
    //(island, island + size, ostream_iterator<int> (cout, " "));
 
    //forward scan towards peak
    int* pos = const_cast<int*> (island); //left edge of island
    int* wall = pos;
    for (++pos; pos < peak; ++pos) { if (*wall > *pos) {
            accu += *wall - *pos; // accumulate water
            printf("adding %d liter of water at Pos#%d (T=%d)\n", *wall - *pos,
                    pos - island, accu);
            continue;
        }
        //ALL new walls must match or exceed previous wall.
        printf("found new wall of %d^ at Pos#%d\n", *pos, pos - island);
        wall = pos;
    }
    cout << "^^^ end of fwd scan ; beginning backward scan vvv\n";
    //backward scan
    pos = const_cast<int*> (island) + size - 1;
    wall = pos;
    for (--pos; pos > peak; --pos) {
        if (*wall > *pos) {
            accu += *wall - *pos; // accumulate water
            printf("adding %d liter of water at Pos#%d (T=%d)\n", *wall - *pos,
                    pos - island, accu);
            continue;
        }
        //Note all new walls must match or exceed previous wall.
        printf("found new wall of %d^ at Pos#%d\n", *pos, pos - island);
        wall = pos;
    }
}
int main(int argc, char *argv[]) {
    onePassAlgo();
}
/*
 Requirement -- an island is completely covered with columns of bricks. If
 between Column
 A(height 9) and Column B(10) all columns are lower, then we get a basin to
 collect rainfall. Watermark level will be 9.  We can calculate the
 amount of water. If I give you all the columns, give me total rainfall collected.
 Code showcasing
 - stl algo over raw array
 - array/pointer manipulation
 - array initialization
 - array size detection
 - std::max_element modified
 - std::swap
 */

divide-by-zero: c++ y no excp; y java throws

https://stackoverflow.com/questions/8208546/in-java-5-0-statement-doesnt-fire-sigfpe-signal-on-my-linux-machine-why explains best.

http://stackoverflow.com/questions/6121623/catching-exception-divide-by-zero — c++ standard says division-by-zero results in undefined behavior (just like deleting Derived via a Base pointer without virtual dtor). Therefore programmer must assume the responsibility to prevent it.

A compliant c++ compiler could generate object code to throw an exception (nice:) or do something else (uh :-() like core dump.

If you are like me you wonder why no exception. Short answer — c++ is a low-level language. Stroustrup said, in “The Design and Evolution of C++” (Addison Wesley, 1994), “low-level events, such as arithmetic overflows and divide by zero, are assumed to be handled by a dedicated lower-level mechanism rather than by exceptions. This enables C++ to match the behavior of other languages when it comes to arithmetic. It also avoids the problems that occur on heavily pipelined architectures where events such as divide by zero are asynchronous.”.

C doesn’t have exceptions and handles division-by-zero with some kind of run time error (http://en.wikibooks.org/wiki/C_Programming/Error_handling). C++ probably inherited that in spirit. However, [[c++primer]] shows you can create your own divideByZero subclass of a base Exception class.

java has no “undefined behavior” and generates an exception instead.

c++low latency phone IV #MS-shanghai

I think this is all QQ… i.e theoretical knowledge. Not really harder than the mvea interview. Easier than the CVA interview.

Q: why we should not throw exception from dtor? Why do you say sometimes you can break the rule? See throwing dtor: %%justified use cases
%%A: if i have a utility routine that may throw, and I want to use it in my dtor, we should assess the danger. If we know entire process should crash under this condition, whether this dtor throws exception or swallow it, then it’s fine to use that routine without try/catch. Even without double-exception situation, the default behavior is std::terminate()

Q: divide by zero — is that an exception?
%%A: yes in java but UB in c++
AA: UB in c++

Q: is there something you can do in C but can’t in c++?
%%A: I guess you mean “C code uncompilable under c++ compiler”? I don’t think so.
A: “include <stdio>” won’t compile in g++
%%A: in pure-C environment if you receive a pointer at run time and it’s known to point to heap, you can call free() but in a C++ environment, that pointer may be created by new/array-new so free() is problematic.
A: https://isocpp.org/wiki/faq/big-picture#back-compat-with-c hints that if a C source code has no function prototype, then c++ compiler will complain.

Q: at what threshold of latency requirement would you switch from java to c++?
%%A: 100 μs. but I think nowadays java can rival c++.

Q: singleton — how would you implement
%%A: no friend class; declare but not define copier/op=. Provide a static method getInstance

Q: size of an empty c++ class’s instance
%%A: 1 byte (Correct ! https://stackoverflow.com/questions/6552319/c-sizeof-of-a-class-with-functions)

Q: diff between struct and class
Q: can you implement inheritance and polymorphism in C?
%%A: yes. Most c++ features (until exception) used to be converted to C source code.
Q: what’s complete vs partial specialization of template? (Jargon question)
Q: what happens to stack when an exception is thrown?

Q: what’s your reaction when you see “delete this” in a program?
A: After that, need to be careful not to reference anything in the class, so as to avoid dereferencing a dangling pointer.

 

IV – HFT trading operation support

Q: What’s a busted trade. How can it be busted?

Q: If I send a IOC sell of 12.05 but the best bid is $12, how could this be filled?

I believe IOC / FOK are all limit orders. Market orders are implicitly IOC — if orderbook is (almost) empty on this side, the market order is (partially) cancelled.

Q: what’s a resting order?

Q: if a sell limit order is $27 and a new buy limit order comes in at $28, what would be trade price?
A: $27. new comer is given the best price available. Note the buyer doesn’t know what the best offer would become when his order hits the book, so he uses limit order not a market order.

Q: what’s away order?

Q: whats intermarket sweep order

how many tosses to get 3 heads in a row

A common probability quiz — Keep tossing a fair coin, how many tosses on average until you see 4 heads in a row? There’s a Markov chain solution, but here’s my own solution.

I will show a proposed solution using math induction. Suppose the answer to this question is A4. We will first work out A2 and A3.

Q1: how many tosses to see a head?

You can get h, th, tth, ttth, or tttth …  Summing up 1/2 + 2/4 + 3/8 + 4/16 + 5/32 == 2.0. This is A1.

Q2: how many tosses to see 2 heads in a row?

t: Total tosses in this senario = 1+A2. Probabily = 50%
hh: 2. P = 25%
ht: 2 + A2. P = 25%

(1+A2:50%) + (2:25%) + (2+A2:25%) = A2. A2 works out to be 6.0

Q3: how many tosses to see 3 heads in a row?
After we have tossed x times and got hh, we have 2 scenarios only
…..hhh — total tosses = x + 1. Probability = 50%
…..hht: x + 1 + A3 : 50%

(x+1: 50%) + (x+1+A3 : 50%) = x+1 + 0.5x A3= A3, So A3 = 2*(1+x), where x can be 2 or 3 or 4 ….

In English, this says

“If in one trial it took 5 tosses to encounter 2 heads in row, then this trial has expected score of 2*(1+5) = 12 tosses.”
“If in one trial it took 9 tosses to encounter 2 heads in row, then this trial has expected score of 2*(1+9) = 20 tosses.”

Putting on my mathematics hat, we know x has some probability distribution with an average of 6 because A2 = 6. We can substitute 6 for x, so

A3 = 2x(1+A2) = 14.0. This is my answer to Q3.

Now we realize A2 and A1 are related the same A2 == 2x(1+A1)

I believe A4 would be 2x(1+A3)==30.0. Further arithmetic shows A[n] = 2(A[n-1]+1) = 2^n – 2

slist iteration with concurrent add()

Mithun,

Thanks for the interview question —

Q: Can you iterate over a LinkedList while another thread adds or removes elements? How might you make it thread safe?

If a long linked list is super busy with a lot of add/remove, then we may never finish iterating — we get ConcurrentModificationException every time.

I feel solution depends on length of the list, how critical is iteration vs add()/remove(), performance requirement etc.

I feel your initial idea is plausible — make a copy of the entire linked list before iterating. But when we make a copy, we need to iterate — chicken and egg problem.

How about iteration-free copying? Get a node, copy it, then read the “nextNode” pointer to reach the next node. This way I implement my own loop. Must handle concurrent modification. Not easy, but at least I don’t get exceptions. In contrast, Similar home-made solution is harder on vector/hashtable/deque as their reallocation is more complex.

I also like your 2nd idea of a custom linked list, where add() operations are customized. This is fine if iteration can finish quickly or add() is infrequent. However, If add()/remove() are time-critical, then we need other solutions.

If we need to iterate a long linked list that’s super busy, then the __design__ is probably wrong. Why not use a database?

Alternatively, split the long linked list into shorter lists. Easier to iterate.

java IV for quant #ANZ

JNI is not thread safe? Suppose we have 10 simultaneous requests from Swing GUI and 10 threads in the java server, each calling the same JNI quant function. What’s the threading scenario? Will there be 10 independent JNI calls that don’t interfere with each other? I feel in the c++ codebase the 10 threads may clash.

Q: What RMI reliability issues and solutions did you see (you said JMS is rather reliable)?
%%A: In my apps, there are few reliability issues observed — because it’s production, but the request volume is probably much lower than the JMS requests. Once a while, there are error messages in the log, possibly related to RMI calls.

Q: For swing-server communication, when would you use RMI and when JMS?
%%A: JMS for large volume, request queuing.
%%A: JMS for request forwarding.
%%A: JMS is flexible for asynchronous request/reply
%%A: JMS requires queue/topic set-up in the broker (except temporary queues) — extra admin effort. Tibrv is clean and nice. RMI is also simpler.
%%A: RMI is quick and dirty (inefficient). Requestor blocks briefly to get a quick response.
%%A: if you must get a response before proceeding, then RMI or web service is better than MOM

What’s the memory organization in an JNI process?
%%A: java+C stack + java heap + c heap + c globals

Q: Between sync and async, What’s the difference in your application code?
%%A: sender app is indifferent. Receiver uses sync receive() call or onMessage()

polymorphic equals() in java/c# ? NO

A java interviewer quizzed me why my equals() implementation insists on both objects having exactly the same runtime type, and rejecting a derived object. He questioned what if we happen to have

B var1 = new Derived(1,2); B var2 = new B(1);

and the B portion of both objects are identical?

Biggest problem with this type of polymorphic equals() — X === Y === Z doesn’t mean X === Z. Suppose the base sub-object of X/Y/Z are identical. Y is a base object. Therefore Y.equals(X) == true == Y.equals(Z), but X and Z are subtype instances, and different.

I’m a purist with high standard. If my equals() determines 2 objects to be equals, then I “swear” they are indistinguishable and interchangeable carbon copies.

It’s generally safe to be strict writing equals() — insist both objects’ actual types at runtime are identical.

c++Citi (Changi Biz Park) winows IV (method hiding

Q: Base class has non-virtual float f(int); Derived class has void f(int);
Base& ref = aDerivedInstance; ref.f(333) invokes which one?

My test proved — base and derived classes’ f() can both be invoked in the same program, but on 2 distinct variables (Base vs Derived). I believe compiler/runtime can distinguish the 2 functions. Remember this is no inheritance nor redefinition — If you only have a Derived variable, then the Base f() is hidden, regardless Base f() is virtual or not.

If Base and Derived both define methods of the same NAME f(), then Derived either overrides/redefines or hides Base f() functions. I don’t think you could get to the Base versions via a Derived object — Base version can’t coexist in Derived with Derived version. But When would method inheritance actually work? Well, only if D doesn’t contain a method of the same NAME. As soon as it does, all B methods of the same NAME are either overridden or hidden.

Incidentally, EffC++ P 114 points out that c++ compiler condones lots of ambiguities and only generates a COMPILER (not runtime) error when you write a call that’s ambiguous. Javac compiler can also complain about ambiguous method calls

Incidentally, if you call a declared but undefined func, you get a linker error, not a compiler error. EffC++ P116.

Q: Is option delta always between 0 and 1?
A: no

Q: write a c++ singleton

Q: write a reader/writer lock class given a basic mutex class. I don’t feel confident since there’s no one to help review my code.
%%A: lock release should be in dtor. Therefore, acquire should be in ctor.

c++/java MS commodity IV – 2010

Q: SQL when would u turn on dirty read?

Q: can you tell me some of the drawbacks of stored-proc?

Q: challenges in database refactor?

Q: technical/project challenges you faced in your career?

Q: how do you put across your argument in that challenging situation?

Q: when I enter some.pl in a shell, what happens?

Q: 3 threads are adding/remove/reading single elements in a hashmap. How do you optimize synchronization?
A (now i think): i guess lockfree is perhaps the best
A: concurrent hashmap with each segment holding multiple keys

Q: but if remover and reader use different keys, then why should they wait for each other?
A: they might hit the same bucket

Q: what other maps beside hashmap?
A: treemap, a red-black tree

Q: what’s a red-black tree?
A: perhaps a balanced tree

Q: how is it balanced?
A: perhaps by shifting the root?

Q: What are the list implementations?

Q: is insertion faster in linked list or array-based list

Q: is insertion faster in linked list or hashmap

Where is c++ used? quant lib for risk, PnL and trade booking. Is c++ going away? no.

2 main specializations — physical scheduling + risk including pnl attribution

1st SCB IV #comm #private inheritance

Q: you have identical rows in a table. How do you clean it up?
%A: select into a new table, then overwrite the old, but this involves a lot of disk space and IO
%A: perhaps delete where (select count(*)…) > 1 and rowid > 1 — using oracle rowid

Q: given a long string of letters A-Z, print a histogram like A:865 times, B:9932 times….

Q: copy a vector to another vector. What if the source is very large? Correct — you get reallocation, so how do you avoid that?
%A: either reserve or initialize the target vector with sufficient capacity. However, the 2nd option default-constructs a large number of payload objects!

Q: when would you use private inheritance?
A: This is rarely needed or quizzed. Not on my Tier 1/2. [[effC++]] P204 has an example of MI and also an example of private inheritance — a public inheritance of a pure interface and also a private inheritance of a concrete class. However, Google style guide strictly states that private inheritance should be replaced with composition.

IV – UDP/java

My own Q: How do you make UDP reliable?
A: sequence number + gap management + retransmission

My own Q: can q(snoop) capture UDP traffic?

Q: Why would multicast use a different address space? Theoretical question?
A: each MC address is a group…

Q: why would a server refuse connection? (Theoretical question?)
%%A: perhaps tcp queue is full, so application layer won’t see anything

——————

Q: How do you avoid full GC
Q: what’s the impact of 64 bit JVM?
Q: how many package break releases did your team have in a year?
Q: In a live production system, how do you make configuration changes with minimal impact to existing modules?

MS S’pore IV – Aug 2010

* what’s the class loaders hierarchy?
* why do we need custom class loaders?
* dynamic proxy — how do you use it?
* are java objects always always on the heap? Escape analysis

Q: besides heap and stack, what else?
A: code area; global variable area, memory mapped file, DB, shared mem and …?

FXall java IV

Q: what’s the difference between sleep() and wait()
Q: what’s an inner class?
Q: can an inner class access variables outside the inner class?
Q: what’s a deadlock? How do you avoid deadlocks?
Q: how do you tune garbage collection?
Q: what’s isolation level in database? What do you know about isolation levels?
Q: what’s concurrent hash map?
Q: hash map iterator can throw concurrent mod exception. In the same scenario, if I use CHM instead of HM, what happens?
Q: override equals() in your own String class
Q: implement a static sort() method accepting a collection of numbers. Each member must be a particular type, which is a subtype of Number.java.

Stamford equity drv IV

Q: describe some of the GC algorithms?
(Now I think) A: ref counting + root object /descent/. Ref-counting alone isn’t reliable due to islands.

Q: For a given specification, a Java version uses more memory than C and many languages. Does this impact performance, and how?
%%A: more swapping

Q: what if enough free memory?
%%A: then no performance penalty
Now i think reading/writing to 20% more memory takes more time.

Q: Is JVM a performance problem, with the new JIT compilers?

Q: Beside JVM and extra mem usage, what else would you say to criticize java performance among competing languages?

Q: How many KB for a java Object?

Q: Why do you say java userland threads are light-weight relative to kernel threads?
%%A: memory. One address space per JVM

Q: does Object.java have any field?
%%A: serial number
A: none

Q: How does the GC decide who should live in old-generation area and who in young-generation area.

Does new objects live on young or old generation area?
%%A: young, except static fields

Q: what’s hashmap load factor?

Q: What can you tune in JVM
%%A: heap size, native/green thread, young generation size

Q: We agree that threads, statics and JNI are 3 types of root objects, but I’ve not heard of JNDI as a 4th type. Are you sure?

Q: What’s so good about java’s thread feature compared to other languages?
I only know 2 comparable languages — c# and c++.
%%A(2013): memory model; concurrent collections

Q: key challenges of large java projects in your past?

%%FX IV – math

Q: Suppose you converted home currency (USD) into GBP and JPY. Avarege rate is computed as net short position in home currency (say 17,982,000 USD) divided by net long position in a foreign currency (say 10m GBP), in a single account. Same for JPY average rate, but in another account. But what if you did some cross trades between GBP and JPY? How would you compute your GBP average rate?
%%A: I would need to work out how much nett GBP was converted to JPY

IV < — Q: if a cross pair AAA/BBB can be priced using EUR, and also can be priced using USD, then what’s the algorithm to choose the correct pricing? Choose USD if it has tighter spread?
%%A: I feel it’s possible to go with EUR and publish a quote with a tri-arb loophole in it. I feel the bid quote on the cross must be the safest bid, and the offer quote on the cross must be the safest offer.

Note ECN’s don’t generate cross rates. Only market-makers (dealers) do that. Its their responsibility to check for triangular-arb loopholes.

%%FX IV

Q772: does your system give your clients margin accounts? Do you let clients margin-trade FX options, FX futures or FX spot?
Q1802: is your system Retail-facing or Institutional-facing, an interbank trading system or prop trading?

Q: since NY traders often need to hand over the book to Tokyo traders, do they have different base currencies in their respective trader accounts?
%%A: Tokyo has a caretaker/baby-sitter trader for each NY account.  He babysits for the absent parents.

Q1234: how do you detect tri-arb (Triangular Arbitrage)? Both Outgoing quotes by your traders and incoming quotes need to checked? http://forexmentalist.com/forextrading/forex-arbitrage-trading-a-short-lived-trading-strategy/ is an introduction.
Q987: what live feeds do you use?
Q2323: typical size of your trades?
Q8792: how many cross currency pairs are actively traded on your system?
Q623: Beside the crosses, how many currency pairs are actively traded on your system?  Do these pairs always have USD as one of the 2 currencies?
Q: how many users do you support? Do they use website or WPF or swing or Quartz GUI?
Q: which ECN or interbank trading venues do you connect to?

Q: what’s the typical bid/ask spread on a non-cross currency pair?
A: 2-3 pips for top tier clients, and 20-40 pips for retail. I believe the dealer makes little if any profit on these trades, but they make more from the clients on other deals such as FX fwd, options or lending.
A: 1-2 pips for EUR/USD.

Q: is your system buy-side or sell-side?
(I believe fund managers, asset management firms are all buy-side. I guess Retail online trading websites are probably buy-side too.)

Q: typically how many percent of the trades are voice trades, and how many percent electronic?
(I know many big banks still have voice trades.)

Q: how many types of FX options are processed in your system? Barrier options, Binary options.
Q: what’s the process of FX option trading?
Q: what’s no dealing desk?
Q: what’s ledger balance?
Q: for a fx option, What’s strip? Strip leg?

A1234 (from a veteran): checking the quote frequency of crosses. for example, if EURJPY update rate is the sum of EURUSD update rate and USDJPY update rate. then that could mean one is using EURUSD and USDJPY to make market on EURJPY.
A772: I have seen real examples of retail traders using 50:1 leverage in FX spot trading, but not sure about institutional clients.
A8792 (Piroz): 5. Most of the active pairs are non-crosses.
A8792 (ZJW): 20-30 is common, and often high volume
A623 (ZJW): 200 – 300 is common in a large bank
A1802 (P): Inst sell-side system with about 700 web users.
A987 (P): Bloomberg, FXAll, Reuters
A987 (QEMS): Bloomberg

%%FX IV – IT

[L]Q: describe some multithreading challenges in your system
Q: where is the performance bottleneck of your system?
A: in the space of trade matching or “order-book”, there’s a single thread responsible for EUR/USD matching. I think this is the only thread to *removes* items from the EUR/USD order book. Allowing 2 threads to remove is probably impractical. Is this a bottleneck? I don’t know.
A: for FX options, it’s risk/PnL real time update across a large number of positions.
A: tiered pricer. See blog post.
A: Message volume and throughput. For multi-threaded processes, internal state tracking and event trigger sequences would be quite tricky.

Q: what are the functional components in your trading platform? I guess trade booking, quoting, end-of-day risk batch and PnL batch, market data gateway, connectivity to interbank trading venues i.e. FIX engine to send/receive orders and quotes?

Q: which database tables have a history table behind them and why?
(I know Trade, Quote and Position need history tables so the main table can be kept small and fast.)

[L] Q: how is tick data processed?

%%FX IV – processing

[L] Q: what database tables hold unrealized PnL data? I guess it’s Position table
[L] Q: how is general ledger reports generated from unrealized PnL tables?
IV <– Q: how long is a quote (in response to a RFQ) good for?
A: For an extremely competitive quote, it could be good for 1 minute only.

Q: I know derivative instruments have expiration dates, but must all CASH trades be closed sooner or later?
(Someone said you must close all your trades by converting back to your base currency, but I was told it’s possible to keep a long position in a non-base currency forever, just like you keep an IBM position forever.)

Q: after an open trade is closed, does it disappear from the Position table?
IV <– Q: is there any real time risk data presented to traders?
Q: if a trading account start out with USD 100m, and after a month becomes USD 50m + Eur 30m, how is unrealized PnL calculated?
(I think each account must choose 1 base currency. All positions are converted to it when computing unrealized PnL)

[L] Q: how is settlement done? Using some CounterPartyAccount table?

%%FX IV – volume stats

Q6601: how many quotes do you publish each day, and how many RFQ do you get each day?
Q3244: On a typical incoming market data feed, how many message a day?

A6601 (Pz): possibly a few thousand RFQ. There are probably fewer RequestForStreams

A3244: for a large ECN, hundreds of millions of incoming quotes a day, or 30,000 messages/sec/client, probably aggregate from liquidity providers.
A3244 (Pz): given the tiered pricer, we act on not more than 10msg/sec/currencyPair. (Presumably more are received but only this many trigger outgoing messages.)
A3244 (QEMS): Peak load was 30,000 messages per minute per client, but GUI get refreshed only 4 times ps. only few clients… 6 with GUI and 1 headless client

Barc eq drv IV

Q: can you create your own thread pool?

Q: if an exception is thrown on a thread, will other threads get affected?

Q: how do you implement a scheduling engine embeddable in a client JVM. The entry point is a static/nonstatic schedule(Runnable, long initialDelay). A bonus feature — cancel a given task.

[11]Miami exchange HFT IV

First of my 10+ HFT style QQ interviews, where the QQ topics were completely alien to me.

Q: 3 no-choice scenarios to use initializer list. Efficiency is a 4th reason.
%A: const or reference field, or if one of the fields has no op=
%A: if a base class lacks a no-arg ctor
A: if a data member’s class lacks a no-arg ctor
A: what if a data member is a reference-counted class like a shared_ptr?
A: what if a data member’s class keeps track of instances constructed

Q: What exchanges send you market data feed?

Q: outline a ref-counted copy-on-write string class, showing all the function declarations
See separate post https://bintanvictor.wordpress.com/2017/05/29/ref-counted-copy-on-write-string-miax-exchange-iv/

Q: how do you see what system calls a running process is calling?
%%A: strace for linux
AA: also DTrace/truss/tusc on other unix variants.

Q: RTTI is too slow, but somehow (can’t remember the context) we need to tell what class (in a hierarchy) this object is. How do you achieve that? See clever use of enum(char?) in demanding c++ app

Can’t remember the exact question….
Q: tell me a clean solution to support cout<<myDog; where Dog/Cat/Ant/.. classes are derived from Animal
Hint from questioner — you can edit Dog class.
A: see c++friend function calling a virtual method

P156 [[essentialC++]] has some simple tip

Note overloaded operator is like a method and can be virtual — http://stackoverflow.com/questions/2969966/c-polymorphism-of-operator-overloading. However, our operator<< has to be a friend!

create unbounded queue using STL deque – threading

There’s a STL container adapter — queue over deque. (Probably no queue over vector exists since one end of the vector is hard to operate.) Like all STL containers it’s thread unsafe. But in this context, let’s ignore this adapter and try to design our own adapter.

How many condition variables? I think 1 only — isEmpty.

Q1: How many mutexes? At most 1 at each end of the queue.
%%A: probably 1. See below.

Q1b: Is it possible to allow consumers and producers to operate simultaneously?
%%A1b: I don’t think so. See below.

Q2: What if there’re 0 elements now and both threads are operating on that 1 new element?

For a queue backed by a linked list, Answer 1b might be yes — I feel the producer would button up the entire node and put its address into the head pointer in one instruction. For a deque though, the copying is not an atomic one-stepper. It’s either operator=() or copy-ctor. Before the producer finishes copying, consumer reads the half-copied node! In conclusion, I feel we need a big lock over the entire data structure.

2011 Wells Fargo IV

Q: from vi, how do you import output from a command?
A: cmd | vi –

Q: what thread libraries did you use?
%A: the java thread uses different thread libs on Win32, Linux or Solaris; boost threads; perl thread; pthreads

Q: how do you create a socket in java or c?

%A: syscall socket() returns (ptr??) to the new (yes!) socket object? Wrong!!!!! A socket file _descriptor_ is returned.

Q: const ptr vs a reference?
A: you can cast away the constness
A: dynamic_cast the reference either succeeds or triggers exception

Q: 100,000,000-row table. efficiently select first 5 rows?
%A: cursor or resultSet processing minimizes disk I/O
%A: order-by is often bad as it must read all rows from disk before finding the first 5. Exception: if you order by (“front portion” of) an
index then you only read a small part of the index tree and only the data pages holding those 5 rows
%A if table is a heap, then no order-by can help.
A: set row count

Q: how do you find out number of processors in Unix?
%A: top, psrinfo, startup messages. Perhaps uptime

Q: if an off-the-run bond is lightly traded, how do you know if you have missed some update?
%%A: I just hope the exchange has a refresh/snapshot channel to send periodic snapshots
%%A: some exchanges may have a query service to request per-symbol snapshots.
%%A: we monitor all sequence number gaps

##finance jargons asked by interviewers

what’s theta for an option?
What’s forward rate?
[Barc] Define bond duration. Are there other definitions of duration?
[MS] Bond with an option
Dv01 – how to calculate
[MS] when and why would issuer call a bond?
[MS] How to price an interest rate swap
[MS] Call-put parity
[MS] How does volatility affect a call option’s price? How about a put? Why?
[MS] How is callable bond priced. Why do investors like/dislike callable bonds?
[Citi] How are interest rate and bond price related?
[MS] risk-free bond
[Ms] Price-yield conversion – both ways
[MS] Why is the price-yield curve convex?
[CS] On-the-run treasuries vs off-the-run
[MS] Can discount factor be greater than 1.00?
How do you compute PnL in your system?
Bond pricing algorithms/models in your system
Benchmarks used in your system?
Butterfly strategy
Busted trade
Market order vs limit order

2011 Pimco c++ #+perl/SQL #done

Many obscure QQ questions. Be selective what to study!

Q: diff between malloc() and new()?
%%A: must cast void ptr; must call ctor
A(hind sight): malloc need to be told the size. In contrast. new() relies on knowledge about the class
A(hind sight): array new can be done with malloc too
A (hind sight): new can be a (static) method, therefore inherited or hidden.
A: q(new) would put some housekeeping data in a header. Therefore, you can’t use free() on a new() ..

Q: ok, so malloc doesn’t call ctor, but can I pass the malloc void ptr to a ctor for initialization?
%%A: placement new will initialize a block of memory without allocation.
%%A: it’s not common. C++ language doesn’t encourage this operation.
A(hind sight): ctor doesn’t take a ptr argument?
AA: placement-new — https://stackoverflow.com/questions/2995099/malloc-and-constructors. Also see https://bintanvictor.wordpress.com/2017/06/28/op-new-allocateconstruct/

Q3: can you call a dtor explicitly?
%%A: yes though not a good idea. See C++ FAQ
A (hindsight): placement delete

Q3b: when would yo do that?
%%A: when i want to free a large block of heap memory as early as possible, and the dtor is not scheduled to run any time soon. But without delete() the heap memory isn’t freed. Probably delete can achieve the same purpose without looking ugly.
%%A: again, not  common and not encouraged by C++ language. I don’t think i would ever do that.
A(hind sight): placement new? Correct 🙂

Q5: can a dtor throw exception as the last statement?
%%A: no cos the dtor can be invoked as part of stack unwinding

Q5b: so what?
%%A: stack is unwinding due to another exception, and now your dtor creates a 2nd exception object. System is going to lose critical information in the 2 exceptions. The original exception handling flow is halfway through.

Q: given an instance of an absolutely empty class, can I downcast a ptr (or reference) to it?
%%A: assuming this class inherits from another empty class, you can’t since there’s no vtbl.

Q4: can you specialize a template with a ptr type as the type argument?
%%A: yes vector of raw ptr is common

Q4b: what must this template do to accommodate ptr template arguments? Too advanced and not one of my Tier 1/2 specializations
%%A: handle dtor, assignment and copier.
%%A: any memory-management operations must be customized. Default behavior is not good for ptr type arguments

Q: which part of c++ is most interesting to you?
%%A: low level access. Few other languages provide the low-level access

The remaining (perl and sql) questions are less daunting.

Q: perl pass an array by reference into a sub?
%%A: you can pass the address of the array. Array can be modified in-placer by the receiving sub
A: http://www.troubleshooters.com/codecorn/littperl/perlsub.htm

Q: chomp?

Q: perl open a file in append mode?
A: use “>>”

Q: when would you create a non-clustered index?

Q: select 2nd highest salary from a salary table?

Q: if my query need all 3 columns of an index (id, name, salary), how would you order the 3 columns when creating the index? Suppose salary has the highest selectivity
%%A: salary first

slist: remove consecutive dupes #Macquarie

#include
#include
struct litem {
     char data;
     litem* next;
};
void myDelete(litem * i){
  delete i;
}
int remove_consecutive_duplicates( litem*& list ){
  vector trash; // collection of bad nodes' address, to be DELETE'd
  litem* keep = list; // this pointer only points to “good” nodes to keep
  for (litem* p = keep->next; p;  p=p->next ){
    if (keep->data != p->data) {
      keep->next=p; // often unnecessary, but very cheap
      keep=p;
      continue;
    }
    trash.push_back(p);
  }
  keep->next = 0; // keep is now the last good node. It may or may not point to null. This  is cheap insurance.
  int ret = (int) trash.size();
  std::for_each(trash.begin(), trash.end(), myDelete);
  // now, trash.size() == ret probably, as trash holds a bunch of stray pointers, but just in case size changes in the foreach loop, we save the original size in advance.
  return ret;
}
void dump( litem*& list){
    for (litem * p=list; p; p=p->next){
      cout<

data< “;
    }
    cout<<endl;
}
int main23(){
   litem* a = new litem();
   a->data='a';
   litem* a2 = new litem();
   a2->data='a';
   litem* a3 = new litem();
   a3->data='a';
  
   litem* b = new litem();
   b->data='b';
   litem* b2 = new litem();
   b2->data='b';
   litem* b3 = new litem();
   b3->data='b';
   litem* b4 = new litem();
   b4->data='b';
  
   litem* c = new litem();
   c->data='c';
   litem* c2 = new litem();
   c2->data='c';
   litem* c3 = new litem();
   c3->data='c';
  
   a->next = b;
   b->next = c;
   c->next = c2;
   c2->next = a2;
   a2->next = b2;
   b2->next = b3;
   b3->next=b4;
   b4->next=a3;
   dump(a);
   remove_consecutive_duplicates(a);
   dump(a);
   cin.get();
}

CDS trading platform – IV

%%Q: how is JNI used here?
A: some calculation module is in C; some external API is in C. It’s a pain to rewrite things into java, but “we do want to”.

%%Q3: how does a CDS dealer hedge its exposure after it does a deal with a client?
A: It creates an offsetting deal with someone else

%%Q3b: but the dealer can’t disappear from the scene, right?
A: right. Consider an original mtg lender who sells the mtg off to Fannie. The mtg is completely off its books, but every month it still collects the installment. Similarly in the CDS case, the dealer still collects the quarterly premium, with or without the hedge.

Q: personal experience with MOM tuning?
%%A: message rate; subscriber population (RV handles this gracefully); message size and depth
%%A: one slow subscriber can cause message build-up in the broker, but TTL can help.

Q: how is quick sort used on array list and linked list?
%%A: random access needed. In STL, the sort function template is usable on only selected containers, whereas linked list (and other containers) has its own sort() method because the standard sort function is inapplicable.

macquarie – remove constness from a char const*

This is a tested solution to an investment banking c++ interview question . I forgot to add the const for L/R. Note there’s no need to cast away constness. L is a reseatable ptr to a const char.

#include
bool is_palindrome(char const * str){
  if ( !strlen(str) ) return true; // 0 size string is considered palindrome
  char const* L= str; // left ptr
  char const* R= str -1 + strlen(str); // the char before the . right ptr

  for (;L<R; L++, R–){
    while( tolower(*L) <'a' || 'z'< tolower(*L)) L++; // skip all non-alphabets
    while( tolower(*R) <'a' || 'z'< tolower(*R)) R–; // skip all non-alphabets
    if (*L != *R) return false;  
  }
  return true;
}
int main(){
   bool ret = is_palindrome(“ab ca”);
   cout<<ret;
   cin.get();
}

MS IR drv IV

Q: how do you test a critical change to your market-making logic?
%A: mostly based on experience
%A: guided by user test cases + dev test cases
%A: brain storm on corner cases

Q: any size limit on TCP/UDP data transfer?
Q: UDP packet sequencing?
Q4: what do u know about class loaders?
Q4b: how can a singleton be “broken” due to class loaders, if there’s the up and down rule?
Q: How does J2EE uses class loaders?
Q: define scalability?
Q: real time features in your system?
Q: why do issuers want to recall a bond?
%A: to stop paying high coupons and exploit the current low interest rate environment

Q: why is bond recall a disaster for a holder?
%A: The high coupons were a windfall.

Q: what’s a bond with options?
Q(probably a ticket data question): NFS vs SAN

some Design questions at my interviews

Q: when you start a green field project and multi-threading is required, what are the key points you keep in mind about multi-threading.
A (my Answer): keep things really simple. For example, immutables; fewer locks; less granular locks;
A (now i think): have an idea where MT can really help and where it has marginal impact. In many cases ST works better, without any data races or locks.
A: It’s also possible to go for multi-processing instead of MT — lower risk (fewer MT hazards); simpler; better-understood
A: sometimes it makes sense to leverage on existing MT features in a database such as running code in the DB

Q: in your sybase/java/perl system at GS, what are the performance bottlenecks? How did you address them?
A: DB query.

Q (MS): how do you define scalability, in your specific contexts?
%%A: generically, it means nearly linear throughput improvement without code change.

Q: what strengths/complaints do you know about C++ STL?
A (my Answer): STL is more powerful than anything before or after it, but too complex. For example, I can have a vector of reference to pointer to pointer to int. No such thing allowed in java.
A(my Answer): STL debugging is harder
A (now i think): not much OO, but heavy usage of templates. Some people like/dislike it
A (now i think): thread-safety is minimal
A (now I think): storing pointers in containers is widely required but not well-supported without smart pointers.
A (now I think): storying interface or base types is widely required in Java but similar situation as above
A: performance very good.

Q(BGC): key differences between Perl and c++?

IV at MS-IRD (java) + Imagine c++ #done

Q: implement getAngleBetweenHourAndMinuteHands(byte hr, byte min)

Q: implement getFirstNPrimeNumbers(int howMany)

Q1: Given a list of 99 random integers (uniqueness not guaranteed), write a function to find any pair that adds up to 1000. The function takes 2 arguments – a target int and a list of int.
%%A: I blogged about this. Build a hashset to hold all 99 numbers in 1st pass. In 2nd pass, for each number, compute the “complement” and look for it in the hashset.

Q1a: what if the list is pre-sorted?
%%A: no change. Same performance. The pre-sort doesn’t help.

Q2: how do you price an IRS? I will assume an existing vanilla IRS
%%A: the existing contract’s fixed rate is compared to the prevailing swap rate. The difference represents a periodic cash flow that’s fixed! Just discount that to PV.

Q2a: what if you have a discount curve?

——c++
Q: Given a singly linked list, write a recursive function to reverse the list

Class Slist{
Node * head;
Node * tail;
}
A: https://bintanvictor.wordpress.com/2017/06/26/reverse-a-linked-list-without-recursion/

Q (ez coding question): for an array of characters (null-terminated string), write a (recursive or iterative) function to reverse print it.

Q: given a simple class with an int field, implement pre-increment operator and post-increment operator
A: I have never done this in work, but the syntax is easy to find on-line.

Credit Suisse IV questions (automated bond trading)

Q: What’s your ideal dev environment, in terms of release, code sharing, code review, test-driven,

Q: What design patterns did you use to improve flexibility and testability?

Q: what’s your testing strategy (i think this is a very important question and challenge.)

Q: what are the pros and cons of implementing Runnable vs extending Thread?
%A: Runnable is either better or as good as Thread. I don’t see any advantage in extending Thread.

Q: volatile bool vs AtomicBool?
%A: AtomicBool is bigger but offers atomic operations unavailable in volatile.

Q: choose iterative vs recursive
%A: speed very similar. Same O(n)
%A: stack over flow
%A: concurrency – parallelism

MS FX trading IV

Q: what if someone successfully de-serializes a singleton? What’s the value of that static field?

%A: null, but if it de-serializes into the same class loader as the original singleton, then the field will be non-null

%A: whether it’s null or not, this new instance is usable. Whoever has a handle on this new instance can use it without calling getInstance().

Q: How do you prevent cloning of a singleton?
A: override Object.clone() to throw.

Q: a base class has a static method and field. Can subclass declare those same members?
%A: yes hiding.

Q: what’s hiding vs overriding?
%A: compile-time vs runtime binding

Q: If I have a base type variable pointing to a derived object, and I call a static method defined in both, which version runs?
%A: base version, due to compile-time binding.

Q: what are the generations in a GC? Does an object jump from Eden to another generation or does it move incrementally?

Q: what objects are in each generation esp. the permanent generation?

Credit Suisse tick data C++IV

Mostly QQ!

Q: auto_ptr vs scoped_ptr? See my other blog post

Q: when did you use a binary functor?
%%A: provide that class to a template to concretize it

Q: if I have an empty class, what’s the size of an instance?
AA: 1 byte

Q: offsetof() macro in C?

Q: ref vs ptr

Q: when is copy ctor implicitly invoked?

Q: given a vector of ints, how do I add 1 to each of them, in-place?
%A: for_each(), transform()

Q: what are functors?

Q: did you use priority queue?

Q: in network programming, how Did you use select?
%%A: My system used it but I didn’t study it in-depth

Q1 (actually easy): Anything wrong with a vector of references
%%A: won’t compile

Q1b: can I put references into containers?
%A: never seen it.
%A: I know it’s sometimes ok to put pointers in containers, or const pointers. Since references are very much like pointers, it might be ok
A: compiler breaks saying “pointer to reference is illegal”. I think it’s because iterator of vector is a regular pointer to the underlying_type. If underlying_type is an int ref, then the iterator would be a pointer to an int ref.
A: In STL source code there’s

typedef T* pointer;

If T is int&, this code won’t compile.

Amazon cod`IV(full source) – bitwise operator

–Method 1, submitted and rejected:
/**
 * Q: program to test whether an input long integer’s binary representation is
 * the same left-to-right or right-to-left.
 */
public class ShiftTest {
    public static void main(String[] args) {
        long input = 0xff00ff0180ff00ffL;
        char[] bitsForDisplay = populateArray(input);
        dumpArray(bitsForDisplay);
        for (short start = 0, end = 63; start < end; start++, end–) {
            if (bitsForDisplay[start] != bitsForDisplay[end]) {
                System.out.format(“nBad – Mismatch between Positions #%d and #%d. (1-based)”,
                    ++start, ++end);
                return;
            }
        }
        System.out
                .println(“nGood – Binary representation reads identical from left or right”);
    }

    private static void dumpArray(char[] array64) {
        int subscript = 0;
        for (char element : array64) {
            System.out.print(element);
            if (++subscript % 8 == 0)
                System.out.print(‘ ‘);
        }
    }

    private static char[] populateArray(long input) {
        char[] bitsForDisplay = new char[64];
        for (int subscript = 0; subscript < 64; subscript++, input <<= 1) {
            // if the shifted num is negative, then leading bit must be 1
            bitsForDisplay[subscript] = ((input < 0) ? '1' : '0');
        }
        return bitsForDisplay;
    }
}

–A similar method without converting to array
Maintain two masks. with
– 1st bit set only
– last bit set only

Apply each mask on the input — the AND result should be both 0 or both positive.

Now shift both masks and repeat.