TowerResearch IV: c++,algo,past prj..

Q1: how could you use enum to improve performance
AA: use it to replace strings
AA: enum is used in template programming to move computation from run time to compile time. See Q4 and SFINAE in github.
%%A: see clever use of enum(char?) in demanding c++ app

Q1b: you put in your change but it turns out to hurt performance. Any idea?

Q4: how do you compute Fib(N) like Fib(99999) at compile time using template programming?
A: See my Fibonacci code in github

Q: how would you use multicast to send a message to a group of registered users?
%%A: less data copying than TCP

Q: In O(1) space, Given an unsorted array of natural numbers, how do you find any pair to produce a target sum? What’s your best time complexity?
%%A: if I assume the integer size is 32-bit then a fixed-size O(1) space radix structure can sort the array in O(N). See also my blog post on [[locate a pair with targetSum=55 #bbg IV #Morris]]
%%A: if integer size is unlimited, then I think the best is O(NN)

Q: How is your muni bond pricing engine designed internally?
Q2b: what kind of pricing rules live in the cache?
Q2c: how many threads to react to a given event?

Q: how did you calculate FX fwd points?
%%A: use the interest rates in the two currencies and the spot FX rate.

Q: how is your implied volatility computed from an option price?
%%A: if there’s not a closed-form formula, then Newton’s method or something similar should be able to converge quickly.

Q3 (OO design): How would you design a game software with Animal objects, various birds, and fish?
Q3a: Say the system is running fine, but now you need to add ostrich class?

Advertisements

movie query #SIG

Like the other coding question from the same employer, this is mostly about nested data structure.

Q1: Given sample csv file below (unsorted), implement an efficient query function get_movies(genre, Start_year, End_year). Optimize for query, not initial loading. Bonus points if you make it work within 75 minutes.

  #movieId,title,date,genres
  1,Toy Story 3,2010,Adventure|Animation|Children|Comedy|Fantasy
  2,Toy Story,1995,Adventure|Animation|Children|Comedy|Fantasy
  ...... 20,000 more lines

https://github.com/tiger40490/repo1/blob/py1/py/movieQuery_SIG.py is my working code, good enough for a small data set like movies, but inefficient for bigger data set like newspaper articles.

Q2: what if the number of genres could be thousands, like a tag cloud.

— Q1 analysis
My initial thought was on binary search, with O(logN) + O(E-S) which is often equivalent to O(N). Locate first and last item in a sorted list, then iterate over enclosed items.

After some brief discussion with interviewer, I came up with a design of O(1) + O(E-S), arguably the fastest, featuring a jagged 2D array:

  • Assumption: genres are shared and shows up repeatedly in the file, rather than random strings that may show up only once.
  • use contiguous array to hold all movies, array indexed by year.
  • Within each “year bucket”, store multiple movies in a contiguous array indexed by genreId.
    • Note there is no gap in the genreId /numbers/. I initially said hardcoded genre enum but interview (Ken) challenged me with frequently changing genres. I now feel we can generate genreId as we load the csv file. This is comparable to some perfect-hashing described in Q2 analysis below.
  • each “cell” holds a list-of-shared_ptr-to-movie
  • each movie object is on heap, and owned by shared_ptr’s

— Q2 analysis:
If tags are user-generated, random and uncoordinated, then the total population of tags could be unlimited. Our genreId would go up to millions. Solution? Perhaps use a separate hash table within each “year bucket” instead.

However, since the query is “by tag” it implies that the most important tags are shared.

At this point, I feel design should depend on the real, not imaginary, context, so it’s kind of pointless to speculate what type of tags we will get.

SIG phone round

See retrans questions on retrans biz logic if I were CTO

Q: how do you extract the seq number from packet header?
A: reinterpret_cast to a pktHeaderStruct ptr, then read the field

Q: is reinterpret_cast safe?

Q: why is RBTree sometimes faster than hashtable for a small collection? See unordered_map^map performance: unpredictable

Q4: speed of std::sort() on a vector vs list::sort
%%A: probably vector due to cache efficiency
A: also the stability of list::sort comes at a cost

Q4b: how are the two sorting implemented?
%%A: possibly two variations of quicksort

Q: std::sort() vs qsort() in C. Same algorithm, but which is faster?
%%A: std::sort, due to functor inlining
AA: Item 46 in [[effectiveSTL]]

Q: given a csv file with a million rows, you use c/java/c#/py to (convert to number) sum up 2nd column and produce a single output number, which language is fastest?
%%A: main cost is file read, but let’s suppose entire file is loaded into memory.
%%A: I feel python bytecode (even optimized) will be slightly slower. Java (due to JIT) and C are likely faster.
A: now in hindsight I feel integer parsing will cost much more than arithmetics.

SDI: elevator design #DeepakCM #70%

My friend Deepak gave me this Basic Requirements:

  • N-level building. You can assume 5 for now
  • Each level’s lift lobby has up/down buttons
  • Inside each lift there are N buttons for the N target floors
  • Any time, system can receive requests from any button

My basic design is not optimized for efficiency. The number of pending requests will stay below 20 since there are only that many buttons, so we iterate over all requests frequently.

My design effort is heavily focused on data structure — The more complex the requirements, the more I need to focus on clean, concise, sound data structure. They may not be necessary — a less optimal data structure can also work, but an optimal data structure helps us tremendously to cope with the complexity. I feel this problem is tractable once the data structures take shape.

Q: what if lift is in motion towards some target when a lift-lobby button is pressed and it happens to be serviceable?
A: Like the pencil solution to the space-pen challenge, my endless loop in main() may qualify as a simple solution to this daunting challenge. The system wakes up frequently to check for new requests. When sleeping, it ignores all inputs.

This simple design, if viable, avoids asynchronous or multi-threading complexities.  http://pubs.vmware.com/foundry1/pg/wwhelp/wwhimpl/common/html/wwhelp.htm?context=pg&file=Foundry_PG_concepts.3.30.html is a similar single-threaded design

For simplicity, I assume the new requests from all buttons show up in some buffer (or database), so we can poll it to see them. There’s no interrupt or callback.

Q: in your design, there are “targets” set by in-lift passengers vs. up/down requests (from lift lobbies) assigned by the system. How do you prioritize between the two types?
A: In general, targets are at higher priority than assignments, but if lift is already moving down towards Level 2 for an assigned request, it will move down all the way till that level.

I don’t want to spend too much time on any module since the correct emphasis/focus could be different in a real world design or design interview.

given int array,find triplets: x divides y;y divides z #Trex

Within a sequence of unique natural numbers, count the occurrence of so-called triplets of (x,y,z) where x divides y and y divides z.

I feel this is not a classic, but the technique is somewhat reusable.

https://github.com/tiger40490/repo1/blob/py1/py/array/tripletsInPreSorted_Trex.py is the improved solution after lots of hints.

— a useful technique:

Construct each {factor, multiple} pair and save it in a collection. Any new pair constructed will be checked against all existing pairs. The “check” is the trick. My 25 Apr 2018 commit shows the most efficient “check” I can think of.

list of pairs -> list of triplet -> list of quadruplet -> list of quintuplet … Build up pairs to construct triplets. Use triplets to construct quadruplets. Use quadruplets to construct quintuplets.

The pairs collection doesn’t need to be fully built before we start constructing triplets.

convert-to-0 using4transformations #Trex

Q: given a natural number, write a function to "transform" it to zero in a minimum number of transformations. Four transformations allowed:A) add 1
B) subtract 1
C) divide by 2 if divisible
D) divide by 4 if divisible

My own idea:

1) attempt D for as long as possible. Repeat for C. Now we have an odd number K
2) if K can be written as 4N-1, then do A, otherwise B.
3) go back to step 1

https://github.com/tiger40490/repo1/blob/py1/py/trivial/4conversion_ez.py is my code

find median@2sorted arrays #Trex untested

https://leetcode.com/problems/median-of-two-sorted-arrays/description/ is similar except X and Y can be unequal length. My solution solves the harder, generalized problem.

This “coding” question is really math problem. Once you work out the math techniques, the coding is simple.

Designate arr1 as the shorter array. compare med(arr1) vs med(arr2)

Suppose former is lower, i can discard lower half of arr1 (s items). Can i discard highest s items in arr2? I think so because upper half of arr2 cannot have that median element, so any subset of it can be discarded

repeat until arr1 is completely discarded or left to a single element .. might be the final median. Now answer is close to the med of the remaining arr2.

–For the equal-length problem, My own idea on the spot — find the median of X and median of Y. If med(X) < med(Y) then discard the lower portion of X i.e. the “XB group”, and higher portion of Y (“YA group”). Then repeat.

  • Note len(XB) == len(YA) == min(len(X), len(Y))/2 := K. So every iteration would shrink the shorter array by half (i.e. K), and shrink the longer array by K. K would drop in value in next iteration.
  • loop exit — When the shorter of the two (say it’s X) shrinks to length 1, we are lucky — find the numbers around median(Y) and adjust the answer based on X[0].

Insight — Why can’t the final “winner”be somewhere in XB group? Because XA + YA already constitute half the population, and all of them are higher.

I always like concrete examples. So Suppose there are 512 items in the lower portion “XB group”, and the higher portion “XA” has 512 items. Suppose there are 128 items each in YB and YA groups. So in this iteration, we discard YA and the lowest 128 items in XB.

Definition of lower portion —
* all lower items up to but not including med(X) If len(X) is odd
* exactly the lower half of X if len(X) is even

Trex QnA IV #std::forward,noexcept

Q: how would a tcp consumer-client know the server process died rather than a quiet server?

Q: how would a tcp producer-server know a client process died? Signal?

Q: What’s perfect forwarding?

Q: std::forward() vs std::move()? See std::move=unconditional-cast #forward=conditional-cast

Q1: in c++03, a myVectorOfTrade.push_back( Trade(22, 0.7) ) uses how many ctor/dtor invocations?
A: regular ctor, copy-ctor, dtor of the temporary

Q1b: how about c++11?
A: regular ctor, mv-ctor, dtor of temporary. See P293…. We assume there’s a Trade mv-ctor and there’a some pointer field in Trade, otherwise the mv-ctor has nothing to steal and is probably same as copy-ctor

Q1c: what about something like emplace_back(22, 0.7)
A: in-place ctor using placement-new. P294 [[eff modern c++]] perfect forwarding eliminates temporaries

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/rvrDemo.cpp is my illustration.

Q: how would “noexcept” improve runtime performance?
AA: P 91 [[effModernC++]] has a short paragraph explaining “noexcept” functions are more compiler-optimizable
AA: P 25 [[c++stdLib]] says noexcept functions don’t require stack unwinding.

Q: please implement a simple Stack class backed by a vector. Implement push(), pop(), top().

 

ref-counted copy-on-write string #MIAX exchange IV

I completely failed this 2011 IV question from MIAX options exchange:

Q: outline a ref-counted copy-on-write string class, showing all the function declarations
A: here’s my 2017 answer

struct Payload{ //this object must live outside any Str instance. If 1st Str instance in the group is destroyed, this object must live on.
	char * const arr;
	size_t const length;
	mutable size_t refCount;
	Payload(std::string const &);
};
class Str{
	Payload const * payload;
public:
	~Str(); //decrement refCount and if needed, delete the payload
	//Str(); //default ctor is useless since after construction we can't really modify this instance, due to copy-on-write
	Str(std::string const &);
	Str(Str const &); // similar to shared_ptr

	Str & Operator=(Str const & other) const; // will return a reference to a new instance constructed on heap (never on stack!). The new instance will have a ref-count initialized to 1
	Str & replace_with(std::string const &) const; //ditto

// optional utilities
	char const * c_str()    const; // <-- Hey mine is very similar to std::string
	Str(char const * arr, size_t const len); // <-- Hey mine looks similar to std::string
	friend ostream & operator<<(ostream &, Str const &);
};

minimum-cost array shrinking #Citadel

Input is a vector of positive integers. You are to shrink it progressively. Each step you remove 2 selected elements, and replace with their sum. Therefore vector size drops by 1 at each step, until there’s one element left.

At each step there’s a cost, which is defined as that sum.

Eg: {4,2,1} becomes {4,3} after you combine 2/1. The cost at this step is the sum 2+1=3.

Q1: For a given vector, find a sequence of steps with the lowest total cost. Test your code in c++
Q2: how would you optimize your code, assuming high data volume.

#include <vector>
#include <queue>
#include <algorithm>
#include <iostream>
#include <string>
#include <functional> // std::greater
using namespace std;

vector<int> vec = { 3,2,1 }; // a std::set will fail when duplicate values show up like {3,3}
priority_queue<int, vector<int>, std::greater<int> > pq(vec.begin(), vec.end());

void dumpVector(string msg) {
 cout << msg << ", size = " << vec.size() << endl;
 for (auto it = vec.begin(); it != vec.end(); ++it) cout << *it << ' ';
 cout << endl;
}

int operateVector(int sum = 0) {
 auto lowestItem = min_element(vec.begin(), vec.end());
 sum += *lowestItem;
 vec.erase(lowestItem); // now s1 is an invalidated iterator and unusable

 //remove() is bad as it removes all not one item having target value!
 //v.erase(std::remove(v.begin(), v.end(), *lowestItem), v.end()); 

 dumpVector("afer erase");
 return sum;
}

void dumpHeap(string msg) {
 auto clone = pq;
 cout << msg << ", size = " << clone.size() << endl;
 for (; !clone.empty();clone.pop()) {
 std::cout << clone.top() << ' ';
 }
 cout << endl;
}
int operateHeap(int sum = 0) {
 sum += pq.top();
 pq.pop();
 //dumpHeap("afer pop");
 return sum;
}

int f1(int sum = 0) {
 return operateHeap(sum);
}
int main87() {
 int totalCost = 0;
 for (; pq.size()>1;) {
 int sum = f1(f1()); //call f1() twice recursively.
 pq.push(sum);
 dumpHeap("afer push");
 totalCost += sum;
 }
 cout << "total cost = " << totalCost << endl;
 return 0;
}

Trex QnA IV #low-level C

Q: false sharing? cache coherence? cache line?

Q5: 3 threads in my process, and one of them calls fork(). What are the threads in the child process and which thread will be running?
Q5b: what potential problems do you see?
%%A: some resources like locks could be in a bad state

Q: how do you start a thread? %%A: pthreads_create()
Q1: how do you start a process? %%A: fork()
Q1b: what happens to the parent’s signal handlers, file handles etc

Q: how does the parent process wait for the child process to end?
AA: wait()

Q: sigkill vs sigterm?
%%A: can’t catch sigkill

See other questions discussed in unix signal #Trexquant IV

Q: why we need isnan(myFloat) rather than if (myFloat == Nan)
Q2: I have a float variable1=20180213, and I cast it to int. What’s the int value?
Q2b: two floating point numbers are both very close to 1. In fact they are the closest possible pair. What’s your estimate of their difference?
Q2c: what’s the representation of a float object?

c++DRW IV

Q: a vector needs to allocate N instances of Account but Account has no default ctor, but it just works. How does the compiler achieve it? (Actually it won’t compile if compiler can’t synthesize the no-arg ctor)
A: indeed the call to array new would prevent the vector concrete class from compiling due to SFINAE rule or something like that. (Java and c# would use type constraints instead.) The compiler must be using 2 separate lines – one to allocate raw memory, the other to initialize the object.

However, when is vector internal array allocated by array new? I discussed with Ashish but found no answer.

Q: why did you say you needed to write your own smart ptr?
A: super simple one, just to deal with some issue of the raw ptr… probably dangle pointer, by overriding the deref operator

Q: why would anyone use unique ptr when shared ptr is general purpose
A: threading…
A: some pointee objects are designed with sole-ownership

Q: is the lambda functionality doable in c++98? what is the c++98 equivalent?
A: some functor with a challenging syntax I can’t remember.

Q: OK you don’t use c++11 at work, but do you hack around at home?

Q: why is  the bid/ask spread much wider in options than the underlier?
A: must be the risk to the writer. Competition didn’t drive down the bid/ask spread like it did in FX and cash equities.
AA: delta hedge adjustment can’t be done every second. Before the next adjust, the risk to the writer (market maker and quoter) would be quite high.

c++11 QnA IV 3arrow

Q: In a move constructor, is the parameter “x” an rvalue reference? is there another rvalue reference in the call?
%%A: x is a rvr, but as a variable is an l-value expression since it is named and has a Location.

Q: What’s an rvalue reference actually, like a std::string && rvr1
A: I feel it’s similar to a regular reference variable and often treated as a pointer. Since “pointer” has multiple meanings, I would not say that. I speculate that compiler treats rvr1 as an special alias to the original object. A special name plate on the memory location. Compiler knows to treat the object-behind as suitable-for-stealing.

Q: what’s lockfree? How did you make it work in your projects?
A: see my blog about atomic{int}

Q: What part of the boost thread library did you use?

Q: for-loop in c++11?
AA: work for C-style arrays, initializer lists, and any type that has begin() and end() functions defined for it that return iterators.

Q: Why did you implement your own smart pointer and wrapper over int?
A: to avoid uninitialized variables. See post on uninitialized ..

Q: Can ctor throw exception? Why do you say it’s not best practice?
A: now I think it’s not necessarily best practice. Exception is the only way constructors can signal failure.

Q: What kind of algo is qsort? Average and worst runtime complexity?
A: average nLog(n), worst case n^2

Q: Recursive vs iterative, which is faster?
A: comparable, but space complexity lower for iterative?

Q: How did you use parallel processing in GS?
A: data parallellism, threading, and other techniques. Coarse-grained is ideal.
A: i guess pipelining parallellism is also relevant, using task queues

Q: (rarely quizzed) Translation lookaside buffer

Q: mutable keyword’s usage? How about in c++11?
AA: closure – captured variables can be modified if “mutable”.
http://stackoverflow.com/questions/105014/does-the-mutable-keyword-have-any-purpose-other-than-allowing-the-variable-to

Q(seldom quizzed): noexcept
AA: both an operator and a function specifier…

 

c++Chicago/Sing (Jump) IV Aug 2012

Q: UDP vs TCP diff?
%%A: multicast needs UDP.
%%A: UDP is faster – no connection setup/teardown no error check no ACK, no sequence number; shorter emvelope

Q: How would you add reliability to multicast?
%%A: sequence number

Q: How would you use tibco for trade messages vs pricing messages?
%%A: the trade msg must be delivered reliably to back office?
%%A: one of them is real time?

Q5: In your systems, how serious was data loss in non-CM multicast?
%%A: Usually not a big problem. During peak volatile periods, messaging rates could surge 500%. Data loss would deteriorate.

Q5b: how would you address the high data loss?
%%A: test with a target message rate. Beyond the target rate, we don’t feel confident.
A: tune the tibco reliability parameter —  http://javarevisited.blogspot.sg/2011/04/dataloss-advisory-in-tibco-rendezvous.html

Q7: how is order state managed in your OMS engine?
%%A: if an order is half-processed and pending the 3nd reply from ECN, the single thread would block.

Q7b: even if multiple orders (for the same security) are waiting in the queue?
%%A: yes. To allow multiple orders to enter the “stream” would be dangerous.

Now I think the single thread should pick up and process all new orders and keep all pending orders in cache. Any incoming exchange messages would join the same task queue (or a separate task queue) – the same single thread.

3 main infrastructure teams
* exchange connectivity – order submission
* exchange connectivity – price feed i.e. market data. I think this is incoming-only, probably higher volume. Probably similar to Zhen Hai’s role.
* risk infrastructure – no VaR mathematics.

c++ low-latency connectivity IV (nQuant) #2

This IV is heavy on low-level QQ in C/C++. Such obscure knowledge won’t help GTD and is not significant zbs. They may improve your design though.

Q3: Memory alignment – what if on the stack I declare 2 char variables? See post on memory alignment.

Q3b: what if I have 2 char fields in a struct?

Q3c: I have two 64-bit ints, one misaligned. When I use them what problems will I have?
A: not much performance penalty. See https://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/

Q1: If inside a member function I call “delete this”, what happens?
%%A: what if this “this” points to an object embedded as a field of an umbrella object? The deallocation would happen, but the destruction of the umbrella object may again deallocate it? This is confirmed in the FAQ (https://isocpp.org/wiki/faq/freestore-mgmt#delete-this)
%%A: how do we know the host obj is on heap, stack or global area i.e. data section.

Q1b: To achieve heap-only, my class has private ctors and private op= and a static factory method. Will it work?
%%A: according to moreEffC++ P146, I would say yes, with certain caveats.

Q2: What’s reinterpret_cast vs dynamic_cast vs static_cast?

Q2b: What other casts are there?

Q: Placement new – can I use the regular “delete”?
%%A: probably no. Need to call the dtor manually? See P42 moreEffC++

Q: How does tcp handshake work? (I don’t know why this nlg is even relevant)
A: http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Connection_establishment

Q: Some tcp parameter to speed it up?
A: larger TCP window size?
AA: TCP_NODELAY

Q: tcp client to specify a non-random port? See post on bind()

Q: If a c++ app runs fine in debug build (compiler optimizations removed), but crashes in release mode, what guesses/clues do you have?
%%A: conditional compilation, like in my c# project
%%A: the compiler optimization leads to unusual execution speed between 2 threads, and cooks up a rare corner case
%%A: I have seen assertions turned on in debug build (a debug STL can also be used), so we know one data file F1 is unusable, and another data file F2 is usable. In release build, someone else tries F1 and it crashes somewhere else.

See c++debug build can modify app behavior!

nQuant c++IV #1 #monitoring

Overall, I remember nQuant IV was rather low-level and heavy on optimization. Mostly QQ.

Q6: how could do you implement a userland thread?
%%A: setjmp and longjmp, as in early JVM

Q6b: how? I feel better be modest and offer my tentative guesses.

Q6c: how is setjump different from goto? Seldom asked!
AA: http://ecomputernotes.com/what-is-c/function-a-pointer/what-is-the-difference-between-goto-and-longjmp-and-setjmp and http://www.geekinterview.com/question_details/3330

Q: by default, is a linux mutex cross-process?
%%A: I guess the mutex in pthreads by default isn’t. Outside linux, the windows mutex is cross-process.
AA: Beyond the default, linux mutex can be cross-process, by using shared memory — http://stackoverflow.com/questions/9389730/is-it-possible-to-use-mutex-in-multiprocessing-case-on-linux-unix

Q: in linux, what’s the difference between a process and a thread?

Q: start 5 copies of the current process, using fork? (Can use the example in [[head first c]] )

Q: If a variable in (2-process) shared memory is marked volatile, is it a reasonable usage of volatile keyword?
AA: now i think so. Similar to a variable writable by a temperature sensor. http://embeddedgurus.com/barr-code/2012/01/combining-cs-volatile-and-const-keywords/ shows use of “const volatile” on a variable in shared memory.
AA: [[moving from c to c++]] section on Volatile (P75) seems to agree

Q: what’s a limitation of the select() system call when there are too many sockets to check, and each messages (~ 2KB) is important.
A: select has max socket count. Use epoll() — http://stackoverflow.com/questions/5357445/select-max-sockets

Q: what linux command to monitor memory usage by a given process, showing size of heap, stack, text, even broken down to shared lib vs static lib
A: cat /proc/{pid}/smaps
A: pmap -x {pid}

Q: given a one-line c program “while(1);”, launch it and /usr/bin/top would show 100% cpu usage. What does it mean?

Q: what’s inside a socket? My socket book has detailed diagrams.
%%A: http://bigblog.tanbin.com/2011/06/5-parts-in-socket-object.html

Q: can a socket be shared by 2 processes?
AA: Yes. See https://bintanvictor.wordpress.com/2017/04/29/socket-shared-between-2-processes/

———I feel most of the questions below are rarely asked.

Q: how many sockets can a single process open? Not too sure. A few hundred?
A: http://stackoverflow.com/questions/410616/increasing-the-maximum-number-of-tcp-ip-connections-in-linux

Q: what linux command to monitor network performance?
%%A: Beside netstat, I have seen tools that report error rates that indicate saturation
A: ss

 

Q: how to remove the word “option” from a resume, unless it is a sub-word? Use perl or python
%%A: Word boundary symbol?

Q: when a user land thread makes a syscall, what’s the implication?
%%A: the thread enters kernel mode?

Q: what’s offered on Layer 2? Can IP simply operate on top of physical layer without Layer 2? Too deep and seldom asked.
A: ethernet is a L2 technology.
A: Each device on a network has a hardware address or MAC address, used by the data link layer

c++IV data analyst(WorldQuant

Struct base{
char * buffer
base(){ buffer = new char[1000]; }
..
}; //how do we improve this class?

What if the new() throws exception

Q: if you make buffer a smart array then what do you do with the big3?

Q: diff between scoped_array vs shared_array?

Q: what if I have a derived class without virtual dtor?

base * a = new derived;
delete a;

Q: What if I make derived a virtual subclass and have a derivedDerived class?

c++buy-side data support IV (wq) – phone round

Q: inline function vs macro. Not on my Tier 1/2
%%A: macro is frowned upon. Inline is a hint to compiler

Q: reverse a single linked list in-place. How many temp variables needed and how?
A: 3. I wrote this, in this blog

Q: volatile keyword in c++
A: used to be hardware-related. Also used for threading? Not according to a stackoverflow post
Q: throw exception in constructors? Best practices?
A: memory cleanup needed … JVM/CLR handles it!
Q: given an unsigned integer, how do you test if it’s a power of 2.
A: x-1 would be all-1…

2013 Citadel IV – c#, C++

Q: System.Array.CopyTo() vs Clone()
%%A: Clone() is declared in a supertype IClonable, whereas CopyTo is only in Array class – confirmed
%%A: Clone() is controversial – confirmed. http://www.dotnetperls.com/clone
%%A: Clone() returns a new object whereas CopyTo requires a target array of the correct size to pre-exist – correct

AA: Array class has  a static method Clone(), but I feel the real important point is the Clone() controversy.

Q4: What does the “…” mean in C++ catch(…)?
%%A: If the troublemaker has no (not empty) exception spec, then “…” means catch-all. Any type of exception can be thrown and will be caught here
%%A: if the troublemaker has an exception spec of type B, then “…” means all subtypes of B. If another  type of exception is thrown, then unexpected() triggers, with the catch(…) ignored. Tested 🙂
A: i think my answer was correct.

Q4b: What’s the c# counter part?
%%A: catch(Exception) or an empty catch{/**/} — confirmed

Q3: delegate — what is it and what’s the usage?

Q3b: what’s the c++ equivalent (or closest)
Q: What design patterns are behind the c# delegate?
A: for multicast …. observer
A: for unicast … command? The delegate instance usually has a host object
Q: how do you manage memory leaks?

simplified order book design doc: jump

It’s tempting to use virtual function processMessage() to process various order types (A, M, X …) and trade types, but virtual functions add runtime overhead. Template specialization is a more efficient design, but due to the limited timeframe I implemented an efficient and simple alternative.

Assumption: M messages can only change quantity. Price change not allowed — Sender would need to cancel and submit new order. The B/S and price fields of the order should not change, but validation is omitted in this version.

Assumption: T messages and the corresponding M and X messages (also the initiator A message) are assumed consistent and complete. Validation is technically possible but omitted. Validation failure indicates lost messages.

The cornerstone of the design is the data structure of the order book — a RB-tree of linked lists. Add is O(logN) due to the tree-insert. Modify is O(1) thanks to the lookup array. Remove is O(1) — eliminating tree search. This is achieved with the lookup array, and by saving iterator into the order object.

There are 2 containers of pointers — the map of lists and the lookup-array. It would be better to use container of smart pointers to ease memory management, but STL doesn’t provide any smart pointer.

All equality test on doubles are done using “==”. Should use some kind of tolerance if time permits.

Here’s the documentation in the lookup array class

/*This class encapsulates an array of pointers.
Assumption 1 — Exchanges is likely to generate auto-incrementing orderID’s. Here’s my reasoning. OrderID’s are unique, as stated in the question. If orderID generation isn’t continuous, then the generator has 2 choices about the inevitable gap between 2 adjacent ID numbers. It can keep the gap forever wasted, or somehow “go back” into a gap and pick a number therein as a new orderID. To do the latter it must keep track of what numbers are already assigned — rather inefficient. There are proven in-memory algorithms to generate auto-increment identity numbers. I assume an exchange would use them. Auto-increment numbers make a good candidate as array index, but what about the total number range?

Assumption 2 — each day the number range has an upper limit. Exchange must agree with exchange members on the format of the orderID. It’s likely to be 32 bits, 64 bits etc and won’t be a million bits.

Question 1: Can the exchange use OrderID 98761234 on both IBM and MSFT during a trading day? I don’t know and i feel it doesn’t matter. Here’s the reason.

Case 1: suppose exchange uses an *independent* auto-increment generator for each stock. So IBM and MSFT generators can both generate 98761234. My design would use one array for IBM and one array for MSFT. For basket orders, additional generator instances might be needed.

Case 2: suppose exchange uses an independent auto-increment generator for each stock, but each stock uses a non-overlap number range. 98761234 will fall into IBM number range. My design would need to know the number range so as to convert orderID to array index and conserve memory.

Case 3: suppose exchange uses a singleton auto-increment generator across all stocks (bottleneck inside the exchange). My design would use one gigantic array. Given Assumption 1, the numbers would be quasi-continuous rather than sparse — below 50% of the range is assigned. Suppose the range is S+1, S+2 … S+N, then my array would be allocated N elements (where S is orderIDBase). There’s a limit on N in reality. Every system is designed for a finite N — no system can handle 10^9999 (that’s one followed by ten thousand zeros) orders in a day. Each array element is a pointer. For a 64-bit machine, N elements take 64N bits or 8N bytes. If I have 640GB memory, N can be 80 billion but not higher. To scale out horizontally, we would hope Case 1 or 2 is the case.

Therefore the answer to Question 1 shows array of pointer is feasible for the purpose of lookup by orderID. In a real system hash table is likely to be time/space efficient. In this exercise, only STL is available, which provides no hash table. Tree based map has logN time complexity — too slow. My choice is between a built-in array vs a non-expanding vector. I chose array for simplicity.
*/