c++debug^release build can modify app behavior #IV

This was actually asked in an interview, but it’s also good GTD knowledge.

https://stackoverflow.com/questions/4012498/what-to-do-if-debug-runs-fine-but-release-crashes points out —

  • fewer uninitialized variables — Debug mode is more forgiving because it is often configured to initialize variables that have not been explicitly initialized.
    • For example, Perhaps you’re deleting an uninitialized pointer. In debug mode it works because pointer was nulled and delete ptr will be ok on NULL. On release it’s some rubbish, then delete ptr will actually cause a problem.

https://stackoverflow.com/questions/186237/program-only-crashes-as-release-build-how-to-debug points out —

  • guard bytes on the stack frame– The debugger puts more on the stack, so you’re less likely to overwrite something important.

I had frequent experience reading/writing beyond an array limit.

https://stackoverflow.com/questions/312312/what-are-some-reasons-a-release-build-would-run-differently-than-a-debug-build?rq=1 points out —

  • relative timing between operations is changed by debug build, leading to race conditions

Echoed on P260 [[art of concurrency]] which says (in theory) it’s possible to hit threading error with optimization and no such error without optimization, which represents a bug in the compiler.

P75 [[moving from c to c++]] hints that compiler optimization may lead to “critical bugs” but I don’t think so.

  • poor use of assert can have side effect on debug build. Release build always turns off all assertions as the assertion failure messages are always unwelcome.
Advertisements

asymmetry lower_bound^upper_bound #IFF lookup miss

For a “perfect” hit in both set::lower_bound() and std::lower_bound(), the return value is equivalent to the target; whereas upper_bound is strictly higher than target. See

To achieve symmetry, we need to decrement (if legal) the iterator returned from upper_bound.
———-
If no perfect hit, then lower_bound() and upper_bound() both give the next higher node, i.e. where you would insert the target value.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

vector<int> v{1,3,5};
int main(){
  vector<int>::iterator it;
  it = lower_bound(v.begin(), v.end(), 2); cout<<*it<<endl;
  it = upper_bound(v.begin(), v.end(), 2); cout<<*it<<endl;
}

multiple hits: lower_bound()gives Earliest

Looking for lower_bound (2) in {0,1,2,2,3,4}, you get the earliest perfect hit among many, i.e. the left-most “2”.

No such complexity in upper_bound since upper_bound never returns the perfect hit.

No such complexity in set.lower_bound since it won’t hold duplicates.

int main(){
  vector<int> s{0,1,2,2,3,4};
  vector<int>::iterator it = lower_bound(s.begin(), s.end(), 2);
  cout<<"to my left: "<<*(it-1)<<endl;
  cout<<"to my right: "<<*(it+1)<<endl;
  cout<<"to my right's right: "<<*(it+2)<<endl;
}

non-local static class-instance: pitfalls

Google style guide and this MSDN article both warn against non-local static objects with a ctor/dtor.

  • (MSDN) construction order is tricky, and not thread-safe
  • dtor order is tricky. Some code might access an object after destruction 😦
  • (MSDN) regular access is also thread-unsafe, unless immutable, for any static object.
  • I feel any static object including static fields and local statics can increase the risk of memory leak since they are destructed very very late. What if they hold a growing container?

I feel stateless global objects are safe, but perhaps they don’t need to exist.

mgr position stress: project delay #cf FTE/contractor

contractor is most care-free. Even As an employee, the pressure to deliver is lower than the mgr.

As a junior VP (perhaps a system owner) you could still stay behind a shield (defend yourself) — “I did my best given the limitations and constraints”. However, As mgr, you are more expected to own the task and solve those problems at a higher level of effectiveness, including negotiations with other departments.

“Results or reasons?” … is the manager’s performance review.

Recall Yang, Stirt-risk …

  • —- past barometer due to project delivery pressure —-
  • GS – 10/10,  “if i quit GS I may have to quit this country; I must Not quit”
  • Stirt – 8
  • Mac – 7
  • OC – 5, largely due to fear of bonus stigma
  • 95G, Barc – 3, due to mgr pressurizing
  • Citi – 2

+! trySomethingNew] sg, what could I have(possibly)got

See also past vindicative specializations

  • I would still do my MSFM
  • I would still fail to get into algo trading or quant dev — too few jobs and extremely high bar
  • I would likely fail to get into leadership roles. I was considered for leadership roles at 1 to 3 companies

However,

  • I could possibly have focused on a specialization such as risk system + some analytics
  • would probably have joined citi, barc, baml, UBS, SC or .. in sg
  • probably java or swing or connectivity
  • would Not have achieved the c#/py/c++ ZBS growth
  • would Not have the skills to get this ICE raw mkt data job or the other c++ job offers.
  • no guarantee to become a manager or app owner. There could be many old timers in the team.
  • possibly less stress and pain. Lower chance of performance stress (#1 biggest stressor), because my GTD/KPI would be higher due to my java/SQL zbs.

lower_bound() may return end() #gotcha

lower_bound() may return end().

If your target value is too high and nothing qualifies, all 6 functions return the right end of the range. If you look at the (key value in) return value i.e. end-of-range,

  • This end-of-range node is a dummy. Never read its key value.
  • After lower_bound or upper_bound, always validate before reading the return value

I spent hours puzzled by the wrong data returned after lower_bound()->first. Specifically, if the map/set is keyed by integer, then the end()->first can be normal int value even when it fails and returns map.end()!

Consistent across 6 functions:

  • std::lower_bound
  • std::upper_bound
  • set/map methods

What if the target value is too low? Easier — upper bound should return left boundary iterator, and lower_bound returns the same iterator! See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/curveInterpolation_CVA.cpp

 

ETF share creation #over-demand context

http://www.etf.com/etf-education-center/7540-what-is-the-etf-creationredemption-mechanism.html is detailed.

Imagine a DJ tracking ETF by Vanguard has NAV=$99,000 per share, but is trading at $101,000. Overpriced. So the AP will jump in for arbitrage — by Buying the underlying stocks and Selling a single ETF unit. Here’s how AP does it.

  1. AP Buys the underlying DJ constituent stocks at the exact composition, for $99,000
  2. AP exchanges those for one unit of ETF from Vanguard.
    1. No one is buying the ETF in this step, contrary to the intuition.
    2. So now a brand new unit of this ETF is created and is owned by the AP
  3. AP SELLs this ETF unit on the open market for $101,000 putting downward pressure on the price.

Q: So how does the hot money get used to create the new ETF shares?
A: No. The hot money becomes profit to the earlier ETF investors. The ETF provider or the AP don’t receive the hot money.

## vi (+less) cheatsheet

https://github.com/tiger40490/repo1/blob/bash/bash/vimrc has some tricks including how to make vim remember last edit location.

  • ~~~~ command mode #roughly ranked
  • [2/3] :↑ (i.e. up-arrow) — cycle through previous :commands
  • [3] dt — “dta” delete until the next “a”
  • [2]: 6x — delete 6 chars
  • [2] 9s — wipe out 9 characters (including current) and enter insert-mode. Better than R when you know how many chars (9) to change
    • to delete 5 characters … there is NO simpler keystroke sequence
  • R — Overwrite each character one by one until end of line. Useful if the replacement content is similar to original?
  • Ctrl-R to re-do
  • cw — wipe out from cursor to end of word and puts you into insert mode
    • c2w or 2cw
  • :se list (or nolist) to reveal invisible chars
  • C — wipe out from cursor to END of line and puts you into insert-mode
  • capital O — open new line above cursor
  • A — to append at END of current line
  • from inside q(LESS), type a single “v” to launch vi

–paging commands in vi and less

  • jump to end of file: capital G == in both vi and LESS
  • jump to head of file: 1G == in both vi and LESS
  • page dn: Ctrl-f == in both; LESS also uses space
  • page up: Ctrl-b == in both; LESS also uses b

— q[less] searching feature

  • after you have searched for “needle1”, how do you expand on the pattern? You can hit 2 keys
    • [2]  /↑ (i.e. <upArrow>) to load “needle1.” Now you an edit it or add an alternative like
    • [2+]/↑ (i.e. <upArrow>) |needled2|needle3

[3/4] means vi receives 3 keystrokes; we hit 4 keys including shift or ctrl …

vi on multiple files

[3/4] means vi receives 3 keystrokes; we hit 4 keys including shift or ctrl …

–“split” solution by Deepak M

vi file1 # load 1st file

  • :sp file2 # to show 2nd file upstairs
  • :vsp file3 # to show 2nd file side by side
  • You end up with  — file2 and file3 side by side upstairs, and file1 downstairs!
  • [2/3] ctrl-ww # To move cursor to the “next” file, until it cycles back

–the q( :e ) solution

vi file1 # load 1st file

  • :e file2 # to put 2nd file on foreground
  • [1/3] ctrl-^ — to switch to “the other file”
  • This solution is non-ideal for moving data between files, since you must save active file before switching and you can’t see both files

–editing 3 or more files

  1. vi file1 file2 file3
  2. q(:n) to switch to next, q(:N) for previous…
  3. q(:args) shows all files
  • –Suppose now you are at file2.
  • q(:e file4) works. q(^) will toggle between file2 and file4
  • However, q(:n :N  :args) only work on the original list, not the new files from q(:e)

q(:n :N ^) always shows the current filename in status bar:)

how many rolls to see all 6 values

Q: A fair dice has 6 colors. What’s the expected number of rolls to see all 6 colors?

This is a probability (not IT) interview question my friend Shanyou received.

My analysis:

Suppose it takes 3.1357913 rolls to get 2 distinct colors. how many additional rolls does it take to get the next distinct color? This is equivalent to

“How many coin tosses to get a head, given Pr(head)=4/6 (i.e. another distinct value)” — a Geometric distribution. Why 4/6? Because out of six colors , the four “new” colors are considered successes.

Once we solve this problem then it’s easy to solve “how many additional rolls to get the next distinct value” until we get all 6 values.

https://math.stackexchange.com/questions/28905/expected-time-to-roll-all-1-through-6-on-a-die is an accepted solution.

##9 specific Short-term goals I look fwd 2keep motivated

This question is relevant to “meaningful endeavors”, “next direction” and “sustained focus”.

Q: Over the last 10Y, what I looked forward to :

  • before GS — it’s all about earning capacity.
  • GS– promotion and salary increment, but I soon realized limitations in me, in the environment etc
  • contracting phase — in-demand, muscle building; try something new; billing rate
  • sg 3 jobs –See in SG: realistic next motivation for working hard
  • after re-entry to U.S. — IV batting average, as gauge of my market value

Q: what positive feedback can I look forward to, to keep me motivated?

  1. success with tricky coding questions from real interviews (perhaps from friends)
  2. more time for myself (but not in bad mood) — blogging, reading, exercise, sight-seeing.
  3. more time to reunion with family and grandparents. Remember [[about time]] movie theme?
  4. more income to provide for kids, grandparents and my dear wife
  5. more savings — to achieve more investment success
  6. more savings — buy a home nearer to office to cut commute
  7. more IV success, perhaps in quant or HFT domains?
  8. growing IV capabilities towards better jobs
  9. positive feedback from mgr like Anand and Ravi K.
    • promotion?
  10. build zbs in c++/py — unrelated to IV, but gives me the much-needed respect, cool confidence, freedom from stress …?
  11. weight and fitness improvement
  12. more insights to publish on my blog, a sign of my accumulation

[17]orgro^unconnecteDiversify: tech xx ROTI

Update — Is the xx fortified with job IV success? Yes to some extent.

Background – my learning capacity is NOT unlimited. In terms of QQ and ZZ (see post on tough topics with low leverage), many technical subjects require substantial amount of /laser energy/, not a few weeks of cram — remember FIX, tibrv and focus+engagement2dive into a tech topic#Ashish. With limited resources, we have to economize and plan long term with vision, instead of shooting in all directions.

Actually, at the time, c#+java was a common combination, and FIX, tibrv … were all considered orgro to some extent.

Example – my time spent on XAML now looks not organic growth, so the effort is likely wasted. So is Swing…

Similarly, I always keep a distance from the new web stuff — spring, javascript, mobile apps, cloud, big data …

However, on the other extreme, staying in my familiar zone of java/SQL/perl/Linux is not strategic. I feel stagnant and left behind by those who branch out (see https://bintanvictor.wordpress.com/2017/02/22/skill-deependiversifystack-up/). More seriously, I feel my GTD capabilities are possibly reducing as I age, so I feel a need to find new “cheese station”.

My Initial learning curves were steeper and exciting — cpp, c#, SQL.

Since 2008, this has felt like a fundamental balancing act in my career.

Unlike most of my peers, I enjoy (rather than hate) learning new things. My learning capacity is 7/10 or 8/10 but I don’t enjoy staying in one area too long.

How about data science? I feel it’s kind of organic based on my pricing knowledge and math training. Also it could become a research/teaching career.

I have a habit of “touch and go”. Perhaps more appropriately, “touch, deep dive and go”. I deep dived on 10 to 20 topic and decided to move on: (ranked by significance)

  • sockets
  • linux kernel
  • classic algorithms for IV #2D/recur
  • py/perl
  • bond math, forex
  • black Scholes and option dnlg
  • pthreads
  • VisualStudio
  • FIX
  • c#, WCF
  • Excel, VBA
  • xaml
  • swing
  • in-mem DB #gemfire
  • ION
  • functional programming
  • java threading and java core language
  • SQL joins and tuning, stored proc

Following such a habit I could spread out too thin.

mgr position limitation:没当上经理,但换工作/国家 更容易

我选择一直当程序员, 有个好处就是比较容易回新加坡工作,过几年再来美国也容易。

当经理的就没这么灵活。 他们不能太频繁换工作,因为简历会受影响。经理的空缺,数目也低得多。很多类型的经理职位,只在中国, 不可能在另一国家找到同类职位。比如鲁诺所任职的国营企业,比如某同学任职的外资企业中国分公司总裁。

公司招聘经理非常谨慎,招聘程序员则比较简洁迅速。 这对我找工作很有利。

SDI: order-resend timer #CSY

Requirement: Each time we send an order (with an unique orderID number), we wait for about 5 seconds. If no Ack received on this id, we would resend it using the same id. Please design a data structure and algo to achieve it.

I believe we must keep data structure size under control, so when there are too many pending orders then very old pending orders would be dropped according to a reasonable policy.

A reasonable assumption — For simplicity, we resend any order only once and drop the order. If needed, we could send the same or a modified order but under a new orderID.

For now, I will relax the timing precision so that a little longer than 5 seconds is tolerable in practice. I would hope it takes sub-millis to iterate through any data structure under size control.

Note TCP has an extremely robust, efficient and well-thought-out design for a similar challenge, tested millions of times every second throughout the world. However, I will not reference it. Below is ..

—- my homemade design —-

System is driven by 4 types of events — timer, ack, new-order, resend. The first 3 are asynchronous, primary events, whereas the resend is a secondary event after a primary event. To minimize data races, I will use a single thread, so all event handlers must be brief.

Ring-buffer is the most popular underlying data structure for this type of system. I will implement a linked queue where each queue node is allocated from a ring buffer, and returned to buffer after delete/erase. Note a contiguous array will NOT accommodate mid-stream deletion.

  • Hashmap holds {orderId -> address of link node}
  • Each link node has {integer orderId; expiry time, pointer to next node; other trade details}.
  • We enqueue only at the tail, but we could erase either from head (dequeue) or the middle (ack received)
  • If we take a snapshot at any time, all link nodes are always ordered by expiry.
  • Only one timer is needed. It is either empty or has a single expiry time.

Event-handler algorithms:

  • –after sending a new order,
  • iterate from the head of the queue. If any node has an expiry time already passed, then resend it and dequeue it. Once we see a node that’s not expired yet, iteration ends.
  • enqueue the new id. If there’s no capacity, then simply remove the oldest node i.e. head of queue.
  • –After a resend,
  • Always erase (the node for) the resent id, usually mid-stream. This is where linked lists beat arrays.
  • If this resend is due to a timer event, then we need to set the timer to the expiry time of the queue head.
  • (No data structure scan since this is a secondary event.)
  • –After a timer event,
  • iterate from the head of the queue. If any node has an expiry time already passed, then resend it and dequeue it. Once we see a node that’s not expired yet, iteration ends.
  • set the timer to the expiry time of the current queue head.
  • –After an ack is received,
  • get the id in the ack message
  • use it to look up in hashmap to get the order object.
  • erase the node from linked queue
  • iterate from the head of the queue. If any node has an expiry time already passed, then resend it and dequeue it. Once we see a node that’s not expired yet, iteration ends.

std::dynamic_pointer_cast

Returns a copy (of the given shared_ptr) instantiated in the same “club”.

The returned type must be a subtype of the original… equivalent to a dynamic_cast() on a raw ptr.

http://www.cplusplus.com/reference/memory/dynamic_pointer_cast/ example doesn’t have virtual function, so dynamic_cast() isn’t needed !

https://github.com/tiger40490/repo1/blob/cpp1/cpp/template/shPtrDownCast.cpp is my own experiment.

personal learn`]difficult workplace: %%tips #XR

I understand your boss could be pushing hard and you have colleagues around who may notice what you do…. Well, Not much wiggle room for self-study. A few suggestions:

  • try to put some technical ideas into the code. I did manage to put in some threading, some anonymous inner classes, some memcpy(), some local byte array buffer into my project, so I get to practice using them. (I understand the constraints of a tight time line…)
  • it takes a few minutes to write down new findings discovered at work. I put them in my blog. I also try to spend a little more time later on to research on the same topic, until I feel confident enough to talk about it in an interview.
  • I try to identify some colleagues who are willing to discuss technical issues in my project. I try to discuss only when boss is not paying attention. Boss is likely to feel we are taking too much time on some unimportant issue.

If the learning topic is not related to work, then I feel it’s similar to checking personal investment account at work. (In the ICE office now, some employees get a cubicle with 4 walls so they get more freedom than me.) Do your colleagues check their investment accounts at lunch time? I believe they always get a bit of personal time. In GS, on average very roughly 1 out of 9 working hours is spent on personal matters, and the other companies have higher than that. We all need personal time to call insurance, immigration, repair, … The managers might need even more personal time. I would guess at least 60 minutes a day is yours. Question is how to avoid drawing attention. I don’t care that much about drawing attention, so I often print technical articles to read, or read on-line, or blog on-line.

It’s more discrete to write email to record your technical learning. I often send those emails to my blog (email-to-publish) or to my personal email address.


Personal time (be it 60 minutes or 3 hours at some banks) is never enough. We just have to try harder to squeeze a bit more out of the 9 hours. If you are serious about learning in your personal time, then I see two much bigger obstacles
1) family responsibility and distractions
2) insufficient motivation and persistent effort (三天打鱼两天晒网)

In my Singapore years (4.5 years), I felt overwhelmed not by work but family duties, so my weekends/evenings were almost never put to good use for personal learning. I can’t just blame my family members though. I do get quiet time 10.30 pm to 12.30 and many hours on weekends. Somehow, I didn’t put in persistent effort so I didn’t experience significant growth in my technical capabilities.

A very capable colleague was able to do his math research at home and make progress. His wife is probably a full time home maker and takes care of their 3 kids. He is wealthy so he may have a maid and a separate study room at home. However, I feel a more important factor is his persistent effort and focus. A rolling stone gathers no moss. By the way, this same guy runs about 5 miles at least 4 times a week. Determined and efficient. Good time management and correct priorities.

If (a big IF) we are sufficient motivated, we will find time or make time, either the 60 minutes at work, or on trains, or at home, or in Starbucks. In reality, very few individuals have that level of motivation, so I believe some external factors can help, such as (my favorite) —

* jot down some idea in a draft email and do a bit of research whenever I get time to build on the idea, until it’s complete and fairly substantial. Idea could be something I overhead, an idea I’m truly interested in. The learning is mostly in the research but also in the subsequent reviews. If I don’t review the email, I will forget most of it. When I do review it, I not only refresh my memory, but I often discover connections with other things I studied, or find new ideas to learn — 温故而知新. Learning is associative, like growing a spider web.

c#/c++/quant – accumulated focus

Update — such a discussion is a bit academic. I don’t always have a choice to focus on one area. I can’t afford to focus too much. Many domains are very niche and there are very few jobs.

If you choose the specialist route instead of the manager route, then you may find many of the successful role models need focus and accumulation. An individual’s laser energy is a scare resource. Most people can’t focus on multiple things, but look at Hu Kun!

eg: I think many but not all the traders I know focus for a few years on an asset class to develop insight, knowledge, … Some do switch to other asset classes though.
eg: I feel Sun L got to focus on trading strategies….
eg: my dad

All the examples I can think of fall into a few professions – medical, scientific, research, academic, quant, trading, risk management, technology.

By contrast, in the “non-specialist” domains focus and accumulation may not be important. Many role models in the non-specialist domains do not need focus. Because focus+accumulation requires discipline, most people would not accumulate. “Rolling stone gathers no moss” is not a problem in the non-specialist domains.

I have chosen the specialist route, but it takes discipline, energy, foresight … to achieve the focus. I’m not a natural. That’s why I chose to take on full time “engagements” in c#, c++ and UChicago program. Without these, I would probably self-teach these same subjects on the side line while holding a full time java job, and juggling the balls of parenting, exercise, family outings, property investment, retirement planning, home maintenance….[1] It would be tough to sustain the focus. I would end up with some half-baked understanding. I might lose it due to lack of use.

In my later career, I might choose a research/teaching domain. I think I’m reasonably good at accumulation.

–See also
[1]  home maintenance will take up a lot more time in the US context. See Also
https://1330152open.wordpress.com/2015/08/22/stickyspare-time-allocation-history/ — spare time allocation
https://1330152open.wordpress.com/2016/04/15/set-measurable-target-with-definite-time-frame-or-waste-your-spare-time/
https://1330152open.wordpress.com/2016/04/26/spare-time-usage-luke-su-open/

mgr position risk: forced out

An engineer can be forced out, too, due to performance or attitude, but a mgr can be forced out for no fault of her own — change of upper management.

The “like” factor is more important in a manager than an engineer. In a sense, a mgr keeps her place by pleasing her superior, in addition to doing her job (of getting things done.)

Therefore, a mgr position can feel more /precarious/ than an engineer position.

mgr position risk: targeted hatred

“Hatred” is stronger word than “dislike”. Hatred demands actions.

Hatred can emerge among subordinate employees, superiors, downstream teams, or lateral colleagues.

If an employee feels unfairly treated, usually she puts up with it or quit, but a fair percentage (30%?) of them could decide to take action. I once reached out to HR at OC. Some lodge an official complaint.

Even if the employee doesn’t take action, the intense dislike is bound to spread and become infectious.

How easy is it to neutralize or contain hatred? Get real.

How easy is it to remain fair to every employee? Get real.

mgr position stress: inferiority,rivalry

At the senior mgr level, your position in the hierarchy is highly visible to everyone and also in your own mind. The higher, the more visible. You are more likely to feel inferior (+superior) to other people across the industry. In contrast, the regular employee and contractors are not in a position to feel that much inferiority — /blissful oblivion/ means happiness and carefree.

Some would say the inferiority is a /part and parcel/ of moving up, so most people would willingly accept it. I think each individual reacts different. Some may be more affected by it when they move up.

Rivalry is another side of the same coin. It can get ruthless. I remember Mark in PWM.

Demotions and promotions are more intense than the annual bonus situation.

 

 

senior mgr position risk: temptations

A risk underestimated at the senior mgr position — seduction, temptation. You will be a target. I guess the operators are sharp observers. They could possibly spot your weakness.  It’s really human to be attracted to the opposite sex. You can’t completely hide your vulnerability.

A friend ML told me it can be very hard to resist at the right time and right place.

Alcohol is a common “weakening” factor, or possibly a weapon used by the operator.

python to dump binary data in hex digits

Note hex() is a built-in, but I find it inconvenient. I need to print in two-digits with leading 0.

Full source is hosted in https://github.com/tiger40490/repo1/blob/py1/tcpEchoServer.py

def Hex(data): # a generator function
  i=0
  for code in map(ord,data):
    yield "%02x " % code
    i += 1
    if i%8==0: yield ' '

print ''.join(Hex("\x0a\x00")); exit(0)

friend class Fren need!!be fwd-declared

http://www.cplusplus.com/forum/articles/10627/ is a forum post, but i basically trust him:

There are two basic kinds of dependencies you need to be aware of:
1) stuff that can be forward declared
2) stuff that needs to be #included

If, for example, class A uses class B, then class B is one of class A’s dependencies. Whether it can be forward declared or needs to be included depends on how B is used within A:

- do nothing if: The only reference to B is in a friend declaration <-- I tested this myself.
- forward declare B if: A contains a B pointer or reference: B* myb;
- forward declare B if: one or more functions has a B object/pointer/reference
as a parameter, or as a return type:

B MyFunction(B myb);

- #include "b.h" if: B is a parent class of A
- #include "b.h" if: A contains a B object: B myb;

unix family tree #MacOS

This is academic knowledge for the self-respected techie.

https://upload.wikimedia.org/wikipedia/commons/c/cd/Unix_timeline.en.svg  and https://en.wikipedia.org/wiki/UNIX_System_V#/media/File:Unix_history-simple.svg show

  • MacOS is based on BSD
  • iOS and MacOS are  based on Darwin
    • Darwin is based on BSD
  • linux contains no BSD or Unix codebase
  • most commercial Unix versions are based on sysV

housekeeping^payload fields: vector,string,shared_ptr

See also std::string/vector are on heap; reserve() to avoid re-allocation

std::vector — payload is an array on heap. Housekeeping fields hold things like size, capacity, pointer to the array. These fields are allocated either on stack or heap or global area depending on your variable declaration.

  • Most STL (and boost) containers are similar to vector in terms of memory allocation
  • std::string — payload is a char-array on heap, so it can expand both ways. Housekeeping data includes size…
  • shared_ptr — payload includes a ref counter and a raw-pointer object [1] on heap. This is the control-block shared by all “club members”. There’s still some housekeeping data (pointer to the control block), typically allocated on stack if you declare the shared_ptr object on stack and then use RAII.

If you use “new vector” or “new std::string”, then the housekeeping data will also live on stack, but I find this practice less common.

[1] this is a 32-byte pointer object, not a pure address. See 3 meanings of POINTER + tip on q(delete this)

git| origin/br1 ^ remote/origin/br1

In q[ git branch -a ], we see

remote/origin/branch1
remote/origin/br2

I tested that

q[ git log remote/origin/br1 ] and q[ git log origin/br1 ] are identical

— my incomplete understanding:

There can be many remotes in a git repo. Each remote is “pointer” to a peer repo at some URL, identified by a nickname. “origin” is the default nickname of the remote you cloned from.

q[ git remote -v ] shows the URL and nickname of each remote.

array^pointer variables types: indistinguishable

  • int i; // a single int object
  • int arr[]; //a nickname of the starting address of an array, very similar to a pure-address const pointer
  • int * const constPtr;
  • <— above two data types are similar; below two data types are similar —->
  • int * pi; //a regular pointer variable,
  • int * heapArr = new int[9]; //data type is same as pi

c++big4: prefer synthesized

I think it’s the author of [[safe c++]] who pointed out that if we have to maintain non-default big4, then it’s extra workload for the maintenance programmer. He argued convincingly that it’s not a good idea to require other programmers or yourself to “always remember to do something”

pointer as field –#1 pattern in c++ explains that shared_ptr as a pointer field allows us to use the default big4.

array as field #implementation pattern

array field is less common in java/c# than in c++ and include vector, hashtable, deque.

As an alternative, please consider replacing the array with a vector. This would use heap memory but total memory usage is probably similar.

  • benefit — lower risk of seg fault due to index out of range
  • benefit — growable, though in many cases this is unneeded
  • benefit — different instances can different sizes, and the size is accessible at run time.
  • benefit — compared to a heap array as a field, vector offers RAII safety

ensure operator<< is visible via header file

If you define operator<<() for a basic ValueObject class like Cell, to be used in higher-level class like Board, then you need to make this declaration visible to Board.cpp via header files.

If you only put the definition of operator<<() in a ValueObj.cpp and not the ValueObj.h, and you link the object files ValueObj.o and Board.o, everything compiles fine. When Board.cpp calls operator<< on this Cell object it would use the default operator<< rather than yours.

2obj files compiled@different c++toolchains can link@@

(many interviewers asked…)

Most common situation — two static libs pre-compiled on toolchain A and B, then linked. Usually we just try our luck. If not working, then we compile all source files on the same toolchain.

Toolchain A and B could differ by version, or compiler brand, or c vs c++ … I guess there’s an Application Binary Interface between different toolchains.

https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html says that it’s possible (“straightforward”) to link C++03 and C++11 code together.

##teaching the privileged to get ahead@@

It’s often easier, more lucrative to focus on the affluent consumers, but consider “value”.

Example — trading techniques. This kinda teaching doesn’t really have much social value, except .. risk reduction? Zero-sum game … you help some win, so other investors must lose.

Example — coach some brainy kids get into gifted classes. This is gaming the competitive “system”. Actually the poor kids need your help more.

Example — coach table tennis kids win competitions. Arguably you help improve the table tennis game, but how much social value is there? Mostly you are helping those few individual kids get-ahead

Many other teaching subjects do have social value

  • languages, writing
  • tech, math, science
  • programming
  • health care
  • financial literacy
  • arts

EarlyRetireExtreme: learning as pastime !! mainstay

The ERE author enjoys learning practical skills as a hobby. In fact, his learning programs could be more than a hobby, since he has no full time job.

However, I am very different human being from him. I feel very few such learning programs can the mainstay during my semi- or full retirement. Why?

  • I need to work towards some level of commitment, and a daily routine.
  • I need to make some contribution and be paid for it
  • I prefer interaction with other people

q[less] functor ^ operator<() # map/sort/lower_bound

  • In coding tests, you can use any solution below, so I will use a global operator<(Trade, Trade)
  • In QQ, we need some basic know-how to discuss the alternatives but …. seldom quizzed

Let’s summarize the rules for QQ — I wanted to say “simple rules” but… non-trivial.

1) multiset/multimap/set/map use functor “less” .
** That’s by default. You can (but probably no practical reason to) specify any functor when instantiating the multiset class template. See post on [[allocator, vptr…]]
** Note a functor is always a class template.

2) each instantiation of functor template “LESS” typically calls operator<()
** note this “operator<()” can be a non-static method, or a global/namespace thing
** If you have an entity class Trade, you can customize by overloading this operator as a global function accepting 2 Trade arguments.
** However, Warren of CS (market data system) said it should often be a (non-static?) method of the Trade entity class. I feel this is best practice.

The dogmatic [[effective stl]] P179 doesn’t mention overloading operator< but advocates subclassing binary_function and giving it a unique name…. But this is significantly more (and unfamiliar) code to write, so for simplicity, simply overload operator<() and don’t touch q(less).

ptr-to-Trade as a key — see [[effSTL]] P88. Basically, you need a custom functor class deriving from std::binary_function. Beware the syntax pitfall highlighted in my blog post. Note shared_ptr is an option, not a must.

If you don’t need a red-black tree container, but need sorting, binary_search, lower_bound etc — then you have flexibility. Simplest is a pointer to a global bool function. See https://bintanvictor.wordpress.com/2017/10/01/binary-search-in-sorted-vector-of-tick-pointer/


How about std::lower_bound()? Same defaults — less and operator<()

How about std::sort()? Same defaults — less and operator<()

 

big-data arch job market #FJS Boston

Hi YH,

My friend JS left the hospital architect job and went to some smaller firm, then to Nokia. After Nokia was acquired by Microsoft he stayed for a while then moved to the current employer, a health-care related big-data startup. In his current architect role, he finds the technical challenges too low so he is also looking for new opportunities.

JS has been a big-data architect for a few years (current job 2Y+ and perhaps earlier jobs). He shared many personal insights on this domain. His current technical expertise include noSQL, Hadoop/Spark and other unnamed technologies.

He also used various machine-learning software packages, either open-sourced or in-house, but when I asked him for any package names, he cautioned me that there’s probably no need to research on any one of them. I get the impression that the number of software tools in machine-learning is rather high and there’s yet an emerging consensus. There’s presumably not yet some consolidation among the products. If that’s the case, then learning a few well-known machine-learning tools won’t enable us to add more value to a new team using another machine-learning tool. I feel these are the signs of an nascent “cottage industry” in the early formative phase, before some much-needed consolidations and consensus-building among the competing vendors. The value proposition of machine-learning is proven, but the technologies are still evolving rapidly. In one word — churning.

If one were to switch career and invest oneself into machine-learning, there’s a lot of constant learning required (more than in my current domain). The accumulation of knowledge and insight is lower due to the churn. Job security is also affected by the churn.

Bright young people are drawn into new technologies such as AI, machine-learning, big data, and less drawn into “my current domain” — core java, core c++, SQL, script-based batch processing… With the new technologies, Since I can’t effectively accumulate my insight(and value-add), I am less able to compete with the bright young techies.

I still doubt how much value-add by machine-learning and big data technologies, in a typical set-up. I feel 1% of the use-cases have high value-add, but the other use cases are embarrassingly trivial when you actually look into it. I guess it mostly consist of

  1. * collecting lots of data
  2. * store in SQL or noSQL, perhaps on a grid or “cloud”
  3. * run clever queries to look for patterns — data mining

See https://bintanvictor.wordpress.com/2017/11/12/data-mining-vs-big-data/. Such a set-up has been around for 20 years, long before big-data became popular. What’s new in the last 10 years probably include

  • – new technologies to process unstructured data. (Requires human intelligence or AI)
  • – new technologies to store the data
  • – new technologies to run query against the data store

container of smart^raw pointer

In many cases, people need to store addresses in a container. Let’s use std::vector for example. Both smart ptr and raw ptr are common and practical

  • Problem with raw ptr — stray pointer. Usually we the vector doesn’t “own” the pointee, and won’t delete them. But what if the pointee is deleted somewhere and we access the stray pointer in this vector? smart pointer would solve this problem nicely.
  • J4 raw ptr — footprint efficiency. Raw ptr object is smaller.

##fastest container choices: array of POD #or pre-sized vector

relevant to low-latency market data.

  • raw array is “lean and mean” — the most memory efficient; vector is very close, but we need to avoid reallocation
  • std::array is less popular but should offer similar performance to vector
  • all other containers are slower, with bigger footprint
  • For high-performance, avoid container of node/pointer — Cache affinity loves contiguous memory. After accessing 1st element, then accessing 2nd element is likely a cache-hit
    • set/map, linked list suffer the same

json^protobuf

https://auth0.com/blog/beating-json-performance-with-protobuf/ points out

  • —limitations of protobuf:
  • Lack of resources. You won’t find that many resources (do not expect a very detailed documentation, nor too many blog posts) about using and developing with Protobuf.
  • Smaller community. Probably the root cause of the first disadvantage. On Stack Overflow, for example, you will find roughly 1.500 questions marked with Protobuf tags. While JSON have more than 180 thousand questions on this same platform.
  • not human readable
  • schema is extra legwork for quick and dirty project
  • — advantages of protobuf over Json
  • very dense, and binary, data
  • up to 5 times faster, but optimized json parser could reduce the performance gap.

 

q[less] functor ^ operator<() again, briefly

[[effSTL]] P177 has more details than I need. Here are a few key points:

std::map and std::set — by default uses less<Trade>, which often uses a “method” operator<() in the Trade class

  • If you omit this operator, you get verbose STL build error messages about missing operator<()
  • this operator<() must be a const method, otherwise you get lengthy STL build error.
  • See https://stackoverflow.com/questions/1102392/stdmaps-with-user-defined-types-as-key
  • The other two (friendly) alternatives are
  • function pointer — easiest choice for quick coding test
  • binary functor class with an operator()(Trade, Trade) — too complex but most efficient, best practice.

 

compute FX swap bid/ask quotes from spotFX+IR quotes #eg calc

Trac Consultancy’s coursebook has an example —

USD/IDR spot = 9150 / 9160
1m USD = 2.375% / 2.5%
1m IDR = 6.125% / 6.25%

Q: USD/IDR forward outright = ? / ?

Rule 1: treat first currency (i.e. USD) as a commodity like silver. Like all currency commodities, this one has a positive carry i.e. interest.

Rule 2: Immediately, notice our silver earns lower interest than IDR, so silver is at fwd Premium, i.e. fwd price must be higher than spot.

Rule 3: in a simple zero-spread context, we know fwd price = spot * (1 + interest differential). This same formula still holds, but now we need to decide which spot bid/ask to use, which 1m-USD bid/ask to use, which 1m-IDR bid/ask to use.

Let’s say we want to compute the fwd _b_i_d_ price (rather than the ask) of the silver. The only fulfillment mechanism is — We the sell-side would borrow IDR, buy silver, lend the silver. At maturity, the total amount of silver divided by the amount of IDR would be same as my fwd bid price. In these 3 trades, we the sell-side would NOT cross the bid/ask spread even once, so we always use the favorable side of bid/ask, meaning

Use the Lower 1m-IDR
Use the Lower spot silver price
Use the Higher 1m-silver

Therefore fwd bid = 9150 [1 + (6.125%-2.5%)/12] = 9178

…… That’s the conclusion. Let’s reflect —

Rule 4: if we arrange the 4 numbers ascending – 2.375 / 2.5 / 6.125 / 6.25 then we always get interest differential between … either the middle pair (6.125-2.5) OR the outside pair (6.25-2.375). This is because the dealer always uses the favorable quote of the lend and borrow.

Rule 5: We are working out the bid side, which is always lower than ask, so the spot quote to use has to be the bid. If the spot ask were used, it could be so much higher than the other side (for an illiquid pair) that the final fwd bid price is higher than the fwd ask! In fact this echos Rule 9 below.

Rule 5b: once we acquire the silver, we always lend it at the ask (i.e. 2.5). From Rule 4, the interest differential is (6.125-2.5)

Rule 9: As a dealer/sell-side, always pick the favorable side when picking the spot, the IR on ccy1 and IR on ccy2.  If at any step you were to pick the unfavorable number, that number could be so extreme (huge bid/ask spread exists) as to make the final fwd bid Exceed the ask.

Let’s apply the rules on the fwd _a_s_k_ = 9160 [ 1+ (6.25% – 2.375%)/12 ] = 9190

Rule 1/2/3/4 same.

Apply Rule 5 – use spot ask (which is the higher quote). Once we sell silver spot, we lend the IDR sales proceeds at the higher side which is 6.25%….

##xp@career diversification #instead of stack-up/deepen

  • biz wing — in addition to my tech wing. I learned a bit but not enough. Not strategic
  • quant? diversify. The on-the-job learning was effective and helped me with subsequent interviews, but further push (UChicago) are not bearing fruits
  • data science? diversify
  • big data java jobs? stack-up
  • —-diversify within the tech space, where I have proven strengths
  • py? bearing fruits. Confidence.
  • unix -> web dev -> java? extremely successful
  • [10 Citi] c++? slowly turning positive
  • [12 Barc] swing? positive experience
  • [12] dotnet? reasonable
  • [14] real time risk, in-house framework (Quartz)? disillusioned
  • [17] raw mktData
  • [18] high-volume, low-latency equity Order Management
  • [18] FIX? diversify

some international securities have no cusip/isin but never missing both

A BAML collateral system dev told me some securities in his system have no cusip or isin, but must have one of them.

I believe some international assets pledged as collateral could be missing one of them.

Japanese gov bond is a common repo asset — cross-currency repo. The borrower needs USD but uses Japanese bond as collateral.

In MS product reference database, I see these identifiers:

  • internal cusip
  • external cusip – used in U.S./Canada
  • cins – CUSIP International Numbering System, for “foreign” securities
  • isin – if you want to trade something inter-nationally
  • sedol
  • bloomberg id
  • Reuters RIC code, RT symbol and RT tick

 

collateral: trade booked before confirmation

In collateral system, a margin call requires the counter party to post additional collateral (within a short window like a day). If the collateral is in the form of a bond (or another security), then it’s considered a “trade”. There are often pre-agreed procedures to automatically transfer the bond.

So the IT system actually books the trade automatically, even before the collateral operations team gets to confirm the trade with the counter party. That’s what I heard from an application owner. However, I suspect these bonds could be held in some special account and transferred and confirmed automatically when required. In such a case, the trade booking is kind of straight-through-processing.

I guess the counter-party is often an margin account owner, perhaps hedge funds in a prime brokerage system.

tail-recursion Fibonacci # tricky]python

Tail recursion is a “halo” skill in coding interviews. It turns out that most recursive functions can be reworked into the tail-call form, according to http://chrispenner.ca/posts/python-tail-recursion.

The same author also demonstrates

  1. python recursion stack depth is about 1000 only, so deep recursion is unpopular in python
  2. python doesn’t support tail recursion
  3. some decorator trick can simulate tail recursion in python

—————-

Easiest demo problem is factorial(N). For Fibonacci, https://stackoverflow.com/questions/22111252/tail-recursion-fibonacci has a very short python implementation (though I suspect python doesn’t optimize tail recursion). Let me rephrase the question:

Q: Given f(firstVal=0, secondVal=1, length=0) returns 0, f(0,1,1) returns 1, can you implement f(0,1,N) using recursion but in O(N) time and O(1) space? Note Fib(N) ==f(0,1,N)

Key points in the python solution:

  • Start with iterative algo, then convert it to tail recursion.
  • use 2 extra arguments to hold last two intermediate values like Fib(2) Fib(3) etc
  • We saw in the iterative solution that memory usage is O(1), a good sign that tail recursion might be possible.
  • if you observe the sequence of Fib() values computed in the blackbox, actually, you see Fib(2), Fib(3) … up to Fib(N), exactly like the iterative solution.
  • solution is extremely short but non-trivial

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/FibTailRecurse.cpp is my very brief implementation

##c++QQ topics I took up since Apr 2017

Combine with https://bintanvictor.wordpress.com/wp-admin/post.php?post=24064&action=edit ? no need… Don’t spend too much time! I keep this list only as motivation and reward, but I don’t want a long list. I want heavy-hitters only, non-exhaustive.

Note “Take-up/Conquest” mean … I now feel as competent as my interviewers on that topic.

Note QQ means … hard topics unneeded in any project.

  • RAII+swap(), the killer combination, in the sports sense of “kill”
  • [3] sfinae
  • [3] crtp
  • covariant return type and virtual ctor i.e. clone()
  • static_assert? Non-trivial for those uninitialized with TMP
  • [3] alternative to virtual functions
  • singleton thread-safety
  • [3] heap memory mgmt strategies
  • [3d] make_shared benefits and limitations (Scott Meyers)
  • [3] shared_ptr concurrency
  • [d] Common mistakes involving shared_ptr
  • weak_ptr
  • unique_ptr ownership transfer
  • factory returning smart pointers
  • [d] emplace employing perfect forwarding
  • [3] just when std::move() is needed
  • std::move usage (outside big4)
  • [d] rval objects vs rvr
  • [3] placement new
  • [3] MI using Abstract Base Classes
  • [3] TCP congestion control
  • TCP OOS control
  • reinterpret_cast
  • — topics to be conquered
  • [3] TCP resend
  • TCP buffer sizing
  • std::forward()

[3=top-33 favorite topics among ibank interviewers]
[d=reached some depth of understanding]

##11 c++academic topics that turned out2b popular

  • –ranked by .. surprise and value to my learning
  • lvr/rvr references vs pointers
  • *_cast operators, conversions including std::move
  • TMP – CRTP, SFINAE, type traits
  • multiple inheritance
  • placement new
  • c/c++ integration
  • weak_ptr? small halo
  • ctor/dtor sequence esp. in virtual inheritance
  • public api of any smart ptr class
  • — Now the rarely-quizzed topics:
  • 😦 boost threading and other boost
  • 😦 STL iterators and algorithms
  • 😦 pthreads
  • 😦 exception guarantees
  • 😦 allocators

##c++QQ topics discovered ONLY from IV

I didn’t know these were important when I was reading on my own.

  • socket details
    • tcp handling of OOS
    • tcp flow control, AWS
  • smart ptr api and control-block manipulation
    • make_shared details
    • enable_shared_from_this
    • auto_ptr vs unique_ptr
  • multiple inheritance, casting
  • template techniques
  • std::forward
  • exception catching/re-throwing
  • q[ … ] variadic template params
  • inline: impact on performance
  • throwing dtor
  • details of pure virtual

real-time symbol reference-data: arch #RTS

Real Time Symbol Data is responsible for sending out all security/product reference data in real time, without duplication.

  • latency — typically 2ms (not microsec) latency, from receiving to sending out the enriched reference data to downstream.
  • persistence — any data worthing sending out need to be saved. In fact, every hour the same system sends a refresh snapshot to downstream.
    • performance penalty of disk write — is handled by innoDB. Most database access is in-memory. Disk write is rare. Enough memory to hold 30GB of data. https://bintanvictor.wordpress.com/2017/05/11/exchange-tickers-and-symbols/ shows how many symbols there across all trading venues.
  • insert is actually slower than update. But first, system must check if there’s a need to insert or update. If no change, then don’t save the data or send out.
  • burst / surge — is the main performance headache. We could have a million symbols/messages flooding in
  • relational DB with mostly in-memory storage

peers’priority^%%priority: top5 #beyond gz

In a nut shell, Some of my peers’ priorities I have decidedly given up, and some of my priorities are fairly unique to me.

Actually I only spoke to a small number of (like 10) peers, mostly Chinese and Indian. There are big differences among their views. However, here’s a grossly oversimplified sample of “their” priorities.

  • theirs — top school district
  • theirs — move up (long-term career growth) in a good firm?
    • build long-term relationships with big bosses
  • theirs — green card
  • theirs — early retirement
  • ———-
  • mine — diversify (instead of deepen or stack-up) and avoid stagnation
  • mine — stay technical and marketable, till 70
  • mine — multiple properties and passive income
  • mine — shorter commute
  • mine — smaller home price tag

 

converting btw epoch, HHMMSS ..

  • Note epoch is second-level, not millisecond level.
  • Note epoch is timezone agnostics. It’s always defined in UTC.
  • Note struct tm is the workhorse. It can break up a time object into the year/weekday/…/second components

—to convert from epoch to HHMMSS:
https://github.com/tiger40490/repo1/blob/cpp1/cpp/88miscLang/convertEpochInt.cpp

—to convert current time to int epoch number:
https://github.com/tiger40490/repo1/blob/cpp1/cpp/88miscLang/convertEpochInt.cpp

—epoch timestamp is typically in seconds

  • 1513961081 — 10 digits, seconds since Epoch
  • 1511946930032722000 — 19 digits, nanosec since Epoch

g++ -D_GLIBCXX_DEBUG #impractical

This is a good story for interviews.

In a simple program I wrote from scratch, this flag saved the day. My input to std::set_difference was not sorted, as detected by this flag. Without this flag, the compiler didn’t complain and I had some unexpected successful runs, but with more data I hit runtime errors.

I had less luck using this flag with an existing codebase. After I build my program with this flag, I got random run-time crashes due to “invalid pointer at free()” whenever i use a std::stringstream.

 

non-local^local static object initialization ] c++

Regarding when such objects are initialized, there are simple rules as illustrated in 2 simple yet concurrent singleton implementations ] c++The two singleton implementations each use one type.

The rules:

  1. lazy init — local statics are initialized in the first function call. GCC guarantees only one thread would initialize it, never concurrently on two threads.
  2. eager init — non-local statics are initialized before main(), on one single thread, but the sequence among them is non-deterministic, as explained by Scott Meyers.
  3. Both are kind of thread-safe

j4 factory java^c++ #Wells

A Wells Fargo interviewer asked

Q6: motivation of factory pattern?
Q6b: why prevent others calling your ctor?

  • %%A: some objects are expensive to construct (DbConnection) and I need tight control.
  • %%A: similarly, after construction, I often have some initialization logic in my factory, but I may not be allowed to modify the ctor, or our design doesn’t favor doing such things in the ctor. I make my factory users’ life easier if they don’t call new() directly.
  • AA: more importantly, the post-construction function could be virtual! This is a sanctioned justification for c++ factory on the authoritative [[c++codingStd]] P88
  • %%A: I want to (save the new instance and) throw exception in a post-construction routine, but I don’t want to throw from ctor
  • %%A: I want to centralize the non-trivial business rule of selecting target types to return. Code reuse rather than duplication
  • %%A: a caching factory
  • %%A: there are too many input parameters to my ctor and I want to provide users a simplified façade

shared_ptr dislikes private ctor/dtor in payload class

1) make_shared can’t call a private ctor. If your ctor is private, you have to use

shared_ptr sp(new MyClass); # within your class??

2) If your MyClass dtor is private, you simply can’t hold it in shared_ptr.

#include <memory>
using namespace std;
class C{
  //~C(){cout<<"dtor\n"; } // breaks shared_ptr<C>
  C(){cout<<"ctor\n"; }
public:
  static shared_ptr<C> mk(){
    //shared_ptr<C> sp = make_shared<C>(); //won't compile if ctor is private
    return shared_ptr<C>(new C());
  }
};
int main(){ shared_ptr<C> sp = C::mk(); }

custom delimiter for cin operator>> #complicated

Tested but is too hard to remember. Better use the getline() trick in https://bintanvictor.wordpress.com/2017/11/05/simplest-cway-to-split-string-on-custom-delimiter/

struct comma_is_space : std::ctype<char> { //use comma as delimiter
  comma_is_space() : std::ctype<char>(get_table()) {}
  static mask const* get_table() {
    static mask rc[table_size];
    rc[','] = std::ctype_base::space;
    return &rc[0];
  }
};

istringstream iss(line);
iss.imbue(locale(cin.getloc(), new comma_is_space));

binary search in sorted vector of Tick pointer

Note the mismatched args to the comparitor functions.

(I was unable to use a functor class.)

std::vector<Tick const*> vec;
int target;
bool mylessFunc(Tick const * tick, unsigned int target) {
     //cout<<tick->ts<<" against "<<target<<endl; 
     return tick-ts < target;
}
lower_bound(vec.begin(),vec.end(),target, mylessFunc);

bool mygreaterFunc(unsigned int target, Tick const * tick){
     //cout<<a->ts<<" against "<<target<<endl; 
     return tick->ts > target;
}
upper_bound(vec.begin(),vec.end(),target, mygreaterFunc)