%%strength ] c++knowhow #Mithun

See also

Mithun asked me “So you are now completely pro in c++?” I replied

  1. On-the-job technical challenges are not very different from java
  2. On interviews the QQ topics are different. That’s the real challenge for a java guy moving into c++.

Now I feel a 3rd element is zbs beyond GTD and interviews. I have written many blogposts about “expert”. I also have many blogpost in the category “c++real”

Some may say c++ is overshadowed by java, and c++ QQ is overshadowed by coding IV. Well, we need sharper perception and judgment, and recognize the many facets of the competitive landscape. I won’t elaborate here, but c++ has withstood many waves and is more robust than many other technologies.

Advertisements

##c++dev tools for GTD+zbs

The primary tool chain (IDE, gcc, gdb…) often provides similar features. Therefore, some of these are no longer needed.

  • c++filt
  • [so] ctags, cxref, cflow
  • depend and cppdepend
  • [s] lint — outdated and no longer popular
  • [x] prof/gprof
  • [x] valgrind
  • [o] nm and objdump
  • [s=works on source files]
  • [o=works on object files]
  • [x=works on executables only]

poll()as timer]real time C : industrial-strength #ST-mode

See also http://www.sourcexr.com/articles/2013/11/12/timer-notifications-using-file-descriptors

RTS Xtap (and earlier framework? probably) use a C timer based on the Linux syscall epoll(). It has millisecond precision [1], good enough for our real time feed parsers that consume all of NYSE, Nasdaq, OPRA etc. I won’t say how many clients we support, but some of the world’s top sites use our feeds.

There’s also a version using poll(). It’s used by default when epoll is unavailable, but linux supports epoll. For simplicity, I will say “poll” from now on.

[1] The millisec resolution is due to the tick length in Linux kernel, described on P 195 [[Linux kernel]]

I guess one advantage of poll() to implement timer is the integration with sockets. A parser is fundamentally event driven. Timer and socket are the two primary event sources.

It looks like the poll() syscalls are capable of supporting both requirements, together, which is our situation.

I call it “industrial-strength” solution because it has powered one of the most reliable, most widely used market data dissemination systems for decades, the unsung hero behind most of the financial data web sites. It has been in use for decades and handles 300 “exchanges feeds”

Concurrency? xtap is single-threaded mode. In the poll() solution, incoming packets AND timer events are inherently processed serially. No synchronization required. When the receiver socket is busy, timer events are .. delayed. Exactly what we want.

–The timer/event loop structure

while(1){ //pseudocode
  check_timers(); //roughly this gets called every 1 ms
  epoll_wait(timeout=1ms);
}

 

[[safeC++]] discourages implicit conversion via OOC/cvctor

See other posts about OOC and cvctor. I am now convinced by [[safeC++]] that it’s better to avoid both. Instead, Use AsXXX() method if converting from YYY to XXX is needed. Reason is type safety. In an assignment (including function input/output), it is slightly hacky if LHS is NOT a base type of RHS. Implicit conversion is like Subversion of compiler’s type enforcement — Given a function declared as f(XXX), it should ideally be illegal to pass in a YYY. However, The implicit converters break the clean rule, from the back door.

As explained concisely on P8 [[safeC++]], The OOC is provided specifically to support implicit conversion. In comparison, The cvctor is more likely to be a careless mistake if without “explicit”.

Favor explicit conversion rather than implicit conversion. Some manager in Millennium pointed out that c++ syntax has too many back doors and is too “implicit”. Reading a piece of code you don’t know what it does, unless you have lots of experience/knowledge about all the “back-doors”.

[[safeC++]]assertion technique(versatile), illustrated with NPE

Update: how to add a custom error message to assert — https://stackoverflow.com/questions/3692954/add-custom-messages-in-assert

This is a thin book. Concise and practical (rather than academic) guidelines.

#1 guideline: enlist compiler to catch as many errors as possible.

However, some errors will pass compiler and only happen at run-time. Unlike on P11, I will treat programmer errors and other run-time errors alike – we need an improvement over the “standard” outcome which is UB (undefined behavior) and we may or may not see any error message anywhere.

#2 guideline: In [[safeC++]], such an improvement is offered in the form of assertions, in other words, run-time checks. The author gives them a more accurate name “diagnostics”.

2.1) now the outcome is guaranteed termination, rather than the undefined behavior.
2.2) now there’s always an error message + a stack trace. Now 2.2) sounds like non-trivial improvement. Too good to be true? The author is a practicing programmer in a hedge fund so I hope his ideas are real-world.

Simplest yet realistic example of #2 is NPE (i.e. null pointer deref). NPE is UB and could (always? probably not) crash. I doubt there’s even an error message. Now with a truly simple “wrapper” presented on P53-54, an NPE could be diagnosed __in_time__ and an fatal exception thrown, so program is guaranteed to terminate, with an error message + stack trace.

Like a custom new/delete (to record allocations), here we replace the raw pointer with a wrapper. There we see a pattern where we replace builtin c++ constructs with our wrappers to avoid UB and get run time diagnostics —

$ this wrapper is actually a simple smart ptr
$ traditional smart ptr templates
$ custom new, delete
$ vector
$ Int class replacing int data type

The key concepts —
% assertion
% diagnostics
% run time

Q: Can every UB condition be diagnosed this way? Not sure, but the most common ones seem to be.

[[safeC++]] – concise, pragmatic, unconventional wisdom

First off, this is a 120-page thin book, including about 30 pages of source code in the appendices. Light-weight, concise. Very rare.

I feel the author is bold to advocate avoidance of popular c++ features such as
– “Avoid using pointer arithmetic at all.”
– For class fields, avoid built-in types like int. Use Int type — no need to initialize.
– “Use the new operator only without bracket”. Prefer Vector to new[]
– “Whenever possible, avoid writing copy ctor and assignment operators for your class”

I feel these suggestions are similar to my NPE tactics in java. Unconventional wisdom, steeped in a realistic/pessimistic view of human fallibility, rather tedious, all about ….low-level details.

Amidst so many books on architecture and software design, I find this book so distinctive and it speaks directly to me — a low-level detailed programmer.

I feel this programmer has figured out the true cost/benefit of many c++ features, through real experience. Other veterans may object to his unconventional wisdom, but I feel there’s no point proving or convincing. A lot of best practices [1] are carefully put aside by veterans, often because they know the risks and justifications. These veterans would present robust justifications for their deviation — debatable but not groundless.

[1] like “avoid global variables and gotos”

–finance
Given the author’s role as a quant developer I believe all of the specific issues raised are relevant to financial applications. When you read about some uncommon issue (examples in [1]), you are right to question if it’s really important to embedded, or telecom, or mainframe domains, but it is certainly relevant to finance.

Incidentally, most of the observations, suggestions are tested on MSVS.

–assert, smartPtr…
I like the sample code. Boost smart ptr is too big to hack. The code here is pocket-sized, even bite-sized, and digestible and customizable. I have not seen any industrial strength smart pointer so simple.

–templates
The sample code provided qualify as library code, and therefore uses some simple template techniques. Good illustration of template techniques used in finance.

[1] for eg the runtime cost of allocating the integer ref count on P49; or the date class.

Any negative?

Simple, clean, pure Multiple Inheritance@@

Update — Google style guide is strict on MI, but has a special exception on Windows.

MI can be safe and clean —

#1) avoid the diamond. Diamond is such a mess. I’d say don’t assume virtual base class is a vaccine

#2) make base classes imitate java interface … This is one proven way to use MI. Rememer Barcalys FI team. All pure virtual methods, No field, No big4 except empty virtual dtor.

#2a) Deviation: java8 added default methods to interfaces

#2b) Deviation: c++ private inheritance from one concrete base class , suggested in [[effC++]]

#3) simple, minimal, low-interference base classes. Say the 2 base classes are completely unrelated, and each has only 1 virtual method. Any real use case? I can’t think of any but when this situation arises i feel we should use MI with confidence and caution. Similarly “goto” could be put to good use once in a blue moon.

g++ frontends ^ JVM/CLR bytecode

Have you ever wondered how gcc can compile so many languages. We know various dotnet languages all compile to the same IL code; JVM similar. Now I know that gcc uses a similar technique —

gcc has a frontend for java
gcc has a frontend for ObjectiveC
gcc has a frontend for c++
gcc has a frontend for Fortran

All of them produce the same intermediate code that feeds the true compiler, which produces assembly code.

c++ uninitialized "static" objects ^ stackVar

By “static” I mean global variables or local static variables. These are Objects with addresses, not mere symbols in source code. Note Some static objects are _implicitly_ static — P221 effC++.

Rule 1: uninitialized local and class fields of primitive types (char, float…) aren’t automatically initialized. See [[programming]] by Stroustrup.

Rule 2: uninitialized local or file-scope statics are automatically initialized to 0 bits at a very early stage of program loading. See https://stackoverflow.com/questions/1597405/what-happens-to-a-declared-uninitialized-variable-in-c-does-it-have-a-value

Rule 3: all class instances are “initialized”, either explicitly, or implicitly via default ctor. Beware… Reconsider Rule 1 — is a field of primitive type of the class initialized? I don’t think so. Therefore, “initialized” means a ctor is called on the new-born instance, but not all fields therein are necessarily initialized. I’d say the ctor can simply ignore a primitive-typed field.

Uninitialized static Objects (stored in BSS segment) don’t take up space in object file. Also, by grouping all the symbols that are not explicitly initialized together, they can be easily zeroed out at once. See
http://stackoverflow.com/questions/9535250/why-is-the-bss-segment-required

Any static Object explicitly initialized by programmer is considered an “initialized” static object and doesn’t go into BSS.

P136 [[understanding and using c pointers]] uses a real example to confirm that a pointer field in a C struct is uninitialized. C has no ctor!

P261 [[programming]] by Stroustrup has a half-pager summary
* globals are default-initialized
* locals and fields are truly uninitialized …
** … unless the data type is a custom class having a default ctor. In that case, you can safely declare the variable without initialization, and content will be a pre-set value.

4 Scopes for operator-overloads #new()

  1. non-static member operators are very common, such as smart ptr operator++(), operator< () in iterators
  2. static member operator new() is sometimes needed. ARM explains why static.
  3. friend operators are fairly common, such as operator<<()
  4. class specific free standing operator is recommended by Sutter/Andrei, to be placed in the same namespace as the target class. Need to understand more. Advanced technique.

static object initialization order #lazy singletons

Java’s lazy singleton has a close relative among C++ idioms, comparable in popularity and implementation.

Basically, local statics are initialized upon first use, exactly. Other static Objects are initialized “sometime” before main() starts. See the Item on c vs c++ in [[More eff c++]] and P222 [[EffC++]].

In real life, this’s hard to control — a known problem but with standard solutions — something like a lazy singleton. See P 170 C++FAQ. Also addressed in [[EffC++]] P221.

[09]global free func^free func]a namespace^static method #ADL

In java there are static methods and nothing else. In c there are “functions” and nothing else. In c++ there are 3 alternatives.

1) static methods — like java
** unlike the free functions, static methods can be private or protected
** can access private static fields of host class
** unlike the free functions, static methods are inherited. See Operator-new in [[eff c++]]
** usually don’t need name space prefix

2) free func/operators in a “package” like a boost library or STL, typically organized into namespaces. Better than global. See ADL technique

3) GLOBAL free func /operators — vestige of C syntax

In quick and dirty applications (not libraries), you see lots of global free functions.