c++debug^release build can modify app behavior #IV

This was actually asked in an interview, but it’s also good GTD knowledge.

https://stackoverflow.com/questions/4012498/what-to-do-if-debug-runs-fine-but-release-crashes points out —

  • fewer uninitialized variables — Debug mode is more forgiving because it is often configured to initialize variables that have not been explicitly initialized.
    • For example, Perhaps you’re deleting an uninitialized pointer. In debug mode it works because pointer was nulled and delete ptr will be ok on NULL. On release it’s some rubbish, then delete ptr will actually cause a problem.

https://stackoverflow.com/questions/186237/program-only-crashes-as-release-build-how-to-debug points out —

  • guard bytes on the stack frame– The debugger puts more on the stack, so you’re less likely to overwrite something important.

I had frequent experience reading/writing beyond an array limit.

https://stackoverflow.com/questions/312312/what-are-some-reasons-a-release-build-would-run-differently-than-a-debug-build?rq=1 points out —

  • relative timing between operations is changed by debug build, leading to race conditions

Echoed on P260 [[art of concurrency]] which says (in theory) it’s possible to hit threading error with optimization and no such error without optimization, which represents a bug in the compiler.

P75 [[moving from c to c++]] hints that compiler optimization may lead to “critical bugs” but I don’t think so.

  • poor use of assert can have side effect on debug build. Release build always turns off all assertions as the assertion failure messages are always unwelcome.
Advertisements

##c++dev tools for GTD+zbs

The primary tool chain (IDE, gcc, gdb…) often provides similar features. Therefore, some of these are no longer needed.

  • c++filt
  • [so] ctags, cxref, cflow
  • depend and cppdepend
  • [s] lint — outdated and no longer popular
  • [x] prof/gprof
  • [x] valgrind
  • [o] nm and objdump
  • [s=works on source files]
  • [o=works on object files]
  • [x=works on executables only]

[[safeC++]] discourages implicit conversion via OOC/cvctor

See other posts about OOC and cvctor. I am now convinced by [[safeC++]] that it’s better to avoid both. Instead, Use AsXXX() method if converting from YYY to XXX is needed. Reason is type safety. In an assignment (including function input/output), it is slightly hacky if LHS is NOT a base type of RHS. Implicit conversion is like Subversion of compiler’s type enforcement — Given a function declared as f(XXX), it should ideally be illegal to pass in a YYY. However, The implicit converters break the clean rule, from the back door.

As explained concisely on P8 [[safeC++]], The OOC is provided specifically to support implicit conversion. In comparison, The cvctor is more likely to be a careless mistake if without “explicit”.

Favor explicit conversion rather than implicit conversion. Some manager in Millennium pointed out that c++ syntax has too many back doors and is too “implicit”. Reading a piece of code you don’t know what it does, unless you have lots of experience/knowledge about all the “back-doors”.

[[safeC++]]assertion technique(versatile), illustrated with NPE

Update: how to add a custom error message to assert — https://stackoverflow.com/questions/3692954/add-custom-messages-in-assert

This is a thin book. Concise and practical (rather than academic) guidelines.

#1 guideline: enlist compiler to catch as many errors as possible.

However, some errors will pass compiler and only happen at run-time. Unlike on P11, I will treat programmer errors and other run-time errors alike – we need an improvement over the “standard” outcome which is UB (undefined behavior) and we may or may not see any error message anywhere.

#2 guideline: In [[safeC++]], such an improvement is offered in the form of assertions, in other words, run-time checks. The author gives them a more accurate name “diagnostics”.

2.1) now the outcome is guaranteed termination, rather than the undefined behavior.
2.2) now there’s always an error message + a stack trace. Now 2.2) sounds like non-trivial improvement. Too good to be true? The author is a practicing programmer in a hedge fund so I hope his ideas are real-world.

Simplest yet realistic example of #2 is NPE (i.e. null pointer deref). NPE is UB and could (always? probably not) crash. I doubt there’s even an error message. Now with a truly simple “wrapper” presented on P53-54, an NPE could be diagnosed __in_time__ and an fatal exception thrown, so program is guaranteed to terminate, with an error message + stack trace.

Like a custom new/delete (to record allocations), here we replace the raw pointer with a wrapper. There we see a pattern where we replace builtin c++ constructs with our wrappers to avoid UB and get run time diagnostics —

$ this wrapper is actually a simple smart ptr
$ traditional smart ptr templates
$ custom new, delete
$ vector
$ Int class replacing int data type

The key concepts —
% assertion
% diagnostics
% run time

Q: Can every UB condition be diagnosed this way? Not sure, but the most common ones seem to be.

[[safeC++]] – concise, pragmatic, unconventional wisdom

First off, this is a 120-page thin book, including about 30 pages of source code in the appendices. Light-weight, concise. Very rare.

I feel the author is bold to advocate avoidance of popular c++ features such as
– “Avoid using pointer arithmetic at all.”
– For class fields, avoid built-in types like int. Use Int type — no need to initialize.
– “Use the new operator only without bracket”. Prefer Vector to new[]
– “Whenever possible, avoid writing copy ctor and assignment operators for your class”

I feel these suggestions are similar to my NPE tactics in java. Unconventional wisdom, steeped in a realistic/pessimistic view of human fallibility, rather tedious, all about ….low-level details.

Amidst so many books on architecture and software design, I find this book so distinctive and it speaks directly to me — a low-level detailed programmer.

I feel this programmer has figured out the true cost/benefit of many c++ features, through real experience. Other veterans may object to his unconventional wisdom, but I feel there’s no point proving or convincing. A lot of best practices [1] are carefully put aside by veterans, often because they know the risks and justifications. These veterans would present robust justifications for their deviation — debatable but not groundless.

[1] like “avoid global variables and gotos”

–finance
Given the author’s role as a quant developer I believe all of the specific issues raised are relevant to financial applications. When you read about some uncommon issue (examples in [1]), you are right to question if it’s really important to embedded, or telecom, or mainframe domains, but it is certainly relevant to finance.

Incidentally, most of the observations, suggestions are tested on MSVS.

–assert, smartPtr…
I like the sample code. Boost smart ptr is too big to hack. The code here is pocket-sized, even bite-sized, and digestible and customizable. I have not seen any industrial strength smart pointer so simple.

–templates
The sample code provided qualify as library code, and therefore uses some simple template techniques. Good illustration of template techniques used in finance.

[1] for eg the runtime cost of allocating the integer ref count on P49; or the date class.

Any negative?

Simple, clean, pure Multiple Inheritance@@

Update — Google style guide is strict on MI, but has a special exception on Windows.

MI can be safe and clean —

#1) avoid the diamond. Diamond is such a mess. I’d say don’t assume virtual base class is a vaccine

#2) make base classes imitate java interface … This is one proven way to use MI. Rememer Barcalys FI team. All pure virtual methods, No field, No big4 except empty virtual dtor.

#2a) Deviation: java8 added default methods to interfaces

#2b) Deviation: c++ private inheritance from one concrete base class , suggested in [[effC++]]

#3) simple, minimal, low-interference base classes. Say the 2 base classes are completely unrelated, and each has only 1 virtual method. Any real use case? I can’t think of any but when this situation arises i feel we should use MI with confidence and caution. Similarly “goto” could be put to good use once in a blue moon.

g++ frontends ^ JVM/CLR bytecode

Have you ever wondered how gcc can compile so many languages. We know various dotnet languages all compile to the same IL code; JVM similar. Now I know that gcc uses a similar technique —

gcc has a frontend for java
gcc has a frontend for ObjectiveC
gcc has a frontend for c++
gcc has a frontend for Fortran

All of them produce the same intermediate code that feeds the true compiler, which produces assembly code.

c++ uninitialized "static" objects ^ stackVar

By “static” I mean global variables or local static variables. These are Objects with addresses, not mere symbols in source code. Note Some static objects are _implicitly_ static — P221 effC++.

Rule 1: uninitialized local and class fields of primitive types (char, float…) aren’t automatically initialized. See [[programming]] by Stroustrup.

Rule 2: uninitialized local or file-scope statics are automatically initialized to 0 bits at a very early stage of program loading. See https://stackoverflow.com/questions/1597405/what-happens-to-a-declared-uninitialized-variable-in-c-does-it-have-a-value

Rule 3: all class instances are “initialized”, either explicitly, or implicitly via default ctor. Beware… Reconsider Rule 1 — is a field of primitive type of the class initialized? I don’t think so. Therefore, “initialized” means a ctor is called on the new-born instance, but not all fields therein are necessarily initialized. I’d say the ctor can simply ignore a primitive-typed field.

Uninitialized static Objects (stored in BSS segment) don’t take up space in object file. Also, by grouping all the symbols that are not explicitly initialized together, they can be easily zeroed out at once. See
http://stackoverflow.com/questions/9535250/why-is-the-bss-segment-required

Any static Object explicitly initialized by programmer is considered an “initialized” static object and doesn’t go into BSS.

P136 [[understanding and using c pointers]] uses a real example to confirm that a pointer field in a C struct is uninitialized. C has no ctor!

P261 [[programming]] by Stroustrup has a half-pager summary
* globals are default-initialized
* locals and fields are truly uninitialized …
** … unless the data type is a custom class having a default ctor. In that case, you can safely declare the variable without initialization, and content will be a pre-set value.