IV Q: can U implement op= using copier #swap+RAII

A 2010 interviewer asked

Q: do you know any technique to implement the assignment operator using an existing copy ctor?
A: Yes. See P100 [[c++codingStd]] There are multiple learning points:

  • RAII works with a class owning some heap resource.
  • RAII works with a local stack object owning a heapy thingy
  • RAII provides a much-needed exception safety. For the exception guarantee to hold, std::swap and operator delete should never throw, as echoed in my books.
  • I used to think only half the swap (i.e. transfer-in) is needed. Now I know the transfer-out on the old resource is more important. It guarantees the old resource is released even-if hitting exceptions, thanks to RAII.
  • The std::swap() trick needed here is powerful because there’s a heap pointer field. Without this field, I don’t think std::swap will be relevant.
  • self-assignment check — not required as it is rare and tolerable
  • Efficiency — considered highly efficient. The same swap-based op= is used extensively in the standard library.
  • idiomatic — this implementation of operator= for a resource-owning class is considered idiomatic, due to simplicity, safety and efficiency
C & operator=(C const & rhs){
  C localCopy(rhs); //not needed for move-assignment
  std::swap(rhs.heapResource, this->heapResource);
  return *this;
}
C & operator=(C const && rhs){ //move-assignment
  std::swap(rhs.heapResource, this->heapResource);
  return *this;
}
Advertisements

c++debug^release build can modify app behavior #IV

This was actually asked in an interview, but it’s also good GTD knowledge.

https://stackoverflow.com/questions/4012498/what-to-do-if-debug-runs-fine-but-release-crashes points out —

  • fewer uninitialized variables — Debug mode is more forgiving because it is often configured to initialize variables that have not been explicitly initialized.
    • For example, Perhaps you’re deleting an uninitialized pointer. In debug mode it works because pointer was nulled and delete ptr will be ok on NULL. On release it’s some rubbish, then delete ptr will actually cause a problem.

https://stackoverflow.com/questions/186237/program-only-crashes-as-release-build-how-to-debug points out —

  • guard bytes on the stack frame– The debugger puts more on the stack, so you’re less likely to overwrite something important.

I had frequent experience reading/writing beyond an array limit.

https://stackoverflow.com/questions/312312/what-are-some-reasons-a-release-build-would-run-differently-than-a-debug-build?rq=1 points out —

  • relative timing between operations is changed by debug build, leading to race conditions

Echoed on P260 [[art of concurrency]] which says (in theory) it’s possible to hit threading error with optimization and no such error without optimization, which represents a bug in the compiler.

P75 [[moving from c to c++]] hints that compiler optimization may lead to “critical bugs” but I don’t think so.

  • poor use of assert can have side effect on debug build. Release build always turns off all assertions as the assertion failure messages are always unwelcome.

[[safeC++]] discourages implicit conversion via OOC/cvctor

See other posts about OOC and cvctor. I am now convinced by [[safeC++]] that it’s better to avoid both. Instead, Use AsXXX() method if converting from YYY to XXX is needed. Reason is type safety. In an assignment (including function input/output), it is slightly hacky if LHS is NOT a base type of RHS. Implicit conversion is like Subversion of compiler’s type enforcement — Given a function declared as f(XXX), it should ideally be illegal to pass in a YYY. However, The implicit converters break the clean rule, from the back door.

As explained concisely on P8 [[safeC++]], The OOC is provided specifically to support implicit conversion. In comparison, The cvctor is more likely to be a careless mistake if without “explicit”.

Favor explicit conversion rather than implicit conversion. Some manager in Millennium pointed out that c++ syntax has too many back doors and is too “implicit”. Reading a piece of code you don’t know what it does, unless you have lots of experience/knowledge about all the “back-doors”.

[[safeC++]]assertion technique(versatile), illustrated with NPE

Update: how to add a custom error message to assert — https://stackoverflow.com/questions/3692954/add-custom-messages-in-assert

This is a thin book. Concise and practical (rather than academic) guidelines.

#1 guideline: enlist compiler to catch as many errors as possible.

However, some errors will pass compiler and only happen at run-time. Unlike on P11, I will treat programmer errors and other run-time errors alike – we need an improvement over the “standard” outcome which is UB (undefined behavior) and we may or may not see any error message anywhere.

#2 guideline: In [[safeC++]], such an improvement is offered in the form of assertions, in other words, run-time checks. The author gives them a more accurate name “diagnostics”.

2.1) now the outcome is guaranteed termination, rather than the undefined behavior.
2.2) now there’s always an error message + a stack trace. Now 2.2) sounds like non-trivial improvement. Too good to be true? The author is a practicing programmer in a hedge fund so I hope his ideas are real-world.

Simplest yet realistic example of #2 is NPE (i.e. null pointer deref). NPE is UB and could (always? probably not) crash. I doubt there’s even an error message. Now with a truly simple “wrapper” presented on P53-54, an NPE could be diagnosed __in_time__ and an fatal exception thrown, so program is guaranteed to terminate, with an error message + stack trace.

Like a custom new/delete (to record allocations), here we replace the raw pointer with a wrapper. There we see a pattern where we replace builtin c++ constructs with our wrappers to avoid UB and get run time diagnostics —

$ this wrapper is actually a simple smart ptr
$ traditional smart ptr templates
$ custom new, delete
$ vector
$ Int class replacing int data type

The key concepts —
% assertion
% diagnostics
% run time

Q: Can every UB condition be diagnosed this way? Not sure, but the most common ones seem to be.

[[safeC++]] – concise, pragmatic, unconventional wisdom

First off, this is a 120-page thin book, including about 30 pages of source code in the appendices. Light-weight, concise. Very rare.

I feel the author is bold to advocate avoidance of popular c++ features such as
– “Avoid using pointer arithmetic at all.”
– For class fields, avoid built-in types like int. Use Int type — no need to initialize.
– “Use the new operator only without bracket”. Prefer Vector to new[]
– “Whenever possible, avoid writing copy ctor and assignment operators for your class”

I feel these suggestions are similar to my NPE tactics in java. Unconventional wisdom, steeped in a realistic/pessimistic view of human fallibility, rather tedious, all about ….low-level details.

Amidst so many books on architecture and software design, I find this book so distinctive and it speaks directly to me — a low-level detailed programmer.

I feel this programmer has figured out the true cost/benefit of many c++ features, through real experience. Other veterans may object to his unconventional wisdom, but I feel there’s no point proving or convincing. A lot of best practices [1] are carefully put aside by veterans, often because they know the risks and justifications. These veterans would present robust justifications for their deviation — debatable but not groundless.

[1] like “avoid global variables and gotos”

–finance
Given the author’s role as a quant developer I believe all of the specific issues raised are relevant to financial applications. When you read about some uncommon issue (examples in [1]), you are right to question if it’s really important to embedded, or telecom, or mainframe domains, but it is certainly relevant to finance.

Incidentally, most of the observations, suggestions are tested on MSVS.

–assert, smartPtr…
I like the sample code. Boost smart ptr is too big to hack. The code here is pocket-sized, even bite-sized, and digestible and customizable. I have not seen any industrial strength smart pointer so simple.

–templates
The sample code provided qualify as library code, and therefore uses some simple template techniques. Good illustration of template techniques used in finance.

[1] for eg the runtime cost of allocating the integer ref count on P49; or the date class.

Any negative?

Simple, clean, pure Multiple Inheritance@@

Update — Google style guide is strict on MI, but has a special exception on Windows.

MI can be safe and clean —

#1) avoid the diamond. Diamond is such a mess. I’d say don’t assume virtual base class is a vaccine

#2) make base classes imitate java interface … This is one proven way to use MI. Rememer Barcalys FI team. All pure virtual methods, No field, No big4 except empty virtual dtor.

#2a) Deviation: java8 added default methods to interfaces

#2b) Deviation: c++ private inheritance from one concrete base class , suggested in [[effC++]]

#3) simple, minimal, low-interference base classes. Say the 2 base classes are completely unrelated, and each has only 1 virtual method. Any real use case? I can’t think of any but when this situation arises i feel we should use MI with confidence and caution. Similarly “goto” could be put to good use once in a blue moon.

g++ frontends ^ JVM/CLR bytecode

Have you ever wondered how gcc can compile so many languages. We know various dotnet languages all compile to the same IL code; JVM similar. Now I know that gcc uses a similar technique —

gcc has a frontend for java
gcc has a frontend for ObjectiveC
gcc has a frontend for c++
gcc has a frontend for Fortran

All of them produce the same intermediate code that feeds the true compiler, which produces assembly code.

c++ uninitialized "static" objects ^ stackVar

By “static” I mean global variables or local static variables. These are Objects with addresses, not mere symbols in source code. Note Some static objects are _implicitly_ static — P221 effC++.

Rule 1: uninitialized local and class fields of primitive types (char, float…) aren’t automatically initialized. See [[programming]] by Stroustrup.

Rule 2: uninitialized local or file-scope statics are automatically initialized to 0 bits at a very early stage of program loading. See https://stackoverflow.com/questions/1597405/what-happens-to-a-declared-uninitialized-variable-in-c-does-it-have-a-value

Rule 3: all class instances are “initialized”, either explicitly, or implicitly via default ctor. Beware… Reconsider Rule 1 — is a field of primitive type of the class initialized? I don’t think so. Therefore, “initialized” means a ctor is called on the new-born instance, but not all fields therein are necessarily initialized. I’d say the ctor can simply ignore a primitive-typed field.

Uninitialized static Objects (stored in BSS segment) don’t take up space in object file. Also, by grouping all the symbols that are not explicitly initialized together, they can be easily zeroed out at once. See
http://stackoverflow.com/questions/9535250/why-is-the-bss-segment-required

Any static Object explicitly initialized by programmer is considered an “initialized” static object and doesn’t go into BSS.

P136 [[understanding and using c pointers]] uses a real example to confirm that a pointer field in a C struct is uninitialized. C has no ctor!

P261 [[programming]] by Stroustrup has a half-pager summary
* globals are default-initialized
* locals and fields are truly uninitialized …
** … unless the data type is a custom class having a default ctor. In that case, you can safely declare the variable without initialization, and content will be a pre-set value.

throwing dtor: %%justified use cases

Status — still I don’t have a well-justified use case.

I feel throwing dtor is frowned upon but not “illegal”. In practice it’s seldom done by design. When used, it’s not worth the analysis. In interviews, however, this receives disproportionate attention.

However, in practice, there are reasons to break the rule. Suppose I have

int i1, i2;// temp variables,
try{
  i1 = myobj->eat();
  myobj->drink(&i1);
  //....
  delete myobj;
}catch(business_exception & be){
  //handle exception using the temp variables i1, i2 etc
}

Since eat(), drink() etc and dtor all throw the same business_exception, this code is clean and maintainable. If we need to throw more than one exception type from those 3 functions, we can easily add the code. The same exception handler is used as a catch-all.

It would be messy to pass the temp variables i1, i2 etc into myobj dtor and replicate the same exception-handling logic therein.

So in this case, I’d make myobj dtor throw business_exception.

Now, as described in [[moreEffC++]] myobj dtor is invoked as part of stack unwinding due to another exception? [1] Well, in this case, I know that’s a fatal scenario and I do want system to crash anyway, like an assertion error, so the terminate() behavior is not unacceptable.

In other words, myobj’s class is written such that its dtor should throw exception only under normal object destruction and should never be part of an exceptional stack unwinding. In such a case, no one should misuse this class in an exception-unsafe context. If they ignore the restrictions on this class, they could get this dtor invoked as part of an exceptional stack unwinding, and the consequence is something they must deal with.

[1] in c++11, system will trigger std::terminate() whether or not this is part of unwinding. See https://akrzemi1.wordpress.com/2011/09/21/destructors-that-throw/

fail fast – fundamental and Practical principle

Fail-fast is one of those low-level coding habits that deserve its place among the valuable habits to be adopted by practicing app developers in industry. In contrast, library developers (including open-source authors) generally adopt fail-fast by default.

Principle — prefer crashing the entire program. Don’t keep going. Don’t hope the dubious condition will be tolerated. Don’t hope the corrupted data will be left alone and left untouched. Sooner or later what you fear will happen. In such a case it can be very hard to find the root cause. The crash site might be far away from the buggy code.

Example — when dealing with DAM issues, there are specific tools to make the program crash as soon as detected. I think they intercept, replace or integrate with malloc/free.

Example — if null pointer can cause problem then check as early as possible. In java and c#, a lot of error messages simply say null pointer (like “unknown exception”) . It can take hours to find out the real cause.

Example — fail-fast iterators.

static object initialization order #lazy singletons

Java’s lazy singleton has a close relative among C++ idioms, comparable in popularity and implementation.

Basically, local statics are initialized upon first use, exactly. Other static Objects are initialized “sometime” before main() starts. See the Item on c vs c++ in [[More eff c++]] and P222 [[EffC++]].

In real life, this’s hard to control — a known problem but with standard solutions — something like a lazy singleton. See P 170 C++FAQ. Also addressed in [[EffC++]] P221.

global free func^free func]a namespace^static method

In java there are static methods and nothing else. In c there are “functions” and nothing else. In c++ there are 3 alternatives.

1) static methods — like java
** unlike the free functions, static methods can be private or protected
** can access private static fields of host class
** unlike the free functions, static methods are inherited. See Operator-new in [[eff c++]]
** usually don’t need name space prefix

2) GLOBAL free func /operators — vestige of C syntax
3) free func in a “package” like a boost library or STL, typically organized into namespaces. Better than global

In quick and dirty applications (not libraries), you see lots of global free functions.