5 constructs: c++implicit singletons

#1 most implicit singleton in c++ is the ubiquitous “file-scope variable”. Extremely common in my projects.

  • — The constructs below are less implicit as they all use some explicit keyword to highlight the programmer’s intent
  • keyword “extern” — file-scope variable with extern
    • I seldom need it and don’t feel the need to remember the the details.. see other blogposts
  • keyword “static” — file-scope static variables
  • keyword “static” within function body — local static variables — have nice feature of predictable timing of initializaiton
  • keyword “static” within a class declaration —  static field

~~~~~~  The above are the 5 implicit singleton constructs ~~~~~~

Aha — it’s useful to recognize that when a data type is instantiated many many times i.e. non-singleton usage, it is usually part of a collection, or a local (stack) variable.

Sometimes we have the ambiguous situation where we use one of the constructs above, but we instantiate multiple instances of the class. It’s best to document the purpose like “instance1 for …; instance2 for …”

shared vars across files: prefer static field

When I have state to maintain and share across compilation units, there are basically three main types of variables I can create. (Non-static Local variables are simple and don’t maintain state.)

  1. nonstatic field of a singleton class — Note any user of this variable need a reference to the single object 😦
  2. file scope var in a *.cpp  — relatively simple usage, but don’t put in a shared header , as explained in global^file-scope variables]c++
  3. public static fields — most versatile design. Any user can access it after they #include the class header file.
  4. — non-contenders
  5. local static variables — (niche usage) You can create a local static var in myfunc(). To share the variable across compilation units, myfunc() can return a reference to this object, so from anywhere you can use the return value of myfunc(). This is a simple for of singleton.
  6. global variables — monster. Probably involves “extern”. See my other blogposts

The advantages of static field is often applicable to static methods too.

In fact, java leaves you with nothing but this choice, because this choice is versatile. Java has no “local static”, no file-scope, no global variables.

Y allocate static field in .c file %%take

why do we have to define static field myStaticInt in a cpp file?

For a non-static field myInt, the allocation happens when the class instance is allocated on stack, on heap (with new()) or in global area.

However, myStaticInt isn’t take care of. It’s not on the real estate of the new instance. That’s why we need to declare it in the class header, and then define it exactly once (ODR) in a cpp file. It is allocated at compile time — static allocation.

c++static field init: rules

See also post on extern…

These rules are mostly based on [[c++primer]], about static Field, not local statics or file-scope static variables.

Rule 1 (the “Once” rule) — init must appear AND execute exactly once for each static field.

In my Ticker Plant xtap experience, the static field definition crucially sets aside storage for the static field. The initial value is often a dummy value.

Corollary: avoid doing init in header files, which is often included multiple times. See exception below.

Rule 2 (the “Twice” rule) — static field Must (See exception below) be DECLARED in the class definition block, and also DEFINED outside. Therefore, the same variable is “specified” exactly twice [1]. However, the run time would “see” the declaration multiple times if it’s included in multiple places.

Corollary: always illegal to init a static field at both declaration and definition.

[1] Note ‘static’ keyword should be at declaration not definition. Ditto for static methods. See P117 [[essential c++]]

The Exception — static integer constant Fields are special, and can be initialized in 2 ways
* at declaration. You don’t define it again.
* at definition, outside the class. In this case, declaration would NOT initialize — Rule 1

The exception is specifically for static integer constant field:

  • if non-const, then you can only initialize it in ctor
  • if non-integer static, then you need to define it outside
  • if non-const static, then ditto
  • if not a field, then a different set of rules apply.

Rule 3: For all other static fields, init MUST be at-definition, outside the class body.

Therefore, it’s simpler to follow Rule 3 for all static fields including integer constants, though other people’s code are beyond my control.

——Here’s an email I sent about the Exception —–
It turned these are namespace variables, not member variables.

Re: removing “const” from static member variables like EXCHANGE_ID_L1

Hi Dilip,

I believe you need to define such a variable in the *.C file as soon as you remove the “const” keyword.

I just read online that “integer const” static member variables are special — they can be initialized at declaration, in the header file. All other static member variables must be declared in header and then defined in the *.C file.

Since you will overwrite those EXCHANE_ID_* member variables, they are no longer const, and they need to be defined in Parser.C.

private-header^shared-header

In our discussions on ODR, global variables, file-scope static variables, global functions … the concept of “shared header” is often misunderstood.

  • If a header is only included in one *.cpp, then its content is effectively part of a *.cpp.

Therefore, you may experiment by putting “wrong” things in such a private header and the set-up may work or fail, but it’s an invalid test. Your test is basically putting those “wrong” things in an implementation file!

 

bbg-FX: check if a binary tree is BST #no deep recursion

Binary search tree has an “ascending” property — any node in my left sub-tree are smaller than me.

Q1: given a binary tree, check if it’s a BST. (No performance optimization required.) Compile and run the program.

Suppose someone tells you that the lastSeen variable can start with a random (high) initial value, what kind of test tree would flush out the bug? My solution is below, but let’s Look at Q1b.

Q1b: what if the tree could be deep so you can’t use recursion?
A: use BFT. When we actually “print” each node, we check left/right child nodes.


I made a mistake with lastSeen initial value. There’s no “special” initial value to represent “uninitialized”. Therefore I added a boolean flag.

Contrary to the interviewer’s claim, local statics are automatically initialized to zerohttps://stackoverflow.com/questions/1597405/what-happens-to-a-declared-uninitialized-variable-in-c-does-it-have-a-value, but zero or any initial value is unsuitable (payload can be any large negative value), so we still need the flag.

#include <iostream>
#include <climits>
using namespace std;

struct Node {
    int data;
    Node *le, *ri;
    Node(int x, Node * left = NULL, Node * right = NULL) : data(x), le(left), ri(right){}
};
/*    5
  2       7
 1 a
*/
Node _7(7);
Node _a(6);
Node _1(1);
Node _2(2, &_1, &_a);
Node _5(5, &_2, &_7); //root

bool isAscending = true; //first assume this tree is BST
void recur(Node * n){
//static int lastSeen; // I don't know what initial value can safely represent a special value

  //simulate a random but unlucky initial value, which can break us if without isFirstNodeDone
  static int lastSeen = INT_MAX; //simulate a random but unlucky initial value
  static bool isFirstNodeDone=false; //should initialize to false

  if (!n) return;
  if (n->le) recur(n->le);

  // used to be open(Node *):
  if (!isAscending) return; //check before opening any node
  cout<<"opening "<<n->data<<endl; if (!isFirstNodeDone){ isFirstNodeDone = true; }else if (lastSeen > n->data){
        isAscending=false;
        return; //early return to save time
  }
  lastSeen = n->data;

  if (n->ri) recur(n->ri);
}

int main(){
  recur(&_5);
  cout<< (isAscending?"ok":"nok");
}

concurrent lazy singleton using static-local var#c++

As explained in another blog post, static local is a shared mutable, but fortunately initialization is thread-safe:

“If multiple threads attempt to initialize the same static local variable concurrently, the initialization occurs exactly once (similar behavior can be obtained for arbitrary functions with std::call_once).” — https://en.cppreference.com/w/cpp/language/storage_duration

https://stackoverflow.com/questions/26013650/threadsafe-lazy-initialization-static-vs-stdcall-once-vs-double-checked-locki has 20 upvotes and looks substantiated. It also considers double-checking, std::call_once, atomics, CAS…

GCC uses platform-specific tricks to ensure a static local variable is initialized only once, on first use. 

Conclusion — On GCC and other compilers, this is the easiest solution for a concurrent lazy singleton.

 

non-local^local static object initialization ] c++

Regarding when such objects are initialized, there are simple rules as illustrated in 2 simple yet concurrent singleton implementations ] c++The two singleton implementations each use one of the 2 types.

The rules:

  1. lazy init — local statics are initialized in the first function call. Only one thread would initialize it, never concurrently on two threads. See concurrent lazy singleton using static-local var
  2. eager init — non-local statics are initialized before main(), on one single thread, but the sequence among them is non-deterministic, as explained by Scott Meyers.
  3. Both are kind of thread-safe

ODR@functions # and classes

Warning – ODR is never quizzed in IV. A weekend coding test might touch on it but we can usually circumvent it.

OneDefinitionRule is more strict on global variables (which have static duration). You can’t have 2 global variables sharing the same name. Devil is in the details:

(As explained in various posts, you declare the same global variable in a header file that’s included in various compilation units, but you allocate storage in exactly one compilation unit. Under a temporary suspension of disbelief, let’s say there are 2 allocated storage for the same global var, how would you update this variable?)

With free function f1(), ODR is more relaxed. http://www.drdobbs.com/cpp/blundering-into-the-one-definition-rule/240166489 (possibly buggy) explains the Lessor ODR vs Greater ODR. Lessor ODR is simpler and more familiar, forbidding multiple (same or different) definitions of f1() within one compilation unit.

My real focus today is the Greater ODR. Obeying Lessor ODR, the same static or inline function is often included via a header file and compiled into multiple binary files. If you want to put non-template free function definition in a shared header file but avoid Great ODR, then it must be static or inline, implicitly or explicitly. I find the Dr Dobbs article unclear on this point — In my test, when a free function was defined in a shared header without  “static” or “inline” keywords, then linker screams “ODR!”

The most common practice is to move function definitions out of shared headers, so the linker (or compiler) sees only one definition globally.

With inline, Linker actually sees multiple (hopefully identical) physical copies of func1(). Two copies of this function are usually identical definitions. If they actually have different definitions, compiler/linker can’t easily notice and are not required to verify, so no build error (!) but you could hit strange run time errors.

Java linker is simpler and never cause any problem so I never look into it.

//* if I have exactly one inline, then the non-inlined version is used. 
// Linker doesn't detect the discrepancy between two implementations.
//* if I have both inline, then main.cpp won't compile since both definitions 
// are invisible to main.cpp
//* if I remove both inline, then we hit ODR 
//* objdump on the object file would show the function name 
// IFF it is exposed i.e. non-inline
::::::::::::::
lib1.cpp
::::::::::::::
#include &amp;lt;iostream&amp;gt;
using namespace std;

//inline
void sameFunc(){
    cout&amp;lt;&amp;lt;"hi"&amp;lt;&amp;lt;endl;
}
::::::::::::::
lib2.cpp
::::::::::::::
#include &amp;lt;iostream&amp;gt;
using namespace std;

inline
void sameFunc(){
    cout&amp;lt;&amp;lt;"hey"&amp;lt;&amp;lt;endl;
}
::::::::::::::
main.cpp
::::::::::::::
void sameFunc(); //needed
int main(){
  sameFunc();
}

 

c++variables: !! always objects

Every variable that holds data is an object. Objects are created either with static duration (sometimes by defining rather than declaring the variable), with automatic duration (declaration alone) or with dynamic duration via malloc().

That’s the short version. Here’s the long version:

  • heap objects — have no name, no host variable, no door plate. They only have addresses. The address could be saved in a “pointer-object”, which is a can of worm.
    • In many cases, this heap address is passed around without any pointer object.
  • stack variables (including function parameters) — each stack object has a name (multiple possible?) i.e. the host variable name, like a door plate on the memory location. This memory is allocated when the stack frame is created. When you clone a stack variable you get a cloned object.
    • Advanced — You could create a reference to the stack object, when you pass the host variable by-reference into another function. However, you should never return a stack variable by reference.
  • static Locals — the name myStaticLocal is a door plate on the memory location. This memory is allocated the first time this function is run. You can return a reference to myStaticLocal.
  • file-scope static objects — memory is allocated at program initialization, but if you have many of them scattered in many files, their order of initialization is unpredictable. The name myStaticVar is a door plate on that memory location, but this name is visible only within this /compilation-unit/. If you declare and define it (in one step!) in a shared header file (bad practice) then you get multiple instances of it:(
  • extern static objects — Again, you declare and define it in one step, in one file — ODR. All other compilation units  would need an extern declaration. An extern declaration doesn’t define storage for this object:)
  • static fields — are tricky. The variable name is there after you declare it, but it is a door plate without a door. It only becomes a door plate on a storage location when you allocate storage i.e. create the object by “defining” the host variable. There’s also one-definition-rule (ODR) for static fields, so you first declare the field without defining it, then you define it elsewhere. See https://bintanvictor.wordpress.com/2017/05/30/declared-but-undefined-variable-in-c/

Note: thread_local is a fourth storage duration, after 1) dynamic, 2) automatic and 3) static

C++build error: declared but undefined variable

I sometimes declare a static field in a header, but fail to define it (i.e. give it storage). It compiles fine and may even link successfully. When you run the executable, you may hit

error loading library /home/nysemkt_integrated_parser.so: undefined symbol: _ZN14arcabookparser6Parser19m_srcDescriptionTknE

Note this is a shared library.
Note the field name is mangled. You can un-mangle it using c++filt:

c++filt _ZN14arcabookparser6Parser19m_srcDescriptionTknE -> arcabookparser::Parser::m_srcDescriptionTkn

According to Deepak Gulati, the binary files only contain mangled names. The linker and all subsequent programs deal exclusively with mangled names.

If you don’t use this field, the undefined variable actually will not bother you! I think the compiler just ignores it.

c++class field defined]header,but global vars obey ODR

Let’s put function declaration/definition aside — simpler.

Let’s put aside local static/non-static variables — different story.

Let’s put aside function parameters. They are like local variables.

The word “static” is heavily overloaded and confusing. I will try to avoid it as far as possible.

The tricky/confusing categories are

  • category: static field. Most complex and better discussed in a dedicated post — See https://bintanvictor.wordpress.com/2017/02/07/c-static-field-init-basic-rules/
  • category: file-scope var — i.e. those non-local vars with “static” modifier
  • category: global var declaration — using “extern”
    • definition of the same var — without “extern” or “static”
  • category: non-static class field, same as the classic C struct field <– the main topic in the post. This one is not about declaration/definition of a variable with storage. Instead, this is defining a type!

I assume you can tell a variable declaration vs a variable definition. Our intuition is usually right.

The Aha — [2] pointed out — A struct field listing is merely describing what constitutes a struct type, without actually declaring the existence of any variables, anything to be constructed in memory, anything addressable. Therefore, this listing is more like a integer variable declaration than a definition!

Q: So when is the memory allocated for this field?
A: when you allocate memory for an instance of this struct. The instance then becomes an object in memory. The field also becomes a sub-object.

Main purpose to keep struct definition in header — compiler need to calculate size of the struct. Completely different purpose from function or object declarations in headers. Scott Meyers discussed this in-depth along with class fwd declaration and pimpl.

See also

global^file-scope variables]c++ #extern

(Needed in coding tests and GTD, not in QnA interviews.)

Any object declared outside a block has “static duration” which means (see MSDN) “allocated at compile time not run time”

“Global” means extern linkage i.e. visible from other files. You globalize a variable by removing “static” modifier if any.

http://stackoverflow.com/questions/14349877/static-global-variables-in-c explains the 2+1 variations of non-local object. I have added my own variations:

  • A single “static” variable in a *.cpp file. “File-scope” means internal linkage i.e. visible only within the file. You make a variable file-scope by adding “static”.
  • an extern (i.e. global mutable) variable — I used this many times but still don’t understand all the rules. Basically in one *.cpp it’s defined, without “static” or “extern”. In other *.cpp files, it’s declared (via header) extern without a initial value.
  • A constant can be declared and defined in one shot as a static (i.e. file-scope) const. No need for extern and separate definition.
  • confusing — if you (use a shared header to) declare the same variable “static” in 2 *.cpp files, then each file gets a distinct file-scope mutable variable of the same name. Nonsense.

https://msdn.microsoft.com/en-us/library/s1sb61xd.aspx

http://en.wikipedia.org/wiki/Global_variable#C_and_C.2B.2B says “Note that not specifying static is the same as specifying extern: the default is external linkage” but I doubt it.

I guess there are 3 forms:

  • static double — file-scope, probably not related to “extern”
  • extern double — global declaration of a var already Defined in a single file somewhere else
  • double (without any modifier) — the single definition of a global var. Will break ODR if in a shared header

Note there’s no special rule about “const int”. The special rule is about const int static FIELD.

//--------- shared.h ---------
#include 
#include 
void modify();

extern std::string global; //declaration without storage allocation
static const int fileScopeConst1=3; //each implementation file gets a distinct copy of this const object
static double fileScopeMutable=9.8; //each implementation file gets a distinct copy of this mutable object
//double var3=1.23; //storage allocation. breaks compiler due to ODR!

// ------ define.C --------
#include "shared.h"
using namespace std;
string global("defaultValue"); //storage allocation + initialization
int main(){
  cout<<"addr of file scope const is "<<&fileScopeConst1<<std::endl;
  cout<<"addr of global var is "<<&global<<std::endl;
  cout<<"before modify(), global = "<<global<< "; double = "<<fileScopeMutable<<endl;
  modify();
  cout<<"after modify(), global = "<<global<< "; double (my private copy) = "<<fileScopeMutable<<endl;
}
// ------ modify.C --------
#include "shared.h"
void modify(){
  global = "2ndValue";
  fileScopeMutable = 700;
  std::cout<<"in modify(), double (my private copy) = "<<fileScopeMutable<<std::endl;
  std::cout<<"in modify(), addr of file scope const is "<<&fileScopeConst1<<std::endl;
  std::cout<<"in modify(), addr of global var is "<<&global<<std::endl;
}

c++ uninitialized "static" objects ^ stackVar

By “static” I mean global variables or local static variables. These are Objects with addresses, not mere symbols in source code. Note Some static objects are _implicitly_ static — P221 effC++.

Rule 1: uninitialized local and class fields of primitive types (char, float…) aren’t automatically initialized. See [[programming]] by Stroustrup.

Rule 2: uninitialized local or file-scope statics are automatically initialized to 0 bits at a very early stage of program loading. See https://stackoverflow.com/questions/1597405/what-happens-to-a-declared-uninitialized-variable-in-c-does-it-have-a-value

Rule 3: all class instances are “initialized”, either explicitly, or implicitly via default ctor. Beware… Reconsider Rule 1 — is a field of primitive type of the class initialized? I don’t think so. Therefore, “initialized” means a ctor is called on the new-born instance, but not all fields therein are necessarily initialized. I’d say the ctor can simply ignore a primitive-typed field.

Uninitialized static Objects (stored in BSS segment) don’t take up space in object file. Also, by grouping all the symbols that are not explicitly initialized together, they can be easily zeroed out at once. See
http://stackoverflow.com/questions/9535250/why-is-the-bss-segment-required

Any static Object explicitly initialized by programmer is considered an “initialized” static object and doesn’t go into BSS.

P136 [[understanding and using c pointers]] uses a real example to confirm that a pointer field in a C struct is uninitialized. C has no ctor!

P261 [[programming]] by Stroustrup has a half-pager summary
* globals are default-initialized
* locals and fields are truly uninitialized …
** … unless the data type is a custom class having a default ctor. In that case, you can safely declare the variable without initialization, and content will be a pre-set value.

func^var in header files

Many authors describe a single set of rules for func declaraion vs variable declaration in header files. However, for some beginners, it might make more sense to assume the rules are largely independent and unrelated between func vs var.

Note I will use “include file” and “header file” interchangeably.

The word “static” is heavily overloaded and confusing. I will try to avoid it as far as possible.

(Perhaps because function in header files are implicit global) We are more familiar with the rules on functions
– Rule: each some.cpp file that uses a shared func1() must have func1 “declared”. Best done via include file.
– Rule: across all the .cpp files, there must be exactly 1 definition of func1, otherwise linker complains. Consequently, this must go into a non-include file and seen once only by the linker.
– exception: implicit inline func definition can be included in headers
– exception: file-scope static func definition can be included in headers

——global variables——-
Let’s start with a big Backgrounder (because it’s rather confusing) — Let’s ignore fields of classes/structs and just focus on classic C variables. The most common and most simple variables are local to a function — function-scope static/non-static variables. The other variables are essentially static-duration variables[1]. A simple type of non-local variable is a file-scope but unshared variable. It can be used across functions within that single file, but here we are interested in global shared variables. I think there are various categories but as hinted in the rather authoritative [[essential c++]] P53, most common is a file-scope static var. The way to globalize such a variable var2 is

Rule: use extern in an include file, so every .cpp file will “see” this extern declaration of var2
Rule: in exactly one .cpp file, define var2 without “extern” or “static”

[1] Someone said online “A local variable is a variable that is declared inside a function. A global variable is a variable that is declared outside all functions“. Scott Meyers discussed “non-local static variable”.

—— static field eg sf2 ——-
See also post on static Field init, and P115 [[essentil c++]]
Usually must be defined exactly once outside class Declaration (which is part of include files). Somewhat similar to shared global variables.

Rule — each some.cpp file that uses this field must have it “declared”. Best done via include file.
Rule — across all the .cpp files, there must be exactly 1 definition of this field. Consequently, this must go into a non-include file and seen once only by the compiler. The field name should be prefixed with the class name, but with the “static” keyword.
Rule — NO “extern” please.
Rule — (special) — const static Integral field can be initialized (not “defined”) in the declaration. See
http://stackoverflow.com/questions/370283/why-cant-i-have-a-non-integral-static-const-member-in-a-class. But watch out for http://stackoverflow.com/questions/3025997/c-defining-static-const-integer-members-in-class-definition

extern !! q[extern C]

This post is about extern on global variables. The other (overload) usages of extern include
1) extern “C” — see other posts
2) extern “anyOtherLang” — illegal in most compilers. Proprietary feature

Basic purpose of extern on non-functions? Create a declaration without definition. See c++Primer P396. Rules

* use extern, not extern “C”
* on a non-function — basically a global variable
* without initializer

Suppose a.cpp Defines a globalVar1 and this is to be shared.

  1. My rule 1: if globalVar1 is shared only with one b.cpp file, then I could put the extern declaration in b.cpp. Nowadays I seldom choose this option. It’s better to follow one standard rule (Rule 2) rather than multiple rules.
  2. My rule 2: if globalVar1 is shared with potentially multiple cpp files, then it is better to put the extern in a.h

static object initialization order #lazy singletons

Java’s lazy singleton has a close relative among C++ idioms, comparable in popularity and implementation.

Basically, local statics are initialized upon first use, exactly. Other static Objects are initialized “sometime” before main() starts. See the Item on c vs c++ in [[More eff c++]] and P222 [[EffC++]].

In real life, this’s hard to control — a known problem but with standard solutions — something like a lazy singleton. See P 170 C++FAQ. Also addressed in [[EffC++]] P221.

static: 3meanings for c++objects

http://www.cprogramming.com/tutorial/statickeyword.html echoes my analysis below.

I feel “static” has too many overloaded meanings, almost unrelated. (Java is better.) Maybe there are just 3 mutually exclusive meanings

1) local static vars — simplest meaning of “static”. as inherited from C, static local var retain their values between function calls. My Chartered system relied heavily on these.

2) static field — the java meaning.

3) file scope — We know local variables (block scope) and global variables (extern). There is one (intermediate?) level of scoping for variables — file scope using “static” keyword. A variable with file scope can be accessed by any function or block within a single file. To declare a file scoped variable, simply declare a variable outside of a block. (Note all of these 3 types are about “objects”, rather than “variables”.) There’s a technical jargon — non-local static object, sometimes IMPLICITLY static. See [[effC++]] P221.

Are all these objects allocated at compile time not dynamically at run time. How about auto variables?

How about static_cast? Irrelevant to this discussion. Nothing to do with object type

3 storage classes for c++variables: lifetime^scope

external
static
thread_local

(Most important type for me is the external…)

https://www.programtopia.net/cplusplus/docs/storage-classes points out that these 4 describe the lifetime and visibility of a variable.

https://docs.microsoft.com/en-us/cpp/cpp/storage-classes-cpp is more authoritative.

initially there are only 2 — auto and external. They added “register” as a subtype of auto. They added static as a subclass of external.

External variables are global variables. I feel globals and file-scope static objects are stored in the same area, not on stack or heap.