5 constructs: c++implicit singletons

#1 most implicit singleton in c++ is the ubiquitous “file-scope variable”. Extremely common in my projects.

  • — The constructs below are less implicit as they all use some explicit keyword to highlight the programmer’s intent
  • keyword “extern” — file-scope variable with extern
    • I seldom need it and don’t feel the need to remember the the details.. see other blogposts
  • keyword “static” — file-scope static variables
  • keyword “static” within function body — local static variables — have nice feature of predictable timing of initializaiton
  • keyword “static” within a class declaration —  static field

~~~~~~  The above are the 5 implicit singleton constructs ~~~~~~

Aha — it’s useful to recognize that when a data type is instantiated many many times i.e. non-singleton usage, it is usually part of a collection, or a local (stack) variable.

Sometimes we have the ambiguous situation where we use one of the constructs above, but we instantiate multiple instances of the class. It’s best to document the purpose like “instance1 for …; instance2 for …”

local using-namespace

https://stackoverflow.com/questions/223021/whats-the-scope-of-the-using-declaration-in-c

https://stackoverflow.com/questions/4811596/using-namespace-in-function-implementation

Both confirmed my understanding that using-namespace can be safely used within a function … even in a header file 🙂

I used this handy technique in my HRT codebase.

 

##STL iterator is implemented as ..

  • implemented as raw ptr — if your data is held in an array
  • implemented as member class (sugarcoated as member typedef) — most common
  • implemented as friend class
  • implemented as wrapper over internal container’s iterator — (cheat) if your custom container is kind of wrapper over an STL container, then just use the internal container’s iterator as your iterator.

Remember an iterator class is a form of smart pointer by definition, since it implements operator->() and operator*()

nonVirtual1() calling this->virt2() #templMethod

http://www.cs.technion.ac.il/users/yechiel/c++-faq/calling-virtuals-from-base.html has a simple sample code. Simple idea but there are complexities:

  • the given print() should never be used inside base class ctor/dtor. In general, I believe any virt2() like any virtual function behaves non-virtual in ctor/dtor.
  • superclass now depends on subclass. The FAQ author basically says this dependency is by-design. I believe this is template-method pattern.
  • pure-virtual is probably required here.

avoid if()assert

avoid:

if (…) assert(…)

Instead, put the if-condition into the assertion, which is

  1. more readable since the if-condition is clearly part of an assertion so other people tracing code don’t have one more IF block to check!
  2. more efficient compilation since preprocessor would strip the if-condition in release build

prefer ::at()over operator[]read`containers#UB

::at() throws exception … consistently 🙂

  • For (ordered or unordered) maps, I would prefer ::at() for reading, since operator[] silently inserts for lookup miss.
  • For vector, I would always favor vector::at() since operator[] has undefined behavior when index is beyond the end.
    1. worst outcome is getting trash without warning. I remember getting trash from an invalid STL iterator.
    2. better is consistent seg fault
    3. best is exception, since I can catch it

 

this->myVector as instance field

Sound byte — if you want your class to have a vector field, then a nonref field is the simplest design, unlike the java convention (below).

I have occasionally seen this->myVector as a nonstatic data member. I think this is normal and should not raise any eyebrows. [[effC++]] P62 has a simple example.

I also used std::map and other containers as fields in my classes, like PSPCDemux.

Java programmers would have a pointer field to a vector constructed on heap, but memory management is simpler with the nonref field. In terms of memory layout, PSPCDemux::myvector has some small footprint [1] embedded in the PSPCDemux object, and the actual container payload has to be allocated on heap, to support container expansion.

[1] Java is different as that “small footprint” shrinks to a single pointer.

These fields don’t need special handling in PSPCDemux ctors. By default an empty container would be allocated “onsite”. PSPCDemux dtor would automatically call the container dtor, which would free the heap memory.

If you adopt the java convention, then your dtor need to explicitly delete the heap pointer. This is tricky. What if the dtor throws exception before deleting? What if ctor throws exception after calling new?

initialize const field in ctor body #cast

Background: I have a const (int for eg) field “rating” to be initialized based on some computation in the ctor body but c++ compiler requires any const field (like rating) be initialized in initializer only.

solution: cast away the constness.. See https://stackoverflow.com/questions/3465302/initializing-c-const-fields-after-the-constructor

I also used it in my github code https://raw.githubusercontent.com/tiger40490/repo1/cppProj/cppProj/concretizeSheet/concretize.cpp

*const_cast<bool*>(&hasUpstream) = tmp_hasUpstream;

need a safe invalid value for a c++float@@ NaN

— For c++ float, if you need a safe “invalid” value, there’s NaN, with standard support like std::isnana() etc

— For c++ int, you need to pick a actually-valid number like INT_MAX.

Q: How do you find a special value to indicate “variable has an invalid value”?
%%A: I think you need separate boolean flag.
A: boost::optional #NaN #a G9 boost construct is exactly designed for this

t_implBestPractice^t_idiom^t_c++patt^namedDesignPatt^t_c++ECT66^c++tecniq^t_workTimeSaver

  • t_c++pattern — must be a named pattern
  • t_c++idiom — must be well-established small-scale
  • t_ECT and t_c++idiom are mutually exclusive
  • t_tecniq — higher than syntaxTips, less selective than t_c++idiom or t_c++pattern but Should NOT be dumping ground
  • t_implBestPractice is like gentmp

boost::optional #NaN #a G9 boost construct

https://www.boost.org/doc/libs/1_65_1/libs/optional/doc/html/index.html has illustrations using optional<int>.

— #1 Usage: possibly-empty argument-holder
I encountered this in MS library:

void myFunc(optional<ReportType &> reportData_){
if(reportData_) cout<<“non-empty”;

In addition, declaration of this function features a default-arg:

void myFunc(optional<ReportType &> reportData_ = optional<ReportType &>());

Q: for an int param, how does this compare with a simple default-arg value of -1?
A: boost::optional wins if all integer values are valid values, so you can’t pick one of them to indicate “missing”

c++get current command line as str

#include <fstream>

std::string const & get_command_line() {
  static std::string ret;
  if (ret.empty())
  try{
        std::string path="/proc/" + std::to_string((long long)getpid()) + "/cmdline";
        std::cerr<<"initializing rtsd parsername from "<<path<<" ..\n";
        std::ifstream myfile(path);
        if (!myfile) return ret;
        getline(myfile, ret);
  }catch(...){
  }
  return ret;
}

c++user-defined constant:(scoped)enum^const static

See https://stackoverflow.com/questions/112433/should-i-use-define-enum-or-const

  • best practice — enclose in a namespace or a class
  • enum can group 2 related constants. Scoped enum is even more descriptive.
  • enum also creates a typename for code documentation
  • enum const value must be signed integers 😦
  • Both enum type and a single static const field can be declared as class members 🙂
  • For singular constants, use enum or const static variables (file-scope, and independently instantiated in each compilation unit). Avoid extern.

class scope as a pseudo namespace #goog/effC++

https://google.github.io/styleguide/cppguide.html#Nonmember,_Static_Member,_and_Global_Functions

[[effC++]] P 120 has a concise chapter on namespace vs namespace-emulating class.

So when you see someName::someVar or someName::someClass or someName::someFunc, the “someName” may be a class.

When you see “someVar” without the prefix, don’t assume it’s in your local namespace. It could be typedef of someName::someVar !

#include <xtap/PluginConfig.h> trick

I have seen in many large systems:

The actual path to the header is …/shared/tp_xtap/include/PluginConfig.h, but develoeprs prefer an abbreviated include like #include <xtap/PluginConfig.h>.

#1) Here’s one simple implementation:

ls -l shared/tp_xtap/include/ # should show a symbolic link to be created automatically:

    xtap -> ./

Therefore, -I/full/path/to/shared/tp_xtap/include/ will resolve #include <xtap/PluginConfig.h>

#2) I guess a second-best solution is code generation. Checked-in source file has #include <xtap/PluginConfig.h> but the build system follows configured rewrite-rules, to convert it into #include <some/other/path>

memset: a practical usage #Gregory

  • memset is a low-level C function.
  • memset takes a void pointer.
  • Fast and simple way to zero out an array of struct, having primitive data members. No std::string please. No ptr please. Use sizeof to get the byte count.
  • Useful in low level wire coding
// illustrates packed and memset
#include <iostream>
using namespace std;

struct A{
  unsigned int i1; //4 bytes
  bool b; //1 byte
  char cstr[2];
  int* ptr; //8 bytes
} __attribute__((packed));
size_t const cnt = 3;
A arr[cnt];
int main(){
  cout<<sizeof(A)<<endl;
  size_t sz = sizeof(arr);
  cout<<sz<<endl;
  memset(arr, 0, sz);
  for(size_t i=0; i<cnt; ++i){
    A* tmp = &arr[i];
    cout<<"i1 = "<<tmp->i1<<"; b = "<<tmp->b<<" ; cstr[1] = "<<(int)tmp->cstr[1]<<" ptr = "<<tmp->ptr<<endl;
  }
}

file (de)serialization, for array of simple structures #Gregory

Not needed for IV..

// simple and fast (de)serialization to file given an array of structures

#include <fcntl.h>
#include <iostream>
using namespace std;
size_t const len=15;

struct A{
        int i1;
        char cstr[len];
        //string s4; //doesn't really work
        A(int i=999, string cs="default c-string", string s="default std::string"): i1(i){
                strncpy(cstr, cs.c_str(), len);
        }
};
size_t const  cnt=2, siz=cnt * sizeof(A);
A arr[cnt], ar2[cnt];

char fname[] = "/tmp/,.dat";
int main() {
        arr[0]=A(1,  "grin", "backbone");
        arr[1]=A(2,  "frown", "try/except/else");

        int fd = open(fname, O_CREAT | O_WRONLY, S_IRUSR | S_IWUSR);
        write(fd, arr, siz);
        close(fd);

        int fd2 = open(fname, O_RDONLY);
        read(fd2, ar2, siz);
        close(fd2);

        for (int idx = 0; idx < cnt; ++idx){
                A * tmp = ar2 + idx;
                cout<<tmp->i1<<" ; "<<tmp->cstr<<" ; "<<endl; //tmp->s4<<endl;
        }
}

instance field: char-array without terminating null

This is a best practice if those char-arrays are short and this class gets instantiated or (de)serialized frequently.

Drawback — when you print this field, the printing function would keep printing beyond the field boundary looking for the terminating null. I saw this in my nyse-intg parser

So as a personal best practice I often saves the terminating null in my char-array field

retreat to raw ptr from smart ptr ASAP

Raw ptr is in the fabric of C. Raw pointers interact/integrate with countless parts of the language in complex ways. Smart pointers are advertised as drop-in replacements but that advertisement may not cover all of those “interactions”:

  • double ptr
  • new/delete/free
  • ptr/ref layering
  • ptr to function
  • ptr to field
  • 2D array
  • array of ptr
  • ptr arithmetics
  • compare to NULL
  • ptr to const — smart ptr to const should be fine
  • “this” ptr
  • factory returning ptr — can it return a smart ptr?
  • address of ptr object

Personal suggestion (unconventional) — stick to the known best practices of smart ptr (such as storing them in containers). In all other situations, do not treat them as drop-in replacements but retrieve and use the raw ptr.

subverting – multiple inheritance

!! With MI, the “this” pointer field is not always identical between the Base object and the Derived object.
** Remember a primitive technique of a home-made “class” is a C struct + a self-pointer field [2]. Similarly, In Python methods, “self” must be the first argument…

[2] I don’t think we should add a 32-bit field to each instance! I guess compiler can help make do without it.

!! In a Derived instance, there’s not always a single Base1 instance.
** Basic example — D extends C1 and C2, which both extend B.
** Even if C1 and C2 both virtually extend B, if C3 extends B without “virtual”, D instance still embeds 2 B instances.

!! Casting within inheritance hierarchy doesn’t always maintain the address held inside a Derived pointer or Base pointer. SI — always. ARM P 221 says “with MI, casting changes the value of a pointer”. See my experiment  https://github.com/tiger40490/repo1/blob/cpp1/cpp/88miscLang/ptrArithmeticMI.cpp

!! Each Derived instance uses more than one vtbl. If D inherits from B1 and B2, then that instance uses two vtbl’s. See ARM P230.

!! AOB (see other posts) assumes a 0 “delta” but in MI, delta is introduced because it’s not 0! ARM P222

— Below rules still holds,  in SI and MI —
dtor sequence is the exact reverse of ctor sequence which is BCDC (see other posts)

prevent heap instantiation4my class #part2

[[C++FAQ]] and [[More Eff C++]] both cover related topics.

I feel it’s very useful to have classes that can only be instantiated on stack, because stack-variables are automatically reclaimed.

With these classes, no double-delete, no dangling pointer, and most importantly, no memory leak. Remember memory leak is the most hidden — hardest to identify.

c++ creational patterns – return pointers always

Many creation patterns need to return pointers.

Must use pbref not pbclone, so the choice is between pointer vs reference. I feel pointer is more flexible than reference. If creation fails, we can return NULL.

eg: virtual ctor, esp. clone()
eg: factory http://login2win.blogspot.com/2008/05/c-factory-pattern.html, [[ModernC++Design]]
eg: builder http://en.wikibooks.org/wiki/C++_Programming/Code/Design_Patterns/Creational_Patterns

real-world OMS c++data structure for large collection of orders

class OrderMap{
struct data_holder {/*fields only. no methods*/};
unordered_map<int, shared_ptr<data_holder> > orders; // keyed by orderID
… //methods to access the orders
};

Each data_holder is instantiated on heap, and the new pointer goes into a shared_ptr. No need to worry about memory.

Do you see ANY drawback, from ANY angle?