.so.2: linker^dynamic loader

— Based on https://unix.stackexchange.com/questions/475/how-do-so-shared-object-numbers-work

In my work projects, most of the linux SO files have a suffix like libsolclient_jni.so.1.7.2. This is to support two executables using different versions of a SO at the same time.

Q: How is the linker able to find this file when we give linker a command line option like “-lsolclient_jni”? In fact, java System.loadLibrary(“solclient_jni”) follows similar rules. That’s why this example uses a java native library.

A: Actually, linker (at link time) and dynamic loader (at run time) follow different rules

  • at compile time, executable binaries saves (hardcoded) info about which version of a SO to load into memory. You can run “ldd /the/executable/file” to reveal the exact versions compiled with the executable.
  • at run time, executable would consult the hardcoded info and load libsolclient_jni.so.1.7.2 into memory
  • at link time, linker only uses the latest version. So there’s usually a symlink like libsoclient_jni.so (without suffix)

— static libraries:

I think static libraries like libc.a do not have this complexity.

During static linking the linker copies all library routines used in the program into the executable image. This of course takes more space on the disk and in memory than dynamic linking. But a static linked executable does not require the presence of the library on the system where it runs.

shared vars across files: prefer static field

When I have state to maintain and share across compilation units, there are basically three main types of variables I can create. (Non-static Local variables are simple and don’t maintain state.)

  1. nonstatic field of a singleton class — Note any user of this variable need a reference to the single object 😦
  2. file scope var in a *.cpp  — relatively simple usage, but don’t put in a shared header , as explained in global^file-scope variables]c++
  3. public static fields — most versatile design. Any user can access it after they #include the class header file.
  4. — non-contenders
  5. local static variables — (niche usage) You can create a local static var in myfunc(). To share the variable across compilation units, myfunc() can return a reference to this object, so from anywhere you can use the return value of myfunc(). This is a simple for of singleton.
  6. global variables — monster. Probably involves “extern”. See my other blogposts

The advantages of static field is often applicable to static methods too.

In fact, java leaves you with nothing but this choice, because this choice is versatile. Java has no “local static”, no file-scope, no global variables.

compile-time ^ run-time linking

https://en.wikipedia.org/wiki/Dynamic_linker describes the “magic” of linking *.so files with some a.out at runtime, This is more unknown and “magical” than compile-time linking.

“Linking is often referred to as a process that is performed when the executable is compiled, while a dynamic linker is a special part of an operating system that loads external shared libraries into a running process”

I now think when the technical literature mentions linking or linker I need to ask “early linker or late linker?”

Can a.so file get loaded 5min after process starts@@

Q: Can some shared library abc.so (actually libabc.so) file get loaded 5 min after my process pid123 starts?

https://stackoverflow.com/questions/7767325/replacing-shared-object-so-file-while-main-program-is-running says NO. This abc.so file has to be loaded into pid123 memory (then dynamically linked into the executable) before main() is called.

Among the 3 major mechanism 1) static linker 2) dynamic linker 3) dlopen, dlopen is able to achieve this purpose but I’m unfamiliar with dlopen.

If pid123 reads a config file to decide whether to load abc.so, then dlopen is the only solution. I saw such an industrial strength implementation in 2019.

A remotely related note — The same stackoverflow webpage also shows that after pid123 starts, you can actually remove (and replace) abc.so without affecting pid123, since the old file content is already loaded into pid123 memory.

extern^static on var^functions

[1] https://stackoverflow.com/questions/14742664/c-static-local-function-vs-global-function confirmed my understanding that static local function is designed to allow other files to define different functions under this name.

extern var file-scope STATIC var static func extern-C func
purpose 1 single object shared across files [1] applies. 99% sure [1] name mangling
purpose 2 private variable private function
alternative 1 static field anon namespace no alternative
alternative 2 singleton
advantage 1 won’t get used by mistake from other file
disadv 1 “extern” is disabled if you provide an initializer no risk. very simple
disadv 2
put in shared header? be careful  should be fine  not sure

hunt down CORRECT include file+directory

When I get something like unrecognized symbol, obviously header file is missing.

This is a relatively easy challenge since it involves ascii source files, not binary. Faster to search.

  1. I start with some known include directories. I run find-grep looking for a declaration of the symbol. Hopefully I find only one declaration and it’s the correct header file to include.
  2. then I need to guess the correct form of #include
  3. Then I need to add the directory as an -I command-line option

 

q[cannot open shared object file] abc.so

strace -e trace=open myprogram can be used on a working instance to see where all the SO files are successfully located.

— Aug 2018 case: in QA host, I hit “error while loading shared libraries: libEazyToFind.so: … No such file or directory”

I can see this .so file so I used LD_LIBRARY_PATH to resolve it.

Then I get “error while loading shared libraries: libXXXXX.so: … No such file or directory”. I can’t locate this .so, but the same executable is runnable in a separate HostB. (All machines can access the same physical file using the same path.)

I zoomed into the HostB and used “ldd /path/to/executable”. Lo and behold, I can see why HostB is lucky. The .so files are located in places local in HostB … for reasons to be understood.

— May 2018 case:

The wording should be “cannot locate ….”

I fixed this error using $LD_LIBRARY_PATH

The *.so  file is actually specified as a -lthr_gcc34_64 option on the g++ command line, but the file libthr_gcc34_64.so was not found at startup.

I managed to manually locate this file in /a/b/c and added it :

LD_LIBRARY_PATH=$LD_LIBRATY_PATH:/a/b/c/

Y allocate static field in .c file %%take

why do we have to define static field myStaticInt in a cpp file?

For a non-static field myInt, the allocation happens when the class instance is allocated on stack, on heap (with new()) or in global area.

However, myStaticInt isn’t take care of. It’s not on the real estate of the new instance. That’s why we need to declare it in the class header, and then define it exactly once (ODR) in a cpp file. It is allocated at compile time — static allocation.

contents to keep in .C rather than .H file

1) Opening example — Suppose a constant SSN=123456789 is used in a1.cpp only. It is therefore a “local constant” and should be kept in a1.cpp not some .H file.  Reason?

The .H file may get included in some new .cpp file in the future. So we end up with multiple .cpp files dependent (at compile-time) on this .H file. Any change to the value or name of this SSN constant would require recompilation to not only a1.cpp but unnecessarily to other .cpp files 😦

2) #define and #include directives — should be kept in a1.cpp as much as possible, not .H files. This way, any change to  the directives would only require recompiling a1.cpp.

The pimpl idiom and forward-declaration use similar techniques to speed up recompile.

3) documentation comments — some of these documentations are subject to frequent change. If put in .H then any comment change would trigger recompilation of multiple .cpp files

c++static field init: rules

See also post on extern…

These rules are mostly based on [[c++primer]], about static Field, not local statics or file-scope static variables.

Rule 1 (the “Once” rule) — init must appear AND execute exactly once for each static field.

In my Ticker Plant xtap experience, the static field definition crucially sets aside storage for the static field. The initial value is often a dummy value.

Corollary: avoid doing init in header files, which is often included multiple times. See exception below.

Rule 2 (the “Twice” rule) — static field Must (See exception below) be DECLARED in the class definition block, and also DEFINED outside. Therefore, the same variable is “specified” exactly twice [1]. However, the run time would “see” the declaration multiple times if it’s included in multiple places.

Corollary: always illegal to init a static field at both declaration and definition.

[1] Note ‘static’ keyword should be at declaration not definition. Ditto for static methods. See P117 [[essential c++]]

The Exception — static integer constant Fields are special, and can be initialized in 2 ways
* at declaration. You don’t define it again.
* at definition, outside the class. In this case, declaration would NOT initialize — Rule 1

The exception is specifically for static integer constant field:

  • if non-const, then you can only initialize it in ctor
  • if non-integer static, then you need to define it outside
  • if non-const static, then ditto
  • if not a field, then a different set of rules apply.

Rule 3: For all other static fields, init MUST be at-definition, outside the class body.

Therefore, it’s simpler to follow Rule 3 for all static fields including integer constants, though other people’s code are beyond my control.

——Here’s an email I sent about the Exception —–
It turned these are namespace variables, not member variables.

Re: removing “const” from static member variables like EXCHANGE_ID_L1

Hi Dilip,

I believe you need to define such a variable in the *.C file as soon as you remove the “const” keyword.

I just read online that “integer const” static member variables are special — they can be initialized at declaration, in the header file. All other static member variables must be declared in header and then defined in the *.C file.

Since you will overwrite those EXCHANE_ID_* member variables, they are no longer const, and they need to be defined in Parser.C.

private-header^shared-header

In our discussions on ODR, global variables, file-scope static variables, global functions … the concept of “shared header” is often misunderstood.

  • If a header is only included in one *.cpp, then its content is effectively part of a *.cpp.

Therefore, you may experiment by putting “wrong” things in such a private header and the set-up may work or fail, but it’s an invalid test. Your test is basically putting those “wrong” things in an implementation file!

 

#include <xtap/PluginConfig.h> trick

I have seen in many large systems:

The actual path to the header is …/shared/tp_xtap/include/PluginConfig.h, but develoeprs prefer an abbreviated include like #include <xtap/PluginConfig.h>.

#1) Here’s one simple implementation:

ls -l shared/tp_xtap/include/ # should show a symbolic link to be created automatically:

    xtap -> ./

Therefore, -I/full/path/to/shared/tp_xtap/include/ will resolve #include <xtap/PluginConfig.h>

#2) I guess a second-best solution is code generation. Checked-in source file has #include <xtap/PluginConfig.h> but the build system follows configured rewrite-rules, to convert it into #include <some/other/path>

friend class Fren need!!be fwd-declared

http://www.cplusplus.com/forum/articles/10627/ is a forum post, but i basically trust him:

There are two basic kinds of dependencies you need to be aware of:
1) stuff that can be forward declared
2) stuff that needs to be #included

If, for example, class A uses class B, then class B is one of class A’s dependencies. Whether it can be forward declared or needs to be included depends on how B is used within A:

- do nothing if: The only reference to B is in a friend declaration <-- I tested this myself.
- forward declare B if: A contains a B pointer or reference: B* myb;
- forward declare B if: one or more functions has a B object/pointer/reference
as a parameter, or as a return type:

B MyFunction(B myb);

- #include "b.h" if: B is a parent class of A
- #include "b.h" if: A contains a B object: B myb;

ensure operator<< is visible via header file

If you define operator<<() for a basic ValueObject class like Cell, to be used in higher-level class like Board, then you need to make this declaration visible to Board.cpp via header files.

If you only put the definition of operator<<() in a ValueObj.cpp and not the ValueObj.h, and you link the object files ValueObj.o and Board.o, everything compiles fine. When Board.cpp calls operator<< on this Cell object it would use the default operator<< rather than yours.

2obj files compiled@different c++toolchains can link@@

(many interviewers asked…)

Most common situation — two static libs pre-compiled on toolchain A and B, then linked. Usually we just try our luck. If not working, then we compile all source files on the same toolchain.

Toolchain A and B could differ by version, or compiler brand, or c vs c++ … I guess there’s an Application Binary Interface between different toolchains.

https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html says that it’s possible (“straightforward”) to link C++03 and C++11 code together.

demo: static method declare/define separately n inherited

Low complexity in this demo, but if you lack a firm grip on the important details here, they will add to the complexity in a bigger code base.

  • When subclass is compiled, compiler complains about undefined sf() since it sees only the declaration. You need “g++ der.cpp base.cpp”.
  • Note the static method is inherited automatically, so you could call der::sf().
#include <iostream>
struct base{
  static void sf();
};
///////// end of header /////////
#include "base.h"
using namespace std;
void base::sf(){ // no "static" please
  cout<<"sf"<<endl;
}
///////// end of base class /////////
#include "base.h"
using namespace std;
struct der: public base{};
int main(){
  der::sf();
}
///////// end of sub class /////////

declare^define: additional complexity@c over java/c#

It slowly dawned on me that a big share of c++ programming headaches (confusions, compiler/linker errors, makefile complexity) stem from one basic design of C — names need to be declare to the “world”, and separately defined.

This design gives rise to header files.

Variables/objects vs functions vs custom classes vs templates have different rules.

I think only objects (including static fields) obey this particular rule: the definition allocates (non-stack) memory.

I think class instance fields are completely different. See https://bintanvictor.wordpress.com/2017/05/01/c-class-field-are-defined-in-header-but-global-variables-has-singledefinitionrule/

I think functions are completely different. I blogged about function ODR — https://bintanvictor.wordpress.com/2017/06/15/odrclassesfree-functions/

Pimpl is one of many issues.

LD_LIBRARY_PATH ^ objdump RUNPATH

This is related to q[cannot open shared object file] abc.so

See https://amir.rachum.com/blog/2016/09/17/shared-libraries/#rpath-and-runpath for the RUNPATH

q(objdump) can inspect the binary file better than q(ldd) does.

q(ldd) shows the final, resolved path of each .so file, but (AFAIK) doesn’t show how it’s resolved. The full steps of resolution is described in http://man7.org/linux/man-pages/man8/ld.so.8.html

q(objdump) can shed some light … in terms of DT_RUNPATH section of the binary file.

undefined reference to vtable

This error can be hard to track down if codebase is huge. I have seen it more than 4 times and often it takes a few minutes of investigaton because there are multiple causes:

  • If a virt function is defined in a .cpp file but you don’t link in that file, then you can hit this linker error.
  • if you declare a virt function but don’t define it, you hit this linker error

ODR@functions # and classes

Warning – ODR is never quizzed in IV. A weekend coding test might touch on it but we can usually circumvent it.

OneDefinitionRule is more strict on global variables (which have static duration). You can’t have 2 global variables sharing the same name. Devil is in the details:

(As explained in various posts, you declare the same global variable in a header file that’s included in various compilation units, but you allocate storage in exactly one compilation unit. Under a temporary suspension of disbelief, let’s say there are 2 allocated storage for the same global var, how would you update this variable?)

With free function f1(), ODR is more relaxed. http://www.drdobbs.com/cpp/blundering-into-the-one-definition-rule/240166489 (possibly buggy) explains the Lessor ODR vs Greater ODR. Lessor ODR is simpler and more familiar, forbidding multiple (same or different) definitions of f1() within one compilation unit.

My real focus today is the Greater ODR. Obeying Lessor ODR, the same static or inline function is often included via a header file and compiled into multiple binary files. If you want to put non-template free function definition in a shared header file but avoid Great ODR, then it must be static or inline, implicitly or explicitly. I find the Dr Dobbs article unclear on this point — In my test, when a free function was defined in a shared header without  “static” or “inline” keywords, then linker screams “ODR!”

The most common practice is to move function definitions out of shared headers, so the linker (or compiler) sees only one definition globally.

With inline, Linker actually sees multiple (hopefully identical) physical copies of func1(). Two copies of this function are usually identical definitions. If they actually have different definitions, compiler/linker can’t easily notice and are not required to verify, so no build error (!) but you could hit strange run time errors.

Java linker is simpler and never cause any problem so I never look into it.

//* if I have exactly one inline, then the non-inlined version is used. 
// Linker doesn't detect the discrepancy between two implementations.
//* if I have both inline, then main.cpp won't compile since both definitions 
// are invisible to main.cpp
//* if I remove both inline, then we hit ODR 
//* objdump on the object file would show the function name 
// IFF it is exposed i.e. non-inline
::::::::::::::
lib1.cpp
::::::::::::::
#include &amp;lt;iostream&amp;gt;
using namespace std;

//inline
void sameFunc(){
    cout&amp;lt;&amp;lt;"hi"&amp;lt;&amp;lt;endl;
}
::::::::::::::
lib2.cpp
::::::::::::::
#include &amp;lt;iostream&amp;gt;
using namespace std;

inline
void sameFunc(){
    cout&amp;lt;&amp;lt;"hey"&amp;lt;&amp;lt;endl;
}
::::::::::::::
main.cpp
::::::::::::::
void sameFunc(); //needed
int main(){
  sameFunc();
}

 

q[nm] instrumentation #learning notes

When you want to reduce the opacity of the c++ compiled artifacts, q(nm) is instrumental. It is related to other instrumentation tools like

c++filt
objdump
q(strings -a)

Subset of noteworthy features:
–print-file-name
–print-armap? Tested with my *.a file. The filename printed is different from the above
–line-numbers? Tested
–no-sort
–demangle? Similar to c++filt but c++filt is more versatile
–dynamic? for “certain” types of shared libraries
–extern-only

My default command line is


nm --print-armap --print-file-name --line-numbers --demangle
nm --demangle ./obj/debug/ETSMinervaBust/src.C/ReparentSource.o //worked better

In May 2018, I ran nm on a bunch of *.so files (not *.a) to locate missing symbol definitions. Once I found a needed symbol is exported by libabc.so, I had to add -labc to my g+ command line.

incisive eg show`difference with^without extern-C

---- dummy8.c ----
#include <stdio.h> 
//without this "test", we could be using c++ compiler unknowingly 😦
int cfunc(){ return 123; }
---- dummy9.cpp ----
#include <iostream>
extern "C" // Try removing this line and see the difference
  int cfunc();
int main(){std::cout << cfunc() <<std::endl; }

Above is complete source of a c++ application using a pre-compiled C function. It shows the need for extern-C.

/bin/rm -v *.*o *.out
### 1a
g++ -v -c dummy8.c # 
objdump --syms dummy8.o # would show mangled function name _Z5cfuncv
### 1b
gcc -v -x c -c dummy8.c # Without the -x c, we could end up with c++ compiler 😦
objdump --syms dummy8.o # would show unmangled function name "cfunc"
### 2
g++ -v dummy8.o dummy9.cpp  # link the dummy8.o into executable

# The -v flag reveals the c vs c++ compiler versions 🙂
### 3
./a.out

So that’s how to compile and run it. Note you need both a c compiler and a c++ compiler. If you only use a c++ compiler, then you won’t have any pre-compiled C code. You can still make the code work, but you won’t be mixing C and C++ and you won’t need extern-C.

My goal is not merely “make the code work”. It’s very easy to make the code work if you have full source code. You won’t need extern-C. You have a simpler alternative — compile every source file in c++ after trivial adjustments to #include.

c++dynamicLoading^dynamicLinking^staticLinking, basics

https://en.wikipedia.org/wiki/Dynamic_loading

*.so and *.dll files are libraries for dynamic linking.
*.a and *.lib files are libraries for static linking.

“Dynamic loading” allows an executable to start up in the absence of these libraries and integrate them at run time, rather than at link time.

You use dlopen(“path/to/some.so”) system call. In Windows it’s LoadLibrary(“path/to/some.dll”)

C++build error: declared but undefined variable

I sometimes declare a static field in a header, but fail to define it (i.e. give it storage). It compiles fine and may even link successfully. When you run the executable, you may hit

error loading library /home/nysemkt_integrated_parser.so: undefined symbol: _ZN14arcabookparser6Parser19m_srcDescriptionTknE

Note this is a shared library.
Note the field name is mangled. You can un-mangle it using c++filt:

c++filt _ZN14arcabookparser6Parser19m_srcDescriptionTknE -> arcabookparser::Parser::m_srcDescriptionTkn

According to Deepak Gulati, the binary files only contain mangled names. The linker and all subsequent programs deal exclusively with mangled names.

If you don’t use this field, the undefined variable actually will not bother you! I think the compiler just ignores it.

c++class field defined]header,but global vars obey ODR

Let’s put function declaration/definition aside — simpler.

Let’s put aside local static/non-static variables — different story.

Let’s put aside function parameters. They are like local variables.

The word “static” is heavily overloaded and confusing. I will try to avoid it as far as possible.

The tricky/confusing categories are

  • category: static field. Most complex and better discussed in a dedicated post — See https://bintanvictor.wordpress.com/2017/02/07/c-static-field-init-basic-rules/
  • category: file-scope var — i.e. those non-local vars with “static” modifier
  • category: global var declaration — using “extern”
    • definition of the same var — without “extern” or “static”
  • category: non-static class field, same as the classic C struct field <– the main topic in the post. This one is not about declaration/definition of a variable with storage. Instead, this is defining a type!

I assume you can tell a variable declaration vs a variable definition. Our intuition is usually right.

The Aha — [2] pointed out — A struct field listing is merely describing what constitutes a struct type, without actually declaring the existence of any variables, anything to be constructed in memory, anything addressable. Therefore, this listing is more like a integer variable declaration than a definition!

Q: So when is the memory allocated for this field?
A: when you allocate memory for an instance of this struct. The instance then becomes an object in memory. The field also becomes a sub-object.

Main purpose to keep struct definition in header — compiler need to calculate size of the struct. Completely different purpose from function or object declarations in headers. Scott Meyers discussed this in-depth along with class fwd declaration and pimpl.

See also

global^file-scope variables]c++ #extern

(Needed in coding tests and GTD, not in QnA interviews.)

Any object declared outside a block has “static duration” which means (see MSDN) “allocated at compile time not run time”

“Global” means extern linkage i.e. visible from other files. You globalize a variable by removing “static” modifier if any.

http://stackoverflow.com/questions/14349877/static-global-variables-in-c explains the 2+1 variations of non-local object. I have added my own variations:

  • A single “static” variable in a *.cpp file. “File-scope” means internal linkage i.e. visible only within the file. You make a variable file-scope by adding “static”.
  • an extern (i.e. global mutable) variable — I used this many times but still don’t understand all the rules. Basically in one *.cpp it’s defined, without “static” or “extern”. In other *.cpp files, it’s declared (via header) extern without a initial value.
  • A constant can be declared and defined in one shot as a static (i.e. file-scope) const. No need for extern and separate definition.
  • confusing — if you (use a shared header to) declare the same variable “static” in 2 *.cpp files, then each file gets a distinct file-scope mutable variable of the same name. Nonsense.

https://msdn.microsoft.com/en-us/library/s1sb61xd.aspx

http://en.wikipedia.org/wiki/Global_variable#C_and_C.2B.2B says “Note that not specifying static is the same as specifying extern: the default is external linkage” but I doubt it.

I guess there are 3 forms:

  • static double — file-scope, probably not related to “extern”
  • extern double — global declaration of a var already Defined in a single file somewhere else
  • double (without any modifier) — the single definition of a global var. Will break ODR if in a shared header

Note there’s no special rule about “const int”. The special rule is about const int static FIELD.

//--------- shared.h ---------
#include 
#include 
void modify();

extern std::string global; //declaration without storage allocation
static const int fileScopeConst1=3; //each implementation file gets a distinct copy of this const object
static double fileScopeMutable=9.8; //each implementation file gets a distinct copy of this mutable object
//double var3=1.23; //storage allocation. breaks compiler due to ODR!

// ------ define.C --------
#include "shared.h"
using namespace std;
string global("defaultValue"); //storage allocation + initialization
int main(){
  cout<<"addr of file scope const is "<<&fileScopeConst1<<std::endl;
  cout<<"addr of global var is "<<&global<<std::endl;
  cout<<"before modify(), global = "<<global<< "; double = "<<fileScopeMutable<<endl;
  modify();
  cout<<"after modify(), global = "<<global<< "; double (my private copy) = "<<fileScopeMutable<<endl;
}
// ------ modify.C --------
#include "shared.h"
void modify(){
  global = "2ndValue";
  fileScopeMutable = 700;
  std::cout<<"in modify(), double (my private copy) = "<<fileScopeMutable<<std::endl;
  std::cout<<"in modify(), addr of file scope const is "<<&fileScopeConst1<<std::endl;
  std::cout<<"in modify(), addr of global var is "<<&global<<std::endl;
}

linker dislikes [non-generic]function definition in shared header

I used to feel header files are optional so we can make do without them if they get in our ways. This post shows they aren’t optional in any non-trivial c++ project. There is often only one (or few) correct way to structure the header vs implementation files. You can’t make do without them.

Suppose MyHeader.h is included in 2 cpp files and they are linked to create an executable.

A class definition is permitted in MyHeader.h:

class Test89{
void test123(){}
};

However, if the test123() is a free function, then linker will fail with “multiple definition” of this function when linking the two object files.

http://stackoverflow.com/questions/29526585/why-defining-classes-in-header-files-works-but-not-functions explains the rules

  • repeated definition of function (multiple files including the same header) must be inlined
  • repeated class definition (in a shared header) is permitted for a valid reason (sizing…). Since programmers could not only declare but define a member function in such a class, in a header, the compiler silently treats such member functions as inline

[15]1st deep dive@linker + comparison with compiler

mtv: I feel linker errors are common. Linker is less understood than pre-processor or compiler. This know-how is more practical than a lot of c++ topics like MI, templates, op-new … Most real veterans (not just bookworm generals) would deal with some linker errors and develop some insight. These errors can take a toll when your project is running late. My textbook knowledge isn’t enough to give me the insight needed.

I believe compiler produces object files; whereas linkers take in object or library files and produce library or executable files.

Q: can linker take in another linker’s output?

http://www.lurklurk.org/linkers/linkers.html seems to be more detailed, but I have yet to read it through.

http://stackoverflow.com/questions/6264249/how-does-the-compilation-linking-process-work:

This object file contains the compiled code (in binary form) of the symbols defined in the input. Symbols in object files are referred to by name.

Object files can refer to symbols that are not defined. This is the case when you use a declaration, and don’t provide a definition for it. The compiler doesn’t mind this, and will happily produce the object file as long as the source code is well-formed.

(I guess the essence of linking is symbol resolution i.e. translating symbols to addresses) It links all the object files by replacing the references to undefined symbols with the correct addresses. Each of these symbols can be defined in other object files or in libraries.

During compilation, if the compiler could not find the definition for a particular function, it would just assume that the function was defined in another file. If this isn’t the case, there’s no way the compiler would know — it doesn’t look at the contents of more than one file at a time.

So what the compiler outputs is rough machine code that is not yet fully built, but is laid out so we know the size of everything, in other words so we can start to calculate where all of the absolute addresses will be located. The compiler also outputs a symbol table of name/address pairs. The symbols relate a memory offset in the machine code in the module with a name. The offset being the absolute distance to the memory location of the symbol in the module. That’s where we get to the linker. The linker first slaps all of these blocks of machine code together end to end and notes down where each one starts. Then it calculates the addresses to be fixed by adding together the relative offset within a module and the absolute position of the module in the bigger layout.

c++compiler must know data type sizes

http://stackoverflow.com/questions/6264249/how-does-the-compilation-linking-process-work points out

So what the compiler outputs is rough machine code that is not yet fully built, but is laid out so we know the size of everything, in other words so we can start to calculate where all of the absolute addresses will be located. The compiler also outputs a list of symbols which are name/address pairs. The symbols relate a memory offset in the machine code in the module with a name. The offset being the absolute distance to the memory location of the symbol in the module.

Stroustrup told me about a key performance advantage of c++ over modern languages — local variables. If we want to use more local variables and fewer heap objects, then I can understand that each time need to know the size of every data type.

C++ one source file -> one object file

Not sure if you can compile multiple source files into a single *.obj file…

http://stackoverflow.com/questions/6264249/how-does-the-compilation-linking-process-work:

Compilation refers to the processing of source code files (.c, .cc, or .cpp) and the creation of an ‘object’ file. This step doesn’t create anything the user can actually run!

Instead, the compiler merely produces the machine language instructions that correspond to the source code file that was compiled. For instance, if you compile (but don’t link) three separate files, you will have three object files created as output, each with the name .o or .obj (the extension will depend on your compiler). Each of these files contains a translation of your source code file into a machine language file — but you can’t run them yet! You need to turn them into executables your operating system can use. That’s where the linker comes in.

func^var in header files

Many authors describe a single set of rules for func declaraion vs variable declaration in header files. However, for some beginners, it might make more sense to assume the rules are largely independent and unrelated between func vs var.

Note I will use “include file” and “header file” interchangeably.

The word “static” is heavily overloaded and confusing. I will try to avoid it as far as possible.

(Perhaps because function in header files are implicit global) We are more familiar with the rules on functions
– Rule: each some.cpp file that uses a shared func1() must have func1 “declared”. Best done via include file.
– Rule: across all the .cpp files, there must be exactly 1 definition of func1, otherwise linker complains. Consequently, this must go into a non-include file and seen once only by the linker.
– exception: implicit inline func definition can be included in headers
– exception: file-scope static func definition can be included in headers

——global variables——-
Let’s start with a big Backgrounder (because it’s rather confusing) — Let’s ignore fields of classes/structs and just focus on classic C variables. The most common and most simple variables are local to a function — function-scope static/non-static variables. The other variables are essentially static-duration variables[1]. A simple type of non-local variable is a file-scope but unshared variable. It can be used across functions within that single file, but here we are interested in global shared variables. I think there are various categories but as hinted in the rather authoritative [[essential c++]] P53, most common is a file-scope static var. The way to globalize such a variable var2 is

Rule: use extern in an include file, so every .cpp file will “see” this extern declaration of var2
Rule: in exactly one .cpp file, define var2 without “extern” or “static”

[1] Someone said online “A local variable is a variable that is declared inside a function. A global variable is a variable that is declared outside all functions“. Scott Meyers discussed “non-local static variable”.

—— static field eg sf2 ——-
See also post on static Field init, and P115 [[essentil c++]]
Usually must be defined exactly once outside class Declaration (which is part of include files). Somewhat similar to shared global variables.

Rule — each some.cpp file that uses this field must have it “declared”. Best done via include file.
Rule — across all the .cpp files, there must be exactly 1 definition of this field. Consequently, this must go into a non-include file and seen once only by the compiler. The field name should be prefixed with the class name, but with the “static” keyword.
Rule — NO “extern” please.
Rule — (special) — const static Integral field can be initialized (not “defined”) in the declaration. See
http://stackoverflow.com/questions/370283/why-cant-i-have-a-non-integral-static-const-member-in-a-class. But watch out for http://stackoverflow.com/questions/3025997/c-defining-static-const-integer-members-in-class-definition

fwd class declaration illustrated #C++Succinctly

[[c++succinctly]] has a good illustration of fwd declaration — ClassB.h uses FCD for ClassA, instead of including ClassA.h. ClassA.h happens to include Windows.h.

Does this FCD help classB compilation? No. ClassB.cpp still has to include ClassA.h (and things like Windows.h — even though unneeded.) because the Implementation of ClassB methods would probably [1] need to “unwrap” the ClassA pointers (or ClassA references) to access ClassA members.

[1] otherwise, we may erase ClassA completely from ClassB source code.
Allow me to repeat —
ClassB.h is physically smaller thanks to FCD, but that doesn’t simplify ClassB compilation.

In that case who, if not ClassB, does this FCD help?

A: any class (like ClassK) header files that include ClassB.h.
A: any class (like ClassK) implementation files that include ClassB.h.

Compiling ClassK could be much faster thanks to this single FCD.

So here’s the takeaway — a fwd class declaration of ClassA only helps “grandchild” ClassK and doesn’t help the immediate downstream ClassB.

By the way, as pointed out in [[c++succinctly]] this FCD would be impossible if ClassB.h were to use any nonref variables of ClassA. Nonref means sizeof(ClassA) is needed to compile ClassB.

[[effC++]] goes one step further to introduce pimpl, which is closely related to FCD

c++ compilation dependency – my take on effC++

(Based on P144 effC++)

* size of each class instance
* #include
* header (assuming one header file per class) contains field listing of the class

These are the key points of ComplDependency. CD means that if an upstream file changes, then a downstream file needs recompiling.

As an analogy, imagine the “downstream” file (your app class) has an auto-generated Table of Contents of a large MS-word document. The Word document includes sub-documents (utility classes), which in turn include other sub-documents (lib classes). Edits in any included documents could render the TOC obsolete, and needs a re-compilation.

In the traditional (simple) design, the app class Person has its fields declared as type String, Date, Address etc. Any change in the size of Type Address triggers re-compilation of our class.

In pimpl or the java world, all those fields are moved into a PersonImpl class which is still CD on Type Address. However, the domino effect stops at PersonImpl. Our app class Person needs no recompilation since size of Type Person is unaffected.

extern !! q[extern C]

This post is about extern on global variables. The other (overload) usages of extern include
1) extern “C” — see other posts
2) extern “anyOtherLang” — illegal in most compilers. Proprietary feature

Basic purpose of extern on non-functions? Create a declaration without definition. See c++Primer P396. Rules

* use extern, not extern “C”
* on a non-function — basically a global variable
* without initializer

Suppose a.cpp Defines a globalVar1 and this is to be shared.

  1. My rule 1: if globalVar1 is shared only with one b.cpp file, then I could put the extern declaration in b.cpp. Nowadays I seldom choose this option. It’s better to follow one standard rule (Rule 2) rather than multiple rules.
  2. My rule 2: if globalVar1 is shared with potentially multiple cpp files, then it is better to put the extern in a.h

troubleshooting bloodshed c++

If you get 
[Linker error] undefined reference to `__dyn_tls_init_callback’
Then maybe 2 mingw versions are in conflict. Try renaming c:\mingw to something else
Here’s a better solution — http://www.allegro.cc/forums/thread/472245
Use the /lib and /mingw32 folders in your own mingw to overwrite those originals in the dev-cpp folder.
It may be safe to remove the mingw32 folder.

static: 3meanings for c++objects

http://www.cprogramming.com/tutorial/statickeyword.html echoes my analysis below.

I feel “static” has too many overloaded meanings, almost unrelated. (Java is better.) Maybe there are just 3 mutually exclusive meanings

1) local static vars — simplest meaning of “static”. as inherited from C, static local var retain their values between function calls. My Chartered system relied heavily on these.

2) static field — the java meaning.

3) file scope — We know local variables (block scope) and global variables (extern). There is one (intermediate?) level of scoping for variables — file scope using “static” keyword. A variable with file scope can be accessed by any function or block within a single file. To declare a file scoped variable, simply declare a variable outside of a block. (Note all of these 3 types are about “objects”, rather than “variables”.) There’s a technical jargon — non-local static object, sometimes IMPLICITLY static. See [[effC++]] P221.

Are all these objects allocated at compile time not dynamically at run time. How about auto variables?

How about static_cast? Irrelevant to this discussion. Nothing to do with object type

3 storage classes for c++variables: lifetime^scope

external
static
thread_local

(Most important type for me is the external…)

https://www.programtopia.net/cplusplus/docs/storage-classes points out that these 4 describe the lifetime and visibility of a variable.

https://docs.microsoft.com/en-us/cpp/cpp/storage-classes-cpp is more authoritative.

initially there are only 2 — auto and external. They added “register” as a subtype of auto. They added static as a subclass of external.

External variables are global variables. I feel globals and file-scope static objects are stored in the same area, not on stack or heap.

extern C on myFunc: 1st demystification

This post is about extern “C”, See other posts (in this blog) on other uses of extern.

extern “C” is the only form in the standard. It is designed to help linking with Fortran, C or even other c++ modules. See the chapter in [[moreEffC++]]

  • Basic purpose on functions? suppress name mangling. See moreEffC++ for more details. In the most common usage as  incisive example showing diff: with^without extern-C ,
    • the pre-compiled c library function has No mangling its name.
    • without extern-C, then c++ calling function would apply mangling and fail to match the actual callee name
    • with extern-C, the c++ calling function would Not apply mangling on the callee name.
  • extern-c wraps func prototypes. You could make extern-c wrap func implementations, but not common.
  • linker — extern is a linkage feature.
  • Interaction with #include? See P858 [[c++Primer]]

— Re-declarations ?
Forget about extern-C first. Given any given function prototype, it can appear multiple times even in “rapid fire”, like void a(); void a(); void a(). Compiler just ignores these prototypes. Extern-c doesn’t change the rule. See p366 c++Primer.

shared^static library – phasebook

http://stackoverflow.com/questions/2649334/difference-between-static-and-shared-libraries
is good

zipfile — static lib (not shared lib) is created using q(ar) and conceptually a zipfile

  • static = unshared. A static library (some.a) is “copied” into your executable image, enlarging it.
  • copied
  • enlarge

shared = dynamic library — In unix, some.so.n means SSSharedObject. In windows some.dll means DDDynamic Link Library.

  • baggage — using Static library, your executable doesn’t carry any external “baggage” i.e. the shared library files.
  • recompile — of your executable is necessary only if using “static” library

http://stackoverflow.com/questions/4250624/ld-library-path-vs-library-path explained it well — your libraries can be static or shared. If it is static then you don’t need to search for the library after your program is compiled and linked. If shared then LD_LIBRARY_PATH is used when searching for the shared library, even after your executable has been successfully compiled and linked. This resembles dotnet DLL or jars — you can upgrade a DLL/jar without recompiling the executable.

c/c++ headers, func prototypes

In C [2], before you use a function[1], you must declare (like abstract) or define (i.e. implement) it. Alex (lab49) explained that c compiler is single-pass. Java and C# are multiple-pass.

Say we don’t like to create func declarations. We must define a function before calling it from another func. We must be careful to arrange func definitions. As developers, I don’t want to worry about the ordering of func definitions. One worry-free solution is to declare all functions upfront. One step further is header file, which has all the func declarations.

[2] C++ is like C. Object-C is the same.
[1] variables too.

again, what’s a header file for a c++ class

defining feature: MY Header file is #included by OTHER code, using macro expansion

feature: header file usually contains MY field listing so compiler can do sizeof()

Q: Does the object file include field listing?
A: not important

Q: is header file NOT compiled into the object file? In that case, the field listing is not physically saved in the object file?
A: At runtime, the assembly instructions in the object file has enough details to instantiate objects of my class.

Q: header file creates compile-time dependency, but how about run-time dependency?
http://www.gotw.ca/gotw/007.htm
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml – is a real-world coding standard in a real software company.

c++ header file == #define@@

Whenever you see ANY “header file”, think of #define. It’s the same copy-paste process. It’s the same pre-processor. You ship object files not header files to your users, because header files are not referenced at runtime.

———- Forwarded message ———-
From: Chuck

Hi,

  The trick to header files is that they are nothing but string substitution. If you copied the contents of the header inline in the file where the include directive is, you get exactly the same result. One little trick that can help is, the compiler will take an argument that tells it to preprocess only (‘-E’ I think in g++). You’ll see what the expanded headers and macros look like with that option.