clean language==abstracted away{hardware

Nowadays I use the word “clean” language more and more.

Classic example — java is cleaner than c# and c++. I told a TowerResearch interviewer that java is easier to reason with. Fewer surprises and inconsistencies.

The hardware is not “consistent” as wished. Those languages closer to the hardware (and programmers of those languages) must deal with the “dirty” realities. To achieve consistency, many modern languages make heavy use of heap and smart pointers.

For consistency, Everything is treated as a heapy thingy, including functions, methods, metadata about types …

For consistency, member functions are sometimes treated as data members.

new languages lose limelight soon

New languages come into and go out of fashion, like hairstyle and cars.

  • Java has lost the lime light for 10 years but is still fairly popular due to job market and high-profile and “cool” companies like google.
  • C and C++ are among the top 5 longest-living languages. They are unpopular among the young, but they remain important.
  • c# is a curious case. I think the really important windows applications are still written in c++.
  • Many high-level languages go out of fashion, including C#, D, ..
  • Most scripting or dynamic languages go out of fashion


kernels(+lang extensions)don’t use c++

Q: why kernels are usually written in C not c++? This is an underpinning of the value and longevity of the languages.

I asked Stroustrup. He clearly thinks c++ can do the job. As to why C still dominates, he cited a historical reason. Kernels were written long before C++ was invented.

Aha — I think there are conventions and standard interfaces (POSIX is one of them)… always in C.

I said “The common denominator among various languages is always a C API”. He said that’s also part of what he meant.

benchmark c++^newer languages

C++ can be 5x faster than java if both programs are well-tuned — A ball-park estimate given by Stroustrup.

The c++ code is often written like java code, using lots of pointers, virtual functions, no inline, perhaps with too many heap allocations (STL containers) rather than strictly-stack variables .

Many other benchmarks are similarly questionable. These new languages out there are usually OO and rely on GC + pointer indirection. If you translate their code to C++, the resulting c++ code would be horribly inefficient, not taking advantage of c++ compiler’s powers. An expert c++ developer would rewrite everything to avoid virtual functions and favor local variables and inline, and possibly use compile-time programming. The binary would usually become comparable in benchmark.  The c++ compiler is more sophisticated and have more optimization opportunities, so it usually produces faster code.

local_var=c++strength over GC languages

Stroustrup told me c++ code can use lots of local variables, whereas garbage collected languages put most objects on heap.

I hypothesized that whether I have 200 local variables in a function, or no local variable, the runtime cost of stack allocation is the same. He said it’s nanosec scale, basically free. In contrast, with heap objects, biggest cost is allocation. The deallocation is also costly.

Aha — At compile-time, compiler already knows how many bytes are needed for a given stack frame

insight — I think local variables don’t need pointers. GC languages rely heavily on “indirect” pointers. Since GC often relocates objects, the pointer content need to be translated to the current address of the target object. I believe this translation has to be done at run time. This is what I mean by “indirect” pointer.

insight — STL containers almost always use heap, so they are not strictly “local variables” in the memory sense

language war: 4 criteria

  1. criteria-E: efficiency, performance including platform-specific optimization.
    • Ling of MS felt threading support is crucial
    • many interactive or batch applications don’t care much about latency
  2. criteria-P1: proven, widely used in “my” community (me = tech lead), with mature ecosystem.
  3. criteria-F: killer features, including OO
  4. criteria : RAD, simplicity, ease-of-use

It is instructive to study the “dethrone” stories.

  • Case: On server-side, java dethroned c++ due to P1, RAD, E
    • In contrast, windows GUI seldom use java. They use 1)c++ 2)c# due to P1, F and E
  • Case: Python dethroned perl due to RAD
  • Case: shell scripting is very old but survived, due to F
  • Case: php survived due to F (designed for web server side), RAD

20Y long term trend – demand for high-level language skillset is rising (unsteadily) relative to java/c++. The reasons are numerous and include RAD, F, P1 Q: which factor is the biggest threat to java? A: F, RAD, not E ! I guess the RAD productivity gap between java and python isn’t so big. For large modularized projects, java probably has a productivity advantage.

Q: will c++ (or even java) be relegated to assembly’s status? No but I wonder why.

heap usage: C ilt C++/Java #Part 2

I now recall that when I programmed in C, my code never used malloc() directly.

The library functions probably used malloc to some extent, but malloc was advanced feature. Alexandrescu confirmed my experience and said that c++ programmers usually make rather few malloc() calls, each time requesting a large chunk. Instead of malloc, I used mostly local variables and static variables. In contrast, C++ uses heap much more:

  • STL containers are 99% heap-based
  • virtual functions require pointer, and the target objects are usually on heap, as Alexandrescu said on P78
  • pimpl idiom i.e. private implementation requires heap object, as Alexandrescu said on P78
  • the c++ reference is used mostly for pass-by-reference. Pass-by-reference usually works with heap objects.

In contrast, C++ uses small chunks of heap memory.

Across languages, heap usage is is slow because

  • In general OO programming uses more pointers more indirection and more heap objects
  • heap allocation is much slower than stack allocation, as Stroustrup explained to me
  • using a heap object, always always requires a runtime indirection. The heap object has no name, only an address !
  • In Garbabe-Collected languages, there’s one more indirection.

java primitives have no address #unlike C

In C, any variable, including those on stack, can have its address printed.

In java, the primitive variables have no address.  Every reference type object has an addresses, by definition (“reference” means address)

C# is somewhat mixed and I’m not going into it.

Python rule is extreme, simple and consistent. Every object has an address. You can print id(x) or getrefcount(x)

>>> from sys import getrefcount as rc
>>> i=40490
>>> rc(i)

[19] assignment^rebind in python^c++j

For a non-primitive, java assignment is always rebinding. Java behavior is well-understood and simple, compared to python.

Compared to python, c++ assignment is actually well-documented .. comparable to a mutator method.

Afaik, python assignment is always rebinding afaik, even for an integer. Integer objects are immutable, reference counted.
In python, if you want two functions to share a single mutable integer variable, you can declare a global myInt.
It would be in the global idic/namespace. q[=] on myInt has special meaning similar to

idic[‘myInt’] =..

Alternatively, you can wrap the int in a singular list and call list mutator methods, without q[=].

See my experiment in github py/88lang and my blogpost on immutable arg-parssing

python nested function2reseat var] enclos`scope

My maxPalindromeSubstr code in demos the general technique, based on

Note — inside your nested function you can’t simply assign to such a variable. This is like assigning to a local reference variable in java. explains the fundamental property of java reference parameter/argument-passing. Basically same as the python situation.

In c# you probably (99% sure) need to use ref-parameters. In c++, you need to pass in a double-pointer. Equivalently, you can pass in a reference to a pre-existing 64-bit ptr object.

closestMatch in sorted-collection: j^c++^python

Both of the above operate on sorted arrays. C++ also has

— java has the cleanest yet rich interface. P236 (P183 for Set) [[java generics]] lists four methods belonging to the NavigableMap interface

  • ceilingEntry(key) — closest entry higher or equal
  • higherEntry(key) — closest entry strictly higher than key
  • lowerEntry
  • floorEntry

c++complexity≅30% above java #c#=in_between

Numbers are just gut feelings, not based on any measurement. I often feel “300% more complexity” but it’s nicer to say 30% 🙂

  • in terms of interview questions, I have already addressed in numerous blog posts.
  • see also mkt value@deep-insight: java imt c++
  • — tool chain complexity in compiler+optimizer+linker… The c++ compiler is 200% to 400% (not merely 30%) more complex than java… see my blogpost on buildQiurks. Here are some examples:
  • undefined behaviors … see my blogposts on iterator invalidation
  • RVO — top example of optimizer frustrating anyone hoping to verify basic move-semantics.
  • See my blogpost on gdb stepping through optimized code
  • See my blogpost on on implicit
  • — syntax — c++ >> c# > java
  • java is very very clean yet powerful 😦
  • C++ has too many variations, about 100% more than c# and 300% more than java
  • — core language details required for GTD:
  • my personal experience shows me c++ errors are more low-level.
  • Java runtime problems tend to be related to the (complex) packages you adopt from the ecosystem. They often use reflection.
  • JVM offers many runtime instrumentation tools, because JVM is an abstract, simplified machine.
  • — opacity — c++ > c# > java
  • dotnet IL bytecode is very readable. Many authors reference it.
  • java is even cleaner than c#. Very few surprises.
  • — more low-level — c++ > c# > java.
  • JVM is an excellent abstraction, probably the best in the world. C# CLR is not as good as JVM. A thin layer above the windows OS.

pick java if you aspire 2be arch #py,c#

If you want to be architect, you need to pick some domains.

Compared to python.. c#.. cpp, Java appears to be the #1 best language overall for most enterprise applications.

  • Python performance limitations seem to require proprietary extensions. I rarely see pure python server that’s heavy-duty.
  • c#is less proven less mature. More importantly it doesn’t work well with the #1 platform — linux.
  • cpp is my 2nd pick. Some concerns:
    • much harder to find talents
    • Fewer open-source packages
    • java is one of the cleanest languages. cpp is a blue-collar language, rough around the edges and far more complex.

[18] Integer(like String)objects always immutable: java+python #XR

Integer(like String)objects always immutable in java. My google search confirmed that.

Beside serialization, there is no practical reason to deep-copy them.

Python integer objects are also immutable. Every time we modify an int, the new value has a different id(). See also my blog post on python immutable types.

See also

growth factor ] string/vector/hashtable #xLang

  1. std::string
  2. vector
  3. python list
  4. ArrayList
  5. hashtables

… all have algorithms to decide exactly how many percent more capacity to acquire during re-allocation. Usually it grows up to 2.0 in capacity:

ptr = inevitable when using c-str

It is impossible to use any string without using pointers in C, according to

That’s one reason to call C a lowLevel language.

In most c++ string classes, there’s still a c-string inside every “string object”. I am 99% sure the char-array now lives on heap.

In java and c#, not only the char-array, but the entire string object (including the house-keeping data) live on heap.

[19] j^python arg passing #(im)mutable

Python is ALWAYS pbref.

— list/arrayList, dict/HM etc: same behavior between java and py —
pbref. Edits by the function hits the original object.

— string: probably identical behavior between java and py —
immutable in both java and py.

Passed by reference in both.

If function “modifies” the string content, it gets a new string object (like copy-on-write), unrelated to the original string object in the “upstairs” stackframe.

— ints, floats etc: functionally similar between java and py —
If function modifies the integer content, the effect is “similar”. In python a new int object is created (like string). In java this int is a mutable clone of the original, created on the stackframe.

Java int is primitive, without usable address and pbclone, never pbref.

Python numbers are like python strings – immutable objects passed by reference. Probably copy-on-write.

python^perl #my take

(Collected from various forums + my own, but ranking is based on my experience and judgment)

  1. OO
    1. – Not sure about perl6, but not many app developers create perl classes. Many CPAN modules are OO though. Python users don’t create many classes either but more than in perl. I guess procedural  is simple and good enough.
    2. – not sure about perl6, I feel perl OO is an afterthought. Python was created with OO in mind.
    3. – even if you don’t create any class, out-of-the-box python relies (more deeply) on more OO features than CPAN modules. Object-orientation is more ingrained in Python’s ethos.
    4. OO is important for large-scale, multi-modular development project. Python is more battle tested in such a context, partly due to it’s OO design, but i think a more important reason is the python import machinery, which is probably more powerful and more widely used than perl’s
  2. – Python can cooperate better with other popular technologies like DotNet (IronPython) and Java (Jython). See also python project in boost.
  3. – perl’s text processing (and to a lesser extent unix integration) features are richer and more expressive. Using every key on the keyboard is like using all 20 fingers and toes. I immediately felt the difference when switching to java. In this aspect, pytyon is somewhere between those 2 extremes.
  4. – perl can be abused to create unreadable line-noise; python has a rather clean syntax
  5. – As part of unix integration, perl offers many practical one-liners competing effectively with grep/awk/sed. Perl one-liners integrate well with other unix commands
  6. – Perl was initially designed for unix, text and as an alternative for shell script programmers. It’s still very good for that purpose. For that purpose, I feel OO offers limited value-add.
  7. – for unix scripting, perl is more similar to unix shell scripting at least on the surface, but python isn’t hard to learn.
  8. – I feel perl was more entrenched and more established in certain domains such as unix system automation, production support, bio-informatics
  9. big data, data science and data analytics domains have already picked python as the default scripting language. Perl may be usable but not popular
  10. – some say Perl the base language is much larger(??) than Python’s base language so probably it takes longer(???) time to learn Perl than Python but in the end you have a more expressive language???
  11. – CPAN was once larger than python’s module collection. If CPAN’s collection is A and python’s module colleciton is B, then I personally feel (A-B) are mostly non-essential modules for specialized purposes.
  12. – Python has better support for Windows GUI apps than Perl.
  13. I feel the open source contributions are growing faster in python
  14. There are more books on python

is Set based on Map@@ #c++,python

–java: I believe set is often based on map…

–std::set is based on RBTree, same as std::map. Matt Austern said

“The Associative Container map is very similar, except that in a set the elements are their own keys while in a map the elements are pairs.”


CPython source code for set (according to Achim Domma) is mostly a cut-and-paste from the dict implementation.

c# static class emulated in java/c++ #all-static


use a (possibly nested) namespace to group related free functions. See google style guide.

c# has static classes. C++ offers something similar — P120 effC++. It’s a struct containing static fields. You are free to create multiple instances of this struct, but there’s just one copy for each field object. Kind of alternative design for a singleton.

This simulates a namespace.


In [[DougLea]] P86, this foremost OO expert briefly noted that it can be best practice to replace a java singleton with an all-static class

–c# is the most avant-garde on this front

  • C# static class can be stateful but rarely are
  • it can have a private ctor

big guns: template4c++^reflection4(java+python)

Most complex libraries (or systems) in java require reflection to meet the inherent complexity;

Most complex libraries in c++ require template meta-programming.

But these are for different reasons… which I’m not confident to point out.

Most complex python systems require … reflection + import hacks? I feel python’s reflection (as with other scripting languages) is more powerful, less restricted. I feel reflection is at the core of some (most?) of the power features in python – import, polymorphism

technical advantages of c# over java #XR

Hi XR,

Based on whatever little I know, here are some technical advantages of c# over java.

(Master these c# feature and mention them in your next java interview 🙂

  • C# has many more advantages on desktop GUI, but today let’s focus on server side.
  • [L] generics —- c# generics were designed with full knowledge of java/c++ shortcomings. Simpler than c++ (but less powerful), but more complete than java (no type erasure). For example see type constraints.
  • [L] delegates —- Rather useful. Some (but not all) of its functionalities can be emulated in java8.
  • [L] c# can access low-level windows concurrency constructs such as event wait handles. Windows JVM offers a standardized, “reduced-fat” facade. If you want optimal concurrency on windows, use VC++, or c#.
  • [L] reflection —- is more complete than java. Over the years java reflection proved to be extremely powerful. Not sure if c# has the same power, but c# surely added a few features such as Reflection.Emit.
  • concurrency —- dotnet offers many innovative concurrency features. All high level features, so probably achievable in java too.
  • tight integration with COM and MS Office. In fact, there are multiple official and unofficial frameworks to write Excel add-ins in c#
  • tight integration with high-level commercial products from Microsoft like MSSQL, sharepoint
  • tight integration with windows infrastructure like Windows Services (like network daemons), WCF, Windows networking, Windows web server, windows remoting, windows registry, PowerShell, windows software installation etc
  • c# gives programmers more access to low-level windows system API, via unmanaged code (I don’t have examples). In contrast, Java programmers typically use JNI, but I guess the java security policy restricts this access.
  • probably higher performance than JVM on windows
  • CLR offers scripting languages, F#, IronPython etc, whereas JVM supports scripting languages javascript, scala, groovy, jython etc.

[L = low-level feature]

If you want highest performance on Windows, low-level access to windows OS, but without the complexity of VC++ and MFC, then c# is the language of choice. It is high-level, convenient like java but flexible enough to let you go one level lower when you need to.

Another way to address your question — listen to the the complaints against java. (Put aside the complaints of GUI programmers.)

Even if a (rational, objective) architect doesn’t recognize any of these as important advantages, she may still favor c# over java because she is familiar and competent ONLY in the Microsoft ecosystem. She could point out countless features in Visual Studio and numerous windows development tools that are rather different from the java tool set, so different that it would take months and years to learn.

Also, there are many design trade-off and implementation techniques built on and for Dotnet. If she is reliant on and comfortable in this ecosystem, she would see the java ecosystem as alien, incomplete, inconvenient and unproductive. Remember when we first moved to U.S. — everything inconvenient.

On a more serious note, her design ideas may not be achievable using java. So java would appear to be missing important features and tools. In a nutshell, for her java is a capable and complete ecosystem theoretically, but in practice an incomplete solution.

[17]MS java threading IV#phone

See also

These are in the QQ category i.e. skills required for QnA IV only.

Q1: 3 threads to print the numbers 1,2,3,4,5… in deterministic, serial order. Just like single-threaded.

Q1b: what if JVM A has T1, T2, and JVM B has T3? How do they coordinate?
%%A: in C++ shared memory is the fastest IPC solution for large data volume. For signaling, perhaps a semaphore or named pipe
%%A: I feel the mutex is probably an kernel object, accessible by multiple processes.

On Windows, mutex, wait handle, … are all accessible cross-process, but java (on Windows or unix) is designed differently and doen’t have these cross-process synchronization devices.

%%A: use a database table with one row one column. Database can notify a JVM.
AA: The java.nio.file package provides a file change notification API, called the Watch Service API. The registered JVM has a thread dedicated to watching.AA: in java, the JDK semaphore is NOT a wrapper of the operation system semaphore so not usable for IPC
A: java Semaphore? Probably not an IPC construct in java.

Q2: have you used any optimized Map implementations outside the JDK?

Q3: to benchmark your various threading solutions how do you remove the random effects of GC and JIT compilation?
%%A: allocate enough memory to avoid GC. Turn off JIT to compile every code path. Perhaps give the JVM some warm-up period to complete the JIT compilation before we start the benchmark.

[17] j^c++^c# churn/stability…

This comparison has a bias towards java. My observation is finance-centric.

Q: why do I feel c# as a t-investment is not as long-living as java and C (the longest-living)?
A: Java is not tied to any OS. Java is used on windows + many unix derivatives including linux and android, whereas c# and objective-C are tied to particular platforms.

However look at Perl. It is cross-platform but was displaced by vbscript on windows and python.

 notes worst – best factor
Microsoft doesn’t care about this c# jav c++ protection of Your investment
c# jav c++ churn
c++ has no single owner making those decisions #worst to best c# jav c++ stability of features
#younger to older c# jav c++ longevity(new tech tends to die young
or changing too much)
JVM is huge help c++ c# jav ease of troubleshooting
c++ c# jav maintainability of app
c# can be back-end but less proven;
c++ can crash easily — no exception to catch
c# c++ jav underlying stability as
long-running server
windows is murky. Java: type erasure c++ c# jav  dark corners undocumented
#high to low c++ c# jav syntax complexity
#worst to cleanest c++ c# jav clean language
#low to high jav c++ c# expressiveness
c++ much higher #low to high c++ entry barrier at senior level
c# can be ez to learn, but java is even cleaner  #easy to hard java c# c++ initial learning
only Microsoft provides #low to high c# c++ jav std+3rdParty libraries
c# — only on windows c++ c# jav popularity
c# — limited to Windows and GUI/web c# c++ jav wide appeal, general-purpose
each lang has x% high-end jobs but c++
percentage (30%) is highest
c# salary
c++ c# java job market depth

python class^module(singleton)

Both modu5 and class2 are based on a idic. Both can contain “system-wide” i.e. global variables, effectively singleton objects.

A module is like a singleton class, without constructor or inheritance.

Global variables and singletons — I figured these out because 2nd time you import a module, the module-level objects aren’t created again! You can safely construct objects at the module level and all of them become global variables. Is this similar to c++ namespace variables?

Calling a method on a class goes through __getattr__ hook. Probably no such thing in a module?

Importing a regular module actually executes the *.py file – no such feature with a class.

Modules (not classes) are hooked into the (non-trivial) import mechanism.

Module functions vs Methods have differences. All module-level methods are like classmethods, so a module can be simpler to use if you want a simple singleton class.

See also

python dict bad lookup key: solutions

A common scenario. myDict[‘baaadKey’] throws exception. compared the solutions.

My use case — if any key is not found, return a const default value.

— 1) Solution: defaultdict class. To my surprise, my use case is not easily supported. Need to define a factory method.

— 2) Solution: mydict.setdefault(myKey, myDefault) of the regular dict class. Note this solution is similar to the get() solution below, and it does NOT set a single default for the entire dict. explains setdefault() with clear examples, but I can’t remember the multiple rules so I won’t use setdefault in coding tests.

— 3) simplest Solution: mydict.get(myKey, myDefault).


## turn on asserts: 5 lang

===python: enabled by default

-O and -OO python parameters will strip away your assertions. Tested.

===C: enabled by default, but you need to #include <assert.h>

To disable, #define NDEBUG. Or you can pass -DNDEBUG to the GCC compiler

How to disable assert in GCC

===java: disabled by default

-ea enables assertions.

===Perl: disabled (unavailable) by default

use Carp::Assert; # like C include

===c#: disabled only in Release mode by default


c++ parametrized functor – more learning notes

Parametrized Functor (class template) is a standard, recommended construct in c++, with no counterpart in java. C# delegae is conceptually simpler but internally more complex IMO, and represents a big upgrade from c++ functor. Better get some clarity with functor before comparing with delegates.

The most common functor is the simple stateless functor (like a simple lambda). The 2nd common category is the (stateful but) immutable functor. In all cases, the functor is designed for pass-by-value (not by ref or by pointer), cheap to copy, cheap to construct. I see many textbook code samples creating throwaway functor instances.

Example of the 2nd category – P127[[essentialC++]].

A) One mental block is using a simple functor Class as a template type param. This is different from java/c#

B) One mental block is a simple parametrized functor i.e. class template. C++ parametrized functor can take various type params, more “promiscuous” than c#/java.

C) A bigger mental block, combining A+B, is a functor parametrized with another functor Class as a template param. See P186[[EssentialC++]].

This is presented as a simple construct, with about 20 lines of code, but the more I look at it, the more strange it feels. I think this is somwehwat powerful but unpopular and unfamiliar to mainstream developers.

Functional programming in c++?

In java, we write a custom comparitor class rather than a comparitor class Template. We also have the simpler alternative of a Comparable interface, but that’s not very relevant in this discussion. Java 8 lambda —

tryGet, tryAdd … on maps – c#, STL etc

* in dotnet the plain lookup operation will throw exception if non-existent. I hit this frequently in my projects…
*** STL map silently adds an empty entry! For the justifications (crazy interviewers?), see
*** java returns null
* basic Dictionary offers TryGet…()

TryAdd? Concurrent dict only, not the basic Dictionary.
* in dotnet, the plain Add() will throw exception if clash
*** STL map will silently ignore the insert() attempt, but operator[] can be used to insert or overwrite ???

TryRemove? No such thing. Using the plain Remove to remove a non-existent is no-throw.

process-wide exception handling, across lang

label – not sure…

I blogged before that one of the “services” a managed environment offers is (uncaught) exception handling.

By design, try/catch exception handling is per-thread. However, an innovative feature is the app-domain-wide exception handler that handles any “unhandled” exception on any thread. JVM and CLR both support this feature. (Actually they aren’t bullet-proof. They can’t handle stack overflow for example. I think it all boils down to the design of the VM.)

Q: Why doesn’t c++ have this feature (except set_terminate())?  No VM, but exactly how?

Now I think any VM (CLR, JVM etc) offers “containers” to contain your applications. A CLR can contain 2 app domains. When some thread in an app domain throws an exception and you don’t catch it, the container can “notice” it and fire the event. At least it has a chance to print the stack trace, which is sometimes impossible in c++.

I think the CLR is the “container” or “host” to the app-domain. The OS doesn’t see the code running in the app-domain; OS sees the CLR only. If app-domain generates an uncaught exception then the host/container VM won’t let it bring down entire VM. VM will print the exception and continue to live and host the other app-domains.

In contrast, in a traditional compiled application like c++, the uncaught exception would bring down entire process, as if the CLR itself throws an exception.

In a crude analogy, suppose I write a container (like a CLR) in C that evaluates a byte array containing some byte code. If the byte code throws exception, my container can print the error and continue to live.

I feel an OS is also a container. Without the OS, a program (compiled into assembly) can run directly on a CPU. Any error would crash the entire CPU. The OS isolates one such program from other programs, so one crasher doesn’t bring down entire OS.

convert sequence@obj→strings→concat #listCompr

P60 [[cookbook]] shows a neat trick

>>> data = [‘aa’, 50, 91.1]
>>> ‘, ‘ . join(str(d) for d in data)
‘aa, 50, 91.1’

Technique: generator expression,
Technique: str() conversion ctor. Without conversion, joining string with non-string leads to exceptions.
Technique: calling join() method on the delimiter string

The author points out that string concat can be very inefficient if you blindly use “+” operator. Similarly, java/dotnet offers stringBuilders, and c++ offers stringstream

top 2 threading constructs in java ^ win32

Update: Now I think condition is not a fundamental construct in c#. The wait handle is. WH are based on primitive kernel objects…. See other posts

(Soundbyte– I feel the know-how about low level constructs are more valuable/versatile/powerful. Interviewers often recognize that. These include CAS, conditions.)

See other posts on the additional threading constructs added by dotnet …
See also my post on NSPR, a cross-platform library with heavy concurrency emphasis.

Most important, practical and popular [1] constructs —
* pthreads       — 1) locks, 2) conditions
* java              — 1) locks, 2) conditions. On a “higher” layer, timers; thread pools; queues and other thread-safe collections
* win32/dotnet — 1) locks, 2) event WaitHandle (not conditions)… Also timers, thread pools. Task is supposed to be popular too.
* the dbx debugger offers “mutex” and “condition” commands as the only 2 thread-related features (beside the “thread” command)

In Win32, there are 2 lock constructs
1a) mutex : usable cross-process , like other kernel objects
1b) CRITICAL_SECTION : single-process only, like other userland objects

In general, locks alone are sufficient for simple thread coordination. Sometimes you need fancier tools —
– In java, you can use wait/notify, which is another primitive.
– In C#, wait/notify seems less popular. WaitHandle seem to be popular. Wait handles are designed (by MS) to be a more complex but feature-rich notification /construct/ than conditions. See However, experts agree conditions are more powerful. The IAsyncResult uses wait handles for inter-thread signal.

Interlocked/automicVar seems to be equally popular in java and c#.

I think exception CAS all other threading constructs rely on locks and conditions. As [[c# threading]] points out, you can simulate most features of dotnet Wait Handles by simple Condition techniques. However, wait handles (but not conditions) supports IPC.

[1] some techniques are important and practical but poorly marketed and unpopular — wait/notify, immutable, interrupts,..

struct in C is like c# value-type

Before C++, java or c#, C offers the struct. This is a true-blue value type. When you put a struct type variable on the LHS, the entire struct instance with all the fields are cloned bitwise.

If one of the fields happens to be a pointer like a c_str, then the address inside the pointer field is copied.

Beside pbclone, you can also work with a pointer to struct — a bit advanced.

In C++, the struct is backward compatible with C — pbclone by default.

C++ also added lots of features into the struct construct. It's essentially identical to the class except members are public by default.

Therefore, c++ class/struct instances follow value semantics (pbclone) by default

In java, there's only class, no struct. Any class instance is pbref — simple and clean. You never get bitwise copy with java class instances.

In c#, the class behaves just like java classes. The struct behaves like C struct.

Fwd: dotnet vs jvm performance on Windows

Hi Sunil,

I tend to get into long debates on controversial tech topics, so i hope i don’t do that again here.

You mentioned c# outperforms java on the server side. I find it intriguing.

First off, the truly performance-sensitive systems either use mainframe/supercomputers (large volume, paralell processing) or C ++/assembly (for latency sensitive apps). I assume we have no disagreement on their performance advantages over virtual machines like CLR or JVM.

The Next level of high-performance data server is perhaps represented by (but not exclusively) database and messaging servers. Even the new web 2.0 shops focus (i believe) most of their tuning effort on these data-heavy engines. C/c++ dominate. I also worked with ETL products like Informatica and Ab Initio. These are heavy duty data engines for fairly large volumes. All C/C++. They actually tried java.

Many of the servers used in finance are smaller and somewhat less demanding, but still there’s non-trivial requirement on performance. C++, java and c# compete in this space. Traditionally c++ won. In recent years, I have heard claims that java could outperform c++, probably on unix benchmarks.

On windows, I still think c++ has an edge. Between dotnet and jvm, I won’t be surprised dotnet IL could outperform java bytecode. However, in finance more application servers run unix/linux than windows.

I am no expert on database vendors, but I’d draw a parallel. Oracle is more popular on *nix. It runs on windows too but perhaps not as fast as MS-SQL. Microsoft would not release a database engine if it is beaten by a competitor on microsoft’s own OS. If we were to compare oracle vs mssql, it’s an unfair contest if done on windows — MSSql has home advantage.

A more interesting contest would be java/linux vs c#/windows on the same hardware.

c# enum is more like c++ or java@@

I told a c++ veteran that c# enum is more like c++ and less like java enum (singleton-based). He felt the other way. He points out that each c# enum has rich metadata, attributes, reflection, and even a few methods (from System.Object). Also, each enum type is a distinct type in the type system, which is important to reflection. You can use extension methods to add functionality specific to a particular enum type.

I now feel here’s the root of the issue — c# added lots of convenience features to simple value types like integer and float. A c# enum is fundamentally a special integer type. Passed by value, it behaves like a simple integer.

So my soundbyte is, if c++ enum is a plain integer, then c# enum is an integer enriched with convenience features.

Compared to java — java enums have real OO features, but c# enums get the features by other means.

SAM interface^lambda, across 3 languages

(In this discussion I suppose it’s probably OK to ignore the multicast feature of delegates.)

update — see java lambda and Single-Abstract-Method interface on

I used to feel a (unicast) delegate TYPE is quite similar to a SAM interface. Now I doubt it.

C++ has abstract classes. When all the methods are pure virtual, that’s an interface (as in java). In this tradition, an interface is typically implemented by a Stateful class. Most textbooks and most schools introduce interface of this kind. What if the implementation class is stateless?

(To keep things simple let’s suppose there’s just 1 method.)  Then the objects needed by the method must be passed in. This feels like a static utility method without the static keyword. Such an interface is quite a different animal from the traditional interface. A non-capturing Lambda is the best example. But also
– static nested classes in java
– anonymous delegates

Where c# API uses a lambda, java often uses a SAM interface, since Java didn’t support lambda until Java 8.

I feel in both c# and c++, lambda is often used as a function argument (not “parameter”). Imagine you have a method parameter whose type is an SAM interface,
– and this interface has just 1 (or few) implementation(s)
– and the instance of this implementation class is basically stateless,

then this SAM parameter is probably a lambda trapped in an SAM. C# 3.0 and c++11 would set it free.

suggest(hint) a GC run – CLR ^ JVM

Java’s gc() method is a suggestion/plea to JVM at run-time. (There’s no way to force an immediate GC cycle. I often call gc() some 4000 times in a few loops to coax the GC to start.)

CLR Collect() method with Optimized mode is exactly the same.

CLR Collect() method with Forced mode would force an immediate collection. No such feature in JVM.

All of these techniques are discouraged. When are they justified?

atomic operations offered in c++11 ^ pthreads libraries

P341 [[c++ concurrency in action]] has a nice table showing about 7 to 10 most important concurrency features across pthreads vs c++11 [2]. It’s obvious to me the 2 fundamental[1] and universally important features are locks and condVars. These are thoroughly and adequately supported everywhere — pthreads, c++11, java, c#. Beyond these 2 features, other features aren’t universally supported across the board. Let’s look at some of them.

— Atomic operations —
pthreads? no support
c++11? yes atomic types
java? yes atomic variables
boost? no support
c#? yes interlocked

Notice in the C#, c++11, java thread libraries there are specific constructs (classes and methods) for atomic INT and atomic BOOLEAN (but not atomic float), because in practice most atomic algorithms use primitive int and boolean types.

Atomic operations don’t always rely on specific CPU instructions. They are often “offered” on (no more than a few) specific atomic data types. Can we apply atomic operations on a random data type, like some method in a regular class? I doubt it. I feel in each thread library, there are no more than a few specific atomic Operations, tied to a handful of atomic data types.

— thread pool —
java? yes
c#? yes
c++11? no
pthreads? no
boost? To my surprise, No, according to the maintainer of boost::thread

I think it’s technically very feasible to implement thread pool using locks and condVars, so this feature is left out of the c/c++ base libraries.

[1] “Fundamental” is a imprecise term that I would not spend too much time debating. In c#, locks and condVars aren’t really rock-bottom fundamental. They are based on more fundamental constructs namely primitive kernel objects. In other languages, I’m not sure. Locks and condVars are often implemented in thread libraries not syscalls. It’s a bit obscure, arcane and even irrelevant to many developers.
[2] (and java and boost too, but this blog is about c++)

deep clone in c# _ c++ _ java — briefly

A theoretical IV question — how is deep clone (aka deep copy) done?  Remember Each non-trivial object is an object graph.

reflection is a universal big-hammer. Covers private fields and covers all “embedded” base objects. However, usually the author of a given class should decide how to deep-clone her class, rather than using reflection.

deep-copy and serialization — Remember serialization needs to recreate entire graph on a remote machine. Deep copy required.

deep-copy and c++ big3 — A c++ class having a pointer field must be careful about big3. Deep-copy often required.

deep-copy and object equality — both may need to traverse the object graph.

java clone() method is controversial and isn’t widely used in all the projects I have seen. C# IClonable is also controversial. Stackoverflow mentions — Microsoft recommends against implementing ICloneable because there’s no clear indication from the interface whether your “Clone” method performs a “deep” or “shallow” clone.

Incidentally, stack overflow exception can happen during recursive object graph traversal, even if we take care of cycles.

finally{} should never throw

Principle EE — exception throwing during exceptional stack unwinding is often disastrous.

dtor should never throw, largely because of EE.

A finall{} block should not throw. This block often runs due to some exceptionA (i.e. exceptional stack unwinding as in EE). If finally block itself  throws excetionB, then exceptionA is simply hidden. As bad as dtor throwing.

same as java finally{}

Important case: Dispose() should not throw. Dispose() often runs in finally{} or in finalizer (on the finalizer thread) shows 10 other places not to throw!

jar ^ dotnet-assembly

I feel the dotnet “assembly” concept borrows the java “package” and “jar” features. Here we compare assembly with jar

– physical files? Yes both are physical files
– executable? assembly can be EXE or DLL. Jar can come with a main class or without. Jar is more like DLL.
– contains IL bytecode? Yes. However, you could have a native assembly (see ngen.exe)
– unit of deployment? identical
– versioning? assembly versioning is mandatory and builtin. Jar versioning is adhoc, home-made and not always necessary.
– access modifier? No a jar is never used in the access modifier rules
– manifest? identical
– each dotnet class loaded in the VM has a fully qualified name containing the assembly name and namespace. See Not in java. Each java class is identified by the package name and unqulified class name.

common nested types – dotnet^java^c++

Enum – is a common nested type in c++, java and dotnet. In java it’s typically a static nested “class” . In c++, non-nested enum is more common than nested enum

Delegate – is a common nested type in dotnet. Many textbook examples declare custom delegates nested in a class. However, in practice, it’s more common to declare it outside all classes – more reusable.

Local typedef – is a common nested “type” in c++

method hiding – c# ^ java

C# hiding rules Based on [[c# primer]] P148.

— virtual methods —
Rule – for virtual methods, there’s no hiding — base version is completely “replaced”. If Account and CheckingAccount both define
v1(), and we have a CheckingAccount instance, then we can’t invoke the base v1(), even if we upcast.

Virtual method rule is fairly simple compared to non-virtual methods —

— non-static non-virtual —
Rule – If Account and CheckingAccount both define non-virtual, non-static method m1() with identical signature, then hiding occurs.
We are strongly advised (not required) to mark the hiding explicit using “new”.

Hiding affects only static, compile-time method binding. Suppose we have a CheckingAccount instance myAccount. Both m1() versions
are available at runtime, unlike the virtual method v1().

myAccount.m1();// is the derived version

( (Account)myAccount ).m1(); // is the base version. Static binding

Inside CheckingAccount source code, we can also use base.m1(). This is same as java.

— static methods —
Rule – java method hiding applies to static method only. Simple and clean. How about c#? I guess same.

I feel c++ hiding rules are less comparable. I feel c# borrowed more from Java than from c++.

jar ^ c# DLL, briefly

In java, namespace tree (not the inheritance “family” tree), physical directory tree and accessibility are all based on the same tree.

C# decouples them. The namespace tree has no physical manifestation.

The physical organization of files is based on assembly, which is unrelated to namespace.

For a third party library, java would use a jar. C# would use a DLL, which is an assembly. Inside the jar there’s a namespace tree known as a package. An assembly isn’t required to use a unique namespace.

fopen in various languages (file input/output

ofstream outfile(“out.txt”);
ifstream infile (“in.txt”); // class template

FILE * pFile = fopen (“myfile.txt”,”w”);

–php follows C
$handle = fopen(“a.txt”, “r”);

— python:
outfile = open(“a.txt”, “w”) # semicolon is usually omitted

open (OUTFILE, “>>append.txt”) or die …  ### No dollar sign. parentheses are optional but help readability

–c# offers many convenient solutions —
TextReader rd = new StreamReader(“in.txt”);
TextWriter tw = new StreamWrioter(“out.txt”);

Alternatively, File class offers variations of
static string ReadAllText(string path)
static void WriteAllText(string path, string contents) //creates or overwrites file

I have written so many of them but paradoxically can’t recall which class we need to instantiate

heap usage: C ilt C++/Java #Part 1

At the application level (as opposed to libraries), I personally feel C apps tend to do most of their work on the stack whereas c++ apps uses more heap.

Q: what are the classic C usages for heap? I feel most requirements are met by “auto” and global variables, including large N-dimensional arrays of structures. A big structure can in turn hold arrays/pointers (well, array and pointers are almost indistinguishable.)

A: linked graph data structures.

C++ added a lot of support for heap — including all the new-expressions and various operator-new (not to mention deletes).

– C++ new-expression ties together heap allocation and class constructor.
– C++ delete-expression ties together heap de-allocation and class destructor.

In C++, class instances are commonly allocated either on stack OR on heap. Java/C# is even more heap-oriented. Why?

c# namespace vs java packages

All of these have a BASIC job duty — partition the name space hierarchically, just like internet domain names.

But java package also have *additional* duties or values —

Java – access control by package
Java – Default access is package-access (C#/C++ default to private.)
Java – packages define what’s included in a jar
Java – packages define physical directory tree. I feel this is a simple and clean design.

In java, perl, python, c++, c# …, A namespace is imported by a using/import (not “include”) directive. C# inherits the syntax from c++.

nested class having ptr to outer-class Object]java,c#,c++

Usually java is cleaner than c++ and c#. However, in this case I believe java is the Least clean.

Java “non-static nested class” feature is not “embraced” by Microsoft. All c# nested classes are static and can’t access non-static fields of the enclosing class.

C++ doesn’t support java style inner class either. See

most fundamental data type in c#/c++/java

Notice the inversion —
* C# — Simple types like “int” are based on struct — by aliaing
* C — struct is based on primitives like “int” — by composition
* C++ — classes are based on the C struct

See Ignoring enum types for a moment, all c# value types are struct.

Now we are ready to answer the big Question
Q: beside pointers, what’s the most fundamental data type in this language?
A: C# — most fundamental type is the struct. Simple ints, classes and all other types (except enums) are built on top of struct.
A: Java/C++ — most fundamental types are primitives. In C++ all other types are derived using pointer and composition.
A: Once again, java emerges as the cleanest

Now we know why c# doesn’t call int a PRIMITIVE type because it’s not monolithic nor a truly fundamental type.

OO scripting languageS

I see more php developers create classes than Perl developers. I guess python developers are even more likely to create classes. Every language claim to support OO, but the litmus test is how easily (and how frequently) developers create user-defined classes.

(In the same vein, C++ template is a another thing regular developers have not fully embraced. Java generic class is yet another.)

Unlikely Java/c#, these scripting languages won’t deliberately make your life difficult if you don’t “embrace” OO.

Compared to perl and php, I feel python OO is simple, clean and “full-service” (not in the call-girl sense:). Incidentally, built-in python “features” are more object-oriented than in perl and php.

However, I still feel many python applications tend to be more procedural than OO. When facing a choice, I guess many developers feel procedural is simpler even thought python OO is not complicate. If a simple tool does the job, then no motivation to change.

What about multi-module team development? Look at perl and C projects. Many team still stick to procedural.

LINQ predicate delegate ^ boolean functors in STL shows a nice usage of delegate —

One such (functional programing) technique is to use them for filtering sequences of data. In this instance you would use a predicate delegate which accepts one argument and returns true or false depending on the implementation of the delegate itself.

using System;
using System.Linq;
using System.Collections.Generic;
class Program{
  static void Main() {
        List names = new List {
                "Nicole Hare",
                "Michael Hare",
                "Joe Hare",
                "Sammy Hare",
                "George Washington",
        // Here I am passing "inMyFamily" to the "Where" extension method
        // on my List.  The C# compiler automatically creates
        // a delegate instance for me.
        IEnumerable myFamily = names.Where(inMyFamily);
        foreach (String name in myFamily)
    static Boolean inMyFamily(String name){ // unary boolean functor
        return name.EndsWith("Hare");

c# struct, java bean, propertySet

Data bundles are extremely common, perhaps inevitable — domain models; DB records; GUI data-transfer-objects; POJO…

Struct is the C solution. Every field is read-writable. One special feature is sizeof(myStruct) — all the fields live contiguously.

C# (more than java POJO) classes enhanced the struct idea to let you customize the getter and setter operations on each field…

In c++ quant library, propertySet/propertyPage is a major enhancement….

what things are #include’d – c++, java, python

Not just utility functions, but also

* global variables like cin.
** think of them as fully initialized OBJECTS (or services), usable out of the box

* classes – as cookie cutters, very different from cookies the OBJECTS above
** and Families of classes

Java (and c#?), in contrast, only put TYPES in included libraries. To include (and then use) utility _OBJECTS_, they have to be presented as public static fields like System.out. These static member objects get created when the TYPE is loaded by the class loader. In contrast, c++ library objects (like cout) are typically part of a  namespace.

Why the difference? I think reason is the so-called “free(standing)” variable outside any type. Java forbids it. Any language permitting it can package it along with types and free functions into a “unit” (package, module, namespace…) to be #included.

Python (and perl) is more like c++.

#1essential classificationS@variables]each language: java/c++/c#

In java, a variable is exactly one of 2 — reference var or primitive var. Therefore, a thingy/entity (more precisely an”allocation”) is either an OBJECT or a primitive. Clean and simple.

C++ doesn’t differentiate between primitives and composite objects. However, it does differentiate between allocation locations. In C++, an entity either lives on heap, on stack or global area, with important differences. The word “object” has no special meaning.

See also
See also

In C#, avoid the word “object” if possible. The real, deep and precise (less clean than java) dichotomy is value-type vs reference type.
* a reference type variable occupies 32 bits on a 32-bit machine. The pointee is always on heap, and could be 99 Bytes.
* a value type variable is like a nonref in c++ and occupies 99B. Unlike reference type variables, there’s no separate storage for the variable.
* See

– assignment to a reference type var copies the RHS 32 bit address and reseats the LHS pointer. Same as java.
– assignment to a value type var bulldozes the LHS and clones the 99B into it.
**Java only clones primitives, up to 64 bits.
**C++ can clone any variable if you pass-by-value.

pbref^pbclone in c#, c++ and java, again

Let’s Assume a 32-bit machine.

– java primitive param in a method? Value (up to 64bit) is copied — to the call stack
– java reference param? the 32-bit virtual address of the heapy thingy is copied. Note the caller must provide a tuple consisting of
*** a heapy thingy
*** a 32-bit address, possibly anonymous (How? [1])

C++ by-reference param? the 32-bit address is copied into the function’s stack frame
C++ non-reference param? entire chunk of memory of the argument object is copied, from stack/heap[2] to stack
C++ pointer param? Reference to pointer is illegal.  Declaration is f(someType *). Caller must plug in a 32-bit address of a someType instance. The 32-bit pointer is copied.
C++ double-pointer param? Declaration is f(someType **). See
Note C++ has no rule differentiating a basic type (like char/float) vs a user-defined class.

C# is more complicated
c# by-value on simple/struct type? Entire chunk of memory is copied into the method stack frame
c# by-value on reference-type? 32-bit address copied, just like java reference param
c# ref-param on reference-type? See diagram on P71 [[C#Precisely]] (Value type easier)
** this is true call-by-reference, unsupported in Java
** like C++ double pointer param
** The caller must provide a 32-bit address of a reference variable [3] myvar. Literals like 123 is not a variable therefore unacceptable. Since myvar is a 32-bit pointer, it may point to a 32-bit address 0xAA112233 on heap OR stack
** Let’s say myvar itself has address 0xFFEEFF12. This address is copied into the method as parameter myParam
** myParam becomes a permanent __alias__ of myvar. Effectively 2 names at the same address. myParam may not get a 32-bit storage but it doesn’t matter.
** in the method, assignment to myParam _reseats_ myvar, to 0xAA110000 for example.
** after the method, myvar still points to 0xAA110000

Note when a 32-bit address is copied (pbref), then any change to the pointee is visible after the call; invisible for pbclone.

Note in general, pointee can be on stack or heap

[1] if you pass in new MyType()
[2] how would a nonref param passing involve copying from heap? Granted parameter is nonref, but the argument could be an unwrapped pointer. Another case — a heap object’s field is passed into the function.
[3] myvar can be a primitive variable. Look at P71 [[c#precisely]] to see the difference.

when does implicit cloning occur – c#/c++/java

Background — When we use an existing “object” to write into a variable, we usually [1] get either by-reference-copy, or by-value-copy i.e. cloning, typically bitwise. (C++ allows us to make it non-bitwise via op= and copy ctor.) Here are the contexts that triggers such cloning —

Java is cleanest — only primitives and all primitives get cloned when passing in/out from methods

c# is messier — all Value types and only Value types are pass-by-clone (pbclone). Includes primitives and structs.
* passing in/out from methods
* initializing a Value type variable
* assigning to an existing Value type variable — bulldozes then clones

C++ is most customizable but there are rules too.
* nonref is always pass-by-clone, but you need to look at function prototype. The original argument object could be a reference variable but may still pass-by-clone
* nonref variable init or assignment clones
* unwrapped pointer assignment clones
* reference variable assignment (not init) clones, since reference vars refuse re-seating.

[1] let’s not discuss why “usually”

rooted vs re-bindable variables – c#, c++, java, python

Q: What kinds of variables can re-bind (reseated) to a different object at run-time and what kinds can’t? This understanding is not academic but helps programmers remember ground rules.

—-Python moves further towards rebinding. Even a simple myInt variable can rebind. I feel the fundamental distinction in python world is between immutable vs mutable “Objects” (defined as storage-locations).
* Python Immutables are reference-counted, probably copy-on-write. Therefore variables bound to immutable Objects are reseat-able.
* What python variables are rooted? Well I believe the first element (other elements too) in a tuple is, though the tuple variable itself can rebind.

—-In java, all primitive variables are “rooted”. All reference variables are reseat-able.
+ Assigning to a primitive variable writes into “the ultimate” memory location;
+ Assigning to a reference variable reseats the pointer, without cloning any object.
– There’s a Separate 32-bit storage for every reference variable, distinct from pointee’s storage.[1]
– There’s no separate storage for a primitive variable. Variable name is a nickname of the storage address. Compiler translates variable name into storage address. Run-time access to variable is one-hit. In contrast, Reference variables’ access is 2-hit – following the pointer.

[1] Evidence? See memory layout of any MyClass having a non-primitive field. How much memory (like sizeof(MyClass)) is allocated by new MyClass()?
—-In c#, all Value variables (including structs) are rooted. Assignment clones, including pass-by-value into a method.
All reference variables can be rebound.
—-C is simple and clean
All pointer variables can be re-seated but non-pointer variables are rooted. When a variable is on the LHS, it either rebinds or the Object is “edited”. See post on “Immutable, initialize..” to see the difference between Object vs Variable.
—-C++ feels more complicated.
In C++, all nonref and reference variables are rooted. Assignment writes directly into the object’s “stomach”. Pointers are reseat-able (unless const …)

However, a C++ reference variable (like pointer variable) has a separate 32-bit storage (address hidden) distinct from pointee/referent storage. Some writers say “referent” but I find “pointee” more distinct and less ambiguous.

threading features – c# improving over java

C# “adapted” many java designs, but by different magnitudes.
+ Where c# creators found _robust_ designs in java, they made minimal adjustments — like in threading. Basically wholesale borrow.
+ Where they found problematic/controversial designs, they made bigger adaptations.

Java’s *language* support for threading is rather clean and comes in only 3 foundation building blocks – creation, locking and signal. So in these 3 areas what are the adjustments by c# creators?

– signaling (wait/notify) – identical
– locking – identical
** synchronized keyword replaced by lock keyword
** static synchronized similarly adapted

– thread creation – Fundamentally identical. Superficial changes
** delegate to replace Runnable interface

Beyond these 3, java language includes some important (but less central) features, largely borrowed wholesale by C# creators.
– join
– interrupt
– sleep

Other java threading support is largely “superstructure”.
+ read-write lock
+ thread pool
+ callable tasks and future results —

double ptr Usage #2 – special reseating #c#

Double pointer has many usages in c++. Usage #1 is array of pointers. Here’s Usage #2

Q: in a function, can you reseat a pointer object (32bit) created outside the function?
A: possible in C. pointer-to-pointer is a C-specific construct that can reseat the “nested” pointer.

int var3 = 3, var9 = 9;
void reseat(int ** ptr){ *ptr =  &var9; }
int main(){
    int * intp = &var3;
    reseat(&intp); //intp is a pointer-to-int, now reseated to &var9

C# has a similar feature in “ref” params.

In java, a reference param to a method is always a local ptr Variable pointing to the argument Object. Once you reseat this local ptr Variable, you lose the argument object. You can’t reseat the original ptr Variable declared outside your method.

void metho1(Object param){
  param=new Object(); // reseating

RAII^ContextManager ^using^ java AutoCloseable

1) Stroustrup commented that c++ doesn’t support finally{} because it has RAII dtor. See

Both deal with exceptional exits, unless noexcept specified.
Both are robust.
Both are best practices.

However, try{} etc has performance cost [1], so much so that some c++ compilers can be configured to disable it. C++ Memory management relies heavily on RAII. Using Try for that would be too costly.

[1] noexcept was presumably introduced partly to address this cost.

2) python ContextManager protocol defines __enter__() and __exit__() methods

Keyword “with” required …

3) Java uses finally(). Note finally{} becomes implicit in java7 try-with-resources

AutoCloseable interface is needed in try-with-resource. See

4) c# — Achilles’ heel of java GC  is non-deterministic. C#’s answer is q(using). C# provides both USING and try/finally. Under the hood USING calls try/finally.

I feel c# USING is evolution-wise a closer cousin to RAII (while try/finally is is a distant cousin). Both use variable (not obj) scope to manage object (not var) lifetime.

USING uses Dispose() method, which is frequently compared to the class dtor/Finalize(). For the difference between c# Dispose() vs dtor vs Finalize, see other blog post(s).

As you can see, c# borrowed all the relevant techniques from c++ and java. So it’s better to first understand the c++/java constructs before studying c# constructs.

mkt-data subscription engines prefer c++ over C

For a busy feed, Java is usually slower. One of the many reasons is autoboxing. Market data always prefer primitive integers (rather than floats), char-arrays (rather than null-terminated or fancy strings).

I think Another reason is garbage collector — non-deterministic. I feel explicit free() is fast and efficient [1].

A market data engineer at 2-Sigma said C++ is the language of choice, rather than C or java. Some market data subscription engines use C to simulate basic C++ features.

[1] free(3) is a standard library function, not a syscall (manpage section 2). No kernel involvement.

y java is dominant in enterprise app

What's so good about OO? Why are the 3 most “relevant” enterprise app dev languages all happen to be OO – java, c# and c++?

Why is google choosing java, c++ and python?

(Though this is not really a typical “enterprise app”) Why is apple choosing to promote a compiled OO language — objective C?

Why is microsoft choosing to promote a compiled OO language more vigorously than

But why is facebook (and yahoo?) choosing php?

Before c++ came along, most enterprise apps were developed in c, cobol, fortran…. Experience in the field show that c++ and java require more learning but do offer real benefits. I guess it all boils down to the 3 base OO features of encapsulation, inheritance and polymorphism.

python: very few named free functions

In python, Most operations are instance/static methods, and the *busiest* operations are operators.

Free-standing functions aren’t allowed in java/c# but the mainstay of C, Perl and PHP. Python has them but very few.

— perl-style free functions are a much smaller population in python, therefore important. See the 10-pager P135[[py ref]] —
map(), apply()
min() max()

— advanced free functions such as introspection
repr() str() — related to __repr__() and __str__()
type(), id(), dir()
isinstance() issubclass()
eval() execfile()
getattr() setattr() delattr() hasattr()
range(), xrange()
?yield? not a free-function, but a keyword!

single-threaded UI update – wpf vs swing

In a WPF app every visual “element” object has a “owning” thread. It can be modified only by the owning thread. This sounds like “every child has a mother”. Indeed WPF lets you have 2 owning threads if you have 2 unrelated window instances in one Process[3]. In practice, most developers avoid that unnecessary flexibility, and use a single owning thread — known as a dispatcher thread.

[3] each having a bunch of visuals.

In WPF, Anytime you update a Visual “element”, the work is sent through the Dispatcher to the UI thread. The control itself can only be touched by it’s owning thread. If you try to do anything with a control from another thread, you’ll get a runtime exception.

new Action(
delegate(){myCheckBox.IsChecked = true;}

Very similar to invokeAndWait(new Runnable()…. There’s also a counterpart for invokeLater().

fwd: heap memory allocation – java/c#/C

Thanks Nigel or the article. A quick glance suggests to me the allocation (not the de-allocation) procedure is no different from malloc, logically.

At run time, the memory management library (some functions pre-loaded into the code section of the address space) gives out small chunks of memory to the requesting application. When there’s insufficient “small chunks” to satisfy a request[1], the library synchronously grabs a large block from OS, and adds this large block to the private free-store, which is private to the process, managed by the library. This is the malloc() behavior I know.

[1] perhaps due to fragmentation

This “library” is compiled binary code. For c/c++, I think this means platform-specific object code. For java, this means platform-independent java bytecode, or more likely platform-specific native code.
For dotnet, I guess it’s windows native binary code. Now, is this binary code compiled from a C linker/compiler that used the standard malloc() function? I think it’s possible. If the dotnet runtime is itself written in C, then the runtime (including the  memory library) is compiled/linked with a C compiler/linker. It’s therefore possible to use the malloc library.
Alternatively, I guess it’s also possible to create a custom memory module with an equivalent function to malloc(). I make this claim because Microsoft created its own version of C compiler and C memory management library since the early days of C.
Subject: RE: question on heap memory allocation
Does this use malloc ?
Subject: question on heap memory allocation
Hi Nigel,
Someone speculated that “in any programming language, heap memory is allocated always using the C function malloc(). There’s no alternative.” I know JVM and dotnet runtime are both written in C/c++ so at run time, heap memory is grabbed from OS probably by malloc(). Not sure about other languages.
Is my understanding correct?

java^c# generic: struct, boxing, erasure

At the heart of all the jargon and debates lies one core issue — field layout (FL). Essentially [1], field layout is what truly defines a type, be it class or struct.

For example, the java LinkedList’s Node class has field layout to hold a payload and a “next-node” — all pointers. Since untyped Node came to the scene first, Java generics resorted to type erasure (TE) for backward compatibility. As a result, Node of any payload is the same Node type at runtime. Consequently, LinkedList of any payload boils down to the same type. one-size-fit-all.

C# needs no TE. However, in a twist of clever design, Node of any reference payload has a field layout just like java. So LinkedList of any reference type boils down to the same type. Reduces code bloat compared to c++ template.

vvv Java Node of int has to be boxed/unboxed because Node payload (like any reference type) must be a subtype of Given a node, the compiler is /hardwired/ to treat the “payload” field therein as a pointer to the Heap (where boxed Integer lives).

^^^ C# Node of int is a distinct type with a distinct field layout.[2] The int value is a struct Instance embedded in Node object’s real estate. No boxing/unboxing.
** Struct payloads are all treated this way.

This is yet another low-level technical issue affecting only high-performance, high volume market data where megabytes of packet copying is expensive. Non-trading developers tend to consider this a niche that’s “too” specialized.

See P156 [[c#precisely]]

[2] Incidentally, a Node of float has the same size as a Node of int, but compiler interprets the payload field as int vs float. Therefore different field layouts.
[1] This is practical thinking. I am no language lawyer. Specifically,
– A type can have only 1 field layout.
– If 2 types have identical field layout, then we can make one an alias of the other like a typedef. It’s wasteful to create too many duplicate types.

Set (data structure) support across languages#+py

java collections? hash set + sorted set (tree set, skip list…)
STL? tree set + sorted multiset
STL extensions? hash set

—- Languages Below offer reduced support for Set. Emulate with dict —-
c# ? HashSet is not a first-class citizen like Map and List
perl? no set
php? no set. Array and Associative array are the only builtin data structures
python? Set is not a first-class bultin like dict and tuple. There are fewer operators on Sets. However set() is a built-in free standing function, just like tuple() and dict()

The reason STL supports set as a distinct container from map? Short answer is Efficiency. C++ is a low level, high-efficiency, application-building tool. At that level, set offers real advantage over map.

pipe, stream, pesudo file, tcp-socket#xLang

* Java has a family of IO classes;
* c++ has a family of stream classes derived from class ios;
* unix has various file-like things.

“pipe” and “stream” are 2 general terminologies for an input/output destination with sequential[1] access. A file is one subtype of stream. Therefore java and c++ access files using their family of classes above.

[1] c++ gives each sequential stream a get-ptr and a put-ptr.
A named pipe is an entity in the file system….

TCP (not UDP) socket is a stream-oriented protocol. Sockets, either unix domain or TCP (not udp), is also like a pipe/stream. In fact, Unix treats any socket as a file so you can read and write to a it like a file. Perl liberally “confuses” file handles and sockets, just as C uses the same type of file-descriptor-number for files and sockets.

TCP (not UDP) is ordered — if two messages are sent over a connection in sequence, the first message will reach the receiving application first. When data segments arrive in the wrong order, TCP buffers the out-of-order data until all data can be properly re-ordered and delivered to the application.

Python wraps the socket descriptor number in a socket object, but you can still get the underlying descriptor number by mySocket.fileno()–

import socket
mySocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print mySocket.fileno()

c#delegate:closest ancestors]other lang+py

Update — I now feel it’s more practical to separate the 2 (or more) “streams” of thoughts of labmda vs multicast events. The write-up below was written without that seperation. Unnecessarily complicated.

a unicast delegate type =~= a java interface type with a single method, without method body. [1] I no longer subscribe to this.
a unicast delegate type =~= a typedef for a function pointer. This typedef can apply to many functions.

a “delegate” is an instance of a particular delegate type. If we have to build on the java interface idea, then ….

a delegate instance =~= an instance of a stateless concrete single-method class implementing that interface in [1]

a delegate instance =~= a particular stateless functor. Specifically an instance of a functor-wrapper. Note An instances of a functor type is sometimes stateful (see Scott Meyers), but an instance of a delegate type is always stateless.
Update — As explained in ….
a delegate instance =~= instance of a functor-wrapper having a permanent pointer to the target object myDog, so it could later invoke myDog.jump()

a delegate instance =~= python method object bound to a host object? But stateless??

a delegate type is a distinct type (with a name), a subtype of Class Delegate. Delegate type names typically start with D like “DSomething”

The clue to see the link between a delegate type and a delegate instance is the instantiation expression new DSomething(meth1)

Java’s interface is not really an extension of the c++ functor. C# delegates build on top of java interfaces AND c++ functors.

countingSemaphore and read/write lock

Q: Logically, reentrant read-write locks (available in java, boost,objectSpace …) can be achieved using counting semaphores — in theory. How about practically?
%%A: I feel you can implement it using condVar without countingSema.

Q2: I guess counting semaphore uses inter-thread signals. This is either the conditionVar OR the low-level signaling construct described in the post on [[mutex^condVar^countingSemaphore — big 3 synchronization devices]]

%%A2: I feel in some cases it could be the condVar. CountingSema are sometimes written by library-USERS, who must use library utilities.Those low-level signals by definition aren’t exposed to library-USERS.

challenges when using JNI

Recently I was asked about the challenges when developing apps using JNI.

I mentioned the “native” keyword, mentioned GC, type safety(???) but missed the portability issue. The bond calculator is in C. C is platform specific, so the java app can't run on windows.

Wikibook says:

To complete writing native method, you need to process your class with javah tool that will generate a header code in C. You then need to provide implementation of the header code, produce dynamically loadable library (.so under Linux, .dll under Windows) and load it with System.load(library_file_name) — as seen in reo.

jagged^matrix(2D-array) syntax: C/j/py/C#

See also

There exist only 2 types of 2D arrays across these 3 languages — C, java and c#. Across these languages the 2 types can be compared but the syntax … better don’t compare.

C uses vastly different syntax between
– array of pointers — i.e. jagged.
[a][b] matrix. Column(and row) size is fixed and permanent. Total a*b storage space allocated, used or unused. Unused space is sometimes padded?
** arr[a,b] // this code strangely compiles but means arr[b]. This confusing syntax is completely unrelated to 2D array.

A Java 2D array is jagged,  _ a l w a y s _ see [[javaPrecisely]]. Implemented as an array of pointers. Syntax is…. [a][b], a departure from C. Java has no built-in support for matrix

C# 2D arrays are really 2 unrelated constructs with similar syntax. The Jagged[a][b] vs the Matrix[a, b]. See

Python can support arr[3][2]. See

In terms of syntax evolution, java jagged took the c matrix syntax, and C# jagged inherited java syntax.

Therefore the c# designers faced a dilemma whether to follow c or java syntax. They chose java and not c.

In summary, the more useful jagged construct has this syntax
*arrOfPointer[3] // C
arr[][] //java
arr[][] //c#

A matrix in linear algebrea is a rectangular 2D array, built-in with c and c#, not java.

y invokeLater() important to swing (creditx

A Creditex Swing developer explained to me that UI redraw is an operation happening on the dispatcher thread. This thread should be the only thread that modifies UI objects, otherwise the screen may get messed up by uncoordinated concurrent updates.

Suppose you have an onMsg() listener triggered by market data.  This onMsg() obviously runs in it’s own thread, which blocks until waken up by incoming message. If a naive developer decides to modify some JTable object in onMsg() thread[2], and at the same time another timer thread is also updating the JTable, and the event-dispatcher thread is redrawing the screen … display will change in unexpected ways.

In WPF, every UI control is owned by the UI thread ie the dispatcher thread, according to Longbo — Good simplification. No other thread can modify UI controls, though they can modify the data binding (models?) behind the UI controls. To effect a change on UI, other threads must go through dispatcher thread as the choke point of control. I was told this is done by some event publication model, probably with INotifyPropertyChanged.

[[Learning java]] shows swing has a single event queue — see my post on async buffer. If UI thread is in the middle of some long calculation when you publish, then synchronous call is not possible. In fact, synchronous call would mean the event sender thread is the only thread involved, which is the scenario [2] above.

heap allocation +! malloc()

Someone speculated that “in any programming language, heap memory is allocated always using the C function malloc(). There’s no alternative.” Nigel (Citi) disagrees. If a language is not based on C, then it can use its own heap-management library.

The heap-mgmt library is a wholesaler/retailer. For efficiency this library requests large blocks [1] of memory and gives out small chunks to the application. Probably many languages have a heap-mgmt library. C’s heap library (in glibc) uses malloc(). Nigel felt C# has its own heap-mgmt library and may have a malloc-equivalent. JVM is written in C but it could re-implement the heap-mgmt library with its own malloc-equivalent.

Everyone must file tax returns with the same government, but through different tax consultants. Tax consultants are not part of the government. Similarly, Heap-mgmt library is one level above system calls. It makes system calls (perhaps brk()/sbrk()) to request the large blocks from OS. Every language must use system calls to request memory but possibly using its own heap-mgmt library.

c# enum is more like c++, less like java@@

c++ enum is usually 16-bit but can be configured 8-bit in high-performance systems like market data feed.

java enum is very much like a class with static (singleton) fields. Nothing to do with integers. (perhaps implemented with integers.)

— c# enum is more like c++ IMO in terms of run time representation.
* No singleton. c# Enum objects are passed by value (pbclone), just like simple-type integers (which are physically struct objects)
* An enum Instance is usually 16-bit.
* An enum Type is always based on some integer type, slightly more flexible than c++
* For now, safely disregard the (confusing) fact that enum types extend System.Enum.

— C# enum is more like java enum in terms of compiler support.
A friend pointed out some differences between c# and c++ enums
$ c# value TYPES (not instances) have rich meta data. C++ enum is a simple int. No methods
$ c# enum type can customize ToString()


unlock@thread death

(See other posts about “consistency” issues due to the Release.)

Q: will the dead thread Hold or Release its locks?

* c++? — H. You need RAII. See posts on RAII
* For java sync-keyword — Release is guaranteed. See 2.2.1 [[Doug Lea]], P123 [[java threads]].
* For, by default dead threads — Hold locks. see P123 [[java threads]]. Finally block is recommended.
* Thread.destroy()/suspend() + sync-keyword? — H, leading to deadlock
* Thread.stop() + sync-keyword? — R, leading to inconsistency
* What if i use and Thread.stop()? My guess is — H

The give-and-take, the cost/benefit — Hold protects consistency but risks liveness; Release provides liveness but risks consistency.

c++function entities — 2nd class citizens@@

Compared to java, I feel c++ is more flexible, “creative” and “loose” with functions; c++ weaves functions more into its rich syntax; and c++ apps rely more heavily on passable function entities. Any algorithm (any trivial body of code) can be passed around like a variable or object.

Inside the computer, a function (like objects) occupies a chunk of memory in “text” section, so it’s reasonable to pass its address to where it’s needed. C++ lets your *package* algorithms into passable function entities. These function entities are often stateless[2]. Consider functional programming. I call these 2nd-class citizen objects because they normally do not have state, variables, virtuals etc.

(Java Threads also should not but often have state.)

Java has interfaces and Method objects. C++ has func pointers and functor objects. Both designs are powerful, widespread and proven.

[2] functors are almost always pbclone in STL. Some functors are stateful — (beside generators) Dinkumware’s hash tables have hashing traits/policies as state in the hashing functors. See separate blog post. shows some func ptr syntax examples —

someObject.someMethodReturningAFunctor()(arg1, arg2); // not the number of () pairs
(instance1.*pt2Member)(12, ‘a’, ‘b’);

(*this.*pt2Member)(12, 'a', 'b');     
(instance2->*pt2Member)(12, 'a', 'b');  

(*(someComplexExpressionReturningFuncPtr))(arg1, arg2)

objects confined to call stack

Note on c++ — unlike java, class objects can live on stack, and are automatically confined

Doug Lea’s chapter on Confinement (within the Exclusion section) briefly explains the important type of object confined to a call stack. These are as thread-safe as an object instantiated as a local variable and never leaked outside the method.

Note (static or instance) fields constitute the other variables in java — not local, so never confined to any call stack

Q: Take any snapshot of the object reference graph (a directed graph) as a stop-the-world garbage collector uses, will this object be reachable from outside this /method/?

If you *analyze* the source code and reach a answer of “No”, then it’s thread-safe. Now substitute /…./ with “call stack”. The analysis becomes less obvious. It’s possible for an object to be created in one method, passed around on the call stack, and stay confined to the call stack. See Doug Lea’s 2.3.1.

stack to heap trend in languages

  • I guess you seldom see malloc() and free() in business applications. I have seen a number of c applications but didn’t notice them. However, how does c pointers deal with automatic destruction of the objects they reference? I guess all such objects are declared and used in the same scope, never declared in an inner func and used in an outer func?
  • C++ biz apps use lots of new(), but by default, if you don’t call new() or malloc(), C++ uses the stack and global area. In terms of language support and encouragement, stack is still easier to use than heap in C++. I consider C++ as a transition between C and newer languages.
  • STL seems to be mostly heap, to support expansion
  • Java uses the heap more than the stack. Every object is on the heap. Memory leak becomes too likely so java had to provide garbage collector.
Conclusion: c -> c++ -> STL -> java, I see less stack and more heap usage. However, Stroustup pointed out the performance advantage of stack over heap.

primitive^class type, pbclone, pbref…

java has primitive vs reference types; c++ has builtin/primitive vs class data types. For a beginner, it’s best NOT to think of these c++ features in java terms. To a beginner sometimes I’d say C++ builtin/primitive and class types are very very different from java primitive and reference types.

Like C language, both builtin and class types are, by default, pass-by-clone (pbclone) — See post on “function border”. Both builtin and class types can become pbref..

Note this blog is mostly about memory. Pretend to be an snooper on the memory space in the runtime. What differences do we notice when builtin or class type variables are created, destroyed, passed, cloned…

obviously a class type can contain pointers. If a struct contains nonref (non-reference types) fields only, then it’s quite similar to a double or char primitive.

primitive^object dichotomy : !! in C++

As a java developer learning C++, I’d say the #1 syntax/semantic difference is this — primitive vs reference type dichotomy is misleading and counterproductive. In fact, in C++ i would not mention primitive vs object, as every variable is (a name for) an object, .

In java, every variable is either a primitive or a reference type i.e. an object on heap. Primitives are always pass-by-clone (pbclone). Garbage collection only covers objects. Primitives don’t need it.

Learning C++, you need to unlearn all that, and perhaps start from the C tradition. The useful dichotomy is stack vs heap. All data types are “objects”, either built in or user defined. All heap variables need new/delete… As mentioned elsewhere on this blog, all heap objects are nameless — multiple named pointers can point to the same heap object.

recursive lock; test a thread already holds this lock

(Importance and value — I feel most non-library developer don’t need to this knowledge.)

java lets you test whether an arbitrary thread already holds a given lock. myThread.holdsLock(), but let’s explore it for c++.

In both java and boost thread, there’s an object (a “poor handle”) linked to the real world thread. There’s a one-to-one mapping between the handle and the real thread. I believe the thread library (not “the OS”) gives you a pointer to the real thread. Java or pthread builds an object around that pointer. This pointer is the key to determining if a thread already hold a given lock. P177[[pthreads]] shows pthreads api does provide a TID data structure. Typically an integer. Win32 offers 2 data structures —
– a kernel-level global thread id
– thread handle local to an OS process, so this handle is meaningless to another process. See post on user-kernel thread mapping.

Boost offers thread::native_handle() and thread::get_id()
shows simple java code of a re-entrant lock and non-re-entrant lock.
Our implementation of recursive locks is built on top of a pthread mutex lock (which is –not–recursive–). It makes use of a pthread condition variable to have unsuccessful threads wait on the mutex. Waiting threads are awaken by a signal from the final unlock.;
mutexes are recursive by default in Windows.;
Internally a lock count[3] is maintained and the owning thread must unlock the mutex model the same number of times that it’s locked it before the mutex object’s state returns to unlocked. Since mutex objects in Boost.Threads expose locking functionality only through lock concepts, a thread will always unlock a mutex object the same number of times that it locked it.

[3]Conceptually, I feel a counting semaphore can be used.;
is a more detailed discussion of recursive mutex

hash_set insert success/failure: xLang

SGI hash_set (not in STL) insert also returns true/false, as the 2nd field of the returned pair object.

Same for VC++ hash_set.

java HashSet add() returns true/false to indicate successful/failed insertion.

Same for c#

java String[] is O(1) for lookup-by-index

int[] has constant by-index lookup time, because system can compute address of 55th element from base address and offset ie base address + 54 * 4 since an int takes 4 bytes.

For a String[], I believe it’s the same formula. 55th string could be very long, but it’s not stored in the array’s memory block. I believe it’s the address of 1st char that’s stored in the array’s memory block, assuming the c-str implementation. Each address probably takes 4 bytes in a 32-bit machine.

More generally, any array of objects is probably implemented the same way. Note an array is also an object. If the elements in the outer array are array objects, then we get a 2D array. Outer array’s memory block holds 55 pointers to 55 inner arrays.

JNI memory leak, briefly

Java has a problem with accessing resources outside the JVM, such as directly accessing hardware. Java solves this with native methods (JNI) that allows calls to functions written in another language (currently only C and C++ are supported). …

There are performance overhead in JNI, especially for large messages, due to copying of the data from the JVM’s heap onto the system buffer. JNI also may lead to memory leaks because in C the programmer is responsible for allocating and freeing the memory. GC can’t go beyond jvm heap to visit the malloc free store. See post on wholesale/retail.

Even regular java objects on java heap may become memory leak when you add JNI. says (using int array for example) that ReleaseIntArrayElements will “unpin” the Java array if it has been pinned in memory. I believe anything pinned by JNI will not be garbage collected. If JNI programmer forgets to release, it’s similar to a java programmer forgetting to clear a static hashmap

double-colon q[ :: ] in c++

Recently I hit this confusion multiple times. The compiler is not confused because the usage scenarios are distinct —

  1. usage — namespace1::free_func2(). Example – you can specify std::swap() instead of your local swap()
    • somewhat similar to java package separator the dot
  2. usage — classA::staticMethod1()
  3. usage (very useful) — superclassA::instanceMethod3(). This is equivalent to this->superclassA::instanceMethod3() //tested
  4. usage — classB::localType

Actually, the first 2 scenarios have similar meanings. Java simply merged both into the dot.

boost::this_thread::get_id() shows two usages.

I believe a field name can replace the function name.

MyNamespace1::MyNamespace2::j = 10; // nested namespace
std::terminate() and ::operator new() are similar to and methods.

In a class template, the #2 and #4 usages can confuse the compiler. See P670 [[c++primer]].

import+package+namespace ]py/c++/java/perl

Namespace is a common challenge, and a common feature. Physically, module files invariably live in a hierarchical file system.

Perl’s solution revolves around the concept of package and symbol table….

–java’s solution is rooted in fully qualified type names. Every named type has a FQTN. They naturally form a hierarchical namespace tree.

Q: Can import a global object like System.out i.e. the equivalent of c++ cout handle object?
A: java global objects are always implemented as static fields (never namespce-level or package-level variables). Therefore we use static import. See

Python instantiates a namespace object “os” (or sys, re) when you say “import os, sys, re”, so you can use the dot notation like os.path. Thanks to this “object”, Python’s introspection, instrumentation and meta programming capabilities shine through.

Python’s “from os import environ” imports the environ variable into the current namespace, just like c++ “using std::out”

–C++ namespaces are well-covered in [[absolute c++]], and also concisely covered in effC++
using namespace std; // imports all the vars and functions into “here” so we don’t need to say std::cout
using std::cout; // imports just one var

However, in both cases your own “cout” variable will clash with the cout imported.

There’s a 3rd usage of “using myBaseClass::method2” on P413 of ARM

Note, “using” can be nested in  a class.

c++ forward class declaration vs "implementation"

a FCD is the minimum declaration of a class before its use. Here are some FCD in std::iosfwd library —

template class char_traits;
class char_traits;

In C++, i see 3 levels of class declarations
1) FCD
2) class definition using method prototypes, and field compositions ie a full listing of fields. Full listing required for memory allocation.
3) class fully defined with method bodies
4?) see another post for an alternative – pure abstract classes

Usually we put #2 in *.h; client programs “#include” our class definitions by macro expansion. We seldom need to put #3 in header files, though most boost header files are #3, with important consequences for linking and compiling.

effC++ item 34 has a detailed treatment of #1 vs #2. FCD, being the minimum declaration, is also known as the “interface”, whereas #2 is known as an “implementation” and “class definition”.

Puzzled by the word “implementation”? Think of a Car as an abstract concept. Different car makers “implement” it by using concrete components. A specific implementation of car is essentially a listing of non-static fields.

Put another way, implementation means composition.

Compiler need the size of each field (possibly user-defined-types) in order to size up your Car instance. new expression and operator new calls sizeof(Car). In java, primitive fields have known sizes; all reference fields occupy 4 bytes (32-bit machine). I believe c++ compiler actually calculates and determines the address of each new’ed object — the address is not determined at runtime.

As you write a client program, you could sometimes choose to include the API classes by FCD rather than #2. I feel If you don’t open up an API object to access its members, and you only mention the class name in method signatures, then FCD suffices.

The motivation behind Item #3/4 is compile-time dependency and coupling. I feel it’s c++ specific. By the way, decoupling is one of my favorites, and is a practical priority compared to a lot of other design principles.

Google c++ coding guide gave a simple illustration of bad FCD messing up compilation, where #include is safe. In a nutshell, #include would inform compiler the inheritance relationship between B and D, which is important to compiler static binding.

variable-scope vs object-lifetime

(See also blog post on [[a heap-stack dual variable]].)

[[OO MutliThreading using c++]] P 355 lists 4 scopes.

Sound byte — any object’s address is always immutable. It’s a simple, fundamental concept but not always obvious to me:)

Sound byte — a pointer (and a java reference) variable is a 32bit memory location[3] with a _permant_name_. This 32bit object is the 32bit pointer itself. When you reseat the pointer, you change the state of this 32bit pointer object, but this 32bit pointer object stays at the old address.

[3] often on stack, even if the pointee is on heap)

Q: A fully qualified var name is unique by definition. Mem addresses are unique by definition. How are they related?
* A stackVar has a name attached to the address. Is the name->address mapping permanent? I think so.
** what if the stackvar is a ptr to a heap object? 4-byte ptr is on stack? yeah
* A c++ ref is a permanent name permanently attached to an address, even after the address is reclaimed (ref becomes invalid) Think Unix symlink.

* a heap object is always nameless. When created, new() returns the address only. Can you ever attach a name to that address? i doubt it. You can attach multiple pointers and name the pointers, which have 4-byte addresses. Heap holds unnamed addresses. Think Unix Hardlink.

* java objects are always nameless. An object (address immutable) has no names attached to address. Name is for the variable, which can be reseated to another address. Unix hardlink. Just like pointers.

[[c++ primer]] says variables have scope and objects have lifetime.
* Simple stackVar has matching scope/lifetime. When a regular stackVar goes out of scope, the object goes.
* static locals have small scope and long lifetime
* fields have class scope, and same lifetime as host object??
* heap objects are complicated — long life and no scope. A heap object is associated with zero[2] or more pointer variables. Usually a ptr variable is a stackVar or a field. The 4-byte pointer object has its own lifetime and a variable scope. If heap obj links to just one ptr variable, and it goes out of scope and 4-byte reclaimed, we get leaks.

[2] zero means unreachable object and a leak

alternatives to java, in the enterprise

I hope to get into the minds of the CTO’s and compile the list of pros and cons…

Tech gurus routinely predicate the decline of java [1], but my question is “Who will /displace/ java in the enterprise?”. Let’s restrict ourselves to app programming languages. I think the gorillas/challengers of the past have been

– Java
– C# (dotnet is not a language)
– python
– javascript

Pattern: OO is the only option. For Perl or PHP to enter the enterprise, OO needs strengthening.

pattern: scripting languages are never serious contenders. Now,

Q: What are the fundamental technical reasons beside (beneath) vendor support? Even the most conservative big business was and is open to scripting or open-source — Perl/Shell, PHP, Javascript, Tomcat, Linux … “Open” but short of adopting in a bigger way.
A: Performance, data volume. Perl falls short.
A: large scale, multi-module development

Answer to the opening question: From the list above, the challenger list boils down to 1 — C#

[1] as well as other technologies. They do that to every player.