POSIX semaphores do grow beyond initial value #CSY

My friend Shanyou pointed out a key difference between c++ binary semaphore and a simple mutex — namely ownership. Posix Semaphore (binary or otherwise) has no ownership, because non-owner threads can create permits from thin air, then use them to enter the protected critical section. Analogy — NFL fans creating free tickets to enter the stadium.

https://stackoverflow.com/questions/7478684/how-to-initialise-a-binary-semaphore-in-c is one of the better online discussions.

  • ! to my surprise, if you initialize a posix semaphore to 1 permit, it can be incremented to 2. So it is not a strict binary semaphore.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/thr/binarySem.cpp is my experiment using linux POSIX semaphore. Not sure about system-V semaphore. I now think a posix semaphore is always a multi-value counting semaphore with the current value indicating current permits available.

  • ! Initial value is NOT “max permits” but rather “initial permits”

I feel the restriction by semaphore is just advisory, like advisory file locking. If a program is written to subvert the restriction, it’s easy. Therefore, “binary semaphore” is binary IIF all threads follow protocol.

https://stackoverflow.com/questions/62814/difference-between-binary-semaphore-and-mutex claims “Mutex can be released only by thread that had acquired it, while you can signal semaphore from any non-owner thread (or process),” This does NOT mean a non-owner thread can release a toilet key owned by another thread/process. It means the non-owner thread can mistakenly create a 2nd key, in breach of exclusive, serialized access to the toilet. https://blog.feabhas.com/2009/09/mutex-vs-semaphores-–-part-1-semaphores/ is explicit saying that a thread can release the single toilet key without ever acquiring it.

–java

[[DougLea]] P220 confirms that in some thread libraries such as pthreads, release() can increment a 0/1 binary semaphore value to beyond 1, destroying the mutual exclusion control.

However, java binary semaphore is a mutex because releasing a semaphore before acquiring has no effect (no “spare key”) but doesn’t throw error.

Advertisements

how could jvm surpass c++latency

A Shanghai Morgan Stanley interviewer asked in a 2017 java interview — “How could jvm surpass c++ latency?”

— One reason — JIT compiler could aggressively compile bytecode into machine code with speedy shortcuts for the “normal” code path + special code path to handle the abnormal conditions.

P76 [[javaPerf]] described a nifty JIT technique to avoid runtime cost of the dynamic binding of virtual function equals(). Supppose in some class, we call obj1.equals(obj2).

After a priming (i.e. warm-up) period, JIT collects enough statistics to see that every dynamic dispatch at this site is calling String.equals(), so JIT decides to turn it into faster “static binding” so the String.equals() function address is hardwired into the assembly code (not JVM bytecode). JIT also needs to handle the possibility of Character.equals(). I guess the assembly code can detect that obj1/obj2 is not a String.java instance and retry the virtual function lookup. JIT can generate assembly code to
1. call String.equals() and go ahead to compare some field of obj1 and obj2.
2. if no such field, then obj1 is not String, then backtrack and use obj1 vtable to look up the virtual function obj1.equals()

It may turn out that 99.9% of the time we can skip the time-consuming Step 2: )

Priming is tricky — https://www.theserverside.com/tip/Avoid-JVM-de-optimization-Get-your-Java-apps-runnings-fast-right-fromt-the-start highlights pitfalls of priming in the trading context. Some take-aways:
1. Optimizing two paths rather than just one path
2. Reusing successful optimization patterns from one day to the next, using historical data

— One hypothesis — no free() or delete() in java, so the memory manager doesn’t need to handle reclaiming and reusing the memory. [[optimizedC++]] P333 confirmed the c++ mem mgr does that. See [1]

https://stackoverflow.com/questions/1984856/java-runtime-performance-vs-native-c-c-code is Not a published expert but he says —
On average, a garbage collector is far faster than manual memory management, for many reasons:
• on a managed heap, dynamic allocations can be done much faster than the classic heap
• shared ownership can be handled with negligible amortized cost, where in a native language you’d have to use reference counting which is awfully expensive
• in some (possibly rare and contrived) cases, object destruction is vastly simplified as well (Most Java objects can be reclaimed just by GC’ing the memory block. In C++ destructors must always be executed)

— One hypothesis — new() is faster in jvm than c++. See [1]

Someone said “Object instantiation is indeed extremely fast. Because of the way that new objects are allocated sequentially in memory, it often requires little more than one pointer addition, which is certainly faster than typical C++ heap allocation algorithms.”

[1] my blogpost java allocation Can beat c++


Julia, Go and Lua often beat C in benchmark tests .. https://julialang.org/benchmarks/

http://www.javaworld.com/article/2076593/performance-tests-show-java-as-fast-as-c–.html is a 1998 research.

In my GS-PWM days, a colleague circulated a publication claiming java could match C in performance, but didn’t say “surpass”.

initialize const field in ctor body #cast

Background: I have a const (int for eg) field “rating” to be initialized based on some computation in the ctor body but c++ compiler requires any const field (like rating) be initialized in initializer only.

solution: cast away the constness.. See https://stackoverflow.com/questions/3465302/initializing-c-const-fields-after-the-constructor

I also used it in my github code https://raw.githubusercontent.com/tiger40490/repo1/cppProj/cppProj/concretizeSheet/concretize.cpp

*const_cast<bool*>(&hasUpstream) = tmp_hasUpstream;

std::string is COW-reference-counted now but should change]c++11

http://www.drdobbs.com/cpp/c-string-performance/184405453 (circa 2003) explains that COW speeds up string copy and string destruction.

https://stackoverflow.com/questions/12520192/is-stdstring-refcounted-in-gcc-4-x-c11

This proved a small halo.

However, gcc 5.1 introduced a breaking change. https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html says These changes were necessary to conform to the 2011 C++ standard which forbids Copy-On-Write strings

Q: why forbidden?
A: thread-safety … See [[std c++lib]] P692

struct packing^memory alignment %%experiment

Sound byte — packing is for structs but alignment and padding is Not only for structs

Longer version — alignment and padding apply to Any object placement, whether that object is a field of a struct or an independent object on stack. In contrast, the packing technique is specific to a struct having multiple fields.

https://github.com/tiger40490/repo1/blob/cpp1/cpp1/memAlign.cpp has my test code to reveal the rules:

  1. For a (8-byte) long long object on stack (or in a struct), the address is 8-byte aligned. So padding is added by compiler, unless you say “packed” on a struct.
  2. For a (4-byte) int object on stack (or in a struct), the address is 4-byte aligned.
  3. For a (2-byte) short object on stack (or in a struct), the address is 2-byte aligned.
  4. For a char object on stack (or in a struct), the address is 1-byte aligned. All memory address are 1-byte aligned, so compiler adds no padding.

http://stackoverflow.com/questions/11108328/double-alignment Wug’s answer echoes Ashish’s comment that tiny fields like char should be grouped together, due to Rule #4. This applies to stack layout as well [1]. However, compiler optimization can be subversive:

Not only can the compiler reorder the stack layout of the local variables, it can assign them to registers, assign them to live sometimes in registers and sometimes on the stack, it can assign two locals to the same slot in memory (if their live ranges do not overlap) and it can even completely eliminate variables.

[1] See https://stackoverflow.com/questions/1061818/stack-allocation-padding-and-alignment

field^param^local-variable — C++ allocation

A field is always allocated memory, since a (including static) field is part of an object.

A function parameter is always allocated on the stack.

Local variables are supposed to be allocated on stack, but may not be allocated at all. Compiler can often optimize them away.

Sometimes a variable is just a token/alias in source code’s symbol table. A constant variable can be replaced by the constant value at compile time.

How about initialization?
Rule 1: class instances are never UNinitialized
Rule 2: static variables are never UNinitialized
Rule 3: local vars and class fields are uninitialized except as part of Rule 1.

const – common subversions

Whenever I see “const”, I always remind myself at least 2 “backdoor” ways to break it

– mutable keyword

– const_cast

There are more techniques of subversion. In C we often see pointer-to-const, but in C++ ref-to-const is more common, and often known simply as a constant-reference. Suppose we have a const ref to a pointee object myOldCar. myOldCar may have another reference to it. That (non-const) reference can modify myOldCar's internal content behind our back!

Offsite — Even if myOldCar is a const object, it may have a pointer field transistorRadio. Since the actual radio object is “offsite” i.e. outside the real estate of myOldCar, that object can change its content. myOldCar is effectively non-immutable.

Incidentally, other software constructs have backdoors and subversions too

* Whenever I see singleton, I always remind myself of those subversive techniques to make multiple instances of the class.

* Whenever I see a private ctor, I always remind myself of those subversive techniques to instantiate this class without this ctor.

subverting – multiple inheritance

!! With MI, the “this” pointer field is not always identical between the Base object and the Derived object.
** Remember a primitive technique of a home-made “class” is a C struct + a self-pointer field [2]. Similarly, In Python methods, “self” must be the first argument…

[2] I don’t think we should add a 32-bit field to each instance! I guess compiler can help make do without it.

!! In a Derived instance, there’s not always a single Base1 instance.
** Basic example — D extends C1 and C2, which both extend B.
** Even if C1 and C2 both virtually extend B, if C3 extends B without “virtual”, D instance still embeds 2 B instances.

!! Casting within inheritance hierarchy doesn’t always maintain the address held inside a Derived pointer or Base pointer. SI — always. ARM P 221 says “with MI, casting changes the value of a pointer”. See my experiment  https://github.com/tiger40490/repo1/blob/cpp1/cpp/88miscLang/ptrArithmeticMI.cpp

!! Each Derived instance uses more than one vtbl. If D inherits from B1 and B2, then that instance uses two vtbl’s. See ARM P230.

!! AOB (see other posts) assumes a 0 “delta” but in MI, delta is introduced because it’s not 0! ARM P222

— Below rules still holds,  in SI and MI —
dtor sequence is the exact reverse of ctor sequence which is BCDC (see other posts)

java/c++overriding: 8 requirements #CRT

Here’s Another angle to look at runtime binding i.e. dynamic dispatch i.e. virtual function. Note [[effModernC++]] P80 has a list of requirements, but I wrote mine before reading it.

For runtime binding to work its magic (via vptr/vtbl), you must, must, must meet all of these conditions.

  • method must be –> non-static.
  • member must be –> non-field. vtbl offers no runtime resolution of FIELD access. See [[java precisely]]. A frequent IV topic.
  • host object must be accessed via a –> ref/ptr, not a non-ref variable. P163 [[essential c++]]. P209 [[ARM]] explains that with a nonref, the runtime object type is known in advance at compile time so runtime dispatch is not required and inefficient.
  • method’s parameter types must be —> an exact copy-paste from parent to subclass. No subsumption allowed in Java [2]. C++ ARM P210 briefly explains why.
  • method is invoked not during ctor/dtor (c++). In contrast, Java/c# ctor can safely call virtual methods, while the base object is under construction and incomplete, and subclass object is uninitialized!
  • method must be –> virtual, so as to hit the vtbl. In Java, all non-static non-final methods are virtual.
  • in c++ the call or the function must NOT be scope-qualified like ptr2->B::virtF() — subversion. See P210 ARM
  • the 2 methods (to choose from) must be defined INSIDE 2 classes in a hierarchy. In contrast, a call to 2 overload methods accepting a B param vs a D param respectively will never be resolved at runtime — no such thing as “argument-based runtime binding”. Even if the argument is a D instance, its declared type (B) is always used to statically resolve the method call. This is the **least-understood** restriction among the restrictions. See http://bigblog.tanbin.com/2010/08/superclass-param-subclass-argument.html

If you miss any one condition, then without run/compile-time warnings compiler will __silently__ forgo runtime binding and assume you want compile-time binding. The c++11 “overload” and java @Override help break the silence by generating compiler errors.

However, return type of the 2 functions can be slightly different (see post on covariant return type). P208 ARM says as of 1990 it was an error for the two to differ in return type only, but [[c++common knowledge]] P100 gives a clear illustration of clone() method i.e. virtual ctor. See also [[more eff C++]] P126. CRT was added in 1998.

[2] equals(SubclassOfObject) is overloading, not overriding. @Override disallowed — remember Kevin Hein’s quiz.

Here’s a somewhat unrelated subtle point. Suppose you have a B extended by C, and a B pointer/ref variable “o” seated at a C object, you won’t get runtime binding in these cases:

– if you have a non-static field f defined in both B/C, then o.f is compile-time binding, based on declared type. P40 [[java precisely]]
– if you have a static method m() defined in both B/C, then o.m() is compile-time binding, based on declared type. [1]
– if you have a nonref B variable receiving a C object, then slicing — u can’t access any C part.

[1] That’s well-known in java. In C++, You can also “call a static member function using the this pointer of a non-static member function.”

Quiz: who to "intercept" returns from a java method

Jolt: When you put in a “return”, you think “method would exit right here and bypass everything below”, but think again! If you use a return in a try{} or catch{}, it is at the /mercy/ of finally{}.


P477 [[ thinking in java ]] Only one guy can stop a return statement from exiting a method. This guy is finally, also mentioned in one of my posts on try-catch-finally execution order.

Same deal for break, continue.

python tuples aren’t waterproof immutable

–Based on http://www.velocityreviews.com/forums/t339699-are-tuple-really-immutable.html
t = ([1],[2])
# apply the id() function to each item in t
map(id,t)   
[47259925375816, 47259925376392]

t[0].append(0)
t
([1, 0], [2])
map(id,t) # unchanged
[47259925375816, 47259925376392]

So tuple deviates from java immutability, which mandates t[0] returning a clone — essentially copy-on-write.

A tuple is like an ordered club roster written with indelible ink. The members of the club may change jobs, age, salary etc but the roster remains the same: same members, same SSN like python id(), same ranking.

In C++ lingo, the “t” tuple has 2 pointers on its real estate. It qualifies as immutable since the two 32-bit fields remain _bit_wise_constant_. The pointees live outside the tuple’s real estate and are Editable.

subvert: field initializer + constructors #java

I had an apparently watertight base bean class with a field initializer

public final Map properties = Collections.EMPTY_MAP;

Apparently every instance should have an immutable empty map? You will see how this “watertight guarantee” is compromised.

Subclass adds no field. However, at runtime, I found an object of the subclass with this.properties == a Hashtable, even populated with data. How could JVM allow it?

More (ineffective) chokepoint — There is only one line of code that “puts” into this Map. It’s in a private method and I added a throw new RuntimeException() //to ensure it never adds any data.

More (ineffective) chokepoint — There’s only one constructor for the base class. I put some println() which, surprisingly, didn’t run.

Short Answer – casting an object received by de-serialization.

Long answer — These bean classes are DTO classes for RMI. Server side is still using the old version, so it conjures up objects with this.properties == a populated Hashtable and serializes it to client JVM. Client de-serializes this.properties only with respect to the declared type, bypassing field initializer or constructors. So long as the incoming stream can convert to a Map, it’s successfully de-serialized.

q[private] access modifier ] java # surprises

Given 2 instances of C — c1 and c2. c1 can access private members in c2. See [[hardcore java]].

What about an instance b1 of a base class B? I believe b1 can’t access them. Compiler will tell you b1 knows nothing about subclasses’ fields. Similar to the c++ slicing problem.

A sound byte — “private” is checked at compile time.

Such an understanding of “private” is critical to nested classes. See posts on nested classes.

Moving on to “protected”. B has protected m1(), so a C method can call super.m1() or equivalently m1(), but an arbitrary method of an arbitrary class can’t call m1(). Again compile-time check. This is the story of the protected clone() and finalize() in Object.java.

I feel finalize() is designed to be called as super.finalize() in an overridden finalize().

subverting java private constructor #some c++

Private constructor is not watertight. Among the consequences, singletons must be carefully examined.

  • reflection
  • ObjectInputStream.readObject() — see other posts
  • de-serialization can instantiate the class without constructors (or field initializers).
    • RMI, EJB, JMS, Web service all use serialization
    • any time we copy an object across 2 jvm processes
  • if you see a private constructor, don’t feel safe — the class may be a nested class. In that case the enclosing class can call the private constructor. Worse still, another nested class (sister class) can also call the private constructor. And enclosed classes too. In summary all of these classes can call my private ctor —
    • ** my enclosing class
    • ** my “sister” classes ie other classes enclosed by my enclosing class
    • ** my “children” ie enclosed class including anonymous classes
    • ** my “grand-children” classes

— in c++

  • Friend function and friend class can call my private ctor.
    • Factory as friend
  • static method is frequently used by singletons

writeObject() invoked although never declared in any supertype

Usually, a common behavior must be “declared” in a supertype. If base type B.java declares method m1(), then anyone having a B pointer can invoke m1() on our object, which could be a B subtype.

However, writeObject(ObjectOutputStream) [and readObject] is different. You can create a direct subtype of Object.java and put a private writeObject() in it. Say you have an object myOb and you serialize it. In the classic Hollywood tradition, Hollywood calls myOb.writeObject(), even though this method is private and never declared in any supertype. Trick is reflection — Hollywood looks up the method named writeObject —

writeObjectMethod = getPrivateMethod(cl, “writeObject”, …

gemfire GET subverted by(cache miss -> read through)

get() usually is a “const” operation (c++ jargon), but gemfire CacheLoader.java can intercept the call and write into the cache. Such a “trigger” sits between the client and the DB. Upon a cache miss, it loads the missing entry from DB.

When Region.get(Object) is called for a region entry that has a null value, the load method of the region’s cache loader is invoked. The load method *creates* the value for the desired key by performing an operation such as a database query.

A region’s cache loader == a kind of DAO to handle cache misses.

In this set-up, gemfire functions like memcached, i.e. as a DB cache. Just like the PWM JMS queue browser story, this is a simple point but not everyone understands it.

http://community.gemstone.com/display/gemfire60/Database+write-behind+and+read-through

When an application requests for an entry (for example entry key1) which is not already present in the cache, if read-through is enabled Gemfire will load the required entry (key1) from DB. The read-through functionality is enabled by defining a data loader for a region. The loader is called on cache misses during the get operation, and it populates the cache with the new entry value in addition to returning the value to the calling thread.

g++ removes a method if never called@@

I suspect the syntax checker in gcc effectively comments out an (potentially illegal) method if it's never called. In the example below,

1) modification of this->intVar is blatant violation of const “this” but this is invisible to the gcc syntax checker unless there's a call to the method.

2) More obviously, bad2() calls a non-existent method but condoned unless someone calls bad2().

using namespace std;
template struct J{
    J(T & rhs){}
    void violateConstThis(T a) const{
       this->intVar = a; // legal when method is const but no one calls.
    }
    void bad2() const{
        this->nonExistentMethod();
    }
    T intVar;
};
int main()
{
    int a=22;
    const J j1(a);
    //j1.violateConstThis(89);
    //j1.bad2();
    return 0;
}