share variables between methods within a class

* a special map instance field to hold all such data. Map of Objects, so casting needed everywhere.

* (laziest) instance field? reserved for instance attributes. Don’t abuse.
* (laziest) static field? reserved for class attributes. Don’t contaminate.
* return a collection? A bit clumsy but recommended by many.
* best option: avoid sharing. Stick to local vars if possible.
* 2nd best: sometimes the shared variable really belongs to another class. That class/object perhaps is already shared between your methods. Essentially use that object as a courier. More work than the laziest.

Advertisements

hibernate — essential techniques to SELECT query

* Technique: standard associations (m:1,m:n etc) — You need not write the query to SELECT the “associated” objects. If you load Students and want the associated Course loaded, Hibernate automatically constructs the Course query based on the Course.hbm.xml.

* Technique: HQL — You may need to write HQL to select students.

* Technique: views — Completely outside and unknown to hibernate, you can implement complex SELECT in a view and mention its name in a hbm.xml file, as if it’s a table.

* technique: native sql —

* technique: proc — existing logic in proc? This would be the ultimate. most powerful and customized.

2 FX interbank broker ^ 2 interdealer treasury broker

Treasury Inter-dealer brokers are the backbones of treasury market. BrokerTec and Espeed…

FX interbank brokers are the backbones of currency market. In Spot FX, EBS and Reuters (see separate blog) are the only 2 big brokers. See http://en.wikipedia.org/wiki/Interbank_market and the investorpedia article. Big banks handle very large transactions often in billions of dollars. These transactions cause the primary movement of currency prices (in the short term?) In the long term, fx is influenced not by the big bank’s actions, but economies.

Reuters’ system for Spot is a the electronic version of traditional voice execution. A screen-based “conversational” system so both sides know each other. Trades execute in the conversation, much like voice execution.

In contrast, EBS is anonymous. Trades execute when market-takers hit a button on screen.

For FX Forward, Reuters (see the other post) is dominant, but Tullet Prebon is popular too.

top3 key nlg pearls on thread methods

A: wait(), lock()

#1. Each of the Thread object’s static/non-static methods must execute in a call stack — ie a real thread, and this real thread can be unrelated to the Thread. You should always ask “Is this Thread method affecting the call stack (ie the real thread) ?”

#2 Q: Many other methods (sometimes designed to) affect their call stack, but are not defined in Thread.java. Example?

vtable and vptr in pseudo code

http://www.parashift.com/c++-faq-lite/virtual-functions.html#faq-20.4 is a one-pager with pseudo code. Here are some of my comments
tanbin – one v-table per class in the hierachy, shared by all instances. Parent’s v-table, child’s v-table…
tanbin – one v-ptr per instance. Since a child instance wraps a parent, the entire “onion” has a single v-ptr
tanbin – the v-ptr is reseated once when each “onion” layer is added during construction. Each child constructor in the hierarchy can reseat the v-ptr to point to the child’s own v-table

————
Let’s work an example. Suppose class Base has 5 virtual functions: virt0() through virt4().

// Your original C++ source code

class Base {
public:
virtual
arbitrary_return_type virt0(…arbitrary params…);
virtual
arbitrary_return_type virt1(…arbitrary params…);

};

Step #1: the compiler builds a static table containing 5 function-pointers, burying that table into static memory somewhere. Many (not all) compilers define this table while compiling the .cpp that defines Base‘s first non-inline virtual function. We call that table the v-table; let’s pretend its technical name is Base::__vtable. If a function pointer fits into one machine word on the target hardware platform, Base::__vtable will end up consuming 5 hidden words of memory. Not 5 per instance, not 5 per function; just 5 for the class. It might look something like the following pseudo-code:

// Pseudo-code (not C++, not C) for a static table defined within file Base.cpp

// Pretend FunctionPtr is a generic pointer to a generic member function
// (Remember: this is pseudo-code, not C++ code)
FunctionPtr Base::__vtable[5] = {
&Base::virt0, &Base::virt1, &Base::virt2, &Base::virt3, &Base::virt4
};

Step #2: the compiler adds a hidden pointer (typically also a machine-word) to each object of class Base. This is called the v-pointer. Think of this hidden pointer as a hidden data member, as if the compiler rewrites your class to something like this:

// Your original C++ source code

class Base {
public:

FunctionPtr* __vptr;
supplied by the compiler, hidden from the programmer

};

Step #3: the compiler initializes this->__vptr within each constructor. The idea is to cause each object’s v-pointer to point at its class’s static v-table, as if it adds the following instruction in each constructor’s init-list:

Base::Base(…arbitrary params…)
: __vptr(&Base::__vtable[0])
supplied by the compiler, hidden from the programmer

{

}

Now let’s work out a derived class. Suppose your C++ code defines class Der that inherits from class Base. The compiler repeats steps #1 and #3 (but not #2). In step #1, the compiler creates a new hidden v-table for class Der, keeping the same function-pointers as in Base::__vtable but replacing those slots that correspond to overrides. For instance, if Der overrides virt0() through virt2() and inherits the others as-is, Der‘s v-table might look something like this (pretend Der doesn’t add any new virtuals):

// Pseudo-code (not C++, not C) for a static table defined within file Der.cpp

// Pretend FunctionPtr is a generic pointer to a generic member function
// (Remember: this is pseudo-code, not C++ code)
FunctionPtr Der::__vtable[5] = {
&Der::virt0, &Der::virt1, &Der::virt2, &Base::virt3, &Base::virt4
}; ^^^^----------^^^^---inherited as-is

In step #3, the compiler adds a similar pointer-assignment at the beginning of each of Der‘s constructors. The idea is to reseat each Der object’s v-pointer so it points at Der class’s v-table. (This is not a second v-pointer; it’s the same v-pointer that was defined in the base class, Base; remember, the compiler does not repeat step #2 in class Der.)
Finally, let’s see how the compiler implements a call to a virtual function. Your code might look like this:

// Your original C++ code

void mycode(Base* p)
{
p->virt3();
}

The compiler has no idea whether this is going to call Base::virt3() or Der::virt3() or perhaps the virt3() method of another derived class that doesn’t even exist yet. It only knows for sure that you are calling virt3() which happens to be the function in slot #3 of the v-table. It rewrites that call into something like this:

// Pseudo-code that the compiler generates from your C++

void mycode(Base* p)
{
p->__vptr[3](p);
}

On typical hardware, the machine-code is two ‘load’s plus a call:

  1. The first load gets the v-pointer, storing it into a register, say r1.
  2. The second load gets the word at r1 + 3*4 (pretend function-pointers are 4-bytes long, so r1+12 is the pointer to the right class’s virt3() function). Pretend it puts that word into register r2 (or r1 for that matter).
  3. The third instruction calls the code at location r2.

Conclusions:

  • Objects of classes with virtual functions have only a small space-overhead compared to those that don’t have virtual functions.
  • Calling a virtual function is fast — almost as fast as calling a non-virtual function.
  • You don’t get any additional per-call overhead no matter how deep the inheritance gets. You could have 10 levels of inheritance, but there is no “chaining” — it’s always the same — fetch, fetch, call.

local variable declaration IS allocation@@

– local nonref (i.e. stackVar) declaration always allocate-create the object — the C tradition. (java made a bold departure.) This is the big deal in this post.

MyClass c; // allocates without initializing. calls noArg ctor? I doubt it?
MyClass c= …. // ditto

– local ptr variable declaration — always allocates the 32 bits for the pointer. [1]
– local ref variable declaration — always allocates and initializes the reference. Compare to ptr.
– function param declaration (nonref) always allocate-create the object, on stack.
– A field declaration doesn’t allocate-create the field object — not until host class construction-time. Same as in java.
^ obviously, if you see new … then you know it calls constructor, new-expression is fundamentally different from nonref variable declarations because
^^ heap
^^ returns pointer
^^ object created is nameless. In contrast, nonref variable is a name-plate carved on a memory location.

[1] It may initialize it to 0 or seat it to a real pointee object

std::string field and other non-ptr fields in a c++ class

Background — I feel this is a fundamental but overlooked design consideration.

In java, any non-primitive field is a ptr. In c++, std::string is a common field type. No ptr needed. std::vector is similar… But What other non-ptr fields are common? Let’s exclude primitive types like double or “bool” (shorter spelling than java “boolean”)

I feel any class like Address can be used as the nonref type of a field myAddr in a new class Customer. Customer ctor needs to allocate this field, but is it allocated on stack or heap? It depends on the context.

– If you allocate the entire Customer object on heap, then myAddr field is also on heap
– ditto for stack. You can trace the steps of myAddr’s allocation and there’s no malloc().

Either way, upon Customer de-allocation, myAddr is automatically /bulldozed/ since myAddr is on the real estate of the Customer object. sizeof(Customer) includes sizeof(Address), which is not the case for pimpl.

Now we know nonref field like qq{ Address myAddr; } is the c++ default whereas qq{ Address * myAddrPtr; } is the pimpl/java/c# version.

c++ casts usually work with pointers and refences

dynamic_cast — target type must be ref or ptr
dynamic_cast — source must be ref or ptr
———
const_cast can operate on nonref, but usually operates on ptr and references. Specifically, const-ref is a common func param, and const_cast abounds here.
Q: const_cast — target/source type can be nonref?
A: less common. I believe LHS is a distinct object. See post on casting-creates-new-pointer. In this case const_cast is like a return-by-clone function.

See also post on const nonref
——–
static_cast is _less_ common, but …
Q: static_cast — target and source type can be nonref?
Q: static_cast — source can be nonref?
Q: static_cast — LHS can be nonref?
A: similar to const_cast.
A: yes effC++ says copy ctor may be invoked.
A: yes http://www.cplusplus.com/doc/tutorial/typecasting/

allocate ptr/ref on heap@@

I don’t think finance app developers need this level of understanding, but for my curiosity…

Q1: do we sometimes create a ptr or reference on heap?
A: I think so, if your object has a 32-bit pointer field, or (less commonly) a reference field, and you create the object on heap. This is quite common in C++. Extremely widespread in java, since most non-trivial objects need pointer fields to be useful. The pimpl pattern does the same in c++.

Q1b: Apart from that, is there any way to create a ptr on heap?
A: I don’t think so. Here are a few cases —

— stackVar as a ptr? —
As a so-called auto variable, the 32-bit storage is AUTOmatically deallocated. Note the pointee could very well be on the heap, and possibly leaks memory when the pointer disppears.

— static_casting a heap ptr of Q1? —

   Type1* var1 = static_cast (obj3.field2);

See post on casting-creates-new-pointer for more details. In the above context, a new 32-bit pointer is allocated on the stack. var1 and var2 point to the same pointee, but they each occupy 32 bits.

pointer-casting creates new pointer@@

With some exceptions[1], c/c++ casts operate primarily on pointers (and references). That begs the question —
Q1: does pointer casting allocate 32 bits for a new pointer (assuming 32-bit addressing)?
A: depends on context

I feel the more basic question is

Q0: does pointer initialization allocate 32 bits for a new pointer?
A: i think so:

SomeType* p2 = p1; // allocates 32 bits for a new pointer p2, unless optimized away

Now back to our original Q1, i feel casting is like a function that returns an address in a specific “context” — address returned must be used as a Type3 pointer:

(Type3) p1;

In this case, if you don’t use this address to initialize another pointer, then system doesn’t allocate another pointer. but usually you put the cast on the RHS. The LHS therefore determines whether allocation is required

Type3* p3 = (Type3) p1; // 32-bit allocated
////////////////
Type3* p3=0; // this pointer could be on stack or heap
p3 = (Type3)p1; // no allocation since p3 already allocated.

As an extension of the LHS/RHS syntax, programmers also use function call syntax —

myFunction(123, (Type3)p1); // cast returns an address, to be saved in a temporary 32-bit pointer on the stack. This is more like pointer initialization.

[1] static_cast often operates on nonref variables, by implicitly invoking cvctor or OOC

pair of iterator, array of iterator…

Could be a common pattern. Think of these iterators as pointers.

The anagram blog-post mentions an array of iterators.

[[stl tutorial]] mentions a pair of iterators, where we use iterator to specialize the std::pair template.
– The concrete iterator type is at the specialization-level.
– The iterator object is at the instance level.

I feel the conceptual complexity is an unwanted complexity, so it’s best to just copy this implementation and not analyze too much

pair^unary functor^Property pattern..all use2dummy-types

STL’s pair template has 2 “dummy types” — K/V. Pair template is useful by itself, without the map.

By comparison, the STL unary (no comments on “binary”) func template also has 2 dummy types A/R, standing for Argument/Result.

The popular Property pattern (D Duffy) also features 2 dummy types K and V. Many features are added — very adaptable.

The name/value pairs are very natural on a spreadsheet.

Warning — some programmers give longer names to these dummy types — “key/value”. They want to add clarity, but actually add confusion. These dummy type names look like class names or typedef names esp. in a long source listing. Unfortunately c++ class names and typedefs don’t start with Capital Letters.

Boost Tuple generalizes the pair concept.

This “pair” concept is very simple, practical and adaptable.

pthreads according to S. Wang

Boost thread is a wrapper around pthread? But boost creator disagrees — see other post.

Think of pthread as a c library on solaris. There’s a similar pthread library on linux. There’s another pthread library on aix … All of them have the same application programmer interface. If i have a c program using the solaris pthread library, and port to linux, then it should be able to link to the linux pthread library.

Let’s zoom into the solaris (or linux or aix …) pthread library. It is probably a bunch of c functions. Each probably makes system calls to the OS to create, schedule threads, yield, join, acquire lock … I feel a thread uses system resources. All system resources are “guarded” by the OS software layer, so I feel thread operations (join, signal, unlock …) must go through OS.

How is a native kernel thread created in the OS? I feel all native thread creation must go through OS, since OS will soon schedule and control the new born thread. In linux, thread is created by fork-exec, much like a process, with one notable exception — the new born thread doesn’t get an independent heap. Its heap is shared with the parent process. In contrast, a regular fork-exec child process and its parent can’t share data on their 2 heaps. See P68 [[OO multithreading using c++]]

iterator’s true type can be anything

[[EffectiveSTL]] P120 points out that in vector [1] class template, iterator and const_iterator are typedef (I call them aliases) of pointers; whereas in other containers these are 2 unrelated classes.

For easy discussion, let’s focus on const_iterator.

It seems to me the compiler sees nothing in common among the const_iterator of different containers. In that case, the const_iterator idea/category is probably another fake type??

Q: Is it possible to have a simple function template accepting an “const_iterator” when “const_iterator” can be any type?
A: No for a non-template function.
A: for a function template, maybe the template param (a fake type) helps? See my thoughts later.
A: for a method like erase() or insert(), the const_iterator often has a more predictable type, defined in the host class.

* When we call advance (), the function accepts “int*” as param.
* When we call advance< list::iterator > with the matching angle brackets, where the iterator is a true class, the function accepts the iterator class as type-param.

[1] and all array-based containers. Regular array is also iterable, whose iterator is the simple pointer.

STL iterators — a simple idea pushed to the max

Unix simple ideas pushed to the max —
* everything is a file
* pipes
* sockets
* fork and exec
* signals sent to a process — trap and kill
* stdin, stdout, stderr streams
* background vs foreground

But this post is not about unix but about STL simple ideas —

Idea: iterators a bold extension of the pointers in an array, which can move and jump, read and write the elemtns.

Idea: Iterators are adopted in many data structures including io streams, strings, arrays, regex

Idea: for_each(), find(), counf_if()… these work with any data structure supporting the iterator pattern

Idea: Parametrized container are adopted in many packages including io stream, strings, regex

Idea: Allocators —

memory leak detection ideas#malloc etc

http://www.flipcode.com/archives/How_To_Find_Memory_Leaks.shtml is a dated (2000) but readable and detailed treatment. A home-made new/delete overload, using malloc and free rather than qq[[ ::operator new ]] as advised by Scott Meyers.

  • Valgrind – no need to link anything… malloc/free are “replaced” automatically.
  • electric fence — link it into your code to get seg fault upon DAM errors. Won’t catch memory leaks. (My GDB book covers similar tools)
  • cmemleak traces malloc() and free() — the choke points.
  • [[c++nutshell]] says allocators can implement debugging or validity checks to detect programmer errors, such as memory leaks or double frees.
  • GlowCode — Three easy ways to use GlowCode (a windows-only profiler):
    • (1) use it to launch your application,
    • (2) attach GlowCode to a running program, or
    • (3) link GlowCode directly into your application. No source code or
      build change or post-build step required. Similar to Valgrind
  • IBM Purify — When a program is *linked* with Purify, corrected verification code is automatically inserted into the executable by parsing and adding to the object code, including libraries. Similar to Java bytecode instrumentation. That way, if a memory error occurs, the program will print out the exact location of the error, the memory address involved, and other relevant information. Purify also detects memory leaks (but I guess as a secondary feature). Leak report can be generated by calling the Purify leak-detection API from within an instrumented application. Object Code Insertion (OCI) OCI can be performed either during the link phase or after the link. Rational Purify reads object files generated by existing compilers and linkers, and adds error checking instructions without disturbing the ability to use conventional debuggers on the executable.
  • MDB is a commercial dynamic/shared library that provides replacements for malloc/free etc. You must link your software to these *.SO.
    • 🙂 Much lower overhead than Purify and Valgrind

main thread early exit

in java, main thread can “feel free” to exit if another non-daemon thread keeps the entire JVM alive. Not c++. [[c++cookbook]] says

when the operating system destroys a process, all of its child threads go with it, whether they’re done or not. Without the call to join(), main() doesn’t wait for its child thread: it exits, and the operating system thread is destroyed.

This assumes a 1:1 native thread model, so the operating thread is actually a kernel thread. When it ends, entire process ends.