q[Node*] param to a function can be..

A Node address can represent an entire aray of Node objects

If the Node class has a nextNode field, then this param can represent an entire linked list, or head (or any node) of a linked list

If Node class has left and right child nodes, then this param can represent a tree, or a subtree

Q: Can this param represent an array of linked lists. A hashtable is such an array.
A: No. If each element in the array is a Node*, then the array would look like a Node * *

down-cast a reference #idiom

ARM P69 says down-cast a reference is fairly common. I have never seen it.

Q: Why not use ptr?
%%A: I guess pointer can be null so the receiver function must face the risk of a null ptr.
%%A: 99% of references I have seen in my projects are function parameters, so references are extremely popular and proven in this use case. If you receive a ref-to-base, you can down cast it.

See post on new-and-dynamic_cast-exceptions
see also boost polymorphic_cast

returning const std::string #test program#DeepakCM

1) returning a const std::string is meaningful.. it disallows f().clear(), f().append, f().assign() etc. Deepak’s 2019 MS interview.

2) returning a const int is useless IMO. [1] agrees.

3) I agree with an online post. Returning a const ptr is same as returning a non-const ptr. Caller would clone that const ptr just as it clones a mutable ptr.

I would say what’s returned is an address, just like returning an int value of 315.

int i=444;
int * const pi = &i;
int * p2 = pi;
int main(){
  int i2=222;
  p2 = &i2;
  cout << *p2 <<endl;
}

It does make a difference if you return a ptr-to-const. The caller would make a clone of the ptr-to-const and can’t subsequently write to the pointee.

4) How about returning a const Trade object? [1] gave some justifications:

[1] https://www.linuxtopia.org/online_books/programming_books/thinking_in_c++/Chapter08_014.html

pointer/itr as field: a G3 implementation pattern

common interview question.

This mega-pattern is present in 90% of java and c# classes, and also very common in c++. Important data structure classes  relying on pointer fields include vectors, strings, hashtables and most STL or boost containers.

Three common choices for a pointer member variable:

  • Raw pointer — default choice
  • shared_ptr — often a superior choice
  • char pointer, usually a c-str

In each case, we worry about

  • construction of this field
  • freeing heap memory
  • RAII
  • what if the pointee is not on heap?
  • copy control of this field
  • ownership of the pointer
  • when to return raw ptr and when to return a smart ptr

shared_ptr field offers exception safety, automatic delete, default synthesized dtor, copier, op=

 

c++delete() using addr@reference variable

Conclusion — you can’t declare a pointer to a reference variable myRef, but you can take the address of myRef!

Q1: if you have nothing but a reference to a heap object, how can you delete it?
AA: yes. q(delete & theReference) works

Q1b: how about valgrind q(g++ -g d.cpp; valgrind ./a.out)?
AA: no leak.

See https://stackoverflow.com/questions/3233987/deleting-a-reference. q(new vector<int>) works but is code smell. Code smells tend to be useful tricks.

https://bintanvictor.wordpress.com/?p=10514&preview=true has details on vector memory allocation.

 // --- d.cpp ---
#include <vector>
#include <iostream>
using namespace std;

typedef vector<int> intArray;

intArray & createArray() {
    intArray *arr = new intArray(3, 0);
    return(*arr);
}

int main(int argc, char *argv[]) {
    intArray& array = createArray();
    for (int i=0; i< array.size(); ++i){
      cout<<array[i]<<' ';
    }
    intArray * ptr2ref = & array;
    delete ptr2ref;
}

c++return by ref #briefly

Background: This is an old topic I studied multiple times, but still unfamiliar. I think most people don’t work at such a low level?

Q1: Can return a malloc object by reference? I tested. You can still get the address by taking the address of the reference.

Q1b: if you only have a reference to a heap object, how can you delete it? See https://bintanvictor.wordpress.com/?p=10505&preview=true

Q: Can return a static local by reference? I think so but not popular

Q: Can return a by-ref argument by reference? No need (but legal) since the caller already has that reference. You could return it in an if/else

Q: can return another object in scope by reference? Yes

Return-by-non-const-ref is often used in operator overloading to support counter[3].increment()

The assignment operator returns a const ref.

 

illegal to assign rvalues into non-const Lval-ref

This is a tough one for the average cpp guy

int getInt(){static int last=3; return last+1);}

int& myint = getInt(); // illegal, since getInt() returns a nonref.

(What if getInt returns a reference??? . I think it’s legal.)

However, adding a const makes it legal:

int const & var1 = 33;

In fact, this is same thing as the (extremely) common idiom of using const ref as function param. With that, you can pass in either an lvalue (like a literal) or rvalue.

I found the above exlanation on
http://eli.thegreenplace.net/2011/12/15/understanding-lvalues-and-rvalues-in-c-and-c . The author (Eli) points out:

  • * subtle difference between const lval-ref vs non-const lval-ref (above)
  • * lvalue means “locator value”
  • * a concise example of mv-semantics
  • * An everyday example of a function returning an lvalue is operator[] in

myMap[10] =5.6

pointer equality – counter-intuitive in c++

label – ptr/ref, book

Background: Pointer (identity) comparison is widely used in java, c# and python. 
[[c++common knowledge]] points out that if class DD subclasses CC and BB (classic MI), then pointer d, c, b follows:
assert(d == c); // since the the pointer to DD object can be cast to pointer to CC
assert(d == b); // since the the pointer to DD object can be cast to pointer to BB
However, c and b are two different address, because the 2 sub-objects can't possibly have the same starting point in the address space.

pointer in C +other lang-(early) learning notes

Every variable in any lang is a 3-some — name + addr + content (but see the post on mutability, initialize etc). http://aelinik.free.fr/c/ch11.htm has a good explanation:

int x;

x = 7;

the variable x now has two values:

Left value: 1000

Right value: 7

Here the left value, 1000, is the address of the memory location reserved for x. The right value, 7, is the content stored in the memory location.

—-

Now a few simple rules on & and *

rule — &var1 returns address of var1.

rule — I don't think &var1 can be on the left side of =

rule — var1 must be a L-value expression. See http://bigblog.tanbin.com/2011/11/c-func-call-as-l-value-key-points.html

Now suppose we assign ptr1 = &var1

rule — *ptr1 returns the content of var1. If you print *ptr1, and print var1, you get same.

rule — *ptr1 can be on the left side of =

I believe *ptr1 == *(&var1) == var1

null^stray^uninitialized^void pointer

$ dangling/wild/stray ptr and *ref* — holds an address but already bulldized, possibly *reused* => u can’t delete, write or read the memory location. Merely reading (not touching) someone else’s memory location can have unknown effects — like incrementing a read counter.

P132 effC++ confirms a dangling reference can point to a memory cell already reused by someone else.

$ uninitialized/unseated ptr — address[2] undefined and could be a completely wrong address mapped to some program’s hot button => u can’t delete, write or read it. Always anchor your pointer before dereferencing.
[2] However, ptr object has an address and occupies 4 bytes

Objects can be uninitialized  too. See post on default-initialize.

$ uninitialized ref? won’t compile

$ null ptr — address 0, interpreted as a well-defined fake address. Safe deletion
$ null ref? won’t compile

$ void ptr — java. qq[ Object o ]
$ void ref? won’t compile

For each of these special scenarios, delete, deref/unwrap, detection (of null…), compare, reseat, assign … have special semantics.

beware of creating reference to dereferenced pointer

In one sample code (p190 [[c++in24hour]]), we see that once you delete a pointer, the reference to the object becomes a dangling reference. More common when multi-threaded.

I guess sometimes we do create a reference to a dereferenced pointer, but we have standard guards to avoid dangling references. What are these guards? Not sure.

Reference param is best practice in c++. What if caller passes an unwrapped pointer into a function’s ref param? How do you ensure the underlying heap/stack thingy doesn’t get deallocated?
– Single-threaded — if the pointee is accessible only on this thread, it can’t possibly “vaporize” beneath my feet. The caller won’t (technically impossible) delete the ptr before the func returns.
– multi-threaded — all hells break lose. We need guards.

There are many similar multi-threading “delete” issues to be solved with ownership control. Only one function on one thread should delete.

double-ptr usage #5 – pointer allocated on heap

Whenever a pointer Object (32-bit object[1]) is allocated on heap, there’s usually (always?) a double pointer somewhere.

new int*; //returns an address, i.e. the address of our pointer Object. If you save this address in a var3, then var3 must be a double-ptr.

int ** var3 = new int*; //compiles fine, but you should not access **var3

However, I feel we seldom allocate a pointer on heap directly. More common patterns of allocating pointer on heap are
– if an Account class has a pointer field, then an Account on heap would have this pointer field allocated on heap.
– a pointer array allocated on heap

[1] assuming a 32-bit machine.

func ptr as template non-type param

This is the #1 eye-opener in [[essential c++]]

http://bigblog.tanbin.com/2012/03/non-dummy-type-template-parameters.html describes the background of Non-Dummy-Type parameters in a class template. Things like the maxRows in template class matrix_double. [[Essential c++]] succinctly describes that usage but also illustrates on P185 how you can put in a func ptr Type (not a dummy type) in place of the “int maxRows”. I find it a very powerful technique. (I guess java and c# take other routes since they don’t support NDT template parameters.) This is how I guess it works. 

First understand a real usage scenario and focus on a concrete class instantiated from such a template. At runtime such a class can have 888 class-instances, but they all share a single special “static field”, which is the function address[1] supplied to concretize the template.
If you later concretize the template with a 2nd function address, you get a 2nd concrete class. You can create 877 instances of this 2nd class. 
For the simple NDT, you supply an integer constant like 55 when you concretize the template matrix_double. Similarly, you supply a function address as a constant when concritizing our numeric_sequence template. More generally, any constant expression can be used for a NDT.
How useful is this technique? It allows you to concretize a template using a specific function address — a specific “behavior”. It could beat a static field in some cases. For example, You can concretize a given template
* once with type Account + AccountBehavior
* once with type Student + StudentBehavior
* once with type int + intBehavior
[1] the address of a real function, not a func ptr Type.

"uninitialized" is usually either a pointer or primitive type

See also c++ uninitialized “static” objects ^ stackVar

1) uninitialized variable of primitive types — contains rubbish
2) uninitialized pointer — very dangerous.

We are treating rubbish as an address! This address may happen to be Inside or Outside this process’s address space.

Read/write on this de-referenced pointer can lead to crashes. See P161 [[understanding and using C pointers]].

There are third-party tools to help identify uninitialized pointers. I think it’s by source code analysis. If function3 receives an uninitialized pointer it would look completely normal to the compiler or runtime.

3) uninitialized class instance? Possible. Every class instance in c++ will have its memory layout well defined, though a field therein may fall into category 1) or 2) above.

Ashish confirmed that a pointer field of a class will be uninitialized by default.

4) uninitialized array of pointers could hold wild pointers

5) I think POD class instances can also show up uninitialized. See https://stackoverflow.com/questions/4674332/declaring-a-const-instance-of-a-class

pointer arg – 2 patterns

When we see a function take in a pointer argument, we should realize there are only 2 correct patterns. If neither patterns apply, then it’s likely a misuse.

 

I think this is a very simple knowledge, easy to apply, easy to remember, but not everyone knows it.

 

* readonly mode – pointer to const. Function receives the object in readonly mode.

 

* update mode – pointer to non-const. Function to modify the object

where does the q[&] belong: ref declare^address-of

1) & appears in declarations reference variable.
2) & appears outside a declaration address-of

When dealing with references, the “&” should go with the data type, not the variable name. I’d suggest move the space to the right side of &, as in

T& var

You can see this clearly in typedef statements.

You see the & only in the declaration. Afterwards, you never see it.

Read it backwards …
—–
When used as address-of, obviously the & goes with the variable or function. I’d suggest move the white space to the left side of &.

Note the address-of is an operator. You can even overload it

No comments about other uses of &

%%jargon – readonly ^ permanent handles

If we refer to pointer variables and reference variables collectively as “handles” on objects (on heap/stack), then know the difference between a permanent handle and a readonly handle. A permanent handle can’t (compiler enforcement) rebind/reseat to another object. A RO handle can’t (compiler enforcement) be used to edit the target object.

Tip: a stack object can have 0, 1 or multiple handles attached. However, a heap object must have some handle, otherwise it’s unreachable and a memory leak.
Tip: there are detached handles — null pointer and uninitialized pointer
Tip: the pointee/referent of a handle can be another handle

Now Let’s focus on permanent vs readonly handles. There are 4 basic forms — perm, RO, permRO or a plain old handle. Now since there are pointer and reference, we have up to 8 flavors, some illegal.

RO ptr? ptr to const
RO ref? ref to const, widely known as “const-ref”
perm ptr? const ptr
perm ref? the plain vanilla reference
permRO ptr? const ptr to const
permRO ref? same thing as RO ref
plain old ptr? non-const ptr to non-const
plain old ref? rebindable ref is illegal

That’s c++. How about java? pointer variables are illegal. All handles are references.
RO ref? no language support
perm ref? final variable
permRO ref? no language support
plain old ref? yes by default.

double-ptr usage #4 – array of pointer@@

Whenever you have an array of pointer, you probably have a double-ptr. See http://stackoverflow.com/questions/5558382/double-pointers-and-arrays

In some contexts, you actually new up an array of (uninitialized) pointers, then initialize each ptr. This example is from [[Programming with Visual C++: Concepts and Projects]]

int* arr;
int** ptr;
arr = new int[8];
ptr = new int*[8];
for (i=0; i<8; ++i) ptr[i] = &arr[i];

named vs nameless objects in C++, again

This is an obscure point only relevant to compilers….

In C, a local variable on stack refers to a named object. That memory location holds an int (or a char or a struct or whatever) and has a unique name like “dog1”. The variable, the object (with its address) are 2 sides of the same coin. As stated elsewhere on this blog, the variable has a limited scope and the object has a limited lifespan. Well defined rules. In C++, you can create a reference to dog1. You get an alias to the object, whose real name is still dog1. Pretty simple so far. Things gets complicated with heap (aka free store or DynamicallyAllocatedMemory DAM).

If an object is a field named “age” in another object “person2” (a class/struct instance), and person2 lives on heap, then it’s fair to say this nested heap object has a name “age” [1].

If an object lives on heap (I call it a “heapy thingy”) and is Not a field, then it can’t have a name. Such an object is created by new() and new() only returns an address. Multiple pointer Variables can hold the same address, but have different variable names. Pointee object remains forever nameless.

Revisiting [1], person2 is actually a heap object identified by a pointer Variable, so another pointer Variable (say ptr2) can also refer to the same object. Therefore “person2” is just one of the pointer variable names, not the pointee object’s name since the pointee is on heap and therefore nameless.

Taking another look at the “age” field, person2->age and ptr2->age both refer to the same age object so it gets 2 names.

The nameless heap object is dominant in java (reference types) and c# (reference types). P36 [[c# primer]] puts it nicely — “Reference-type object consists of 2 parts: 1) named handle that we manipulate in our program, 2) unnamed object allocated on the managed heap….. A value type object is not represented as a handle/object pair.”

The upshot — a stack object has one controlling variable (hence a real name), whereas a heapy thingy is a puppet pulled by many strings.

3 meanings of POINTER + tip on q[delete this)\]

(“Headfirst” means the post on the book [[headfirst C]])

When people say “receive a pointer”, “wild pointer”, “delete a pointer”, “compare two pointers”, “include two pointer members“… a pointer can mean several things. A lot of fundamental issues lie in those subtle differences. (To keep things simple, let’s assume a 32 bit machine.)

1) 32bit Object — a pointer can occupy (heap, stack, global) memory and therefore (by ARM) qualify as a 32bit Object.Note it’s not always obvious whether a given ptr variable has its own 32-bit allocation.

  • For example, if a pointer is a field in a C struct or C# struct/class, then it pretty much has to be allocated 32 bits when the struct is allocated.
  • For example, if a double pointer p5 points to a pointer p8. Since p8 has an address it must occupy memory so p8 is by definition an object. But p5 may not be an object if without an address.

2) address — in some contexts, a “pointer” can mean an address, and we don’t care the pointer object (if any). An address is pure Information without physical form, perhaps with zero /footprint/. When you pass myfloatPtr + 2 to function(), you are passing an address into function(). This address may not be saved in any 32-bit object. I suspect compiler often uses registers to hold such addresses. Note it’s not always obvious whether a given ptr variable has its own 32-bit allocation.

  • For example, in C An array name is always a const-pointer (not ptr-to-const) to an already-allocated array in memory.  For an array of 10 doubles, 640 bits are already allocated. However, compiler may not need to allocate 32 bits to hold the array name. The array name like “myArray” is like an alias of an address-of-a-house i.e. pure address not an object.
  • For example, in C if I save the address-of-something in a transient, temp variable, compiler may optimize away the stack-allocation of that variable.
  • see also headfirst.

Fundamentally, if a symbolic name is permanently attached to an Address-of-a-house (permanent alias?), then compiler need not allocate 32bit of heap/stack/global area for the symbolic name. Compiler can simply translate the name into the address. If the symbolic name rebind to a 2nd address, compiler can still avoid allocating 32 bit for it.


Whether it’s an object or a pure address, a pointer variable is a nickname. Most commonly, a pointer means a pointer variable as in (a) below. Remember a nickname exists in source code as a mnemonic for something but binding is impermanent. When we look at code snippets, we may not know whether a variable is a

  • a) nick name for a 32-bit pointer object — most common
  • b) or nick name for a pure address. An array-name like myArray in source code is such a nickname — rather confusing. Note there’s no 32-bit pointer object in myArray, beside the actual allocation for the array.
  • See also headfirst

A 32-bit pointer Object often has multiple nick names. In general, any object can have multiple nick names. If a nickname is transient or never refers to a 2nd Object, then compiler can optimize it into (b).

—- some implications —-
A resource (like a DB) — usually requires some allocation on heap, and we access the resource via a pointer. This pointer could be a pure address,  but more commonly we pass it around in a 32-bit object.

“delete this” —– When you delete a pointer, you invoke delete() on an Address, including the controversial “delete this” — The entire Object is bulldozed but “delete this” is fine because “this” is treated as an Address. However, the method that performs “delete this” must ensure the object is not allocated by malloc() and not a nonref field (resident object) “embedded in the real estate” of another heapy thingy object. In the case of multiple inheritance, it must not be embedded in the real estate of a derived class instance. See http://www.parashift.com/c++-faq-lite/freestore-mgmt.html#faq-16.15.

reference counting —- In reference counting for, say, a shared data structure, we count “handles” on our pointee. Do we count pointer Variables or pointer Objects? I’d say neither — we count Address usage. If a thread has a function1() in its top “frame” and function1() is using our target data structure we must count it and not de-allocate it with a bulldozer. This function1() does not necessarily use any 32-bit pointer object to hold this Address. It might make do without a pointer Variable if it gets this address returned from another function.

vptr and “this” pointers —- The vptr is always a pointer Object taking up 32 bit real estate in each instance (java/c# classes and Polymorphic C++ classes). How about the “this” pointer? Probably a pointer Variable (nickname) — See http://bigblog.tanbin.com/2011/12/methods-are-fields-function-pointer.html

pointer comparison —- When you compare 2 pointers, you compare the addresses represented by 2 pointer variables. Apart from the 2 pointee objects A and B, there may not be two 32bit pointer Objects in memory whose 32 bits hold A’s or B’s addresses.

smart pointer —- When you pass a raw pointer to a smart pointer, you pass an Address, not necessarily a pointer Object or pointer Variable. A smart pointer is always an Object, never a pure Address.

wild pointer —- is a pointer Address, but a dangerous address. Object at the address is now bulldozed/reclaimed. SomeLibrary.getResource() may return an Address which becomes wild before you use it. If you don’t give a nick name to that returned value, it remains a wild pointer Address.
** now I feel even a stack object can be bulldozed, creating a wild pointer. getResource() may mistakenly return a pointer to an auto variable i.e. stack object

pointer argument or returning a pointer —– I think what’s passed is an address. I call it pbclone — the address is cloned just like an int value of 9713.

Most of the time we aren’t sure if a nickname refers to a 32-bit pointer object or pure information. The prime example of a 32-bit pointer object is a pointer field in a struct.

4-layers of pointer wrapping

I was looking for a realistic scenario of multiple layers of pointer wrapping. Here’s a vector of iterators (Not a vector of vectors). Each iterator therein comes from a nested-vector.

If we get an iterator from the outer vector, we get a ptr to ptr to ptr to ptr to double.

vector<vector<smartPtr >::iterator>::iterator my_itr;

1) inner-most pointer is a 32-bit raw pointer to a double (stack/heap/global) object.
2) The smart pointer is bigger, say 55-bit object holding a pointer to the 32-bit raw pointer.
3) The inner-iterator is a 32-bit pointer to the smart pointer object, since vector iterator is typically implemented as raw pointers.
4) The outer iterator my_itr is another 32-bit raw pointer to the elements of the outer vector, where each element is an inner-iterator. (Note each element is not a vector.)

How about adding an asterisk — smartPtr… ? As explained elsewhere on this blog (http://bigblog.tanbin.com/2012/04/smartptr.html) it is not my favorite construct.

double-ptr usage #2b — swap 2 pointers

(Note this is a special case of “special reseating” — http://bigblog.tanbin.com/2011/04/double-pointer-usage-2-special.html)

Q1: write a utility function to swap 2 pointer’s content.
%%A: swap(int** a, int** b) {….}
Obviously, the 2 pointer Variables must be declared to be compatible — to be elaborated [1], such as
int* a1, b1; …. swap(&a1, &b1);

To really understand why we need double pointers, consider

Q2: function to swap 2 nonref variables. In other words, after i call swap(..), x should have the value of y. If y used to be -17.2, x now has that value. Note this number isn’t an address.

In this case, you need to pass the address of x like &x….

To understand this Q2, it’s actually important to be be thoroughly familiar with

Q3: what does swap(char i, char j) do?
%%A: something other than swapping. I doubt it does anything meaningful at all. It receives two 8-bit chars by value. It doesn’t know where the original variables are physically, so it can’t put new values into those original variables, so it can’t swap them.

Any swap must use some form of pass-by-reference.

Q1b: function to swap 2 arrays. After the swap, array1 is still a pointer but it points to the physical location of array2
%%A: same signature as Q1.

[1] void ** is probably a feasible idea but may require casting. Usually the 2 arguments a1 and b1 should have the same declaration. If a1 is Animal* and b1 is Dog* (Note the single asterisks), then such a polymorphic swap is no longer straight-forward and fool-proof.

convert a reference variable into a pointer variable

You can’t declare a variable as a pointer to a reference, but we often take the address of a reference variable. I think it’s same as address of the referent.

Q: If you need to pass a pointer to a 3rd party library, but you only received a reference variable — perhaps as an function input argument, how?
A: Well, you can treat the variable as a non-ref and simply pass its address to the library.

 

HASA ^ ISA ^ vs PointTo in c++

In java, hasa always means point2 (or pointTo), assuming The Small object is never a primitive…. In c++, hasa means Host object’s has a piece of its real estate carved out for the Small object “embedded” therein [1]. This is a profound difference from java. Here are some consequences —

* Even if you put a custom op-new into class Small to prevent heap-instantiation of Small, class Host can still instantiate a Small on heap during its own construction. See [[More effC++]]

* const (on the host object or a non-static method) means real-estate const. So the non-ref field should not change. This governs hasa, and has no effect on points-to.

* If a c++ Host object points-to a Small object, then memory leak is quite possible, so Host dtor had better delete that Small pointer. Therefore c++ points-to suffers from memory leak but no such issue with c++ HASA. Host dtor automatically calls Small dtor, in the DCB sequence (see blog post on DCB)

ISA also involves embedding. Therefore, the custom op-new in Small won’t prevent a Small instantiation on heap if Small subclass has its own op-new.

double-ptr usage #3 – vector of pointers

We know the iterator in a vector is physically a raw ptr. If the elements in a vector are pointers, then the iterator is a double ptr. P238 [[STL tutorial]] has a simple yet complete example showing how to dereference the double-ptr twice.

By the way Same example also shows how to use a vector of Shape pointers to hold sub-class instances (of base class shape), like circles, squares…., then call the virtual methods defined in the class hierarchy.

c++ reference variable is like …. in java@@

Q: a c++ reference is like a ….. in java?
A: depends (but i'd say it's not like anything in java.)

A1: For a monolithic type like int or char, a c++ reference variable is like a Integer.java variable.  Assignment to the reference is like calling a setValue(), though Interger.java doesn't have setValue().

A2: For a class type like Trade, a c++ reference is like nothing in java. When you do refVar2.member3, the reference variable is just like a java variable, but what if you do

Trade & refVar = someTrade; //initialize
refVar2 = …//?

The java programmer falls off her chair — this would call the implicit op=

refVar2.operator=(….)

null address ^ null-pointer-variable

@ A null address is the fake address of 0. It doesn’t exist physically. Compiler treats it differently (Don’t ask me how…)
@ A null-pointer-variable is a pointer variable holding a null address.

I think this is a source of confusion to newbies. A so-called “Null pointer” means one of these 2.

There’s just one null address, /to the extent that/ there’s just one Address 0xFF8AE0 or whatever. But there can be 5 (or 5555) null pointer variables. Note each pointer variable doesn’t [1] always occupy 32 bits (assuming a 32-bit bus), but usually does. If it does, then the pointer variable’s own address is never 0. (Anything that’s /allocated/ an address is never at Address 0 since Address 0 doesn’t exist.)

[1] I guess if a pointer variable has a brief lifespan it may live in the cache or thread register??

double ptr Usage #2 – special reseating #c#

Double pointer has many usages in c++. Usage #1 is array of pointers. Here’s Usage #2

Q: in a function, can you reseat a pointer object (32bit) created outside the function?
A: possible in C. pointer-to-pointer is a C-specific construct that can reseat the “nested” pointer.

int var3 = 3, var9 = 9;
void reseat(int ** ptr){ *ptr =  &var9; }
int main(){
    int * intp = &var3;
    reseat(&intp); //intp is a pointer-to-int, now reseated to &var9
}

C# has a similar feature in “ref” params.

In java, a reference param to a method is always a local ptr Variable pointing to the argument Object. Once you reseat this local ptr Variable, you lose the argument object. You can’t reseat the original ptr Variable declared outside your method.

void metho1(Object param){
  param=new Object(); // reseating

#1pitfall in return-by-ref: pointee lifetime

See the post on using out parameter to return by ref

Q: Return by reference is very common in C++, but there’s a common pitfall, that everyone must always remember?
A: pointee object’s lifetime (not “scope”).

Q2: common, practical safeguards?
A: return a field by reference, where the host object has a longer lifetime
A: return a static object by reference, either class-static, global variables, or function-static. P150 [[NittyGrittyC++]] has a useful example.

handling pointer field during object assignment

class inner {int innerVal;};
class outer{
  privaet:
    int val;
    inner * inptr;
….
};
How do you overload the assignment op?

outer.val is simply bitwise copy. My solution (Sol1) for inptr is

  *inptr = *(rhs.inptr);

Q: what if the *inptr memory cells are already returned to freelist? i.e. intptr pointee is on heap and deallocated.

The suggested solution (Sol20 is

delete inptr;
inptr = new inner();
*inptr = *(rhs.inptr);

Let’s compare Sol1 and Sol2
– If we have control in place to prevent accidental deletion of the inptr pointee, then Sol1 is faster.
– If our control is weaker and only guarantees that after deletion, the 32-bit object inptr always gets ==reseated== to NULL, then Sol1 is not safe as Sol2
– Without control, then when we want to access or delete inptr, we will worry it could be pointing to freelist, or a valid object. We dare not delete or read/write; and we don’t want it to leak either. No good solution. I would risk double-delete because memory leak is worse. Leak is harder to detect.

pointer≠address #with exceptions

See also https://bintanvictor.wordpress.com/2012/07/02/3-meanings-of-pointer-tip-on-delete-this/

Some people say “A pointer is an address”. Misleading! I’d say “pointers can sometimes be used as addresses.”

When people talk about a pointer, they mean either an address or a pointer-variable . A variable is a source code entity. After compilation it can be either an object or optimized away.

An address of the white-house is a stable, immutable text you can write on paper. But a pointer-var can be reseated, so it’s more like a “variable holding an address”.

(A non-reseatable pointer is pretty much an address. Notable examples include the “this” pointer and the vptr, both hidden fields. Update — now i think both can be reseated — in ctor and in multiple-inheritance.)

int myInt = 3;

Look at the above illustration. The variable myInt is NOT the same thing as the value “3”. Unless optimized away, myInt is both a variable and an object (see post on Mutable,initialize etc), has an address and occupies 32 bits and can change its content. Likewise, a pointer var occupies 32 bits of memory and can change Content.

Note nonref variables like myInt is permanently seated at a address. It gets allocated 32bits once and for all, and never re-binds to another object (same in java/c/c#). Similar to reference variables. (In contrast, python simple myInt variable behaves differently. See other posts.)

— I don’t think there’s another c++ construct that is as monolithic as address-of-a-house —

* qq( &myObject ) is the closest thing — It’s an address + type information. Whoever using qq( &myObject ) need the address and need to know its type.
** qq( &myObject ) is an address + type and is subject to variable scope control. You can get this address only if you are in its scope. Compiler controls scope(?).
* A const pointer is also like an address-of-a-house.
* A reference doesn’t feel like an address. Granted, a reference is permanently seated to an address, but when you read the reference you don’t see that target address — You must take the address of the reference.

const ptr field ^ reference field

Nonstatic const field vs reference field — what’s different what’s similar? Many people ask this question, so here’s my take.

Rule: for both, ctor has no choice — must use initializer list.

The 2 constructs are similar and should be studied together.

I feel reference field is less popular than a const ptr field. Reference fields are probably too rigid. Const ptr is a “bendable rule” — mutable, const_cast…

Also, as the source code author, you can decide to remove (or re-add) the const modifier, without affecting client code.

Perhaps a ref field is more useful in class templates as a local typedef?

(unwrapped) pointer assignment — double-pointer scenario 2 again

Before looking at double-pointers, let’s clarify the basic rules. For regular pointers p and q2, (assuming 32-bit),
Rule 1) Foo *p = 0; p = … // pointer p bitwise state change i.e. reseating
Rule 2) *q2 = …// pointer q2 still points to the same pointee, but pointee’s state changed via assignment operator (possibly overloaded)

Now, in http://www.parashift.com/c++-faq-lite/const-correctness.html#faq-18.17 shows a double pointer
Foo **q = &p; // q points to the address of pointer p

*q = … // q still points to the same 32-bit object, which is pointer p, but p’s content is changed, i.e. p reseated. Means the same as “p = …”
Now we are ready for

Rule 3) SomeType * ptr = … // ptr seating at initialization. This looks like Rule 2 but is really more like Rule 1.

simple example@ptr to method[[Essential c++]]

I tried repeatedly but unsuccessfully to wrap my mind around ptr to methods, until I hit P130 [[essential c++]].

Your class should have a bunch of sister methods with identical params and identical returns. Remember every ptr VARIABLE[1] exists as (compile time) symbol table entry with
– a name,
– a value (address of the pointee) and
– a declared type

This pointer var can point to any of those sister methods, but not any look-alike methods in other classes.

[1] there’s no such thing as a “pointer” — only pointer _variables_ and pointee addresses.

void ptr && TYPE of ptr

P197 [[nitty gritty]] points out that at compile time and/or runtime, “system” knows the type of each ptr object. One exception is the void ptr, whose type is undefined.

“Pointer object” means the 4-byte ptr object. This object could be a field of a class object; or a stackVar, or a global var.

Obviously as a pointer this variable holds an address. This address should[1] be another object. The type of the pointee object is the type of the pointer. System remembers each pointer’s type. That’s why pointer cast is a controlled operation. A compiler need this type information to allocate memory.

What if the pointee is a derived object? See the post on AOB (ie address of basement).

In conclusion, a ptr object
* has an address of its own
* holds an address unless it’s a null or uninitialized pointer
* has a type unless it’s a void ptr
* has a name ie the variable name
* has an optional pointee-name — qq(Cat * cptr = &myCat; ). Pointee could be on heap or stack. If it’s on heap, the pointee object is always nameless but myCat could be a name attached to it subsequently.

[1] if dangling, then the pointee could be another stackframe, or any used/unused heap address.

##ptr and ref as lval/rval

(pbclone means pass-by-creating-a-clone, usually on stack)

Ptr as lval — reseating
Ptr unwrapped as lval — pointee obj state change, possibly assignment overloading

Ref as lval during initialization — seating. ie permanently saving an address into the 4-byte ref
Ref as lval but not initialization — obj state change, possibly assignment overloading

Nonref primitive as lval — obj state change, just like java primitive
Nonref class type as lval — obj state change, possibly assignment overloading

Ptr (including “new struct123”) as rval — returns address

Ptr unwrapped as rval, initializing a ref — pbref by returning address
Ptr unwrapped as rval, not initializing a ref — pbclone
.
Ref as rval, initializing another ref — pbref by returning address
Ref as rval, not initializing another ref — pbclone
.
Nonref class type var as rval, initializing a ref — pbref by returning address
Nonref class type var as rval, not initializing a ref — pbclone
.
Nonref primitive var as rval, initializing a ref — pbref by returning address
Nonref primitive var as rval, not initializing a ref — pbclone
.
nonref primitive literal like “23.5” as rval, not initializing a ref — pbclone
nonref primitive literal like “23.5” as rval, initializing a ref — illegal

ref-counting usage in STL

a1) A container of pointers usually use ref-counting smart pointers

b1) strings are usually ref-counted

[ Above 2 points are unrelated but they lead to the next point ]

1) I believe a vector of POINTERS uses ref-counting smart pointers, but a vector of CHAR is never ref-counted.

stackVar^heap object

In a C++ program, there is exactly
* one heap
* one stack — locals (ie auto variables) and static locals.
* one global space — globalVar is defined outside all functions.

An object (float, int, user-defined…) lives in one of the 3

In terms of cleanup, heap objects need delete (never automatically deallocated); global var lives forever; auto var has automatic cleanup. static locals live forever. [[ absolute c++]]

You can reasonably seat your ptr at
* globals
* static locals
* autos in main()
* heap objects

assignment in 5 kinds of c++ variables

— ptr to primitive or ptr to class type
float* newPtr = oldPtr; // copies address of the wrapped object. This is what happens in a default copier and assignment

— primitive ref assignment
overwrite object State, just like java primitive assignment.

Note if a class has a ref field and a ptr field, they behave differently during the field-by-field class assignment!

— class type ref assignment
MyClass& newRef = someRef;
newRef = oldRef; // overwrites object state, field by field

— primitive nonref assignment
obvious

— class type nonref variable
MyClass v; // ctor
v = v2; // field by field state overwrite by assignment operator

——various types of variables to look at
) Primitives ^ class types
) Pointer(and ref) ^ nonref types

A very detailed stroke-by-stroke on assignment overload — http://icu-project.org/docs/papers/cpp_report/the_anatomy_of_the_assignment_operator.html

(incomplete) backgrounder to C++ pointers, references and scalars

(C++ centric explanation, also applicable to other languages.)

1) An object [2] is simply a chunk of memory having
* A) address — immutable in any language
* Val) value — mutable. [1]
* name? Never! Names exist in source code not in RAM.
2a) A nonref variable is a name attached to an object. The var has no visible address of its own.
2b) A ref-variable is a pointer to an object and always, always has a
* N) name — immutable
* PA) pointee-address — mutable except C++ references
* A) address (immutable) — a ref-variable is an object and has its own address. Therefore double pointers.

A ref-variable can change its object, via re-assignment. Java and C++ differ here. Java reference variables easily change PA ie reseat, same as C++ pointers, but C++ reference can’t change PA.

[1] Note on the word CHANGE — We say “an object can change value” as “a leaf can change color”.

[2] “Object” could be an int (or float…) in general. Java talks about “Object” as non-primitive objects.

Q90: what variables can change object?

Q66a: what if we assign scalar to scalar?

Q66b: what if we assign pointer to pointer?
Q66c: what if we assign scalar to ref?
Q66d: what if we assign ref to scalar?

Q32: what if we take the address-of a variable?

Q95: what if we dereference a variable?
A: reference variables rVar1 never need it
A: a pointer p1 is a wrapper of another variable v1. Dereferencing p1
means “unwrapping” p1 or “exposing v1”. Since, *p1 === v1, I feel any
place [1] you write *p1, compiler translates it into v1.
[1] declarations use star with a completely different meaning

Q20: what if i pass address-of this variable into a function?
A: receiving parameter must be a pointer

Q21: what if i pass this variable into a function? It depends on the
receiver variable.
A: scalar receiver — pass by clone
A: reference receiver — receiver becomes alias

Q29: what if i pass a pointer variable dereferenced into a function?
A: i think it’s equivalent to passing v1 the scalar ie pass by clone

Q82: what exactly happens to the A/Val/N during assignment like “var =
some_value”
A: for a pointer — Val becomes the addr of …
A: for a scalar — Val becomes a clone of ..? See P216 of [[24]]
A: for a reference — Addr becomes the addr of …

Q66: can the address-of be an lvalue like “&var=…”? What happens to
the A/Val/N?
A: never

Q52: can the dereference be an lvalue like ” *var = …”? What happens
to the A/Val/N?
A: yes for a pointer.

in a param declaration, const and & are both "decorators"

Sound byte — in a param declaration, const and & are 2 standard “decorators”

Q: Func1(const vector & v) { // It’s easy to get distracted by the some_complex_expression, but put it aside — what kind of param is v?
A: v is a reference to some vector object, and you can’t modify object state via this handle. It’s known as a “const reference” but
* This doesn’t mean the object is immutable.
* This only means the object is unmodifiable using this particular handle.

Note — when you pass an arg to this param, you don’t specify const and you don’t specify a reference —
Vector myVector; Func1(myVector); // the compiler will create a 32bit const ref based on the nonref variable myVector

Jargon warning — A “handle” is a loose concept, can be a pointer, a ref, or a nonref variable.

pimpl, phrasebook

pointer – the private implementation instance is held by (smart) pointer, not by reference or by value
** P76 [[c++codingStandards]] suggests boost shared_ptr

big3 – you often need to avoid the synthesized dtor, copier and op=. See P…. [[c++codingStandard]]

FCD – see other posts

encapsulate – the “private” class is free to evolve
** wrapper class vs a private class. The public methods in the wrapper class simplify delegates to the private class.[[c++succinctly]] has a complete Pimpl example showing such “delegations”

C++ references — strict and clean rules

int i2; int & ri = i2;
Cat c; Cat & rc=c;

#1 keyword — a reference is an *ALIAS* or symlink [1] to another variable

* you can take address-of ri. looks the same as i2 i.e.&ri === &i2
* ri can be a lvalue. looks the same as i2, i.e. “ri=222” === “i2=222”. In fact, this also shows you can never reassign a reference to a second target object (different memory location).
* you never need to deference a reference. ri looks the same as i2, without the deref operator “&”.
* when the target of the ref is a user-defined object, then the ref looks the same as the original handle on the object. See P168 [[24]].
* you can put ri on the RHS when initialize a new ref. ri works the same as i2 the original nonref variable.

[1] unix hardlinks have equal status to each other, unlike reference vs referent

A reference has no “separate identity” because it lacks an observable address. Compiler completely hides whatever address a reference occupies.

A reference must be seated at birth and will never be null or reseated. A reference is like your baby boy (Dabao). He has a mother at birth and can’t change his mother. However, a reference can become “invalid” when referent memory is reclaimed.

/*
* The following two lines won’t work because reference has to
* be initialized at the time of declaration
*/
int &debug_level_ref;
debug_level_ref = level;
– so always initialize references
– because references always need to be initialized there could be no NULL references
– references cannot be re-initialized, i.e. throughout its lifetime it will serve as
an alias to the object it was first initialized with

FCD vs regular *.h files

A footnote on the post about FCD. Here’s another aha moment — *.h file is #include’d into the the host class (I call it the “umbrella class”).

If the component class changes any part of its field listing including field types, umbrella “detects” it through the #include, tantamount to a text change in the umbrella file. Umbrella is marked dirty and needs a recompile.

Q: If you don’t #include the *.h but use FCD instead, can we avoid the recompile?
A: not sure. How about sizeof(Umbrella)? can compiler calc it if you use FCD? See Item 34 eff c++. I believe the trick borrowed from java is a pointer field, so sizeof(Umbrella) is independent of sizeof(Component)

pbref/pbclone — indistinguishable by func call (restriction on overloading

At end of a function, You often see “return someNonRefVar” or “return *ptr”, or “return someRef”. Such a return can either return by ref or return by clone, but which one exactly? Answer is in the func prototype. Return by ref is always indicated there, not in return statements.

Quiz: by looking at the return statement itself, can you tell if it’s return by clone or return by ref?
A: if it returns a non-var like “return amount + 3.22” ==} can’t be pbref.
A: if it returns a local object ==} can’t be pbref as the local object would self-destroy right away. But see post on RVO

It’s instructive to compare with argument passing syntax. Calling code doesn’t need to know to call by clone or call by ref. The func call looks identical [1] — just put vars in the parentheses. To tell a call-by-ref func from a call-by-clone func, Key is the func prototype.

Q: Can 2 otherwise identical functions overload each other simply based on the “&” in a param?
A: no. Compiler can’t “bind” a function call. I tested in gcc. Also covered in IKM test.

[1] except you can’t pass non-vars in pbref.

a C/C++ pointer variable has a target-data-type

(see also post on void ptr)
Consider a simple declaration

  int *intp;

Q1: If a pointer variable holds an address, why is it necessary (to compilers) to attach a data type to a pointer variable? From then on the intp variable will be treated always, always as a pointer to a __int__.
A1: when an address is assigned to this pointer, the address is treated as the starting address of a block of memory. An essential info is the size of the block. A wild compiler can treat 8 bytes starting at that address as an object, or 888888 bytes!
A1: every operation on “intp” must be valid for integers. Compilers must check that.

To further answer this question, consider a more basic question —

  int intVar;

Q2: why is it necessary to attach a data type to a nonref variable? From then on the intVar variable will be treated always, always as an int.
A2: I think compiler (yes compiler) must allocate the right amount of memory for the object
A2: compiler must access-check every operations on it. Compiler won’t allow concatenation for an int, right?

Q3: how large is the intp variable itself, if it has to hold both the address and the type?
A3: 32 bits. I believe the data type is not “carried” by the intp variable so doesn’t increase its size. I think it may be a compile-time information rather than runtime information. At runtime, the data type of the pointer is lost and not checked by anyone

allocate ptr/ref on heap@@

I don’t think finance app developers need this level of understanding, but for my curiosity…

Q1: do we sometimes create a ptr or reference on heap?
A: I think so, if your object has a 32-bit pointer field, or (less commonly) a reference field, and you create the object on heap. This is quite common in C++. Extremely widespread in java, since most non-trivial objects need pointer fields to be useful. The pimpl pattern does the same in c++.

Q1b: Apart from that, is there any way to create a ptr on heap?
A: I don’t think so. Here are a few cases —

— stackVar as a ptr? —
As a so-called auto variable, the 32-bit storage is AUTOmatically deallocated. Note the pointee could very well be on the heap, and possibly leaks memory when the pointer disppears.

— static_casting a heap ptr of Q1? —

   Type1* var1 = static_cast (obj3.field2);

See post on casting-creates-new-pointer for more details. In the above context, a new 32-bit pointer is allocated on the stack. var1 and var2 point to the same pointee, but they each occupy 32 bits.

pointer-casting creates new pointer@@

With some exceptions[1], c/c++ casts operate primarily on pointers (and references). That begs the question —
Q1: does pointer casting allocate 32 bits for a new pointer (assuming 32-bit addressing)?
A: depends on context

I feel the more basic question is

Q0: does pointer initialization allocate 32 bits for a new pointer?
A: i think so:

SomeType* p2 = p1; // allocates 32 bits for a new pointer p2, unless optimized away

Now back to our original Q1, i feel casting is like a function that returns an address in a specific “context” — address returned must be used as a Type3 pointer:

(Type3) p1;

In this case, if you don’t use this address to initialize another pointer, then system doesn’t allocate another pointer. but usually you put the cast on the RHS. The LHS therefore determines whether allocation is required

Type3* p3 = (Type3) p1; // 32-bit allocated
////////////////
Type3* p3=0; // this pointer could be on stack or heap
p3 = (Type3)p1; // no allocation since p3 already allocated.

As an extension of the LHS/RHS syntax, programmers also use function call syntax —

myFunction(123, (Type3)p1); // cast returns an address, to be saved in a temporary 32-bit pointer on the stack. This is more like pointer initialization.

[1] static_cast often operates on nonref variables, by implicitly invoking cvctor or OOC

ref2const object — everywhere

C++ uses the r2c idiom in many strategic areas — note-worthy

* typeid operator returns a ref to a constant type_info object
* catch(……) often uses a r2c exception object, _BUT_ http://www.parashift.com/c++-faq-lite/exceptions.html#faq-17.15 shows a non-const usage
* op<<
* cvctor
* operator-overload-converters
* op=
* copy ctor — if u drop the “&” you get infinite recursion and stack overflow

to delete: heap-stack dual variable #ptr/ref

Common scenario — you declare a local ptr variable and assign it q(new int), you get a heap-stack var. The object lives on the heap but the ptr var lives on the stack (unless the variable itself is optimized away).

Unfortunately, when the ptr var goes out of scope, you lose the address forever and can’t free the memory.

One best practice is to always new() and delete() an object in the same func, or use smart ptr

pointer is usually to the heap@@ !! in CSM project

I feel In general a pointer to a stack object is 2nd best as it may become stray pointer. However, in my CSM data validation, statistical, visualization system, i used many arrays and strings without malloc().

a pointer to a global object is possible, but I didn’t use a whole lot of globals.

I guess my CSM team constructed many local arrays in outer functions (main()), and pass them into nested functions, so they stay in scope (throughout). Pointer to local is fine.

Also, static local variables were probably the 2nd popular choice we took.

In conclusion, we probably did use pointer-based collections but without heap/malloc(). Pointers to static locals, globals and main() method locals.

Also note function pointers point to neither heap nor stack.

##varieties of c++ variables — 4 dimensions

To understand the wide variety of C++ variables, consider…
— about the object —
1) Heap (=>pointer) ^ stack
2) Primitives ^ class types
— about the variable —
3) Pointer(and ref) ^ nonref types
4) locals ^ fields — globals are rare
First 3 dimensions make 8 types
* Pointer/ref to Heap primitive
* pointer/ref to Heap class type
* pointer/ref to anything on stack is questionable[1]
* class type on heap but not accessed by pointer? impossible
* class type on stack and not accessed by pointer
* primitive on heap but not accessed by pointer? impossible
* primitive on stack and not accessed by pointer

[1] but how about simple c-strings. I used many of them without malloc(). Some arrays do live on the stack!

Nonref types can be pbclone or pbref. Pointers should be passed as …pointers.

Vast majority of variables are local variables or fields. By default locals live on the stack, but a local ptr can point to a heap object. After out-of-scope, the heap leaks….;)

c++ nested class has NO ptr to enclosing-class object

See also the post in the recrec blog that’s dedicated to the same topic.

Q: Does a non-static nested class have an implicit pointer to the enclosing-class object [1]?

Specifically, say Class O defines a non-static nested class N, and has a non-static field n1 of type N. Given an object o2 of type O, in java, n1 has an implicit pointer to the o2 object. Not in C++. See P790 [[c++ primer]] and P187 ARM.

[1] and can access the non-static fields and methods of the enclosing object?

Q: what’s the difference between static vs non-static nested class in c++?

ptr-ref layering #reference to pointer

Update http://markgodwin.blogspot.sg/2009/08/c-reference-to-pointer.html is a detailed blog post

Background — I doubt you ever need to know these arcane data types, but these are at work whenever you specialize a class template using a pointer as the “T”.

int * ptr2int;
int * & ref2ptr = ptr2int; // ok

int **** ptr;
int ****& ref = ptr; // ok

Real Case in point from EffC++ : The default allocator of vector (and many containers) has a nested typedef for T* (called “pointer”) and a nested typedef for T& (called “reference”). Now if you instantiate a vector using a reference type as the “T”, you die, because

The pointer typedef won’t compile, because you can’t declare a pointer to a reference; [1]
The reference typedef won’t compile, because you can’t declare a reference to a reference.

However, You can have a ref to a pointer to a pointer to a pointer to a pointer….

[1] However, we often take address of a reference variable. See https://bintanvictor.wordpress.com/2012/04/04/convert-a-reference-variable-to-a-pointer-variable/

Now the rule on reference-pointer layering –

  • If you start with a nonref variable, you can add “pointer” layers over and over, but any “reference” layer must be the last layer.
  • In other words, once you “reference” anything, you can’t layer on top of it.
    • You can declare a variable “pointing to” anything, so long as there’s not already a reference layer
    • You can declare a variable “referencing” anything, so long as there’s not already a reference layer
    • Pointer-to-reference is illegal as a variable type, but you can take the address of a reference!