STL iterator invalidation rules, succinctly

http://www.martinbroadhurst.com/iterator-invalidation-rules-for-c-containers.html is concise with explanations. Specifically,

  • list insertions don’t invalidate any iterator. I feel iterator is a pointer to a node.
  • tree insertions don’t invalidate any iterator. Same reason as list.
  • for erasure from list of trees, only the iterator to the erased node is invalidated.

Now for vector:

  • vector insertion invalidates any iterator positioned somewhere after the insertion point. If reallocation happens due to exceeding vector.capacity() then all invalidated
Advertisements

declare iterator]function template #gotcha

(Needed in some coding interviews and also in GTD!)

Update: With c++11, you can use the “auto” keyword and avoid the complexity.

If you drop the “typename” from the for-loop header, then compiler is confused

error: dependent-name ‘std::multiset::iterator’ is parsed as a non-type, but instantiation yields a type
note: say ‘typename std::multiset::iterator’ if a type is meant

Basically, we need to be extra explicit to the confused compiler.

template<typename T> ostream & operator<<(ostream & os, multiset<T> const & l){
  for(typename multiset<T>::iterator it = l.begin(); 
      it != l.end(); ++it){
        os<<*it<<" ";
  }
  os<<endl;
}

iterators : always pbclone !! pbref or by pointer

Iterators – generalizations of pointers – are expected to be copied cheaply. The copy cost is never raised as an issues.

Iterators are usually passed by value. Sutter and Alexandrescu recommended (P154 in their book) putting iterators into containers rather than putting pointers into containers. Containers would copy the iterators by value.

someContainer.end() often returns a temp object, so taking its address is a bug. The returned iterator object from end() must be passed by Value.

Someone online said that If an argument is taken by value, this makes it usually easier for the compiler to optimize the code. Look at the advantage of using function objects by value instead of taking function pointers. This is a similar reason for by-value parameters reasoned on the level of functions objects.

Note java/c# arguments are predominently passed by reference.

calling same method on unrelated objects: c++template outshines java

Suppose pure abstract class Animal has a die() method, and so does Project, Product, Plant and Planet, but they don’t share a base class. How would you write a reusable function that invokes this method on a generic input object, whose type could be any of them?

Java can’t do this. In C++ you create

template<typename T>  f1(T input){ input.die(); }

If you pass an int into f1(), then you get compile time error. Probably a linker error. Is this SFINAE ? I doubt it.

STL algorithms routinely take an iterator argument and then call operator>() on the iterator. Now, this operator is undefined for a lot of iterator INSTANCES. I think only RandomAccessIterator category supports it.

Q: So how does STL make sure you don’t pass an object of ForwardInterator category into such a function?
A: use the template parameter type name (dummy type name) as a hint. Instead of the customary “T”, they put a suggestive dummy type name like “BidirectionayInterator” or “InputIterator”. If you ignore the hint you get compile-time error.

Now we understand that STL iterators have no inheritance hierarchy, but “stratification” among them.

y create custom STL iterators, briefly

Q: when would you write your own STL iterator?

I think if you create your own customized container, you probably need custom iterator Objects. You probably need to return such an Object from rbegin() etc.

In python, c# and (less commonly) in java, many data structures can be made iterable. There’s an implicit iterator.

[[STL tutorial]] has a 3-page chapter showing a debuggable-iterator that reveals interesting details of the inner workings of STL containers and STL algorithms.

inputIterator category — unneeded?

Q: why there's a category “Input Iterator” at all, where is it used? (Output iterator is a similar story, so this post will omit it.)

I think the only major use is input stream.

There are some algorithms like std::merge(…) that require a tiny *subset* of a C pointer's full capabilities.

To make merge() useful on an input stream, STL authors put the Dummy type name “InputIterator” into merge() template declaration as a *hint* — a hint that in the implementation, only that *subset* of pointer capabilities are used. This is a hint to containers

“Hey Containers, if you have an iterator capable of deference-then-read, then you can use me.”

It turned out all containers except output stream has that capability.

iterator = simple smart ptr

– Iterators are *simple* extensions of raw pointers, whereas
– smart pointers are *grand* extensions of pointers.

They serve different purposes.

If an iterator is implemented as a class (template) then it usually defines
– operator=
– operator==, >=
operator* i.e. dereference
– operator++, —
– operator+, += i.e. long jumps
….
—-However, iterator class (template) won't define any operation unsupported by raw pointers, therefore no member functions! Smart pointers do define public ctor, get() etc

iterator categories, stratification, hint..

Iterator categories are a /stratification/ of the 5 to 10 standard Capabilities of raw pointers in the C language. Stratification — like deposits in a lake.

The dereference-then-Read operation is part of Input-Iterator category, which lacks the dereference-then-Write capability.
The dereference-then-Write operation is part of output Iterator category, which lacks the dereference-then-Read.
The ++ and — operators are available in the bidirectional-iterator category
The += and -= (jumps) and greater-than etc operations belong to the topmost strata — random access iterator category

There are a total of 5 categories, arranged in layers. (Some say “hierarchy” — a bit misleading). [[STL tutorial and reference]] has detailed coverage of this technicality, which is too technical for other books.

Iterator categories are not some kind of inheritance hierarchy.

Iterator categories are not some kind of typedef.
 
Q: So how do you write code to enforce what category of iterator you require?
Answer: no-way. Compiler has absolutely no idea what ForwardIterator category means. A category name is no different from those dummy type names in a template declaration. To the compiler, template<Bidirectional_Iterator,…> is no different from template.
A: you can only “hint“, not enforce, the “kind” of iterator you require.

Q: how does an algorithm's author indicate what type of iterator he needs?
A: (This is what STL authors did) Be creative — use special names to name the dummy types in template. Remember all STL algorithms are function templates. If you look at the specification (not “declaration”) of STL algorithms, the dummy types all come with special names. Don't just rename them with the customary T or S.

You appreciate the categories After you start writing STL-style function templates.

Note constness of iterator is completely unrelated to the 5 categories. No STL algo specification mention const in the dummy type names.

y list iterator can’t jump (+=, -=)

Q: Given the STL list iterator supports increment (++), why not make it support jumps? This way, list iterators can be used with algorithms like sort(), right?

P64 of [[STL tutorial and reference]], written by one of the 3 STL inventors, made it clear —

– It’s not about feasibility — adding jump is feasible but not a great idea.
– It’s all about efficiency. Implementing jumps using increment is “faking it”. The sort() algorithm would become extremely inefficient when given a fake random-access iterator.