next_perm@N color socks #complex #100%

A common IDE coding challenge — given x pairs socks of various colors, generate all the permutations, in ascending order. Each color has a value.

–Solution 1: std::next_permutation() and prev_permutation()

–solution 2: I can probably write my own next_perm(). Using This function we can generate an ascending sequence of permutations starting from the current content of a vector.

https://github.com/tiger40490/repo1/blob/cpp1/cpp/combo_permu/nextPerm.cpp is my iterative solution, but should use lower_bound()

https://github.com/tiger40490/repo1/blob/py1/py/combo_perm/nextPermOfNSocks.py is my python solution, overcoming the lower_bound problem.

asymmetry lower_bound^upper_bound #IFF lookup miss

In both set::lower_bound() and std::lower_bound() —

For a “perfect” hit , the return value is equivalent to the target; whereas upper_bound is strictly higher than target. See

To achieve symmetry, we need to decrement (if legal) the iterator returned from upper_bound.
———-
If no perfect hit, then lower_bound() and upper_bound() both give the next higher node, i.e. where you would insert the target value.

#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;

vector<int> v{1,3,5};
int main(){
  vector<int>::iterator it;
  it = lower_bound(v.begin(), v.end(), 2); cout<<*it<<endl;
  it = upper_bound(v.begin(), v.end(), 2); cout<<*it<<endl;
}

multiple hits: lower_bound gives Earliest

Looking for lower_bound (2) in {0,1,2,2,3,4}, you get the earliest perfect hit among many, i.e. the left-most “2”.

No such complexity in upper_bound since upper_bound never returns the perfect hit.

No such complexity in set.lower_bound since it won’t hold duplicates.

int main(){
  vector<int> s{0,1,2,2,3,4};
  vector<int>::iterator it = lower_bound(s.begin(), s.end(), 2);
  cout<<"to my left: "<<*(it-1)<<endl;
  cout<<"to my right: "<<*(it+1)<<endl;
  cout<<"to my right's right: "<<*(it+2)<<endl;
}

lower_bound() may return end() #gotcha

lower_bound() may return end().

If your target value is too high and nothing qualifies, all 6 functions below return the right end of the range. If you look at the (key value in) return value i.e. end-of-range,

  • This end-of-range node is a dummy. Never read its key value.
  • After calling lower_bound or upper_bound, always validate before reading the return value

I spent hours puzzled by the wrong data returned after lower_bound()->first. Specifically, if the map/set is keyed by integer, then the end()->first can be normal int value even when it fails and returns map.end()!

Consistent across 6 functions:

  • std::lower_bound
  • std::upper_bound
  • set/map methods

What if the target value is too low? Easier — upper bound should return left boundary iterator, and lower_bound returns the same iterator! See https://github.com/tiger40490/repo1/blob/cpp1/cpp1/miscIVQ/curveInterpolation_CVA.cpp

 

q[std::less] functor ^ operator<() # map/sort/lower_bound

  • In coding tests, you can use any solution below, so I will use a global operator<(Trade, Trade)
  • In QQ, we need some basic know-how to discuss the alternatives but …. seldom quizzed

Let’s summarize the rules for QQ — I wanted to say “simple rules” but… non-trivial.

1) multiset/multimap/set/map use functor “less” .
** That’s by default. You can (but probably no practical reason to) specify any functor when instantiating the multiset class template. See post on [[allocator, vptr…]]
** Note a functor is always a class template.

2) each instantiation of functor template “LESS” typically calls operator<()
** note this “operator<()” can be a non-static method, or a global/namespace thing
** If you have an entity class Trade, you can customize by overloading this operator as a global function accepting 2 Trade arguments.
** However, Warren of CS (market data system) said it should often be a (non-static?) method of the Trade entity class. I feel this is best practice.

The dogmatic [[effective stl]] P179 doesn’t mention overloading operator< but advocates subclassing binary_function and giving it a unique name…. But this is significantly more (and unfamiliar) code to write, so for simplicity, simply overload operator<() and don’t touch q(less).

ptr-to-Trade as a key — see [[effSTL]] P88. Basically, you need a custom functor class deriving from std::binary_function. Beware the syntax pitfall highlighted in my blog post. Note shared_ptr is an option, not a must.

If you don’t need a red-black tree container, but need sorting, binary_search, lower_bound etc — then you have flexibility. Simplest is a pointer to a global bool function. See https://bintanvictor.wordpress.com/2017/10/01/binary-search-in-sorted-vector-of-tick-pointer/


How about std::lower_bound()? Same defaults — less and operator<()

How about std::sort()? Same defaults — less and operator<()

 

q[std::less] functor ^ operator<() again, briefly

[[effSTL]] P177 has more details than I need. Here are a few key points:

std::map and std::set — by default uses less<Trade>, which often uses a “method” operator<() in the Trade class

  • If you omit this operator, you get verbose STL build error messages about missing operator<()
  • this operator<() must be a const method, otherwise you get lengthy STL build error.
  • See https://stackoverflow.com/questions/1102392/stdmaps-with-user-defined-types-as-key
  • The other two (friendly) alternatives are
  • function pointer — easiest choice for quick coding test
  • binary functor class with an operator()(Trade, Trade) — too complex but most efficient, best practice.

 

binary search in sorted vector of Tick pointer

Note the mismatched args to the comparitor functions.

(I was unable to use a functor class.)

std::vector<Tick const*> vec;
int target;
bool mylessFunc(Tick const * tick, unsigned int target) {
     //cout<<tick->ts<<" against "<<target<<endl; 
     return tick-ts < target;
}
lower_bound(vec.begin(),vec.end(),target, mylessFunc);

bool mygreaterFunc(unsigned int target, Tick const * tick){
     //cout<<a->ts<<" against "<<target<<endl; 
     return tick->ts > target;
}
upper_bound(vec.begin(),vec.end(),target, mygreaterFunc)

STL algos and their _if() and _copy() derivatives

Note all the _if() accept only Unary predicates i.e. filters. Often, you start with a binary predicate like less
… then you use bind2nd() to convert binary to unary predicate

—-These algos have both _if and _copy derivatives–
remove
remove_if
remove_copy
remove_copy_if
replace
replace_if
replace_copy
replace_copy_if
—-These algos have an _if derivative–
count
count_if
find
find_if
—-These algos have an _copy derivative–
partial_sort
partial_sort_copy
rotate
rotate_copy
unique
unique_copy

tailor-made beats one-size-fits-all: STL algo

Effective STL Item 44 says “prefer member functions to algorithms with the same name”. There's a simple reason — taylor-made beats one-size-fits-all.

set.find() is taylor-made for class set (actually a class-template), but the free function (actually a function-template) find() is one-siize-fit-all — container-agnostic. All STL algos are container-agnostic because the creators made them work exclusively with iterators.

Bottom line — Such container-agnostic solutions will be less efficient than the tailor-made solutions.

## many ways to classify STL algorithms

I feel there are too many STL algorithms for my modest brain capacity. If we can fully digest a single interesting *comparison*, then we conquer a small piece of that universe. I like binary, black and white schemes.

^^ RW algorithms can operate on each element OR on the container.

^^ _copy version ^ in-place edit — many algorithms come with these 2 versions. replace_copy, remove_copy, sort_copy(???)

^^ public, generic sorter ^ private sorter. Powerful random access iterators are needed by generic sort algorithms. But many containers aren’t random-access so they provide their own private sorters in the form of methods

A few algorithms expect the range to be sorted. [[eff STL]] has a dedicated chapter.

I think only a few major algorithms need random access iterators — sort, shuffle?

transform() to emulate copy()

I feel transform() can emulate many other STL algorithms ..

string const & passthru(string const & s){  return s;  }

void dumpList(const list & list) {
    copy(list.begin(), list.end(), ostream_iterator (cout, “\n”));
    cout<<"————-\n";
    transform(list.rbegin(), list.rend(), ostream_iterator (cout, “\n”), passthru); // i used rbegin/rend just to show the symmetry
}

demo transform,for_each,operator()() with arg,pair,hash_set #10gen

Other features
– transform() with a home-made functor
– many for_each calls
– erase-remove idiom

——————
#include
#include
#include
#include
#include
#include
using namespace std;
using namespace __gnu_cxx;

hash_set uniqueChars(256); // only 256 characters exist in ascii
list abbreviations;

class suffixAdder {
    const char suffix;
public:
    suffixAdder(const char c) :
        suffix(c) {
    }
    // This operator() actually requires an argument!
    const string & operator()(const string & original) {
        return *(new string(original + this->suffix));
    }
};

void dumpList(const list & list) {
    copy(list.begin(), list.end(), ostream_iterator (cout, “n”));
}

void insertNewNodes(char c) {
    cout
            << "entering makeNewNodes() with a character in the original input — "
            << c << "n";
    transform(abbreviations.begin(), abbreviations.end(), front_inserter(
            abbreviations), suffixAdder(c));
    //dumpList(abbreviations);
}
void checkDupe(char c) {
    if (!uniqueChars.insert(c).second) {
        // insert returns a pair, whose second field indicates whether insert happened
        cout <<"Error: duplicate character : "<<c; // insert skipped due to dupe
        cin.get();
        exit(0);
    }
}
/* first pass scans in O(n) input array for duplicate characters.

* 2nd pass re-scans the input array and generates new abbreviations based on the existing population of abbreviations.

Suppose our 2nd scan is on the 4th character “x”
** Pre-condition: all existing abbrevations don't contain “x”
** Post-condition: all new abbreviations generated end in “x”

Number of abbreviations is 2 to the power of n (actually 1 fewer), therefore O(2 to the power of n)

This algorithm is iterative and more efficient than recursion. For a 200-character string, recursion would build a 200-frame stack.

*/
int main(int argc, char *argv[]) {
    //string input = “gfedba21”; // testing
    string input = string(argv[1]);
    for_each(input.begin(), input.end(), checkDupe);
    cout << "—no dupe found. Now generating abbreviations—n";
    abbreviations.push_back(“”);
    for_each(input.begin(), input.end(), insertNewNodes);
    abbreviations.erase(remove(abbreviations.begin(), abbreviations.end(), “”),
            abbreviations.end());
    dumpList(abbreviations);
    cout << "Total # of abbreviations = " << abbreviations.size();
    cin.get();
}

partition – multiple meanings

in GS, PWM desktop has a partition selector. Authorized users (mostly IT) can select one of about a dozen regions (like NorthEast), so system will only query and display that subset of the total data.

DB — Oracle, sybase, mssql have partitioned table

STL algorithm — has a partition() template function

large distributed java/c++ system — P105 [[enterprise java perf]] has a 3-page example on “partitioning”. I feel the idea is to encourage local calls in favor of cross-process data transfer. Serialization is required even between 2 local JVM’s. I believe you can avoid serialization only between threads.

transform() to emulate for_each() in STL

I feel in coding tests, you can do a for() loop

struct null_output_iterator: std::iterator<std::output_iterator_tag,
        null_output_iterator> { // all 3 members needed
    template void operator=(T const&) {} // NOT a regular operator=
    null_output_iterator & operator++() {return *this;}
    null_output_iterator & operator*() {return *this;}
} noi;
void aaaa(){
    transform(input.begin(), input.end(), noi, checkDupe);

    //same as

    for_each(input.begin(), input.end(), checkDupe);

usage@custom subclass of STL binary_function

customized subclass of binary_function can be used in several *distinct* use cases —

1) as 3rd argument to STL sort(). See P205 [[stl tutorial]] for complete example.
2) as optional template argument when *specializing* a sorted map class TEMPLATE in STL. Note your custom subclass feeds into the template, NOT to the constructor —

– A constructor usually takes an object as argument, not a class.
– A template specialization usually takes a class or a derived type[1] as argument, not an object.

[1] like a ptr, ref or typedef based on the class

STL algorithms always use iterators

My ObjectSpace manual says STL algos only access data via iterators. Therefore they are versatile enough to (indirectly) access data elements in arrays, strings, io streams and containers. Typically an algo’s input includes a range with 2 demarcation iterators.
Q: any STL algo using just 1 iterator?
A: fill_n() and generate_n()

Q: any STL algo using 0 iterator?
A: swap() is the only one. This manual has a cheatsheet or quick reference chart.

Q: any STL algo using 3 iterators?
A: many

non-method remove() can’t shrink a container

Suppose 5 items are to be removed and 10 items to remain. Scott Meyers said something to the effect that “remove() will put the 10 items in the front-segment, and return an iterator to indicate end of the front-segment, but the container size will remain 15.

Now, to shrink the container, you need to get a pointer to the container not its elements. With the pointer, either a method or a function accepting the ptr can shrink the container.

Function remove() gets no such ptr. It only gets iterators. Iterators provide access to the elements, not the container.

An STL container has pointers to the data structure (almost always on heap), but the data structure doesn’t have pointer to the host container, which can be on stack or global memory.

filter(unary)^comparitor(binary): 2 "genders" among STL predicates

Differentiate a “comparitor” vs a “filter”. Both constructs qualify as “predicates”, but if you think in terms of “predicates” you can’t observe the difference between horse and donkey.

Wrong — STL predicates are(?) all binary functions including std::less
Right — STL filters are Unary; STL comparitors are Binary

P41 [[objectSpace]] lists the stl algorithms accepting Unary predicates. EffSTL P168 has example of custom functor class — unary

Canonical example 1 — Most stl algorithms need a …. filter — Unary predicate operating on ONE element at a time.
* You are given a Binary function or functor
** either a std::less
** or a customized function or functor
* so you use bind2nd() to convert binary to unary

Canonical example 2 — your sort() or associative container needs a …. comparitor — binary predicate
* If you stop and think, a comparitor must take 2 inputs, so must be a binary predicate
* a comparitor can be a function or functor
* you don’t need bind2nd()

Std::less or a custom binary functor can be used in filters and comparitors.
$ filter usage needs bind2nd
$ comparitor usage doesn’t

Q: What if a STL algo needs a filter when you only have a binary predicate (including a comparitor)?
A: use bind1st and bind2nd. I guess the binary predicate needs to extend binary_function. See effSTL.

Note filter and comparitor are logical constructs or patterns. Physically they are implemented as function or functor returning bool (therefore known as predicate). Each STL algorithm or container requires either a real filter or a real comparitor, but physical implementation is not focus of this blog.