python to dump binary data in hex digits

Note hex() is a built-in, but I find it inconvenient. I need to print in two-digits with leading 0.

def Hex(data): # a generator function
  i=0
  for code in map(ord,data):
    yield "%02x " % code
    i += 1
    if i%8==0: yield ' '

print ''.join(Hex("\x0a\x00")); exit(0)
Advertisements

edit1file]big python^c++ prod system #XR

Q1: suppose you work in a big, complex system with 1000 source files, all in python, and you know a change to a single file will only affect one module, not a core module. You have tested it + ran a 60-minute automated unit test suit. You didn’t run a prolonged integration test that’s part of the department-level full release. Would you and approving managers have the confidence to release this single python file?
A: yes

Q2: change “python” to c++ (or java or c#). You already followed the routine to build your change into a dynamic library, tested it thoroughly and ran unit test suite but not full integration test. Do you feel safe to release this library?
A: no.

Assumption: the automated tests were reasonably well written. I never worked in a team with a measured test coverage. I would guess 50% is too high and often impractical. Even with high measured test coverage, the risk of bug is roughly the same. I never believe higher unit test coverage is a vaccination. Diminishing return. Low marginal benefit.

Why the difference between Q1 and Q2?

One reason — the source file is compiled into a library (or a jar), along with many other source files. This library is now a big component of the system, rather than one of 1000 python files. The managers will see a library change in c++ (or java) vs a single-file change in python.

Q3: what if the change is to a single shell script, used for start/stop the system?
A: yes. Manager can see the impact is small and isolated. The unit of release is clearly a single file, not a library.

Q4: what if the change is to a stored proc? You have tested it and run full unit test suit but not a full integration test. Will you release this single stored proc?
A: yes. One reason is transparency of the change. Managers can understand this is an isolated change, rather than a library change as in the c++ case.

How do managers (and anyone except yourself) actually visualize the amount of code change?

  • With python, it’s a single file so they can use “diff”.
  • With stored proc, it’s a single proc. In the source control, they can diff this single proc
  • with c++ or java, the unit of release is a library. What if in this new build, beside your change there’s some other change , included by accident? You can’t diff a binary 😦

So I feel transparency is the first reason. Transparency of the change gives everyone (not just yourself) confidence about the size/scope of this change.

Second reason is isolation. I feel a compiled language (esp. c++) is more “fragile” and the binary modules more “coupled” and inter-dependent. When you change one source file and release it in a new library build, it could lead to subtle, intermittent concurrency issues or memory leaks in another module, outside your library. Even if you as the author sees evidence that this won’t happen, other people have seen innocent one-line changes giving rise to bugs, so they have reason to worry.

  • All 1000 files (in compiled form) runs in one process for a c++ or java system.
  • A stored proc change could affect DB performance, but it’s easy to verify. A stored proc won’t introduce subtle problems in an unrelated module.
  • A top-level python script runs in its own process. A python module runs in the host process of the top-level script, but a typical top-level script will include just a few custom modules, not 1000 modules. Much better isolation at run time.

There might be python systems where the main script actually runs in a process with hundreds of custom modules (not counting the standard library modules). I have not seen it.

big guns: template4c++^reflection4(java+py)

Most complex libraries (or systems) in java require reflection to meet the inherent complexity;

Most complex libraries in c++ require template meta-programming.

But these are for different reasons… which I’m not confident to point out.

Most complex python systems require … reflection + import hacks? I feel python’s reflection (as with other scripting languages) is more powerful, less restricted. I feel reflection is at the core of some (most?) of the power features in python – import, polymorphism

##minimum py know-how for (compiler)coding test #XR

Hi XR,

My friend Ashish Singh (in cc) said “For any coding tests with a free choice of language, I would always choose python”. I agree that perl and python are equally convenient, but python is the easiest languages to learn, perhaps even easier than javascript and php in my opinion. If you don’t already have a general-purpose scripting language as a handy tool, then consider python as a duct tape or Swiss army knife

(Actually linux automation still requires shell scripting. Perl and python are both useful additions.)

You can easily install py on windows. Linux has it pre-installed. You can then write a script in any text editor and test-run, without compilation. On windows the bundled IDLE tool is optional but even easier – see https://stackoverflow.com/questions/6513967/running-python-script-from-idle-on-windows-7-64-bit

(Ashish please feel free to add to this list) For coding tests, a beginner would need to learn

  • · String common operations. Regex not needed
  • · list and dict data structures and common operations. A “Set” may be useful occasionally. Tuple not needed.
  • · “in” operator on string, list, dict
  • · if/elif/else; while loop with beak and next
  • · for-each loop is more useful in coding test, esp. iterating list, dict, string, range()
  • · range() function – frequently needed in coding test
  • · Define simple functions. Recursion is frequently coding-tested.
  • · No need to handle exceptions
  • · No need to create classes
    • I think “struct-type” classes with nothing but data fields are useful in coding tests, but not yet needed in my experience.
  • · No need to learn OO features
  • · No need to use list comprehension and generator expressions, though very useful features of python
  • · No need to use lambda, map()/reduce()/filter()/zip(), though essential for functional programming
  • · No need to use import os and sys modules or open files, which are essential for practical automation scripts

3groupsOf3digits #YiHai

Q: Find three 3-digit numbers with a ratio of 1:2:3;
These numbers pick their 3 digits from a range of 1 to 9;
All digits that form these numbers must be completely unique, i.e. you must use up all distinct 9 digits. For example, a set of 123:246:369 doesn’t qualify.

def failed_to_add(myset, digit): # return False when failed
    return digit == 0 or len(myset) == ( myset.add(digit) or len(myset))

def other_number_failed(myset, num): # returns False by default if everything good
    for digit in(num/100, num/10%10, num%10):
        if failed_to_add(myset, digit):     return True

def check(number1, myset):
    number2 = 2*number1
    if other_number_failed(myset, number2): return 
    number3 = 3*number1
    if other_number_failed(myset, number3): return 
    print number1, number2, number3, myset
    winners.append([number1, number2, number3])

winners=list()
nine_digits = tuple(range(1,10))
for digit1 in (1,2,3):
    last8 = list(nine_digits)
    last8.remove(digit1)
    for digit2 in last8:
        last7 = list(last8)
        last7.remove(digit2)
        for digit3 in last7:
            number1 = digit1*100 + digit2*10 + digit3
            if number1 > 329: break
            check(number1, {digit1, digit2, digit3})
print "-- lucky winners --", winners

my quicksort in python/c++

# partition the given section by fixing the anchor
def partition(A, le, ri):
    pivot = A[ri] # immutable
    pivotPos = i = ri
    while i > le:
        i -= 1
        if A[i] <= pivot: continue swap(A, pivotPos-1, i) swap(A, pivotPos-1, pivotPos) #print 'after swap ...', A[le:ri+1] pivotPos -= 1 return pivotPos def recurse(A, le, ri): if le>=ri: return
    print 'entering partition()...',
    print(A[le:ri+1]), ' pivotVal =', A[ri]
    anchor = partition(A, le, ri)
    print '...after partition()   ',
    print(A[le:ri+1])
    recurse(A, le, anchor-1)
    recurse(A, anchor+1, ri)
def swap(A, x,y):
    tmp = A[x]
    A[x] = A[y]
    A[y] = tmp
def qsort(A):
    recurse(A, 0, len(A)-1)
    print(A)

qsort([222,55,33,11,66,8,144,99,7,66])

Above is py, below is c++

</pre>

#include <iostream>
#include <vector>

std::vector<int> A{77, 11, 66,22,33,99,44,88, 77, 55, 0};
int const size = A.size();

void dump(int l=0, int r=size-1) {
	for (int i = l; i <= r; ++i)
		std::cout << A[i] << ' ';
	std::cout <<std::endl;
}

template <typename T>
void swap(int pos1, int pos2) {
	if (A[pos1] == A[pos2]) return;
	T tmp = A[pos1];
	A[pos1] = A[pos2];
	A[pos2] = tmp;
}

/*partition the region [l,r] such that all elements smaller than
pivotValue are on the left of pivotPosition
*/
template <typename T>
int partitionUsing1stInefficient(int l, int r) {
	T const pivotVal = A[l];
	int pivotPos = l;
	for (int i = l+ 1; i <= r; ++i) { if (A[i] >= pivotVal) continue;
		swap<int>(pivotPos + 1, i);
		swap<int>(pivotPos + 1, pivotPos);
		++pivotPos;
	}
	return pivotPos;
}
template <typename T>
int partitionUsingLastInefficient(int l, int r) {
	T const pivotVal = A[r];
	int pivotPos = r;
	for (int i = r - 1; i >= l; --i) {
		if (A[i] <= pivotVal) continue;
		swap<int>(pivotPos - 1, i);
		swap<int>(pivotPos - 1, pivotPos);
		--pivotPos;
	}
	return pivotPos;
}
/*based on [[Algorithms unlocked]]
Lime zone -- items smaller than pivot value
Red zone -- items bigger than pivot value
Ugly zone -- items yet to be checked
*/
template <typename T>
int partitionMinimalSwap(int le, int ri) {
	T const pivotVal = A[ri];
	// two boundaries exist between zones
	int redStart = le;
	// we start with redStart == uglyStart == l, which means item at le is Unknown
	for (int uglyStart = le; uglyStart < ri; ++uglyStart) {
		if (A[uglyStart] < pivotVal) {
			swap<int>(uglyStart, redStart);
			redStart++;
		}
	}
	swap<int>(ri, redStart);
	return redStart;
}

template <typename T>
void recurse(int l, int r) {
	if (l >= r) return; // recursion exit condition
	int const anchor = partitionMinimalSwap<T>(l, r);
	recurse<T>(l, anchor-1);
	recurse<T>(anchor+1, r);
}

int main() {
	recurse<int>(0, size-1);
	dump();
	return 0;
}

python list append -3ways

Note there could be some useful python tips in the pearl blog. Perhaps it’s better to move the z_py posts there.

li + [1,2,3] # can be used directly in a for loop

### all solutions below return None!

li.append(1) # one argument exactly. To append more than one, use extend:

li.extend( [1,2] )

li.extend(list2)

 

python – ‘global myVar’ declaration needed where? simple rules

Suppose you have a global variable var1 and you need to “use” it in a function f1()

Note Global basically means module-level. There’s nothing more global than that in python.

Rule 1a: to read any global variable in any function you don’t need “global”. I think the LGB rule applies.

Rule 1b: to call a mutator method on a global object, you don’t need “global”. Such an object can be a dict or a list or your custom object.

Rule 2: to rebind the variable to another object in memory (i.e. pointer reseat), you need “global” declaration to avoid compilation error. This situation is rare in all my projects.

%%logging decorator with optional args

I hope the wordpress code formatting renders the source code correctly:

def log3(funcOrMsg=None, named_arg=None):
    '''arg is Optional. You can use any of:
    @log3(named_arg='specific msg') # some prefer named argument for clarity
    @log3('msg2') or
    @log3
    '''
    arg1_isCallable = callable(funcOrMsg)
    arg1_isStr      = isinstance(funcOrMsg, basestring)
    arg1_isNone     = funcOrMsg is None

    def decorated(func):

        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            if named_arg: print 'named_arg = ' + str(named_arg)
            tmp = funcOrMsg if arg1_isStr else ''
            logger.info(tmp + ' pym ver = ' + str(logger.pymodelsVer),
              extra={'name_override' : func.__name__})
              ### set name_override to func.__name__ in a kwarg to info()
            return func(*args, **kwargs)
        return wrapper
        ## end wrapper

    if arg1_isCallable:
        return decorated(funcOrMsg) # decorator received no-arg
    else:
        # decorator had kwargs   or   positional arg
        assert     arg1_isNone   or   arg1_isStr
        return decorated