more python mock IV questions #DeepakCM

Q (real IV): how do you verify a given string represents an integer? For example, "3e5" would be invalid.

Q (real IV): compare multi-threading vs multi-processing in python. Which one have you used in your projects?

Q: Have you used a FIFO data structure (like a queue) in python? It’s OK if you have not, but how would you create such a data structure in python?

Q: Compare and contrast deep copy vs shallow copy for a python list
Q: Compare and contrast deep copy vs shallow copy for a python dictionary

Q (advanced question from me): When is a deep copy required? Give some scenarios

Q : Have you used context managers? (I think they are useful.)

Q (advanced question from me): compare and contrast list comprehension vs generator expression

Q: write a Student class with lastName, firstName, birthDate fields. Also include a class attribute "instCnt" showing the number of student objects created. As you create a new Student instance, this instCnt should increment.

Q: have you heard of the LGB rule in python?

Q (basic question from me, never asked in interviews): what data types can’t be used a dictionary keys

Q (basic question from me, never asked in interviews): what’s the usage of "global" keyword? Give some usage scenarios
Q (basic question from me, never asked in interviews): what builtin data types, apart from tuples, are immutable? Give some examples.

numpy,scipy,pandas #cheatsheet

  • numpy provides efficient and large matrices
    • also linear algebra
  • scipy extends numpy
  • scipy compete against matlab and provides
    • mathematical optimizations including optimize.curve_fit()
    • stats
    • numerical integration
    • spline smoothing
  • pandas extends numpy
  • pandas provides key data structures like
    • time series
    • dataFrame i.e. spreadsheet

python usage in FI quant lib #Pimco

In one of world’s biggest fixed income buy-side firms’ quant library, the codebase is 3/4 c++ and ¼ python including pandas, numpy, machine learning, grid computing modules. I think this is similar to Macquarie FICC quant lib.

C++ is much faster, but data structures are very limited including STL containers.

I think the funds hold mostly bonds and mortgages. How about futures, IRS? Perhaps for hedging?

python IV: xp-verification^QQ questions

Many CVs include claims like — "used py for sys automation". If interviewer wants to verify such claims, There are too many realistic IV questions they can use.

Too many, so I want to be selective. Many verification questions requires extensive learning (and periodic re-learning to compensate for low usage). See [14] mock/realistic python IV

In fact, I do come with more wide-ranging python project experience than other candidates. If out of 30 "verification questions" I struggle with 50% then other candidates probably /fare worse/.

If you look at java/c++ interviewers, they don’t ask so many "experience verification" questions. On the contrary, QQ questions by definition means "not required in projects"

[14] mock/realistic python IV

There are many xp-verification questions listed here. I have a blogpost about their relevance.

ess=[[essentialRef]]; cook=cookbook 1st edition

Q: optimize – how did you optimize perf?
A: I didn’t need to for my scripts.
A: https://bintanvictor.wordpress.com/2012/03/11/python-some-performance-tips-andy/

Q: try/catch usage? A: only in file IO
Q: immutable – what data types are mutable/immutable? QQ topic?
Q: threading – global interpreter lock? A: never really needed threading
Q: debugger? A: I didn’t need it
Q: command line arguments stored where? A: sys.argv?

Q: xml – what xml parser did you use? ess477
Q: read a config file?
Q: logging?
Q: DB access?
Q78: exit normally? A: sys.exit()
Q78b: normally, “raise Exception” would print to stderr stream, but how do you direct that to a log file instead?
A: set sys.stderr stream. dive205 i.e. [[dive into python]]

Q: how do you handle newline diff between OS? ess114
Q: truth table? e68 (i.e. [[essentialRef]])
Q: how do you edit on windows and deply to linux? A: samba, ftp.

— sys admin
Q: how do you pass input files to your py script? cook
A: fileinput module is convenient.

Q: how is PYTHONPATH used with sys.path?
Q: another py — how do you execute another py script? Ess89
Q: what command line options do you use often?

Q5: how do you launch an external program?
Q5b: how do you capture its output? [[cookbook]] has a slightly advanced solution
Q5c: how do you call a unix shell command? A: shutil module meets some of your needs

Q: exit with an error msg? cook540
A: raise SystemExit (‘msg’)

— data structures
Q: diff between listcomp and genexp? How would you choose?
Q: split a string on multiple delimiters? cook37 explains re.split()
Q: iterating in reverse? cook119
==coding IV
Q: how do you print to a file rather than stdout?
A: e115 — print >> anOpenFile, ‘whatever’ # using the print KEYWORD
A: c144 — print (‘whatever’, file=anOpenFile) # using the print() function
Q: concat 2 lists?
Q: initialize a dict
Q: initialize a 2D array? m52 (i.e. [[perl to python migration]])
Q: walk a directory
Q: use a dict as a “seen”
Q: iterate a dict? mig87
Q: iterate a file
Q: interpolate into a string? ess115. c61 i.e. [[cookbook]]
Q: date/time usage (datetime module is OO;  time module is procedural?)
Q: trim both ends of a string? strip()

python usage + risk system ] GIC #ChaoJen

Government Investment Corporation is a large buy-side. IT heads are keen about python as a first-class language, used for data analysis, in post-trade risk systems etc. Pandas is the most common module.

Python salary is not always lower than java/c++. My friend Chao Jen said python salary could be high if the team is demanding and deep-pocketed.

Is the GIC risk system mostly written in python? Chao Jen doesn’t know for sure, but he feels c++ may not be widely used in the GIC risk system. Might be a vendor product.

Hackerrank python tests can be hard to get 100% right, but Chao Jen said many get close to 100%.

python^perl #my take

(Collected from various forums + my own, but ranking is based on my experience and judgment)

  1. OO
    1. – Not sure about perl6, but not many app developers create perl classes. Many CPAN modules are OO though. Python users don’t create many classes either but more than in perl. I guess procedural  is simple and good enough.
    2. – not sure about perl6, I feel perl OO is an afterthought. Python was created with OO in mind.
    3. – even if you don’t create any class, out-of-the-box python relies (more deeply) on more OO features than CPAN modules. Object-orientation is more ingrained in Python’s ethos.
    4. OO is important for large-scale, multi-modular development project. Python is more battle tested in such a context, partly due to it’s OO design, but i think a more important reason is the python import machinery, which is probably more powerful and more widely used than perl’s
  2. – Python can cooperate better with other popular technologies like DotNet (IronPython) and Java (Jython). See also python project in boost.
  3. – perl’s text processing (and to a lesser extent unix integration) features are richer and more expressive. Using every key on the keyboard is like using all 20 fingers and toes. I immediately felt the difference when switching to java. In this aspect, pytyon is somewhere between those 2 extremes.
  4. – perl can be abused to create unreadable line-noise; python has a rather clean syntax
  5. – As part of unix integration, perl offers many practical one-liners competing effectively with grep/awk/sed. Perl one-liners integrate well with other unix commands
  6. – Perl was initially designed for unix, text and as an alternative for shell script programmers. It’s still very good for that purpose. For that purpose, I feel OO offers limited value-add.
  7. – for unix scripting, perl is more similar to unix shell scripting at least on the surface, but python isn’t hard to learn.
  8. – I feel perl was more entrenched and more established in certain domains such as unix system automation, production support, bio-informatics
  9. big data, data science and data analytics domains have already picked python as the default scripting language. Perl may be usable but not popular
  10. – some say Perl the base language is much larger(??) than Python’s base language so probably it takes longer(???) time to learn Perl than Python but in the end you have a more expressive language???
  11. – CPAN was once larger than python’s module collection. If CPAN’s collection is A and python’s module colleciton is B, then I personally feel (A-B) are mostly non-essential modules for specialized purposes.
  12. – Python has better support for Windows GUI apps than Perl.
  13. I feel the open source contributions are growing faster in python
  14. There are more books on python

## any python ecosystem question@@

This topic is relevant to how I prepare for high-end python job interviews.

  1. core python QQ i.e. hard quiz questions not needed in projects? I don’t know any
    • decorators?
    • protocols like ContextManager?
  2. python ecosystem QQ? I assume there are some topics
  • pandas? never quizzed
  • python web app? never asked
  • python to launch new processes in system automation? never asked
  • python unit test tools? never asked
  • python integration with RDBMS? never asked
  • python integration with Excel, java ..? never quizzed
  • networking scripts in py? never asked. https://github.com/tiger40490/repo1/blob/py1/py/sock/tcpEchoServer.py is my useful utility

choose python^c++for cod`IV

1) Some hiring teams have an official policy — focus on coding skills and let candidate pick any language they like. I can see some interviewers are actually language-agnostic.

2) 99% of the hiring teams also have a timing expectation. Many candidates are deemed too slow. This is obvious in online coding tests. We all have failed those due to timing. (However, I guess on white-board python is not so much faster to code.)

If these two factors are important to a particular position, then python is likely better than c++ or java or c#.

  • Your code is shorter and easier to edit. No headers to include.
  • Compiler errors are shorter
  • c++ pointer or array indexing errors can crash without error message. Dynamic languages like python always give an error message.
  • STL iterator can become end() or invalidated. Result would often look normal, so we can spend a long time struggling a hidden bug.
  • Edit-Compile-Test cycle is reduced to Edit-Test
  • no uninitialized variables
  • Python offers some shortcuts to tricky c++ tasks, such as
    1. string: split,search, and many other convenient features. About 1/3 of the coding questions require non-trivial string manipulation.
    2. vector (a.k.a List): slicing, conversion(to string etc), insertion, deletion…, are much faster to write. Shorter code means lower chance of mistakes
    3. For debugging: Easy printing of vector, map and nested containers. No iteration required.
    4. easy search in containers
    5. iterating over any container or iterating over the characters of a string — very easy. Even easier than c++11 for-loop
    6. Dictionary lookup failure can return a default value
    7. Nested containers. Basically, the more complex the data structure , the more time-saving python is.
    8. multiple simultaneous return values — very very easy in python functions.
    9. a python function can return a bool half the times and a list otherwise!

If the real challenge lies in the algorithm, then solving it in any language is equally hard, but I feel timed coding tests are never that hard. A Facebook seminar presenter emphasized that across tech companies, every single coding problem is always, always solvable within the time limit.

##minimum python know-how for cod`IV

Hi XR,

My friend Ashish Singh (in cc) said “For any coding tests with a free choice of language, I would always choose python”. I agree that perl and python are equally convenient, but python is the easiest languages to learn, perhaps even easier than javascript and php in my opinion. If you don’t already have a general-purpose scripting language as a handy tool, then consider python as a duct tape and Swiss army knife

(Actually linux automation still requires shell scripting. Perl and python are both useful additions.)

You can easily install py on windows. Linux has it pre-installed. You can then write a script in any text editor and test-run, without compilation. On windows the bundled IDLE tool is optional but even easier. For the ECT cycle – see https://stackoverflow.com/questions/6513967/running-python-script-from-idle-on-windows-7-64-bit

I actually find some inconveniences — IDLE uses Alt-P to get previous command. Also copy-paste doesn’t work at all. On Windows The basic python command-line shell is better than IDLE!

For coding tests, a beginner would need to learn

  • String common operations — learn 30 and master 20 of them
    • “in” operator on string, list, dict — is one of the operations to master
    • slicing on string and list — is one of the operations to master
    • converting between string, list, dict etc — is one of the operations to master
    • Regex not needed since many developers aren’t familiar with it
  • list and dict data structures and common operations — learn 30 and master 20 of them
    • A “Set” rarely needed. I never create tuple but some built-in functions return tuples.
  • Define simple functions
    • Recursion is frequently coding-tested.
    • multiple return values are possible but not required
  • “global” keyword used inside functions
  • if/elif/else; while loop with beak and next. Switch statement doesn’t exit.
  • for-each loop is useful in coding test, esp. iterating list, dict, string, range(), file content
  • range() and xrange() function – frequently needed in coding test
  • check 2 object have same address
  • null pointer i.e. None
  • what counts as true / false
  • · No need to handle exceptions
  • · No need to create classes
    • I think “struct-type” classes with nothing but data fields are useful in coding tests, but not yet needed in my experience.
  • · No need to learn OO features
  • · No need to use list comprehension and generator expressions, though very useful features of python
  • · No need to use lambda, map()/reduce()/filter()/zip(), though essential for functional programming
  • · No need to use import os and sys modules or open files, which are essential for everyday automation scripts

my quicksort in python/c++

import random
# partition the given section by fixing the anchor
def partitionUsingFarRight(A, le, ri):
    pivotValue = A[ri] # immutable value
    pivotPos = i = ri
    while True:
        if (i <= le): return pivotPos
        i -= 1
        if A[i] > pivotValue:  
          swap(A, pivotPos-1, i)
          swap(A, pivotPos-1, pivotPos) 
          pivotPos -= 1
          
def partitionUsingFarLeft(A, le, ri): 
    # optional: swap a random object to the far left
    swap(A, random.randint(le, ri), le)
    benchmark=A[le]
    ret = i = le
    while True:
        if i == ri: return ret
        i +=1 #move pointer
        if A[i] < benchmark: # 3-step adjustment
            swap(A, ret+1, i)
            swap(A, ret+1, ret)
            ret +=1
    
def partition(A, le, ri):
    return partitionUsingFarLeft(A, le,ri)
def recurse(A, le, ri): 
    if le>=ri: return
    print 'entering partition()...',
    print(A[le:ri+1]), ' pivotVal =', A[ri]
    anchor = partition(A, le, ri)
    print '...after partition()   ',
    print(A[le:ri+1])
    recurse(A, le, anchor-1)
    recurse(A, anchor+1, ri)
def swap(A, x,y):
    tmp = A[x]
    A[x] = A[y]
    A[y] = tmp
def qsort(A):
    recurse(A, 0, len(A)-1)
    print(A)
 
qsort([222,77,6,55,3,11,5,2,88,9,66,22,8,44,1,33,99,7,66])

Above is py, below is c++


#include <iostream>
#include <vector>

std::vector<int> A{77, 11, 66,22,33,99,44,88, 77, 55, 0};
int const size = A.size();

void dump(int l=0, int r=size-1) {
	for (int i = l; i <= r; ++i)
		std::cout << A[i] << ' ';
	std::cout <<std::endl;
}

template <typename T>
void swap(int pos1, int pos2) {
	if (A[pos1] == A[pos2]) return;
	T tmp = A[pos1];
	A[pos1] = A[pos2];
	A[pos2] = tmp;
}

/*partition the region [l,r] such that all elements smaller than
pivotValue are on the left of pivotPosition
*/
template <typename T>
int partitionUsing1st(int l, int r) {
	T const pivotVal = A[l];
	int pivotPos = l;
	for (int i = l+ 1; i <= r; ++i) { 
		if (A[i] >= pivotVal) continue;
		swap<int>(pivotPos + 1, i);
		swap<int>(pivotPos + 1, pivotPos);
		++pivotPos;
	}
	return pivotPos;
}
template <typename T>
int partitionUsingLast(int l, int r) {
	T const pivotVal = A[r];
	int pivotPos = r;
	for (int i = r - 1; i >= l; --i) {
		if (A[i] <= pivotVal) continue;
		swap<int>(pivotPos - 1, i);
		swap<int>(pivotPos - 1, pivotPos);
		--pivotPos;
	}
	return pivotPos;
}
/*based on [[Algorithms unlocked]], should but doesn't minimize swaps!
Lime zone -- items smaller than pivot value
Red zone -- items bigger than pivot value
Ugly zone -- items yet to be checked
*/
template <typename T>
int partitionMinimalSwap(int le, int ri) {
	T const pivotVal = A[ri];
	// two boundaries exist between zones
	int redStart = le;
	// we start with redStart == uglyStart == l, which means item at le is Unknown
	for (int uglyStart = le; uglyStart < ri; ++uglyStart) {
		if (A[uglyStart] < pivotVal) {
			swap<int>(uglyStart, redStart);
			redStart++;
		}
	}
	swap<int>(ri, redStart);
	return redStart;
}

template <typename T>
void recurse(int l, int r) {
	if (l >= r) return; // recursion exit condition
	int const anchor = partitionMinimalSwap<T>(l, r);
	recurse<T>(l, anchor-1);
	recurse<T>(anchor+1, r);
}

int main() {
	recurse<int>(0, size-1);
	dump();
	return 0;
}

[16]python: routine^complex tasks #XR

XR,

Further to our discussion, I used perl for many years. 95% of my perl tasks are routine tasks. With py, I would say “majority” of my tasks are routine tasks i.e. solutions are easy to find on-line.

  • routine tasks include automated testing, shell-script replacement, text file processing, query XML, query various data stores, query via http post/get, small-scale code generation, simple tcp client/server.
  • For “Complex tasks” , at least some part of it is tricky and not easily solved by Googling. Routine reflection / concurrency / c++Integration / importation … are documented widely, with sample code, but these techniques can be pushed to the limit.
    •  Even if we just use these techniques as documented, but we combine them in unusual ways, then Google search will not be enough.
    • Beware — decorators , meta-programming, polymorphism, on-the-fly code-generation, serialization, remote procedure call … all rely on reflection.

When you say py is not as easy as xxx and takes several years to learn, I think you referred to complex tasks.

It’s quite impressive to see that some powerful and useful functionalities can be easily implemented by following online tutorials. By definition these are routine tasks. One example that jump out at me is reflection in any language. Python reflection can be powerful like a power drill to break a stone wall. Without such a power drill the technical challenge can look daunting. I guess metaclass is one such power drill. Decorator is a power drill I used in %%logging decorator with optional args

I can see a few reasons why managers choose py over java for certain tasks. I heard there are a few jvm-based scripting languages (scala, groovy, clojure, jython …) but I guess python beats them on several fronts including more packages (i.e. wheels) and more mature, more complete and proven solutions, familiarity, reliability + wider user base.

One common argument to prefer any scripting language over any compiled language is faster development. True for routine tasks. For complex tasks, “your mileage may vary”. As I said, if the software system requirement is inherently complex, then implementation in any language will be complex. When the task is complex, I actually prefer more verbose source code — possibly more transparent and less “opaque”.

Quartz is one example of a big opaque system for a complex task. If you want, I can describe some of the complex tasks (in py) I have come across though I don’t have the level of insight that some colleagues have.

When you said the python debugger was less useful to you than java debugger, it’s a sign of java’s transparency. My “favorite” opaque parts of py are module import and reflection.

If any language has excellent performance/stability + good on-line resources [1] + reasonable library of components comparable to the mature languages like Java/c++, then I feel sooner or later it will catch on. I feel python doesn’t have exactly the performance. In contrast, I think php and javascript can achieve very high performance in their respective usage domains.

[1] documentation is nice-to-have but not sufficient. Many programmers don’t have time to read documentation in-depth.

[[automate the boring stuff with python]]

This book teaches just enough python features for the purpose. All non-essentials are left out.

–sub chapter on selenium
the easiest way to use selenium and drive a browser, IMO.

–Chapter on Excel
text file spreadsheet converter
merge/unmerge
setting font in a cell

–Chapter on PDF:
combine select pages from many files

—-Chapter on CSV + json

—-Chapter on task scheduler

—-Chapter on keyboard/mouse control — powerful

## relatively innovative features of python

I’m relatively familiar with perl, java, c++, c# and php, though some of them I didn’t use for a long time.

IMO, these python features are kind of unique, though other unknown languages may offer them.

* decorators
* list comp
* more hooks to hook into object creation. More flexible and richer controls
* methods and fields are both class attributes. I think fundamentally they are treated similarly.

python features (broad) actually used inside 2 Wall St ibanks

If a python project takes 1 man-year, then the number of python features used would take 2 man-weeks of learning, because python is an easy-learning language. In such a project you probably won’t need fancy features like threading, decorators… Most of the work would be mundane.

After the initial weeks of exciting learning curve, you may feel bored. (c# took me months/years). You may feel your peers are deepening their expertise in c++ or wpf …

The tech quiz would be mostly on syntax.

========= the features =========
? lambda, closure, apply()
Everyday operations on list/dict/tuple, str, file object
** for-loop, conversion, filter(), map(), aggregation, argument passing,
Filesystem, input/output
Basics of module import (no advanced features needed)
command line processing + env vars
create functions –composite argument passing
config files and environment config like PYTHON_PATH

———a bit advanced and less used———
creating classes? fewer than functions, and usually simple classes only
list comprehension, generator expressions
basic platform-specific issues. Mostly unix
inheritance, overriding
threading
spawn new processes
decorators
* Python calling C, not the other way round. In financial analytics, heavy lifting is probably done in C/C++ [1], so what’s doable in python? I guess the domain models (market models, instrument models, contract models …) are probably expressed in python as graph-aware python objects.

[1] I have a book on option pricing using things like Finite Element Method. It seems to use many c++ language features (template etc) but I’m not sure if java or python can meet the requirements.

OO scripting languageS

I see more php developers create classes than Perl developers. I guess python developers are even more likely to create classes. Every language claim to support OO, but the litmus test is how easily (and how frequently) developers create user-defined classes.

(In the same vein, C++ template is a another thing regular developers have not fully embraced. Java generic class is yet another.)

Unlikely Java/c#, these scripting languages won’t deliberately make your life difficult if you don’t “embrace” OO.

Compared to perl and php, I feel python OO is simple, clean and “full-service” (not in the call-girl sense:). Incidentally, built-in python “features” are more object-oriented than in perl and php.

However, I still feel many python applications tend to be more procedural than OO. When facing a choice, I guess many developers feel procedural is simpler even thought python OO is not complicate. If a simple tool does the job, then no motivation to change.

What about multi-module team development? Look at perl and C projects. Many team still stick to procedural.

## python features used in a web app #django

(Plone is too heavy-weight and over-kill, just like zend. Django is light-weight.)

  • MVC – model/view are classes. Controllers may/may not be classes.
  • user-defined class hierarchy
  • per-instance method-redefine
  • per-instance attribute-add/delete
  • lambda, apply — functional programming
  • try/catch – heavily used
  • decorators – Django-specific
  • introspection (basic meta-programming)