big O, in%%lang

O, o, Omega and theta all try to answer the question “for this algorithm, how many more iterations when N triples?” (Avoid “doubles” to avoid potential ambiguity about 2), where N measures the input.

Example: For the quicksort algorithm, answer is O (N * log N), meaning “iterations increase by a factor of (3 * log 3) when input triples”
Example: For the binary search algorithm, answer is O (log N), meaning “(log 3) times more iterations when input triples”

O () means “no more than iterations”.
O is an upper bound, not a tight bound


enforcing Perl coding standards

Large Perl teams often wish to enforce Perl “coding standards” to control bewildering style variations permitted in Perl.

Perl::Critic applies 256 coding style “policies” and outputs warnings. Management could adopt various levels of enforcements
* automate the check in an automated build/test/release/deploy process
* track violation statstics for each developer and each team
* periodic scan of codebase
* require every developer to check, and send output to a coworker for peer review

These practices are similar to taint checking, “use strict” and -w.

Teams often wish to customize or disable some “policies”. P122 [[ Mastering Perl ]]

The #1 question is “How precisely do Perl::Critic && PPI detect violations”, without false-positives, without false-negatives. How intelligent and reliable is it?

competition from the young and foreigners

可能你觉得比同龄同行同学更动荡, 但他们有一天面对挑战可能没你表现得好, 因为他们没经历过风风雨雨.

当然也可能有的人没经历过却也能应付自如, 判断比你还准.

也有可能他们运气好, 不会被裁退. 我本人不想运气,我不能允许自己依靠公司提供一个铁饭碗.

高屋见/嶙/?, 总的来说, 面对外来者的挑战, “迎”比”避”有一些益处.

%%value-add as a 5-year batch veteran

Why do employers ask for 5 years experience in batch development? Here are the most important value-add of a real veteran, based on my first-hand observation

(See also %% posts on batch wishlist.)

1) robust and resilient. My experience shows that serious batch jobs can fail for a large number of reasons such as unexpected input or network delays

2) Flexibility for change. I think batch apps are seen as quick-and-dirty, and flexible. People ask for more changes cos they assume *cost* of change is lower for batch apps than non-batch apps. Such expectations call for deep experience in batch design.

2A) extensibility, which is slightly different from “flexibility”. Example: adding parallelism, retry.. If not well designed, you often need to throw out old tested codebase and restart from scratch.

) modularization for a development team. Minimize stepping on each other’s toes.

) readability, ease of learning. Batch jobs are often seen as temporary, so documentation and design are lower priorities in batch than non-batch. Many batch applications actually need hand-over and maintenance by a new guy. I think a good system design can ease documentation, learning and knowledge-transfer.

* fine-grained control. Consider the monitoring features of JMX and Weblogic
* testability
* performance optimization experience


Let’s take the base sql vocabulary as a starting point
without joins
without sub queries
without grouping
without agg ie aggregates
without union

Q: which addition is “troublesome” for users?

$ Join is natural to sql. Even outer join is natural.
$ Union is not as natural but simple to understand
$ Sub query is an unnatural addition to sql. ugly.
$ correlated sub query is complex.
$ Group-by imposes restrictions on other parts of a select-statement, such as “select expressions must be …”
$ Agg imposes restrictions, such as “other select expressions must be …”

app design in a fast-paced financial firm#few tips

#1 design goal? flexibility (for change). Decouple. Minimize colleagues’ source code change.

characteristic: small number of elite developers in-house (on wall street)
-> learn to defend your design
-> -> learn design patterns
-> automate, since there isn’t enough manpower

characteristic: too many projects to finish but too few developers and too little time
-> fast turnaround

characteristic: reputation is more important here than other firms
-> unit testing
-> automated testing

characteristic: perhaps quite a large data volume, quite data-intensive
-> perhaps “seed” your design around data and data models

characteristic: wide-spread use of stored proc, but Many java designs aren’t designed to work well with stored proc. Consider hibernate.
-> learn coping strategies

characteristic: “approved technologies”
characterstic: developers move around
-> maintenance left to other guys
-> documentation is ideally “less necessary” if your design is easy to understand
-> learn documentation tools like javadoc

forkey ^ join ^ cartesian-product

referential-integrity ^ forkey ^ any_type_of_join ^ cartesian-product — at the heart of the relational paradigm.

* most if not all joins (including self-join, outer join) are Cartesian in nature, and produce a intermediate Cartesian table (icart) initially. (No need to explain “initially”)
* forkeys exist primarily (if not always) as join-columns
* relational model relies on forkeys at its heart
* normalization usually (if not always) create forkeys

common 1-table query: list buses pass`both K&&T

question 2a:
The table: route(num,company,pos, stop) ie (service_num, bus_company, position, bus_stop)

select a.num, from route a, route b 
where a.stop=53
and b.stop=149
and a.num=b.num

The self-join creates a 8-column icart (ie intermediate cartesian). The rows form a cartesian product. The where-clause filters the 8-field rows based solely on the 8 fields.

I feel some of the conditions should be join conditions.

common 1-table query:list biggest countries]each region 1 table only: bbc(name, region, area, population, gdp) Surprisingly, such a simple thing need a complicated solution in sql!

–You can use a correlated sub query with ALL

select name, region, area from bbc a where area >=
all (select area from bbc b where a.region=b.region) — equal sign needed

–You can use a correlated sub query with max()

select name, region, area from bbc a where area =
(select max(area) from bbc b where a.region=b.region)

I guess you may also use a self-join? Maybe “biggest” type of query unnatural for self-join?

official^programmer-friendly thread states has the official thread states
[[ java precisely ]] and a few webpages each define a set of “thread states” but these authors don’t agree 100%. is the official 1.5 list. However, I think this list is perhaps too academic or too low-level. There must be a programmer-friendly version of “thread states”

If you decide to maintain your own list of easy-to-understand states, the state transition diagrams/tables supplied by those authors will need some adjustments. Below are a few ineligible states i.e. ineligible to receive CPU. See post on “Running” state.

* Blocked on I/O — A java thread may enter this state while waiting for data from the IO device. The thread will move to Ready-to-Run (ie eligible) after I/O condition changes (such as reading a byte of data).
* Blocked on Synchronization — A java thread may enter this state while waiting for object lock. The thread will move to Ready-to-Run when a lock is acquired.
* waiting in wait() without argument
* waiting in join() without argument — P 16 [[ concurrent programming in java ]] says “invoking t.join() for Thread t suspends the caller until t completes.”
* timed_waiting — when a thread is waiting in
timed wait()
timed join()

correlated^uncorrelated subquery

#1 difference

Q: “how many times does the subquery run during a complete run of the /enclosing/ query?”
A: exactly once for uncorrelated. You can see it is much simpler — the sub-select is complete and ready to run on its own just like a regular select, and doesn’t mention anything from the enclosing query.

Look at the example in
* uncorrelated often has the form “… IN (select …”.
* correlatd often has the form “… EXISTS (select … “

%%innovation track record


– [[ multiple inheritance]FTTP module ]]
– snoop oracle client-server conversations to see the actual query and raw data returned
– separate biggies and text files in my personal folder, to minimize its size and ease backup and portability
– always type my passwords on websites, to help me memorize a new zipcode or an often-misspelt word
– dozens of transparent bags hung on pegs, to help find various stuff in my home right after move-in
– since mail2blogger has 5% downtime, i created a googlegroup to publish to multiple blog hosts to compare their uptime
– dd: go vertical, freeing floor space, more stable than tall bookshelf
– typing trainers in append.bashrc.txt
– paste on mirrors torn-out pages of dictionary to help my wife’s english
– daily check on library loans (web scraping)
– put yellow stickers on light-weight luggage for mom
– bookshelf in bathroom
– automated remote login, delete acc.

You can’t be a developer for ever@@

my answers ranked

* Each person has her roles in the economy. Perhaps the best role for me is …
* I still believe my experience dealing with enterprise application design issues (tx,cluster,perf,…) has value in 20 years
* compared to a team lead, architects have a longer shelf life. Architects need some domain knowledge and deep hands-on experience.
* I still believe deep technical experience is more solid than Project Management experience, at least for me. I was able to do a “good-enough” PM job even without years of experience.
* I have been a good enough (small-scale) Presales consultant.
* some older technies in my circle are NOT earning more even as presales or PM
* Xian Hua of JDA pointed that for hands-on developer, job search is easier. I’d add mobility. Can enter US.
* I feel developer has vantage when a commander is needed for a project, better position than PM, BA etc. However, there are exceptionally capable/effective BA/PM individuals.