create functionality +! jvm restart]strategy pattern

Strategy patterns allows you to define a family of interchangeable algorithms, to be selected at runtime. In extreme circumstances, a new algorithm is to be created and to be added immediately without jvm restart. This would be a higher level of flexibility.

Perhaps the highest level of flexibility is offered by a DB containing classnames in the “family”. After you create a new algorithm, you insert its classname into the DB.

Class c = Class.forName( “com.myPackage.Myclass” );
Thing t = (Thing)c.newInstance( );

Also see detailed sample code in

"PREDICATE" = boolean-expression OR other meanings

In STL, the 2 types of predicates — filter vs the comparator — could cause major confusions to beginners learning the STL syntax details

predicate = any statement that evaluates an expression and returns a boolean.
predicate (xsl) = a filter
predicate = a proposition
predicate = an assertion

— contexts —
sql where-clause? filter on list
matlab? filter on list

one-word intro 2 outer join

Outer join (amidst other joins) can be confusing for beginners. Different authors use their favorite anchor keywords. Best one-word intro: subset

Sometimes, one join-column is a subset of the other join-column, where outer join and inner join show their difference. Focus on subset scenario for now. Once you get your mind around this scenario, you are ready to look at the generic scenario where one join column is not a subset of another join column

–2nd word after “subset”? union

In some SQL implementations, u need a union query to do outer join

I believe unions solve more problem than outer joins.

–simplest example: bonus
show employees with and without bonus
“bonus” table {employee_id (forkey), bonus}

wait() means, to a beginner

To avoid overwhelming a beginner, the simpler meaning is “current thread please wait”.

* wait() means “current thread please wait at this statement, inside this waiting area”
* wait() means “Hey the thread executing this statement, please wait at this statement, in this object”

Note: Remember the method enclosing the wait() statement belongs to the “this” object? The “waiting area” is the “this” object.

Note: The “thread” is the real world thread, not to be confused with the Thread object returned by Thread.getCurrentTrhead()

runtime change to object behaviour

[[ head first design patterns ]] repeatedly favors *runtime* change to program functionality, rather than compile-time ie source code change. I assume they have a *practical* reason instead of a doctrine.

Related concepts: Strategy pattern, Decorator pattern,

When we need to change from an old functionality to a new functionality, a good approach is
* we try to create a new functionality class, if at all possible,
* at runtime, use existing setters to assign the new functionality, replacing the old, when needed.
* minimize edits to existing, tested classes

See also post on [[ create functionality without jvm restart]strategy ]]

I think this probably incurs least-impact to existing, tested functionalities.
=> regression test@@ no need
=> Low stress for fellow developers, managers, clients, internal users and any non-technies.
=> no need to worry “Did we miss any other existing classes that need edit?”
( documentation on interdependencies is crucial but often neglected by developers. )

be prepared for entry-level java position interviews

Hi XR,
在美国这三个月, 见面和电话面试了接近十个 Java 职位.
有个不能确定的感觉. 我这样(新移民?)的应聘人, 在 Java 领域主要被看做底层动手编程的角色. 要求手快, 记性好, 写程序”产量”高, 质量 (Performance, Security….) 也不能有明显弱点. 三个字 — 快而准.


我的另一些不大不小的特长, 面试公司欢迎但不常问到. 比如 High Volume Java App server 维护, Basic Weblogic Configuration/tuning, Cluster 入门知识, Database design, tuning 入门知识, Java 与 External System 的 Integration …也许有别的人员务责这几块. 但如果是小公司, 可能没有全职 DBA, 全职 Weblogic Admin 之类. 如果真的没有这些方面的专家, 那务责的人可能经验不多, 跟我一样在这些领域只有半桶水. (我曾经单枪匹马挑过 Oracle, Mysql, Weblogic… 这几个重担)

想做小公司的 Architect. 但是没有人考虑我. 也算公平, 自认 Java / SQL 还不精深. 希望你的运气比我好些, 也希望我的运气能慢慢改善. 还希望我能通地更多的面试实战, 能对 Java 就业市场有一个更深入更准确的把握, 不再象这样靠一丁点微弱的灯火摸着黑前进.

你问到工作有多累. 同事们大都是 9 – 5, Mon-Fri. 我每周干 50 个钟头. 9:30 – 6:30. 连续 5 个周六主动回公司加班几个钟头. 在路上/家里, 每天至少一小时学 Java/SQL/PHP, 不觉得苦.
New York 的工作项目更累. 平均 10 小时/天, 路程更长, 单程一个多钟头.

partitioned table: sybase^oracle

— based on;hf=0

An unpartitioned table with no clustered-index (%% irrelevant %%) … each insertion into the table uses the last (newest) page of the chain. Adaptive Server holds an exclusive lock on the last page while it inserts the rows, blocking other concurrent transactions from inserting data into the table.

Partitioning a table with the partition clause of the alter table command creates additional page chains. Each chain has its own last page, which can be used for concurrent insert operations. This improves insert performance by reducing page contention.

If the table is spread over multiple physical devices, partitioning also improves insert performance by reducing I/O contention (lower than page-contention) while the server flushes data from cache to disk.

avoid clustered index on identity column

If you do a large number of inserts and you have built your clustered index on an Identity column, you will have major contention and deadlocking problems. This will instantly create a hot spot in your database at the point of the last inserted row, and it will cause bad contention if multiple insert requests are received at once. See post on [[ batch feature wishlist — integrate ]] There are 2 solutions, based on

Solution #1: create your clustered index on a field that will somewhat randomize the inserts across the physical disk (such as last name, account number, social security number, etc) and then create a non-clustered index based on the identity field that will “cover” any eligible queries.

The drawback here, as pointed out in the Identity Optimization section in more detail, is that clustering on another field doesn’t truly resolve the concurrency issues. The hot spot simply moves from the last data page to the last index page of the Identity column index (no longer clustered)

%%Every insert requires a write to the non-clustered index on IC. Since the IC increases for every insert, these index writes occur at the highest IC values. Visualize a B-tree index tree and often these index writes happen on the right-most leaf node.%%

— based on

Recall that a clustered index physically sorts the data pages in a table. If you put a clustered index on an IDENTITY column, then all of your inserts will happen on the last (newest) page of the table – and that page is locked for the duration of each IDENTITY.

Solution #2: turn on insert row-level locking in SQL 6.5, and that will only lock the row being inserted, thus reducing lock contention. You can also move your clustered index to a different column, thereby scattering the inserts around the table.

sybperl ^ DBD::Sybase

features^ease — my short comparision of the 2, based on hearsay. In other words, sybperl exposes more sybase features but DBD::sybase is easier to learn since it’s standard DBI/DBD. How easy? Nothing more than a change from “dbi:oracle:..” to “dbi:sybase:…”.

Other aspects?

* ease of use@@ Sybase::Simple is part of sybperl, but can’t reach the ease of DBD::Sybase
* Bugs and stability@@ I think by now DBD::sybase is mature and stable, but presumably less battle-tested than sybperl for certain rarely used features.
* performance@@ no idea

statement re-order — within-thread^between-thread

— Based on

JVM can re-order statements. “Ordering rules fall under two cases, within-thread and between-thread: “

)) “From the point of view of the thread performing the actions in a method, instructions proceed in the normal as-if-serial manner that applies in sequential programming languages. “

%% As an observer of this thread, you can ignore the re-ordering and assume that “statements appearing earlier never runs later”. Behind-the-scene shuffle will never break a single-thread spectator’s expectation.%%

)) “From the point of view of other threads that might be “spying” on this thread by concurrently running unsynchronized methods, almost anything can happen. The only useful constraint is that the relative orderings of synchronized code, as well as operations on volatile fields, are always preserved. “

struts: swallow an elephant

iview q: “desc struts(&&spring) web flow of your project” There are more than 10 key aspects , so u need to digest and memorize something like an elephant. You need a system to organize these topics.

* Benefit: u will sound thorough, confident,
* benefit: As a bonus, fewer unprepared questions. Most iviewers ask about common, standard aspects like login, dao, formbeans, forward/redirect, session

[A=authentication]: after successful/unsuccessful login, redirect or forward?
A: login form bean?
A: login form send to ….?
A: which component sends the cookie? any session support offered by struts?
A: time out an idle session?
A: what objects do u put in the session obj, when and by which component

[I=form input interaction] which component does input validation?
I: how is the validation errors sent back? forward or redirect?
I: if successful, forward or redirect?

where else do u use forward? redirect?

M: biz logic in … biz-logic beans? servlets?
M: DAO questions
M: desc the usage of a formbean
v: how many forms? name a few. where do they submit to? controller? which one?
v: error pages?
v: validation error page?
c: routing map?
c: how many controllers? name a few

2ways to sell a security@ECN #RFQ

BW (bid wanted) is an application connecting sellers-without-offer-price (swop)[1] with bidders of our desk (CB)[2]. BW allows swop to sell their asset quickly. Most invitations (ie IFB) result in trades.

Typically, a swop puts up a Request for quotation (RFQ) aka IFB (Invitation For Bid) “1m of bond123. Please bid by 11am (bid time). Will finalize by 1pm (firm time)”. Before bid time, multiple[3] bidders from a single desk can use BW to submit bids and cancel bids. At bid time, the desk elects and send out a single top bid to the conduit. No cancel-bid allowed afterward. By firm time, seller must accept or, by default, reject all bids. Acceptance usually means trade finalized but the winning trader could get chance for a last-look.

Now a comparison of the IFB model and the neoreo model of *selling* bonds. I think these are the 2 primary modes on wall street.
* IFB is advertising without an offer price. Once our replies (bids) go out, no back-out as swop can execute the trade right away.
* neoreo is advertising with an offer price. Once our offer is published, no back-out, as trade can execute right away. Neoreo maintains the sell position.

Q: how do buyer/seller negotiate?
A: i feel BW and neoreo won’t support it. It’s over phone.

Some conduits (like bloomberg and TMC) are transparent, so bidders can see top x bids for each request.

Retail conduit charges higher commission than other conduits. Commission is part (say, 10%) of the bid price. swop receives the net price.

[1] Swops are usually broker-dealers, buy-side firms like blackrock but can also be individual investors.
[2] desk bidders are typically traders but can also be an autobidder.
[3] The same cusip can be bought by different traders

DAO study notes

Data Access Objects can be used in Java to insulate an application from the underlying Java persistence technology, which could be JDBC, JDO, EJB CMP, TopLink, Hibernate, iBATIS. Using Data Access Objects means the underlying technology can be upgraded or swapped without changing other parts of the application.
j4: The DAO completely hides the data source implementation details
from its clients.
j4: Because the interface exposed by the DAO to clients does not
change when the underlying data source implementation changes, this
pattern allows the DAO to adapt to different storage schemes without
affecting its clients
if (dao == null) dao = CatalogDAOFactory.getDAO();
return dao.getProducts(categoryId,…)
comprehensive but simple sample code

struts methods called by hollywood

* [P740] actionform bean’s instance method validate() is called by ActionServlet ie the framework — hollywood principle in action

* [P735] actionform bean’s setters are called by ActionServlet ie the framework — hollywood principle in action. Afbean auto-populated.

* [P741] execute() is called by ActionServlet ie the framework — hollywood principle in action

Page numbers -> [[ head first servlet ]]

spring mvc reload,starring wac4ds (AppCx)

wac4ds is the #1 thing to internalize, && the #1 source of initial confusion.

Q: What’s the relationship between a DispatcherSerlvet and a WebApplicationContext (WAC)?
A: “each DispatcherServlet has its own WebApplicationContext” — according to, which even shows a diagram

Q: To drive home the one-to-one mapping, give me the names of a { DS, WAC } pair?
A: These pairs are by no means behind-the-scene unsung heros. golfing-servlet.xml is the WAC if you have


Struts ActionServlet is similar:


In both spring and struts web.xml, this “golfing” name would subsequently be used for url mapping.

Q: is DispatcherServlet ready to use or must be subclassed?
A: No subclass need. You give a customized name to the standard DispatcherServlet in web.xml and use it.

Q: where to put the web application context config file?
A: WEB-INF/your_servlet_name-servlet.xml must be present. known as the
webapplicationContext (wac4ds) for the DispatcherServlet

j regex to remove trailing alphabets (from portNum)

        Pattern p = Pattern.compile(“[a-zA-Z]*$”);
        Matcher m = p.matcher(this.portNum);
                “au-” +
                opp.cardSlot + “-” +
                opp.portNum + “-” +               
                this.ontSlot + “-” + 

PHP imitating java@@

hard to remember a long list. i think it’s good to “study” or focus on 2 items each week

–promotion items
garbage collector

–the rest
serialize and unserialize(!)
output buffer
copying a var by ref ^ by value (clone)
return by ref
pass by ref
new (without parenthesis)? yes
finalize()? yes. known as __destruct()
constructor? yes but no automatic chain
static (i.e. class-level) attributes and method? yes
final method? yes
private/public/protected attributes AND methods? yes
getter/setter? yes.
extends? yes
interface? yes
abstract method? yes
abstract class? yes
method/attribute overriding? yes
implements iface1, iface 2? yes
object serialization? yes

Junit to test an object’s exact type

Q: My factory returns an object from one of many classes in an inheritance tree. I know which particular class should be used to instantiate the object. How can Junit test that?

A: Here’s my code for FTTP, where ONTDataPortFactory should return the right type of object:
result = ONTDataPortFactory.scan4ontDataPort(xe);
assertEquals(result.getClass(), TellabsDataPort.class); explains instanceof, getClass() and other things.

poorly modularized

An interviewer described that 2 C++ developers from Wealth Management were asked to help a quantatitive project for 2 weeks. I immediately commented that the design must be highly modularized. I felt the #1 important success factor in that /context/ was modularization.

In my experience, Poorly modularized code would impede initial learning curve, unit-test and debugging.

The new developers would have to tread very carefully, in fear of breaking client code. Poorly modularized code in my projects had less clean interfaces between a client (a dependent) and a provider module. High cohesion leads to cleaner and simpler interfaces.

Perhaps the #1 characteristic of modularation is cohesion. Highly modularized designs almost always show high-cohesion. The module given to the 2 “outsiders[1]” would be more self-contained and concerned with just 1 or very few responsibilities. Look at the old FTTP loader class. The old loader has to take care of diverse responsibilities including instantiation logic for uplink ports and gwr ports. I feel a separate uplink factory is high-cohesion. Separation of concern.

Object-oriented design is a specialized version of Module-oriented design. Most of the OO design principles apply to modularization. My comment basically /reflected/ the importance of clean and /disciplined/ OO design.

[1] zero domain knowledge.

justification for using factories ] FTTP parser

* Justification for moving scan4Uplink() from into
* Indeed it was proven functionally acceptable to leave scan4Uplink() in FTTPCktLoader. However, this new design offers several potential benefits and follows a few design principles.
* – encapsulating changes. Lots of potential changes are now isolated to this class. Look at the “business logic” in this factory, and you may agree it can change.
* – cohesion. It’s best to make one class responsible for one thing only. This factory is relatively high-cohesion — responsible for instantiating UplinkPort and nothing else. FTTPCktLoader is low-cohesion — already responsible for many things, and do not need to take care of UplinkPort instantiation. If a class is responsible for 2 relatively unrelated things, we can get too many changes.
* – readability. Thanks to cohesion, this class is clearly self-contained, not entangled. Left in, this code would appear to be interdependent with other methods.
* – testability. Left in, you can test this code too, but not as a unit-test.
* – extensibility. It’s easy to add functionalities into this factory.

batch feature wishlist — integrate

Integrate with DB and other external systems

* sensible default DB commit-frequency. Batch jobs need a different commit frequency than interactive applications. P505 [[ mysql stored procedures ]] Without this feature, a novice batch user would commit on every update. Commit frequency can be configurable.
* [x] Expect capabilities — integrate with interactive systems
* [x] ability to restart a remote server
* integration with stored procedures and triggers
* integration with ftp, auto-reconnect
* [x] integration with servlets etc, with support for cookies[x]. batch^interactive? A batch can flood and overwhelm a URL.
* avoid db contention due to concurrent updates to the same “page” ie data block. Spread updates over separate data blocks for parallel updates.
* resilient to network outage. Batch vs interactive?
* integration with MQ, usually as a sender

— infrastructure support
* adjustable load level on external partner systems. Batch jobs can generate higher load than normal on unprepared “trading-partners”. Persistent XML configuration. Ideally, load level is adjustable in real time.
* detect peak hours for a database/URL and back off. For a mission-critical DB, our friend’s website, a shared scarce resource, or a resource-tight website, our batch may need extra intelligence to detect changes to peak hours, instead of a hard-coded schedule. It takes Just a single outbreak of flooding (due to change in peak-hour schedule) to bring down a site, damange trust, spoil our image, or even generate bad publicity

batch feature wishlist (WallSt) — Undo

Let’s focus on undo. Undo is esp. important for Wall St EOD batch applications because a batch owner are often unable to anticipate the full range of possible input combinations and other environmental conditions that could lead to a large number of mistakes going unnoticed. In fact, when a batch owner puts in a code fix, it’s usually after a disaster, after the batch has processed (many) unexpected records unknowingly.

* undo a specific step for all records
* undo a specific record
* undo a subset of records, based on a criteria using log analysis
* undo an entire batch

Undo implementaion? Perhaps infrastructure support needed. Look at InnoDB in Mysql — low-level support needed.

batch feature wishlist

[x = lesser-known but fairly regular requirement in my experience]
A “record” means one of a (potentially large) number of input data to be processed

* [x] step-by-step manual confirmation, each with a single keystroke. Just like rm -i
* skip certain steps
* reshuffle some steps — arguably tapping on one of the strengths of interpreted languages.
* [x] re-run a certain step only
* share codebase with other on-going projects, to avoid forking and ease maintenance
* persistent xml config + command-line config
* be nice (Unix terminology) to other processes. Batch jobs can quickly eat up shared resources.

— infrastructure support needed, because standard batch languages can’t
* self-profiling and benchmarking on the batch application, to record time/mem/DB/bandwidth… usage for performance analysis
* scheduled retry or manual retry
* “easy” multi-threading (with data sharing) to exploit multi-threaded processors like our T2000’s 32 kernel threads. Multi-threading is non-trivial, esp. with data sharing. Many batch developers won’t have the time/expertise to create it or test it. Infrastructure support could lower the barrier and bring multi-threading to the “masses”

OO features in FTTP Parser

* depend on interfaces, not concrete classes
* heavy use static methods
* validity interface
* replicate LISP multiple inheritance
* break long methods into multiple screenful methods
* maximize object reuse and minimize duplicate objects
* singleton in multi-threaded environment

— OO-related features
* custom exceptions
* JUnit
* thread-safety

1 singleton 2 instances

Hi guys,

I think we have a few methods like this, which could produce 2 instances of the singleton.

public static FTTPCktLoader getInstance() { 
1) if (_instance != null) { 
2)   return _instance; 
3) } 
4) synchronized (FTTPCktLoader.class) { 
5)   _instance = new FTTPCktLoader(); 
6) } 
7) return _instance; 

Thread A goes through 123 and gives way to other threads
Thread B goes through 123, since _instance is still null
B gets the lock and goes through 4567 and returns a new object to its caller method
B gives way to other threads
A gets the lock and goes through 4567 and returns a second object to its caller method

Now, the caller method (probably in LoaderFactory) may be synchronized, but that would not be a safe design. Another programmer may write another caller method that calls getInstance() without acquring any lock.

I’ll find a new version later, hopefully more thread-safe.