git | reword historical commit msg

Warning — may not work if there’s a recent merge-in on your branch

Find the target commit and its immediate parent commit.

git rebase -i the.parent.commit

First commit in the list would be your target commit. Use ‘r’ for the target commit and don’t change other commits. You will land in vim to edit the original bad commit msg. Once you save and quit vim, the rebase will complete, usually without error.

Now you can reword subsequent commit messages.

opaque challenge:(intermediate)data browser #push the limit

Opacity — is a top 3 (possibly biggest) fear and burden in terms of figure-things-out-relative-to-cowokers on a localSys. The most effective and direct solution is some form of instrumentation tool for intermediate data. If you develop or master an effective browsing tool, however primitive, it would likely become a competitive advantage in terms of figure-out speed, and consequently earn you some respect.

LocalSys — I feel most of the effective data browser knowledge is localSys knowledge.

If you are serious about your figure-out speed weakness, if you are seriously affected by the opaque issues, then consider investing more time in these browsing tools.

Hard work, but worthwhile.

  • eg: Piroz built a Gemfire data browser and it became crucial in the Neoreo project
  • #1 eg: in my GS projects, the intermediate data was often written into RDBMS. Also important — input and output data are also written into RDBMS tables. Crucial in everyday trouble-shooting. I rank this as #1 in terms of complexity and value. Also this is my personal first-hand experience
  • #2 eg: RTS rebus — during development, I captured lots of output CTF messages as intermediate data… extremely valuable
    • Once deployed, QA team relied on some web tools. I didn’t need to debug production issues.
    • I remember the static data team save their static data to RDBMS, so they relied on the RDBMS query tool on a daily basis.

Now some negative experiences

  • eg: I’m not too sure, but during the Stirt QZ dev projects I didn’t develop enough investigation skill esp. in terms of checking intermediate data.
  • eg: in Mvea, we rely on net-admin commands to query order state, flow-element state and specific fills… Not as convenient as a data store. I never became proficient.
    • I would say the FIX messages are logged consistently and serve as input and output data browser.

Many projects in my recent past have no such data store. I don’t know if there’s some effective solution to the opacity, but everyone else face the same issue.

## time-honored textual data file formats

[[art of Unix programming]] has insights on it… Note Windows tend to use binary file formats, whereas the Unix tradition favors textual.

Standard cvs format is not well designed. Its use of double-quote is criticized.

The /etc/passwd format supports metachar embedding better than standard csv.

The “record-jar” format combines cookie-jar with RFC-822 format and is more human-friendly than xml, ini, and cvs. It has field names. Its values can be unstructured text. It also supports “stanza” records.

xml can be checked for well-formed even without a DTD or schema.

git | backup b4 history rewrite

Personal best practice. Git History rewrite is not always no-brainer and riskless. Once in 100 times it can become nasty.

It should Never affect file content, so at end of the rewrite we need to diff against a before-image to confirm no change.

The dumbest (and most foolproof) before-image is a zip of entire directory but here’s a lighter alternative:

  1. git branch b4rewrite/foo
  2. git reset or rebase or cherry-pick
  3. git diff b4rewrite/foo

Note the branch name can be long but always explicit so I can delete it later without doubt.

contents to keep in .C rather than .H file

1) Opening example — Suppose a constant SSN=123456789 is used in a1.cpp only. It is therefore a “local constant” and should be kept in a1.cpp not some .H file.  Reason?

The .H file may get included in some new .cpp file in the future. So we end up with multiple .cpp files dependent (at compile-time) on this .H file. Any change to the value or name of this SSN constant would require recompilation to not only a1.cpp but unnecessarily to other .cpp files 😦

2) #define and #include directives — should be kept in a1.cpp as much as possible, not .H files. This way, any change to  the directives would only require recompiling a1.cpp.

The pimpl idiom and forward-declaration use similar techniques to speed up recompile.

3) documentation comments — some of these documentations are subject to frequent change. If put in .H then any comment change would trigger recompilation of multiple .cpp files

edit 1 file in big python^c++ production system #XR

Q1: suppose you work in a big, complex system with 1000 source files, all in python, and you know a change to a single file will only affect one module, not a core module. You have tested it + ran a 60-minute automated unit test suit. You didn’t run a prolonged integration test that’s part of the department-level full release. Would you and approving managers have the confidence to release this single python file?
A: yes

Q2: change “python” to c++ (or java or c#). You already followed the routine to build your change into a dynamic library, tested it thoroughly and ran unit test suite but not full integration test. Do you feel safe to release this library?
A: no.

Assumption: the automated tests were reasonably well written. I never worked in a team with a measured test coverage. I would guess 50% is too high and often impractical. Even with high measured test coverage, the risk of bug is roughly the same. I never believe higher unit test coverage is a vaccination. Diminishing return. Low marginal benefit.

Why the difference between Q1 and Q2?

One reason — the source file is compiled into a library (or a jar), along with many other source files. This library is now a big component of the system, rather than one of 1000 python files. The managers will see a library change in c++ (or java) vs a single-file change in python.

Q3: what if the change is to a single shell script, used for start/stop the system?
A: yes. Manager can see the impact is small and isolated. The unit of release is clearly a single file, not a library.

Q4: what if the change is to a stored proc? You have tested it and run full unit test suit but not a full integration test. Will you release this single stored proc?
A: yes. One reason is transparency of the change. Managers can understand this is an isolated change, rather than a library change as in the c++ case.

How do managers (and anyone except yourself) actually visualize the amount of code change?

  • With python, it’s a single file so they can use “diff”.
  • With stored proc, it’s a single proc. In the source control, they can diff this single proc. Unit of release is traditionally a single proc.
  • with c++ or java, the unit of release is a library. What if in this new build, beside your change there’s some other change , included by accident? You can’t diff a binary 😦

So I feel transparency is the first reason. Transparency of the change gives everyone (not just yourself) confidence about the size/scope of this change.

Second reason is isolation. I feel a compiled language (esp. c++) is more “fragile” and the binary modules more “coupled” and inter-dependent. When you change one source file and release it in a new library build, it could lead to subtle, intermittent concurrency issues or memory leaks in another module, outside your library. Even if you as the author sees evidence that this won’t happen, other people have seen innocent one-line changes giving rise to bugs, so they have reason to worry.

  • All 1000 files (in compiled form) runs in one process for a c++ or java system.
  • A stored proc change could affect DB performance, but it’s easy to verify. A stored proc won’t introduce subtle problems in an unrelated module.
  • A top-level python script runs in its own process. A python module runs in the host process of the top-level script, but a typical top-level script will include just a few custom modules, not 1000 modules. Much better isolation at run time.

There might be python systems where the main script actually runs in a process with hundreds of custom modules (not counting the standard library modules). I have not seen it.

what to test #some brief observations

For a whitebox coder self-test, it’s tough to come up with real possible corner cases. It takes insight about the many systems that make up the real world. I would argue this type is the #1 most valuable regression tests.

If you see a big list of short tests, then some of them are trivial and simply take up space and waste developer time. So I’m a big fan of the elaborate testing frameworks.

Some blackbox tester once challenged a friend of mine to design test cases for a elevator. Presumably inputs from buttons on each level + in-lift buttons.

%%jargon — describing c++ templates

The official terminology describing class templates is clumsy and inconsistent with java/c#. Here’s my own jargon. Most of the words (specialize, instantiate) already have a technical meaning, so I have to pick new words like “concretize” or “generic”

Rule #1) The last Word is key. Based on [[C++FAQ]]
– a class-template — is a template not a class.
– a concretized template-class — is a class not a template
————————–

“concretizing” a template… Officially terminology is “instantiating”

A “concretized class” = a “template class”. Don’t’ confuse it with — “concrete” class means non-abstract

A “unconcretized / generic class template” = a “class template”.
(Note I won’t say “unconcretized class”as it’s self-contradictory.)

A “non-template” class is a regular class without any template.

Note concretizing = instantiating, completely different from specializing!

“dummy type name” is the T in the generic vector

“type-argument”, NOT “parameter“, refers to an actual, concrete type like the int in vector

renaming folder in svn

C) Conservative approach – clone the folder, verify it, check in, observe [3], then schedule to remove obsolete folder.

** Caveat – Before removal, everyone needs to know the old folder is obsolete and not to be used.
** Caveat – history lost

[3] it can take months to uncover a problem induced by the renaming.

D) The tortise “rename” is convenient but potentially dangerous. Usually works. Acceptable on unimportant svn folders. Use your judgement.

%%jargon – Consumer coder, Consumer class

When we write a utility, an API, or a data class to be used by other programmers or as components (or “services” or “dependencies”) in other modules, we often strain to find an unambiguous and distinct term that refers to “the other side” whom we are working to serve. The common choices of words are all ambiguous due to overload —
“Client” can mean client-server.
“User” can mean business user.
“App developer”? me or “the other side” are both app developers

My Suggestions —

How about “downstream coder”, or “downstream classes”, or “downstream app” ?

How about “upper-layer coder”, “upper-layer classes”, “upper-layer app”, “upper-layer modules”
How about “upper-level coder”, “upper-level classes”, “upper-level app”, “upper-level modules”
How about “Consumer coder”, “Consumer class”, or “Consumer app”?

quiet confidence on go-live day

I used to feel “Let’s pray no bug is found in my code on go-live day. I didn’t check all the null pointers…”

I feel it’s all about …blame, even if manager make it a point to to avoid blame.

Case: I once had a timebomb bug in my code. All tests passed but production system failed on the “scheduled” date. UAT guys are not to blame.

Case: Suppose you used hardcoding to pass UAT. If things break on go-live, you bear the brunt of the blame.

Case: if a legitimate use case is mishandled on go-live day, then
* UAT guys are at fault, including the business users who signed off. Often the business come up with the test cases. The blame question is “why this use case isn’t specified”?
* Perhaps a more robust exception framework would have caught such a failure gracefully, but often the developer doesn’t bear the brunt of the blame.
**** I now feel business reality discounts code quality in terms of airtight error-proof
**** I now feel business reality discounts automated testing for Implementation Imperfections (II). See http://bigblog.tanbin.com/2011/02/financial-app-testing-biz.html

Now I feel if you did a thorough and realistic UAT, then you should have quiet confidence on go-live day. Live data should be no “surprise” to you.

tuning/optimization needs "targeted" input data

My personal focus is performance profilers for latency engineering, but this principle also applies to other tuning — tuning needs “targeted” even “biased” input data, biased towards the latency-critical scenarios.

See also the post on 80/20.

The “80/20” item in [[More Effective C++]] points out that we must feed valid input data set to performance profiler. It think it’s easy to get carried away by a good quality profiler’s findings, without realizing that the really important users (who?) or the critical operations may not be represented by the test case.

This sounds like more than an engineer’s job duty — find the relevant input data sets that deserve performance-profiling most. Perhaps the business analyst or project manager knows who and what functionality is performance-critical.

80/20 rule, dimishing return…

In latency (or any other) tuning, I’m always be suspicious and critical about common assumptions, best-practices….I don’t completely subscribe to the “80/20” item in [[more eff c++]] but it raised valid points. One of them was the need for “targeted” input data — see http://bigblog.tanbin.com/2011/01/performance-profilers-needs-targetted.html

Bottleneck is a common example of 80/20….

Diminishing return is another issue. A discussion on DR assumes we have a feedback loop.

feedback loop — It’s so important to have a typical (set of) test setup to measure latency after tuning. Without it we risk shooting in the dark. In many situations that test-setup isn’t easy…. Good feedback loops focus on the most important users. In some trading apps, it’s the most latency-sensitive [1] trades.

Illustration of reliable feedback loop — to test a SQL query you just tuned, you must measure the 2nd time you run the query, because you want all the data to be in cache when you measure.

cost/benefit — sometimes there’s too much unknown or complexity in a module, so it would take too much time to get clarity — high cost in terms of man-days or delivery time frame. After the research it may turn out to be a hopeless direction — low benefit

[1] largest? no. Largest trades are often not latency sensitive.

practice blind-walk on a tight rope, and grow as true geek

A fellow developer told me if a c++ class is in deployed to production, then it’s better not to rename a method in it.

In another case — an initial system release to production — I was the designer of the spring config loading mechanism. I was so confident about how my code works that I simply dropped an override config file into a special folder and I knew my code would take care of it. A colleague suggested we test it in QA environment first. I said we need to develop some level of self-confidence by deliberately taking small leaps of faith in real production environment.

I developed similar self-confidence in GS when I directly ran update/delete in production database (when no choice). My release was approved on the last day of the monthly cycle. I had just 2 hours to finish the release. So I put the entire month-end autosys batch (tens of millions of dollars of payout processing) on-hold to complete my release. However, the first run produced bad data in the production DB, so I had to clean it up, tweak the stored proc and table constraints a bit before rerun. I stayed till 11pm. If I were to test everything in QA first, it would take 3 times longer and I would lose concentration, and would not be able to react to the inevitable surprises during such a complex release.

As a developer, if you always play safe and test, test, test, and dare not trust your own memory and deductive thinking, then you will not improve your memory and deduction. You will not quickly master all the one-liners, all the IDE refactors, …

Top Lasik surgeons can perform the operation swiftly (which is good for the patient). A young tight-rope walker will one day need to practice blind-walk. Master wood-carving artists develop enough self-confidence to not need to pencil down the cuts before cutting…

make your app self-sustainable Af you are gone

I always hope that my app can run for years after I'm gone, not get killed soon after. Here are some realities and some solutions.

If your app does get killed soon after, then it's a management strategic error. Waste of resources.

For a 6-month contractor on wall street, it's quite common for the code to be maintained by an inexperienced guy. Knowledge transfer is usually rather brief and ineffective. (“Effective” means new maintainer runs the app for a few cycles and trace log/source code to resolve issues independently, with author on stand-by.)

All your hacks put into the code to “get it to work” are lost with you when you are gone.

Tip: really understand the users' priorities and likely changes. Try to identify the “invariant” business logic. The implementation of these core logic is more likely to survive longer. Probably more than 50% (possibly 90%) of your business logic aren't invariant. If really 90% of the logic isn't invariant, then requirement is unstable and any code written by anyone is likely a throw-away.

Tip: thoroughly test all the realistic use cases. Don't waste time on the imaginary — http://bigblog.tanbin.com/2011/02/financial-app-testing-biz.htmlt
Tip: don't waste time writing 99 nice-looking automated tests. Focus on the realistic use cases. Automated tests take too much effort to understand. Maintainer won't bother with it since she already has too much to digest.

Tip: use comment/wiki/jira to document your hacks. I don't think high-level documentation is useful, since developers can figure it out. Since time is very limited, be selective about what constitutes a “hack”. If they can figure it out then don't waste time documenting in wiki. Source code comment is usually the best.
Tip: get new maintainer involved early.
Tip: pray that requirements don't change too much too fast after you are gone.

code change while switching JDBC library@@

When a new vendor implements JDBC api, they create classes implementing the PreparedStatement java interface. They ship the class files of new PreparedStatement classes. They don’t need to ship PreparedStatement.class which is a standard one, unchanged. Suppose i want to change my java app to use this vendor, I need to make exactly one SC (source code) change – import the new PreparedStatement class. Note we make no more and no fewer SC changes. That’s the the “state of the art” in java.

The C solution to “API standardization” is different but time-honored. For example, if a vendor implements pthreads, they implement pthread_create() function (and structs like pthread_mutex_t) in a new .C file. They ship the binary. Now suppose I want to change my working C application and switch to this vendor, from an older implementation…

Q: what SC changes?
%%A: possibly none. (None means no recompile needed.) When I compile and link, I probably need to pull in the new binary.

Q: Do they have to ship the header file?
%%A: i don’t think so. I feel header file is part of the pthreads standard.

Now the interesting bit — suppose I want to use a vendor with small extensions. SC change is different — include a slightly different header file, and change my thread function calls slightly. If the SC changes are small enough, it’s practically painless.

Now we see a real-world definition of “API” — If a vendor makes changes that require application SC change, then that impacts the API.

For java, the API includes all members public or protected. For C, it’s more about header files.

## practical app documentation in front office

+ Email? default “documentation” system, grossly inefficient but #1 documentation tool in practice.
+ Source code comments? default documentation system

— above are “organic” products of regular work; below are special documentation effort — tend to be neglected —-

+ Jira? rarely but can be used as organic documentation. easy to link
+ [jsw] Cheatsheet, run book and quick guide? same thing, very popular.
+ [jsw] Troubleshooting guide on common production issues
+ Design document? standard practice but often get out of date. A lot of effort. Often too high-level and not detailed enough for developers/implementation guys — written for another audience. Some people tell me a requirement spec should be detailed enough so developers could write code against it, but in reality, i have see extremely detailed specs but developers always have questions needing real time answers.

+ Unit test as documentation? Not sure. In theory only.
+ Wiki? widespread. Easy to link
+ sharepoint? less widely used than wiki as documentation tool. organized into a file tree.

The more “organic”, the easier, and the better its take-up rate; The more artificial, the more resistance from developers.

[w = wiki]
[j = jira]
[s = sharepoint]

code change right before sys integration test

Reality – each developer has many major projects and countless “noise” tasks simultaneously. Even those “successfully” completed projects have long tails — Never “completely completed” and out of your mind.

Reality – context switch between projects has real cost both on the brain and on your computer screen — If you keep too many files, emails, browsers open, you lose focus and productivity.

Q: for something like SIG, should you wrap up all code changes 3 days before SIT (sys int test) or the night before SIT?
A: night before. During the SIT your mind will be fresh and fully alert. Even if you finish 3 days before, on the night before you are likely to find more changes needed, anyway.

software engineering where@@ wall street@@

When I look at a durable product to be used for a long time, I judge its quality by its details. Precision, finishing, raw material, build for wear and tear…. Such products include wood furniture, musical instruments, leather jackets, watches, … I often feel Japanese and German (among others) manufacturers create quality.

Quick and dirty wall street applications are low quality by this standard, esp. for code maintainers.

Now software maintainability requires a slightly different kind of quality. I judge that quality, first and foremost, by correctness/bugs, then coverage for edge/corner cases such as null/error handling, automated tests, and code smell. Architecture underpins but is not part of code quality. Neither is performance, assuming our performance is acceptable.

There's a single word that sums up what's common between manufacturing quality and software quality — engineering. Yes software *is* engineering but not on wall street. Whenever I see a piece of quality code, it's never in a fast-paced environment.

code-cloning && other code smells in fast-paced GTD Wall St

Update — embrace these code smells to survive the coworker-benchmarking

In fast-paced Wall St, you can copy-paste, as long as you remember how many places to keep in-sync. In stark contrast, my ex-colleague Chad pointed out that even a one-liner sybase API call should be made a shared routine and a choke point among several execution paths. If everyone uses this same routine to access the DB, then it’s easier to change code. This is extreme DRYness. The opposite extreme is copy-paste or “code cloning” as some GS veteran described. Other wall st developers basically said Don’t bother with refactoring. I’m extremely uncomfortable with such quick and dirty code smells, but under extreme delivery pressure, this is often the only way to hit deadlines. Similarly,

*Use global variables.
** -Use global static collections. Remember my collection of locks in my ECN OMS kernel?
** -Instead of passing local variables as arguments, make them static or instance fields.
** -Use database or env var as global variables.
** -Use spring context as global variables.

* Tolerate duplicate variables (and functions) serving the same purpose. Don’t bother to refactor, simplify or clean up. Who cares about readablility? Not your boss! Maybe the next maintainer but don’t assume that.
** Given the requirement that a subclass field and a parent field pointing to the same object, due to bugs, sometimes they don’t. Best solution is to clean up the subclass, but don’t bother.
** 2 fields (out of 100) should always point to the same object, but due to bugs, sometimes they don’t. Best solution is to remove one, but don’t bother.
** a field and a field of a method param should always point to the same object, so the field in the param’s class is completely unnecessary distraction. Should clean up the param’s class, but don’t bother.
** a field and method parameter actually serve the same purpose.
*** maybe they refer to the same object, so the method param is nothing but noise. Tolerate the noise.
** tolerate a large number (100’s) of similar methods in the same class. Best to consolidate them to one, but don’t bother.
** tolerate many similar methods across classes. Don’t bother to consolidate them.
** tolerate tons of unused variables. Sometimes IDE can’t even detect those.

– use macro to substitute a checked version of vector/string. It’s cleaner but more work to use non-macro solutions.
– don’t bother with any validation until someone complains. Now, validation is often the main job of an entire applicaiton, but if your boss doesn’t require you to validate, then don’t bother. If things break, blame upstream or downstream for not validating. Use GarbageInGarbageOut as your defence.
– VeryLongVeryUgly methods, with multi-level nested if/while
– VLVU signatures, making it hard to understand (the 20 arguments in) a CALL to such a method.
– many methods could be made static to simplify usage — no need to construct these objects 9,999 times. It’s quite safe to refactor, but don’t bother to refactor.
– Don’t bother with null checks. You often don’t know how to handle nulls. You need to ask around why the nulls — extra work, thankless. If you can pass UAT, then NPE is (supposedly) unlikely. If it happens in production, you aren’t the only guy to blame. But if you finish your projects too slow, you can really suffer.
– Don’t bother with return value checks. Again, use UAT as your shield.
– use goto or exceptions to break out of loops and call stacks. Use exception for random jump.
– Ignore composed method pattern. Tolerate super long methods.
-Use if/else rather than virtual functions
-Use map of (String,Object) as cheap value objects
-Forget about DRY (Don’t Repeat Yourself)
-Tolerate super long signatures for constructors, rather than builder pattern ( [[EffectiveJava]] )
– don’t bother to close jdbc connections in the finally block
– open and close jdbc connections whenever you like. Don’t bother to reuse connections. Hurts performance but who cares:)
-Don’t pass jdbc connections or jdbcTemplates. In any context use a public static method to get hold of a connection.
– create new objects whenever you feel like, instead of efficiently reuse. Of course stateless objects can be safely reused, creating new ones are cheap. It hurts performance but who cares about performance?
-Use spring queryForMap() whenever convenience
– Tolerate tons of similar anon inner classes that should be consolidated
– Tolerate 4 levels of inner classes (due to copy-paste).
– Don’t bother with thread pools. Create any number of threads any time you need them.
– tolerate clusters of objects. No need to make a glue class for them
-Use public mutable fields; Don’t both with getters/setters
-JMX to open a backdoor into production JVM. Don’t insist on password control.
-print line# and method name in log4j. Don’t worry about performance hit
-Don’t bother with factory methods; call ctor whenever quick and dirty
-Don’t bother with immutables.
– tolerate outdated, misleading source documentation. Don’t bother to update. Who cares:)
– some documentations across classes contradict each other. Don’t bother to straighten.

avoid convenience wrappers;xx low-level APIs #RV,socket

See also high^low-level expertise

Many examples of the same observation — convenient high-level wrappers insulate/shield a developer from the (all-important) underlying API. Consequently, in a hardcore interview he can’t demonstrate his experience and mastery in that particular “real” technology — he only knows how to use the convenience wrappers.

  • example — servlet — in one of my GS trading desks, all web apps were written atop an in-house framework “Panels”, which was invented after servlet but before JSP, so it offers many features similar or in addition to JSP. After years of programming in Panels, you would remember nothing at all about the underlying servlet API, because that is successfully encapsulated. Only the Panels API authors work against the underlying servlet API. Excellent example of encapsulation, standardization, layering and division of labor.

Notice the paradox — guys who wrote those wrappers are considered (by colleagues and job interviewers) as stronger knowledge experts than the users of the wrappers. If you think about it, you (as everyone else) would agree that knowledge of the underlying API is more valuable than GTD skill using the wrappers.

  • example — Gemfire — one of my trading engines was heavy on Gemfire. The first things we created was a bunch of wrapper classes to standardize how to access gemfire. Everyone must use the wrappers.

If you are NOT the developer of those wrappers, then you won’t know much about gemfire API even after 12 months of development. Interviewers will see that you have close to zero Gemfire experience — unfair!

  • example — thread-pool — One of my ML colleagues told me (proudly) that he once wrote his own thread pool. I asked why not use the JDK and he said his home-made solution is simple and flexible — infinitely customizable. That experience definitely gave him confidence to talk about thread pools in job interviews.

If you ask me when a custom thread pool is needed, I can only hypothesize that the JDK pools are general-purpose, not optimized for the fastest systems. Fastest systems are often FPGA or hardware based but too costly. As a middle ground, I’d say you can create your customized threading tools including a thread pool.

  • example — Boost::thread — In C++ threading interviews, I often say I used a popular threading toolkit — Boost::thread. However, interviewers repeatedly asked about how boost::thread authors implement certain features. If you use boost in your project, you won’t bother to find out how those tools are internally implemented. That’s the whole point of using boost. Boost is a cross-platform toolkit (more than a wrapper) and relies on things like win32 threads or pthreads. Many interviewers asked about pthreads. If you use a high-level threading toolkit, you avoid the low-level pthreads details..
  • example — Hibernate — If you rely on hibernate and HQL to generate native SQL queries and manage transaction and identities, then you lose touch with the intricate multitable join, index-selection, transaction and identity column issues. Won’t do you good for interviews.
  • G5 example — RV — one of my tibco RV projects used a simple and nice wrapper package, so all our java classes don’t need to deal with tibrv objects (like transport, subject, event, rvd….). However, interviewers will definitely ask about those tibrv objects and how they interact — non-trivial.
  • example — JMS — in GS, the JMS infrastructure (Tibco EMS) was wrapped under an industrial-strength GS firmwide “service”. No applications are allowed to access EMS directly. As a result, I feel many (experienced) developers don’t know JMS api details. For example, they don’t use onMessage() much and never browse a queue without removing messages from it. We were so busy just to get things done, put out the fires and go home before midnight, so who has time to read what he never needs to know (unless he plans to quit this company)?
  • example — spring jms — I struggled against this tool in a project. I feel it is a thin wrapper over JMS but still obscures a lot of important JMS features. I remember stepping through the code to track down the call that started the underlying JMS server and client sessions, and i found it hidden deeply and not well-documented. If you use spring jms you would still need some JMS api but not in its entirety.
  • example — xml parsing — After many projects, I still prefer SAX and DOM parsers rather than the high-level wrappers. Interviewers love DOM/SAX.
  • G3 example — wait/notify — i used wait/notify many times, but very few developers do the same because most would choose a high-level concurrency toolkit. (As far as I know, all threading toolkits must call wait/notify internally.) As a result, they don’t know wait/notify very well. Strangely, wait/notify are not complicated as the other APIs mentioned earlier but are tricky to 90% of developers.
  • G7 example — FIX — one of my FIX projects uses a rather elaborate wrapper package to hide the complexity of FIX protocol. Interviewers will ask about FIX, but I don’t know any from that project because all FIX details were hidden from me.
  • example — raw arrays — many c++ authors say you should use vector and String (powerful, standard wrappers) to replace the low-level raw arrays. However, in the real world, many legacy apps still use raw arrays. If you avoid raw arrays, you have no skill dealing with them in interviews or projects. Raw arrays integrate tightly with pointers and can be tricky. Half of C complexities stem from pointers.
  • G5 example — DB tuning — when Oracle 10 came out, my Singapore (Strategem) DBA friend told me the toughest DBA job is now easier because oracle 10 features excellent self-tuning. I still doubt that because query tuning is always a developer’s job and the server software has limited insight. Even instance tuning can be very complicated so DBA hands-on knowledge is probably irreplaceable. If you naively believe in Oracle marketing and rely on self-tuning, you will soon find yourself falling behind other DBAs who look hard into tuning issues.
  • G3 example — sockets — i feel almost every place where sockets are used, they are used via wrappers. Boost, ACE, java, c#, python, perl all offer wrappers over sockets. RTS is the exception. However, socket API is critical and a favorite interview topic. We had better go beyond those wrappers and experiment with underlying sockets.

I recently had to describe my network programming experience. I talked about using perl telnet module, which is a wrapper over wrapper over sockets. I feel interviewer was looking for tcp/udp programming experience. But for that system requirement, perl telnet module was the right choice. That right choice is poor choice for knowledge-gain.

The Quoine interviewer Mark was the only hiring manager to appreciate wrappers more than underlying socket knowledge. He is the exception that proves the rule.

financial app testing – biz scenario^implementation error

Within the domain of _business_logic_ testing, I feel there are really 2 very different targets/focuses – Serious Scenario (SS) vs Implementation Imperfections (II). This dichotomy cuts through every discussion in financial application testing.

* II is the focus of junit, mock objects, fitnesse, test coverage, Test-driven-development, Assert and most of the techie discussions
* SS is the real meaning of quality assurance (QA) and user acceptance (UA) test on Wall St. In contrast II doesn’t provide assurance — QA or UA.

SS is about real, serious scenario, not the meaningless fake scenarios of II.

When we find out bugs have been released to production, in reality we invariably trace root cause to incomplete SS, and seldom to II. Managers, users, BA, … are concerned with SS only, never II. SS requires business knowledge. I discussed with a developer (Mithun) in a large Deutsche bank application. He pointed out their technique of verifying intermediate data in a workflow SS test. He said II is simply too much effort and little value.

NPE (null pointer exception) is a good example of II tester’s mindset. Good scenario testing can provide good assurance that NPE doesn’t happen in any acceptable scenarios. If a scenario is not within scope and not in the requirement, then in that scenario system behavior is “undefined”. Test coverage is important in II, but if some execution path (NPE-ridden) is never tested in our SS, then that path isn’t important and can be safely left untested in many financial apps. I’m speaking from practical experience.

Regression testing should be done in II testing and (more importantly) SS testing.

SS is almost always integrated testing, so mock objects won’t help.

XR’s memory efficiency in large in-memory search (DB diff via xml

A real world Problem: 2 DB tables have very similar content. The owner decided to periodically export each table into a 200MB xml file. Each row becomes an element in xml. Each field an attribute. Business Users want to compare the rows on-demand via a servlet. Need to show per-field differences.

Performance? A few seconds to compare the xml files and show result.

Due to various overhead while reading and write files, -Xmx1000m is required.

Insight — duplicate object is the chief performance issue. Key technique is an object registry (like SSN in US), where each recurring object (strings, Floats too!) is assigned an 32-bit Oid. Dates are converted to dates then long integer (memory efficient).  To compare 2 corresponding fields, just compare their Oid. String comparison is much slower than int comparison.

Use an AtomicInteger as Oid auto-increment generator — thread-safe.

Solution:
– load each XML into memory, but never to instantiate 2 identical objects. Use the Oid.
** construct an {Object,Oid} hashmap?
– now build object graphs, using the Oid’s. Each row is represented by an int array
– 1m rows become 1m int arrays, where 0th array element is always Column 0, 1st element always Column 1 …
– Since Column 3 is primary key, we build a hashmap (primary key Oid, int array) using 1st xml file
– now we go through the 1m rows from 2nd xml file. For each row, we construct its int array, use the primary key’s Oid to locate the corresponding int array from the hashmap, and compare the 2 int arrays.

Note xml contains many partial rows — some of the 50 fields might be null. However, every field has a name. therefore our int-array is always 50 but with nulls represented by -1.

Table has 50 columns, 400,000 rows. Each field is char(20) to char(50).

%%jargon — describing c# enums

Consider public enum Weekday{Mon=1, Tue, Wed, Thu,Fri,Sat,Sun} This declares an enum TYPE.

Since these aren’t singletons, we would get many INSTANCES of Weekday.Wed (being value type), but only 7 MEMBERs in this Type. At runtime, we generally work with (pass and receive) INSTANCES, not MEMBERS. Members are a more theoretical concepts.

Each MEMBER has a name and an Integer value. Only 7 names and 7 integer values in this Type. (Warning – Names are unique but not integer values.)

We can get name or integer value from an INSTANCE.

automated test in non-infrastructure financial app

I worked in a few (non-infrastructure) financial apps under different senior managers. Absolutely zero management buy-in for automated testing (atest).

“automated testing is nice to have” — is the unspoken policy.
“regression test is compulsory” — but almost always done without automation.
“Must verify all known scenarios” — is the slogan but if there are too many (5+) known scenarios and not all of them critical then usually no special budget or test plan.

I feel atest is a cost/risk analysis for the manager. Just like market risk system. Cost of maintaining a large atest and QA system is real. It is justified on
* Risk, which ultimately must translates to costs.
* speed up future changes
* build confidence

Reality is very different on wall street. [1]
– Blackbox Confidence[2] is not provided by test coverage but by “battle-tested”. Many minor bugs (catch-able by atest but were not) will not show up in a fully “battle tested” system; but ironically a system with 90% atest coverage may show many types of problems once released. Which one enjoys confidence?

– future changes are only marginally enhanced by atest. Many atests become irrelevant. Even if a test scenario remains relevant, test method may need a complete rewrite.

– in reality, system changes are budgeted in a formal corporate process. Most large banks deal with man-months so won’t appreciate a few days of effort saving (actually not easy) due to automated tests.

– Risk is a vague term. For whitebox, automated tests provide visible and verifiable evidence and therefore provides a level of assurance, but i know as a test writer that a successful test can, paradoxically, cover up bugs. I never look at someone’s automated tests and feel that’s “enough” test coverage. Only the author himself knows how much test coverage there really is. Therefore Risk reduction is questionable even at whitebox level. Blackbox is more important from Risk perspective. For a manager, Risk is real, and automated tests offer partial protection.

– If your module has high concentration of if/else and computation, then it’s a different animal. Automated tests are worthwhile.

[1] Presumably, IT product vendors (and infrastructure teams) are a different animal, with large install bases and stringent defect tolerance level.
[2] users, downstream/upstream teams, and managers always treat your code as blackbox, even if they analyze your code. People maintaining your codebase see it as a whitebox. Blackbox confidence is more important than Whitebox confidence.

release test failures and code quality

(blogging again) XR, You mentioned some colleagues often release (to QA) code that can’t complete an end-to-end test .. (Did I got you wrong?)

That brings to mind the common observation “95% of your code and your time is handling the unexpected”. That’s nice advertisement and not reality — a lot of my GS colleagues aggressively cut corners on that effort (95% -> 50%), because they “know” what unexpected cases won’t happen. They frequently assume (correctly) a lot of things about the input. When some unexpected things happen, our manager can always blame the upstream or the user so we aren’t at fault. Such a culture rewards cutting-corners.

I used to handle every weird input combination, including nulls in all the weird places. I took the inspiration from library writers (spring, or tomcat…) since they can’t assume anything about their input.

I used to pride myself on my code “quality” before the rude awakening – such quality is unwanted luxury in my GS team. I admit that we are paid to serve the business. Business don’t want to pay for extra quality. Extra quality is like the loads of expensive features sold to a customer who just needs the basic product.

As we discussed, re-factoring is quality improvement at a price and at a risk. Business (and manager) may not want to pay that price, and don’t want to take unnecessary risk. I often do it anyway, in my own time.

design patterns often increase code size

Using nested if and instanceof.., you can implement complex logic in much fewer lines, and the logic is centralized and clearly visible.

If you refactor using OO design techniques, you often create many Types and scatter the logic.

Case in point — Error Memos workflow kernel. We could use factory, polymorphism, interfaces…

Case in point — chain of responsibility

how I estimate workload #XR

Hi XR,

There’s a lot of theory and many book chapters and courses on app development estimation. I don’t know how managers in my department does it. I’m sure these factors below affect their estimations.

* the right way of solving a problem can shorten a 2-day analysis/discussion to 2 hours, provided someone points out the correct way in the beginning. In our bigger department (100+ developers), many managers make estimates assuming each developer knows the standard (not always best) solutions from the onset. If I am new i could spend hours and hours trying something wrong.

* as in your struts form-validation example, a good framework/design can reduce a 1-month job to 1-week.

* tool — the right software tool can reduce a 3-hour job to 30-minutes. Example — I once mentioned to you testing tools. Also consider automated regression testing.

* A common complaint in my department (100+ developers) — a poorly documented system can turn a 2-day learning curve into a week.

* I suspect my colleagues do fewer test cases than i do, given the same system requirement. This is partly because they are confident and familiar with the system. This could mean 30% less effort over the entire project.

understand DB tables in a relatively large system

What do we mean “relatively large”?
– team size 5 – 15 including system support and developers
– tables 50 – 300

tbudget for a basic understanding of all the autosys jobs?
tbudget for a basic understanding of all the tables? 20h

I see 3 unambiguous groups of tables
– R: reference tables, mostly read-only, populated from upstream
– P: product-specific, “local” tables, since we have several independent streams of products
– C: core tables shared among many products

wiki for enterprise app documentation

Justification: jargon. In a non-trivial enterprise app documentation, usually there are too many (hundreds) jargons for an uninitiated reader. Wiki can help. Let’s look at an example. Under generous assumptions, every time a document mentions “Price Spread”, this word will become a hyperlink to the wiki page for this jargon.

Justification: linking. linking up hundreds of related topics.

Justification: Easy update. Everyone can update every wiki page, without approval.

documentation#le2raymond

Hi Raymand,

You mentioned documentation. Documentation is not only important for users, but also important for you to get a complete understanding of the entire project. After writing the documentation, hopefully you will be able to better describe the system to a new person such as your next client or a new interviewer.

I think in US and Singapore, many interviewers respect job candidates who understand past projects in-depth.

How stable is your job? Are you learning some new technologies such as java, dotnet, SQL? I recently improved my understanding of SQL joins, correlated queries, java OO, java exceptions, java threads, jsp/javabeans, servlet sessions… Now I feel like a stronger survivor.

poorly modularized

An interviewer described that 2 C++ developers from Wealth Management were asked to help a quantatitive project for 2 weeks. I immediately commented that the design must be highly modularized. I felt the #1 important success factor in that /context/ was modularization.

In my experience, Poorly modularized code would impede initial learning curve, unit-test and debugging.

The new developers would have to tread very carefully, in fear of breaking client code. Poorly modularized code in my projects had less clean interfaces between a client (a dependent) and a provider module. High cohesion leads to cleaner and simpler interfaces.

Perhaps the #1 characteristic of modularation is cohesion. Highly modularized designs almost always show high-cohesion. The module given to the 2 “outsiders[1]” would be more self-contained and concerned with just 1 or very few responsibilities. Look at the old FTTP loader class. The old loader has to take care of diverse responsibilities including instantiation logic for uplink ports and gwr ports. I feel a separate uplink factory is high-cohesion. Separation of concern.

Object-oriented design is a specialized version of Module-oriented design. Most of the OO design principles apply to modularization. My comment basically /reflected/ the importance of clean and /disciplined/ OO design.

[1] zero domain knowledge.

agile methodology %%xp

Challenge: too many agile principles
Sugg: focus on a few, relate to your personal xp, develop stories —
understandable, memorable, convincing

–xp: frequent releases
Qz, RTS

–xp: devops automation
RTS, Mac

–xp: frequent and direct feedback from users

–xp: frequent changes to requirements

–xp: “Working software is the principal measure of progress”
pushmail: repeated code blocks; many functions sharing 99% identical code;

–xp: “Working software is the principal measure of progress”
nbc: top3.txt (expert brackets) not following mvc

–xp: “Working software is delivered frequently (weeks rather than months)”
nbc: half-finished product with hundreds of bugs are released to user to evaluate

–Agile home ground:

Low criticality
Senior developers
Requirements change very often
Small number of developers
Culture that thrives on chaos

–(traditional) Plan-driven home ground:

High criticality
Junior developers
Low requirements change
Large number of developers
Culture that demands order

app design in a fast-paced financial firm#few tips

#1 design goal? flexibility (for change). Decouple. Minimize colleagues’ source code change.

characteristic: small number of elite developers in-house (on wall street)
-> learn to defend your design
-> -> learn design patterns
-> automate, since there isn’t enough manpower

characteristic: too many projects to finish but too few developers and too little time
-> fast turnaround

characteristic: reputation is more important here than other firms
-> unit testing
-> automated testing

characteristic: perhaps quite a large data volume, quite data-intensive
-> perhaps “seed” your design around data and data models

characteristic: wide-spread use of stored proc, but Many java designs aren’t designed to work well with stored proc. Consider hibernate.
-> learn coping strategies

characteristic: “approved technologies”
characterstic: developers move around
-> maintenance left to other guys
-> documentation is ideally “less necessary” if your design is easy to understand
-> learn documentation tools like javadoc

safety in a dynamic financial environment

* to promote testing, try to write tests first, at least in 1% of the cases. Over time u may find shortcuts and simplified solutions
* Many systems aren’t easy to unit-test or regression-test. To increase the applicability of testing, allocate time/budget to study best practices
* allocate resources to learn automation tools like Perl, Shell, IDE, diff tools, Windows tools
* automated test
* regression test
* automated doc-generation
* keep track of volumes of written messages
* pair-programming
* involve the users early and at every step

1st steps in java design: file naming

Consider choosing package names for your objects that reflect how your application is layered. For example, the domain objects in the sample application can be located in the com.meagle.bo package. More specialized domain objects would be located in subpackages under the com.meagle.bo package. The business logic begins in the com.meagle.service package and DAO objects are located in the com.meagle.service.dao.hibernate package. The presentation classes for forms and actions reside in com.meagle.action and com.meagle.forms, respectively. Accurate package naming provides a clear separation for the functionality that your classes provide, allows for easier maintenance when troubleshooting, and provides consistency when adding new classes or packages to the application.