intuitive – with PUT option, 1st look

When I read financial articles, I find PUT options harder to understand than most derivatives. Here’s my summary

–> You use a put as insurance, when you worry that underlying might fall.
–> A put insurance let’s you unload your worthless asset and cash in a reasonably high strike price

Here’s a longer version

–> You use a put insurance (on an underlying) at price $100 when you think that underlying might fall below $100. This insurance lets you “unload” your asset and cash in $100. Note most put or calls traded are OTM.

A good thing about this simplified intuitive definition is, current underlying price doesn’t matter.Specifically, it doesn’t matter whether current underlying price is below or above strike (ie in the money or out).

Q: both a short position (in the underlying) and a put holder benefits from the fall. Any difference?
A: I feel if listed puts are available, then they are preferable to holding a short position. Probably cheaper and don’t tie up lots of cash.

Q: how about the put writer?
A: (perhaps not part of the “intuitive” first lesson on puts)
A: sound byte — an “insurer” .
A: therefore they don’t want volatility.  They want underlying price to stay high or at least be stable

Simplest underlying is a stock.


exceed copy-paste between win32 and X-windows

Using exceed, it can be a challenge to set up copy-paste between win32
and X windows. I know 2 options.

Note I always enable auto-copy-x-selection and

— option: X-selection-page -> X Selection Associated With Edit
Operations set to Primary —
Lesson? “Primary” is the default. In this mode, don't use the xwin

* Simple-Select (without middle-button or context menu) to copy from
unix, paste to win32? Yes
* Simple-Select (without middle-button or context menu) to copy from
unix, middle-button to paste to unix? yes
* Select from win32, middle-button to paste in unix? Yes
* Select from win32, context-menu->edit->paste in unix? no

— option: X-selection-page -> X Selection Associated With Edit
Operations set to Clipboard —
This is suggested on some webpage. It also enables copy-paste between
unix and windows.

java generic subtyping: arrays^collections

based on [[java generics and collections]]

List<Integer> ints = Arrays.asList(1,2,3);
List<Number> nums = ints; // uncompilable
nums.add(0.001); // mistake caught by compiler
// list of integer is NOT a subtype of list of numbers, but below, array-of-integer is indeed a subtype of array-of-numbers!
Integer[] intArray = {1,2,3};
Number[] numArray = intArray; // compilable
numArray[0] = 0.001; // run time error; compiler negligence
ArrayList is indeed a subtype of List

which banks ask differentiating questions

(another blog post)

If an interview question is too tough, then strong or fake candidates both fail. If question is too common, then both pass. Good differentiating questions let the strong candidates shine through. Differentiating candidates depends on

* non-standard (but not obscure) questions, or
* drill-in on standard questions, or
* an interviewer capable of seeing the subtle differences among answers to standard questions.

If you have “none of the above”, then you can't differentiate candidates.

Most interview questions are open-ended. I (not yet a strong candidate) can give super long answers to sleep-vs-wait, interface-vs-abstract-class, arrayList-vs-linkedList, “how to prevent deadlock”…. However, many interview questions are so common that candidates can find perfect standard answers online. A strong interviewer must drill in on such a stock question to test the depth of understanding.

GS, google, lab49, MS, barc cap have good interviewers.

#1 database scale-out technique (financial+

Databases are easy to scale-up but hard to scale-out.

Most databases get the biggest server in the department as it’s memory intensive, I/O intensive and cpu-intensive, taking up all the kernel threads available. When load grows, it’s easy to buy a bigger server — so-called scale-up.

Scale-out is harder, whereby you increase throughput linearly by adding nodes. Application scale-out is much easier, so scalable architects should take early precautions.

If DB is read-mostly, then you are lucky. Just add slave nodes. If unlucky, then you need to scale-out the master node.

Most common DB scale-out technique is shard or HORIZONTAL partitioning (cut horizontally, left to right). Simplest example — federated table, one partition per year.

I worked in a private wealth system. biggest DB is the client sub-ledger, partitioned by geographical region into 12 partitions.

We also used Vertical partitioning — multiple narrow tables.

Both vertical and horizontal partitions can enhance performance.

equals method of a class


You are spot on about linked list — If a class N has-a field of type N, then N is almost always, by definition, a node in a graph. That N field is probably a parent node. So allow me to put in some meaningful names — Each GraphNode has-a field named this.parent. Now the question becomes “how to override equals() in GraphNode and deal with the unbounded recursion”.

It’s an unusual technical requirement to make equals() to compare all ancestor nodes. However, It’s a reasonable business requirement to compare 2 GraphNodes by comparing all ancestors. Such a business requirement calls for a (static) utility method, NOT an instance method in A static utility method like compareAllAncestor(GraphNode, GraphNode) can be iterative and avoid recursion and stack overflow. Once this static method is in place, I might (grudgingly) create an instance method compare(GraphNode other) which simply returns compareAllAncestor(this, other), without unbounded recursion or stack overflow.

If 2 threads both perform this comparison, then I feel the method may need to lock the entire graph — expensive.

Even in a single-threaded environment, this comparison is expensive. (The recursive version would add an additional memory cost.) Potentially a performance issue. For most graph data structures in business applications, GraphNode should be Serializable and collections-friendly. Therefore hashCode() and equals() should be cheap.

For most graph data structures in business applications, each graph node usually represents a real world entity like a member in a MLM network. Now, if a graph node represents a real world entity, then it’s always, without exception, identifiable by an immutable and unique ID. Usually this ID is saved in database (could also be generated in application). Therefore, in most cases, equals() should compare ID only.

3 major risk-calculators in an investment bank

Background — Imagine a typical investment bank. The risk engines below are owned by distinct branches of the IT organization. Not integrated (A major shortcoming in risk systems today is such data silos.)  For the bank CRO (Chief Risk Officer), how are these systems related? How do we interpret their risk numbers in a consolidated big picture?

– c-risk (credit risk) systems calculate bank’s potential loss due to defaults OR counter party credit rating drops
– Sophisticated m-risk (market risk) engines calculate expected Market Value drops due to price swings
– L-risk (liquidity risk)? Among other things, it covers capital reserve (Basel). L-risk is Less computerized. Perhaps no daily valuation of assets/liabilities, long and short positions.

— some comparisons among the domains —
There is significant overlap between credit-risk vs market-risk processes. In the bigger picture, unrealized loss due to counter party credit is covered by both c-risk and m-risk. Real cash loss (i.e. realized) is the subject of both by L-risk and c-risk.  Credit risk engine is more about calculating unrealized loss (i.e. MV drop) due to credit quality change. In contrast, realized loss due to default is the subject of liquidity risk.
Unrealized MV loss due to credit quality hurts valuation of loan portfolios and incoming collateral, and hurts our consolidated assets and our own credit rating. Therefore it is a liquidity risk.
At the heart of credit risk analysis (unlike market risk or liquidity risk) is the credit review on individual borrowers/issuers including countries.
M-risk is more quantitative than c-risk and L-risk. Therefore most IT jobs are in m-risk. VaR is the most quantitative domain in finance. The star player in the “team”. Useful for short term m-risk.

For long-term m-risk, Stress-test (aka scenario-test) is the primary risk engine. Stress test is also one of the engines for c-risk estimation.

I feel liquidity risk is more critical to a bank than credit risk or market risk, as liquidity means solvency.
How about collateral valuation engines? I think this straddles c-risk and L-risk systems. Outgoing collateral reduces a bank’s liquidity. Collateral we hold in the form of bonds are valued daily in our c-risk calculator.

How about margin risk calculator (for prime brokerage or listed derivatives)? I assume these margin accounts only hold liquid assets, credit-risk free. In such a case, it’s basically a stress test m-risk engine. Not so much VaR. Not much c-risk. It does hit bank’s capital reserve since collateral adds or reduces a bank’s liquidity.

Now, if a margin risk calculator must support risky bonds in the margin account, then this system might affect m-risk, c-risk and L-risk.

[11] BAU vs green field – Wall St

My Citi mgr divided my work into ProdSupport + Greenfield + BAU i.e. (strategic or quick) enhancements on current production codebase.

In ML, effort is divided into BAU 66% + 33% Greenfield

GS is 75-25

One of the hardest part in Greenfield is replicating BAU behavior. The closer we get to prod release, the more we rely on prod system knowledge. Those nitty-gritty BAU behaviors become necessary features and show stoppers.

I think recruiters generally see Greenfield as more attractive. More design involved. More experienced candidates required.

A friend said green field becomes stressful as release date approaches.

non-const-non-primitive field in a const method

As stated on P215 [[effSTL]], inside const methods, all fields act like const fields. This is simpler for primitive fields like an int field, but trickier for non-primitive fields … like Animal —

class C{
Animal f;
Animal const & dump() const;

Inside dump(),
– we can’t overwrite f like “f = anotherAnimal”
– we can’t call mutator methods on f like “f.grow()” — uncompilable
– we can’t remove the “const” from dump() return type — uncompilable in GCC. See also A common IV quiz.
– if Animal class overloads method age() on const, then compiler will statically bind the “f.age()” to the const version.

All these rules are compiler-enforced. No such check at run-time.

print any container, using copy() and ostream_iterator ctor #noIV

Better commit to memory. See P53 of [[ stl tut ]] and P96 [[essential c++]]

  // vector vector1(10);
  // generates a series of numbers generate(vector1.begin(), vector1.end(), calc_square());

  copy (vector1.begin(), vector1.end(), ostream_iterator(cout,” “));

This is an idiom. back_inserter is another idiom. Such idioms aren’t tested in interviews.

avoid calling wait() while holding multiple locks

Avoid calling wait() while holding 2 locks. I always try very hard to call lock1.wait() while holding nothing but lock1. Any lock2 held would NOT be released by lock1.wait().

Similarly, avoid grabbing a lock while holding other locks.

Beware of all blocking operations like lock-grabbing, wait(), while loop, join() and sleep(longPeriod). If the blocked thread is the only thread capable of “releasing” some shared resource, then that resource gets locked up along with the thread, starving other threads who needs it.

In contrast, tryLock(), wait(shortPerod), join(shortPeriod) are safer.

call +! calling thisThread.start() first

2. What will happen when you call without calling thisThread.start() first?
answer: doesn’t create a thread
i agree. start() is a hook into the JVM.

It sets up the call stack in the JVM stack segment and those internal memory structures (like instruction pointers) needed by a real thread. Then it points the instruction pointer to the run() method’s address. Then it joins the other Eligible threads to be scheduled by the jvm scheduler. I believe this start() runs on the thread-creator thread, but run() would execute on the new thread. If I’m right, then it’s misleading to say start() calls run(). start() returns after setting up the jvm thread. When the new thread is scheduled to execute, first method executed is run().

This jvm thread exists independent of the thisThread object. Jvm thread can run (till shutdown) even after thisThread object is garbage collected. Conversely, thisThread object can live in the heap without the call stack — this is the situation before we call start(), or after the jvm thread exits.

Physically, the jvm thread occupies a bunch of memory cells (its home, in the stack) in order to do its job. Our thisThread object also occupies a bunch of memory cells (in the heap) to hold indirect pointers to the jvm thread’s home.

STL multiset = sorted Bag

(Q: Have you ever wondered why set/multiset are grouped with map/multimap as Associative containers? Answer revealed at the end.) points out

It is possible for distinct objects to be considered “equivalent” under some equivalence relation but still distinct under another relation. Some types of multiset implementations will store distinct equivalent objects as separate items in the data structure; while others will collapse it down to one version (the first one encountered) and keep a positive integer count of the multiplicity of the element.

STL multiset is the first type. See also P299 [[STL Tutorial and Reference Guide]] written by one of the STL inventors.

STL multiset is a reliable store keeper — never “loses” items that look alike. In contrast, the counting version discards items as _unwanted_duplicate_. STL multiset associates objects to invisible keys — associative containers.

int main() {
    float floats[3] = {1.1,1.2,2};
    multiset<float, less > cont;
    cont.insert(floats, floats+3);
    copy(cont.begin(), cont.end(), ostream_iterator(cout, ”   “));

foreign key on Wall St

An experienced consultant told me he only saw FK used once in the many
banks he knows. He basically said if you don't know how to use FK then
don't use it. His team uses FK during paper design only.

I too worked in several banks. Only relatively inexperienced developers
use FK but I find them inconvenient.

One of the problems is update/delete. FK often blocks you. We all know
some data is wrong but we can't fix them, in real time trading!

Insert? I don't remember but I think you can't insert a dependent table
before inserting all primary tables. Imagine your dependent table
depends on 2 tables, which in turn depend on 3 other tables.

Remember Wall St is all about quick changes and quick fixes (2 different
things). That's why jmx is so popular – no restart required.

port sharing – rvd

In Rendezvous release 8.1.x and earlier, a service port was available only to the first daemon (on a particular host computer) that bound it. That is, client transports would fail when requesting that same service from another daemon on the same host computer.

buffer and queue within rvd process memory

rvd command accepts -max-consumer-buffer size

When present, the daemon enforces this upper bound (in bytes) on each consumer buffer (the queue of messages for a client transport). When data arrives faster than the client consumes it, the buffer overflows this size limit, and the daemon discards the oldest messages to make space for new
messages. The client transport receives a CLIENT.SLOWCONSUMER advisory.

When absent or zero, the daemon does not enforce a size limit. (However, a 60-second time limit on messages still limits buffer growth, independently of this parameter.)

dump()for Any STL container #2 template func enough

template<typename M> void dumpMap(M const & cont) {
    for(auto const & i: cont)        cout<<i.first<<" -> "<<i.second<<endl;
template<typename CT> void dump(CT const & cont) {
    for(auto const & i: cont)        cout<<i<<endl;
/////////// above uses c++11 features
template<typename CT> void dump(const CT& cont) {
    typedef typename CT::const_iterator iterator; //typedef -- no choice
    iterator i;
    //non-compile --
    //CT::const_iterator i;
    for(i = cont.begin(); i!= cont.end(); ++i){
        cout<<*i <<" "<<endl;
//! Better use iterator arguments
//! Better use iterator arguments
template<typename M>
void dumpMap(const M& cont) {
    typedef typename M::const_iterator iterator; //no choice
    iterator i;
    for(i = cont.begin(); i!= cont.end(); ++i){
        cout<<i->first<<" "<<i->second<<endl; // ->first may be unsupported. Checked during template expansion
int main() {
    vector<int> v;
    map<char const *, int> map; // c-string i.e. char-pointer in a container
    map["1)"] = 11;
    map["3)"] = 33;

eclipse mark-occurrence on vertical side ruler

One of the top 10 productivity features in Eclipse.

Preferences -> search for “occur” -> MarkOccurrences -> Annotations ->
Occurrences and WriteOccurrences -> Choose 2 distinct colors that are
easy to differentiate on the vertical rule and also as text highlight.

Q: How about Visual Studio?
A: I found that if I select a variable, move mouse away, then all occurrences are highlighted within the editor (not on vertical ruler). It’s known as “highlight reference” in MSVS. To change the highlight color, see

transform() to emulate copy()

I feel transform() can emulate many other STL algorithms ..

string const & passthru(string const & s){  return s;  }

void dumpList(const list & list) {
    copy(list.begin(), list.end(), ostream_iterator (cout, “\n”));
    transform(list.rbegin(), list.rend(), ostream_iterator (cout, “\n”), passthru); // i used rbegin/rend just to show the symmetry

meta programming — personal reflections

I’m no expert on meta programming. I feel MP is an over-popularized buzzword, with increasingly vague and expanding scope. I’m less interested in what buzzwords are part of MP, and more interested in the very few “pillars” beneath this sprawling complex —

– reflection/introspection
** first-class functions — all functions as first-class objects

– run-time byte-code engineering
– dynamic creation and manipulation of program modules
– nested (often anonymous) code modules[1] such as nested functions/classes — nested in Functions. I feel this is basically “dynamic creation”

– More generally, any compile-time task performed at run time (by your clever hack) is subversive, powerful, dangerous and usually part of meta programming.
** I feel one of the earliest use cases is C++ template meta programming. What used to be a compile-time task — creating similar vector classes for int, float, Dog — is now performed at run time.
** another Simple example in Java reflection — removing “private” access modifier at run time.

Reflection is richer in Python than compiled languages. All functions, methods, types … are first-class objects you can look into at run-time.

Perl offers “subref” and closures, too.

[1] beyond data structure

fxoption vs eq-option popularity among non-retail traders

I spoke to a market data vendor’s presales. Let’s just say it’s a lady named AA.

Without referring to the Singapore market, she feels FXO is clearly more popular as a hedging tool than EqOptions. I feel that’s true among her clients (all institutional, no retail). She explains that only large equity funds would use eqo while virtually all importer/exporters would buy fxo (usually from banks). I asked “In that case how do the equity traders hedge their risk if Not with options?” She didn’t give a complete answer but cited eq index and futures.

I too feel import/export corporates outnumber equity trading houses (perhaps by a large margin), but I feel eqo is more liquid (thanks to exchanges) and more widespread than fxo. Also, eqo has retail demand.

Our conversation about fxo vs eqo was exclusively focused on the hedging usage. Eqo has other users including traders. I was told some hedge funds also trade fxo, but I feel it’s less popular due to exchanges and bid/ask spread.

She felt FXO must be on the books of every corporate (treasury). I asked why. In terms of FX risk hedging, she feels option is the true hedge, whereas fwd is a view on the market. I guess there’s some deeper meaning in her remark. Perhaps she means option is an insurance.

lex^pack-var ] perl – another phrasebook

This is yet another of my attempts to extract the gist of …. a tricky area. (Page numbers refer to the camel book)

Keyword: symbol table — package vars exist on a symbol table, while lexicals don’t. As a result, package vars are accessible from anywhere by the fully qualified name, just like a file’s full path; whereas a lexical is inaccessible from outside its “home” ie lexical scope.

Keyword: fullpath

Keyword: global var — are always package vars. Contrast — A file-level lex (ie declared outside of subs) is accessible in the subs of the file, but isn’t as “global” as a package var.

Keyword: stack vars — Function local lexicals are similar to java’s stack vars. Usually [1]they lose their value when the sub returns.

[1] P223 — persistence across calls

Keyword: auto-var — I feel lexical is similar to C auto variables.

Keyword: nested sub — if a small sub is (rare!) defined completely within a bigger sub, then lexicals in the outer is visible throughout. If called sub is defined outside the caller =} invisible. Exactly like c++/java. P743.

P56 details Perl’s variable name lookup “algorithm” stroke by stroky.

When learning the (non-trvial) essentials of lex^pack-var, ignore use-strict, local or “our” for the time being.

not-null when creating table

Null values mess up numeric aggregates. If you often do that then consider a meaningful default value, though it distorts average().

Nulls mess up composite boolean exp. Worse still, nulls can even mess up basic where-clause.

My first choice is not-null.
– For string fields, default to empty string.
– For integer date fields, wall street systems often default to 99991231.

In terms of null handling, SQL beats java. As java developers going into SQL, we don't need to be too paranoid about nulls.

RMI skeleton/stub instantiation

In an RMI scenario, there are 2 + 1 process. 2 on server, 1 on client
Process ps1) the application jvm
Process ps2) registry process. On unix, you often start it by hand. Note PS1 and PS2 must be on the same localhost.
Process pc) client JVM

There are at least 3 java objects involved. Both skeleton and stub implement the same business interface as OB.
Object OB) the real business object
Object SK) the skeleton object
Object ST) the stub object

Let's see how these are created and linked.
1) skleton is probably instantiated by exportObject() based on the OB object, inside PS1 JVM. This is a static method.
2) After export, Skeleton's address is then registered with the registry, using rebind() or bind(), both static methods.
) UnicastRemoteObject probably has a static collection to hold OB and SK, to fend off garbage collection
) Stub is created on demand in PC, by deserializing the skeleton object

simplest spring-jmx setup to configure a trading server

Without any annotation or source code change, any public (static or non-static) methods will be accessible on jconsole.


decorator quiz – subclassing by reflection

(See background below)
I agree with you that reflection, using dynamic proxy, can automatically redirect or wrap all 200 public methods, so won’t look super long and needs no code change when adds methods. This does require that those 200 (public) methods are specified in an interface. This is usually not hard – just ask the author to create the interface and make implement it. Not possible if comes from a 3rd-party library.

More importantly, reflective subclass falls short on several fronts.
* Biggest shortfall — If base class has a public method printMe() that calls this.toString(), then a regular subclass can override toString() and it will be invoked by printMe(). Doesn’t work in reflective subclass. “this” is a kind of address in JVM, shared by both acct1 and checking1 in a regular subclass scenario. In the reflective case, checking1 has a different JVM address than acct1, so this.toString() can’t be intercepted by our proxy but goes through regular dynamic binding via virtual table.
* If Account has protected methods and fields, then a regular subclass can easily use them. Not so easy in reflective subclass.
* If Account has a final public field pf1, then in main(), we can say System.out.print(checking1.pf1) only if CheckingAccount is a regular subclass.
* We can cast checking1 from Account to CheckingAccount only for regular subclass

From: Tan, Bi
See if you have any good solution —

public static void main(...){
Account acct1 = getAccountFromDB(123);
CheckingAccount checking1 = new CheckingAccount (....); // Somehow create a CheckingAccount instance whose parent object is acct1.
... } class already has 200 public methods. should inherit all and override toString(). How? – replacing StringBuilder

In most cases, I can replace StringBuilder with a Formatter, but of course Formatter offers more!

new Formatter(new StringBuilder()) painlessly appends!

import java.util.Formatter;
import static org.junit.Assert.*;
import org.junit.Before;
import org.junit.Test;

public class FormatterTest {
    private Formatter formatter;
    private StringBuilder sb;
    public void setUp() {
        this.formatter = new Formatter(new StringBuilder()); = new StringBuilder();
    public void testArray() {
        char[] var= new char[] {‘a’, ‘b’, ‘c’};
        formatter.format(var+””, “”);
        assertEquals(sb.toString(), formatter.toString());
    public void testInt() {
        int var=902;
        formatter.format(var+””, “”);
        assertEquals(sb.toString(), formatter.toString());
    public void testFloat() {
        float var=902;
        formatter.format(var+””, “”);
        assertEquals(sb.toString(), formatter.toString());

insist on better (less painful) design@@ Don’t !

(another blog. You need not read…)

It’s possible to be less passionate about technical wisdom, and wise decisions.

As you pointed out, software dev is much more flexible than most industries like /putting a nut on a screw/. There are few large industries where every day presents challenging choice to each worker. (Counseling, radio hosts come to mind.)

With the multitude of choices come the criteria of wisdom. Clever choices, regrettable choices, elegant choices, wise choices. After working for 10+ years in software, we all have truckloads of bitter experiences, so to avoid the pains, many of us slowly become purist, meticulous, opinionated, even emotional against lousy choices imposed on us.

Such choices cover not only implementation, coding style/convention, tooling, build/deploy, file layout, but also project teamwork. I’d say Wall St dev teams are commanding not democratic.

Our industry is littered with jargon related to these large number of small choices and wisdom — “DRY – Don’t Repeat Yourself”, code-smells, TDD, Agile, Extensible, Reusable, Readability, Maintenance cost/TCO, defensive (and non-defensive) design, Future-proof design, code cloning…

For example, a TDD/BDD purist would see every developer colleague of mine as amateurish, non-professional, tactical… I have seen some architects insisting on some design principles but it made the code base harder to understand, bloated and … inevitably class proliferation. EJB was a great best-practice rejected by the community/market and replaced by simpler design principles. I was told all 23 GoF design patterns require additional classes (and likely additional code complexity, though the advocates may argue complexity actually reduces at some mysterious higher level.) Now I feel most best-practice principles have non-trivial drawbacks or costs i.e. they can make someone’s life harder.

Some developer colleagues (like my Taiwanese colleague) are less stubborn, less lazy, less /purist/, more accommodating. Perhaps they see the code as nothing but a way to get a pay check. Perhaps they don’t feel threatened by the risk of losing job. They don’t struggle. They do understand the possibility of losing job, so they do things quick and dirty, try to please the mgr and please others who they depend on, and avoid confrontation. Way to go..

code-cloning && other code smells in fast-paced GTD Wall St

Update — embrace these code smells to survive the coworker-benchmarking

In fast-paced Wall St, you can copy-paste, as long as you remember how many places to keep in-sync. In stark contrast, my ex-colleague Chad pointed out that even a one-liner sybase API call should be made a shared routine and a choke point among several execution paths. If everyone uses this same routine to access the DB, then it’s easier to change code. This is extreme DRYness. The opposite extreme is copy-paste or “code cloning” as some GS veteran described. Other wall st developers basically said Don’t bother with refactoring. I’m extremely uncomfortable with such quick and dirty code smells, but under extreme delivery pressure, this is often the only way to hit deadlines. Similarly,

*Use global variables.
** -Use global static collections. Remember my collection of locks in my ECN OMS kernel?
** -Instead of passing local variables as arguments, make them static or instance fields.
** -Use database or env var as global variables.
** -Use spring context as global variables.

* Tolerate duplicate variables (and functions) serving the same purpose. Don’t bother to refactor, simplify or clean up. Who cares about readablility? Not your boss! Maybe the next maintainer but don’t assume that.
** Given the requirement that a subclass field and a parent field pointing to the same object, due to bugs, sometimes they don’t. Best solution is to clean up the subclass, but don’t bother.
** 2 fields (out of 100) should always point to the same object, but due to bugs, sometimes they don’t. Best solution is to remove one, but don’t bother.
** a field and a field of a method param should always point to the same object, so the field in the param’s class is completely unnecessary distraction. Should clean up the param’s class, but don’t bother.
** a field and method parameter actually serve the same purpose.
*** maybe they refer to the same object, so the method param is nothing but noise. Tolerate the noise.
** tolerate a large number (100’s) of similar methods in the same class. Best to consolidate them to one, but don’t bother.
** tolerate many similar methods across classes. Don’t bother to consolidate them.
** tolerate tons of unused variables. Sometimes IDE can’t even detect those.

– use macro to substitute a checked version of vector/string. It’s cleaner but more work to use non-macro solutions.
– don’t bother with any validation until someone complains. Now, validation is often the main job of an entire applicaiton, but if your boss doesn’t require you to validate, then don’t bother. If things break, blame upstream or downstream for not validating. Use GarbageInGarbageOut as your defence.
– VeryLongVeryUgly methods, with multi-level nested if/while
– VLVU signatures, making it hard to understand (the 20 arguments in) a CALL to such a method.
– many methods could be made static to simplify usage — no need to construct these objects 9,999 times. It’s quite safe to refactor, but don’t bother to refactor.
– Don’t bother with null checks. You often don’t know how to handle nulls. You need to ask around why the nulls — extra work, thankless. If you can pass UAT, then NPE is (supposedly) unlikely. If it happens in production, you aren’t the only guy to blame. But if you finish your projects too slow, you can really suffer.
– Don’t bother with return value checks. Again, use UAT as your shield.
– use goto or exceptions to break out of loops and call stacks. Use exception for random jump.
– Ignore composed method pattern. Tolerate super long methods.
-Use if/else rather than virtual functions
-Use map of (String,Object) as cheap value objects
-Forget about DRY (Don’t Repeat Yourself)
-Tolerate super long signatures for constructors, rather than builder pattern ( [[EffectiveJava]] )
– don’t bother to close jdbc connections in the finally block
– open and close jdbc connections whenever you like. Don’t bother to reuse connections. Hurts performance but who cares:)
-Don’t pass jdbc connections or jdbcTemplates. In any context use a public static method to get hold of a connection.
– create new objects whenever you feel like, instead of efficiently reuse. Of course stateless objects can be safely reused, creating new ones are cheap. It hurts performance but who cares about performance?
-Use spring queryForMap() whenever convenience
– Tolerate tons of similar anon inner classes that should be consolidated
– Tolerate 4 levels of inner classes (due to copy-paste).
– Don’t bother with thread pools. Create any number of threads any time you need them.
– tolerate clusters of objects. No need to make a glue class for them
-Use public mutable fields; Don’t both with getters/setters
-JMX to open a backdoor into production JVM. Don’t insist on password control.
-print line# and method name in log4j. Don’t worry about performance hit
-Don’t bother with factory methods; call ctor whenever quick and dirty
-Don’t bother with immutables.
– tolerate outdated, misleading source documentation. Don’t bother to update. Who cares:)
– some documentations across classes contradict each other. Don’t bother to straighten.

verify SingleConnectionDataSource – sybase

import java.sql.Connection;

import java.sql.ResultSet;

import java.sql.SQLException;


import javax.sql.DataSource;


import org.apache.commons.logging.Log;

import org.apache.commons.logging.LogFactory;

import org.springframework.beans.BeansException;



import org.springframework.jdbc.core.JdbcTemplate;

import org.springframework.jdbc.core.RowMapper;

import org.springframework.jdbc.datasource.SingleConnectionDataSource;




 * @author btan



public class SybaseUtil {

      private static final Log log = LogFactory



      static private JdbcTemplate singleConnectionHandle;

      static private JdbcTemplate handle;



      * @param args


      public static void main(String[] args) {


            log(“abc ‘d’ “);



      static public JdbcTemplate getHandle() {

            if (singleConnectionHandle != null && handle != null)

                  return handle;

            AbstractApplicationContext context = TradeEngineMain.getContext();

            if (context == null) {

                  try {

                        context = new ClassPathXmlApplicationContext(


                  } catch (BeansException e) {


                        context = new ClassPathXmlApplicationContext(“dataSources.xml”);



            DataSource dataSource = (DataSource) context.getBean(“mtsDataSource”);

            handle = new GenericProcedureCaller(dataSource);


            try {

                  dataSource = new SingleConnectionDataSource(

                              dataSource.getConnection(), true);

            } catch (SQLException e) {


                  throw new RuntimeException(e);


            JdbcTemplate jt = new JdbcTemplate(dataSource);

            jt = new GenericProcedureCaller(jt);

            singleConnectionHandle = jt;

            return handle;



      static public JdbcTemplate getSingleHandle() {


            return singleConnectionHandle;



      static public String showServer() {

            JdbcTemplate handle = getHandle();

            StringBuilder ret = new StringBuilder(“nnt == SybaseSybase Details ==”);

            ret.append(“nt ” + handle.queryForObject(“select @@servername + ‘ <– ' + @@version", String.class));

            ret.append(“nt ” + getConnectionDetails());

            ret.append(“nt ” + getSpidDetails());



            return ret.toString();


      static private String getSpidDetails() {

            int spid = getSingleHandle().queryForInt(“sel” +

                        “ect @@spid”);

            String sql = String.format(“sp_who ‘%s'”, spid);

            String ret =”spid=” + spid + ” “;

            ret += getSingleHandle().queryForObject(sql, new RowMapper() {;

                  public String mapRow(ResultSet rs, int rowNum) throws SQLException {

                        return rs.getString(“loginame”) + ” ” + rs.getString(“hostname”);




            return ret;



      static private String getConnectionDetails() {

            Connection conn3 = null;

            try {

                  conn3 = getSingleHandle().getDataSource().getConnection();

            } catch (SQLException e) {

                  throw new RuntimeException(e);


            String ret = System.identityHashCode(conn3) +”=identityHash ” + conn3;


            return ret;



      static public void log(String s) {

            if (s == null || s.trim().isEmpty())


            s = s.replaceAll(“‘”, “”);

            String insert = String.format(“insert snoop (charp1) select ‘%s'”, s);





This message w/attachments (message) is intended solely for the use of the intended recipient(s) and may contain information that is privileged, confidential or proprietary. If you are not an intended recipient, please notify the sender, and then please delete and destroy all copies and attachments, and be advised that any review or dissemination of, or the taking of any action in reliance on, the information contained in or attached to this message is prohibited.
Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Sender. Subject to applicable law, Sender may intercept, monitor, review and retain e-communications (EC) traveling through its networks/systems and may produce any such EC to regulators, law enforcement, in litigation and as required by law.
The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or free of errors or viruses.

References to “Sender” are references to any subsidiary of Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this EC may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: By messaging with Sender you consent to the foregoing.

perf ^ functional-query — 2 sought-af skills on DB

Q: As a Wall st IT hiring mgr, what single DB skill is the most valuable?

Depends on who you ask. Many laymen and most experts would say tuning skills.  This has become the default answer. That means literally 99% of hiring managers would believe tuning is the #1 DB skill.

However, i have worked with many developers. When push comes to shove, i know the badly needed skill is “implementing complex logic”. Wall St is all about time to market, quick and dirty. Most Wall St systems have all of their financial data in database and huge amounts of non-trivial logic in SQL. SQL know-how (mostly SELECT) can blast through brick walls — see post on blast.  Therefore if I were hiring my #1 focus is the know-how to implement complex query logic, using 8-way outer joins, CASE, group-by, correlated sub-select …

So here we are — perf vs SQL tricks. These are the 2 most valuable DB skills on Wall Street.

Incidentally, both depend on detailed knowledge of a tiny language known as SQL. It's remarkable that SQL ranks along with java, Unix, OO, threading etc as a top-3 technical skill, but the SQL language itself is much, much smaller than the other languages.

A tiny subset of of the tuning interview topics to study —
* show plan, show statistics io
* sproc
* hints
* index selection, design
* index stats
* join types, join order
* temp tables

perf ^ functional — techniques, know-how

See also — post on perf^complex query on databases
Background —
* when reading code or publications, we often see 2 broad categories of _techniques_, design skills, know-how…
* When evaluating a developer, we often focus on these 2 types of skills

A) techniques to achieve AAA+ sys performance
B) techniques to achieve basic functionality

I would argue knowledge about B outweighs A in most wall street development jobs. “First get it to _show_ all the essential functionality, then optimize.” is the common strategy — pragmatic. Uncertainty and project risk are the key drivers. Once we show stake-holders we have a base deliverable, everyone feels reassured.

Q: In pursuing quick and dirty, would we end up with an non-scalable design?
A: seldom.

Q: how about low-latency, massive market data feeder, risk overnight batch?
A: I don't have experience there. I would speculate that in these specialized fields, base design is always a battle-tested design[1]. On top of a sound design, quick-and-dirty will not lead to unsolvable issues.

[1] Futuristic designs belong to the lab. Most Wall St systems have serious funding and aren't experimental.

memory^thread creation – kernel always involved@@

memory – see post on new-and-syscall-wholesale^retail (
A) wholesale –  malloc grabbing from kernel [1]
B) retail – malloc to divvy up to host application. Note malloc is an /embedded-library/ and not a separate process. It is in the code section of one application only.

thread –
A) if you create a kernel thread or create a Solaris LWP, then you probably call into kernel[1]
A2) when your kernel thread is scheduled, it’s scheduled by kernel. Your application can’t interfere with it
B) if you create a userland thread, then the thread library does it without kernel
B2) when your userland thread is schedule, it’s scheduled by the thread library. Kernel is unaware of userland threads and can’t schedule them.

Note the thread library could be bundled with the OS, just like malloc, but it doesn’t have to be. Half the time, these libraries can serve user applications without calling into the kernel.

[1] since memory is a hardware resource protected by OS

win32 batch – nested script choices

Know the differences among these, if you have time

— directly calling another script

— CALL nestedScript.bat

— START nestedScript.bat
give birth to another Command Prompt window, then die. A bit like unix exec.

— cmd -k /nestedScript.bat

— cmd -c /nestedScript.bat

— cmd nestedScript.bat
instantiate command interpreter. This seems to be most versatile and popular

Just a small story – I invoked mvn.bat directly and also tried cmd. In the end, i had to use CALL

see – complete class


import java.util.Arrays;
import java.util.Formatter;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

// —————————————————————————-
 * @author <a href="“>Vladimir Roubtsov, modified by
 *         TAN,Bin
public class Sizeof {

    // this is our way of requesting garbage collection to be run:
    // [how aggressive it is depends on the JVM to a large degree, but
    // it is almost always better than a single Runtime.gc() call]
    static public void runGC() {
        // for whatever reason it helps to call Runtime.gc()
        // using several method calls:
        for (int r = 0; r < 4; ++r)

    static public StringBuilder printHeapSegments4JDK6() {
        final StringBuilder ret = new StringBuilder();
        final Formatter formatter = new Formatter(ret);
        final List gcMBeans = ManagementFactory
        for (GarbageCollectorMXBean collector : gcMBeans) {
            final long justRan = justRan(collector);
            if (justRan > 0) {
                        ”    ^ ^ %16s collector ran %d/%d time(s) and covers “
                                + Arrays.toString(collector
                                        .getMemoryPoolNames()) + “n”,
                        collector.getName(), justRan, collector

        final MemoryMXBean memMXBean = ManagementFactory.getMemoryMXBean();
        memMXBean.setVerbose(true); // not sure how useful
        MemoryUsage heap = memMXBean.getHeapMemoryUsage();
        ret.append(heap.toString() + “n”);

        MemoryUsage nonHeap = memMXBean.getNonHeapMemoryUsage();
        ret.append(nonHeap.toString() + “n”);

        List pool = ManagementFactory.getMemoryPoolMXBeans();
        for (int i = 0; i < pool.size(); i++) {
            MemoryPoolMXBean bean = pool.get(i);
            ret.append(bean.getName() + “t”);
            ret.append(bean.getType() + “t”);
            ret.append(bean.getUsage() + “n”);
        return ret;

    private static long justRan(GarbageCollectorMXBean collector) {
        final Long newCount = collector.getCollectionCount();
        if (newCount <= 0)
            return 0;
        Long oldCount = collectionCounts.put(collector, newCount);
        if (oldCount == null)
            return newCount;
        long ret = newCount – oldCount;
        return ret > 0 ? ret : 0;

    private static ConcurrentHashMap collectionCounts = new ConcurrentHashMap();

    static private void _runGC() {
        long usedMem1 = used(), usedMem2 = Long.MAX_VALUE;
        for (int i = 0; (usedMem1 < usedMem2) && (i < 1000); ++i) {
            usedMem2 = usedMem1;
            usedMem1 = used();

    static public long used() {
        return s_runtime.totalMemory() – s_runtime.freeMemory();

    static public long total_ie_Committed() {
        return s_runtime.totalMemory();

    static private final Runtime s_runtime = Runtime.getRuntime();
} // end of class
// —————————————————————————-

## y CTO choosing java over dotnet/python, again

  • proven track record in “my” industry. If python offers a track record then more CTO’s will choose it.
    • php was chosen by wikipedia, Yahoo and facebook (Hack) but remains unproven to many enterprises CTO’s
  • better integration with enterprise-grade infrastructure. Large systems usually need the power of Oracle/DB2/Sybase, Solaris/AIX, Weblogic/Websphere, perhaps beyond microsoft platform
  • better support offered by “my” approved vendors — ibm, oracle, tibco, weblogic…
  • competent developers in java^.NET? I think at least 50% more, given the differing maturity. C++ is even worse.
  • java offers more capabilities and products including open-source. Even if a single needed capability is available in java but not .NET, this one factor alone could be a big factor.
  • cost of programmer, software tools, hardware… Java is not cheaper but Linux is probably cheaper than windows.
  • large scale deployment. c# and python are less proven IMO.
  • avoid lock-in by microsoft. JVM is now offered by IBM and BEA.
  • dotnet is proven only on windows, but linux is the overwhelming favorite operating system

## [11] y no dotnet on sell-side server side@@

(A fairly sketchy, limited, amateurish write-up.)
I was recently asked “dotnet has formidable performance and other strengths compared to java, but in trading engines space, why is dotnet making inroads only on the user-interface, never on the server side?

Reason — as an casual observer, I feel Windows was designed as a GUI operating system with a single user at any given time. Later WinNT tried to extend the kernel to support multiple concurrent users. In contrast, Unix/Linux was designed from Day 1 to be multi-user, with the command line as the primary UI. (Personally I used to feel GUI is a distraction to high volume data processing OS designers.) A trading server needs no GUI.

Reason — Java and c/c++ were created on Unix; dotnet runs only on a windowing operating system. I feel web server is a light weight application, so both java and dotnet (and a lot of scripting languages) are up to the job[1], but truly demanding server-side apps need Unix AND java/c++. I guess Windows is catching up. In terms of efficiency, I guess java and c# are comparable and below C++.

Reason — Sell-side trading system is arms race. (Perhaps same among hedge funds.) Banks typically buy expensive servers and hire expensive system engineers, and then try to push the servers to the max. C/C++ makes the most efficient use of system resources, but Java offers many advantages over C++. Since the late 90’s, trading servers have progressively migrated from C++ to Java. Java and C++ are proven on the high-performance server side. Not dotnet.

Reason — I still feel *nix are more stable than Windows under high load. See However, I think you can create big clusters of windows servers

Reason — (from a friend) — *nix is considered more secure than windows. A GUI desktop can affect one trader if compromised, but a sell-side trading server affects all the traders from all the institutional and retail clients if compromised. So security risk is more serious on server side than GUI side.

The reasons below are arguments for java over dotnet in general, but don’t really explain why java is NOT on the GUI side and dotnet is still chosen on the GUI side.

Reason — big banks need stronger support than a single vendor company. What if Microsoft makes a mistake and dotnet loses technical direction and momentum? Java and *nix enjoy broader industry support.

[1] unless you are google or facebook, who needed c++ for their demanding requirements.

swing (AWT@@) write once run anywhere

Swing is indeed write-once-run-anywhere — anywhere JVM is available — like linux, Mac, and Solaris [1]. However, raised some interesting points —

Swing generalizes [2] your underlying architecture to provide you with a platform-neutral user experience. About the only heavyweight component (provided by the OS) is the JFrame container and the rest is pretty much handled by the Swing tookit. (I think JDialog is another heavyweight.)

In contrast, AWT asks the OS to draw all of it’s UI components which means it’s faster in a lot of ways as you are using the native UI components specific to the OS. SWT tries to achieve a middle ground.

The Swing look-and-feel isn’t particularly attractive as it looks alien to most OS platforms.

[1] I think some JVM implementations may not support windowing at all. Look at some early Java-powered phones.


seeing no nulls from self outer join@@

–drop table students
–create table students(
—    student varchar(44),
—    score decimal(4,1),
—    ident int identity
–insert students select ‘Andrew’, 15.6
–insert students select ‘Becky’, 13
–insert students select ‘Chuck’, 12.2
–insert students select ‘Dan’, 25.6
–insert students select ‘Eric’, 15.6
–insert students select ‘Fred’, 5.6
–select * from students
select * from students h left join students l
  on h.score >= l.score
order by h.score, l.score
–- won’t show the nulls expected of an outer join, so I remove the “=”. Now Fred (and Greg) the poor student doesn’t score higher than anyone and therefore shows nulls.
Another way to get the nulls is adding the not-equal join condition
select * from students h left join students l
  on h.score >= l.score
and h.student != l.student –- this condition is essential to self outer joins
Now, this topic is relevant to me because whenever I construct an outer join, I expect to see nulls. An outer join without nulls is a emergency exit seat without legroom.
Q: if an outer join returns no nulls, is it equivalent to an inner join?
A: Usually yes. I’m yet to find an exception.

address of a java object (and virtual/physical memory)


(another blog post) We once discussed how to find the address of a java object. The address has to be hidden from application programs since the garbage collector often need to relocate the object through the generational heap. Therefore any reference variable we use in java will let us read/write the “pointee” object but won't reveal address.

However, the address is visible to the garbage collector and some of the C code integrating with java via JNI or other means. It has to be visible because C uses pointers. A pointer holds a memory address. If a C function uses a pointer, then the C function can print out the address.

By the way, all along we are talking about virtual memory addresses, which could be anything from 0 to 0xFFFFFFFF ie 32-bit integer, even on a 128MB RAM laptop.

The virtual memory module in the kernel translates between virtual memory address and physical RAM address.

Q: Is it every possible for a C program to see the physical RAM address of an object? Here are my tentative answers so please correct me —

A: yes for the C program implementing the virtual memory module itself. This module runs in probably the lowest layer in the kernel. Virtual memory module probably gets loaded first so that a 32MB RAM laptop can load a 50M operation system. Virtual memory continues to be extremely relevant since no machine has enough RAM to fill up a 64 bit address space.

A: no for any other C program running on top of virtual memory module.

Re: what makes c++ templates difficult to master

Here are some of the reasons I now see c++ templates as monstrous.

1) you can put arbitrarily complex types into the angle bracket when you instantiate a template. Example – You can put an array of multimap of pointer to member functions in there

2) within a class template, you can create typedef involving the “T”. Common in STL. I feel typedef is supposed to simplify declarations but often hides hideous complexities beneath the carpet. The complexity doesn't disappear so when things go wrong, you need to lift the carpet and analyze the dirt beneath. Things get even worse when typdef is defined locally in a class template, when the T is an unknown type.

3) when templates meet inheritance. The syntax is too complex for me to grasp. I don't know if developers enjoy deriving from a class template while adding additional type params  (like adding T2 to an existing T).

What's your view on the complexity of templates?

Sooner or later if we were to investigate these “corners” of the C++ template system, we must ask “how is template instantiation implemented”. Is it code generation? If it is, then I actually feel macro expansion is much, much simpler and better understood than templates, though less powerful.

I feel without templates, c++ is at least 50% more complex than java, thanks to things like pointer to pointer to pointer (and reference to pointer to pointer), function pointers, pointer to member function, assignment/copy-ctor overloads, memory management, casts, pass-by-value vs reference and when each applies.

On top of those, templates add so much complexity that I feel most wall Street developers aren't competent to write templates. Wall Street developers can throw together large java/c# systems because those languages are simpler and more strict. Those languages check a lot of things for us at compile time, and disallow many things and take cares of many things at run time, so it's safer to be careless.

On Fri, Apr 1, 2011 at 1:52 AM, raiserchu  wrote:

Hmmm, I don't quite remember. that was 10+ years ago. Since you have also realized that, then that must be the reason (that you realized). It's easier to use than write. I guess just a lot of rules to remember and follow and a lot of cases. I like simple rules and simple cases/syntax, treat humans like human.

2011 Wells Fargo IV

Q: from vi, how do you import output from a command?
A: cmd | vi –

Q: what thread libraries did you use?
%A: the java thread uses different thread libs on Win32, Linux or Solaris; boost threads; perl thread; pthreads

Q: how do you create a socket in java or c?

%A: syscall socket() returns (ptr??) to the new (yes!) socket object? Wrong!!!!! A socket file _descriptor_ is returned.

Q: const ptr vs a reference?
A: you can cast away the constness
A: dynamic_cast the reference either succeeds or triggers exception

Q: 100,000,000-row table. efficiently select first 5 rows?
%A: cursor or resultSet processing minimizes disk I/O
%A: order-by is often bad as it must read all rows from disk before finding the first 5. Exception: if you order by (“front portion” of) an
index then you only read a small part of the index tree and only the data pages holding those 5 rows
%A if table is a heap, then no order-by can help.
A: set row count

Q: how do you find out number of processors in Unix?
%A: top, psrinfo, startup messages. Perhaps uptime

Q: if an off-the-run bond is lightly traded, how do you know if you have missed some update?
%%A: I just hope the exchange has a refresh/snapshot channel to send periodic snapshots
%%A: some exchanges may have a query service to request per-symbol snapshots.
%%A: we monitor all sequence number gaps

collateral risk, repo, margin call

Repo are sometimes open-ended, but overnight repo is most common. Overnight repo requires automatic (not semi-manual like in BofA) processing.

Margin call is usually daily, but can be intra-day. I don’t think there’s monthly margin call.A typical Collateral IT system supports mostly 3 main collateral assets – futures, options and repo

–Funding efficiency?
Q: If I have a combined 50b portfolio and I want to repo it with various lenders to borrow cash, then how much cash can I get?
A: depends on my asset “distribution”. Diversification — good; A few highly concentrated positions — bad. Why? If we pledge a gigantic 10% of IBM Corp’s entire outstanding shares as a single collateral, then lender worries about worst case i.e. borrower default. Lender must liquidate, but selling so many IBM shares means huge market impact.

–repo — Both cash-lender and cash-borrower need to worry about counter-party credit .
If borrower’s (pledged) collateral appreciates while on loan, then borrower is at risk of loss due to lender bankruptcy while holding the security. The asset we have lost is worth more than the cash we borrowed — we pledged too much

If collateral depreciates, obviously lender is worried.

getline(): how many kinds]C++std library

See also

(Note All the getline() functions below are designed to return text from a File or cin — actually a one-way stream. Inapplicable to sockets.)

1) Most documentations mention the older c-str-based istream::getline() However,

2) is a std::getline(). Unlike the c-str versions of istream::getline() method, this std::string version is implemented as free function instead of member function in istream class.

This version is more modern. C++11 added new features to this free function, but didn’t bother with (1)

I feel cin can emulate it with for(; !cin.eof(); cin>>str3){…}. Of course, cin can also parse numbers!

Differences between these 2 getline() functions:
– One uses c-str; the other uses std::string class.
– One is a member function; the other a free function

Similarities —
– both require streams, therefore unusable in C. discusses both.

There’s an IKM question on the (older) member function (1).

3) describes a prehistoric ANSI-C standard library function — not using std::string Class or std::istream Class. is a short and sharp tutorial on it.

——- philosophically ——-
We can’t assume the most “everyday” programming tasks in a major language are always well-covered online — completely covered, with the confusing details pointed out. Reading a text file is basic but supported by a confusing bunch of alternatives, all named getline(). does describe pitfalls of istream::getline()

java getResourceAsStream() – easist(@@) way to read any file

You can easily read a file in the same dir (esp. in a jar) as your class or anywhere(?) on file system. Methods below are usually conveniently called on the Class object, like

1) simplest usage:
InputStream is = this.getClass().getResourceAsStream(“any file such as a.txt”);
InputStream is = AnyClass.class.getResourceAsStream(“any file such as a.txt”);

InputStream is = GetResourceTest.class.getResourceAsStream(“../../test.txt”);
if (is==null) System.err.println(“file not found”);

Behind the scene, these methods delegate to the classloader’s getResourceAsStream() has good examples.