Runnable object differ from a func ptr in a C thread creation API?

(C# Delegates???)
Q: how is a Runnable object different from a function pointer in a C thread creation API?

Let’s start from the simple case of a fieldless Command object. This command object could modify a global object but the command object itself has no field. Such a Runnable object is a replicable wrapper over a func ptr. In C, you pass the same func ptr into one (or more) thread ctor; in java you clone the same Runnable object and feed it to a new Thread. In both cases, the actual address of the run() method is a single fixed address.

Now let’s add a bit of complexity. I feel a Runnable can access a lot of objects in “creator-scope” — Since a Runnable object is created at run time often inside some methods, that creator method has a small number of objects in its scope.

Assuming a 32bit platform,

Q: are these variables (more precisely objects AND their 32bit references) living the stack of the run() method or as hidden fields of the Runnable object?
%%A: no difference. Since java compiler guarantees that these variables are “stable”, compiler can therefore create implicit fields in the Runnable object

Does a func ptr in C also have access to objects in creator-scope? I think so. Here’s an example taken from my blog on void ptr —

  thread (void (FVVP*)(void *),   void* vptr) // creates and starts a new thread. FVVP means a functor returning Void, accepting a Void Pointer

Above is a thread api with 2 arguments — a functor and a void ptr. The functor points to a function accepting a void ptr, so the 2nd argument (creator scope) feeds into the functor.

Advertisements

correlated sub-SELECT returning max()

Yang taught me this simple yet powerful SQL trick.

select * from EMRepChargeback o where LastUpd = 
(select max(i.LastUpd) from EMRepChargeback i where o.TicketNum = i.TicketNum)

I think you can see the “other attributes” of the most populous countries on each continent, all without using a self-join. Tip: use group by continent and a correlated sub-query.

gemfire GET subverted by(cache miss -> read through)

get() usually is a “const” operation (c++ jargon), but gemfire CacheLoader.java can intercept the call and write into the cache. Such a “trigger” sits between the client and the DB. Upon a cache miss, it loads the missing entry from DB.

When Region.get(Object) is called for a region entry that has a null value, the load method of the region’s cache loader is invoked. The load method *creates* the value for the desired key by performing an operation such as a database query.

A region’s cache loader == a kind of DAO to handle cache misses.

In this set-up, gemfire functions like memcached, i.e. as a DB cache. Just like the PWM JMS queue browser story, this is a simple point but not everyone understands it.

http://community.gemstone.com/display/gemfire60/Database+write-behind+and+read-through

When an application requests for an entry (for example entry key1) which is not already present in the cache, if read-through is enabled Gemfire will load the required entry (key1) from DB. The read-through functionality is enabled by defining a data loader for a region. The loader is called on cache misses during the get operation, and it populates the cache with the new entry value in addition to returning the value to the calling thread.

##20 specific thread constructs developers need

Hey XR,

We both asked ourselves a few times the same question —
Q: why multi-threading java interview questions touch on almost nothing besides synchronized, wait/notify + some Thread creation.

Well these are about the only concurrency features at the language level[1]. For an analogy, look at integrated circuit designers — They have nothing but transistors, diodes and resistors to playwith, but they build the most complex microchips with nothing else.

I feel the 2nd level of *basic*, practical concurrency constructs would include

* basic techniques to stop a thread
* event-driven design patterns, listeners, callbacks, event queues
* nested class techniques — rather useful in concurrent systems
* join()
* deadlock prevention
* timers and scheduled thread pools
* thread pools and task queues
* concurrent collections
* consumer/producer task queues

More advanced but still comprehensible constructs
* interrupt handling
* reader writer lock
* exclusion techniques by using local objects
* exclusion techniques by MOM
* immutables
* additional features in Lock.java
* Condition.java
* futures and callable
* latch, exchanger, barrier etc* AtomicReference etc, volatile
* lock free
* counting semaphore

[1] 1.5 added some foundation classes such as Lock, Condition, atomic variable …

multi-threading with onMessage()

Update — http://outatime.wordpress.com/2007/12/06/jms-patterns-with-activemq/ is another tutorial.

http://download.oracle.com/javaee/1.4/api/javax/jms/Session.html says -- A JMS Session is a single threaded context for producing and consuming messages. Once a connection has been started, any session with a registered message listener(s) is *dedicated* to the thread [1] of control that delivers messages to it. It is erroneous for client code to use this session or any of its constituent objects from another thread of control. The only exception to this is the use of the session or connection close method.

In other words, other threads should not touch this session or receive/mention this object in source code

[1] I think such a session is used by one thread only. but who creates and starts that thread? If I write the jms listener app, then it has to be started by my app. [[java enterprise in a nutshell]] P329 has sample code but looks buggy. Now I think java.jms.Connection.start() is the answer. API mentions “Starts delivery of *incoming* messages only”, not outgoing! This method is part of a process that
* creates a lasting connection
* creates the sessionSSS in memory (multiple-session/one-connection)
* perhaps starts the thread that will call onMessage()

——– http://onjava.com/lpt/a/951 says:
Sessions and Threading

The Chat application uses a separate session for the publisher and subscriber, pubSession and subSession, respectively. This is due to a threading restriction imposed by JMS. According to the JMS specification, a session may not be operated on by more than one thread at a time. In our example, two threads of control are active: the default main thread of the Chat application and the thread that invokes the onMessage( ) handler. The thread that invokes the onMessage( ) handler is owned by the JMS provider(??). Since the invocation of the onMessage( ) handler is asynchronous, it could be called while the main thread is publishing a message in the writeMessage( ) method. If both the publisher and subscriber had been created by the same session, the two threads could operate on these methods at the same time; in effect, they could operate on the same TopicSession concurrently — a condition that is prohibited.

currenex usage in a FX dealer bank

The bank utilizes currenex as a generic infrastructure to build a private ECN, so they can stream their customized live quotes to their private clients. No data is public.

I know this private ECN supports quote dissemination (“advertising”). Not sure about other functionalities.

##where implementation complexities lie: DB, java …

You could be familiar with all the generic tech below, but when you take over a production codebase as a new “owner”, sooner or later you have to wade through megabytes of code and learn to trace the flow. To me, that’s the essence of complexity.

For each DB column, how does data flow from end to end? It takes unknowable effort to document just one column in one table. in a DB-centric world (like GS), these are often the first step in understanding (or architecting) a system, either a tiny sub-system or a constellation of systems
DB join logic — often directly reflects real-world relationship
DB constraints — they affect system behavior
DB view definitions
stored proc, triggers
html — can implement categories, multiple-choices, page flow, data clump…
config files that control java/script/proc
everything in JIL
autosys dependencies
javascript
scripts
java/c++/c# — are the “main” implementation language in a typical financial apps, where most of the code logic exist.

Tracing code flow means we must trace through all of the above. Ironically, interviews can’t easily test how fast someone can “figure out” how data flows through a system. I feel this could be the hardest thing to find out during an interview.

This is yet another reason to stay in java (or c++/c#) and avoid spend too much time scripting. Over time you might become better at tracing code flow through java.

I used to think I can figure out the flow by myself. Now i know in a typical financial system this is impractical. You must ask the experienced developers.

scope-exit destructor call

When you exit a scope, the object(s) created on the current stack frame is destroyed (except RVO — see http://bigblog.tanbin.com/2010/09/return-value-optimization.html).

* stackVars have matching var scope and object life time so they are destroyed.
* pbclone params — are just like stackVars
* pbref params — variable has scope but the object is not on the current stack frame and has a longer lifetime. Not destroyed.
* address passed in — the param is like a local pointer var. This var is destroyed when it goes out of scope, but the pointee is and should not.
* heapy-thingy instantiated locally — the pointer var is an auto/local/stackVar — so destroyed when it goes out of scope, but the pointee is not! This is a classic memory leak. In this case, whoever calling new should call delete.

[09] towards an app (!!SYS) arch across domains

Hi friends,

Self-appraisal on the numerous areas for improvement towards an application architect. You don’t have to reply, but your comments are definitely welcome.

First off, a shortlist of key metrics (among many metrics) for an enterprise application architecture. In fact, an architecture is evaluated on these criteria
* flexible + extensible + adaptable + resilient, for changes in volatile environments where requirements change frequently and need quick implementation. Not rigid or fragile. Will we be forced to scrap our current architecture and rebuild from scratch, or our current architecture can adapt to changes?
* performance — and throughput, capacity, and cluster support. Frequently, these are absolutely essential requirements an architecture must meet, or else the architect is disqualified.
* testability — quality assurance
* Cost — hardware/software cost often pale in comparison to labor cost. Cost improves when an architecture is more resource-efficient, easier to Learn, more Adaptable to changes, faster to Test, or offers faster Speed-to-market
* speed to market — Speed improves when an architecture becomes more Flexible/Adaptable, more Testable, or easier to Learn. Any experience/knowledge can help an architect make faster and better decisions.
* ease of learning (and maintenance) — simplicity helps learning.

Now, my own room4improvement:

– UML.– helps Learning, and perhaps Speed, since it facilitates communication.

– unit testing — for both OO and batch/scripting. I think there are many tools, real-world challenges/solutions.

– OO design patterns

Helps Flexibility, Learning,

– IDE — Not sure about small companies, but many large company (my focus for now) developers use IDE.
These tools affect Learning and Speed-to-market, but a non-IDE programmer may understand troubleshooting a bit better.

– API design — A poorly-defined skill. I think an experienced architect can somehow design better API interfaces between API-team and “client developers” who use the API. “Better” in terms of Flexibility, Testability, Learning, Speed-to-market, and simplified communications.

– DB schema design — The territory of data architects. Large companies (my focus for now) may or may not separate data architect and software architect roles.

– DB system design — stored programs, constraints, views, indices, locking, tuning … DBA’s duty. Software architects depends on and work closely with DBA, so deep knowledge is not absolutely required but important for a good software architecta — basic knowledge suffices. In reality, “basic” knowledge in this area is a rather high expectation.

– reporting — enterprise reporting, management reporting … There are many tools and common solutions. Reporting is a common functionality, found in perhaps 30% of current enterprise projects. An Architect may not need BI (Business Intelligence) but need a decent knowledge of reporting. Otherwise his credentials are somewhat in doubt.
This skill helps Speed to market, since it probably results in a more proven design — less trial-and-error.

– mem leak — tools, nlg, experience. I think an architect may be called to solve such problems.

Helps Performance.

– prototyping — A poorly-defined composite skill. Rapid prototyping and proof-of-concept. Personally I tend to favor these more than UML and product brochures.
This skill helps Speed to market, and can improve Cost by reducing unnecessary purchases.

– capacity planning — for hardware and software. I had a long-time friend, a SAP capacity planner and performance tuner. CP seems to be such a niche that few architects have expertise. I think in large companies (my focus for now), there are designated experts/approaches to decide how much cpu/bandwidth/DB-license… to buy. An architect is often involved in cost-estimation and need CP knowledge.
This skill helps Performance, Cost and Speed.

– code generation — I think some java architects are not knowledgeable about code generators like xdoclet. Code generators can be very effective. However, lack of such knowledge may not affect an architect’s performance appraisal, even though it can dramatically affect the team and the system built.
This skill helps Speed, Flexibility

– profiling/benchmarking — for performance-sensitive systems. There are many tools

– app-servers — knowledge of their relative strengths and limitations, and when to avoid each. Also how to work around their weaknesses, when you have to deal with them.

– OS features — that affect applications, such as OS tuning, threading, CPU/RAM allocation. One of my strengths.

if you have a technical mind

If you have a technical mind, financial IT (and Silicon Valley) rewards you better than other industries I know.

In Singapore, I worked (or know people working) in telecom, manufacturing, university R&D, online gaming, logistics, search engine .. These companies all “value” your technical mind too, but they don’t generate enough profit to reward you as trading shops do.

Raymond Teo pointed out the project budget for a big government project would be a few million, whereas the trading profit supported by a mid-sized trading platform would be tens of millions (possibly hundreds of millions)  a year.

Culture is another factor… what type of contribution is considered important and highly valued…

credit check by forex ECNs

Each hedge fund using an FX ECN has a credit limit imposed by his backing prime broker, who is underwriting all his trades. Let’s not go into the business details, but once a HF has a 100m credit limit set up, then any of his trades will /eat/ up on this limit until there’s too little remaining.

Q: when HF sells a foreign currency position for cash (in “home currency”), will credit limit recover?
A: I think so.

From system perspective, Credit limit check is on the pre-trade critical path. If credit insufficient, then ECN won’t execute the trade. The check could take a few ms, but the entire trade takes a few milliseconds through the ECN, so this latency is significant. The check is serialized because any trade against this same HF needs to queue up for the credit limit check.

Why is execution latency so critical to ECNs and exchanges (including option exchanges)? Well, bid/ask price is the first competition among ECN’s and execution latency might well be the 2nd. Some algo shops pay a real price in terms of execution latency — They send an order, wait for the execution report then adjust their algo. In other cases, a trading house monitors execution latency across competing ECN’s.

EBS, Hotspot, Currenex, FXAll … all (Yes!) have this same credit check capability, probably because participants need it.

In exchange trading, all trades are booked under “member” name. So the prime broker (members) actually pays (or sells) from his pocket on behalf of the HF — http://bigblog.tanbin.com/2009/09/adp-broker-means.html. That’s a huge risk to the PB. Therefore credit limits needed.

Prime brokers (PB) can’t perform any pre-trade credit check on the trades executed by their clients (i.e. hedge funds). They aren’t in the pre-trade flow. The best they can do is to receive the post-trade messages in real time and analyze market risk (which they assume due to HF trades) in real time. Hedge funds themselves “don’t care” about credit limit, since it’s OtherPeople’sMoney (PB’s) lent to them. Therefore, I believe no one but ECN has the job to enforce credit checks. PB has the option to instruct an ECN to adjust the credit limit for a client intra-day, but rarely. Therefore the credit limit is *almost* static.

Something taken from an ad — Prime brokers want real-time visibility of consolidated client activity. In real-time, they can manage the credit /extended/ to clients, trading with both executing banks and ECNs. The increase in high frequency and algorithmic FX trading has made the provision of adequate controls critically important to prime brokers managing risk across clients trading on ECNs. Prime brokers now have the ability to monitor their clients’ credit risk across multiple ECNs on a real-time basis, change or close credit lines to manage risk while maximizing clients’ trading ability. With real-time integration to ECN credit and post-trade APIs, Harmony (a post-trade service) will proactively notify the Prime Broker of limit breaches and allow the prime broker to modify credit lines or terminate trading activity. By providing centralized, automated and secure ECN limit management, credit risk for all counter-parties will be significantly reduced.

risk engine – eg of valuable finance IT domain knowledge

Some friends said that financial domain knowledge gained in financial IT projects over many years can often be learned in a few months by a new entrant.

However, Miao gave an interesting example to the contrary — risk-management. RM is knowledge-based, model-based, expert-system, with Artificial Intelligence. An experienced risk-management “analyst” builds and improves a mathematical model to produce a risk score/assessment based on a large number of inputs. Continuous improvement on the model takes years. The model must be implemented by developers. I think the developer can, if he so chooses, learn to understand the model and the rationales.

For a developer, this domain knowledge takes years. In my “dnlg framework”, this dnlg requires 1) jargons 2) math.

However, I think 90%-99% of the developers don’t have this exposure.

More importantly, this dnlg is not portable.

 

vistor pattern — my code ^ literature

My GS xml serializer is very close —
* my formatter objects {the visitors?} have overLOADED format(TradeGC) or format(SITradeGC).
* TradeGC and SITradeGC {the visitables?} both have a this.myFormatter.format(this) method. This IS the   entry point.

* double dispatch
* passing “this” around
* formatting logic belongs in the visitor classes.
* overLOADed methods resolved at compile time, based on arg declared type 

—- given a Family of visitable data classes + a family of visitor classes, you want to implement a full matrix. In every combination of visitor/visitable class, there’s special logic. Will the logic live in visit() or accept()? Will see.

Dynamic dispatch can be used only once (not sure why). The other dispatch is static dispatch — A key feature of visitor pattern. By convention, visit() is name of the static dispatch and accept() is the name of the dynamic dispatch ie virtual function.

We always have a visitor instance and a visitable instance waiting in the wings before the show starts. The typical show starts with theVisitable.accept(theVisitor). Entire show runs on a thread, with one method calling another in a call stack. Remember accept is resolved by dynamic dispatch. Within accept(), we see theVisitor.visit(this). This is a funny way of saying theVisitor.visit(theVisitable). This call is resolved by static dispatch.

The specific logic is implemented in a specific visit() method. If there are 3 x 2 combinations, then that many visit() methods.

Note the family of visitable data classes must implement public virtual method Accept(IVisitor)
Note the family of visitor classes should share a common base like IVisitor. IVisitor should declare abstract overload methods like visit(VisitableA) visit(VisitableB) etc.

—-http://www.objectmentor.com/resources/articles/visitor.pdf is a simple example using modems. —
* visit(visitable) — defined in the visitor object
** visit(visitable) method is overLOADED — visit(Type1_inFamily), visit(Type2_InFamily), visit(Type3_InFamily). resolved at compile time based on arg declared type. Static dispatch
** myVisitor.visit(this) called by a method in visitable — theVisitableObject.accept(theVisitor)

( This is the confusing part of visitor pattern)
* visitable.accept(visitor) — Ignore until you are clear about the above. Defined inside each visitable object, but usually very thin and adds nothing but confusion.
** I believe accpet(visitor) is usulaly a virtual method resolved at run time by dynamic dispatch

* double-dispatch — passing “this” around
** “double” refers to accept(..) and visit(..). One static one dynamic dispatch
* family — of visitable data-objects
* family — of visitors
* ==> matrix of 2 type hierarchies (see article)

I feel there’s tight coupling between visitable CLASSES and the visitor CLASSES

(except options?) risk is nice-to-have to traders (锦上添花)

Credit risk, credit rating and credit analysis is essential in calculating interest rate, loan amount etc. I think this is a very old trade.

Here I limit myself to market risk or “trading risk”, not credit risk nor operational risk. Trading firms largely pay lip service to computerized risk data. In fact, the “risk management” concept is increasingly vague and all-encompassing .

Risk systems (mark to market, PnL, VaR…) is not in the “critical path” like pricing, market data, execution, trade booking, position management. People can often wait to read risk numbers after market close. I was told GS and some sophisticated trading houses have more real time risk data. However, GS could be lucky. Or GS could be profitable for other reasons but nevertheless want to publicize their risk system for political reasons.

In the extreme case, a trader can execute trades over the phone with a counter party, bypassing all computer systems.

Compared to traders, higher management cares more about risk. Some traders often don’t care about risk, even though real time risk numbers could be available. If that is the case, then it means risk numbers are less relevant to traders than other data such as position data or market data. However, in eq options, i was told the real time “sensitivities” of open positions are truly important to traders. Greeks and vol are the focus.

VaR is the single most important risk output to higher management.

c++ big 3 when subclassing

Update — new best practice using smart pointers would have mostly trivial or empty implementations of the big3.

See post on [[virtual — only 1 of the big 3 please]]

When you create a


class C : public B

copier — Please call B copier in C copier’s initializer
assignment — Please call B assignment among first lines in C assignment overload
dtor — no need to call B’s dtor. When reclaiming C , compiler guarantees to reclaim the B part of the C object real estate, but imagine a ptr field in B points to a 22 byte object. The 4-byte ptr is in the C object real estate, but not those 22 bytes. Compiler guarantees to reclaim the 4 byte ptr but you must make it reclaim the 22 bytes.

Note most of B’s public methods are inherited, but not big 3. These 3 methods wouldn’t do a good job if inherited, as they need access to subclass fields.

Just as in java, constructors aren’t inherited. You can inherit everything from your mother, but not her date of birth.

My own interview questions on credit (or market) risk system

Q: how is your system related to Credit Valuation Adjustment?

Q: what trading desks do you support?
A: many desks. Could be a firm-wide system.

Q: who are the most important users of the reports generated? Traders? Management of the trading desk? Sales teams? Product control?
R: I know Trader, Desk manager are using that.

Q: what asset classes are rated by the system — corporate bonds? other bonds? swaptions? CDS? Mortgage-backed-securities?
R: Bonds, Swaps, Special purpose vehicles.

Q: how many positions in each product?

Q: What kind of traders hold those positions? Holding for how many days?

Q: what’s a typical position size?

Q: What are the major risks to those positions?

Q: What sensitivities are monitored?

Q: I was told CDS is the main credit product on the market. Many banks use CDS to cover and hedge their credit risk. Is CDS covered by your system?

Q: Since you said it’s a reporting system, give me a good idea of the most important pieces of information in your report? To be specific,name at least 3 pieces. Exclude product attributes such as bond name, issuer name, coupon rate, last payment coupon date etc.

Q: give me a good idea of the most important input data to the credit rating system, besides issuer, coupon rate, call dates.

Q: how does the credit risk output from your system affect the trading operation or other businesses? What businesses? Security lending business?

Q: Specifically, what business decisions do these users make after reading your data?
R: user will base on the report to request borrowers to increase collateral.

Q: who are the downstream systems?

Q: does your output data enter the trade “flow” before ORDERS are sent out (ie pre-trade) or after trades are confirmed and become a POSITION (ie post-trade)?
R: Has nothing to do with orders.

Q: if it only affects collateral, then is it part of the buy/sell life cycle? Is it part of security lending life cycle? or some other business life cycle? What businesses? Loan business? Repo business? OTC derivative business that needs collateral?

Q: what are the underlying credit risk methodologies used? Is there a name for that methodology?

Q: how many issuers are covered?
R: Less than 7 hundreds.

Q: do you cover corporate issuers only? If not then who else? Governments? Agencies? Municipalities?

Q: what are the database table sizes in terms of rows and GB? How do you cope with the size?

Q: How large is the largest table you have to query? What kind of data does it hold?
R: my table are small, 80 columns, 100k records. Use ISIN instead of cusip.

Q: is this the largest? What kind of data does it hold? Product data keyed by ISIN?

Q: how long is the batch job? What are the techniques to shorten it?
R: from minutes to several hours. Techniques? none

Q: do you know if the system use a rule engine for the credit risk calculations?

tibrv is decentralized, peer-to-peer, !! hub/spoke

JMS unicast topic has a centralized [1] broker. RV is decentralized. All the rvd processes are equal in status. This is a little bit similar to
– email servers among Ivy League universities.
– file-sharing peer networks

RV is peer-to-peer; JMS is hub/spoke. Note in JMS request/response, “server” has ambiguous meanings. Here we mean the broker.

[1] You can load-balance the broker but the “outside world” views them as a single broker. It’s the Hub of hub/spoke. Scalability bottleneck.

objects confined to call stack

Note on c++ — unlike java, class objects can live on stack, and are automatically confined

Doug Lea’s chapter on Confinement (within the Exclusion section) briefly explains the important type of object confined to a call stack. These are as thread-safe as an object instantiated as a local variable and never leaked outside the method.

Note (static or instance) fields constitute the other variables in java — not local, so never confined to any call stack

Q: Take any snapshot of the object reference graph (a directed graph) as a stop-the-world garbage collector uses, will this object be reachable from outside this /method/?

If you *analyze* the source code and reach a answer of “No”, then it’s thread-safe. Now substitute /…./ with “call stack”. The analysis becomes less obvious. It’s possible for an object to be created in one method, passed around on the call stack, and stay confined to the call stack. See Doug Lea’s 2.3.1.

JVM memory usage stats — reconciliation

Suppose I start a jvm with initial heap of 1G and max 4G. Halfway through, 2G is in use.

Q: What would unix ps SIZE column show?
A: committed. Note RSS column is less useful. It’s the physical portion of the virtual memory allocated.

Q: What would windows task manager -> PageFile Usage graph show?
A: committed.

Q: What would Runtime.totalMemory() show?
A: committed

Q: What would jconsole show?
A: committed, used and max. {— These meanings are explained in the java.lang.Runtime api. JDK 5 onwards offers more instrumentation under MemoryMXBean

I believe “committed” is the actual amount of virtual memory dedicated to the JVM and not available to competing processes. Max is higher, but is partly available to other processes.

Difference between committed vs used? See post on wholesale/retail. JVM has grabbed the committed amount from OS syscalls, but only allocated the used amount.

pbclone always uses pbref (possible slicing)

I always thought a function entry/exit is either pbclone (pass-by-value) or pbref (pass-by-reference) but never both. Now I know a pbclone always[1] uses a pbref somewhere.

Special case — if a copy ctor parameter is missing the “&”, then the pbclone will call the same ctor pbclone recursively — infinite recursion and stack overlow.

Note standard c++ copier has signature (MyClass const & rhs) . So the original arg object is first passed by reference into the copier. Occasionally this is non-trivial.

What if the arg object is a child of MyClass? Slicing ! The rhs variable can’t access the fields added by the child class.

[1] except the trivial case where the the argument is not an object but a literal

value-dates of T/N + common FX fwd deals

(Suppose today is Monday.)
T/N (tom/next): near (value) date is tomorrow, far (value) date is the-day-after, meaning Wed. Wed is actually today + 2 biz day = T+2. Wed is actually the standard “spot-value-date” of Monday. All spot trades (except CAD) on Mon uses Wed as spot-value-date.
 
S/N (spot/next): near (value) date is spot value date (T + 2!!), far (value) date is the day after, meaning Thu

O/N (overnight): near (value) date is today, far (value) date is tomorrow ! Unbelievable? See below

Note the standard value-date of a standard Monday spot trade is Wed but T/N and O/N Forward outright deals are deals “for value before spot” with value-dates before Wed. The quote convention is special. Here’s the reasoning.

) Suppose T/N points are quoted lower/higher like “EUR/USD T/N outright 0.56/0.58“, then we know for sure base currency (EUR) is rising.
) We also know that the trend is 100% sure to continue for the next 2 biz days thanks to interest rate differential.
) We know that base currency will appreciate from value tomorrow to value spot. Note Monday’s spot-value-date is Wed.
) Spot quotes (value T+2 i.e. value Wed) are available everywhere. T/N forward outright quote is always computed based on spot rates i.e. value T+2 rates. Therefore in our example EUR “value tomorrow” is cheaper than EUR “value spot” i.e. value Wed. We need to __subtract__ the forward points from spot quotes
) If value spot (actually value Wed) E/U is 1.2345/1.2347, then E/U value tomorrow is computed as 1.2345-0.000058/1.2347-0.000056. When subtracting, always subtract more from the bid.

+ If you deal T/N forward outright on a Thu, then the transaction would be “value Friday”, i.e. value date of the transaction is Friday. Is there’s a far-date in a forward outright deal? I don’t think so.
+ If you deal T/N FX swap on Thu, then the near value-date would be tomorrow (Friday), and the far value-date Monday

c++ inheriting private fields^methods

* Private fields – onions
* Private methods – utility methods on a single onion-layer

As [[Absolute C++]] P598 puts it, private methods are completely unavailable to subclasses. Private methods are useful as utility methods on-that-layer of the onion, not outside, not inside.

A base class (B) object is wrapped in a subclass (D) object like onions. The physical molecules of each layer are the instance fields. When you construct a new D, every single B instance field must be allocated memory (Java == C++). However, private B fields are inherited but inaccessible in D’s methods [java == c++]. Often D can access B’s private field privfield1 using accessors [java == c++].

Q: can D methods access qq(B::privfield1)?
A: I think so

3 notifyAll limitations — solved with conditionVar??

Classic Object.notifyAll()
* can’t control which subset of waiting threads NOT to wake up.
* cause every thread to wake up and grab the lock, even if only some of them should.
* can’t select a single thread to wake up

Can we solve these problems by having each waiting thread use its own condition variable? Save these condition instances in a global array. Notifier can then choose one condition instance? Any scheme involving shared collection (like this global array) of condition instances requires care and feeding as access to the shared collection need synchronization.

I think condition variable is designed to be 1 : m : mm i.e.

   “1-lock/m-conditions and 1-condition/m-threads” — not 1-condition/1-thread

Condition.java API gives an example of 2 conditions on one lock. There’s another sample code P168 [[java 1.5 Tiger, A developer’s notebook]]

a function as a friend to 2 classes

“I name my friends”. I’m a class with private fields/methods. I can name another class to be a “friend-class”, so its *instance* can access private members of my *instance*.

Now friend-function — I can also name a function as my friend. The function is identified by the full prototype (including const, return type …). Therefore, when i name the friend function, the syntax looks …..

The access controller treats each function [1] as an “thingy”, a special thingy in a running process, with its own identity (and address).

Now you can understand that a function can be friend to 2 classes.

[1] non-member, non-friend functions included, and also overloaded operators

See also http://newdata.box.sk/bx/c/htm/ch15.htm for transfer and inheritance resitrictions

[09]implementation skill ] a leadDeveloper/softwareArch

Update — compared to 2009, I am now much less confident because I see many mismatches in my profile vs arch job requirement, such as

  • presentation and persuasion – eg Mac
  • code reading in a large codebase
  • debugging opaque systems

Hi XR,
(another blog)

I might do well in interviews, but still need know-how to be a senior architect. (Senior developer fine. Junior architect fine. Team lead perhaps ok.) As i confessed to you in our recent call, there are too many (dozens of times a week) practical problem-solving situations [2] i will encounter, research and try various wrong solutions, overcome, and then internalize.

I feel this vast amount of know-how is a must-have for a software architect in large project, but probably not needed by an EnterpriseSystemArchitect — see other posts.

[2] See post on [[## result-oriented lead developer skills]] for some of the problem-solving situations.

I guess your spring/hibernate performance problem was one of those encounters. We need to encounter problems to learn. Whenever i go through a project without implementation[1] difficulties, i feel like time wasted. Therefore, it’s not how many years that count. For example, I improved my SQL skill only in GS, not the other 2-3 years of SQL experience before.

[1] I don’t want to say “design” difficulties. If I face a difficult design issue, i don’t always learn. Most of my tech learning takes place after overcoming implementation challenges.

All the architects I know keep their skills up to date by doing hands-on coding.

* my GS team lead
* my GS mentor, a 40-something senior manager in charge of a 10-people team
* Lab49 team lead and the java developer
* Guo Qiao
* Xiao An

These individuals do hands-on work on a weekly basis — sometimes daily, but seldom go without coding for 3 months. That means

– their code is in CVS and they can’t afford to produce poor code
– they set coding standard for others, by example
– they force themselves to stay productive and churn out code fast. A hands-off architect tends to be a slow coder, i would imagine
– As Guo Qiao said, if you don’t code, gradually you lose your voice in tech discussions. I feel that’s true in high-level discussions just as in implementation discussions.
– i feel a hands-on architect can estimate man-day effort better than a PM.

TibRV usage]%%department

* account cache invalidation -> re-fetch from DB for that account.

However, after (not before) cache server starts up and rebuilds itself, all messages are guaranteed. When it’s down, such notifications are ignored. No guaranteed delivery please.

* management console to put an app server into and out of debug mode without restarts. J4? Cheap, fast, light. EMS is heavier.

* centrally log all emails sent out from many app servers

* market data feed intra-day

##some java skills no longer]demand

(another blog post)

Any powerful technology requires learning. Whatever learning effort I put into EJB, Corba, struts, JSP, JNI, J2ME, Swing, AWT, rule engines .. are becoming less and less valuable to employers. The march of java is merciless. How merciless? Even the creator of java – Sun microsystem – has to go open source on both java and Solaris just to survive, and sell itself to Oracle.

I am now very careful not to invest myself too heavily into anything including popular must-know stuff like Spring, hibernate, distributed cache, EJB3, AOP, CEP, JSF, GWT, protobuf, web services, design patterns … Instead, I seek stability in older core java technologies like threading, collections, serialization, messaging, reflection, RMI …

(C++ is even older and more stable, but this post is all about java.)

[09] insightful article: managing tight deadlines

http://fh.rolia.net/f0/c1050/hit/post/6237273.html (removed?) + my comments

“A job usually involves many phases: investigation, brainstorming, design, implementation, testing and documentation. A quality job requires effort and time in every phase. However, when push comes to shove, many of the phases can be skipped. The problems will show up much later. By that time, nobody would care who’s to blame. And companies are more than willing to budget for these problems, in the form of increased support, more bug fixes, or even a brand-new system. You just have to be WILLING and ABLE to produce imperfect software.”

“The second important thing to managing work load is that you have to be the master of your domain, not your boss. This means you don’t tell your boss everything. And you make a lot of decisions yourself. Otherwise, you lose control.” — My GS team lead knows too much about my project. I tell him everything about my project.

“It starts from estimates. You know better than anyone else how long each piece will take. A hard piece to others might be easy for you. But a simple task might end up taking a lot of your time. Don’t tell your boss that you’ve worked on something before and can borrow a lot of code from previous projects.”

“The same applies in the middle of your project. A seemingly complicated piece could turn out to be smooth sailing. Yet a small issue could bog you down for many hours. Again, don’t tell your boss you finished something in an hour which was budgeted for half a day. But do tell him that a bug from another team cost you many hours unexpectedly.”

“What do you do when you see something wrong in the requirement? Or something wrong with other people’s work which you depend on? If you’re pressed for time, act as if you didn’t see them. Act like a fool. You may be punished for missing your own deadline, but you’re unlikely to be punished for not spotting other people’s mistakes.” — By not reporting the issues early, project will suffer but you will not, but avoid making your boss look bad — try to give him a scapegoat. Project will suffer — project will need more time, and the reason is other people’s mistakes. You expand the impact of their mistakes to get more time for yourself. Conversely, when your mistake affects them, they might do the same.

“What do you do when deadline approaches and you discovered a big hole in your work? Again, act like a fool. Act as if you didn’t see them. You hand in your work. And a week later, people would find issues. But that’s normal. Nobody’s perfect. You and your boss get punished for missing deadline; but neither(???) of you would be held responsible for (non-critical?) bugs. Rather you will be given new budget to fix things, probably after a relaxing break in the sun.”

“Every decision you make affects your schedule. Be flexible. Be creative. Be able to accept imperfections. Be a liar if need be. The important thing is to look good, not to be good. Image is everything. And you can cut a lot of corners without affecting your image.”

— GS Slogan says “Tell your manager the bad news early”. If it’s your mistake, then decide if he will find out you were hiding it. Some managers periodically ask “Any problem?” Often he can’t be sure you were knowingly hiding problems — you could act like a fool.

— There are different bugs and different levels of impact. Manager may say some functionality is important or some time line is important, but question them in your head.  It takes a lot of observations to figure out which ones are really important to your manager’s bonus. Many delays and missing features are manageable or expected.

— My GS team peers know what bugs are tolerable to the manager. If manager must explain it to users and other teams, then you know it’s a visible bug. This knowledge comes from experience. However, initially you need to do some quality work to win trust.

— In fact, the best GS (and other) team members often contribute non-trivial bugs, partly because they are more productive and change lots and lots of code quickly.

— My ICE manager said “Don’t rely on QA team to test your code. Do your thorough testing before passing to QA.” Impractical given the thousands of detailed items in a 50-page spec. Thorough developer testing would take more time and I didn’t do it. A colleague was able to give QA team a well-tested rpm and receive up to 30 bug reports (mine received 60 to 100), but did it win him any award? No.

  • I could explain my requirement is more complex. Manager may or may not accept.
  • Sometimes we could explain one of us is new
  • Which system has more crashes in QA/production? Each crash can take a long time to figure out. Thorough developer testing doesn’t help much here.
  • In the end, which system goes live faster? That’s the visible yardstick. 2nd visible yardstick is how quickly we start QA, so by cutting corners on developer testing, I was able to start QA earlier. A more thorough developer test won’t help with any of these visibles.

is hashcode used as array index inside the hash table?

We know that 2 objects that both hash to 112233 will fall into the same bucket on an array of buckets, but is the hashcode of 112233 the real array index? To create an array of 112233 elements, system must reserve that much memory. That’s the C/C++ array.

(A2: Since internal array size is always an exponential of 2, one economical algorithm is to take the lowest n bits of the 32-bit hashCode(). Throws away lots of information, but i guess can still be random.)

Alternatively, can each hash table in memory have an internal lookup table that translates hashcode to array indices? How do you use such a look-up? Remember internally most systems only has arrays and linked lists. All other data structures must be built on these.

Answer is in [[generics and collections]]. algorithm can be arrayIndex = hashcode%somePrimeNumber

Q2: A barc cap low-latency interview actually asked how to economically derive an array index value from a 32-bit hashCode() for a growing hash table. Note the hash table’s internal array starts out fairly small, like 16-buckets. It usually doubles in size each time. Answer appears elsewhere in this post.

JMS msg selector

for both p2p and unicast topic, only consumer can use msg selector

syntax is like the WHERE-clause [1] of SQL-92, including

* LIKE
* BETWEEN
* IN
* IS NULL
* AND / OR

Filtering is done on broker not consumer, to save bandwidth.

[1] but without the WHERE.

MOM guaranteed delivery

Compare RV CM and JMS guaranteed delivery.

Requirement: once and only once delivery

Requirement: large[1] persistent store

requirement: ack upon consumption. I feel auto_ack and explicit “client_ack” are fine.

[1] in practice, all msg stores have limited capacity. Alerts are needed.

y bond prices in 100 but face value in $1000

Exec summary — Correct way to compute transaction/clearance amount of 5 bonds sold at $99.02 is 99.02% * $1000/bond * 5 bonds

Convention — Bond prices are quoted as a percentage of face value aka principal amount aka par amount which is typically $1000

People often imprecisely talk about a bond price as “2 cents up” at $99.02. Truth is, that price means 99.02% of face value. Somehow, the implicit $100 face-value seems to live on.

In major sell-side dealing rooms of corporate/muni bond, minimum round lot is 5 BONDS with total face value $5000. Investopedia says — In bond trades, a round lot is usually $100,000 worth of bonds. That’s not the reality as of 2011.

tree model ^ table model

Tree model and table model are the 2 complex jcomponent models. Each adds a meaningful *structure* to an otherwise unorganized constellation of objects, a structure naturally suited to the target jcomponent.

1) If you stop and think of the typical kind of data organized into a jtable, it’s really a collection of ROWS. Each row contains a strictly ordered list of objects. Order is fixed, so is each data type — First object must be age, 2nd object must be gender … Therefore it’s logical and natural to use a collection of DOMAIN objects.

2) A jtree maps naturally to any hierarchical data. Interestingly, the TreeModel interface accepts any kind of object as a tree node and does not require that nodes be represented by DefaultMutableTreeNode, or even that nodes implement the TreeNode interface. If you have a pre-existing hierarchical data structure, you do not need to duplicate it or force it into the TreeNode mold. You just need to implement your tree model so that it uses the information in the existing data structure.

3) JTextComponent has a Document as its model

lockfree atomic variables — JDK 1.5

(see article at http://www.ibm.com/developerworks/java/library/j-jtp11234/)

Until JDK 5.0, it was not possible to write wait-free, lock-free algorithms in the Java language without using native code. With the addition of the atomic variables classes in the java.util.concurrent.atomic package, that has changed. The atomic variable classes all expose a compare-and-set primitive (similar to compare-and-swap), which is implemented using the fastest native construct available on the platform.

The atomic variable classes can be thought of as a generalization of volatile variables, extending the concept of volatile variables to support atomic conditional compare-and-set update

Nearly all the classes in the java.util.concurrent package use atomic variables instead of synchronization, either directly or indirectly.

—-
http://java.sun.com/j2se/1.5.0/docs/guide/concurrency/overview.html has unusual comments
Atomic Variables – Classes for atomically manipulating single[1] variables (primitive types or references), providing high-performance atomic arithmetic and compare-and-set[2] methods. The atomic variable implementations offer higher[3] performance than would be available by using synchronization (on most platforms), making them useful for implementing high-performance concurrent algorithms as well as conveniently implementing counters and sequence number generators[4].
[1] atomic variables are quite low-level and only cover monolithic variables. No Object here. Object is different from reference (monolithic).
[2] CAS is the biggest feature
[3] synch performance hit is actually a real concern for java designers. I think synch severely limits multi-processor parallelism
[4] these simple integers are a key use case for atomic? You can see Atomic variables aren’t for everyday use.

assignment in 5 kinds of c++ variables

— ptr to primitive or ptr to class type
float* newPtr = oldPtr; // copies address of the wrapped object. This is what happens in a default copier and assignment

— primitive ref assignment
overwrite object State, just like java primitive assignment.

Note if a class has a ref field and a ptr field, they behave differently during the field-by-field class assignment!

— class type ref assignment
MyClass& newRef = someRef;
newRef = oldRef; // overwrites object state, field by field

— primitive nonref assignment
obvious

— class type nonref variable
MyClass v; // ctor
v = v2; // field by field state overwrite by assignment operator

——various types of variables to look at
) Primitives ^ class types
) Pointer(and ref) ^ nonref types

A very detailed stroke-by-stroke on assignment overload — http://icu-project.org/docs/papers/cpp_report/the_anatomy_of_the_assignment_operator.html