clearing-member of the exchange = sponsor of a trade

Only the clearing members (big broker/dealer) of NYSE can trade on it. When you or any investor sells IBM short [1] using GS as a broker, NYSE always sees the trade under GS’s name. If investor can’t fulfill the obligation, GS is responsible.

Technically, GS “sponsors” your trade. I think most trades are sponsored this way (except trades GS does using its own money). A sponsor member like GS has to protect itself.

A pass-through broker who simply passes the order to the exchange without any check is running the greatest risk.

Same thing in other exchanges. I traded commodities in 1997. My broker was responsible to answer clearinghouse’s margin calls if I don’t. Integrity requires all margin calls satisfied (or clearinghouse has the right to liquidate your position). I had an account with the broker, not the clearinghouse, so all the positions were held under first-tier broker’s name, i.e. a member of the exchange.

[1] regular sell or buy also involve responsibilities and obvious financial risks to the broker.

top 3 subsystems ] spot (and forward) FX dearler bank

There are 3 types of FX dealers – spot dealer, forwards dealer and (corporate)client-facing sales dealers covering spot and forward. They use the systems below differently.

Note ONLY the sales dealers would interface with clients (no retail client). Spot/Forwards dealers deal with sales dealers and counterparty dealers.

#1))) Heart of entire FX desk is the “position keeping system” or position-master. All “deals” are stored there, so as to “calculate current positions” and PnL. Same idea as MTS inventory-mgr and Reo main-grid.

Note sales dealers never keep positions. I believe they are internal brokers on bank’s payroll.

FX position systems usually show real time unrealized P/L.
#2))) AFTER (never before) a dealer executes trades via a voice broker, he needs to record trade details on a “deal-ticket” (see other post) and enter it in the “deal entry system”, just like MTS. Electronically executed trades don’t need this “deal entry system”

Spot trades are too many so seldom voice-based. Forwards dealers use voice brokers more often. Sales dealers do even more voice EXECUTION (often with a customer). These 2 groups definitely need the deal-entry system.

As of 2010, the most common execution model between clients and banks is not ECN (growing) or Reuters conversational dealing, but voice execution.

#3))) (Pending) order management — If a spot dealer wants to hold a position “open” overnight, she must leave a stop loss order. While she’s out, the order is looked after by the take-over time zone (NY -> SG/TK/HK -> London) which MIGHT execute the stop loss order.

Forwards dealers don’t leave overnight orders. Sales dealers often create customer orders for the spot/forwards dealers to execute.

Sales dealers can also create overnight limit orders on behalf of clients.

hide a source folder from eclipse

Background: i want “hide” an arbiturary directory (such as a source folder) from eclipse.

Conclusion: No easy way to mark a directory “please ignore me completely”, but a few tips:

* searcher — try “derived”. Also you can remove read permission from a directory.
* package explorer — try the filters, such as the “.*”

http://www.plaintivemewling.com/articles/ignoring-darcs-directory is good

java String[] is O(1) for lookup-by-index

int[] has constant by-index lookup time, because system can compute address of 55th element from base address and offset ie base address + 54 * 4 since an int takes 4 bytes.

For a String[], I believe it’s the same formula. 55th string could be very long, but it’s not stored in the array’s memory block. I believe it’s the address of 1st char that’s stored in the array’s memory block, assuming the c-str implementation. Each address probably takes 4 bytes in a 32-bit machine.

More generally, any array of objects is probably implemented the same way. Note an array is also an object. If the elements in the outer array are array objects, then we get a 2D array. Outer array’s memory block holds 55 pointers to 55 inner arrays.

bond duration – learning notes

I like the official definition in http://en.wikipedia.org/wiki/Bond_duration#Definition. Each payout has a payout-date and therefore a ‘distance’ from valuation-date. (Example: 47 months from now). Weighted average of all these “distances” is the Mac Duration.

* eg: zeros aka STRIPS — one payout-date only. If distance is 4.5 years then Mac Duration is the same. Zeros have the longest duration[1]
* eg: low coupon bond maturing in 4.5 years — Weighted average means Mac duration is dominated by the principal repayment’s distance of 4.5 years. Duration is slightly shorter than that that last distance.
* eg: high coupon bond with that same maturity of 4.5. Duration is much shorter than the last distance.

[1] among bonds of the same maturity date.

“distance” is a more visual word than the “time-to-maturity” or “term-to-maturity” technical jargon. I also like the TTL or time-to-live phrase.

Now, if we receive $50 coupons five times and then $1000, we get total $1250 [2]. Q: what’s a reasonable “average-payout-date” of this $1250? Answer is the Duration.

[2] actual formula uses present value of each payout.

Now let’s see why the zero is most volatile, i.e. “this bond’s current price swings wildly with interest rate”

Key assumption: yield is roughly correlated to benchmark interest rates (such as the overnight FedFund rate), an indication of market sentiment.

For a high-yielder, larger portions (half?) of the total PresentValue come early and suffer minimal discount (discount factor close to 100%) . Remember DF and yield are subject to semi-annual compound. STRIPS have no early payouts, so the final payout must be discounted deeply due to compounding. Impact of yield change is deep.

Remember yield of 7% is always semi-annually compounded to derive DiscountFactor. See posts on DF and yield.

container of (polymorphic) pointers (double-pointer)

If you have a vector of pointers (fairly common in practice), the iterator is nothing but a double pointer.

– If you have a deque of pointers …?
– If you have a multiset of pointers, the iterator is a pseudo-pointer of pointers.

Now shift gear to smart pointers. Container of smart pointer is extremely common in quant and other apps, perhaps more common than container of values.
– If you have a vector of smart pointers, the iterator is a pointer-to-smart-pointer.
– If you have a priority queue (binary heap) of smart pointers, the iterator is a pesudo-pointer-to-smart-pointer

P186 [[c++coding standards]] says to store polymorphic objects, best solution is
– avoid array
– avoid storing the objects directly
+ Use container of smart pointer to base class

Here are the 2nd-choice but acceptable alternatives
* array of raw   pointer to base class
* array of smart pointer to base class
* container of raw pointer to base class

assembly language programming – a few tips

C compiler compiles into binary machine code. I think pascal too, but not java, dotnet, python.

Assembly-language-source-code vs machine-code — 1-1 mapping. Two representations of the exact same program.

Assembly-language-source-code is by definition platform-specific, not portable.

A simple addition in C compiles to about 3 “Instructions” in machine code, but since machine code consists of hex numbers and unreadable, people represent those instructions by readable assembly source code.

Compared to a non-virtual, a virtual function call translates to x additional instructions. X is probably below 10.

There are many “languages” out there.
* C/c++ remain the most essential language. Each line of source code converts to a few machine instructions. Source code is portable without modification whereas the compiled machine code isn’t.
* Assembly is often called a “language” because the source code is human readable. However a unique feature is, each line of Assembly-language-source-code maps to exactly one line of machine code.
* newer languages (c# java etc) produce bytecode, not machine code.

balance-of-trade and related jargons

* balance of payment INCLUDES current-account (ca). [1] has a formula on that.

* When China sells $1B worth goods/services to US, China receives $1B USD and somehow leave that in the national forex reserve. Is that $1B foreign asset? The $1B is like $1B worth of gold in US owned by China?

— “The Economist” story 2009/01/24 —

* US current account (savings BY US – investment BY US)[2] are in deficit since 1992.

** Q9: what does “invest” mean?** A9: see [3]. In a different context, when economists talk about foreigners investing in US, i think it means foreigners spending USD in US, but not for consumption.

** Experts say US had to borrow[A] from abroad or sell[B] assets to pay for the annual deficits. (Why?)** B would decrease America’s Net Foreign Asset position.** Some articles suggest A is the main source; while other articles suggest B. I think the same transaction can be seen as A and B as 2 sides of a coin. [3] gave some examples of such transactions.** experts say this deficit was used to “finance” some “investment”, though [3] also says US sells assets (B) to “finance” the CA deficit. I guess that “investment” includes investment in biz, R&D, real estate…** experts widely believe China consumer save hugely, and their savings somehow compensate for the low savings of American consumers.

** Q7: how does “foreign SAVERS” and their money play a role?** A7: I guess they buy US properties, gov bonds, and lend to US businesses. Does exporting to US count as “investment” BY foreign savers?** Q8: in the economists eyes, Chinese savers save in USD or CNY?** A8: Neither. I think by “saver” the economists mean “China does not import so much American goods, so they earn more USD than they spend”. The surplus USD must go somewhere. I guess it mostly goes to buying/investing in US assets, which economists call “borrowing-by-US-from-China”

** Q: what does “borrow” mean? who? in what currency? I think economists mean something else

— digestibles —

* A higher savings rate generally corresponds with a trade surplus. China vs US.

* current accoutn includes balance-of-trade (bot) as the biggest component.* * Because exports generate positive net sales, and because the trade balance is typically the largest component of the current account, a current account surplus is usually associated with positive net exports.

* The net foreign asset (NFA) position of a country is the value of the assets that country owns abroad, minus the value of the domestic assets owned by foreigners.

* if a country runs a $700 billion current account deficit, it has to borrow exactly $700 billion from abroad to finance the deficit and therefore, the country’s net foreign asset position falls by $700 billion. Why? Here’s my theory. Suppose US buys $700B worth of Chinese goods/services. US must pay $700B, but US doesn’t earn that much from export to China. Mostly US sells its own assets to China to get that $700B.

* If an annual current account is a surplus of $2T, the country’s net foreign asset (NFA) position increases by that amount $2T.

* US trade deficit — Warren Buffett said “the rest of the world owns $3 trillion more of us than we own of them”, echoing [3]. I think this means over the years, accommulative US trade deficit adds up to $3T and that’s the US asset under foreign ownership.

— References —

[1] http://en.wikipedia.org/wiki/Balance_of_payments[2] explained in 2009/01/24 The Economist, but why is this definition so different from balance-of-trade?[3] http://www.dollarsandsense.org/archives/2004/0304dollar.html

error accounts

• Just like a client account, an EA (error account) can hold a security, like 100 shares of IBM.
• Just like a client account, it can hold cash.
• Just like a client account, it can hold cash in a short position like negative $1M.
• Just like a client account, it can hold a short position in a security, like -100 shares of IBM.

An error account holds any of these positions briefly. We try to reduce these positions to 0 ASAP, usually within the same day. We flatten the positions and flush the error account.

• If the EA holds cash, we *journal* it out to a firm account
• If the EA is short in cash, we journal cash in, perhaps from a firm account
• If the EA holds a security, we sell it (not sure if we journal that out)
• If the EA has a short position in IBM, we buy it to cover the short position.

realtime inter-VM communication in front desk trading sys

Inter-VM is our focus.

* [s] MOM — async
** FIX over RV in Lehman Eq
* [s] distributed cache — async?

Above mechanisms notify listeners. Note Listeners are usually async and multi-threaded.

* DB writes by one app, and periodic DB polling by receiving app
* [s] RMI
* [s] EJB? infrequent. I think this is less efficient than MOM
* [s] web service? not sure
* FTP? not real-time but at SOD (startOfDay) and EOD
* email? none

MOM is the clear favorite. Most efficient. Guaranteed

Within my front office app, RMI, MOM and cache are dominant. Within a related ticketing system (iticket), MOM and RMI are dominant.

DB is an extreme form of synchronous pub/sub.

[s=needs object serialization. cross-VM often requires serializable]

boost thread described by its creator

Excerpts from http://www.drdobbs.com/cpp/184401518. There's also a very short program featuring boost mutex, and a program featuring boost bind, …

Many C++ experts provided input to the design of Boost.Threads. The interface is not just a simple wrapper around any C threading API. Many features of C++ (such as the existence of constructors/destructors, function objects, and templates) were fully utilized to make the interface more flexible. The current implementation works for POSIX, Win32

Currently (2002), not a lot can be done with a thread object created in Boost.Threads. In fact only two operations can be performed. Thread objects can easily be compared for equality or inequality using the == and != operators to verify if they refer to the same thread of execution, and you can wait for a thread to complete by calling boost::thread::join. Other threading libraries allow you to perform other operations with a thread (for example, set its priority or even cancel it). However, because these operations don’t easily map into portable interfaces….

Boost.Threads is designed to make deadlock very difficult. No direct access to operations for locking and unlocking any of the mutex types is available. Instead, mutex classes define nested typedefs for types that implement the RAII (Resource Acquisition in Initialization) idiom for locking and unlocking a mutex. This is known as the Scoped Lock pattern. To construct one of these types, pass in a reference to a mutex. The constructor locks the mutex and the destructor unlocks it. C++ language rules ensure the destructor will always be called, so even when an exception is thrown, the mutex will always be unlocked properly.

income vs cost levels U.S.^sg

Hi XR,

You mentioned between US and SG income levels there’s no clear winner. i guess you meant purchasing power ie income measured by cost level. USD income divided by USD prices vs SGD income divided by SGD prices. I have to ignore bonus since it has always been unstable in my jobs, though 4 month for your UBS S’pore job.

Q: Is purchasing power higher in US than SG? my answer is “no obvious winner”. i agree with your observations on property tax, medical benefits… However, we missed a few points:
* SG economy feels like less stable than US, as US economy is big and “strong” in a sense. Econonic longevity and fragility affect  purchasing power. i witnessed and was a direct victim of IT salary drops in S’pore first-hand.
* there are various ways to reduce income tax. there are book and consultants on that topic i heard.
* it seems only the chinese like to rent out their homes. other people tend to buy single-family homes.

* housing cost is by far the biggest cost. However, it’s hard to compare between US and S’pore.
** which part of US? Chinese IT professionals live in many different states, not only North East or West coast.
** what size? In S’pore, people in my circle bought 3-room, 4-room, 5-room to private condo.

* car is perhaps 2nd biggest cost. I assume both of us do not need a car in S’pore. In US, most places require 1 or 2 cars for a family with kids. Instead of calculating and comparing purchasing powers, i notice that most Chinese IT professionals who can stay in US would not choose to settle in S’pore. If i were 28 and single, i would stay here.

short and sharp back_inserter tutorial #noIV

P182 [[STL tutorial and ref]] offers a tutorial on back_inserter, used in copy().

back_inserter(myVector) is a kind of factory method[1] that manufactures a back_insert_iterator object that’s a wrapper on myVector.

Whether in java or c++, this wrapper holds a pointer to the container object myVector. In the copy loop, every assignment to a myVector element triggers a call to myVector.push_back().

[1] back_inserter() is a free function. It’s hard to tell from the tutorial.

##[09]low latency techniques

roughly ranked in terms of interviewer’s emphasis

  1. OOM and GC. memory conservation.
    1. Avoid dupe objects.
    2. avoid keeping large order state object
  2. parallelism
  3. avoid recursion
  4. JVM tuning beyond GC tuning
  5. –network
  6. avoid network like distributed cache. Favor single-JVM designs
  7. NIO
  8. FIX payload size reduction, such as encoding and compression
  9. –MOM
  10. multicast. RV is more stable than the newly invented multicast-JMS
  11. MOM message size control
  12. !! Peer-to-peer messaging eliminates message brokers and daemon processes

http://download.oracle.com/docs/cd/E13150_01/jrockit_jvm/jrockit/geninfo/diagnos/tune_fast_xaction.html
http://www.sun.com/solutions/documents/pdf/fn_lowlatency.pdf — diagrams, brevity.
http://www.quantnet.com/forum/showthread.php?t=5736
http://en.wikipedia.org/wiki/Low_latency_(capital_markets)#Reducing_Latency_in_the_Order_Chain

JRockit real time — http://www.oracle.com/appserver/docs/low-latency-capital-markets-whitepaper.pdf

JNI memory leak, briefly

Java has a problem with accessing resources outside the JVM, such as directly accessing hardware. Java solves this with native methods (JNI) that allows calls to functions written in another language (currently only C and C++ are supported). …

There are performance overhead in JNI, especially for large messages, due to copying of the data from the JVM’s heap onto the system buffer. JNI also may lead to memory leaks because in C the programmer is responsible for allocating and freeing the memory. GC can’t go beyond jvm heap to visit the malloc free store. See post on wholesale/retail.

Even regular java objects on java heap may become memory leak when you add JNI. http://www.iam.ubc.ca/guides/javatut99/native1.1/implementing/array.html says (using int array for example) that ReleaseIntArrayElements will “unpin” the Java array if it has been pinned in memory. I believe anything pinned by JNI will not be garbage collected. If JNI programmer forgets to release, it’s similar to a java programmer forgetting to clear a static hashmap

double-colon q[ :: ] in c++

Recently I hit this confusion multiple times. The compiler is not confused because the usage scenarios are distinct —

  1. usage — namespace1::free_func2(). Example – you can specify std::swap() instead of your local swap()
    • somewhat similar to java package separator the dot
  2. usage — classA::staticMethod1()
  3. usage (very useful) — superclassA::instanceMethod3(). This is equivalent to this->superclassA::instanceMethod3() //tested
  4. usage — classB::localType

Actually, the first 2 scenarios have similar meanings. Java simply merged both into the dot.

boost::this_thread::get_id() shows two usages.

I believe a field name can replace the function name.

MyNamespace1::MyNamespace2::j = 10; // nested namespace
std::terminate() and ::operator new() are similar to System.java and Runtime.java methods.

In a class template, the #2 and #4 usages can confuse the compiler. See P670 [[c++primer]].

JMS acknowledgment – cheatsheet

If you can only remember one thing about JMS Ack, it’s

#1) 2 legs – there’s ACK on both legs of producer => broker => consumer. The 2 ACK can be present or absent independent of each other. See diagrams in [[JMS]]

#2) reliability – one of the most important features of JMS is reliability. Ack is a cornerstone.

tibrv is slightly less reliable. but CM-transport also uses confirmation on each individual message… see other blogpost.

financial jargon: short

shorting is essential to hedging, and essential to risk analytics

“shorts” means short positions or short trades, and can also mean the holders of short positions

“the short” = the seller, and “the long” = the buyer in a trade.

You COVER your SHORT POSITION by buying.

You “go short” at a price and cover at another (hopefully lower) price.

In other words, you have a “short sale” at $x, covered at $y.

stock lending

There’s probably no “electronic market” to publish offerings. I’m a GS trader with good friends in MS and UBS, I will get weekly(!) inventory feeds from them, perhaps an ftp, a spreadsheet etc. Then we agree on the terms [1] over phone. No automatic agreement! After agreement, borrower can request to borrow over web/ftp.. and delivery can be automated.

At its core, SL system keeps inventory, stocks lent out, stocks borrowed.

[1] Unlike repo, market price isn’t relevant. If I borrow 100 IBM, i agree to return it x days later. No buying! Repo involves buying ie (temporary) change of ownership.

Above is for stocks. For futures and FI instruments, there are different systems.

JGC pauses + stop-the-world JGC

These are some random notes. What we need later is go to the bottom of the issue and summarize the key points.

I guess GC can cause pauses even outside those brief stop-the-world moments. Concurrent collection is not immune to pauses. “Incremental CMS works by doing very small stop_the_world phases to accomplish the concurrent phases, instead of using concurrent threads. Sun recommends this mode for one or two processors.”

Concurrent garbage collectors seldom stops program execution, except perhaps briefly when the program’s execution stack is scanned. “when app needs to allocate a new object, the runtime system may need to suspend it until the collection cycle is complete, or …”

Concurrent GC runs concurrently_with_the_application.
http://www.softwareengineeringsolutions.com/blogs/2010/04/30/garbage-collection-in-java-part-2/ is a short blog article.
http://www.softwareengineeringsolutions.com/blogs/2010/04/30/garbage-collection-in-java-part-2/ says constructor may block when GC is called upon (asynchronously?) to free up more memory.

when does java heap expand, briefly

As advised in [[solaris 10 performance]] and numerous experts, better set -Xms and -Xmx identical, to avoid heap expansion, but what if we can’t?

Expansion occurs if GC is unable to free enough heap storage for an allocation request, or if the JVM determines that expanding the heap is required for better performance.

I believe expansion can only happen after a full (but ineffective) GC.

##triggers for JGC: various comments

Some big trading engine (UBS?) runs just 1 GC cycle each day. I wonder … Q: how often does GC run normally?

P155 [[SanFrancisco]] nicely outlines that GC can start
* upon alloc-failure
* when jvm explicitly waits (IO, sleep …)
* every 10 seconds, but at a low priority

– In addition, application can request GC.

Synchronous GC is predominant, but google “asynchronous garbage collection idle periods of low activity” and you see GC can start when system is idle. Creditex java guys also hinted at that.

In my experiment, I do see GC springs into action when app is … paused.

http://java.sun.com/docs/hotspot/gc1.4.2/faq.html says about JDK1.4 —
* In the default garbage collector, a generation is collected when it is full (i.e., when no further allocations can be done from that generation).
* The concurrent low pause collector starts a collection when the occupancy of the tenured generation reaches a specified threshold(by default 68%).
* The incremental low pause collector collects a portion of the tenured generation during each young generation collection.
* A collection can also be started explicitly by the application.

http://www.softwareengineeringsolutions.com/blogs/2010/04/30/garbage-collection-in-java-part-2/ drills into alloc-failure —

The JVM is unable to fulfill the request (alloc-failure), and so it awakens the garbage collector to come to its aid. This results in a Young Generation collection cycle. This is the simplest and first answer to the opening question. Therefore, the novice thinks this is the only answer.

A Young Generation collection is called for because there isn’t enough space available in Eden to satisfy this allocation request (alloc-failure). However, unlike the earlier scenario, the promotion of the middle-aged object from the active Survivor space to being an elderly object in the old generation cannot proceed because there is no free space available in the Old Generation. In this case, the Young Generation collection is put on hold, while an Old Generation collection cycle is spun up to free up space in the Old Generation.

shared_ptr QnA

I think we should learn shared_ptr (sptr) before weak_ptr, scoped_ptr or intrusive_ptr.

A shared_ptr object is a an object — NOT a variable. It can either be a stackVar or a field (or a global). As a stackVar, the var will get out of scope, and the shared_ptr’s destructor will run and do its job. As a field, will the sptr dtor do its job? yes see P22.

A shared_ptr (and its sisters) keeps the shared counter somewhere on heap. Allocation of this counter is considered expensive, since you may allocate millions of counters each second. See http://stackoverflow.com/questions/14482830/stdshared-ptr-thread-safety first answer

Some people may say a shared_ptr can be used much like a raw ptr, but i don’t agree.  I feel you need to see enough (simple and non-trivial) examples. For example

  • raw ptr can become null
  • raw ptr size is well-known
  • raw ptr thread safety is simple

Note some sptr ctor creates the “1st ref” in the ref-counting pointer group; other ctors create “2nd ref”. Make sure you know which ctor is which type.

Q: Are both the smae — mySp.get() vs *mySp
A: No. operator*() is equivalent to *mySp.get()

Q8a: is it normal to replace a ptr field with a sptr field?
A8a: yes. See eg on P21
Q8b: in the destructor, do you need to handle the sptr field as you would the raw ptr field?
A8b: i think u can relax. Dtor of the sptr field will be invoked in the DCB order. If there’s a raw ptr field, its dtor will run but it won’t call reclaim the pointee.

Q: do we ever call operator new and assign it to a sptr?
A: yes see p21

Q: is a sptr always cleaned up upon exception?
A: i think so. stack unwinding calls sptr object’s dtor

Q: if a method expects a particular ptr type, can you pass in a sptr?
A: i don’t think so. Compiler error. Compiler needs the type for memory allocation.

Q: if a method expects a particular type of sptr, can you pass in a raw ptr?
A: i don’t think so

Q: if a sptr stackVar goes out of scope, will the destructor get called?
A: yes P21

Q: deleting a sptr vs destructing a sptr?
A: a sptr is not a 4-byte thingy created with NEW, so u can’t call delete on it.

Q: do we ever destruct a shared_ptr explicitly?
A: i think so

a C/C++ pointer variable has a target-data-type

(see also post on void ptr)
Consider a simple declaration

  int *intp;

Q1: If a pointer variable holds an address, why is it necessary (to compilers) to attach a data type to a pointer variable? From then on the intp variable will be treated always, always as a pointer to a __int__.
A1: when an address is assigned to this pointer, the address is treated as the starting address of a block of memory. An essential info is the size of the block. A wild compiler can treat 8 bytes starting at that address as an object, or 888888 bytes!
A1: every operation on “intp” must be valid for integers. Compilers must check that.

To further answer this question, consider a more basic question —

  int intVar;

Q2: why is it necessary to attach a data type to a nonref variable? From then on the intVar variable will be treated always, always as an int.
A2: I think compiler (yes compiler) must allocate the right amount of memory for the object
A2: compiler must access-check every operations on it. Compiler won’t allow concatenation for an int, right?

Q3: how large is the intp variable itself, if it has to hold both the address and the type?
A3: 32 bits. I believe the data type is not “carried” by the intp variable so doesn’t increase its size. I think it may be a compile-time information rather than runtime information. At runtime, the data type of the pointer is lost and not checked by anyone

digestible insights on servlet threading

Background: we learnt servlet threading many times but still didn’t internalize it. Let’s distill into single-word keywords.

* objects in application-context are obviously thread unsafe. Before you consider session scope or servlet instance vars, consider application-context
* session scope objects can be modified from 2 browsers. This is more digestible than servlet instance vars.

Now let’s distill the digestible insights on servlet instance var.

* one servlet instance 1:m multiple threads. This is the final knowledge pearl.
* this 1:m is the prevalent set-up. 1:1 is rare

slist ^ array list

* linked list (LL) = doubly linked list
* array list (AL) = re-sizable array

LL advantages:
* operation at both ends. stack, queue, dequeue. No resizing required.

AL advantages:
* random access by index lookup. get or set.

AL add() and remove() require shifting or copying.
LL add() and remove() need a traversal from either end to locate the slot.

callable statement notes

* For procs that write to DB without returning any OUT param, I found preparedStatmeent (PS) and CS interchangeable.
* For procs that only read from DB via SELECT, I found PS and CS interchangeable.

The only CS i can’t convert to a PS are those returning something via params “{? = call proc1 @param1=?, @param2=? }”

Now If you look at the get*() methods and set*() methods, you notice an interesting discrepancy.

* getFloat() can take a paramIndex or a paramName arg, but
* setFloat() only takes paramName. Here’s why:

callableStatement3.setFloat(1, 0.95); // this method is inherited from parent interface PreparedStatement. Sets first “?” to 0.95.

callableStatement3.setFloat(“id”, 0.95); // this method is not declared in PS, since PS doesn’t support param name.

callableStatement3.getFloat(1); // this method is declared in CallableStatement interface. Reads 1st OUT param. PS doesn’t let you read anything except via a ResultSet.

callableStatement3.getFloat(“id”); // ditto

Lastly, Do not confuse the get* methods with the ResultSet.get* methods. CS get*() reads params, not rows selected.

select() syscall multiplex vs 1 thread/socket ]mkt-data gateway

Low-volume market data gateways could multiplex using select() syscall — Warren of CS. A single thread can service thousands of low-volume clients. (See my brief write-up on epoll) Blocking socket means each read() and write() could block an entire thread. If 90% of 1000 sockets have full buffers, then 900 threads would block in write(). Too many threads slow down entire system.

A standard blocking socket server’s main thread blocks in accept(). Upon return, it gets a file handle. It could save the file handle somewhere then go back to accept(). Over time it will collect a bunch of file handles, each being a socket for a particular network client. Another server thread can then use select() to talk to multiple clients, whild the main accept() thread continues to wait for new connections.

However, in high volume mkt data gateways, you might prefer one dedicated thread per socket. This supposedly reduces context switching. I believe in this case there’s a small number of sockets preconfigured, perhaps one socket per exchange. In such a case there’s no benefit in multiplex. Very different from a google web server.

This dedicated thread may experience short periods of silence on the socket – I guess market data could come in bursts. I was told the “correct” design is spin-wait, with a short sleep between iterations. I was told there’s no onMsg() in this case. I guess onMsg() requires another thread to wake up the blocking thread. Instead, the spin thread simply sleeps briefly, then reads the socket until there’s no data to read.

If this single thread and this socket are dedicated to each other like husband and wife, then there’s not much difference between blocking vs non-blocking read/write. The reader probably runs in an endless loop, and reads as fast as possible. If non-blocking, then perhaps the thread can do something else when socket buffer is found empty. For blocking socket, the thread is unable to do any useful work while blocked.

I was told UDP asynchronous read will NOT block.

sybase if-else

Background: after a few years of usage, i still feel the need for a few simple rules, like the simple rules in “perlreftut”. Here
they are, each in 5 words or less.

* select => parenthesis. Any select in a if-condition had better wear a pair of parenthesis. I guess they

* select => single value. Any select in a if-condition must produce a single value, including count()

* begin. Being-end is harmless. May increase code size a bit but actually often helps readability.

* no slip-through. In a if-else, either the if block or the else block must execute. Nothing (including nulls) will slip through
between the 2.

## low-level c++(vs java) sys design considerations #1st take

Compared to java architect, a c++ architect needs to worry about more things —

* memory management — avoid double delete, dangling ptr access, leak…
* ptr ownership (who will delete) scheme
* smart ptr — when, where
* copy ctor, op=
* what objects to put on heap ^ stack ^ global
* choose methods ^ static methods ^ free functions
* when to use templates, with the benefit and complexity

STL container to hold Child objects +! pointers – slicing

http://ootips.org/yonat/4dev/smart-pointers.html — but slicing problem!

#include
#include
struct Base {
    virtual void print() const{cout<<1<<"\n";}
    virtual ~Base(){}
};
struct Derived : public Base {
    virtual void print() const{cout<<2<<"\n";}
};
int main(){
    Base b;
    Derived d;
    std::vector v;
 
    v.push_back(b); // OK
    v.push_back(d); // no error but not sure if there’s any hidden issue
    cout<<  v.size() <<endl; // prints 2
    std::vector::const_iterator pos;
    for(pos = v.begin(); pos != v.end(); ++pos){
       pos->print(); // Der object sliced!
    }
}

template objects in spring framework

*JMS — a template instance is a jms session
** anon inner class

*JDBC — a template instance is a database connection
** callback — an anon inner class

*hibernate — a template instance is a hibernate Session

*transaction — a template instance is a … transaction session?
** callback — an anon inner class

Spring Templates follow template-method design pattern. The callback objects provide the missing steps int the template method.

asynchronous: meaning…@@

When people talk about async, they could mean a few things.

Meaning — non-blocking thread. HTTP interaction is not async because the browser thread blocks.
* eg: email vs browser
* eg: http://search.cpan.org/~mewp/sybperl-2.18/pod/sybperl.pod async query

Meaning — initiating thread *returns* immediately, before the task is completed by another thread.

Meaning — no immediate response, less stringent demand on response time. HTTP server need to service the client right away since the client is blocking. As a web developer i constantly worry about round-trip duration.

Meaning — delayed response. In any design where a response is provided by a different thread, the requester generally don’t need immediate response. Requester thread goes on with its business.
* eg: email, MQ, producer/consumer
* eg: jms onMessage() vs poll, but RV is purely async
* eg: acknowledgement messages in zed, all-http

You can (but i won’t) think of these “meanings” as “features” of async.

technologies of citi FX trading

The FX department you applied for gave a presentation. They said the dominant technologies are

· OLAP
· Swing still works but being replaced by C# (WPF, Silverlight)
· Messaging – RV, EMS, multicast (I think he means RV or JMS multicasting)
· Socket programming
· Grid computing
· Sharepoint
· FIX protocol

Compared to muni, I feel FX has more front end UI systems, because FX has a more diverse client base. Muni has no or limited client-facing UI, because muni investors call their financial analysts to trade. In contrast, FX system has a large number of retail customers (using web) and smaller number (700?) of institutional clients including exporters, governments, airlines, manufacturers, commercial banks who often receive and pay out in multiple currencies. These institutional clients get to use a “professional” front end system, swing/WPF, not web-base, even though connection is over internet. Both muni and FX systems have a lot of (20 – 30) intranet UI systems in http or swing.

Risk system takes in current prices (fast changing) and estimates unrealized PnL on the dealer’s book i.e. trading account.

Even if we include FX options (not futures), most FX trading is OTC. CME is listing fx option contracts and some big dealers (like MS) are following. Some instruments (futures..) are guaranteed by exchanges such as CME. In comparison, most muni trading is OTC, since muni is a dealer market, with a few small brokers in between.

FIX head^tail

Every Single msg (session or app) has exactly 1 head and 1 tail.

Up to FIX.4.4, the header contained three mandatory fields (beside a few dozen optional ones): 8 (BeginString), 9 (BodyLength), and 35 (MsgType) tags.

Every Single msg (session or app) has a tail, containing a single tag (10) ie the checksum.

==> Given a long string, how do you carve cout a full message (session or app)? Look for 8= and 10=

I feel head/tail =~= tcp envelope, and Body =~= payload

[09] sequence num ] FIX

Sequence num can be considered part of the header for many app messages. It’s a control device. I consider it as a session-related tag on a non-session message.

Officially, sequece num is NOT part of the header. Up to FIX.4.4, the header contained three fields: 8 (BeginString), 9 (BodyLength), and 35 (MsgType) tags.

There are several tags about sequence num:
(34) MsgSeqNum Integer message sequence number.
(36) NewSeqNo New sequence number
(43) PossDupFlag Indicates possible retransmission of message with this sequence number
(45) RefSeqNum Reference message sequence number

https://drivewealth-fix-api.readme.io/docs/resend-request-message describes a resend request message i.e. a 35=2 message, containing —

(7) BeginSeqNo and (16) EndSeqNo specifies the requested messages

FIX session^app messages

http://btobits.com/fixopaedia/fixdict11/index.html

I feel even an app message has session-related tags.

Top-down learning of the FIX tags:

* learn basic fields of head and tail — see another post
* learn basic message types (msgType or Tag 35) within session messages aka session messages
** remember — Each session starts with a _session_ message LOGON and ends with session message LOGOUT
*** LOGON is a full message with a head containing a field msgType=A meaning LOGON

* learn basic message types (msgType or Tag 35) within _app_ messages
** learn basic Execution Report (35=8), the most common app message
** learn basic trading scenarios consisting of FIX dialogs

FIX common app message types

This post is about “Business / Application messages”, not “session/admin messages

http://en.wikipedia.org/wiki/Order_(exchange) — different order types

common tag=value pairs

35=D — msgType = NewOrderRequest
35=8 — msgType = executionReport
35=G — msgType = requestForCancel/Replace
12=0.05 — Commission (Tag = 12, Type: Amt) Note if CommType (13) is percentage, Commission of 5% should be represented as .05.

Used in:

*
* Quote Status Report (AI)
* Quote Response (AJ)
* Quote (S)

FIX session is like TCP + sqlplus

Each session starts with an session-message LOGON and ends with session-message LOGOUT. Session-messages are also known as “admin messages”. Main purpose is session management and flow control as in TCP

To maintain the session, I think FIX uses not only the session messages — each app message also has session-related tags like (10)checksum, (9)bodyLength and (34)sequence numbers — used to maintain a session between client and server.

retrans-request — is a message to ask for a resend of a missing message

heartbeat messages — between 2 parties

FIX keywords and intro

tag=value, where tag is always a number, and value is often char(1) — Both need the dictionary for interpretation.

Whenever there’s an enum of possible values, FIX would encode them into very short codes (often char(1) ). As a result, encoding/decoding using dictionary is prevalent.

http://www.ksvali.com/fix-protocol-faqs/#2.1

http://www.ksvali.com/2009/02/fix-protocol-videos-on-youtube-finally/

STL allocators – good book

Never quizzed….

The ebook [[c++ in a nutshell]] sections on header files and on Allocator. Many simple Allocator implementations and usage scenarios. Looking at the code, i realize the allocator idea started as a way to “abstract out” the allocation/deallocation job duty traditional performed by new/delete.

So there are 2 standard, approved techniques to “mess with” new/delete —
1) customize them for your class and, by the way, have them inherited by subclasses
2) allocators
(In some cases like some items of the original [[EffC++]], these techniques are kind of combined. But It’s good to have a mental divide between the 2 techniques.)

STL containers do NOT directly call new/delete. They all use allocators, which often calls new/delete internally.

The ebook also says allocators can implement debugging or validity checks to detect programmer errors such as memory leaks or double frees. That’s because allocators, like customized op-new, intercept every memory allocation request from /host-application/.