wpf triggers, first lesson

I believe
– property triggers — live inside a style including embedded control template
– data triggers — live inside 1) data templates 2) styles, or a 3) control template embedded in a Style.
( Note data template only accepts data triggers, not property triggers.  See http://stackoverflow.com/questions/17598200/datatrigger-binding-in-wpf-style )

Triggers are predominantly created in xaml. It’s possible to do it in c#, but rare.

Triggers are all about Visual-effects. All these triggers modify the properties of visuals.  PTriggers watch properties; DTriggers watch data; ETriggers watch …

Most common trigger is the property trigger. 2 other common types are Data trigger and Event trigger.

It’s best to internalize one simple property-trigger usage, before looking at advanced triggers. As seen below (example from [[wpf  succinctly]]), you can have 2 triggers on a visual. The 2 triggers can hit the same visual Property like the background color, but only one trigger will fire.

Different aspects (properties) of the visual can be managed through different setters within a Singel trigger. Each setter sets one (depedency) property to control one aspect.

  

  
  

Fwd: design a finite state machine for a FIX engine

We need a cache (OMS/EMS?) of all the pending orders we created. All from-exchange FIX messages must match one of the pending orders. Once matched, we examine its current state.

If the message is belated, then we must reject it (informing all parties involved).

If the message is delivered out of sequence and too early, then we must keep it on hand. (a 2nd cache)

If the message is for immediate consumption, then consume.

———-

…the FSM (finite state machine) in FIX is about state transition and should be pertinent to all state machines. Basically, it goes from pending new to new to fill or canceled. From canceled, it cannot go back to new or fill etc.

Anthony

 

tibRV java app receiving msg – the flow

The flow:
* my app creates “my private queue”, calling the no-arg constructor TibrvQueue(void)
* my app (or the system) creates a callback-method object, which must implement onMsg(TibrvListener,TibrvMsg). This is not a bean and definitely not a domain entity.
* my app creates myListner, calling its constructor with
** a callback-method object
** myPrivateQueue
** a subject
** other arguments

* my app calls queue.dispatch(void) or poll(void). Unlike JMS, NO message is returned, Not on this thread!
* Messages arrive and onMsg(..) runs in the dispatch thread, asynchronously, with
** first arg = myListner
** 2nd arg = the actual message

————————
Component – Description

Event Object – Represents program interest in a set of events, and the occurrence of a matching event. See Events on page 81.

Event Driver – The event driver recognizes the occurrence of events, and places them in the appropriate event queues for dispatch. Rendezvous software starts the event driver as part of its process initialization sequence (the open call). See Event Driver on page 83. No java api i believe.

Event Queue – A program (“”my app””) creates (private) event queues to hold event (message) objects in order until the program (“”my app””) can process them.

Event Dispatch Call – A Rendezvous function call that removes an event from an event queue or queue group, and runs the appropriate callback function to process the event. I think my app calls this in my app’s thread.

Callback Function – A program (“”my app””) defines callback functions to process events asynchronously. See Callback Functions on page 89.

Dispatcher Thread – Programs (“”my app””) usually dedicate one or more threads to the task of dispatching events. Callback functions run in these threads. class com.tibco.tibrv.TibrvDispatcher extends java.lang.Thread. Event Dispatcher Thread is a common design pattern in java, c#…

treat IRS floating cashflow "stream" as a commodity

In FX, it’s extremely useful to think of first currency as some kind of commodity like silver… Similarly, in IRS, it’s useful to treat the floating Income stream as a commodity (like silver).

You as the IR “Swap buyer” agrees to pay a fixed price in fixed installments. In exchange you receive something of changing value — whose market value changes daily. Suppose we enter a vanilla IRS to receive floating. When we put down the phone with counter-party, we have agreed to pay a fixed (for eg.) 3.4% price for a “silver” in the form of a stream of floating coupons. This is a bit similar to buying an annuity.

 
The 3.4% fixed rate is like an execution price on the exchange. Next hour (day), the same silver could fetch more, so another buyer would execute the trade at a higher price, like 3.42%, so my existing position would have a positive PnL. This simplified view assumes a null discount rate of 0 (or equivalently, a discount factor of 1.0).

The changing market value of the “silver” we bought is tied to Libor. We “long” for Libor to rise –We are long Libor, as we are long silver.

On the other hand, if you are a dealer selling IRS, you are short Libor as you are short silver.

By comparison, if you write put/call contracts, you are short volatility. Intuitively, consider OTM — lower vol reduces the chance of option finishing ITM, so you are more likely to pocket the premium. As a fire insurer, lower vol means lower chance of fire disaster — good for you the insurer.

I hope this helps beginners get a “feel” of the terminology in IRS.

tibrv supports no rollback – non-transactional transport

Non-transactional “transports” such as TIBCO Rendezvous Certified and socket do not allow for message rollback so delivery is not guaranteed.  Non-transactional transports can be problematic because the operation is committed to the transport immediately after a get or put occurs, rather than after you finish further processing and issue a Commit command.

JMS does support transactional “transport”, so you can “peek” at a message before issuing a Commit command to physically remove it from the queue.

http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.ebd.eai.help.src/configuring_a_rendevous_reliable_session_and_transport.htm

Fwd: jms connection sharing

Reading [[java message service]] I realized that if a topic has 1000 concurrent client connections (a few thousands achievable by 2000 technology), we could have several thousand concurrent applications receiving a new message. JMS Connection sharing dramatically increase throughput and bandwidth.

Many middleware products support jms connection sharing.

vega roll-up makes no sense #my take

We know dv01, duration, delta (and probably gamma) … can roll up across positions as weighted average. I think theta too, but how about vega?

Specifically, suppose you have option positions on SPX at different strikes and maturities. Can we compute weighted average of vega? If we simulate a 100bps change in sigma_i (implied vol), from 20% pa to 21% pa, can we estimate net change to portfolio MV?

I doubt it. I feel a 100 bps change in the ATM 1-month option will not happen in tandem with a 100 bps change across the vol surface.

– Along the time-dimension, the long-tenor options will have much __lower__ vol changes.
– Along the strikes, the snapshot vol smile curve already exhibit a significant skew. It’s unrealistic to imagine a uniform 100 bps shift of the entire smile (though many computer system still simulates such a parallel shift.)

Therefore, we can’t simulate a 100 bps bump to sigma_i across a portfolio of options and compute a portfolio MV change. Therefore vega roll-up can’t be computed this way.

What CAN we do then? I guess we might bucket our positions by tenor and aggregate vega. Imperfect but slightly better.

Nomura FX option + java IV

Q: Does your system cover tenor based risk?

Q: Between Equity vol and FX vol systems, do you monitor the same greeks?
%%A: in eq, it’s about delta gamma vega and theta. In FX, I believe these are all important. But I guess rho is probably more important than in eq, since FX is very sensitive to interest rate and there are 2 interest rates in each currency pair.
A: delta/gama/vega/theta are the big 4. rho is insignificant in both eq and FX. See http://bigblog.tanbin.com/2011/06/rho-of-vanilla-fx-option.html

Q: You mentioned that you build thousands of Libor yield curves each day? How do you do that?
%%A: using deposit rates, futures rates, swap rates, and year-end turns

Q: You mentioned EOD marking that feeds into downstream PnL attribution. How do you do that?

What Unit test tools did you use?

What’s a good unit test?

In java how do you compare 2 strings?

Why is java string designed to be immutable ?

oracle tablespace striping

If an audit table or a trade capture table gets a lot of concurrent writes, you may want to stripe it across disks to increase parallelism and write performance. One way is to configure the table with multiple tablespaces, according to a friend. Each tablespace maps to a physical disk.

The same stripe technique also speeds up reading.

See http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:7116328455352

convertible bond – more like a call than cash stock or corp bond

Now I know that a vanilla convertible bond as an asset is more like a call [1] rather than a stock or a corporate bond. Some People said a CB is more like a cash stock than a bond, but I don’t think so. CBs are traded by the vol desk, not the cash equity desk or the fixed income desk.

[1] a call option (on the stock). I don’t there is a put-option like convertible bond

A CB has vega, a maturity, theta, delta.

In a typical contract, the indenture specifies a strike in the form of conversion-price or conversion-ratio. For example, a conversion ratio of 45:1 means one bond (with a $1,000 par value) can be exchanged for 45 shares of stock.

show a swing GUI at unblocked location

By default, your swing app pops up on top left corner, right behind your IDE – annoying. To make it show up somewhere unblocked, I often do

final Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();
frame.setLocation((int)(screenSize.width*1.1), screenSize.height/2);

This assumes I have another monitor on my right and puts swing an inch off my main monitor.

SCB threading design IV: caching remote DB

This could be a good story to tell during interviews.

The MOM design is still unfinished but I think a practitioner could see we are on the right track.

Q: A remote position data source (a DB or some opaque data source) is super slow. It takes up to 500ms to return a position object when queried. The read(int key) method is provided by the DB’s client-side API and is thread safe, so multiple threads can call read() concurrently. Now, we need some kind of caching to minimize the 500ms latency, but we want a lazy cache, partly to avoid overloading this busy DB — We don’t want to load all 9 million positions when we are interested in only a few positions. Our cache shall provide an API like a Cache class with a method

“public Position lookup(int key) const”

If a local lookup thread is blocked due to a cache miss (resulting in a DB read), it should not block other unrelated lookup threads. Also, if 2 threads happen to call lookup(123) in quick succession which is a cache miss, 2nd thread should block and gets the result as soon as first thread receives the position and populates the cache. Perhaps some kind of notification.
———– Analysis ———–
Similar to my B2bTradingEngine. I feel we can address this either with low-level tools or with high level tools. Perhaps the interviewer is interested in the low-level know-how, or perhaps interested in the higher-level design techniques and principles. I feel it’s more challenging to assume high volume and slow connections, more relevant to the growing big data economy. Perhaps a distributed infrastructure.

It’s generally easier to scale down an over-engineered design than to scale-up a simplistic design.
———– MOM based idea ———-
(not yet a “design”). If the request volume is too high, then we run the risk of creating thousands of blocked threads. MOM to the rescue. I will refer to the MOM-based server process as the “engine”.

The client thread would have to be “decoupled” from the engine thread, otherwise 5000 concurrent requests would map to 5000 threads in the MOM server. To decouple, the client thread (one of 5000 or 55000) could block as a synchronous MOM consumer, or the client thread could register an onMsg() callback.

On the MOM server, we check request against the cache and returns the Position if hit — synchronously. Position data goes out in a response “topic”, so dozens of clients could filter the responses to look for their desired position. What if cache miss? Request goes into another queue (cacheMissQueue). Add the key to a task queue (producer-consumer). The consumer thread pool could have 1 or more work threads. Assuming the read() method is blocking, the worker thread must block. Upon failure, we update the cache with a special record for key 123. Upon success, we update the global cache. Either way we publish to the response topic.

Note If a request key 123 is already being processed or in the task queue, then we won’t even add it to the task queue. The task queue should “see” each key only once.
——–Low level design ——-
Now let’s assume this is a low-level single-process threading challenge, without MOM. The lookup() method could be called on 50 threads simultaneously. If 2 threads both request key 123 (cache miss), they must both block. Each reader thread should create a lock tagged with the key. During the read(), the lock should not be held. Upon successful read, the reader thread should lock, update cache, notify, unlock, then return the Position object. If in the interim a 2nd thread (a want2read thread) also has a cache miss on key 123, it would look for the lock tagged with 123, lock, then wait in it. Upon wake-up, it check cache. If found, it exits the loop and returns the Position object.

If read() times out or fails, the reader thread would fake a special Position object with an error msg, and do the same as above.

%%A: We need a thread pool to consume the queue. The pool threads will execute processOneReadTask(), “complete” the Position object in cache and notify on the position id

class Cache{
private:
   std::map<.....> * map; // should use shared_ptr<position>
   Mutex mutex;
   Condition cond;
   Queue<int> queue; //thread safe queue, not std::queue
   DB * db;
   Cache():
     map(new std::map<....>),
     mutex(createMutex()),
     cond(mutex),
     queue(createQueue()),
     db(getRemoteDB()) {}
   void submitReadTask(int key){
         queue.enqueueIfNew(key); //thread-safely enqueue, if key has never been enqueued
   }
public:
   //callable by clients
   Position * lookup(int key){
       ScopedLock lock(mutex); //may block
       Position * found = map.findByKey(key);
       if (found) return found;
       this.submitReadTask(key);
       while(1){
          cond.wait();
          Position * found = map.findByKey(key);
          if (found) return found;
       }
   }
   // executed by pool threads
   void processOneReadTask(){
       int key = queue.pop();
       if (key == -1) return; // empty queue
       Position * readOut = this.db.read(key); //slow
       ScopedLock lock(this.mutex); //may block
       map.populate(key, readOut);
       this.cond.signalAll();
    }
}; //end of class

How about the listener solution instead of wait/notify? If we have 500 threads requesting position 123. They all block in wait() — memory-intensive. The listener solution means each thread would package its local state info into a state object and save it into a global collection. A few threads would be enough. For example, a single thread could wake up upon notification and sequentially complete the tasks of the 500 requests.

coin denomination-design problem

Q: You are given an integer N and an integer M. You are supposed to write a method void findBestCoinsThatMinimizeAverage(int N, int M) that prints the best design of N coin denominations that minimize the AVERAGE number of coins needed to represent values from 1 to M. So, if M = 100, and N = 4, then if we use the set {1, 5, 10, 25} to generate each value from 1 to 100, so that for each value the number of coins are minimized, i.e. 1 = 1 (1 coin), 2 = 1 + 1 (2 coins),…, 6 = 1 + 5 (2 coins), …, 24 = 10 + 10 + 1 + 1 + 1 + 1 (6 coins), and we take the average of these coins, we would see that the average comes out to ~5.7. But if we instead use {1, 5, 18, 25}, the average would come out to be 3.7. We are to find that set of N coins, and print them, that produce the minimum average.

====analysis

I feel this is more of a math (dymanic programming) puzzle than an algorithm puzzle. I feel if we can figure out how to optimize for N=4,M=100, then we get a clue. In most currencies, there’s a 50c coin, a 10c, 5c. Now, 1c is absolutely necessary to meet the first “trial” and out of the question. Let’s start with 50/10/5/1 and see how to improve on it.

First, we need a simple function
map findCombo (int target, set coinSet);
For example, findCombo(24, set{1,5,10,25} ) ==  map{10-> 2,  1->4}  Actually this findCombo is a tough comp science problem, but here we assume there’s a simple solution.

Now, I will keep the same coinSet and call findCombo 100 times with target = 1,…100. We will then blindly aggregate all the maps. We need to minimize the total coins. (That total/100 would be the average we want to minimize.)

Now, the Impossibly Bad coinset would include just a single denomination of {1c}, violating the rule of N distinct denominations. Nevertheless, this coinset would give a total count of 1+2+3+…+100 = 5050. Let’s assume the cost of manufacturing each big or small coin is identical, so that 5050 translates to the Impossibly Bad (IB) cost of $5050. Each legal coinset would give a saving off the IB cost level of $5050. We want to maximize that saving.

If we get to use a 25c once among the 100 trials, we save $24; if we get to use a 10c once, we save $9. If we use a poor coin set of {1c2c3c4c}, then the saving can only be $1, $2 or $3 each time.

priorityQ^RBtree, part 1

No expert here… Just a few pointers.

I feel binary heap is designed for a Subset of the “always-sorted” requirement on sorted trees. The subset is “yield the largest item on demand”.

Therefore, a binary heap is simpler (and faster) than a red-black tree.

As a solution, sorted data structures (like RB tree) are more in demand than priority queues (like heap). For example, the classic exchange order book is more likely a sorted list, though it can be a priority queue.

A binary heap hides all but the maximum node, so we don’t know how those nodes are physically stored.

A binary heap is a binary tree but not sorted, not a BST.

Re: essential tools of finding antiderivative

substitution of variables during integration is related to the differentiation chain rule.

In general, Integrating an unfamiliar Expression of x requires that we tinker with it until it looks like one we recognize as an antiderivative. This skill comes with experience. Therefore we need to recognize the “output” function form of chain-rule, and also product-rule.

Undoing differentiation by “guess and check” methods

basic oracle instance tuning tips #Mithun

You can turn on tracing for a given session. All queries will be traced. You can also trace across all sessions.

Oracle provides dynamic performance stats in the form of so-called V$ views, but I think the trace files are more detailed.

Another common technique is to record timestamps

– client log – issuing query
– server log – receiving query
– server log – returning data
– client log – receiving data
– client log – sending out data to requester, which could be in the same or another process.

Latency is additive. Find the largest component.

coherence 3 essential config files

Most of the effort of deploying/using coherence is configuration. There are only 3 important xml config files, according to a colleague.

* client config
* cache config — server config
* proxy config

proxy = a jvm sitting between client and server. Load balancer. Knows the relative load of each node. Without proxy, a heavy hitting client can overload one of 2 cache nodes.

stop-the-world inevitable: either minor or major JGC

Across All of Sun’s GC engines so far (2012), young generation (eden + survivors) algorithm has _always_ been STW. The algo changed from single-threaded to parallel, but remains STW. Therefore, during a minor GC ALL application threads are suspended. Usually a short pause.

Across All of Sun’s GC engines so far (2012), oldgen is at best low-pause but _never_ no-pause. For the oldgen, there is _always_ some pause, due to an inevitable STW phase.

Note one of the oldgen algorithms (i.e. CMS) is mostly-concurrent, meaning the (non-concurrent) STW phase is a brief phase — low-pause. However, young gen algorithm has always been STW throughout, without any concurrent phase.

2 notations of differentiation

Given a function f(x), the derivative function can be written as either f'(x) or df/dx.

The prime notation (Newton style) signifies that

1) f'(x) is a function of x
2) f' is derived from f

The quotient notation (Leibniz style) highlights

a) its origin as a quotient
b) the variable x with respect to which the derivative is taken.

I like to treat the derivative as another variable, written as f' or as j or any letter you pick. AT each point on the x-axis, there's a unique value of f (being a function of x), there's also a value of j. Therefore j depends on x and is by definition a function x.

This value of j isn't really a quotient, but rather the quotient of delta_f/delta_x pushed to the limit. (a) is therefore misleading.

(b) is extremely useful when dealing with multiple variables, the chain rule, the inverse function or substitution of variables

[12] y implement state transition by static methods#OMS

(This discussion is relevant to any high-concurrency finite state machine or FSM, but I will use an order management system as example.)

Suppose my FSM implements state transitions in the form of instance methods defined on the Order object, myOrder->A_to_B(Event). Inside such a method, we verify this->currentState == A, validate other Conditions, and take the appropriate Action to process the Event, before updating this->currentState = B.

(The Transition/Event/Condition/Action/Validation concepts are fundamental to FSM in general. See http://www.thetibcoblog.com/2007/06/26/differences-between-a-bre-and-a-rule-driven-cep-engine-part-1)

I once proposed that A_to_B could convert to a static (java/c#/c++) or (c++) global, free-standing function A_to_B(Event, Order). I once said that this “might” be a bit easier to multi-thread. Now I feel in most cases such a simple change (from non-static to static) won’t improve concurrency. Such a change may have bigger benefits (to be elaborated) but concurrency is rarely one of them. However, here’s one possible “rare” context.

Following Functional Programming principle (as seen in FMD), each and every object is immutable, so is Order. Therefore the global function will return a new Order instance but representing the same real world order. Immutability removes the need to synchronize, completely.

Q: But what if 2 threads need to access a given Order?
A: If one of them is a reader, then immutability will probably help.
A: If both are writers, then my counter question is why this scenario is allowed to exist —

I feel in low latency OMS each order should ideally be updated by the same physical thread throughout. Multi-queue — each security/cusip belongs exclusively to a single thread, never hopping from thread X to thread Y — Rule 1.

I feel Rule 1 is an extremely practical rule but in some contexts it is over-restrictive and needs to be relaxed, but still at any moment an order is only updated on a single thread, never concurrently on 2 threads — Rule 1b.

However, I feel all good software rules have exceptions (as explained elsewhere in my blog) so sometimes it’s legitimate to break Rule 1b, though I won’t condone such reckless practice. Well I’m not boss…:)

In such a system, I take the liberty to assume every writer thread must put the updated Order into a shared cache. 2 writers using the instance method A_to_B()….must synchronize on the cache (usually a hash map). The FP design could use CAS. Since the chance of concurrent write is low, CAS retry is is unlikely, so CAS will be faster. Uncontended synchronization is always slower than uncontended CAS — see Brian Goetz article on http://www.ibm.com/developerworks/java/library/j-jtp04186/index.html.

However, immutability entails more object creation instead of updating mutable fields. Trade-off to be measured.

story-telling for behavior interviews

As stated in https://bintanvictor.wordpress.com/2010/08/13/vague-vs-normal-vs-specific-answers-in-nontech-interviews/, during non-tech interviews, stories are a good way to organize your thoughts, and make yourself memorable.

Gayle McDowell pointed out that your story needs to

  • explain why it was done that way
  • reflect on you, not your team
  • understandable and substantial
  • Also be prepared to say how you would do it differently

–To prove a _team_player_ —

  • These aren’t stories but … voted most helpful colleague in Zed;
  • knowledge sharing;
  • hands-on guidance over freshers. In ICE team, all four new hires come to me for help.
  • one of the most gregarious guys on the floor, making friends across department boundaries.

–To prove “help those in need” —

rated substantially-exceed by all the freshers I helped, mostly Indian freshers; Never turned away a help seeker

–To prove constructiveness in conflict —

presenting alternative designs to senior managers

–To prove knowledge sharing —

–To prove “can work with difficult colleagues” —

Chih Hao who likes to criticize my code ..

–To prove under-pressure — biggest release of 2009

–To prove personal sacrifice —

ln(S/K) should be compared to what yardstick@@

(update — I feel depth of OTM/ITM is defined in terms of ln(S/K) “battling” σ√ t )

Q: if you see a spot/strike ratio of 10, how deep OTM is this put? What yardstick should I use to benchmark this ratio? Yes there is indeed a yardstick.

In bond valuation, the yardstick is yield, which takes into account coupon rate, PV-discounting of each coupon, credit quality and even OAS. In volatility trading, the yardstick has to take into account sigma and time-to-maturity. In my simplified BS (http://bigblog.tanbin.com/2011/06/my-simplified-form-of-bs.html), there’s constant battle between 2 entities (more obvious if you assume risk-free rate r=0)

     ln(S/K) relative to σ√ t         …………….. (1)

Fundamentally, in BS model ln(S/K) at any time into the diffusion has a normal distribution whose
stdev = σ√ t, i.e. the quoted annualized vol scaled up for t (2.5 years in our example)

Note the diffusion starts at the last realized stock price.

Q: Why is σ a variable and t or r are not?
σ is the implied vol.
σ is the anticipated vol over the remaining life of the option. If I anticipate a 20%, i can put it in and get a valuation. Tomorrow, If i change my opinion and anticipate a much larger vol over the remaining life (say, 2 years) of the option, I can change this input and get a much larger valuation.

The risk free rate r has a small effect on popular, liquid options, and doesn’t fluctuate much

As to the t, it is actually ingrained in my 2 entities in (1), since my sigma is scaled up for t.

how important is automated pricing engine@@ ask veterans

I proposed/hypothesized with a few pricing engine veteran friends that the big make-or-break trades and successful traders don’t rely so much on pricing engines, because they have an idea of the price they want to sell or buy. However, my friends pointed out a few cases when a pricing engine can be important.

– Risk management on a single structured/exotic position can involve a lot of simulations.
– If u have too many live positions, pricing them for pre-trade OR risk can be too time consuming and error prone.
– If you have many small RFQs, trader may want to focus elsewhere and let the system answer them.

How about those make-or-break big trades? Volume of such trades is by definition small. Decision maker would take the pricing engine output as reference only. They would never let a robot make such big decisions.

var-swap daily mark-to-market

Imagine we buy a week-long var swap with variance notional $400K. Since there are 4 price relatives in a week and hence 4 daily realized var (DRVar) numbers, we divide the notional into 4 equal slices. Each day we record the DRVar and compute our “spread” over the strike level (say 30% annualized vol), before computing our gain or loss realized on that day.

If today’s DRVol annualized is -64.8029% our spread would be 0.648*0.648 – 0.09 = 0.33. We actually earn 0.33 * $100k on this day. It’s a realized gain. By the way, the minus sign is ignored !

If today’s DRVol annualized is 12.7507% our spread would be 0.1275*0.1275 – 0.09 = 0.0737. We actually lose 0.0737*$100k. This is realized loss.

This is like a cabby earning a daily wage. Each day’s earning is realized by end of the day. Next day starts on a clean slate. He could also make a daily loss due to rent, fuel, parking, or servicing.

Now imagine 400 slices. It’s possible to have a lot of small losses in the early phase, then some big profit, perhaps due to dividend anouncement. A shrewd market observer may have some insider/insight on these volatility swings — the volatility of volatility.

It’s wrong to project those early losses (or profits) into the remaining “slices”. This important principle deserves a special illustration. Supppose I strike a deal to sell you all the eggs my hen produces — one per day for 100 days, at a total price of $100. If you initially get a lot of small eggs (losses) you may complain and want to cancel at a negotiated price. However, the initial price is actually a reasonable price, because market participants expect bigger eggs later. After 20 days, the 80 upcoming eggs now have a fair market price of $88. First 20 eggs were perhaps priced from the onset at around $11.50-$12.70, though people couldn’t have predicted the exact value of first 20 eggs. Now if you do terminate, you get $88 + 20 eggs. Fair. Your 20 eggs’ value may exceed $12.

( Now imagine a long var swap position with 100 “slices” with 20 realized…..)   The eggs is an analogy of var swap daily market-to-market. It’s similar (slight messier than) to futures mark-to-market. Here “Market” refers to the market expectation of the remaining 80 daily price relatives (all remaining eggs). Remember each of the 80 slices would result in a daily realized PnL contribution. The market participants have some idea about the SUM[2] of these 80 forthcoming DRVariances — 80 eggs.

[2] variance is additive; volatility isn’t.

##what bad things could happen to a thread in a pool

Do you ever wonder why threads in a thread pool can go bad and become unusable? Weblogic has long realized that. Hard to diagnose.

However, I don’t believe “voodoo” exists in a threading system. A stuck thread must be stuck in a useless loop or waiting for cpu, IO or memory. It won’t be stuck for “no reason”. Pretend to be a spy looking at the program counter inside any thread — why not moving? There are only a limited (not UNlimited) number of reasons. Here are various bad things that can happen to an innocent thread.

– in java, any uncaught exception can fall on a thread (like a rotten apple:). These are extremely common so a thread in a pool should handle these /gracefully/.

– a thread may be seen doing useful work in normal conditions, but in an edge case get into an endless loop but doing no useful work — just wasting cpu
** Note endless loop is fine if the thread is doing useful work.

– divide by zero (perhaps terminating the thread)
– access wrong memory location and trigger a segfault. In C++ there are many pointer/memory-related errors.
– java thread can run out of memory too
– in java, a thread can be stopped due to the deprecated but supported stop() and suspend() methods. Such a thread might emerge in bad shape.
– starvation due to low thread priority.
– stack overflow, due to deep recursion (like a recursive equals() or hashCode()).

– thread could get stuck in a normal blocking operation like accept(), read(), write(), lock()/synchronized keyword, wait(), join(), long sleep. Many of these aren’t even interruptible.
** IO (disk or network) wait is notorious culprit. In this state, everything including those “kill -9” signals are ignored (see [[Unix Power Tools]]). Entire process may appear stuck in mud.
** my java thread book also says a thread blocked on a socket may not respond to interrupt.
– deadlock

##what bad things can crash JVM

(Why bother? These are arcane details seldom discussed under the spotlight, but practically important in most java/c++ integrations.)

Most JVM exits happen with some uncaught exception or explicit System.exit(). These are soft-landings — you always know what actually killed it.

In contrast, the hard-landing exits result in a hs_err_pid.log file, which gives cryptic clues to the cause of death. For example, this message in the hs_err file is a null pointer in JNI —

siginfo: ExceptionCode=0xc0000005, reading address 0x00000000

Note this hs_err file is produced by a fatal error handler. However, if you pull the power plug, the FEH may not have a chance to run, and you get what I call an “unmanaged exit“. Unmanaged exit is rare. I have yet to see one.

People often ask what bad things could cause a hard landing? P79 [[javaPerformance]] mentions that FEH can fire due to

* fault in application JNI code
* fault in OS native code
* fault in JRE native code
* fault in the VM itself

caching thousands of java string literals(hardcoded)

Q: Suppose you have lots of java strings (typically up to 100 characters) in your JVM. Some are string literals, some are dynamic inputs from web, database/file or by messaging. You know many of the strings are recurring, such as column headers or individual English words from a file. You could use constant variables to represent column header names, but now we have too many (thousands of) such constant variables — impractical.
A: My basic solution is a cache in the form of a hashset which is internally a hashtable

    static String lookup(String input);

If input is found in the hashtable then we reuse it and avoid creating duplicate objects. This method is best with string literal inputs. Java automatically interns these literals so no redundant copies of literal string object even if you have lookup(“Column1”) in 200 classes.

Issue: indiscriminate usage — a colleague pointed out if lookup() is public, then other developers can abuse it and pass in strings that never re-occur. They just take up permanent memory for no benefits. One simple measure is another argument to remind developers —

    lookup(String input, boolean isRecurring);

Issue: large string — If we get a 800MB string we need to make a decision. If it’s reused often, then we should cache it somewhere. If it’s used only twice, then maybe recreate it each time. A simplistic solution is to add a length check in lookup(), and rename it to lookup1KB(). The places we know we may get 800MB strings, we use an alternative lookupSpecial() method.

Issue: large memory footprint — even if we check the string lengths in lookup1KB(), we can still get 9,000,000 entries. Most of these are due to the above-mentioned indiscriminate usage. We could add a hashtable size control, but I feel this tends to add latency, so not idea for real time. My colleague pointed out LinkedHashMap.java supports LRU.

(How does the jvm string pool help???)

Q: why not use a bunch of string constants?
A: Even if we only have 200 of these literals, using these many constants can be inconvenient.
* lookup() shows you the exact spelling with spaces and cases. To convert these many literals to constants, you need to hand-craft a lot of variable names.
* what if the literals change? You would need to rename those variables.
* you may want to decouple the constant’s name vs the content. That can hurt readability, assuming I prefer to see the literals in source code.
* If in Class1 I already defined a constant SOME_LONG_STRING, and in Class2 I see “some long string” I would need to look to see if it’s already a constant.

2nd differential is the highest differential we usually need

When I first encountered the concept of the 2nd derivative, I thought maybe people will be equally interested in the 3rd derivative or 4th. Now I feel outside physics (+ math itself), folks mostly use first derivative and 2nd derivative. In classical physics, 2nd derivative is useful — acceleration. Higher derivatives are less used.

Note on notation. f” is (inherently) a function in the black-box sense that for each input value, there’s an output value. This function derives from the original function f. We write f”(x) in that context. However, f” can be usefully treated as a independent variable just like x,y and t, so we write f” without the (x). In this context, we aren’t concerned about how f” depends on x. That dependency might be instrumental in our domain, but at least for the time being we endeavor to ignore that and concentrate on how f” as an independent variable affects “downstream”

Graphically, 2nd derivative describes concavity or convexity —
^ When f” is positive, curve is concave upwards, regardless whether f(x) is positive or negative, whether f(x) is rising or falling, whether f'(x) is positive or negative. However, f’ is rising for sure.
^ When f” is Negative, curve is concave Downwards, regardless.

This observation is relevant to portfolio gamma. Take a short put for example. This position’s delta is always Positive but Falling with S, towards 0. The PnL graph is concave downward, so this gamma is always Negative. (See https://www.thinkorswim.com/tos/displayPage.tos?webpage=lessonGreeks) It’s important to clarify a few points and assumptions implicit in the above context —

* This PnL graph is purely theoretical. Underlier (S) has just one price of $88 right now, and it won’t become $1 even though the graph includes that price on the S axis.
* PnL graph is about the Current valuation (and PnL) but with Imaginary S prices. It shows “what-if” underlier price S becomes $1 in the next moment — that’s the meaning of the $1 on the x-axis.
** However, the most useful part of the PnL curve is the region around the current S — $88. This region reveals our sensitivity to underlier moves. It shows how much our short put valuation (and PnL) would gain or suffer When (not “if”) underlier moves a tiny bit in the next moment.

* The delta curve is purely theoretical. At the current S = $88, our delta is, say 0.51 or 0.47 or whatever. It won’t suddenly become 0.01 even though you may see this delta value at a high S. That 0.01 delta means “if S becomes so high tomorrow, our delta would be 0.01”

* there’s no “evolution over time” depicted in any of these graphs. Time is not an axis. These curves are pure mathematical functions describing “what if S is at this level”. In this sense the delta curve is very similar to the price/yield curve. Even the standard yield curve and forward curve are similarly Unrelated to so-called time-series graphs.

If you are confused about “on the far right put is OTM, but on a smile curve OTM puts are on the far Left”, read my other blog posts about OTM put.

var-swap PnL: %%worked example

 A variance swap lets you bet on “realized” variance. The exchange automatically calculates realized variance for each day, so if you bet the total realized variance over the next 3 days will average to exceed 0.64 [1], then you can buy this contract. If it turns out to be 0.7812, you earn the difference of 0.1412 notional which would mean $141,200 on a million dollar notional.

[1] which means 80% vol (annualized), or roughly 5% daily realized vol (un-annualized)

Standard var swap PnL is defined as

    (sigma_r2 – K) N  ….. …(1)
where
  N denotes notional amount like $1,000,000
  K denotes strike, which is always in terms of annualized variance

sigma_r is annualized realized Vol over the n days, actually over n-1 price relatives

  sigma_r2 is annualized realized Variance, and calculated as
    252/(n-1)  [  ln2(S2/S1) + ln2(S3/S2)  + … + ln2(Sn/Sn-1)  ]
where
  S2 denotes the Day 2 closing price.
  ln2(S2/S1) is known as daily realized Variance un-annualized

ln(S2/S1) is known as daily realized Vol un-annualized, or DRVol

In other words, take the n-1 values of ln(PriceRelative) and find the stdev assuming 0 mean, then annualize.

A more intuitive interpretation — take the average of the n-1 daily realized variances, then multiply by 252.

Now, trading often work with DRVol rather than the S2 stuff above, so there’s an equivalent PnL formula to reveal the contribution of “today’s” DRVol to a given var swap position, and also track the cumulative contribution of each day’s DRVol. Formula (1) becomes PnL ==

252N/(n-1)*[ ln2(S2/S1)-K/252 + ln2(S3/S2)-K/252 + .. + ln2(Sn/Sn-1)-K/252 ], or
N/(n-1)*[ 252ln2(S2/S1)-K + 252ln2(S3/S2)-K + .. + 252ln2(Sn/Sn-1)-K ]
where
  N/(n-1) represents the notional amount allocated to each day.
  252ln2(S2/S1) represents the annualized daily realized Variance on Day 2

√252 ln(S2/S1) represents the annualized DRVol, but is omitted from the formula due to clutter

In other words, for each day get the “spread” of (annualized) DRVar over strike (K), multiply it by the daily notional, and you get a daily PnL “contribution”. Add up the daily to get the total PnL. Here’s an example with daily notional = $4166666 and K = 0.09 i.e. 30% vol

closing
PR
ln PR
sqrt(252) ln PR
spread over K
daily PnL contribution
$1,200
 
 
 
 
 
$1,250
1.041667
0.040822
64.8029%
0.32994168
$1,374,757
$1,240
0.992
-0.00803
-12.7507%
-0.073742023
-$307,258
$1,275
1.028226
0.027835
44.1864%
0.105243561
$438,515
$1,200
0.941176
-0.06062
-96.2386%
0.836186882
$3,484,111

You can then add up the daily contributions, which would add up to the same total PnL by Formula in (1).