tibRV daemon — briefly

Rendezvous daemon (rvd) processes service all network communications for Rendezvous programs (“my apps”). Every Rendezvous program (in any supported language) uses a Rendezvous daemon, usually on local host, but possibly in another machine.

* When a program sends a message, rvd transmits it across the network.
* When a program listens to a subject, rvd receives messages from the network, and presents messages with matching subject names to the listening program.

(I think “program” means my app program that I write using RV api.)

Within Rendezvous apps, each network transport object represents a connection (“session??”) to an rvd process. Because rvd is an essential link in Rendezvous communications, if the transport cannot connect to an rvd process that is already running, it automatically starts a rvd daemon.

RVD is like a gateway. RVD sits between the network and “my app”.

http://www.cs.cmu.edu/~priya/WFoMT2002/Pang-Maheshwari.pdf says —
An RV Sender app (java or c++) passes the message and destination topic (subject) to RVD. RVD then broadcasts this message using User Data Packet(UDP) to the entire network. All subscribing computers with RVDs on the network will receive this message. RVD will filter the messages which non-subscribers will not be notified of the message.


tibrv daemon ^ agent

Rendezvous Java apps can connect to the network in either of two ways:
? An rvd transport (session?) connects to a Rendezvous daemon process (rvd).
? An rva transport (session?) connects to a Rendezvous agent process (rva), which in turn
connects to a Rendezvous daemon process.

For connections to the local network, a direct connection to rvd is more efficient
than an indirect connection through rva to rvd. In all situations we recommend
rvd transports (read “”connections””) in preference to rva transports—except for applets connecting to a remote home network, which must use rva transports

tibRV event object@@

C# “event” also has a dual-meaning…

In RV, an Event object can represent
1) a listener (or program’s interest in events) and also
2) an “event occurrence” (but NOT a TibrvMsg instance). I think we need to get comfortable with this unusual design.

As described in the “flow” blogpost, probably the main event object we use is the listener object. In fact, Chapter 5 of the java manual seems to suggest that
* TibrvListener is (probably the most important) subclass of TibrvEvent
* TibrvTimer is (probably the 2nd most important) subclass of TibrvEvent
* For beginners, you can think of TibrvEvent.java as an abstract base class of the 2

return mem cells to freelist, then fill your pointer’s 32 bits with 0

Suppose a 32-bit pointer p2 contains address 0x1234,and 22==sizeof(Derived)

******* q[base p2 = new Derived()] implements —
– grab 22 bytes from freelist
– initialize by ctor
– return 0x1234 i.e. starting address of the 22 bytes
–> from now on, “system” won’t give those 22 bytes to other malloc requests, since those are ALLOCATED to us.

****** q(delete p2) performs
– call dtor, possibly virtual
– determine size of the (polymorphic) pointee object, possibly via vtbl. Suppose it’s 22.
(The compile-time declared type of p2 doesn’t matter. If we did a Base p2 = new Derived(), then deletion will reclaim all the mem cells of Derived.)
– mark that many bytes starting from Address 0x1234
– return these bytes to freelist
–> Any time from now on, those 22 bytes can be allocated to anyone. Note p2 continues to be seated at 0x1234. The deletion performs no state change in the 32-bit object p2. If you read/write/delete on p2, you die.

Rule — after you call q(delete p2), always fill the 32-bit object p2 with zeros, by calling q(p2 = 0)

constructors called from strategic places (super, subclasses etc

* You can call a constructor C() from an instance method in C.java. This means an existing C instance can create a “friend” C instance. Often parent-child nodes

* Call C() from its base class B?

* Call C() from its derived class D? Actually every D constructor has to call a C constructor.

* If C object has a pointer to a J object, does it make sense to make J.java can call C() constructor? Yes but we can easily add into J object a pointer back to the original C object, so the link becomes bi-directional and tight-coupling.

These unusual ways to call constructor are the basis of many OO designs.

tibrv-UDP-multicast vs tcp-hub/spoke-JMS


Looks like the consensus is
* tibco rv — fastest, high volume, high throughput, less-then-perfect reliability even with certified delivery
* JMS — slower but perfect reliability.

Before adding bells and whistles, core RV was designed for
* efficient delivery — no bandwidth waste
* high volume, high throughput
* multicast, not hub/spoke, not p2p
* imperfect reliability. CM is on top of core RV
* no distributed transaction, whereas JMS standard requires XA. I struggled with a Weblogic jms xa issue in 2006

— simple scenario —

If a new IBM quote has to reach 100 subscribers, JMS (hub/spoke unicast topic) uses tcp/ip to send it 100 times, sequentially[3]. Each subscriber has a unique IP address and accepts this 1 message, and ignores the other 99. In contrast, Multicast sends [4] it once, and only creates copies when the links to the destinations split. RV uses a proprietary protocol over UDP.

[3] http://www.cs.cmu.edu/~priya/WFoMT2002/Pang-Maheshwari.pdf benchmark tests reveals “The SonicMQ broker requires a separate TCP connection to each of the subscriber. Hence the more subscribers there are, the longer it takes for SonicMQ to deliver the same amount of messages.
[4] Same benchmark shows the sender is the RVD daemon. It simply broadcasts to the entire network. Whoever interested in the subject will get it.

P233 [[weblogic definitive guide]] suggests
* multicast jms broker has a constant workload regardless how many subscribers. This is because the broker sends it once rather than 100 times as in our example.
* multicast saves network bandwidth. Unicast topic requires 100 envelopes processed by the switches and routers.

copier & asignment — part of the fabric

When you overload the assignment operator carelessly, you may trigger 1 or 2 implicit calls to the copy constructor. Compared to assignment, copy constructor is even more implicit, more frequently called, more part-of-the-fabric.

The pattern starts with C pbclone (pass-by-clone aka pass-by-value) convention. All java primitives (and c# value types?) are pbclone. Every object passing in and out triggers the copy constructor.

Similarly, C++ class types are pbclone by default, unless you specify references in the func prototype[1]. In an overloaded assignment operator, there is exactly one pass-in and one pass-out, so a careless overload triggers exactly 2 copy constructor calls.

Actually, even more part-of-the-fabric than copier is pbref, because the default copier has signature (MyClass const& rhs).

[1] either the return type or the param types, or both

stack to heap trend in languages

  •  C is mostly stack. I guess you seldom see malloc() and free() in business applications. I have seen a number of c applications but didn’t notice them. However, how does c pointers deal with automatic destruction of the objects they reference? I guess all such objects are declared and used in the same scope, never declared in an inner func and used in an outer func?
  • C++ biz apps use lots of new(), but by default, if you don’t call new() or malloc(), C++ uses the stack and global area. In terms of language support and encouragement, stack is still easier to use than heap in C++. I consider C++ as a transition between C and newer languages.
  • STL seems to be mostly heap, to support expansion
  • Java uses the heap more than the stack. Every object is on the heap. Memory leak becomes too likely so java had to provide garbage collector.
Conclusion: c -> c++ -> STL -> java, I see less stack and more heap usage.

##java magic across domains

“Any sufficiently advanced technology is indistinguishable from magic.” If you don’t know these technologies, you can’t imagine how some things are achievable.

* runtime code generation — cglib and proxies (since jdk 1.3)
* bytecode engineering
* reflection

Also powerful:
* threads in private inner classes

swing features that exceed browser

Q: in the web2.0 era, why use swing or any thick/fat client?

spreadsheet — (dozens of) Spreadsheet features. Just compare Excel against google docs.
fast, automatic screen update based on incoming messages
state — sophisticated state maintained inside client JVM.
threads — create multiple threads in client JVM, to do arbitrarily complex things.
DB — access database
Caching — large amount of database records locally.
access ejb, rmi, webservices
disk — access local hard disk
call OS commands or run any executable
MOM — multiple MOM listeners
precise control of real estate — squeezing in the maximum amount of info in a window.

jtable sort, edit, delete, column-move using client (save the server) CPU
swing timer
Better utilization of client machine processing power.
visual rendering using client machine’s CPU
drag and drop
context menu
fully customizable keyboard shortcuts
movable split pane

swing client/server communication – MOM or RMI

See also http://bigblog.tanbin.com/2011/02/swing-server-communication-for-wall-st.html

1) MOM – server ==> swing? Best example is market data and trade blotter
2) MOM – swing ==> server? Good for some non-urgent updates originated on UI. How about the server response?
2a) perhaps UI doesn’t care about the response
2b) In Neo, the WPF client sends a message to server and returns. A separate onMsg() thread waits for the server response.

Some user operations must use a synchronous call (like EJB or web service) rather than MOM —
3) separate swing thread to call server and then update UI (by publishing to event queue, or invokeLater)
4) action listener running on EDT makes a synchronous blocking call to server, making the UI temporarily non-responsive.

What if server pushes data to swing but needs some client response? I feel this is less common. Swing could run an RMI server and MOM listener. I feel MOM is better.

a fairly generalized request/reply model – MOM

background — req/rep is indispensable in trading as it replaces synchronous call.

I feel the scheme below is overly complex.

Note “server” is an vague term in req/rep. I use “broker” and “service” instead.

* Initiator sends request to message broker.
* broker delivers it to the service provider. I feel async allows an overloaded service to control message rate. Sync could overwhelm the service.
* depending on the acknowledgment mode, service might send an REQ-ACK right away[1], since the reply might take a long time. REQ-ACK can give initiator some useful assurance so it doesn’t need to guess “why no reply?”.

* x seconds/hours later, service sends the “value” to a private queue (initiator’s mailbox).
* broker delivers it to the initiator, async or sync. See post on initiator’s mailbox.
* depending on the acknowledgment mode, initiator might send a VALUE-ACK.

Generalized, request vs value can be considered 2 independent JMS flows. Perhaps 2 independent brokers, 2 independent queues. Zed used this model as the ringtone/joke/horoscope/…. might get generated after a long delay. If the request is a write operation, then more control is needed.

[1] Further decoupled, an explicit, custom ACK message can go into yet another queue. So the request/reply would use 4 queues. However, i think this is rare. ACK could be a builtin in the broker infrastructure, but perhaps the actual receivers (initiator and server) can issue the ACK.

c# yield statement — declarative !! imperative

In a c# method, unlike all other statements the yield statements are not a sequence of instructions. Instead, they form a declarative specification of an iterator. In a sense they have something in common with the annotations/attributes in source code?

In fact, there is no method per se. The entire yield-block is a declarative “iterator block”.

(How about the “using” statement (not the directive)? I feel it still maps to a sequence of instructions.)

java serialization – a few tips

See link — covers serialization in RMI.

* serialization = writing a *graph* of objects into a stream of data

* How to Make a Class Serializable
1. Implement the Serializable interface.
2. Make sure that instance-level, locally defined state is serialized properly.
3. Make sure that superclass state is serialized properly.
4. Override equals( ) and hashCode( ).

Making a class serializable rarely involves significant changes to its functionality and shouldn’t result in any changes to method implementations. This means that it’s fairly easy to retrofit serialization onto an existing object hierarchy. The hardest part is usually implementing equals( ) and hashCode( ).

on (2) above, i personally try to compose my objects with primitives, strings and simple arrays. What if there’s a map field? I guess i have to customize its serialization.

tibrv-CM example usage scenarios #nice

The tibrv documentation [[RV concepts]] (https://docs.tibco.com/pub/rendezvous/8.3.1_january_2011/pdf/tib_rv_concepts.pdf) has good examples of when to use Certified Messaging. (minor edits by me)

Certified delivery is appropriate when a sending program requires individual confirmation of delivery for each message it sends. For example, a traveling sales representative enters sales orders on a laptop computer, and sends them to a central office. The representative must know for certain that the order processing system has received the order he sent.

Certified delivery is also appropriate when a receiving program cannot afford to miss any messages. For example, in an application that processes orders to buy and sell inventory items, each order is important. If any orders are omitted, then inventory records are incorrect. It’s like missing a incremental backup.

Certified delivery is appropriate when each message on a subject builds upon previous message (with that subject) — in sequence. For example, a sending program updates a receiving database, contributing part of the data fields in a record, but leaving other fields of the record unchanged. The database is correct only if all updates arrive in the order they are sent.

Certified delivery is appropriate in situations of intermittent physical connectivity—such as discontinuous network connections. For example, consider an application in which several mobile laptop computers must communicate with one another. Connectivity between mobile units is sporadic, requiring persistent storage of messages until the appropriate connections are re-established.

buy-write vs short put #my take

Put-call parity equates buy-write to selling puts (naked puts), but there are many differences glossed over.

Background — Buy-write means buying underlier and selling calls. Put-call parity shows the resulting expiration PnL graph resembles a short put position. All the trades reference the same amount of underlier asset, and identical strikes. Note the hockey stick PnL graph is an Expiration snapshot of a RangeOfPossibilities — see other blog posts.

Diff — many (at least for retail investors) BW happens on option exchanges, which are usually American style so no PCP. Notable exception — a few listed index options are European style.

Diff — cash outlay — usually 10 times or higher for BW, because you must pay (at least half) the full cash price for underlier. To short sell a put you probably can get away with lower margin. See p89 [[the math of option trading]] for real examples.

Diff — moneyness — typical BW uses an OTM calls (though ITM is common too). The corresponding (same strike) short puts are ITM.

Diff — recurring income — to the BW trader, but only if she’s consistently lucky that her calls sold always expires OTM.

Diff — BW can make you lose the best stocks in your portfolio if you do that habitually. No such side effect with short puts, because once you get burned you stop playing with fire.

Diff — after expiry — you end up with no position and no risk if using naked puts. With BW, you may still hold large risk, large profit potential and large exposure.

Diff — To execute a buy-write, you would need to overcome 2 bid-ask spreads.
Diff — To execute a buy-write, you would pay 2 commissions.

1 queue, 1 EDT thread, 2 full time jobs

The event-dispatch-thread has 2 distinct responsibilities and does nothing else.
1—-) EDT updates the state of UI components, which propagate to the screen
2 —) EDT runs all the event callbacks — EDT’s event queue receives events from user input which trigger other events. Those secondary events go into the EDT queue and are handled on the EDT itself.

EDT queue receives events both from the underlying peer classes and from trusted application classes.
EDT queue receives events from timers
EDT queue receives events from your custom fireTableDataChanged()
EDT queue receives events from your custom invokeLater() when you finish reading/write to DB, file system or network
EDT queue receives events from your custom invokeLater() when you return from wait(), sleep() or join()
Any potentially blocking operation should be on a non-EDT. At end of the operation, you should call invokeLater to fire event to EDT.

When EDT blocks, traffic stops both ways
– down) user inputs can’t go down the system (from peer object?)
– up) updates can’t show up on screen (to peer objects?).

I feel this is an event producer-consumer pattern with the queue as the buffer. Note the FIFO queue enforces strict order among the events — http://docs.oracle.com/javase/1.4.2/docs/api/java/awt/EventQueue.html

practical, everyday sybase tuning]GS #Nikhil

My GS system was database centric. Most important queries hit large 10GB tables. We deal with slow queries (mostly select) on a
daily basis. If a query is slower than usual

* query plan — is the join correct?
* query plan — is the correct index used?
* query plan — any table scan?
* update stat right away if necessary. Note we already run that automatically on a weekly basis.

Most of the time that’s enough. Occassionally, we also use
* partition (horizontal) into history tables
* partition (vertical) into 2 narrower tables (Main comm and trd tables + Trade Staging table)

baml hedging in muni desk

* only tr and tr futures are used. These are liquid instruments and have good correlation to muni
* VaR is the standard risk measurement.
* valuation/marking is based on ….take your breath… last exec price for tr, but less standardized and less formalized for muni

## c++ syntax monsters

* constructor calls without new — see other posts.
* initializers in constructors
* array of pointers (P234 [[24]])
* function parameters — reference types
* copy constructor
* constant pointer to constant object
* assignment operator overloading
* function pointers, and (worse still) function pointers to member functions
* generic functions and (worse still) generic classes

A crude progress bar of a learner is these “chapters”….

when would table model data change show up (TableModel/EDT fusing)

Q: If on a non-EDT thread, such as a market data thread, I modify the table model beneath a “showing” (“realized”) jtable, say change text from “A” to “B”, but don’t repaint or fire table change event, when will my new text show up on screen?

A: never. I tested with a custom table model. The new text shows up only after I repaint or fire table event.

A: you must send an invitation to the EDT to re-query the table model and then repaint the visual, otherwise the visual will not update. Look at INotifyPropertyChanged in wpf.

A (from a veteran): For custom table model, we need to fire events, either cell change event or table data change event.
A (from a veteran, echoed on http://stackoverflow.com/questions/3179136/jtable-how-to-refresh-table-model-after-insert-delete-or-update-the-data): in some cases like default table model setValueAt()/addRow(), the event is fired automatically. I’d say the more common and standard procedure is to always fire events manually

Q: Will firing event involve repaint()?
A: not according to my debug test

Q: will firing the event trigger paint()?
A: yes according to my debug test. EDT runs paint().

Q2: is it safe for non-EDT thread to make state change on the table model object?
A: not sure. The only way to propagate the state change to the screen is via native screen object.
There are perhaps 4 (or more) objects involved:
– JTable instance (view-controller?)
– Model instance
– ?UI delegate object
– native screen object

We are modifying the model object, not the “real” object (whatever it is), so screen won’t update. UI delegate or native screen object are probably not directly accessible.

http://docs.oracle.com/javase/6/docs/api/javax/swing/package-summary.html#threading says “This restriction also applies to models attached to Swing components. For example, if a TableModel is attached to a JTable, the TableModel should only be modified on the event dispatching thread. If you modify the model on a separate thread you run the risk of exceptions and possible display corruption.” This seems to suggest that TableModel objects are modified by even handlers due to user actions or due to timers …

Q: when user edits a table cell, does the table model object change via an event callback or automatically without any event callback?
A: via Event callback, on EDT. My debug test show stopCellEditing(..) is invoked by Hollywood. This indirectly calls editingStopped(…), which calls setValueAt(..)

what trading systems use quantitative pricing

In option, pricing system is one of the most important components of a trading system. A tiny advantage in your bid/offer price will beat the market.  I guess GS has higher trading volume than other firms because they trade at more competitive prices. Pricing and risk are the 2 sides of the same coin. GS has secDB and stronger risk system and better hedging so they can trade aggressively.

Pricing is less quantitative in illiquid markets such as Muni. Not many banks offer standardized interchangeable products.

Pricing is more quantitative in prop trading i.e. trading using bank’s own money.

Pricing is more quantitative in algo trading.

Pricing is less important in broker systems i.e. agency trading. Profit is the commission and the spread on the prices clients specify.

Prime brokerage is similarly reliant on valuation system, margin calculation determines how much money to lend out. The more the house lends, the more fee it can collect, but this must be calculated against the risk exposure.

state of data model vs state of a visual component

Say myTable has a table model encapsulating a “vector3” holding 3 beans for 3 data rows. The actual visual image on screen is a native/physical/screen object outside JVM. Since objects all occupy memory and have individual states, the visual object and vector3 both hold data for the 3 beans.

Q: are there 2 copies of each bean?
%%A: I think so. The visual object is tightly guarded by kernel[1] since it’s a shared hardware resource. i guess No application is allowed to directly modify visual objects. Java Application can freely modify a java vector, but vector3 object has to be distinct from the visual object. Therefore, for each attribute[1] of the beans in vector3, there’s a copy of it in the visual object.

Q: Similarly, when a user types a letter “K” into a cell in myTable, it’s probably saved in the visual object’s memory (otherwise it will be lost in a clock cycle), but is there another distinct java object inside vector3, to be updated to “K”? The keyboard input generates an event encapsulating the text. Probably a copy of the text. But how about the bean in vector3? Is it directly modified by the keyboard input?

%%A: I would say no. It’s extremely rare to have a variable directly updatable by hardware. In C It’s known as __volatile__ variable. I think it’s more common to have a java thread [4] updating vector3. Code running in that thread is regular java code. In contrast The visual object is tightly guarded and owned by the OS[3], not directly exposed to applications.

In a jtable, I confirmed that a special thread[4] updates vector3 (real data structure of the table model), then fires an event to the listeners.

Q: But what if a MOM thread also updates vector3?
%%A: veterans told me we need locks as “guard” over vector3. When users are editing a jtable, MOM thread should not touch it.

[1] Each non-primitive attribute has to be represented or serialized as primitive int/float/char, for it to be usable in the visual.
[3] Remember OS/kernel was created as a chokepoint access layer over shared hardware
[4] my test shows it’s EDT

pattern – start thread in constructor#briefly

This is an important element of many concurrency designs.

Q: is it good to instantiate threads in a constructor?
%%A: better be careful. I try to shorten my constructor *call* including any method calls made from my constructor.

Q: is it good practice to start a thread in a constructor?
%%A: i avoid it. On a multi-core machine, the new thread could do any amount of work before the constructor returns. Until then, the half-constructed object is invalid, hazardous and should be untouchable.
%%A: I notice this is fairly common practice.
A: Actually, std::thread ctor “starts” the kernel thread i.e. makes it Eligible to receive time slices.

Q: best practices for writing constructors for a Runnable? should a thread object maintain state — must be initialized

entry point in a java reusable component

What does it mean when a java developer says

“download xyz jar”
“Give me your java api”
“You own that api”
“use this java toolkit”

They are referring to some java classes (or interfaces), usually organized into a package. This toolkit is maintained by one team (owner) to be used by other teams.

Now, good OO design requires encapsulation, so there are usually a limited number of entry points to the toolkit. I mean the first thing you write to reference the toolkit. I think there are only 2 entry points — static methods and constructors.

In some toolkits, api methods are instance methods. Question is how do you get the first instance from this toolkit? The entry points are all about instantiation.

Factory are often static methods.

brief notes from the Test-driven-development course

Refactor a 20-line nested if/else chunk of procedure code into design patterns? You will create new classes, but that's not bad
complexity. The OO trainer actually refactored a 20-line if-else that's somewhat complex, into multi-class State pattern!

Test coverage measures how many execution paths through your method there are, and how many are tested. Apparently any if/else
creates new execution paths. I didn't believe developers should realistically write tests for any IF in a method (a method worth
testing). The TTD Trainer showed us through 8 hours of hands-on demo that it's actually best-practice to write that many tests. It's
not unrealistic to maintain all these tests when my application changes.

His own test coverage is up to 80%. I saw during the demo that 80% means practically every if/else and exception situation, among
other things.

(incomplete) backgrounder to C++ pointers, references and scalars

(C++ centric explanation, also applicable to other languages.)

1) An object [2] is simply a chunk of memory having
* A) address — immutable in any language
* Val) value — mutable. [1]
* name? Never! Names exist in source code not in RAM.
2a) A nonref variable is a name attached to an object. The var has no visible address of its own.
2b) A ref-variable is a pointer to an object and always, always has a
* N) name — immutable
* PA) pointee-address — mutable except C++ references
* A) address (immutable) — a ref-variable is an object and has its own address. Therefore double pointers.

A ref-variable can change its object, via re-assignment. Java and C++ differ here. Java reference variables easily change PA ie reseat, same as C++ pointers, but C++ reference can’t change PA.

[1] Note on the word CHANGE — We say “an object can change value” as “a leaf can change color”.

[2] “Object” could be an int (or float…) in general. Java talks about “Object” as non-primitive objects.

Q90: what variables can change object?

Q66a: what if we assign scalar to scalar?

Q66b: what if we assign pointer to pointer?
Q66c: what if we assign scalar to ref?
Q66d: what if we assign ref to scalar?

Q32: what if we take the address-of a variable?

Q95: what if we dereference a variable?
A: reference variables rVar1 never need it
A: a pointer p1 is a wrapper of another variable v1. Dereferencing p1
means “unwrapping” p1 or “exposing v1”. Since, *p1 === v1, I feel any
place [1] you write *p1, compiler translates it into v1.
[1] declarations use star with a completely different meaning

Q20: what if i pass address-of this variable into a function?
A: receiving parameter must be a pointer

Q21: what if i pass this variable into a function? It depends on the
receiver variable.
A: scalar receiver — pass by clone
A: reference receiver — receiver becomes alias

Q29: what if i pass a pointer variable dereferenced into a function?
A: i think it’s equivalent to passing v1 the scalar ie pass by clone

Q82: what exactly happens to the A/Val/N during assignment like “var =
A: for a pointer — Val becomes the addr of …
A: for a scalar — Val becomes a clone of ..? See P216 of [[24]]
A: for a reference — Addr becomes the addr of …

Q66: can the address-of be an lvalue like “&var=…”? What happens to
the A/Val/N?
A: never

Q52: can the dereference be an lvalue like ” *var = …”? What happens
to the A/Val/N?
A: yes for a pointer.

JTable vs database table

Logically and Implementation-wise, a Jtable consists of a small number of columns. In contrast, the jtable can contain unlimited rows. As an analogy, a couple can have a small number of kids, and unlimited number
of books. Adding a child is a big deal, just like adding a jtable column, just like adding a database table column. At the core and from the onset, jtable’s basic design and implementation resemble database tables.

In a typical business application, a database RECORD maps to an entity bean. A jtable displays an unlimited number of entity beans, each with a known number of attributes. Each attribute occupies a jtable/database column, therefore each jtable/database column has a data type, and a distinct meaning with a title. What if all the fields of the bean have the same meaning ….[1]?

In any GUI framework, if we create a UI component capable of manipulating a single column of data, resizing, editing, firing events … then we are 90% close to creating a table. Therefore, most of the capabilities of Jtable probably “come from” the column thingy, even though I’m not familiar with the TableColumn.java and
TableColumnModel.java classes.

Each column (not “each row”) has its data type, width, color, renderer(!), editor(!), position among other columns… It’s like a small kingdom within a large federation.

[1]A: Then the jtable is a grid.

eden + survivors must be small (!! too small)

I used to wonder why young generation (eden+survivors) should not be larger than ¼ (or 1/8….can’t remember the recommendation). The reason turns out to be fundamental and it’s good to remember it.

Reason is, minor GC is very frequent and is expected to be efficient and “cheap”. Long pause is associated with major GC, whereas minor GC STW pause is usually much shorter.

Note Even though minor GC is brief, it is Stop-The-World throughout. There are no phases. There’s just one phase and it’s STW.

Many financial trading engines incur just one major GC a day, but no such thing about minor GC.

General advice/observations —

* young gen too large -> bad pause during minor GC
* young gen too small -> too frequent minor GC

financial jargon: over the counter markets

(see link.)

In many contexts, OTC contracts (swaps, options, forwards…) means bilateral contracts not backed by exchange. Not transferable. Usually large, custom contracts not available on any exchange. I feel these OTC securities could trade on an ECN.

Key points:

#1 aka off-exchange. these “things” are not on nasdaq or nyse or other things.
* an instrument can be traded in an exchange, or on an otc market, or both
* market makers — usually required.
* ECN are probably a necessary link between the counterparties.
* swaps are usually on OTC

un-listed — The market is for securities not listed on a stock or bond exchange.

un-listed — OTC is Antithesis of listed.

non-exchange — The market is for securities not listed on a stock or bond exchange.

NASDAQ — The NASDAQ market is an OTC market for US stocks, but sometimes people say otc stocks are’t on nasdaq.

small — OTC stocks are very small and do not trade on an exchange because they do not meet market capitalization requirements. OTC securities may theoretically be traded informally (one may stand on a street corner and sell his/her stocks), but the term usually refers to securities traded through a dealer network.

3 types of swing "classes" in containment hierarchy

Every (yes Every) swing application can be /usefully/ described as a hierarchy of containment, even if 2-level. I only encountered 3-levels or more. A hierarchy always has exactly one INSTANCE of a top-level

Simplified classification — each everyday swing class is exactly one of these 3 —
1) top-level container class, usually JFrame.java
2) branch container class, like JPanel.java, JScrollPane.java,JTextPane.java, JSplitPane.java
3) leaf class, like jtable

As you can see, swing provides exactly 2 types of container classes — top-level vs branch. Both types are few in number and so widely used they are unavoidable. Better know then well.

Layout managers mostly target a branch container. I doubt they target top-levels since the top-level usually contains just a single content pane, so no layout required, presumably. I feel WPF is similar — The WPF top-level window holds exactly one container, which is a layout-control-panel.

Q: can we make do without branch containers?
A: I don’t think so, since a jframe has a content pane which is a jpanel.

A top-level swing class is always heavyweight; a branch or leaf class is lightweight. See P 362 [[java cookbook]]

Q: can a branch class be used as top-level or leaf?
A: I don’t think so. A branch container class can’t be top-level (not heavyweight) and should not be a leaf because it doesn’t display useful content by itself.

transaction isolation level (%%language)

Basic context behind the jargon:
* shared data — in memory or disk
* at least 2 threads. Isolation between threads.

0) concurrent, interleaved writes — no transaction. no isolation. 2 threads can each update half of a row’s data, corrupting it. I don’t think any DB allows this.

1) dirty read or “read uncommitted”(RU) — Easy. when a reader thread/session gets to see another thread’s uncommitted data change, that’s one count of read-uncommitted offense. We say this system[4] “allows read-uncommitted“. This is the lowest isolation level, but higher than (0) above.

2) read committed (RC) [3] — If system doesn’t expose uncommitted data change, then read-uncommitted will never occur. If system is configured this way, then /each thread only reads committed changes/. However, each thread doesn’t get repeatable-read-by-id guarantee.

3) repeatable read (RR) by id — Your thread reads an item by id and writes something /based/ on the read. If you can reliably read, read, read for a long time … to get the *same* data before your write, then you are lucky [2] — no other session is allowed to touch that object. You get Repeatable Read on that object[1] guarantee by the system. This system is operating in the RR mode.

Basically, system locks all rows of your first read and reserves them for you — pessimistic locking. [5] Once you read a row, it’s reserved for you. You are an emperor — once you cast your eyes on a girl, you have the right on her.

4) phantom read — However, if you read with a range-select like “where price > 0”, then your first read can reserve 500 rows and a repeated read can turn up 501 rows. This is a count of Phantom Read. I feel this query is not repeatable. It’s not a read-by-id, so RR is not relevant.

Solution for phantom read is — a range-lock rather than a row or page lock. Some say “serialize all threads that access a given range”.

Say Transaction1 has just finished a read with “from table1 where age > 22”, uncommitted. Can system allow Transaction2 to start, one that doesn’t mention this table? If system lets Transaction2 start, Transaction2 may lock up another table. Since Transaction1 locks up table1, there’s risk of deadlock. It’s safest to serialize all transactions.

[1] but not necessarily on that query
[2] luckier than the threads in a RC system
[3] RC is the default in most databases.
[4] Now the “system” could be a table or a database or a multi-threading app. There’s absolutely(?) no control, coordination, discipline or “isolation” in this RU mode. Isolation is the I in ACID.
[5] Now we know every SELECT is implicitly in a new transaction. In RR mode, it locks up all the rows involved until commit. This is more strict than the default RC mode.

c/c++ headers, func prototypes

In C [2], before you use a function[1], you must declare (like abstract) or define (i.e. implement) it. Alex (lab49) explained that c compiler is single-pass. Java and C# are multiple-pass.

Say we don’t like to create func declarations. We must define a function before calling it from another func. We must be careful to arrange func definitions. As developers, I don’t want to worry about the ordering of func definitions. One worry-free solution is to declare all functions upfront. One step further is header file, which has all the func declarations.

[2] C++ is like C. Object-C is the same.
[1] variables too.

STL algorithms always use iterators

My ObjectSpace manual says STL algos only access data via iterators. Therefore they are versatile enough to (indirectly) access data elements in arrays, strings, io streams and containers. Typically an algo’s input includes a range with 2 demarcation iterators.
Q: any STL algo using just 1 iterator?
A: fill_n() and generate_n()

Q: any STL algo using 0 iterator?
A: swap() is the only one. This manual has a cheatsheet or quick reference chart.

Q: any STL algo using 3 iterators?
A: many

bunch of identical biased coins – central limit theorem

What’s the histogram shape of sum of 2 dice? 3 dice? 4 dice? (plural form!)

2 -> pyramid but in steps:)
3, 4, 5… -> bell

This trend (towards the bell shape) is central limit theorem in action.

—- That’s dice … Coin is more extreme as an illustration of CLT —

Q: What is the histogram distribution of the head count from a biased coin, one toss? Say P(head) = 0.9 denoted as p
A: step function

Q: sum of 2 coins? possible values are {0,1,2}
A: step function again

Q: sum of 3 coins? Same upward shape

So where’s central limit theorem? Wait till np > 5 and n(1-p) > 5. That means 50 coins. There you will see the bell shape.

Runnable vs subclass of Thread

Fairly trivial implementation question

Thread derivatives [1] permits instance field ie state. Semantics of these Thread fields are confusing and should be avoided. If you don’t want to permit thread instance fields, then use Runnable.

[1] including anon thread class

create void methods

Many developers told me they seldom write void methods. Instead, their methods by default return a int/char status or something similar.

There are justifications for void methods. I think void method applies widely, not just when you have no choice.

* if you use void now, you can more easily add a return type later. More flexible.
* if you return something but don’t read it, then other programmers may get confused/mis-lead. Inevitably you end up adding comments
to explain that.
* if you read return value even though you don’t need, then that’s extra work and other programmers will feel they need to
understand what’s going on.
* for some simple methods, void makes the method body simpler. Non-void requires 1 or sometimes several return statements. What a
waste if you don’t need it at all!
* The need to indicate success/failure can be met with judicial use of runtime exceptions. Checked exeption usage requires
* IDE refactoring frequently creates void methods.

[09] trading domain knowledge – math^jargon

Q: how does a trader decide at what price to buy and sell a security, and at what quantity? How does she hedge?

Note in margin trading, sometimes you have to close out a position at a loss, in order to limit further losses. Also Note an algorithm trading system executes hundreds and hundreds of trades a day.

This is the type of trading domain knowledge a java developer can’t acquire in 6 months. Remember we both felt that most domain knowledge required on a java job is trivial and can be acquired in a few months?
Notice my example is a pricing system. Generally, the more math there is, the deeper the domain knowledge. In general, I feel there are 2 types of financial domain knowledge
* Math
* jargons

Jargons are numerous but not too bad. Perhaps up to 200 jargon terms. Nothing compared to what a medical student has to learn in 6 months.

Quantitative usually means math. Financial bookstores always have shelves for the math and the jargon.

swing callback objects^methods, briefly

Note most if not all swing event handlers are objects, not mere method. (It’s common and harmless to use nested classes for the objects.) These objects are often stateless, but i feel you can carefully add state to them.

Since the EDT is the only thread running callback methods, these methods can spawn new (worker) threads to take on heavy jobs.

How many times will these callbacks run in an hour? How many threads do you want to spawn?

BS-Modle – my first take, in simple language

— Here are some pointers to myself —

Price relative
log(PR) (known as r) is normally distributed.
random walk — Brownian motion
BS started with a (diffusion) differential equation describing how instantaneous stock movement depends on exponential drift + geometric Brownian motion
** parametrized with sigma and r, assumed constant.

assumption : constant vol assumption — biggest shortcoming
** no skew no term structure
**** seriously underestimates vol at low strikes – tail risk
** due to this shortcoming, BS is good for price quoting/inversion only, not valuations
assumption : zero-dividend assumption — later addressed by Merton

Applies to European style only
Applies to stocks only, not FX with 2 interest rates

Requires integration of normal pdf, so the valuation formula is based on the N() function i.e. the cummulative function

keywords of TibRV

* addressing — rv protocol is different from tcp/ip socket. No IP addresses. Rather, rv uses
** Subject-based addressing — For programs to communicate, they must agree upon a subject name at which to rendezvous (hence the name of this product). Subject-based addressing technology enables anonymous rendezvous, an important breakthrough

* rvd — is the name of the daemon. in most environments, one daemon process runs on each host computer.

* transport — one of the most important java classes to developers like us. There’s no jms counterpart, but the best you can get is the Session object + physical delivery options.

* unicast vs multicast. No other choice.
** multicast is prevalent
** unicast — sends a msg to exactly one Unix process id. point to point. inbox name

copy-constructor^destructors @ function border (pbclone)

in a simple[2] C function, when you pass (in or out) an integer, you create a copy. Local variables go out of scope at end of blocks {defined by braces}. Out of scope means memory reclamation. Pass-by-clone means memory allocation.

in C++, all user defined objects are treated LIKE the integer above. When passing in (or returning), the copy-constructor creates a copy (P180 [[24]]). Local variables go out of scope under the destructors.

[2] no pointers please.

primitive^class type, pbclone, pbref…

java has primitive vs reference types; c++ has builtin/primitive vs class data types. For a beginner, it’s best NOT to think of these c++ features in java terms. To a beginner sometimes I’d say C++ builtin/primitive and class types are very very different from java primitive and reference types.

Like C language, both builtin and class types are, by default, pass-by-clone (pbclone) — See post on “function border”. Both builtin and class types can become pbref..

Note this blog is mostly about memory. Pretend to be an snooper on the memory space in the runtime. What differences do we notice when builtin or class type variables are created, destroyed, passed, cloned…

obviously a class type can contain pointers. If a struct contains nonref (non-reference types) fields only, then it’s quite similar to a double or char primitive.

primitive^object dichotomy : !! in C++

As a java developer learning C++, I’d say the #1 syntax/semantic difference is this — primitive vs reference type dichotomy is misleading and counterproductive. In fact, in C++ i would not mention primitive vs object, as every variable is (a name for) an object, .

In java, every variable is either a primitive or a reference type i.e. an object on heap. Primitives are always pass-by-clone (pbclone). Garbage collection only covers objects. Primitives don’t need it.

Learning C++, you need to unlearn all that, and perhaps start from the C tradition. The useful dichotomy is stack vs heap. All data types are “objects”, either built in or user defined. All heap variables need new/delete… As mentioned elsewhere on this blog, all heap objects are nameless — multiple named pointers can point to the same heap object.

local variables unavailable in eclipse debugger

If you can’t inspect a method argument (or local variable) object using Expressions/Watch/Inspect … Be creative and try Eclipse debug view -> Variables. It’s possible that local vars are not visible, but class fields are. Many fields of “this” are objects, so their fields are visible. Using a combination of fields you can often deduce the values of those local vars. Simple example — if a local var is set to someStaticMap.get( this.age%10 ), you can deduce that value.

Here’s the original context of this blog — During remote debugging one day, it turned out all these variables are non-viewable because we had built the project without including “debug information”. If your build-master allows, then the best solution is —

  javac -g:vars # Local variable debugging information added to byte code in generated *.class files

Other eclipse users had the same issue — http://dev.eclipse.org/newslists/news.eclipse.tools.jdt/msg08356.html

a view on a particular option instrument

When we see 2 quotes for the same IBM call option by 2 sellers (say UBS vs Citi), these are essentially 2 “views” on IBM volatility. The UBS view may assume a low volatility from now till expiration, but …

Q: does the UBS trader express a view on where underlying price will move? (Note “no-move” is also a view.)
My A: I don’t think so. I feel the entire view is expressed (and encapsulated) in the implied volatility.

Q: If the Citi quote is very, very high, then can we conclude either Citi trader feels IBM will rise, or Citi assumes a high volatility???

Q: From a slightly different angle, what if Citi trader feels IBM will rise over the next month? How does that affect his pricing?
%%A: I feel he can simply use a higher volatility estimate. or he can delta hedge.

Whenever discussing a VIEW like “what if traders feel …”, we can’t avoid comparing multiple strategies. Each strategy has a “risk/reward profile” (what I call RRP). In eq option world, a strategy is basically one or more positions you enter, whose combined portfolio MV is hopefully weather-proof. The layman’s obvious and simple strategy can be smart or naive.  (ignoring spread strategies,) Basic, common strategies include

– buy underlier
– short underlier – risk? unlimited
– buy call, ITM or OTM – risk? low
– buy put, ITM or OTM – risk? low
– [7] (naked call) write a call, OTM – risk? unlimited
– [7] (naked call) write a call, ITM – risk? unlimited and bigger
– (covered call) buy underlier + write call of identical qty
– [a] write an IMT put – risk unlimited (loosely)
– [8] write an OTM put – risk unlimited (loosely)
– [1] buy underlier using a limit order “buy when it drops to a value price of $0.01”
– [3] (protective put) – buy underlier + buy an OTM put of the same qty

[3] the long underlier position has big downside risk. The put protects it, like a low-cost insurance.
[7] naked call is related to a short in underlier. When I write a naked call, i agree to go short in underlier at, say $44.
[8] when I write a put, I agree to buy into a long position
[8] When I write a put at $999, someone may unload his rotten corn (something worthless) on me and I end up buying the corn at $999. I end up with a massive losing long position.
[8] is comparable to [1], more comparable to limit order BUY, but more sophisticated.
[a] I don’t know exactly why anyone would take this risk, but premium would be intrinsicVal + timeVal and a pretty high insurance premium

2 questions on tibRV to a Tibco friend #LS

Q1: Why is certified messaging so widely used in trading systems when CM is not as reliable as JMS? I guess if a system already uses Rendezvous, then there’s motivation to avoid mixing it with JMS. If the given system needs higher reliability than the standard non-guaranteed RV delivery, then the easiest solution is CM. Is that how your users think?

A: I don’t think CM is not as reliable as JMS. However you can say CM’s acknowledgment mechanisms are not as flexible as JMS/EMS. The latter supports several acknowledgment modes. An important advantage of RV is it supports multi-cast which makes broadcasting quotes more efficient. JMS 5 supports something similar for topic subscriber but not as natural as RV.

Q2: in JMS, a receiver can 1) poll the queue or 2) wait passively for the queue to call back, using onMessage() — known as a JMS listener. These are the synchronous receiver and asynchronous receiver, respectively. I feel Rendezvous supports asynchronous only, in the form of onMsg(). The poll() or dispatch() methods of the TibrvQueue object don’t return the message, unlike the JMS polling operation.

option price sensitivities, another first look

For a given option,

– people use Black-Scholes to graph theoretical option price against __underlying_spot_price__. Then they get the gradient – delta
– people use Black-Scholes to graph spot option price against _time_ to expiration. Then they get the gradient – theta.- people use Black-Scholes to plot spot option price against implied _volatility_ for the period to expiry. Then they get the gradient – vega
– people don’t normally plot spot option prices against exercise price (K), although strike price is an important factor.

For short term liquid options, the effect of tweaking the interest rate (like 200bps to 201) is relatively minor. (That gradient is rho.) IR is built into a fwd price. Becomes important for longer-term options

monty hall cond-prob-tree, simplified

http://bigblog.tanbin.com/2009/06/monty-hall-paradox.html has a tabular form of probability “tree” which I’m trying to simplify. After you pick a door and host reveals a door, let’s paint the 3 doors

Amber on your pick
Black on the open
Cyan on the other

Now suppose immediately after (or before) your pick, a fair computer system (noisegen) randomly decides where to put the car. Therefore
p(A) := p(computer put car behind Amber) = 1/3
p(C) := p(computer put car behind Cyan ) = 1/3
p(B) := p(computer put car behind Black) = 1/3

Only after (not before) seeing the computer’s decision, host chooses a door to open. My simplified probability Tree below shows switch means 66% win.

Note on the top branch — if computer has put car behind the Black door then host has no choice but open the Cyan door. However, in this context we know B was opened, so this point is tricky to get right.

It’s worthwhile to replace the host with another fair computer. This way, at each branching point we have a fair computer.

Amber door picked -33%–> Computer put car behind B -100%–> C opens  (A/B 0%) =>s/d switch
-33%–> Computer put car behind A -50%–> C opens (A 0%) => don’t switch
-50%–> B opens (A 0%) => don’t switch
-33%–> Computer put car behind C -100%–> B opens (A/C 0%) =>s/d switch

Now, what’s prob(A has the car | B opens) ie p(chance of winning if we don’t switch). Let’s define
p(b) := p(B gets opened)
p(A|b) = p(b|A)p(A)/ { p(b|A)p(A)+p(b|C)p(C) } by Baye’s thereom.
p(A|b) =50%33%/ { 50% 33% + 100% 33% } = 33%

convert between i-volatility & option price (no real eg

In bond markets, bids/offers are often quoted in bps of yield. Example – Seller and buyer each convert 200bps yield into price and always get the same dollar price. It’s like converting between meter and inch, or between Celsius and Fahrenheit.

The conversion between hazard rate and market quotes is less clear.

Similarly, in many option markets, prices are quoted in implied volatility. Seller and buyer each convert 20%/year[1] to price, using Black-Scholes and always get the same dollar price.

Price/vol conversion is similar to price/yield conversion because …… in each case price is a poor parameter for relative value analysis, and investors found a superior apparatus to compare 2 securities — yield for bonds and i-volatility for options. Wikipedia says “Often, the implied volatility of an option is a more useful measure of the option’s relative value than its price.”

Price/vol conversion is similar to price/yield conversion because ….. from (mid-point of quoted) bid/ask prices, you get an implied-vol or implied-yield[3], over the remaining lifetime of the instrument. The vol and yield are both forecasts, verifiable at end of that “lifetime”. I-vol vs Historical-Vol. Implied Yield vs real inflation rate.

Price/vol conversion is similar to price/yield conversion because ….. everything else is held constant in the equation. Specifically,

** underlying price is held constant — Tricky. Even though underlying price changes every minute, investors each try to estimate the volatility of the underlying price over the _next_ month. If UBS assume stability, then UBS may use a 19%/year vol. As underlying price moves, UBS option pricing will move accordingly, but always lower than a hypothetical Citi trader (who assume instability).

When a buyer compares 2 competing quotes for the same option, she sees 2 i-vol estimates[2]. 19% by UBS, and 22% by Citi. As underlying price wiggles by the minute, both quotes move, but Citi quote always higher.

Next day, UBS uses a higher vol of 21.9%. As before, both quotes move along with underlying price, but now the spread between UBS and Citi quotes narrow.

At any underlier spot price, say underlying S = $388, one can convert the UBS quote to a lower implied vol and convert the Citi quote to a higher implied vol. During this conversion, underlying price is held constant.

** time to expiration is held constant — Tricky. This situation exists in price/yield conversion too.

[1] same dimension as yield? P276 [[complete guide]]
[3] Implied-Yield is not a common term but quite accurate.

monty hall paradox

Q: Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1 [but the door is not opened], and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

— table below (somewhere online) lists all the possible scenario on the probability “tree” — the only effective tree I know. There are many wrong ways to construct the tree. You can see switching = 66% win.

initial pick A A A B B B C C C
prize location A B C A B C A B C
host can open BC C B C AC A B A AB
outcome | switch’ W L L L W L L L W
outcome | switch L W W W L W W W L

—–Let me use extremes to illustrate why switching is Always better.

Say we start with 1,000,000,000 doors. Knowing only one of them has the car, our first pick is almost hopeless so we shrug and pick the red wooden oval door. Now the host opens 999,999,998 doors. You are left with your red door and a blue door.

Now we know deep down the red is almost certain to be worthless. Host has helped eliminate 999,999,998 wrong choices, so he is giving us obvious clues. Therefore the correct Strategy SS is “always switch”. If we play this game 100 times adopting Strategy SS, we win roughly 99 times — to be verified by simulation.

What about a strategy F — “flip a coin to decide whether to switch”? I  feel this is less wise. The Red has very low potential, whereas the blue is an obvious suspect beyond reasonable doubt.

—–http://www.shodor.org/interactivate/activities/AdvancedMontyHall/ shows a parametrized version, with 10 doors. After you pick a door, 8 worthless doors open. My question is, if I follow “always-switch” i.e. Strategy SS, what’s my chance of winning?

Answer: if my initial pick was right (unlikely — 10% chance), then Strategy SS loses; if initial pick was wrong (likely — 90%) then Strategy SS wins. Answer to the question is 0.90.

How about Baye’s thereom?