c#delegate instance HAS-A inv list@@

Everything important to a delegate instance is in its inv list, but technically, is there something else in a delegate instance?

Compiler provide no way to modify an inv list. Immutability is guaranteed by the compiler. I feel immutability simplifies physical layout. I feel a simple implementation would be a thin wrapper over the inv list. The simplest implementation could be nothing but the inv list. A delegate instance is an object on heap. The inv list is also an object on heap.

Note Delegate.Remove() is a global static method which automatically returns null when it would otherwise produce an empty inv list. Nature abhors a vacuum; c# abhors an empty inv list.

More about the immutability – see other posts.

Advertisements

Is a pure-play investment bank really a bank?

Many people wonder why the word “bank” in “investment-bank”. I mean, do those Goldmans and Morgans have anything in common with the main street banks, such as the savings banks or commercial banks? Well, I see  some non-trivial similarities.

Similarity – service provider and facilitator. Like a commercial bank, an investment bank is supposed to facilitate clients’ financial strategies. This is obvious in the IB business model (below). It’s less clear in the SS model (below). In an ideal world, an investment bank in its SS role should not do the buy-side type of business (prop-trading) and compete with clients. This ideal world doesn’t exist, but most investment banks do look like sell-side service providers.

Similarity – lending. Like all banks, an investment bank lends money all the time, and it also borrows money from investors (“depositors”) and central banks.

The IB business model described below has so much in common with commercial banking that most of the major investment banks today are part of commercial banks. This model is known as universal banking, adopted by Citi, Barclays, JMPC, UBS/CS, HSBC/SCB/RBS, DB, BNP/SG, RBC etc. If we focus on the investment banking business on its own, there are basically 2 main business models —

IB — The traditional meaning of IB is related to “funding” and “financing” for a big client’s big project, such as a merger or privatization but more commonly bond/equity issue, including public issues a.k.a. IPOs. These are often dressed up, packaged as “advisory business”, but what clients need most is financing, as illustrated in the Napoleonic wars. In such a funding project, the IB does something similar to a regular bank – collect funds from a large number of investors and lend to that particular client. However, the risks, expertise, techniques, operations, competitive strategies … tend to be different from regular commercial banking.

SS — The other major IB business model is playing the sell-side on security markets. This is not just passive order-taking. Many players are also in the business to create structured products. They have an advisory team to actively engage prospective clients and customize their products for each client. See other posts in this blog.

All other IB business models are lesser known but could sometimes generate more profit than the 2 main models
– asset management — buy-side business model
– prime brokerage
– security lending
– clearance
– investment research

managed ^ unmanaged (dotnet)- what to Manage?

A popular IV question — exactly what runtime Services are provided by the dotnet Hosting runtime/VM to the Hosted/managed code? In other words, how is the “managed code” managed?

Re http://bigblog.tanbin.com/2011/09/what-is-kernel-space-vs-userland.html, I believe an executing thread often has it’s lowest level stack frames in kernel mode, middle frames in the VM, and top frames running end-user application code. The managed code is like a lawyer working out of a hotel room. The hotel provides her many business services. Now, host environments have always provided essential runtime services to hosted applications, since the very early days of computers. The ubiquitous runtime host environment is the OS. In fact, the standard name of an OS instance is a “hostname”. If you have 3 operating systems running on the same box sharing the CPU/RAM/disk/network then there are 3 hosts — i.e. 3 distinct Hosting-Environments. In the same tradition, the dotnet/java VM also provides some runtime services to hosted applications. Hotel needs “metadata” about the data types, so a dotnet assembly always include type metadata in additional to IL code.

Below is a dotnet-centric answer to the IV question. (JVM? probably similar.) For each, we could drill down if needed (usually unneeded in enterprise apps).

– (uncaught) exception handling. See [[Illustrated c#]] (but how different from unmanaged c++?)
– class loading. See [[Illustrated c#]]
– security
– thread management? But I believe unmanaged code can also get userland threads manufactured by the  (unmanaged) thread library.
– reflection.See [[Illustrated c#]]
– instrumentation? Remember jconsole
– easier debugging – no PhD required. Unmanaged code offers “limited” debugging??
– cross-language integration?
– memory management?
** garbage collection. This service is so prominent/important it’s often listed on its own, but I consider this part of mem  mgmt.
** memory request/allocation? A ansi-C app uses memory management library to grab memory wholesale from kernel (see other posts) and a VM probably takes over that task.
** translate a C# heapy object reference to a “real virtual” heap address. Both JVM and .Net collectors have to move (non-pinned[1]) objects from one real-virtual address to another real-virthal address. Note the OS (the so-called Paging supervisor) translates this real-virtual address into physical RAM address.** appDomain. VM isolates each appdomain from other appdomains within a single Process and prevents memory cross-access. See http://msdn.microsoft.com/en-us/library/ms173138(v=vs.80).aspx

[1] pinned objects are not relocatable.

jar ^ c# DLL, briefly

In java, namespace tree (not the inheritance “family” tree), physical directory tree and accessibility are all based on the same tree.

C# decouples them. The namespace tree has no physical manifestation.

The physical organization of files is based on assembly, which is unrelated to namespace.

For a third party library, java would use a jar. C# would use a DLL, which is an assembly. Inside the jar there’s a namespace tree known as a package. An assembly isn’t required to use a unique namespace.

c# regex-match backslashes in strings

My suggestion — First find a “safe” character that’s guaranteed not to show up in the original string, like “_”. Replace all back slashes. Then proceed.

Problem with backslashes is the unnecessary complications. Here I want to match “one-or-more backslashes”. In the end I need to put 4 bachslashes in the pattern to represent that “one”.

var ret = Regex.Replace(@”any number of\\\backslashes”, “(.+\\\\+)?(.+)”, “$1 – $2”);

Alternatively, I could use @ to reduce the complexity @”(.+\\+)?(.+)”

Disappointingly the @ does a partial job. We still need 2 strokes — Confusing! I’d rather just remember one simple rule and avoid the @ altogether

%%jargon – Consumer coder, Consumer class

When we write a utility, an API, or a data class to be used by other programmers or as components (or “services” or “dependencies”) in other modules, we often strain to find an unambiguous and distinct term that refers to “the other side” whom we are working to serve. The common choices of words are all ambiguous due to overload —
“Client” can mean client-server.
“User” can mean business user.
“App developer”? me or “the other side” are both app developers

My Suggestions —

How about “downstream coder”, or “downstream classes”, or “downstream app” ?

How about “upper-layer coder”, “upper-layer classes”, “upper-layer app”, “upper-layer modules”
How about “upper-level coder”, “upper-level classes”, “upper-level app”, “upper-level modules”
How about “Consumer coder”, “Consumer class”, or “Consumer app”?

##coding guru tricks (tools) learnt across Wall St teams

(Blogging. No need to reply.)

Each time I join a dev team, I tend to meet some “gurus” who show me a trick. If I am in a team for 6 months without learning something cool, that would be a low-calibre team. After Goldman Sachs, i don’t remember a sybase developer who showed me a cool sybase SQL trick (or any generic SQL trick). That’s because my GS colleagues were too strong in SQL.

After I learn something important about an IDE, in the next team again I become a newbie to the IDE since this team uses other (supposedly “common”) features.

eg: remote debugging
eg: hot swap
eg: generate proxy from a web service
eg: attach debugger to running process
eg: visual studio property sheets
eg: MSBuild

I feel this happens to a lesser extent with a programming language. My last team uses some c++ features and next c++ team uses a new set of features? Yes but not so bad.

Confucius said “Among any 3 people walking by, one of them could be teacher for me“. That’s what I mean by guru.

Eg: a Barcap colleague showed me how to make a simple fixed-size cache with FIFO eviction-policy, based on a java LinkedHashMap.
Eg: a guy showed me a basic C# closure in action. Very cool.
Eg: a Taiwanese colleague showed me how to make a simple home-grown thread pool.
Eg: in Citi, i was lucky enough to have a lot of spring veterans in my project. They seem to know 5 times more spring than I do.
Eg: a sister team in GS had a big, powerful and feature-rich OO design. I didn’t know the details but one thing I learnt was — the entire OO thing has a single base class
Eg: GS guys taught me remote debugging and hot replacement of a single class
Eg: a guy showed me how to configure windows debugger to kick-in whenever any OS process dies an abnormal death.
Eg: GS/Citi guys showed me how to use spring to connect jconsole to the JVM management interface and change object state through this backdoor.
Eg: a lot of tricks to investigate something that’s supposed to work
Eg: a c# guy showed me how to consolidate a service host project and a console host project into a single project.
Eg: a c# guy showed me new() in generic type parameter constraints

These tricks can sometimes fundamentally change a design (of a class, a module or sub-module)

Length of experience doesn’t always bring a bag of tricks. It’s ironic that some team could be using, say, java for 10 years without knowing hot code replacement, so these guys had to restart a java daemon after a tiny code change.

Q: do you know anyone who knows how to safely use Thread.java stop(), resume(), suspend()?
Q: do you know anyone who knows how to read query plan and predict physical/logical io statistic spit out by a database optimizer?

So how do people discover these tricks? Either learn from another guru or by reading. Then try it out, iron out everything and make the damn code work.

back testing a VaR process, a few points

–Based on http://www.jpmorgan.com/tss/General/Back_Testing_Value-at-Risk/1159398587967

Let me first define my terminology. If your VaR “window” is 1 week, that means you run it on Day 1 to forecast the potential loss from Day1 to Day7. You can run such a test once a day, or once in 2 days etc — up to you.

The VaR as a big, complicated process is supposed to be a watchdog over the traders and their portfolios, but how reliable is this watchdog? VaR is a big system and big Process involving multiple departments, hundreds of software modules, virtually the entire universe of derivatives and other securities + pricing models for each asset class. Most of these have inherent inaccuracies and unreliability. The most visible inaccuracy is in the models (including realized volatilities).

VaR is a “policeman”, but who will police the policeman? Regular Back test is needed to keep the policeman honest — keep VaR realistic and consistent with market data. Otherwise VaR can become a white elephant and an emporer’s new dress.

some modules of an algo trading platform – chat with YH

A few modules I know

* MM (mkt datafeed reader)
* OMS (or order book, or EMS, including exchange gateways and smart order routers.
* PP (pretrade pricer)
* BB (trade booking engine, when an execution comes back successful. BB is the main module affecting the position master DB)
* PNL (real time pnl engine),
* RR (real time risk engine)

I guess BB, PP or PNL might be absorbed into the OMS module. In some places, OMS might be the name for any real time component of the entire system, so MM, PP, BB, PNL, RR (all real time) can all be considered part of OMS in that sense. There’s a narrower definition of OMS though, so I will just refer to that definition as the EMS (execution management system). For a micro hedge fund,

PP EMS RR may not exist, and the rest (MM BB PNL) can be manual. In contrast algo shops must automate all steps.

Where does MOM fit in? It’s an enabling technology, usable in many of the functional modules above.

How does dynamic data fabric fit in? Also known as data grid or in-memory DB. There are 2 types
* generic technology, often integrating SQL, MOM technologies
* functional module with complex business logic, such as CEP engines, often tightly integrated with MM OMS PP RR

These two descriptions are not mutually exclusive. Many data grid systems include features of both. However, for simplicity, in this write-up I treat data grid as the generic technology just like a DB or MOM.

How do reference data fit in? I guess it’s just another separate service to interface with database, which in turn provide reference data to other modules when needed?

Ref data tend to be fairly static — update frequency would be once a few hours at most, right? It is essentially a readonly component to the algo engine modules. Readonly means those modules don’t update ref data. It’s one-way dependency. In contrast, MM is also readonly, but more dynamic. MM is the driver in an event-driven algo engine.

I feel ref data read frequency can be high, but update frequency is low. Actually, i feel the occassional update can be a performance issue. Those dependent modules can’t cache ref data if ref data can change mid-day. There are techniques to address this issue. A fast engine must minimize DB and network access, so if ref data is provided on the network, then every read would be costly.

Some moules are simpler, like PNL and BB so no big deal. Pricing lib is used for PP and perhaps RR, which are quantatitively complex. MM EMS are technically challenging.

The “algo” tends to be in the MM PP EMS and RR modules

A High frequency shop also needs to assess market impact. Not sure where it fits in, perhaps the EMS

Where does STP fit in? I feel STP is largely non-realtime.