social class]U.S. n%%chosen tech domain

发表于9月 27, 20155月 8, 2020 作者 BinTAN

I used to feel US is a less class-conscious society than China or Singapore. Anyone can make it in this “free”, meritocratic country. Then “insiders” tell me about the old boy’s circle, and the alumni circles on Wall St.

I feel in any unequal, hierarchical society, there are invisible walls between social strata. I was lucky to be an immigrant in technology. If I step out of tech into management, I am likely to face class, racial bias/affinity and … I would ~~no longer be “in-demand”~~ as in tech. Look at the number of Chinese managers in GS. Many make VP but few rise further.

Therefore the tech role is a sweet spot for an immigrant techie like me. Beside Tech, a few professions are perhaps less hierarchical – trading, medical, academic, research(?), teaching …

quant developer requirements

发表于9月 27, 20158月 12, 2019 作者 BinTAN

Many quant developers (in our department) program in c# (for excel

plugin) or build infrastructure code modules around quant lib, but

they don't touch c++ quant business logic classes. C++ quant lib

(model) programming is reserved for the mathematicians, typically

PhD's.

Many of these non-C++ quant developers have good product knowledge and

can sometimes move into business side of trading.

I was told these quant developers don't need advanced math knowledge.

—-quant interviews

Mostly C++ questions. Most candidates are filtered out here.

2nd group – probability, (different from statistics)

Some finance intuitions (eg — each item in the BS formula)

Some brain teasers

— some typical C++ questions (everything can be found from the Scott

Meyers books)

exceptions during ctor/dtor

virtual ctor

Given a codebase, how do you detect memory leak

multiple inheritance (fairly common in practice)

threading

—–

[[heard on the street]] and [[A Practical Guide To Quantitative

Finance Interviews]]

Another book by Shreve.

##some benefits@learning c++, even if no salary increase

发表于9月 27, 20158月 12, 2019 作者 BinTAN

After learning c++, i am fairly ~~confident~~ I could if I must pick up c# in a few (4?) months and start passing interviews. C++ is inherently tougher than java and C#. Java and C# both have large libraries, but the core languages are significantly simpler/cleaner than c++.
After learning C++, i have found python and perl easier to understand and master since both are written in C/C++. I now believe some people who claim they could pick up a new language in a few months. Those languages have their roots in C/C++.
- The basic challenges of scope+namespace, object lifetime, heap/stack, pointers, memory allocation, object construction, pass-by-ref/value, arrays, function pointer, exceptions, nested struct+array+pointer… are faced by every language designer. Many of these challenges depend on basic library, which is invariably C.
- The common OO challenges of inheritance, virtual, static/non-static, HAS-A/IS-A, constructor, downcast, … are faced by every OO language designer. Many of them borrow from java, which borrows from C++ and smalltalk
threading — java remains the gold standard but c++ currency support is more complex, harder to understand and offers some low-level insight
memory management — c++ offers insight into JVM and CLR
c++ gave me other insight into java, esp. GC, JVM, overriding, references, heap/stack, sizeof, …

y front office

发表于9月 27, 20158月 12, 2019 作者 BinTAN

* exposure to pricing decisions — the most important decisions

* closer to traders and their decision support

* closer to profit center

[[c++recipes]] mv-semantic etc

发表于9月 26, 20153月 4, 2018 作者 BinTAN

I find this book rather practical. Many small programs are fully tested and demonstrated.

This 2015 Book covers cpp14.

–#1) RVR(rval ref) and move semantic:
This book offers just enough detail (over 5-10 pages) to show how move ctor reduces waste. Example class has a large non-ref field.

P49 shows move(), but P48 shows even without a move() call the compiler is able to *select* the move ctor not copy-ctor when passing an instance into a non-ref parameter. The copy ctor is present but skipped!

P49 shows an effective mv-ctor can be “=default; “

–custom new/delete to trace memory operations
Sample code showing how the delete() can show where in source code the new() happened. This shows a common technique — allocating an additional custom memory header when allocating memory.

This is more practical than the [[effC++]] recipe.

There’s also a version for array-new. The class-specific-new doesn’t need the memory header.

–other
A simple example code of weak_ptr.

a custom small block allocator to reduce memory fragmentation

Using promise/future to transfer data between a worker thread and a boss thread

mv-semantic: keywords

发表于9月 25, 201512月 7, 2018 作者 BinTAN

I feel all the tutorials seem to miss some important details and selling a propaganda. Maybe [[c++ recipes]] is better?

[s = I believe std::string is a good illustration of this keyword]

[s] allocation – mv-semantic efficiently avoids memory allocation on heap or on stack
[s] resource — is usually allocated on heap and accessed via a pointer field
[s] pointer field – every tutorial shows a class with a pointer field. Note a reference field is much less common.
[s] deep-copy – is traditional. Mv-semantics uses some special form of shallow-copy. Has to be carefully managed.
[s] temp – the RHS of mv-semantic must strictly be a temp object. I believe by using the move() function and the r-val reference (RVR) we promise to the compiler not to access the temp object afterwards. If we access it, i guess bad things could happen. Similar to UndefBehv? See [[c++standard library]]
promise – see above
containers – All standard STL container classes (including std::string) provide mv-semantics. Here, the entire container instance is the payload! Inserting a float into a container won’t need mv-semantics.
[s] expensive — allocation and copying assumed expensive. If not expensive, then the move is not worthwhile.
[s] robbed — the source object of the move is crippled, robbed, abandoned and should not be used afterwards. Its “resource” is already stolen, so the pointer field to that resource should be set to NULL.

——–
http://www.boost.org/doc/libs/1_59_0/doc/html/move/implementing_movable_classes.html says “Many aspects of move semantics can be emulated for compilers not supporting rvalue references and Boost.Move offers tools for that purpose.” I think this sheds light…

mv-semantic : use cases rather few

发表于9月 25, 20158月 30, 2020 作者 BinTAN

I think the use case for mv-constructs is tricky. In many simple contexts mv-constructs actually don’t work.

Justification for introducing mv-semantic is clearest in one scenario — a short-lived but complex stack object is passed by value into a function. The argument object is a temp copy — unnecessary.

Note the data type should be a complex type like containers (including string), not an int. In fact, as explained in the post on “keywords”, there’s usually a pointer field and allocation.

Other use cases are slightly more complex, and the justification is weaker.

Q: [[c++standard library]] P21 says ANY nontrivial class should provide a mv ctor AND a mv-assignment. Why? (We assume there’s pointer field and allocation involved if no mv-semantics.)
%%A: To avoid making temp copies when inserting into container. I think vector relocation also benefits from mv-ctor

[[c++forTheImpatient]] P640 shows that sorting a vector of strings can benefit from mv-semantic. Swapping 2 elements in the vector requires a pointer swap rather than a copying strings

returning RVR #Josuttis

发表于9月 25, 20158月 23, 2019 作者 BinTAN

My rule of thumb is to avoid a RVR return type, even though Josuttis did NOT forbid it by saying anything like return type should never be rvr.

Instead of an rvr return type, I feel in most practical cases, we can achieve the same result using a nonref return type. I think such a function call usually evaluates to a nameless temp object i.e a naturally-occurring rvalue object.

[[Josuttis]] (i.e. the c++Standard library) P22 explains the rules about returning rval ref.

In particular, it’s a bad idea to return a newly-created stack object by rvr. This object is a nonstatic local and will be wiped out after the function returns.

(It’s equally bad to return this object by l-value reference.)

lambda meets template

发表于9月 20, 201512月 7, 2017 作者 BinTAN

In cpp, java and c#, The worst part of lambda is the integration with (parametrized) templates.

In each case, We need to understand the base technology and how that integrates with templates, otherwise you will be lost. The base technologies are (see post on “lambda – replicable”)
– delegate
– anon nested class
– functor

Syntax is bad but not the worst. Don’t get bogged down there.

lambda is more industry-standard than delegate

发表于9月 20, 2015 作者 BinTAN

Before java and c++ introduced lambada, I thought delegate is the foundation of lambdas.

Now I think lambda is an industry standard, implemented differently in c++ and java. See post on “lambda – replicable”. For python…

Bear in mind
A) the most fundamental, and pure definition of lambda — a function as rvalue, to be passed in as argument to other functions.

B1) the most common usage is sequence processing in c#, java and c++
* c# introduced lambda along with linq
* java introduced lambda along with streams

B2) 2nd common usage is event handler including GUI.

See post on “2 fundamental categories”

noSQL top 2 categories: HM^json doc store

发表于9月 20, 20156月 5, 2017 作者 BinTAN

Xml and json both support hierarchical data, but they are basically one data type. Each document is the payload. This is the 2nd category of noSQL system. #1 category is the key-value store i.e hashmap, the most common category. The other categories (columnar, or graph) aren’t popular in finance projects I know,

coherence/gemfire/gigaspace – HM
terracotta – HM
memcached – HM
oracle NoSQL – HM
Redis – HM
Table service (name?) in Windows Azure – HM
mongo – document store (json)
CouchDB – document store (json)
Google BigTable – columnar
HBase – columnar

big data feature: variability-in-biz-Value

发表于9月 20, 20158月 9, 2020 作者 BinTAN

RDBMS – every row is considered “high value”. In contrast, a lot of data items in a big data store is considered low-value.

The oracle nosql book refers to it as “variability of value”. The authors clearly think this is a major feature, a 4th “V” beside Volume, Velocity and Variety-of-data-format.

As a result, data loss is often tolerable in big data systems (but never acceptable in RDBMS). Exceptions, IMHO:
* columnar database such as kdb
* Quartz, SecDB

big data tech feature: inexpensive hardware

发表于9月 19, 201511月 12, 2017 作者 BinTAN

See post on variability

Economics — data volume often necessitates inexpensive storage. Commodity hardware is a key feature of big data.

“Inexpensive” helps scale-out (aka horizontal scaling). Just add more nodes. In contrast, RDBMS requires scale-up to bigger machines. See other posts on scale-out.

big data tech feature: scale out

发表于9月 19, 20159月 14, 2019 作者 BinTAN

Scalability is driven by one of the 4 V’s — Velocity, aka throughput.

Disambiguation: having many machines to store the data as readonly isn’t “scalability”. Any non-scalable solution could achieve that without effort.

Big data often requires higher throughput than RDBMS could support. The solution is horizontal rather than vertical scalability.

I guess gmail is one example. Requires massive horizontal scalability. I believe RDBMS also has similar features such as partitioning, but not sure if is economical. See posts on “inexpensive hardware”.

The Oracle nosql book suggests noSQL compared to RDBMS, is more scalable — 10 times or more.

~~RDBMS can also scale out — PWM used partitions.~~

noSQL and ACID

发表于9月 19, 20158月 9, 2020 作者 BinTAN

See big data feature: variability-in-biz-Value, the 4th V of big data.

A noSQL software could support transactions as RDBMS does, but the feature support is minimal in noSQL, according to the Oracle noSQL book.

Transactions slow down throughput, esp. write-throughput like create/update/delete. Read throughput is also affected because of locking, among other things.

In a big data site, not all data items are high value, so ACID transaction properties may be overkill and not worthwhile.

— The A/C/I/D

Atomicity — the most visible, best-known feature, but often overshadows the other three features
Consistency — mostly about invariants. If a transaction meets all the validations and constraints (and commits), and they are comprehensively defined, then the operation is very likely to be correct. However, if the invariants rules are simplistic and superficial, then consistency doesn’t mean much. The data may be incorrectly written.
Isolation — mostly about concurrent operation, which should not affect the final state of the everything after the dust settles. Concurrent or serialized operation should leave the data store in the same state.
Durability — is about back-up and redo log. A power-failure in the middle of a transaction should roll back that transaction and have all earlier operations reflected in the restored data store.

noSQL feature #1 – unstructured

发表于9月 19, 201512月 31, 2015 作者 BinTAN

I feel this is the #1 feature. RDBMS data is very structured. Some call it rigid.
– Column types
– unique constraints
– non-null constraints
– foreign keys…
– …

In theory a noSQL data store could have the same structure but usually no. I believe the noSQL software doesn’t have such a rich and complete feature set as an RDBMS.

I believe real noSQL sites usually deal with unstructured data. “Free form” is my word.

Rigidity means harder to change the “structure”. Longer time to market. Less nimble.

What about BLOB/CLOB? Supported in RDBMS but more like a afterthought. There are specialized data stores for them. Some noSQL software may qualify.

Personally, I feel RDBMS (like unix, http, TCP/IP…) prove to be flexible, adaptable and resilient over the years. So I would often choose RDBMS when others prefer a noSQL solution.

WallSt friends’ comment@slow coder,deadlines

发表于9月 15, 201511月 3, 2020 作者 BinTAN

This is life-n-death: if you are not adding enough value you are out…

With important exceptions (Stirt, Lab49..) Wall street systems are stringent about time line, less about system failures, even less about maintainability or total cost of ownership or Testing. I feel very few (like 5%) Wall St systems are high precision and I include the pricing, risk, trade execution systems. Numerical accuracy is important to the users though, because those numbers are about the only thing they can understand. Their job is about control on those numbers.

In City muni, Boris’s code was thrown out because it didn’t make it to production. Any production code is controlled not by dev team but many many layers of control measures. So my production code in Citi will live.

If you are slow, Anthony Lin feels they may remove you and get a replacement to hit the deadline. If they feel it’s hard to find replacement and train him up, then they keep you – all about time lines.

Hou Li felt your effort does protect you – 8.30 – 6pm everyday. If still slow, then manager may agree estimate is wrong. She felt deadline and effort estimate are arbitrary. However, if you are obviously slower than peers, then boss knows it.

equivalent FX(+option) trades, succinctly

发表于9月 11, 20159月 18, 2018 作者 BinTAN

The equivalence among FX trades can be confusing to some. I feel there are only 2 common scenarios:

1) Buying usdjpy is equivalent to selling jpyusd.
2) Buying usdjpy call is equivalent to Buying jpyusd put.

However, Buying a fx option is never equivalent to Selling an fx option. The seller wants (implied) vol to drop, whereas the buyer wants it to increase.

cvs diff against a given DATE or tag

发表于9月 5, 201510月 1, 2017 作者 BinTAN

cvs diff -D “2017-09-01”

cvs diff -r your-tag

left skew~left side outliers~mean PULLED left

发表于9月 5, 20156月 30, 2016 作者 BinTAN

Label – math intuitive

[[FRM]] book has the most intuitive explanation for me – negative (or left) skew means outliers in the left region.

Now, intuitively, moving outliers further out won’t affect median at all, but pulls mean (i.e. the balance point) to the left. Therefore, compared to a symmetrical distribution, mean is now on the LEFT of median. With bad outliers, mean is pulled far to the left.

Intuitively, remember mean point is the point to balance the probability “mass”.

In finance, if we look at the signed returns we tend to find many negative outliers (far more than positive outliers). Therefore the distribution of returns shows a left skew.

	ptr-ref layering #re…发表在《convert a reference variable i…》
	1330152open⇒发表在《My xx-absorbency[def#1]!=highe…》
	why our coding drill…发表在《## coding IV P/F》
	“hard” l…发表在《FB: spiral number pattern》
	sensitivities = #1 v…发表在《beta ^ rho i.e. correlation co…》

keep learning 活到老学到老

to remove two-column,resize your browser window to narrow

月度归档： 2015年9月

social class]U.S. n%%chosen tech domain

quant developer requirements

##some benefits@learning c++, even if no salary increase

y front office

[[c++recipes]] mv-semantic etc

mv-semantic: keywords

mv-semantic : use cases rather few

returning RVR #Josuttis

lambda meets template

lambda is more industry-standard than delegate

noSQL top 2 categories: HM^json doc store

big data feature: variability-in-biz-Value

big data tech feature: inexpensive hardware

big data tech feature: scale out

noSQL and ACID

noSQL feature #1 – unstructured

WallSt friends’ comment@slow coder,deadlines

equivalent FX(+option) trades, succinctly

cvs diff against a given DATE or tag

left skew~left side outliers~mean PULLED left