opaque c++trouble-shooting: bustFE streamIn

This is a good illustration of fairly common opaque c++ problems, the most dreadful/terrifying species of developer nightmares.

The error seems to be somewhat consistent but not quite.

Reproducing it in dev enviroment was a first milestone. Adding debug prints proved helpful in this case, but sometimes it would take too long.

In the end, I needed a good hypothesis, before we could set out to verify it.

     81     bool SwapBustAction::streamInImpl(ETSFlowArchive& ar)
     82     { // non-virtual
     83       if (exchSliceId.empty())
     84       {
     85         ar >> exchSliceId;
     86       }
    104     }
    105     void SwapBustAction::streamOutImpl(ETSFlowArchive& ar) const
    106     { // non-virtual
    107       if (exchSliceId.size())
    108       {
    109         ar << exchSliceId;
    110       }

When we save the flow element to file, we write out the exchSliceId field conditionally as on Line 107, but when we restore the same flow element from file, the function looks for this exchSliceId field unconditionally as on Line 85. When the function can’t find this field in the file, it hits BufferUnderflow and aborts the restore of entire flow chain.

The serialization file uses field delimiters between the exchSliceId field and the next field which could be a map. When the exchSliceId field is missing, and the map is present, the runtime would notice an unusable data item. It throws a runtime exception in the form of assertion errors.

The “unconditional” restore of exchSliceId is the bug. We need to check the exchSliceId field is present in the file, before reading it.

In my testing, I only had a test case where exchSliceId was present. Insufficient testing.

Advertisements

story: %%investigation skill #etscfg

Hi Paul, In my etscfg I used this simple ifelse:

1        %IFELSE{defined(SOME_CHECK)}%
2        
3        %ELSE%
4        
5        %IFELSE%

To my surprise, Line 4 comes out preceded by a new-line when the if-condition fails. This impacts a lot of config artifacts that shouldn’t be impacted.

On the other hand, when the if-condition evaluates to true, I get exactly Line 2 and I don’t see any new-line added anywhere. This is correct output.

— Investigation —
On my Line 3, the implicit new-line character right after %ELSE% is incorrectly included in $else_block and printed out before Line 4, causing the surprise result. Attached are two test programs showing

my ($if_block, $else_block) = split(/(?:[^\w\n]*)\%ELSE\%/, $block); ### should be
my ($if_block, $else_block) = split(/(?:[^\w\n]*)\%ELSE\%\n?/, $block); ### to include trailing newline in delimiter of split()

In my fix, I also added a “guard” to deal with any trailing spaces sandwiched between %ELSE% and new-line character. I know sometimes I can leave a trailing space sandwiched there.

Altogether, my fix consists of two changes

• 1 new line of perl
• 1 modified line of perl

 

pre-allocate DTOs@SOD #HFT #ets

example — etsflow framework pre-allocates object pool (presumably the flow elements) for the day, to avoid runtime call to malloc. Are these objects ever released to the pool? I doubt it since all of these objects are subject to query or bust.

example — RTS pre-allocates outgoing message objects from a ring buffer’s head, and “returns” to the ring buffer at the tail… See How RTS achieved 400-700 KMPS #epoll

example — Sell-side HFT OMS also uses pre-allocation. Suppose for every new order there are 3 new DataTransferObjects A/B/C to be instantiated on heap. Traditional approach would make 3 allocation requests in real time. I think the free-list manager becomes a hotspot, even if there’s a per-thread free list.

Basically HFT avoids new/malloc after market opens. RTS uses mostly arrays, and very few (rather small) STL containers. Those STL containers tend to be populated before market opens and remain static.

Pre-allocation is a popular technique. We compute at compile time the sizes of A/B/C based on their class declarations. For DTO class A, sizeof(A) just adds up the non-static data field sizes. Then we estimate how many orders we will get a day (say 7 million). Then we pre-allocate 7 million A objects in an array. The allocation happens at start-up, though the sizes are compile-time constants.

When an order comes in, the A/B/C DTO objects are already allocated but empty.

Byte-array is an alternative, but this smells like the raw free list to me…

q[static thread_local ] in %%production code

static thread_local std::vector<std::vector<std::string>> allFills; // I have this in my CRAB codebase, running in production.

Justification — In this scenario, data is populated on every SOAP request, so keeping them as non-static data members is doable but considered pollutive.

How about static field? I used to think it’s thread-safe …

When thread_local is applied to a variable of block scope, the storage-class-specifier static is implied if it does not appear explicitly. In my code I make it explicit.

mktData direct^Reuter feeds: realistic ibank set-up

Nyse – an ibank would use direct feed to minimize latency.

For some newer exchanges – an ibank would get Reuters feed first, then over a few years replace it with direct feed.

A company often gets duplicate data feed for the most important feeds like NYSE and Nasdaq. The rationale is discussed in [[all about hft]]

For some illiquid FX pairs, they rely on calculated rates from Reuters. BBG also calculate a numerically very similar rate, but BBG is more expensive. Bbg also prohibits real time data redistribution within the ibank.

consolidate into single-process: low-latency OMS

Example #2– In a traditional sell-side OMS, an client FIX order propagates through at least 3 machines in a chain —

  1. client order gateway
  2. main OMS engine
  3. exchange gateway such as smart order router or benchmark execution engine supporting VWAP etc

The faster version consolidates all-of-the-above into a single Process, cutting latency from 2ms to 150 micros .. latency in eq OMS

Example #1– In 2009 I read about or heard from interviewers about single-JVM designs to replace multi-stage architecture.

Q: why is this technique not used on west coast or main street ?
%%A: I feel on west coast throughput outweighs latency. So scale-out is the hot favorite. Single-JVM is scale-up.

ibank: trade book`against exch and against client

In an ibank equity trading system, every (partial or full) fill seems to be booked twice

  1. as a trade against a client
  2. as a trade against an exchange

I think this is because the ibank, not the client, is a member of the exchange, even though the client is typically a big buy-side like a hedge fund or asset manager.

The booking system must be reconciled with the exchange. The exchange’s booking only shows the ibank as the counterparty (the opposite counterparty is the exchange itself.) Therefore the ibank must record one trade as “ibank vs exchange”

That means the “ibank vs client” trade has to be booked separately.

Q: how about bonds?
A: I believe the ibank is a dealer rather than a broker. Using internal inventory, the ibank can execute a trade against client, without a corresponding trade on ECN.

Q: how about forex?
A: I think there’s less standardization here. No forex ECN “takes the opposite side of every single trade” as exchanges do. Forex is typically a dealer’s market, similar to bonds. However for spot FX, dealers usually maintain low inventory and executes an ECN trade for every client trade. Biggest FX dealers are banks with huge inventory, but still relative small compared to the daily trade volume.

low-latency jobs; learning python#CSY

I believe my current trading system is latency-sensitive (as a large-scale [1] order management engine), but on this job I don’t do anything about latency — this is just another equity trading system job. I feel you don’t need to aim for low latency. Just aim for any c++ job that’s not off-putting to you.

[1] I consider it large scale because it has probably 10-20 developers working on it, for at least 5 years.

Low-latency jobs are harder to get, and they usually prefer developers in their 30’s or younger.

I don’t exactly avoid such jobs, but I don’t hold up any serious expectation of an offer by those teams. The main barrier of entry is technical capabilities, either coding test or obscure c++/linux questions. Even if I’m lucky to hit familiar, well-prepared tech questions, they may still find reasons to take a pass, as I experienced in my technical wins at SIG, Bloomberg etc.

Java and python are never show-stoppers for these jobs, so
* if you only aim for low-latency jobs, then don’t worry about python or java
* if you widen your search scope, then I suggest you pick up python, not java

Taking a step back to get a perspective, I think job seekers are like smartphone makers — we need to adjust our “product” for the changing customer taste. The “product” in this case is our testable tech skills. Domain knowledge is losing importance; Python is now in-demand; coding tests are growing harder; Linux/compiler system knowledge has always been important to low-latency interviews .. so we need to decide how to adjust our “product” to attract our customers.

—- Earlier email —-

How is your python practice?

This is a personal journey. Therefore some people don’t like to discuss in details how they are learning something new. So feel free to disclose any amount of information you feel comfortable with.

I advocated solving coding problems in python. I now realize both you and me don’t want to solve every problem. Out of 10 problems, we might solve 1 or 2 in real code. So in the past 4 weeks, perhaps you didn’t look at 10 problems so the number of problems you could solve in python might be very low. Therefore, my suggestion may not work for you.

In that case, I wonder what python coding experiments you prefer.

I once said python is easier to learn, but like learning any new language, it still demands a huge commitment, sustained focus, and personal sacrifice. Therefore, it helps greatly if there’s a python project on a day job. Without it, we need self-discipline, determination, and clear targets to sustain the focus and energy. None of these is really “easy”.

Talking about clear targets, one example is “solve one coding problem of medium complexity each week, in python”.

I created a shared_ptr with a local object address..

In my trade busting project, I once created a local object, and used its address to construct a shared_ptr (under an alias like TradePtr).

Luckily, I hit consistent crashes. I think the reason is — shared_ptr likes heap objects. When my function returns, the shared_ptr tried to call delete on the raw ptr, which points at the local stack, leading to crash.

The proven solution — make_shared()

both base classes”export”conflicting typedef #MI

Consider class Der: public A, public B{};

If both A and B expose a public member typedef for Ptr, then C::Ptr will be ambiguous. Compiler error message will explicit highlight the A::Ptr and B::Ptr as “candidates”!

Solution — inside Der, declare

typedef B::Ptr Ptr; //to exclude the A::Ptr typedef
// This solution works even if B is a CRTP base class like

class Der: public A, public B{
  typedef B::Ptr Ptr;
};

order slice^fill: jargon

An execution is also known as a fill, often a partial fill.

  • A slice is part of a request; An execution is part of a response

A slice can have many fills, but a fill is always for a single request.

  • An execution always comes from some exchange, back to buy-side clients, whereas
  • A request (including a slice) always comes from upstream (like clients) to downstream (like exchanges)
  • Slicing is controlled by OMS systems like HFT; Executions are controlled by exchanges.
  • Clients can send a cancel for a slice before it’s filled; Executions can be busted only by exchanges.

##observations@high-volume,latency sensitive eq OMS #CSY

This is a probably the biggest sell-side equity order-management-system (OMS) on Wall St, written in c++11. Daily order volume is probably highest among all investment banks, presumably 7 figures based on my speculation, though a lot of them get canceled, rejected or unfilled. I am disallowed to reveal too many internal details due to compliance.

In contrast, GS used to get about a million individual trades (perhaps the partial fills of an order?) a day, probably not counting the high-frequency small trades.

  • synchronization — I haven’t noticed any locking or condition variable so far. I think single-threaded mode is faster than synchronized multi-threading. Multiple instances of the same software runs in parallel across machines. I think this is in many ways better than one big monolithic process hosting many threads. We have 4 threads per instance in some cases.
  • ticking market data — is available, though I don’t know if my OM system needs them beside the restriction indicators
  • For crash recovery, every order and every fill is persisted in non-volatile memory , and often swapped out to disk and free up memory. These records are never cleared until EOD. Consequently, any fill can be busted any time before EOD. A recovery would reinstate them. So what “order objects”?
    • All pending orders and (for busting support) closed orders. Basically all orders.
    • Each logical order requires a chain of stateful FlowElement objects created on the fly to support the order.
  • data persistence — the OMS enriches every order and also generates new orders. These orders are persisted automatically in case of a server crash and restart. They persistence files are binary and cleared at EOD
  • RDBMS — is loaded into cache at Start-of-Day and seldom accessed intra-day. I confirmed it with an ex-DBA colleague.
    • However, some product DB system sends intra-day real time updates via messaging (not FIX)
  • MOM — I have not seen a message queue so far but they could be hidden somewhere. Earlier I heard other ibanks’ employees telling me Tibco (and similar messaging middlewares) were popular in fixed income but now I doubt it. Queues add latency.
    • We do use some pub-sub MOM (CPS) but not for order messages therefore not part of order flow.
  • socket — is not needed in any module. I believe the applications communicate via FIX, SOAP etc, on top of well-encapsulated TCP library modules.
  • garbage collection — no GC like in java and dotnet
  • CRTP — heavy use of CRTP. I don’t remember seeing many virtual functions.
  • The most important message is the order object, represented by a FIX message. The order object gets enriched and modified by multiple functions in a chain. Then it is sent out via FIX session to the next machine. As in any OMS, the order object is stateful. All order objects are  are persisted somewhere so a crash won’t wipe out pending orders.
    • (Elsewhere, I have seen very lean and mean buy-side OMS systems that don’t persist any order! After crash, it would query the exchange for order states.)
  • The 2nd most important message is probably the response object, represented by a FIX msg. If there are 100,000 order objects then there are roughly 300,000 response objects. Each order generates multiple responses such as Rejection, PendingNew, New, PartialFill, PendingCancel, Cancelled… Response objects probably don’t need to be persisted in my view.
  • The 3rd most common message is the report message object, again in FIX format. Each order object probably generate at least one report, even if rejected. Report objects sound simple but they carry essential responsibilities , not only regulatory reporting and client confirmations, but also trade booking, trade capture… If we miss an execution report the essential books and records (inventory, positions..) would be messed up. However, these reports are not so latency sensitive.

overcoming exchange-FIX-session throughput limit

Some exchanges (CME?) limits each client to 30 orders per second. If we have a burst of order to send , I can see two common solutions A) upstream queuing B) multiple sessions

  1. upstream queuing is a must in many contexts. I think this is similar to TCP flow control.
    • queuing in MOM? Possible but not the only choice
  2. an exchange can allow 100+ FIX sessions for one big client like a big ibank.
    • Note a big exchange operator like nsdq can have dozens of individual exchanges.

Q: is there any (sender self-discipline) flow control in intranet FIX?
A: not needed.

Are equities simpler than FICC@@

I agree that FICC products are more complex, even if we exclude derivatives

  • FI product valuations are sensitive to multiple factors such as yield curve, credit spread
  • FI products all have an expiry date
  • We often calculate a theoretical price since market price is often unavailable or illiquid.
  • I will omit other reasons, because I want to talk more (but not too much) about …

I see some complexities (mostly) specific to equities. Disclaimer — I have only a short few years of experience in this space. Some of the complexities here may not be complex in many systems but may be artificially, unnecessarily complex in one specific system. Your mileage may vary.

  • Many regulatory requirements, not all straightforward
  • Restrictions – Bloomberg publishes many types of restrictions for each stock
  • Short sale — Many rules and processes around short sale
  • Benchmarks, Execution algorithms and alphas. HFT is mostly on equities (+ some FX pairs)
  • Market impact – is a non-trivial topic for quants
  • Closing auctions and opening auctions
  • Market microstructure
  • Order books – are valuable, not easy to replicate, and change by the second
  • Many orders in a published order book get cancelled quickly. I think some highly liquid government bonds may have similar features
  • Many small rules about commission and exchange fees
  • Aggregate exposure — to a single stock… aggregation across accounts is a challenge mostly in equities since there are so many trades. You often lose track of your aggregate exposure.
  • Exchange connectivity
  • Order routing
  • Order management

custom-basket ^ portflio trading

A client can ask a broker to buy “two IBM, one MSFT” either as a AA) custom basket or a BB) portfolio. The broker handles the two differently.

Only the Basket (not the portfolio) is “listed” on Bloomberg (but not on any exchanges). Client can see the pricing details in Bloomberg terminal, with a unique basket identifier.

Booking — the basket trade is recorded as a single indivisible position; whereas the portfolio trade gets booked as individual positions. Client can only sell the entire basket; whereas the portfolio client can sell individual component stocks.

Fees — There is only one brokerage fee for the basket, but 5 for a portfolio of 5 stocks.

The broker or investment advisor often has a “view” and advice on a given basket.

Corporate actions should be handled in the basket automatically.

I feel portfolio is more flexible, more informal than custom basket which is less formalized, less regulated than an index-tracking ETF.

swap on eq futures/options: client motive

Q1: why would anyone want to enter a swap contract on an option/futures (such a complex structure) rather than trading the option/futures directly?

Q2: why would anyone want to use swap on an offshore stock rather than trading it directly?

More fundamentally,

Q3: why would anyone want to use swap on domestic stock?

A1: I believe one important motivation is restrictions/regulation.  A trading shop needs a lot of approvals, licenses, capital, disclosures … to trade on a given futures/options exchange. I guess there might be disclosure and statuary reporting requirements.  If the shop can’t or doesn’t want to bother with the regulations, they can achieve the same exposure via a swap contract.

This is esp. relevant in cross-border trading. Many regulators restrict access by offshore traders, as a way to protect the local market and local investors.

A3: One possible reason is transparency, disclosure and reporting. I guess many shops don’t want to disclose their positions in, say, AAPL. The swap contract can help them conceal their position.

limit-IOC ^ market-IOC

Limit IOC (Immediate-or-Cancel): Can be used for FX Spot and CFD.

An instruction to fill as much of an order as possible within pre-defined tolerances of a limit price, immediately (5 second Time-to-Live).

Unlike Market IOC orders, Limit IOC orders allow a Client to control the maximum slippage that they are willing to accept.

Under normal market conditions a Market IOC order will be filled in full immediately. In the event that it isn’t, any residual amount will be cancelled. Price Tolerance cannot be added on a Market IOC order, meaning that a client cannot control slippage.

gdb: dump STL container %%experience

First let’s talk about custom containers. GDB would show the field names of an object, but frequently not the values. I guess integers values might show up but more than half the fields are pointers ( actually char-array field would be easy to print.)

If I call a function on the object, I have to be very lucky and very careful. q(->) has never worked for me so far, so I need to use q(*) to de-reference every pointer before calling a method on the pointee, and pray it works.

http://www.yolinux.com/TUTORIALS/src/dbinit_stl_views-1.03.txt works on std::map …

A simple experiment using https://github.com/tiger40490/repo1/blob/cpp1/cpp/88miscLang/containerDumpOperator.cpp

  • g++ -g theFile.cpp && gdb -iex ‘add-auto-load-safe-path .’ ./a.out
  • (gdb) print *(li._M_impl._M_start+1) # can print 2nd element if it’s std::string or double
    • Note before vector initialization, gdb already shows the addresses inside the vector, but some addresses are not populated. Just retry after the initialization.
  • std::unordered_map is doable:
    • (gdb) print **(tm._M_buckets) # prints first pair in a hash table bucket
    • (gdb) print *((**(tm._M_buckets))._M_next) # next pair in the same bucket
  • std::map content is harder
    • (gdb) print *(int*)(tm._M_t._M_impl._M_header._M_left+1) # prints one key
    • (gdb) print *(int*)(tm._M_t._M_impl._M_header._M_right+1) # prints another key in the pair
    • (gdb) print *(int*)((void*)(tm._M_t._M_impl._M_header._M_right+1)+sizeof(int)) #prints the value in the pair.
      • the (void*) is needed before we add sizeof(value_type). Without the cast, the pointer arithmetic would be different.
      • from the key field to value field, we move by 4 bytes (i.e. sizeof value_type) from  0x6050e0 to 0x6050e4. It’s actually easy to manually type .. print *0x6050e4
      • I suspect the _M_right pointer is seated at the “color” field. Increment to the key field?

swap^cash equity trade: key differences

I now feel an equity swap is an OTC contract; whereas an IBM cash buy/sell is executed on the exchange.

  • When a swap trade settles, the client has established a contract with a Dealer. It’s a binding bilateral contract having an expiry, and possibly collateral. You can’t easily transfer the contract.
  • When a cash trade settles, the client has ownership of 500 IBM shares. No contract. No counterparty. No expiry. No dealer.

I think a cash trade is like buying a house. Your ownership is registered with the government. You an transfer the ownership easily.

In contrast, if you own a share in coop or a REIT or a real-estate private equity, you have a contract with a company as the counterparty.

Before a dealer accepts you as a swap trading partner, you must be a major company to qualify to be counterparty of a binding contract. A retail investor won’t qualify.

PendingNew^New: OrdStatus[39]

“8” has special meaning

  • in tag 35, it means execution report
  • in tag 39 and tag 150, it means status=rejected.

PendingNew and New are two possible statuses for a given order.

PendingNew (39=A, 150=A) is relatively simple. The downstream system sends a PendingNew to upstream as soon as it receives the “envelop”, before opening, validating or saving it. I would say even a bad order can go into PendingNew.

New (39=0, 150=0) is significant. It’s an official acknowledgement (or confirmation) of acceptance. It’s synonymous with “Accept” and “Ack”. I think it means fully validated and saved for execution. For an intermediate system, usually it waits for an Ack i.e. 39=0 from exchange before sending an Ack to the upstream. Tag 39 is usually not modified.

I see both A and 0 frequently in my systems, in execution report messages.

For a market Buy order, I think it will be followed by (partial) fills, but not guaranteed, because there may be no offers, or execution could fail for any reason. For a dealer system, execution can fail due to inventory shortage. I implemented such an execution engine in 95G.

I’m no expert on order statuses.

pink sheets #learning notes

The pink sheets are a stock quotation service on unlisted stocks.

  • Many are penny stocks, trading for extremely low prices,
  • some are legitimate foreign companies that don’t wish to file reports with the SEC.
  • … There’s less regulation, less transparency, more risk of fraud in these stocks.

OTC Markets Group offers this service.

PinkSheet stocks are considered non-hedgeable in some swap dealer systems. I guess liquidity is too low.

https://www.fool.com/knowledge-center/what-are-the-pink-sheets.aspx is good intro.

gdb symbol-loading too time-consuming

After I attach gdb, it immediately starts a prolonged symbol loading process. It’s better to skip the loading, and selectively load some symbols.

https://ascending.wordpress.com/2007/09/02/a-couple-of-gdb-tricks/ describes how to use ~/.gdbinit, but I had no permission to write to the ~/

gdb -iex ‘set auto-solib-add off’ …. # worked

–loading a particular *.so file

I got “No loaded shared libraries match the pattern” and fixed it by

shar file.name.so #instead of /full/path/to/file.name.so.some.version

## sell-side eq e-trading arch features #MS,Baml..

Mostly inspired by the MS equity order-management “frameworks”

  • message-based, not necessarily MOM.
    • FIX messages are the most common
    • SOAP messages are also possible.
    • BAML system is based on MOM (tibrv)
  • message routing based on rules? Seems to be central to some sell-side /bloated/ “platforms” consisting of a constellation of processes.
  • event-driven
    • client newOrder, cancel requests
    • trading venue (partial) fills
    • Citi muni reoffer is driven by market data events, but here I focus on equity systems
    • Stirt realtime risk is driven by market data events + new trade booking events
    • buy-side would have order-origination events, but here I focus on sell-side systems
  • market data subscription? Actually not so important to some eq trading engines. Buy-side would make trading decisions based on market data, but a sell-side won’t.

hunt down CORRECT include file+directory

When I get something like unrecognized symbol, obviously header file is missing.

This is a relatively easy challenge since it involves ascii source files, not binary. Faster to search.

  1. I start with some known include directories. I run find-grep looking for a declaration of the symbol. Hopefully I find only one declaration and it’s the correct header file to include.
  2. then I need to guess the correct form of #include
  3. Then I need to add the directory as an -I command-line option

 

q[cannot open shared object file] abc.so

strace -e trace=open myprogram can be used on a working instance to see where all the SO files are successfully located.

— Aug 2018 case: in QA host, I hit “error while loading shared libraries: libEazyToFind.so: … No such file or directory”

I can see this .so file so I used LD_LIBRARY_PATH to resolve it.

Then I get “error while loading shared libraries: libXXXXX.so: … No such file or directory”. I can’t locate this .so, but the same executable is runnable in a separate HostB. (All machines can access the same physical file using the same path.)

I zoomed into the HostB and used “ldd /path/to/executable”. Lo and behold, I can see why HostB is lucky. The .so files are located in places local in HostB … for reasons to be understood.

— May 2018 case:

The wording should be “cannot locate ….”

I fixed this error using $LD_LIBRARY_PATH

The *.so  file is actually specified as a -lthr_gcc34_64 option on the g++ command line, but the file libthr_gcc34_64.so was not found at startup.

I managed to manually locate this file in /a/b/c and added it :

LD_LIBRARY_PATH=$LD_LIBRATY_PATH:/a/b/c/

10 μs Additional latency: collocated eq OMS

  • Many organizations “are using the words ultra low latency to describe latencies of under 1 millsec” [1]
  • 13 microsec in collocated eq OMS
  • 150 microsec “single-trip” latency in similar software outside collocation site, measured by Corvil, from A to B
    • Time A: FIX msg coming into our engine
    • Time B: FIX msg going out of our engine
    • 150 μs is median, not average
    • Corvial is (most likely) a TCP network sniffer with FIX parser so it can track a single order flow
  • 2 millis in a “regular” build
  • A major hedge fund had a very limited flow featuring 10 microsec tick-to-trade. Monolithic process. User sends in a symbol + a template id. Engine “instantiates” the template and sends out the FIX

[1] https://en.wikipedia.org/wiki/Low_latency_(capital_markets)

Treasury trading doesn’t need such low latency.

boost::optional #NaN #a G9 boost construct

https://www.boost.org/doc/libs/1_65_1/libs/optional/doc/html/index.html has illustrations using optional<int>.

— #1 Usage: possibly-empty argument-holder
I encountered this in MS library:

void myFunc(optional<ReportType &> reportData_){
if(reportData_) cout<<“non-empty”;

In addition, declaration of this function features a default-arg:

void myFunc(optional<ReportType &> reportData_ = optional<ReportType &>());

Q: for an int param, how does this compare with a simple default-arg value of -1?
A: boost::optional wins if all integer values are valid values, so you can’t pick one of them to indicate “missing”

microservices “MSA” #phrasebook

I feel MSA is more of a architect interview topic, not a developer interview topic. Dev complexity is low by design.

eg: error acct lookup, receiving productId + possibly a clientId, returning an error acct

Now the phrasebook:

  • jxee — As of 2019, I guess jxee has the best support for MSA
  • enterprise — enterprise-bias. Most of the practices used in SOA/MSA come from developers who have created software applications for large enterprise organizations.
  • SOA — is the ancestor and now out of fashion. I think MSA will also fall out of fashion.
  • stateless — stateless microservice is best. Can be highly concurrent and scaled out
  • scalability — hopefully better
  • decentralized — rather than monolithic
  • modularity
  • communication protocol — supposedly lightweight, but more costly than in-process communication
    • http — is commonly used for communication. Presumably not asynchronous
    • messaging — metaphor is often used for communication. I doubt there’s any MOM of message queue.
  • cloud-friendly — cheaper
  • flexible — in the face of changing requirements, though I’m not sure time-to-market will improve
  • simple-facade — (of a big monolithic service) is now replaced by more complex interface, so I suspect this is not always popular.
  • complexity — (various forms) is the public enemy but I don’t know which weapon (REST,SOA,ESB,MOM,Spring) actually works
  • in-process — services can be hosted in a single process, but less common
  • devops — is a driver
    • testability — each service is easy to test, but not integration test
    • loosely coupled — decentralized, autonomous dev teams
    • deployment — is ideally independent for each service, and continuous, but overall system deployment is complicated

cross-currency equity swap: %%intuition

Trade 1: At Time 1, CK (a hedge fund based in Japan) buys one share of GE priced at USD 10, paying JPY 1000. Eleven months later, GE is still at USD 10 which is now JPY 990. CK faces a paper loss due to FX. I will treat USD as asset currency. CK bought 10 greenbacks at 100 yen each and now each greenback is worth 99 yen only.

Trade 2: a comparable single-currency eq-swap trade

Trade 3: a comparable x-ccy swap. At Time 1, the dealer (say GS) buys and holds GE on client’s behalf.

(It is instructive to compare this to compare this to Trade 2. The only difference is the FX.)

In Trade 3, how did GS pay to acquire the share? GS received JPY 1000 from CK and used it to get [1] 10 greenbacks to pay for the stock.

Q: What (standard) solutions do GS have to eliminate its own FX risk and remain transparent to client? I think GS must pass on the FX risk to client.

I think in any x-ccy deal with a dealer bank, this FX risk is unavoidable for CK. Bank always avoids the FX risk and transfer the risk to client.

[1] GS probably bought USDJPY on the street. Who GS bought from doesn’t matter, even if that’s another GS trader. For an illiquid currency, GS may not have sufficient inventory internally. Even if GS has inventory under trader Tom, Tom may not want to Sell the inventory at the market rate at this time. Client ought to get the market rate always.

After the FX trade, GS house account is long USDJPY at price 100 and GS want USD to strengthen. If GS effectively passes on the FX risk, then CK would be long USDJPY.

I believe GS need to Sell USDJPY to CK at price 100, to effectively and completely transfer the FX risk to client. In a nutshell, GS sells 10 greenbacks to CK and CK uses the 10 greenbacks to enter an eq-swap deal with GS.

GS trade system probably executes two FX trades

  1. buy USDJPY on street
  2. sell USDJPY to CK

After that,

  • GS is square USDJPY.
  • CK is Long USDJPY at price 100. In other words, CK wants USD to strengthen.

I believe the FX rate used in this trade must be communicated to CK.

Eleven months later, GS hedge account has $0 PnL since GE hasn’t moved. GS FX account is square. In contrast, CK suffers a paper loss due to FX, since USD has weakened.

As a validation (as I instructed my son), notice that this outcome is identical to the traditional trade, where CK buys USDJPY at 100 to pay for the stock. Therefore, this deal is fair deal.

Q: Does GS make any money on the FX?
A: I don’t think so. If they do, it’s Not by design. By design, GS ought to Sell USDJPY to client at fair market price. “Fair” implies that GS could only earn bid/ask spread.

## stateful OMS class design: observations

More details are in email…

Here’s a well-established and large-scale order manager class design. It handles millions of orders a day.

  • The entire process is restarted on every trading day. Before the restart, all pending orders are cancelled! The OM is probably a per-thread singleton in the process.
  • The OM stores all the orders for the day, including each closed order in case it needs cancellation.
  • The OM keeps all the partial executions (aka partial fills) for a given order, because each execution could be busted.
  • Each action on an order (such as validation, partial execution ..) is performed by a dedicated object. For 1000 orders, if there are 5 actions, then there would be 5000 distinct “action objects”. The OM has pointers to all of these action objects.
  • Most action objects are stateful. ALL action objects are persisted somewhere so as to support busting/cancellation.

 

LD_LIBRARY_PATH ^ objdump RUNPATH

This is related to q[cannot open shared object file] abc.so

See https://amir.rachum.com/blog/2016/09/17/shared-libraries/#rpath-and-runpath for the RUNPATH

q(objdump) can inspect the binary file better than q(ldd) does.

q(ldd) shows the final, resolved path of each .so file, but (AFAIK) doesn’t show how it’s resolved. The full steps of resolution is described in http://man7.org/linux/man-pages/man8/ld.so.8.html

q(objdump) can shed some light … in terms of DT_RUNPATH section of the binary file.

c++timestamp string includ`microsec

        struct timeval timeValNow;
        gettimeofday (&timeValNow,NULL); //populates the struct
        struct tm * ptm = gmtime( &timeValNow.tv_sec );
        timestampNow.str("");
        timestampNow<<ptm->tm_year +1900;
        timestampNow<<setfill('0')<<setw(2)<<ptm->tm_mon+1;
        timestampNow<<setw(2)<<ptm->tm_mday<<'-';
        timestampNow<<setw(2)<<ptm->tm_hour<<':';
        timestampNow<<setw(2)<<ptm->tm_min<<':';
        timestampNow<<setw(2)<<ptm->tm_sec<<'.';
        timestampNow<<setw(6)<<timeValNow.tv_usec;

in-line field initializer ] c++11

I believe the concise form of java-style field initializer is mostly legal in c++11 (except static fields — See P115 [[essential c++]]). In c++ lingo, “initializer” usually refers to one special part of a ctor, but here I focus on in-line initializers like

float myField = 0.11073;

Q: can you inline initialize the following entities?

  • case: static field of a class? No unless const integral types. Must be initialized (One-Definition-Rule) outside the class
  • case: instance field of a class? inline field initializer allowed since c++11. See https://stackoverflow.com/questions/13662441/c11-allows-in-class-initialization-of-non-static-and-non-const-members-what-c
  • case: instance field of type std::string or STL container? Allowed but no need to specify any initializer. These component-objects are automatically initialized to “empty”. I tested in my CRAB project in MVEA.
  • case: local variable? Yes … Best practice. Otherwise compiler can silently put rubbish there!
  • case: local static variable?Yes but no need… because Default-initialized!
    • Note the initialization happens only once, ignored on subsequent encounters
  • case: global variable? Allowed
  • case: file-scope static variable? Allowed
  • .. These rules are messier than java
#include &amp;lt;iostream&amp;gt;
#include &amp;lt;string&amp;gt; // without it, "string" is different type in MSVS!
using namespace std;

float global = 0.1;
static float file_scope_static = 0.1314;

struct Test {
	float instance_field = 0.3; // since c++11
	string instance_field_str = "instance_field_str"; // no-initializer also safe.
	static float static_field;
};
float Test::static_field = 0.4;

int main()
{
	float local = 0.2;
	static float local_static = 0.793;
	cout &amp;lt;&amp;lt; Test::static_field &amp;lt;&amp;lt; endl;
	Test t;
	cout &amp;lt;&amp;lt; t.instance_field_str &amp;lt;&amp;lt; endl;
	cout &amp;lt;&amp;lt; t.instance_field &amp;lt;&amp;lt; endl;
	cout &amp;lt;&amp;lt; file_scope_static &amp;lt;&amp;lt; endl;
	cout &amp;lt;&amp;lt; local_static &amp;lt;&amp;lt; endl;
	return 0;
}