UDP: send/receive buffers are configurable #CSY

It’s wrong to say UDP uses a small receive buffer but doesn’t use send buffer.

Receive — https://access.redhat.com/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html shows how to increase UDP receive buffer to 25MB

Send — https://stackoverflow.com/questions/2031109/understanding-set-getsockopt-so-sndbuf) shows you CAN configure send buffer on a UDP socket.

Send — https://www.ibm.com/support/knowledgecenter/en/SSB23S_1.1.0.15/gtpc2/cpp_sendto.html also confirms SO_SNDBUF send-buffer size applies to UDP sockets

In addition, Application is free to use its own custom buffer in the form of a vector for example.

handle java OOM @runtime

https://stackoverflow.com/questions/2679330/catching-java-lang-outofmemoryerror says "only very infrequently is OutOfMemoryError the death-knell to a JVM. There is only one good reason to catch an OutOfMemoryError and that is to close down gracefully, cleanly releasing resources and logging the reason for the failure best you can (if it is still possible to do so)."

Note in a Docker container, OOM leads to immediate crash of not only JVM but entire container.

low realized roti..cos I moved on quickly#MOM,c#

See the meaning of move-on

Analogy — (Trevor Tang) picking a major in medicine is a wrong bet if you are into technology, though medicine is a great major.

  • MOM roti? low but could go higher if I pursue an architect career in non-eq trading. The underlying tech is very flexible and mature and a winning bet
  • SQL tuning + complex query — roti can go higher if I go into big data or data analysis domains. The underlying tech is powerful, mature, adaptable and not dead. I think it will see a renaissance.
  • swing roti? very low but could be higher if I purpose a GUI career in dotnet. There is a lot of highly inheritable, portable skill even if swing goes out
  • drv pricing roti? can go higher if I stick to quant-dev career and move only within similar roles. However, if you move from volatility to yield curve to exotics, the accu became very thin

lower workload ⇏ quality free time #family

Lower workload CAN mean

… more time for kids + workout, but can also mean

… more time wasted .. burn/rot

The free time saved due to lower workload is often spent on reflective blogging… controversial

A great example of quality free time is the Bayonne -> MS commute in 2018/2019. For a few weeks I was able to do coding drill on commute, despite the segmented commute. For a few months I was doing /productive/ git-blogging

With queues,say pop/append !!enqueue/dequeue

The standard phrases “enqueued/enqueuing, dequeued/dequeuing” are sometimes too long and less readable. Some texts actually use the shorter words —

  1. pop, popped, popping .. for dequeue
    • This word is used in C++ std::list::pop_front and pop_back
    • This word is used in python list.pop() and python collections.deque.popleft()
  2. append, appending, appended .. for enqueue.
    • This word is more direct, more visual.

prod write access to DB^app server@@

Q: Is production write access more dangerous in DB or app server?
A: I would say app server, since a bad software update can wipe out production data in unnoticeable ways. It could be a small subset of the data and unnoticeable for a few days.

It’s not possible to log all database writes. Such logging would slow down the live system and take up too much disk space. It’s basically seen as unnecessary.

However, tape backup is “protected” from unauthorized writes. It is usually not writable by the app server. There’s a separate process and separate permission to create/delete backup tapes.

eg@ JGC-JNI interaction: String

First, let’s remember Java string objects are collected by GC. In fact, String.java instances are often the biggest group of “unreachable” objects in a GC iteration.

I guess interned string objects are treated differently … need more research just for QQ interviews.

GC thread has builtin safety checks to cooperate with application threads to avoid collecting a string.

However, JNI code is not 100% compatible with GC. Example from [[Core java]]:

Suppose a JNI C function instantiates a jstring object in the jvm heap (not in native memory), to be deallocated by GC.

Note the GC doesn’t know when this object is unreachable. To inform GC, JNI function need to call ReleaseStringUTFChars().

%%geek profile cf 200x era, thanks2tsn

Until my early 30’s I was determined to stick to perl, php, javascript, mysql, http [2] … the lighter, more modern technologies and avoided [1] the traditional enterprise technologies like java/c++/c#/SQL/MOM/Corba . As a result, my rating in the “body-building contest” was rather low.

Like assembly programming, I thought the “hard” (hardware-friendly) languages were giving way to easier, “productivity” languages in the Internet era. Who would care about a few microsec? Wrong…. The harder languages still dominate high-end jobs.


* An electronics engineering graduate stuck in a small, unsuccessful wafer fab
* An uneducated pretty girl unable to speak well, dress well.

Today (2017) my resume features java/c++/py + algo trading, quant, latency … and I have some accumulated insight on core c++/c#, SQL, sockets, connectivity, ..

[1] See also fear@large codebase
[2] To my surprise, some of these lighter technologies became enterprise —

  1. linux
  2. python
  3. javascript GUI
  4. http intranet apps

identityHashCode,minimum object size,relocation by JGC

https://srvaroa.github.io/jvm/java/openjdk/biased-locking/2017/01/30/hashCode.html offers a few “halo” knowledge pearls

  • every single java Object must always give an idHashcode on-demand, even if its host class has hashCode() method overridden to return a hard-coded 55.
    • hashcode() doesn’t overshadow idHashcode
  • The “contract” says an object’s idHashcode number must never [2] change, in the face of object relocations. So it’s not really computed based on address. Once someone requests the idHashCode number (like 4049040490), this number must be retained somewhere in object, as per the “contract”. It is retained in the 12-byte object header. (8-byte for a 32-bit JVM)
    • Therefore, the idHashcode contributes to the minimum size of java objects.
  • contrary to common belief, the idHashcode can clash between two objects, so idHashcode is a misnomer, more “hashcode” and not “identity”. https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6321873 explains there are insufficient integer values given the maximum object count.
  • Note anyone can call the hashcode() method on this same object and it could be overridden to bypass the idHashcode.
  • [2] in contrast a custom hashcode() can change its value when object state changes.

DoneForDay report message

https://www.onixs.biz/fix-dictionary/4.2/app_d.html shows many comparable scenarios. D2 is a typical usage of DFD

I think a DFD report is sent (from venue to trader) for each partially filled or unfilled but closed order. Exchange sends an exec report with 39=10 to indicate DFD i.e. Order not completed but no more fills.

Someone said online — DFD (done for day) is a concept that applies to an order, not a stock.


c++dtor^python finalizer^java finalizer

This blog has enough posts on c# finalizers. See also AutoCloseable^Closeable #java

python finalizer is a special class method object.__del__(self), invoked when reference count drops to zero, and garbage-collected. As such, it’s not useful for resource management, which is better done with context manager, a popular python idiom and the best-known “protocol” in python.

— Java finalizer is an important QQ topic

https://stackoverflow.com/questions/2506488/when-is-the-finalize-method-called-in-java has some highly voted summaries.

The finalize() method can be at any time after it has become eligible for garbage collection, possibly never.

The finalize() method should only be written for cleanup of (usually non-Java) resources like closing files. [[effJava]] says avoid it.

https://www.infoq.com/articles/Finalize-Exiting-Java compares

  1. c++RAII
  2. java finalize()
  3. java 7 try-with-resources

how could jvm surpass c++latency

A Shanghai Morgan Stanley interviewer asked in a 2017 java interview — “How could jvm surpass c++ latency?”

— One reason — JIT compiler could aggressively compile bytecode into machine code with speedy shortcuts for the “normal” code path + special code path to handle the abnormal conditions.

JIT to avoid vtable latency #Martin is a prime example.

Priming is tricky in practice — https://www.theserverside.com/tip/Avoid-JVM-de-optimization-Get-your-Java-apps-runnings-fast-right-fromt-the-start highlights pitfalls of priming in the trading context. Some take-aways:
1. Optimizing two paths rather than just one path
2. Reusing successful optimization patterns from one day to the next, using historical data

— One hypothesis — no free() or delete() in java, so the memory manager doesn’t need to handle reclaiming and reusing the memory. [[optimizedC++]] P333 confirmed the c++ mem mgr does that. See [1]

https://stackoverflow.com/questions/1984856/java-runtime-performance-vs-native-c-c-code is Not a published expert but he says —
On average, a garbage collector is far faster than manual memory management, for many reasons:
• on a managed heap, dynamic allocations can be done much faster than the classic heap
• shared ownership can be handled with negligible amortized cost, where in a native language you’d have to use reference counting which is awfully expensive
• in some (possibly rare and contrived) cases, object destruction is vastly simplified as well (Most Java objects can be reclaimed just by GC’ing the memory block. In C++ destructors must always be executed)

— One hypothesis — new() is faster in jvm than c++. See [1]

Someone said “Object instantiation is indeed extremely fast. Because of the way that new objects are allocated sequentially in memory, it often requires little more than one pointer addition, which is certainly faster than typical C++ heap allocation algorithms.”

[1] my blogpost java allocation Can beat c++

Julia, Go and Lua often beat C in benchmark tests .. https://julialang.org/benchmarks/

http://www.javaworld.com/article/2076593/performance-tests-show-java-as-fast-as-c–.html is a 1998 research.

In my GS-PWM days, a colleague circulated a publication claiming java could match C in performance, but didn’t say “surpass”.

Alien dictionary


Suppose Total C characters, and N words


Mostly implementation challenge.

insight — Published solution is mediocre performance as it scans each word exactly TWICE, but luckily “twice” doesn’t affect bigO — O(total char count across all words)

— idea 1: maintain a linked list of “clusters”. Each cluster is {pos, startWordID, optional lastWordID} Each cluster has words with the same prefix up to pos.

copy first letter of N words into an N-array. verify this array is sorted. Now separate the words into up to 26 clusters. Suppose we a cluster of 55 words. This cluster is the payload of a link node. When we look at 2nd char within this cluster, we see up to 26 sub-clusters, so we replace the big cluster with these sub-clusters.

Invariant — the original order among the N words is never changed.

Even if this idea is overkill, it can be useful in other problems.

the verify(arrayOf1stChar) is a util function.

— Idea 4: convert each word to an English word, in O(C).

Then sort the collection. What’s the O()? O(N logN C/N) = O(C logN)

— idea 5: compute a score for each word and check the words are sorted in O(N)

O(1)getRandom+add+del on Bag #Rahul

Q: create an unordered multiset with O(1) add(Item), del(Item) and a getRandom() having the  probability of returning any item  based on the PMF.

Rahul posed this question to our Princeton PhD candidate, who needed some help on the Bag version.

====my solution:
On the spot, I designed a vector<Item> + hashmap<Item, hashset<Pos>>. The hashset records the positions within the vector.

Aha — Invariant — My vector will be designed to have no empty slots even after many del(). Therefore vec[ random() * vec.size() ] will satisfy getRandom() PMF.

add() would simply (in O(1)) append to the vector, and to the hashset.

— del() algo is tricky, as Rahul and I agreed. Here’s an illustration: Let’s say Item ‘A’ appears at positions 3,7,8,11,16 and B appears at positions 2,5,31 (the last in the vector). del(A) needs to remove one of the A’s and move the B@31 into that A’s position.

  1. Suppose the PMF engine picks vec[11] which is an A.
  2. unconditionally O(1) find the item at the last position in vector. We find a B, which is different from our ‘A’
  3. Here’s how to physically remove the A from position 11:
  4. O(1) replace ‘A’ with ‘B’ at position 11 in the vector
  5. O(1) remove 11 from A’s hashset and add 11 into B’s hashset, so A’s hashset size decrements.
  6. O(1) remove 31 from B’s hashset, so B’s hashset size remains

##3 java topics I can build zbs #concurrency

C++ has many zbs topics with accu + job market relevance, but I won’t list them here. In contrast, java has very few qualifying topics —

  • [b] concurrency engineering — low-level techniques + high-level designs.
  • latency improvement to c++level — Accumulated experience can become outdated but relatively long shelf-life
  • [b] GC and JIT — Accumulated experience soon becomes outdated
  • –non-ideal choices
  • large java app tuning? can’t read. High churn. Accumulated experience can become outdated
  • [b] reflection? Powerful but seldom quizzed. Low mkt value
  • [b] java5 generics topics as in my book? low industry adoption. Seldom quizzed.
  • jvm source code? never required in IV
  • [b=luckily there are books. I’m good at reading those books.]

xtap check`if UDP channel=healthy #CSY

xtap library needs reliable ways to check if a “connectivity” is down

UDP (including multicast) is a connectionless protocol; TCP is connection-oriented. Therefore the xtap “connection” class cannot be used for multicast channels. Multicast Channel is like a TV-channel (NYSE terminology).

–UDP is connectionless, has no session, no virtual circuit, so no “established” state. So how do we know the exchange server is dead or just quiet?

After discussing with CSY, I feel UDP unicast or UDP multicast can’t tell me.

heartbeat — I think we must rely on the heartbeat. I actually helped create an inactivity timeout alert precisely because in a multicast channel, we don’t know if exchange is down or just quiet.

–TCP should be easier.

Many online resources such as https://stackoverflow.com/questions/4142012/how-to-find-the-socket-connection-state-in-c.

According to CSY, as a receiver, we don’t need to send a probe message. if exchange has closed the TCP socket, the four-way handshake would have informed my socket that the connection is closed. So my select() would say my TCP socket is ready for reading. When I read it I would get 0 bytes.

I believe there’s a one-to-one mapping between a socket and a “connection” for TCP only


strong coreJava candd are most mobile

Jxee guys face real barriers when breaking into core java. Same /scene/ when a c++ interviewer /grills/ a java candidate. These interviewers view their technical context as a level deeper and (for the skill level) a level higher. I have seen and noticed this perception many times.

Things are different in reverse — core java guys can move into jxee with relatively small effort. When jxee positions grow fast, the hiring teams often lower their requirements and take in core java guys. Fundamentally (as some would agree with me) the jxee add-on packages are not hard otherwise they won’t be popular.

Luckily for java guys, Javaland now has the most jobs and often well-paying jobs but ..

  1. c# guys can’t move into the huge java market. Ellen have seen many
  2. c++ is a harder language than java, but a disproportionate percentage (80%?) of c++ guys face real entry barrier when breaking into javaland.
  3. I wonder why..
  • I have heard of many c++ and c# veterans complain about the huge ecosystem in javaland.
  • I have spoken to several c++/c# friends. I think they don’t sink their teeth in java. Also a typical guy doesn’t want to abandon his traditional stronghold, even though in reality the stronghold is shrinking.
  • Age is a key factor. After you have gone though not only many years but also a tough accumulation phase on a steep learning curve, you are possibly afraid of going through the same.

[20] java≠a natural choice 4 latency #DQH

I think java could deliver similar latency numbers to c/c++, but the essential techniques are probably unnatural to java:

  • STM — Really low latency systems should use single-threaded mode. STM is widely used and well proven. Concurrency is the biggest advantage of java but unfortunately not effective in serious latency engineering.
  • DAM — (dynamically allocated memory) needs strict control, but DAM usage permeates mainstream java.
  • arrays — Latency engineering favors contiguous data structures i.e. arrays, rather than object graphs including hash tables, lists, trees, or array of heap pointers,,. C pointers were designed based on tight integration with array, and subsequent languages have all moved away from arrays. Programming with raw arrays in java is unnatural.
    • struct — Data structures in C has a second dimension beside arrays – namely structs. Like arrays, structs are very compact, wasting no memory and can live on heap or non-heap. In java, this would translate to a class with only primitive fields. Such a class is unnatural in java.
  • GC — Low latency doesn’t like a garbage collector thread that can relocate objects. I don’t feel confident discussing this topic, but I feel GC is a handicap in the latency race. Suppressing GC is unnatural for a GC language like java.

My friend Qihao commented —

There are more management barriers than technical barriers towards low latency java. One common example is with “suppressing gc is unnatural”.

WallSt contract as fallback career plan: 2019summary

After talking to you and a few friends, now I feel Wall St contract job market is a comfortable, lucrative fallback career plan for me. (See [19] y WallSt_contract=my best Arena #Grandpa) The alternatives are:

  • ibank VP jobs — I know many people in these roles including our friend Youwei. Looks too stressful.
  • startups? No real experience but probably more stressful
  • web2.0 shops — (Goog, FB etc) I assume the expectation will be too high esp. for older guys like me
  • Singapore jobs — mostly as stressful as ibank VP jobs.

tech firms value motivation in coding drill

I agree with you that to pass FB/Goog/indeed/twitter interviews, we all need to practice for months.

We also need to study the problem patterns during our practice. “Study” is euphemism for a hard long struggle. If we don’t put in a hell lot of focused energy, we can’t absorb the amount of variations and *recognize* the patterns — 从厚学到薄. Without learning the patterns, we will be overwhelmed by the sheer amount of variations in the problems, and forget a lot, as in 狗熊掰棒子.

Q: Why the hell do these firms like to hire coders who practice so hard?
A: My hypothesis:

  • such a coder is typically young — with plenty of spare energy and spare time. An older techie with kids is unlikely to endure such heavy coding drill. We lack spare energy and spare time.
  • such a coder has perseverance and dedication — hard-driving, hard-working, focused, achievement-oriented and determined
  • such a coder is able to tolerate repetitive, boring, lonely tasks — even joy the boredom. I call it absorbency capacity or 耐得住寂寞. This is not only attitude; but an aptitude. My son currently lacks this capacity, but my dad, my daughter and I have this capacity, to varying degrees.
  • such a coder is able to pay attention to details — Getting a program to work on a range of corner cases requires attention to details.
  • such a coder has a quick analytical mind for abstract logical problem solving

In contrast, financial firms target slightly different skills. Their coding questions test language-specific efficiency, language features, or pseudo-code algorithm. Tech firms don’t test these.