UDP: send/receive buffers are configurable #CSY

It’s wrong to say UDP uses a small receive buffer but doesn’t use send buffer.

Receive — https://access.redhat.com/documentation/en-US/JBoss_Enterprise_Web_Platform/5/html/Administration_And_Configuration_Guide/jgroups-perf-udpbuffer.html shows how to increase UDP receive buffer to 25MB

Send — https://stackoverflow.com/questions/2031109/understanding-set-getsockopt-so-sndbuf) shows you CAN configure send buffer on a UDP socket.

Send — https://www.ibm.com/support/knowledgecenter/en/SSB23S_1.1.0.15/gtpc2/cpp_sendto.html also confirms SO_SNDBUF send-buffer size applies to UDP sockets

In addition, Application is free to use its own custom buffer in the form of a vector for example.

Advertisements

handle java OOM @runtime

https://stackoverflow.com/questions/2679330/catching-java-lang-outofmemoryerror says "only very infrequently is OutOfMemoryError the death-knell to a JVM. There is only one good reason to catch an OutOfMemoryError and that is to close down gracefully, cleanly releasing resources and logging the reason for the failure best you can (if it is still possible to do so)."

Note in a Docker container, OOM leads to immediate crash of not only JVM but entire container.

low realized roti..cos I moved on quickly#MOM,c#

See the meaning of move-on

Analogy — (Trevor Tang) picking a major in medicine is a wrong bet if you are into technology, though medicine is a great major.

  • MOM roti? low but could go higher if I pursue an architect career in non-eq trading. The underlying tech is very flexible and mature and a winning bet
  • SQL tuning + complex query — roti can go higher if I go into big data or data analysis domains. The underlying tech is powerful, mature, adaptable and not dead. I think it will see a renaissance.
  • swing roti? very low but could be higher if I purpose a GUI career in dotnet. There is a lot of highly inheritable, portable skill even if swing goes out
  • drv pricing roti? can go higher if I stick to quant-dev career and move only within similar roles. However, if you move from volatility to yield curve to exotics, the accu became very thin

lower workload ⇏ quality free time #family

Lower workload CAN mean

… more time for kids + workout, but can also mean

… more time wasted .. burn/rot

The free time saved due to lower workload is often spent on reflective blogging… controversial

A great example of quality free time is the Bayonne -> MS commute in 2018/2019. For a few weeks I was able to do coding drill on commute, despite the segmented commute. For a few months I was doing /productive/ git-blogging

With queues,say pop/append !!enqueue/dequeue

The standard phrases “enqueued/enqueuing, dequeued/dequeuing” are sometimes too long and less readable. Some texts actually use the shorter words —

  1. pop, popped, popping .. for dequeue
    • This word is used in C++ std::list::pop_front and pop_back
    • This word is used in python list.pop() and python collections.deque.popleft()
  2. append, appending, appended .. for enqueue.
    • This word is more direct, more visual.

eg@ JGC-JNI interaction: String

First, let’s remember Java string objects are collected by GC. In fact, String.java instances are often the biggest group of “unreachable” objects in a GC iteration.

I guess interned string objects are treated differently … need more research just for QQ interviews.

GC thread has builtin safety checks to cooperate with application threads to avoid collecting a string.

However, JNI code is not 100% compatible with GC. Example from [[Core java]]:

Suppose a JNI C function instantiates a jstring object in the jvm heap (not in native memory), to be deallocated by GC.

Note the GC doesn’t know when this object is unreachable. To inform GC, JNI function need to call ReleaseStringUTFChars().

%%geek profile cf 200x era, thanks2tsn

Until my early 30’s I was determined to stick to perl, php, javascript, mysql, http [2] … the newer lighter technologies and avoided [1] mainstream enterprise technologies like java/c++/c#/SQL/MOM/Corba so my rating in the “body-building contest” was rather low.

Like assembly programming, I thought the “hard” (closer to hardware) languages were giving way to easier, “productivity” languages in the internet era. Who would care about a few microsec? Wrong…. The harder languages still dominate high-end jobs.

Analogy?

* An electronics engineering graduate stuck in a small, unsuccessful wafer fab
* An uneducated pretty girl unable to speak well, dress well.

Now my resume features java/c++/py + algo trading, quant, latency … and I have some accumulated insight on core c++/c#, SQL, sockets, swing, ..

[1] See also fear@large codebase
[2] To my surprise, some of these lighter technologies became enterprise —

  1. linux
  2. http intranet apps

identityHashCode,minimum object size,relocation by JGC

https://srvaroa.github.io/jvm/java/openjdk/biased-locking/2017/01/30/hashCode.html offers a few “halo” knowledge pearls

  • every single java Object must always give an idHashcode on-demand, even if its host class has hashCode() method overridden to return a hard-coded 55.
    • hashcode() doesn’t overshadow idHashcode
  • The “contract” says an object’s idHashcode number must never [2] change, in the face of object relocations. So it’s not really computed based on address. Once someone requests the idHashCode number (like 4049040490), this number must be retained somewhere in object, as per the “contract”. It is retained in the 12-byte object header. (8-byte for a 32-bit JVM)
    • Therefore, the idHashcode contributes to the minimum size of java objects.
  • contrary to common belief, the idHashcode can clash between two objects, so idHashcode is a misnomer, more “hashcode” and not “identity”. https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6321873 explains there are insufficient integer values given the maximum object count.
  • Note anyone can call the hashcode() method on this same object and it could be overridden to bypass the idHashcode.
  • [2] in contrast a custom hashcode() can change its value when object state changes.

c++dtor^python finalizer^java finalizer

This blog has enough posts on c# finalizers. See also AutoCloseable^Closeable #java

python finalizer is a special class method object.__del__(self), invoked when reference count drops to zero, and garbage-collected. As such, it’s not useful for resource management, which is better done with context manager, a popular python idiom and the best-known “protocol” in python.

— Java finalizer is an important QQ topic

https://stackoverflow.com/questions/2506488/when-is-the-finalize-method-called-in-java has some highly voted summaries.

The finalize() method can be at any time after it has become eligible for garbage collection, possibly never.

The finalize() method should only be written for cleanup of (usually non-Java) resources like closing files. [[effJava]] says avoid it.

https://www.infoq.com/articles/Finalize-Exiting-Java compares

  1. c++RAII
  2. java finalize()
  3. java 7 try-with-resources

how could jvm surpass c++]latency #MS

A Shanghai Morgan Stanley interviewer asked in a 2017 java interview — “How could jvm surpass c++ latency?”

  • One reason — JIT compiler could aggressively compile bytecode into machine code with speedy shortcuts for the “normal” code path + special code path to handle the abnormal conditions.

P76 [[javaPerf]] described a nifty JIT technique to avoid runtime cost of the dynamic binding of virtual function equals(). Supppose in some class, we call obj1.equals(obj2).

After a priming (i.e. warm-up) period, JIT collects enough statistics to see that every dynamic dispatch at this site is calling String.equals(), so JIT decides to turn it into faster “static binding” so the String.equals() function address is hardwired into the assembly code (not JVM bytecode). JIT also needs to handle the possibility of Character.equals(). I guess the assembly code can detect that obj1/obj2 is not a String.java instance and retry the virtual function lookup. JIT can generate assembly code to
1. call String.equals() and go ahead to compare some field of obj1 and obj2.
2. if no such field, then obj1 is not String, then backtrack and use obj1 vtable to look up the virtual function obj1.equals()

It may turn out that 99.9% of the time we can skip the time-consuming Step 2: )

Priming note — priming is tricky — https://www.theserverside.com/tip/Avoid-JVM-de-optimization-Get-your-Java-apps-runnings-fast-right-fromt-the-start highlights pitfalls of priming in the trading context. Some take-aways:
1. Optimizing two paths rather than just one path
2. Reusing successful optimization patterns from one day to the next, using historical data

  • One hypothesis — no free() or delete() in java, so the memory manager doesn’t need to handle reclaiming and reusing the memory. [[optimizedC++]] P333 confirmed the c++ mem mgr does that. Instead the GC uses a very different algorithm

https://stackoverflow.com/questions/1984856/java-runtime-performance-vs-native-c-c-code is Not a published expert but he says —
On average, a garbage collector is far faster than manual memory management, for many reasons:
• on a managed heap, dynamic allocations can be done much faster than the classic heap
• shared ownership can be handled with negligible amortized cost, where in a native language you’d have to use reference counting which is awfully expensive
• in some (possibly rare and contrived) cases, object destruction is vastly simplified as well (Most Java objects can be reclaimed just by GC’ing the memory block. In C++ destructors must always be executed)

  • One hypothesis — new() is faster in jvm than c++.

Someone said “Object instantiation is indeed extremely fast. Because of the way that new objects are allocated sequentially in memory, it often requires little more than one pointer addition, which is certainly faster than typical C++ heap allocation algorithms.”


Julia, Go and Lua often beat C in benchmark tests .. https://julialang.org/benchmarks/

http://www.javaworld.com/article/2076593/performance-tests-show-java-as-fast-as-c–.html is a 1998 research.

In my GS-PWM days, a colleague circulated a publication claiming java could match C in performance, but didn’t say “surpass”.

##3 java topics I can build zbs #concurrency

C++ has many zbs topics with accu + job market relevance, but I won’t list them here. In contrast, java has very few qualifying topics —

  • [b] concurrency engineering — low-level techniques + high-level designs.
  • latency improvement to c++level — Accumulated experience can become outdated but relatively long shelf-life
  • [b] GC and JIT — Accumulated experience soon becomes outdated
  • –non-ideal choices
  • large java app tuning? can’t read. High churn. Accumulated experience can become outdated
  • [b] reflection? Powerful but seldom quizzed. Low mkt value
  • [b] java5 generics topics as in my book? low industry adoption. Seldom quizzed.
  • jvm source code? never required in IV
  • [b=luckily there are books. I’m good at reading those books.]

xtap check`if UDP channel=healthy #CSY

xtap library needs reliable ways to check if a “connectivity” is down

UDP (including multicast) is a connectionless protocol; TCP is connection-oriented. Therefore the xtap “connection” class cannot be used for multicast channels. Multicast Channel is like a TV-channel (NYSE terminology).

–UDP is connectionless, has no session, no virtual circuit, so no “established” state. So how do we know the exchange server is dead or just quiet?

After discussing with CSY, I feel UDP unicast or UDP multicast can’t tell me.

heartbeat — I think we must rely on the heartbeat. I actually helped create an inactivity timeout alert precisely because in a multicast channel, we don’t know if exchange is down or just quiet.

–TCP should be easier.

Many online resources such as https://stackoverflow.com/questions/4142012/how-to-find-the-socket-connection-state-in-c.

According to CSY, as a receiver, we don’t need to send a probe message. if exchange has closed the TCP socket, the four-way handshake would have informed my socket that the connection is closed. So my select() would say my TCP socket is ready for reading. When I read it I would get 0 bytes.

I believe there’s a one-to-one mapping between a socket and a “connection” for TCP only

 

strong coreJava candd are most mobile

Jxee guys face real barriers when breaking into core java. Same /scene/ when a c++ interviewer /grills/ a java candidate. These interviewers view their technical context as a level deeper and (for the skill level) a level higher. I have seen and noticed this perception many times.

Things are different in reverse — core java guys can move into jxee with relatively small effort. When jxee positions grow fast, the hiring teams often lower their requirements and take in core java guys. Fundamentally (as some would agree with me) the jxee add-on packages are not hard otherwise they won’t be popular.

Luckily for java guys, Javaland now has the most jobs and often well-paying jobs but ..

  1. c# guys can’t move into the huge java market. Ellen have seen many
  2. c++ is a harder language than java, but a disproportionate percentage (80%?) of c++ guys face real entry barrier when breaking into javaland.
  3. I wonder why..
  • I have heard of many c++ and c# veterans complain about the huge ecosystem in javaland.
  • I have spoken to several c++/c# friends. I think they don’t sink their teeth in java. Also a typical guy doesn’t want to abandon his traditional stronghold, even though in reality the stronghold is shrinking.
  • Age is a key factor. After you have gone though not only many years but also a tough accumulation phase on a steep learning curve, you are possibly afraid of going through the same.

tech firms value motivation in coding drill

I agree with you that to pass FB/Goog/indeed/twitter interviews, we all need to practice for months.

We also need to study the problem patterns during our practice. “Study” is euphemism for a hard long struggle. If we don’t put in a hell lot of focused energy, we can’t absorb the amount of variations and *recognize* the patterns — 从厚学到薄. Without learning the patterns, we will be overwhelmed by the sheer amount of variations in the problems, and forget a lot, as in 狗熊掰棒子.

Q: Why the hell do these firms like to hire coders who practice so hard?
A: My hypothesis:

  • such a coder is typically young — with plenty of spare energy and spare time. An older techie with kids is unlikely to endure such heavy coding drill. We lack spare energy and spare time.
  • such a coder has perseverance and dedication — hard-driving, hard-working, focused, achievement-oriented and determined
  • such a coder is able to tolerate repetitive, boring, lonely tasks — even joy the boredom. I call it absorbency capacity or 耐得住寂寞. This is not only attitude; but an aptitude. My son currently lacks this capacity, but my dad, my daughter and I have this capacity, to varying degrees.
  • such a coder is able to pay attention to details — Getting a program to work on a range of corner cases requires attention to details.
  • such a coder has a quick analytical mind for abstract logical problem solving

In contrast, financial firms target slightly different skills. Their coding questions test language-specific efficiency, language features, or pseudo-code algorithm. Tech firms don’t test these.