break down heap/non-heap footprint@c++app #massif

After reading, I was able to get massif to capture non-heap memory:

valgrind --tool=massif  --pages-as-heap=yes --massif-out-file=$massifOut .../xtap -c ....
ms_print $massifOut

Heap allocation functions such as malloc are built on top of system calls  such as mmapmremap, and brk. For example, when needed, an allocator will typically call mmap to allocate a large chunk of memory, and then hand over pieces of that memory chunk to the client program in response to calls to malloc et al. Massif directly measures only these higher-level malloc et al calls, not the lower-level system calls.

Furthermore, a client program may use these lower-level system calls directly to allocate memory. By default, Massif does not measure these. Nor does it measure the size of code, data and BSS segments. Therefore, the numbers reported by Massif may be significantly smaller than those reported by tools such as top that measure a program’s total size in memory.


python to dump binary data in hex digits

Note hex() is a built-in, but I find it inconvenient. I need to print in two-digits with leading 0.

Full source is hosted in

def Hex(data): # a generator function
  for code in map(ord,data):
    yield "%02x " % code
    i += 1
    if i%8==0: yield ' '

print ''.join(Hex("\x0a\x00")); exit(0)

As CTO, I’d favor transparent langs, wary of outside libraries

If I were to propose a system rewrite, or start a new system from scratch without constraints like legacy code, then Transparency (+instrumentation) is my #1 priority.

  • c++ is the most opaque. Just look at complex declarations, or linker rules, the ODR…
  • I feel more confident debugging java. The JVM is remarkably well-behaving (better than CLR), consistent, well discussed on-line
  • key example — the SOAP stub/skeleton hurts transparency, so does the AOP proxies. These semi-transparent proxies are not like regular code you can edit and play with in a sandbox
  • windows is more murky than linux
  • There are many open-source libraries for java, c++, py etc but many of them affects transparency. I think someone like Piroz may say Spring is a transparent library
  • SQL is transparent except performance tuning
  • Based on my 1990’s experience, I feel javascript is transparent but I could be wrong.
  • I feel py, perl are still more transparent than most compiled languages. They too can become less transparent, when the codebase grows. (In contrast, even small c++ systems can be opaque.)

This letter is about which language, but allow me to digress briefly. For data store and messaging format (both require serialization), I prefer the slightly verbose but hugely transparent solutions, like FIX, CTF, json (xml is too verbose) rather than protobuf. Remember 99% of the sites use only strings, numbers, datetimes, and very rarely involve audio/visual data.

q(nm) instrumentation #learning notes

When you want to reduce the opacity of the c++ compiled artifacts, q(nm) is instrumental. It is related to other instrumentation tools like

q(strings -a)

Subset of noteworthy features:
–print-armap? Tested with my *.a file. The filename printed is different from the above
–line-numbers? Tested
–demangle? Similar to c++filt
–dynamic? for “certain” types of shared libraries

My default command line is

nm –print-armap –print-file-name –line-numbers –demangle

q(g++ -g -O) together has a section specifically on debugging. It says

GCC allows you to use -g with -O

I think -g adds additional debug info into the binary to help debuggers; -O turns on complier optimization.

By default, our binaries are compiled with “-g3 -O2”. When I debug these binaries, I can see variables but lines are rearranged in source code, causing minor problems. See my blog posts on gdb.

gdb skill level@Wall St

I notice that, absolutely None of my c++  veteran colleagues (I asked only 3) [2] is a gdb expert as there are concurrency experts, algo experts [1], …

Most of my c++ colleagues don’t prefer (reluctance?) console debugger. Many are more familiar with GUI debuggers such as eclipse and MSVS. All agree that prints are often a sufficient debugging tool.

[1] Actually, these other domains are more theoretical and produces “experts”.

[2] maybe I didn’t meet enough true c++ alpha geeks. I bet many of them may have very good gdb skills.

I would /go out on a limb/ to say that gdb is a powerful tool and can save lots of time. It’s similar to adding a meaningful toString() or operator<< to your custom class.

Crucially, it could help you figure things out faster than your team peers. I first saw this potential when learning remote JVM debugging in GS.

— My view on prints —
In perl and python, I use prints exclusively and never needed interactive debuggers. However, in java/c++/c# I heavily relied on debuggers. Why the stark contrast? No good answer.

Q: when are prints not effective?
A: when the edit-compile-test cycle is too long, not automated but too frequent (like 40 times in 2 hours) and when there is real delivery pressure. Note the test part could involve many steps and many files and other systems.
A: when you can’t edit the file at all. I have not seen it.

A less discussed fact — prints are simple and reliable. GUI or console debuggers are often poorly understood. Look at step-through. Optimization, threads, and exceptions often have unexpected impacts. Or look at program state inspection. Many variables are hard to “open up” in console debuggers. You can print var.func1()


Locate msg]binary feed #multiple issues solved

Hi guys, thanks to all your help, I managed to locate the very first trading session message in the raw data file.

We hit and overcame multiple obstacles in this long “needle search in a haystack”.

  • · Big Obstacle 1: endian-ness. It turned out the raw data is little-endian. For my “needle”, the symbol integer id 15852(in decimal) or 3dec(in hex) is printed swapped as “ec3d” when I finally found it.

Solution: read the exchange spec. It should be mentioned.

  • · Big Obstacle 2: my hex viewers (like “xxd”) adds line breaks to the output, so my needle can be missed during my search. (Thanks to Vishal for pointing this out.)

Solution 1: xxd -c 999999 raw/feed/file > tmp.txt; grep $needle tmp.txt

The default xxd column size is 16 so every 16 bytes output will get a line break — unwanted! So I set a very large column size of 999999.

Solution 2: in vi editor after “%!xxd -p” if you see line breaks, then you can still search for “ec\_s*3d”. Basically you need to insert “\_s*” between adjacent bytes.

Here’s a 4-byte string I was able to find. It span across lines: 15\_s*00\_s*21\_s*00

  • · Obstacle 3: identify the data file among 20 files. Thanks to this one obstacle, I spent most of my time searching in the wrong files 😉

Solution: remove each file successively, starting from the later hours, and retest, until the needle stops showing. The last removed file must contain our needle. That file is a much smaller haystack.

o one misleading info is the “9.30 am” mentioned in the spec. Actually the message came much earlier.

o Another misleading info is the timestamp passed to my parser function. Not sure where it comes from, but it says 08:00:00.1 am, so I thought the needle must be in the 8am file, but actually, it is in the 4am file. In this feed, the only reliable timestamp I have found is the one in packet header, one level above the messages.

  • · Obstacle 4: my “needle” was too short so there are too many useless matches.

Solution: find a longer and more unique needle, such as the SourceTime field, which is a 32-bit integer. When I convert it to hex digits I get 8 hex digits. Then I flip it due to endian-ness. Then I get a more unique needle “008e0959”. I was then able to search across all 14 data files:

for f in arca*0; do

xxd -c999999 -p $f > $f.hex

grep -ioH 008e0959 $f.hex && echo found in $f


  • · Obstacle 5: I have to find and print the needle using my c++ parser. It’s easy to print out wrong hex representation using C/C++, so for most of this exercise I wasn’t sure if I was looking at correct hex dump in my c++ log.

o If you convert a long byte array to hex and print without whitespace, you could see 15002100ffffe87600,but when I added a space after each byte, it looks like 15 00 21 00 ffffe876 00, so the 3rd byte was overflowing without warning!

o If you forget padding, then you can see a lot of single “0” when you should get “00”. Again, if you don’t include white space you won’t notice.

Solution: I have worked out some simplified code that works. I have a c++ solution and c solution. You can ask me if you need it.

  • · Obstacle 6: In some cases, sequence number is not in the raw feed. In this case the sequence number is in the feed, so Nick’s suggestion was valid, but I was blocked by other obstacles.

Tip: If sequence number is in the feed, you would probably spot a pattern of incrementing hex numbers periodically in the hex viewer.

char-array dump in hex digits: printf/cout

C++ code is convoluted. Must cast twice!

// same output from c and c++: 57 02 ff 80
void dumpBufferPrintf(){
  static const char tag[] = {'W', 2, 0xFF, 0x80};
  cout << hex << setfill('0') ;
  for(int i = 0; i< sizeof(tag)/sizeof(char); ++i)
    printf("%02hhx ", tag[i]);
  printf ("\n");
#include <iostream>
#include <sstream> //stringstream
#include <iomanip> //setfill

//This function was also used to dump a class instance. See below
void dumpBufferCout(const char * buf, size_t const len){
                std::stringstream ss;
                ss << std::hex << std::setfill('0');
                for(size_t i=0; i<len; ++i){
                          if (i%8 == 0) ss<< "  ";
                          ss<<std::setw(2)<< (int)(unsigned char) buf[i]<<" ";
dumpBufferCout((const char*)&myStruct, sizeof(myStruct));