This is a typical “c++ecosystem question”. It’s not about c++ or C; it’s about linux instrumentation tools.
Q1: Given a multi-threaded server, you see some telltale signs that process is stuck and you suspect only one of the threads is stuck while the other threads are fine. How do you verify?
Q2: What if it’s a production environment?
A: I guess all my solution should be usable on production, since the entire machine is non-functioning. We can’t make it any worse. If the machine is still doing useful work, then we should probably wait till end of day to investigate.
–Method: thread dump? Not popular for c++ processes. I have reason to believe it’s a JVM feature, since java threads are always jvm constructs, usually based on operating system threads [1]. JVM has full visibility into all threads and provides comprehensive instrumentation interface.
https://www.thoughtspot.com/codex/threadstacks-library-inspect-stacktraces-live-c-processes shows a custom c++ thread dumper but you need custom hooks in your c++ source code.
[1] Note “kernel-thread” has an unrelated meaning in the linux context
–Method: gdb
thread apply all bt
– prints a stack trace of every thread, allowing you to somewhat easily find the stuck one
I think in gdb you can release each thread one by one and suspend only one suspect thread, allowing the good threads to continue
–Method: /proc — the dynamic pseudo file system
For each process, a lot of information is available in /proc/12345
. Information on each thread is available in /proc/12345/task/67890
where 67890
is the kernel thread ID. This is where ps
, top
and other tools get thread information.