What code is machine/architecture specific? syscalls, std lib …

A: (obviously) binary object code — including the unix/windows kernels. It runs on the bare hardware — must be machine specific.

A: Assembly language source code — must conform to the hardware instruction set. You can't use a i386 assembly code on a PowerPC. Note each line of assembly source code translates to a line in machine code.

A: C compiler — implements the ABI [1]. The object code produced by the compiler is binary machine code, so the compiler itself must be architecture-specific.

A: Syscalls – are specific to the Architecture. The linux syscalls for i386 architecture include about 300 functions. 90% of them are universally available on all architectures but the rest are architecture-specific.

A: C standard library (like glibc) – provides wrappers over syscalls. Since a small number of syscalls are architecture-specific, the std lib is necessarily architecture-specific. However, the “contamination” stops here – I believe anything linked to the std lib is portable at source level. Therefore the std lib provides a “standard” API. Java portability is even better – where the same bytecode compiled on one architecture is usable on any other, assuming 100% pure java without native code.

[1] API vs ABI – explained in [[linux system programming]]

your compiler ^ the c++ standard

C++ has many compilers offering extensions.

Many developers write code based on the c++ standard. You could decide “if this operation fails, I would get this particular error according to the standard.”

Well, unless you are aiming for portability, you had better respect your local compiler more than the language standard. Your local compiler could add a “feature” and avoid raising that error. Side effect – you don't get “reminded” and you assume everything worked as expected – dangerous. There could be a compile-time warning but then it's often drowned in the noise. Poor signal/noise ratio.

My stance — know your compiler. Whenever the standard says some unusual condition should end in a certain way (or one of several ways), check your compiler's conformance.

It's also possible that the standard says this condition should return a null pointer or silently return, but a compiler adds a much-needed diagnostic feature to catch it. Sounds too good, but then the problem is, if you tweak some debug-build settings (or when you port your code to another compiler) you could lose the “reminders” that you depend on. Your code may appear to work in UAT but fail in production.

every async operation involves a sync call

I now feel just about every asynchronous interaction involves a pair of (often remote) threads. (Let’s give them simple names — The requester RR vs the provider PP). An async interaction goes through 2 phases —

Phase 1 — registration — RR registers “interest” with PP. When RR reaches out to PP, the call must be synchronous, i.e. Blocking. In other words, during registration RR thread blocks until registration completes. RR thread won’t return immediately if the registration takes a while.

If PP is remote, then I was told there’s usually a local proxy object living inside the RR Process. Registration against proxy is faster, implying the proxy schedules the actual, remote registration. Without the scheduling capability, proxy must complete the (potentially slow) remote registration on the RR thread, before the local registration call returns. How slow? If remote registration goes over a network or involves a busy database, it would take many milliseconds. Even though the details are my speculation, the conclusion is fairly clear — registration call must be synchronous, at least partially.

Even in Fire-and-forget mode, the registration can’t completely “forget”. What if the fire throws an exception at the last phase after the “forget” i.e. after the local call has returned?

Phase 2 — data delivery — PP delivers the data to an RR2 thread. RR2 thread must be at an “interruption point” — Boost::thread terminology. I was told RR2 could be the same RR thread in WCF.

stack frame has a pointer to the caller stack frame

Q: what data occupy the real estate of a single stack frame? Remember a stack frame is a chunk of memory of x bytes and each byte has a purpose.

A: (obviously) any “auto” variable declared (therefore allocated) in the function

A: (obviously) any formal parameter. If a 44-byte Account parameter is passed-by-value, then the 44-bytes are allocated in the stack frame. If by-reference, then only a 4-byte pointer allocated.

A: 4-byte pointer to caller's stack frame. Note that frame also contains a pointer to its own caller. Therefore, the stack frames form a linked list. This pointer is known as a “previous stack top”.

A: 4-byte ReturnAddress. When a function f4() returns, control passes back to the caller function f3(). Now at assembly level the caller function may be a stream of 20 instructions. Our f4() may be invoked on Instruction #8 or whatever. This information is saved in f4() stack frame under “ReturnAddress”. Upon return, this information is put into the “instruction pointer” register inside the CPU.

operands to assembly instructions

I feel most operands are registers. See first operand in example below. That means we must load our 32 bits into the EAX register before the operation.

However, an operand can also refer directly to a memory location.

SUB EAX [0x10050D49]

A third type of operand is a constant. You pass that constant from source code to compiler and it is embedded in the “object file”

glibc, an archetypical standard library (vs system calls)

Background — I was looking for a concrete example of a standard library.

Best example of a standard library is glibc, produced by GNU team. If you strip the G, it’s “libc” — THE C standard library.

“Standard” means this library is a “carpet” hiding all the platform-specific differences and presents a uniform interface to high-level application programmers — so-called App-Programmer-Interface or “API”.

There are many industry standards for this same purpose, such as POSIX, ANSI-C (which standardizes the C programming language + standard lib). Glibc supports all of these standards.

To clarify a common confusion, it’s worthwhile to understand this simple example — glibc functions (like printf) are implemented in platform-specific underlying syscalls.

Exactly What are the platform differences? Short answer — system calls. Remember System calls call into the “hotel service desk”. These syscalls are tied to the processor and the operating system so they are by definition platform-specific. See other posts on syscall vs standard library.

what is kernel space (vs userland)

(sound-byte: system calls — kernel space; standard library functions — userland, often wrappers over syscalls)

Executive summary — kernel is special source code written by kernel developers, to run in special kernel mode.

Q: But what distinguish kernel source code from application source code?
A: Kernel functions (like syscall functions) are written with special access to hardware devices. Kernel functions are the Gatekeepers to hardware, just like app developers write DAO class as gatekeepers to a DB.

Q: Real examples of syscall source code?
A: I believe glibc source code includes either syscall source code or kernel source code. I guess some kernel source code modules aren’t in glibc. See P364[[GCC]]
A: kernel32.dll ?
A: I feel device drivers are just like kernel source code, though RAM/CPU tend to be considered the kernel of kernel.

My 2-liner definition of kernel — A kernel can be thought of as a bunch of (perhaps hundreds of) API functions known as “syscalls”. They internally call additional (10,000 to 100,000) internal functions. Together these 2 bodies of source code constitutes a kernel. On an Intel platform, kernel and userland source code both compile to Intel instructions. At the individual instruction level, they are indistinguishable, but looking at the source code, you can tell which is kernel code.

There are really 2 distinct views (2 blind men describing an elephant) of a kernel. Let’s focus on run-time actions —
X) a kernel is seen as special runtime services in the form of syscalls, similiar to guest calls to hotel service desk. I think this is the view of a C developer.
Y) behind-the-scene, secret stream of CPU instructions executed on the CPU, but not invoked by any userland app. Example — scheduler [4]

I don’t think a kernel is “a kind of daemon”. Such a description is misleading. Various “regular” daemons provide services. They call kernel functions to access hardware. If a daemon never interacts with user processes, then maybe it would live in “kernel space”. I guess kernel thread scheduler might be among them.

I feel it’s unwise (but not wrong) to think of kernel as a process. Kernel services are used by processes. I guess it’s possible for a process to live exclusively in “kernel space” and never interact with user processes. http://www.thehackademy.net/madchat/sysadm/kern/kern.bsd/the_freebsd_process_scheduler.pdf describes some kernel processes.

P241 [[Pro .net performance]] describes how something like func3 in kernel32.dll is loaded into a c# application’s code area. This dll and this func3 are treated similar to regular non-kernel libraries. In a unix C++ application, glibc is linked in just like any regular library. See also http://www.win.tue.nl/~aeb/linux/lk/lk-3.html and http://www.win.tue.nl/~aeb/linux/lk/lk-3.html

[4] Scheduler is one example of (Y) that’s so extremely prominent that everyone feels kernel is like a daemon.

The term “kernel space” is misleading — it is not a special part of memory. Things in kspace don’t run under a privileged user.

— call stack view —
Consider a c# P/Invoke function calling into kernel32.dll (some kernel func3). If you were to take a snapshot of an average thread stack, top of the stack would be functions written by app developers; middle of the stack are (standard) library functions; bottom of the stack are — if hardware is busy — unfinished kernel syscalls. Our func3 would be in the last 2 layers.

All stack frames below a kernel API is “kernel space”. These stack frames are internal functions within the kernel_code_base. Beneath all the stack frames is possibly hardware. Hardware is the ultimate low-level.

Look at the bottom-most frame, it might be a syscall. It might be called from java, python, or some code written in assembly. At runtime, we don’t care about the flavor of the source code. The object code loaded into the “text” section of the Process is always a stream of assembly code, perhaps in intel or sparx InstructionSet

ANY process under any user can call kernel API to access hardware. When people say kernel has special privileges, it means kernel codebase is written like your DAO.