Kernel (operating system)

A kernel connects the software and hardware of a computer.

In computing, the kernel is the central part in most computer operating systems which manages the system's resources and the communication between hardware and software components. As a basic component of an operating system, a kernel provides abstraction layers for hardware, especially for memory, processors and I/O that allows hardware and software to communicate. It also makes these facilities available to userland applications through inter-process communication mechanisms and system calls.

These tasks are done differently by different kernels, depending on their design and implementation. While monolithic kernels will try to achieve these goals by executing all the code in the same address space to increase the performance of the system, microkernels run most of their services in user space, aiming to improve maintainability and modularity of the codebase.^[1]

Overview

Most, but not all, operating systems rely on the kernel concept. The existence of a kernel as a single piece of software responsible for the communication between the hardware and the software results from complex compromises relating to questions of performance, memory efficiency, security and processor architectures.

In most cases, the boot loader starts the kernel as a process in supervisor mode,^[2] but after initialization, it does not remain as a process as we know it, but instead as a whole set of functions that can be invoked by userland applications to perform operations that require a higher privilege level, such as disk access. Kernel execution streams are a continuation of execution streams of userland processes that are paused when performing system calls and resumed when those return. The initial main kernel stream remains as the "idle process" and "collects" unused processor time.

Kernel development is considered one of the most complex and difficult tasks in programming.^[3] Its central position in an operating system implies the necessity for good performance, which defines the kernel as a critical piece of software and makes its correct design and implementation difficult. A kernel might even not be allowed to use the abstraction mechanisms it provides to other software. Many reasons prevent a kernel from using facilities it provides, such as: interrupt management, memory management and lack of reentrancy, thus making its development even more difficult for software engineers.

Tasks of a kernel

The kernel's task is to manage the computer's resources and allow other programs to run and use these resources. In a computer, the most central part is the CPU or processor, which actually runs different programs the way, and for the amount of time, dictated by the kernel. Another crucial resource is the computer's memory, as programs are loaded for execution and store their data for fast access there.^[3]^[4] Furthermore, the kernel must manage the Input/Output (I/O) of the motherboard, thus allowing the user to access more devices. Finally, a kernel must provide userland programs a way to access these services.

Process management

The main task of an operating system kernel is to allow the execution of applications and support them with features such as hardware abstractions. To run an application, a kernel must load the file containing the code for the application to memory (and eventually set up its own address space), set up a stack for the program and branch to a given location inside the program, thus starting its execution.^{[citation needed]}

Multi-tasking kernels are able to give the user the illusion that the number of processes being run simultaneously on the computer is higher than the maximum number of processes the computer is physically able to run simultaneously. Typically, the number of processes a system may run simultaneously is equal to the number of CPUs installed (however this may not be the case if the processors support simultaneous multithreading).^{[citation needed]}

In a pre-emptive multitasking system, the kernel will give every program a slice of time and switch from process to process so quickly that it will appear to the user as if these processes were being executed simultaneously. The kernel uses scheduling algorithms to determine which process is running next and how much time it will be given, as there might be processes with a higher priority than others. The kernel must also provide these processes a way to communicate; this is known as inter-process communication (IPC) and is realized through message passing, pipes, synchronization, shared memory, remote procedure calls (RPC) and/or software interrupts.^{[citation needed]}

The operating system might also support multiprocessing (SMP or Non-Uniform Memory Access); in that case, different programs and threads may run on different processors. To allow a kernel to run on such a system, it has to be extensively modified to make it "re-entrant" or "interruptible", meaning that it can be called in the midst of doing something else. Once this conversion is complete, programs running at the same time on different processors can safely call the kernel. The kernel must also provide a way to synchronize memory access on different processors, which makes memory management and process management two highly inter-related topics.^{[citation needed]}

Memory management

The kernel has full access to the system's memory and must allow userland programs to access this memory safely as they require it. Often the first step in doing this is virtual addressing, usually achieved by paging and/or segmentation. Virtual addressing allows the kernel to make a given physical address appear to be another address, the virtual address. This allows every program to believe that it is the only one (apart from the kernel) running and thus prevents applications from crashing each other.^{[citation needed]}

In fact, a program's virtual address may even refer to data which is not currently in memory. The layer of indirection provided by virtual addressing allows the operating system to use other data stores, like a hard drive, to store what would otherwise have to remain in main memory (RAM). As a result, operating systems can allow programs to use more memory than the system has physically available. When a program needs data which is not currently in RAM, the OS writes the contents of a currently unused memory block to disk and replaces it with the data requested by the program.^{[citation needed]}

Virtual addressing also allows creation of virtual partitions of memory in two disjointed areas, one being reserved for the kernel (kernel space) and the other for the applications (user space). This fundamental partition of memory space has contributed much to current designs of actual generalistic kernels.^{[citation needed]}

Device management

To perform, an operating system (OS) needs access to the peripherals connected to the computer, which are controlled through device drivers, which must be written by the developers and/or be provided by the manufacturers of the hardware. For example, to show the user something on the screen, the kernel relies on its monitor driver (such as VGA or VESA) which is then responsible for actually plotting the character/pixel.^{[citation needed]}

A device manager first performs a scan on different hardware buses, such as Peripheral Component Interconnect (PCI) or Universal Serial Bus (USB), to detect installed devices, then searches for the appropriate drivers. As device management is a very OS-specific topic, these drivers are handled differently by each kind of kernel design, but in every case, the kernel has to provide the I/O to allow drivers to physically access their devices through some port or memory location. Very important decisions have to be made when designing the device management system, as every access involves context switches, making the operation very CPU-intensive and easily causing a significant performance overhead.^{[citation needed]}

System calls

To actually perform useful work, a userland program must be able to access the services provided by the kernel. This is implemented differently by each kernel, but most provide a C library or an API, which in turn invoke the related kernel functions either through the inter-process communication system, software interrupts or shared memory.^{[citation needed]}

Different kernel design approaches

Naturally, the above listed tasks and features can be provided in many ways that differ from each other in design and implementation. While monolithic kernels execute all of their code in the same address space (kernel space) to increase the performance of the system, microkernels try to run most of their services in user space, aiming to improve maintainability and modularity of the codebase.^[1] Most kernels do not fit exactly into one of these categories, but are rather found in between these two designs. These are called hybrid kernels. More exotic designs, such as nanokernels and exokernels, are mostly investigated by researchers and are not in widespread use.

Monolithic kernels

In a monolithic kernel, all OS services run along with the main kernel thread, thus also residing in the same memory area. This approach provides rich and powerful hardware access. Monolithic systems are easier to design and implement than other solutions, and are extremely efficient if well-written. The main disadvantages of monolithic kernels are the dependencies between system components - a bug in a device driver might crash the entire system - and the fact that large kernels become very difficult to maintain.

Microkernels

In the microkernel approach, the kernel itself only provides basic functionality that allows the execution of servers, separate programs that assume former kernel functions, such as device drivers, GUI servers, etc.

The microkernel approach consists of defining a simple abstraction over the hardware, with a set of primitives or system calls to implement minimal OS services such as memory management, multitasking, and inter-process communication. Other services, including those normally provided by the kernel such as networking, are implemented in user-space programs, referred to as servers. Microkernels are easier to maintain than monolithic kernels, but the large number of system calls and context switches might slow down the system because they typically generate more overhead than plain function calls.

Microkernels generally underperform traditional designs, sometimes dramatically. This is due in large part to the overhead of moving in and out of the kernel, a context switch, to move data between the various applications and servers. It was originally believed that careful tuning could reduce this overhead dramatically, but by the mid-1990s most researchers had abandoned this approach. Recently, newer microkernels, optimized for performance, have addressed these problems.^[5]

Monolithic kernels vs microkernels

As the computer kernel grows, a number of problems become evident. One of the most obvious is that the memory footprint increases. This is mitigated to some degree by perfecting the virtual memory system, but not all computer architectures have virtual memory support.^[6] To reduce the kernel's footprint, extensive editing has to be performed to carefully remove unneeded code, which can be very difficult with non-obvious interdependencies between parts of a kernel with millions of lines of code.

Due to the problems that monolithic kernels pose, they were considered obsolete by the early 1990s. As a result, the design of Linux using a monolithic kernel rather than a microkernel was the topic of a famous flame war between Linus Torvalds and Andrew Tanenbaum.^[7] There is merit on both sides of the argument presented in the Tanenbaum/Torvalds debate.

Monolithic kernels tend to be easier to design correctly, and therefore may grow more quickly than a microkernel-based system. However, a bug in a monolithic system usually crashes the entire system, while this doesn't happen in a microkernel with servers running apart from the main thread. Monolithic kernel proponents reason that incorrect code doesn't belong in a kernel, and that microkernels offer little advantage over correct code. There are success stories in both camps. Microkernels are often used in embedded robotic or medical computers where crash tolerance is important and most of the OS components reside in their own private, protected memory space. This is impossible with monolithic kernels, even with modern module-loading ones. However, the monolithic model tends to be more efficient through the use of shared kernel memory, rather than the slower message passing system of microkernel designs.

Hybrid kernels

The hybrid kernel approach tries to combine the speed and simpler design of a monolithic kernel with the modularity and execution safety of a microkernel.

Hybrid kernels are essentially a compromise between the monolithic kernel approach and the microkernel system. This implies running some services (such as the network stack or the filesystem) in kernel space to reduce the performance overhead of a traditional microkernel, but still running kernel code (such as device drivers) as servers in user space.

Nanokernels

A nanokernel delegates virtually all services - including even the most basic ones like interrupt controllers or the timer - to device drivers to make the kernel memory requirement even smaller than a traditional microkernel.^[8]

Exokernels

An exokernel is a type of kernel that does not abstract hardware into theoretical models. Instead it allocates physical hardware resources, such as processor time, memory pages, and disk blocks, to different programs. A program running on an exokernel can link to a library operating system that uses the exokernel to simulate the abstractions of a well-known OS, or it can develop application-specific abstractions for better performance.^[9]

Other designs

There are also alternative ways to design (and implement) a kernel which do not fit into the above named categories. Examples for this are "exec.library", the kernel of AmigaOS, which is considered to be "microkernel-like"^{[citation needed]}, and Unununium, which entirely lacks a kernel.^[10]

History of kernel development

Early operating system kernels

Strictly speaking, an operating system (and thus, a kernel) is not required to run a computer. Programs can be directly loaded and executed on the "bare metal" machine, provided that the authors of those programs are willing to work without any hardware abstraction or operating system support. Most early computers operated this way during the 1950s and early 1960s, which were reset and reloaded between the execution of different programs. Eventually, small ancillary programs such as program loaders and debuggers were left in memory between runs, or loaded from ROM. As these were developed, they formed the basis of what became early operating system kernels. The "bare metal" approach is still used today on some video game consoles and embedded systems, but in general, newer computers use modern operating systems and kernels.

Time-sharing operating systems

In the decade preceding Unix, computers had grown enormously in power - to the point where computer operators were looking for new ways to get people to use the spare time on their machines. One of the major developments during this era was time-sharing, whereby a number of users would get small slices of computer time, at a rate at which it appeared they were each connected to their own, slower, machine.^[11]

The development of time-sharing systems led to a number of problems. One was that users, particularly at universities where the systems were being developed, seemed to want to hack the system to get more CPU time. For this reason, security and access control became a major focus of the Multics project in 1965.^[12] Another ongoing issue was properly handling computing resources: users spent most of their time staring at the screen instead of actually using the resources of the computer, and a time-sharing system should give the CPU time to an active user during these periods. Finally, the systems typically offered a memory hierarchy several layers deep, and partitioning this expensive resource led to major developments in virtual memory systems.

Unix

Unix represented the culmination of decades of development towards a modern operating system. During the design phase, programmers decided to model every high-level device as a file, because they believed the purpose of computation was data transformation.^[13] For instance, printers were represented as a "file" at a known location - when data was copied to the file, it printed out. Other systems, to provide a similar functionality, tended to virtualize devices at a lower level - that is, both devices and files would be instances of some lower level concept. Virtualizing the system at the file level allowed users to manipulate the entire system using their existing file management utilities and concepts, dramatically simplifying operation. As an extension of the same paradigm, Unix allows programmers to manipulate files using a series of small programs, using the concept of pipes, which allowed users to complete operations in stages, feeding a file through a chain of single-purpose tools. Although the end result was the same, using smaller programs in this way dramatically increased flexibility as well as ease of development and use, allowing the user to modify their workflow by adding or removing a program from the chain.

In the Unix model, the Operating System consists of two parts; one the huge collection of utility programs that drive most operations, the other the kernel that runs the programs.^[13] Under Unix, from a programming standpoint the distinction between the two is fairly thin; the kernel is a program running in supervisor mode^[2] that acts as a program loader and supervisor for the small utility programs making up the rest of the system, and to provide locking and I/O services for these programs; beyond that, the kernel didn't intervene at all in user space.

Over the years the computing model changed, and Unix's treatment of everything as a file no longer seemed to be as universally applicable as it was before. Although a terminal could be treated as a file or a stream, which is printed to or read from, the same did not seem to be true for a graphical user interface. Networking posed another problem. Even if network communication can be compared to file access, the low-level packet-oriented architecture dealt with discrete chunks of data and not with whole files. As the capability of computers grew, Unix became increasingly cluttered with code. While kernels might have had 100,000 lines of code in the seventies and eighties, kernels of modern Unix successors like Linux have more than 4.5 million lines.^[14] Thus, the biggest problem with monolithic kernels, or monokernels, was sheer size. The code was so extensive that working on such a large codebase was extremely tedious and time-consuming.

Modern Unix-derivates are generally based on module-loading monolithic kernels. Examples for this are linux distributions like Debian GNU/Linux, Red Hat Linux and Ubuntu Linux, as well as berkeley software distributions such as FreeBSD and NetBSD. Apart from these alternatives, amateur developers maintain an active operating system development community, populated by self-written hobby kernels which mostly end up sharing many features with Linux and/or being compatible with it.^[15]

Mac OS

Apple Computer first launched Mac OS in 1984, bundled with its Apple Macintosh personal computer. For the first few releases, Mac OS (or Software System, as it was called) lacked many essential features, such as multitasking and a hierarchical filesystem. With time, the OS evolved and eventually became Mac OS 9 and had many new features added, but the kernel basically stayed the same. Against this, Mac OS X is based on Darwin, which uses a hybrid kernel called XNU, which was created combining the 4.3BSD kernel and the Mach kernel.^[16]

Windows

Microsoft Windows was first released in 1985 as an add-on to DOS. Similarly to Mac OS, it also lacked important features at first, but eventually acquired them in later releases. This product line would continue until the release of the Windows 9x series and end with Windows Me. At the same time, Microsoft has been developing Windows NT since 1993, an operating system intended for the high-end and business user. This line started with the release of Windows NT 3.1 and replaced the main product line with the release of the NT-based Windows 2000.

The highly successful Windows XP brought these two product lines together, combining the stability of the NT line and the visual appeal of the 9x series.^[17] It uses the NT kernel, which is generally considered a hybrid kernel because the kernel itself contains tasks such as the Window Manager and the IPC Manager, but several subsystems run in user mode.^[18]

Development of microkernels

Although Mach, developed at Carnegie Mellon University from 1985 to 1994 is the best-known general-purpose microkernel, other microkernels have been developed with more specific aims. The L4 microkernel family (mainly the L3 and the L4 kernel) was created to demonstrate that microkernels are not necessarily slow.^[5] Newer implementations like Fiasco and Pistachio are able to run Linux next to other L4 processes in separate address spaces.^[19]^[20]

QNX is a real-time operating system with a minimalistic microkernel design that has been developed since 1982, having been far more successful than Mach in achieving the goals of the microkernel paradigm.^[21] It is principally used in embedded systems and in situations where software is not allowed to fail, such as the robotic arms on the space shuttle and machines that control grinding of glass to extremely fine tolerances, where a tiny mistake may cost hundreds of thousands of dollars, as in the case of the mirror of the Hubble Space Telescope.^[22]

Footnotes and references

^ ^a ^b An overview of Monolithic and Micro Kernels, by K.J.
^ ^a ^b The highest privilege level has various names throughout different architectures, such as supervisor mode, kernel mode, CPL0, DPL0, Ring 0, etc.
^ ^a ^b Bona Fide OS Development - Bran's Kernel Development Tutorial, by Brandon Friesen
^ It has to be considered that whilst CPU time is an unlimited resource, neither the memory's capacity nor its access speed are unlimited.
^ ^a ^b The L4 microkernel family - Overview
^ Virtual addressing is most commonly achieved through a built-in memory management unit.
^ Recordings of the debate can be found at dina.dk, groups.google.com, oreilly.com and Andrew Tanenbaum's website
^ KeyKOS Nanokernel Architecture
^ MIT Exokernel Operating System
^ Unununium OS :: Introduction
^ BSTJ version of C.ACM Unix paper
^ Introduction and Overview of the Multics System, by F. J. Corbató and V. A. Vissotsky.
^ ^a ^b The UNIX System - The Single Unix Specification
^ Linux Kernel 2.6: It's Worth More!, by David A. Wheeler, October 12, 2004
^ This community mostly gathers at Bona Fide OS Development and The Mega-Tokyo Message Board.
^ XNU: The Kernel
^ LinuxWorld | IDC: Consolidation of Windows won't happen
^ Windows History: Windows Desktop Products History
^ The Fiasco microkernel - Overview
^ L4Ka - The L4 microkernel family and friends
^ QNX Realtime Operating System Overview
^ Hubble Facts, by NASA, January 1997

External links

Operating System Kernels at Sourceforge
Operating System Kernels at Freshmeat
MIT Exokernel Operating System
Kernel image - Knoppix Documentation Wiki
The KeyKOS Nanokernel Architecture, a 1992 paper by Norman Hardy et al.
An Overview of the NetWare Operating System, a 1994 paper by Drew Major, Greg Minshall, and Kyle Powell (primary architects behind the Netware OS).
Kernelnewbies, a community for learning Linux kernel hacking.

Template:Link FA

[mono-micro-1] An overview of Monolithic and Micro Kernels, by K.J.

[supervisor-2] The highest privilege level has various names throughout different architectures, such as supervisor mode, kernel mode, CPL0, DPL0, Ring 0, etc.

[bkerndev-3] Bona Fide OS Development - Bran's Kernel Development Tutorial, by Brandon Friesen

[4] It has to be considered that whilst CPU time is an unlimited resource, neither the memory's capacity nor its access speed are unlimited.

[l4-5] The L4 microkernel family - Overview

[6] Virtual addressing is most commonly achieved through a built-in memory management unit.

[7] Recordings of the debate can be found at dina.dk, groups.google.com, oreilly.com and Andrew Tanenbaum's website

[8] KeyKOS Nanokernel Architecture

[9] MIT Exokernel Operating System

[10] Unununium OS :: Introduction

[11] BSTJ version of C.ACM Unix paper

[12] Introduction and Overview of the Multics System, by F. J. Corbató and V. A. Vissotsky.

[unix-13] The UNIX System - The Single Unix Specification

[14] Linux Kernel 2.6: It's Worth More!, by David A. Wheeler, October 12, 2004

[15] This community mostly gathers at Bona Fide OS Development and The Mega-Tokyo Message Board.

[16] XNU: The Kernel

[17] LinuxWorld | IDC: Consolidation of Windows won't happen

[18] Windows History: Windows Desktop Products History

[19] The Fiasco microkernel - Overview

[20] L4Ka - The L4 microkernel family and friends

[21] QNX Realtime Operating System Overview

[22] Hubble Facts, by NASA, January 1997

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]