Click here to close now.




















Welcome!

Linux Containers Authors: Elizabeth White, VictorOps Blog, Liz McMillan, Mike Kavis, Carmen Gonzalez

Related Topics: Linux Containers

Linux Containers: Article

Understanding the Linux Kernel (Part 1 of 3)

Part 1 of 3

Like any time-sharing system, Linux achieves the magical effect of an apparent simultaneous execution of multiple processes by switching from one process to another in a very short time frame. This article deals with scheduling, which is concerned with when to switch and which process to choose.

The article consists of three parts. Part One introduces the choices made by Linux to schedule processes in the abstract. Part Two discusses the data structures used to implement scheduling and the corresponding algorithm. Finally, Part Three describes the system calls that affect process scheduling.

Scheduling Policy
The scheduling algorithm of traditional Unix operating systems must fulfill several conflicting objectives: fast process response time, good throughput for background jobs, avoidance of process starvation, reconciliation of the needs of low- and high-priority processes, and so on. The set of rules used to determine when and how selecting a new process to run is called scheduling policy.

Linux scheduling is based on timing measurements: several processes are allowed to run "concurrently," which means that the CPU time is roughly divided into "slices," one for each runnable process. Of course, a single processor can run only one process at any given instant. If a currently running process is not terminated when its time slice or quantum expires, a process switch may take place. Time-sharing relies on timer interrupts and is thus transparent to processes. No additional code needs to be inserted in the programs in order to ensure CPU time-sharing.

The scheduling policy is also based on ranking processes according to their priority. Complicated algorithms are sometimes used to derive the current priority of a process, but the end result is the same: each process is associated with a value that denotes how appropriate it is to be assigned to the CPU.

In Linux, process priority is dynamic. The scheduler keeps track of what processes are doing and adjusts their priorities periodically; in this way, processes that have been denied the use of the CPU for a long time interval are boosted by dynamically increasing their priority. Correspondingly, processes running for a long time are penalized by decreasing their priority.

When speaking about scheduling, processes are traditionally classified as "I/O-bound" or "CPU-bound." The former make heavy use of I/O devices and spend much time waiting for I/O operations to complete; the latter are number-crunching applications that require a lot of CPU time. An alternative classification distinguishes three classes of processes:

Interactive processes
These interact constantly with their users, and therefore spend a lot of time waiting for keypresses and mouse operations. When input is received, the process must be woken up quickly, or the user will find the system to be unresponsive. Typically, the average delay must fall between 50 and 150 ms. The variance of such delay must also be bounded, or the user will find the system to be erratic. Typical interactive programs are command shells, text editors, and graphical applications.

Batch processes
These do not need user interaction, and hence they often run in the background. Since such processes do not need to be very responsive, they are often penalized by the scheduler. Typical batch programs are programming language compilers, database search engines, and scientific computations.

Real-time processes
These have very strong scheduling requirements. Such processes should never be blocked by lower-priority processes, they should have a short response time and, most important, such response time should have a minimum variance. Typical real-time programs are video and sound applications, robot controllers, and programs that collect data from physical sensors.

The two classifications we just offered are somewhat independent. For instance, a batch process can be either I/O-bound (e.g., a database server) or CPU-bound (e.g., an image-rendering program). While in Linux real-time programs are explicitly recognized as such by the scheduling algorithm, there is no way to distinguish between interactive and batch programs. In order to offer a good response time to interactive applications, Linux (like all Unix kernels) implicitly favors I/O-bound processes over CPU-bound ones.

Programmers may change the scheduling parameters by means of the system calls illustrated in Table 1. More details will be given in Part Three.

Table 1: System Calls Related to Scheduling
System Call Description
nice( ) Change the priority of a conventional process.
getpriority( ) Get the maximum priority of a group of conventional processes.
setpriority( ) Set the priority of a group of conventional processes.
sched_getscheduler( ) Get the scheduling policy of a process.
sched_setscheduler( ) Set the scheduling policy and priority of a process.
sched_getparam( ) Get the scheduling priority of a process.
sched_setparam( ) Set the priority of a process.
sched_yield( ) Relinquish the processor voluntarily without blocking.
sched_get_ priority_min( ) Get the minimum priority value for a policy.
sched_get_ priority_max( ) Get the maximum priority value for a policy.
sched_rr_get_interval( ) Get the time quantum value for the Round Robin policy.

Most system calls shown in the table apply to real-time processes, thus allowing users to develop real-time applications. However, Linux does not support the most demanding real-time applications because its kernel is nonpreemptive.

Process Preemption
Linux processes are preemptive. If a process enters the TASK_RUNNING state, the kernel checks whether its dynamic priority is greater than the priority of the currently running process. If it is, the execution of current is interrupted and the scheduler is invoked to select another process to run (usually the process that just became runnable). Of course, a process may also be preempted when its time quantum expires. When this occurs, the need_resched field of the current process is set, so the scheduler is invoked when the timer interrupt handler terminates.

For instance, let us consider a scenario in which only two programs--a text editor and a compiler--are being executed. The text editor is an interactive program, therefore it has a higher dynamic priority than the compiler. Nevertheless, it is often suspended, since the user alternates between pauses for think time and data entry; moreover, the average delay between two keypresses is relatively long. However, as soon as the user presses a key, an interrupt is raised, and the kernel wakes up the text editor process. The kernel also determines that the dynamic priority of the editor is higher than the priority of current, the currently running process (that is, the compiler), and hence it sets the need_resched field of this process, thus forcing the scheduler to be activated when the kernel finishes handling the interrupt. The scheduler selects the editor and performs a task switch; as a result, the execution of the editor is resumed very quickly and the character typed by the user is echoed to the screen. When the character has been processed, the text editor process suspends itself waiting for another keypress, and the compiler process can resume its execution. Be aware that a preempted process is not suspended, since it remains in the TASK_RUNNING state; it simply no longer uses the CPU.

Some real-time operating systems feature preemptive kernels, which means that a process running in Kernel Mode can be interrupted after any instruction, just as it can in User Mode. The Linux kernel is not preemptive, which means that a process can be preempted only while running in User Mode; nonpreemptive kernel design is much simpler, since most synchronization problems involving the kernel data structures are easily avoided.

How Long Must a Quantum Last?
The quantum duration is critical for system performances: it should be neither too long nor too short.

If the quantum duration is too short, the system overhead caused by task switches becomes excessively high. For instance, suppose that a task switch requires 10 milliseconds; if the quantum is also set to 10 milliseconds, then at least 50% of the CPU cycles will be dedicated to task switch.

If the quantum duration is too long, processes no longer appear to be executed concurrently. For instance, let's suppose that the quantum is set to five seconds; each runnable process makes progress for about five seconds, but then it stops for a very long time (typically, five seconds times the number of runnable processes).

It is often believed that a long quantum duration degrades the response time of interactive applications. This is usually false. Interactive processes have a relatively high priority, therefore they quickly preempt the batch processes, no matter how long the quantum duration is.

In some cases, a quantum duration that is too long degrades the responsiveness of the system. For instance, suppose that two users concurrently enter two commands at the respective shell prompts; one command is CPU-bound, while the other is an interactive application. Both shells fork a new process and delegate the execution of the user's command to it; moreover, suppose that such new processes have the same priority initially (Linux does not know in advance if an executed program is batch or interactive). Now, if the scheduler selects the CPU-bound process to run, the other process could wait for a whole time quantum before starting its execution. Therefore, if such duration is long, the system could appear to be unresponsive to the user that launched it.

The choice of quantum duration is always a compromise. The rule of thumb adopted by Linux is: choose a duration as long as possible, while keeping good system response time.

More Stories By Daniel Bovet

Daniel P. Bovet got a Ph.D. in computer science at UCLA in 1968 and is now full Professor at the University of Rome, "Tor Vergata," Italy.

More Stories By Marco Cesati

Marco Cesati got a degree in mathematics in 1992 and a Ph.D. in computer science (University of Rome, "La Sapienza") in 1995. He is now a research assistant in the computer science department of the School of Engineering (University of Rome, "Tor Vergata").

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
With the proliferation of connected devices underpinning new Internet of Things systems, Brandon Schulz, Director of Luxoft IoT – Retail, will be looking at the transformation of the retail customer experience in brick and mortar stores in his session at @ThingsExpo. Questions he will address include: Will beacons drop to the wayside like QR codes, or be a proximity-based profit driver? How will the customer experience change in stores of all types when everything can be instrumented and analyzed? As an area of investment, how might a retail company move towards an innovation methodolo...
Contrary to mainstream media attention, the multiple possibilities of how consumer IoT will transform our everyday lives aren’t the only angle of this headline-gaining trend. There’s a huge opportunity for “industrial IoT” and “Smart Cities” to impact the world in the same capacity – especially during critical situations. For example, a community water dam that needs to release water can leverage embedded critical communications logic to alert the appropriate individuals, on the right device, as soon as they are needed to take action.
Manufacturing connected IoT versions of traditional products requires more than multiple deep technology skills. It also requires a shift in mindset, to realize that connected, sensor-enabled “things” act more like services than what we usually think of as products. In his session at @ThingsExpo, David Friedman, CEO and co-founder of Ayla Networks, will discuss how when sensors start generating detailed real-world data about products and how they’re being used, smart manufacturers can use the data to create additional revenue streams, such as improved warranties or premium features. Or slash...
WebRTC services have already permeated corporate communications in the form of videoconferencing solutions. However, WebRTC has the potential of going beyond and catalyzing a new class of services providing more than calls with capabilities such as mass-scale real-time media broadcasting, enriched and augmented video, person-to-machine and machine-to-machine communications. In his session at @ThingsExpo, Luis Lopez, CEO of Kurento, will introduce the technologies required for implementing these ideas and some early experiments performed in the Kurento open source software community in areas ...
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
While many app developers are comfortable building apps for the smartphone, there is a whole new world out there. In his session at @ThingsExpo, Narayan Sainaney, Co-founder and CTO of Mojio, will discuss how the business case for connected car apps is growing and, with open platform companies having already done the heavy lifting, there really is no barrier to entry.
As more intelligent IoT applications shift into gear, they’re merging into the ever-increasing traffic flow of the Internet. It won’t be long before we experience bottlenecks, as IoT traffic peaks during rush hours. Organizations that are unprepared will find themselves by the side of the road unable to cross back into the fast lane. As billions of new devices begin to communicate and exchange data – will your infrastructure be scalable enough to handle this new interconnected world?
All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades. With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo, November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be.
The Internet of Things is in the early stages of mainstream deployment but it promises to unlock value and rapidly transform how organizations manage, operationalize, and monetize their assets. IoT is a complex structure of hardware, sensors, applications, analytics and devices that need to be able to communicate geographically and across all functions. Once the data is collected from numerous endpoints, the challenge then becomes converting it into actionable insight.
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of technology leadership, Micron's memory solutions enable the world's most innovative computing, consumer,...
With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts, GM of Platform at FinancialForce.com, will discuss the value of business applications on wearable ...
SYS-CON Events announced today that IceWarp will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IceWarp, the leader of cloud and on-premise messaging, delivers secured email, chat, documents, conferencing and collaboration to today's mobile workforce, all in one unified interface
As more and more data is generated from a variety of connected devices, the need to get insights from this data and predict future behavior and trends is increasingly essential for businesses. Real-time stream processing is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, and Healthcare. Azure Stream Analytics is a fully managed distributed stream computation service that provides low latency, scalable processing of streaming data in the cloud with an enterprise grade SLA. It features built-in integration with Azur...
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on demos and comprehensive walkthroughs.
SYS-CON Events announced today that the "Second Containers & Microservices Expo" will take place November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
Akana has announced the availability of the new Akana Healthcare Solution. The API-driven solution helps healthcare organizations accelerate their transition to being secure, digitally interoperable businesses. It leverages the Health Level Seven International Fast Healthcare Interoperability Resources (HL7 FHIR) standard to enable broader business use of medical data. Akana developed the Healthcare Solution in response to healthcare businesses that want to increase electronic, multi-device access to health records while reducing operating costs and complying with government regulations.
Containers are not new, but renewed commitments to performance, flexibility, and agility have propelled them to the top of the agenda today. By working without the need for virtualization and its overhead, containers are seen as the perfect way to deploy apps and services across multiple clouds. Containers can handle anything from file types to operating systems and services, including microservices. What are microservices? Unlike what the name implies, microservices are not necessarily small, but are focused on specific tasks. The ability for developers to deploy multiple containers – thous...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal an...
Consumer IoT applications provide data about the user that just doesn’t exist in traditional PC or mobile web applications. This rich data, or “context,” enables the highly personalized consumer experiences that characterize many consumer IoT apps. This same data is also providing brands with unprecedented insight into how their connected products are being used, while, at the same time, powering highly targeted engagement and marketing opportunities. In his session at @ThingsExpo, Nathan Treloar, President and COO of Bebaio, will explore examples of brands transforming their businesses by t...