Click here to close now.




















Welcome!

Linux Containers Authors: Yeshim Deniz, Liz McMillan, Carmen Gonzalez, Tim Hinds, Elizabeth White

Related Topics: Linux Containers

Linux Containers: Article

The kernel of pain

Let's call a spade a spade: For large servers, the 2.4 kernel has been a disaster.

(LinuxWorld) -- Let's start from the beginning. In July 2001, I was responsible for upgrading a customer's server from Red Hat 6.2 to Mandrake 8.0. The machine was built from scratch, and Mandrake was installed onto a freshly formatted RAID 5 array. We then migrated the Red Hat 6.2 applications to the new machine.

After a little configuration, the machine seemed to run fine. We successfully migrated the entire system in less than five hours. Considering this was a large-scale server, that was quite a feat and was certainly welcomed by our paying customer.

However, after about a month into deployment I started noticing strange problems with the machine. Intermittent lockups were the most common. The lockups appeared physical, and the machine was unrecoverable without a reboot.

While performing research on the problem, I learned there was a serious sync() bug in the 2.4 kernel. This bug exists in all kernel 2.4 versions until 2.4.6. The solution seemed simple: I upgrade the kernel.

About a week later, the machine locks up cold -- again. We considered it a fluke and rebooted. The very next day the machine locked up -- again. We do further research and find that the original 2.4 VM (Virtual Memory) implementation was causing problems. In my frustration and embarrassment, I would be inclined to call it bad design, but I don't know enough about the intricacies of the Linux kernel to say whether it was.

The VM problem was so horribly bad that the kernel team decided to rip out the older implementation and implement a completely new design. These problems continued as the kernel versions worked their way up through 2.4.11, which has a serious symlink bug that could lead to corrupted inodes. As of 2.4.13, things finally seemed to be cleaned up a bit. The kernel seemed to show more stability. Then we hit kernel 2.4.15.

Linux version 2.4.15 contained a bug that was arguably worse than the VM bug. Essentially, if you unmounted a file system via reboot -- or any another common method -- you would get filesystem corruption. A fix, called kernel 2.4.16, was released 24 hours later.

Kernel 2.4.16 now appeared to be the kernel of choice. It seemed as if it was possible that after almost a year of "stable" status that the 2.4 kernel would be usable in a production environment.

We still aren't there yet

Alas, the mire of trouble within the 2.4 series kernels continues. As of kernel 2.4.16, there is a serious bug in the OOM that can cause system lockups. The lock-up bug in 2.4.16 has supposedly been fixed in 2.4.17pre4aa1.

The current kernel release is 2.4.17, and one would hope that it is stable, but a brief review of the changelog will show that the kernel team is still working on fine-tuning the new VM design, and the vast amount of changes that have been made are already making me weary of it.

As I reviewed the archives of late December, I found that the per-user limit support in the 2.4 series kernels is broken. With the limit support broken, any user -- privileged or not -- has the potential to suck up all of the machines resources, effectively causing an intramural DoS (Denial of Service) attack. They could do this accidentally, and it would cause a great deal of grief for any system administrator.

So, what does all of this mean for me? It means that after five months of battling the new, better-than-fresh-butter, enterprise-ready 2.4 kernel, I am moving my customer back to the stodgy, conservative, more-enterprise-ready-than-2.4-has-been-since-its-release-almost-a-year-ago, 2.2 kernel-based Red Hat 6.2.

The 2.2 kernels may not handle large SMP machines as well, they may not handle large amounts of memory well (only 2 gigabytes), and they may have a practical limit of 2 gigabytes on a single file, but the 2.2. kernels don't crash or cause phone calls at 5:00 AM. Moreover, the 2.2 kernels don't make customers unhappy that they chose Linux as their server solution.

What does this mean for you?

What does all of this mean for you? That is your decision. You just read mine.

I hope Red Hat, SuSE, and Mandrake are taking a long hard look at the 2.4 process and formulating long-term plans to circumvent problems like this. I know, for example, that Red Hat has its own stress testing for the kernel, and that the Red Hat-shipped kernel is a fork of the standard Linux kernel. This fork is a good thing, because it means that Red Hat is able to apply patches that, in theory, make its kernel more stable.

On the desktop that I write this article, I am running Red Hat 7.2 with the 2.4.9-enterprise kernel. (It's a long story that involves this machine's AMD Duron processor.) I have yet to have any lockups on the Red Hat kernel since I upgraded to 2.4.9. I can say that Red Hat 7.2 seems reasonable and usable (at least as a desktop machine) but I am unsure if any 2.4 kernel-based system would be considered acceptable in a production server environment today.

More Stories By Joshua Drake

Joshua Drake is the co-founder of Command Prompt, Inc., a PostgreSQL and Linux custom development company. He is also the current author of the Linux Networking HOWTO, Linux PPP HOWTO, and Linux Consultants HOWTO. His most demanding project at this time is a new PostgreSQL book for O'Reilly, 'Practical PostgreSQL'

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades. With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo, November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be.
Containers are not new, but renewed commitments to performance, flexibility, and agility have propelled them to the top of the agenda today. By working without the need for virtualization and its overhead, containers are seen as the perfect way to deploy apps and services across multiple clouds. Containers can handle anything from file types to operating systems and services, including microservices. What are microservices? Unlike what the name implies, microservices are not necessarily small, but are focused on specific tasks. The ability for developers to deploy multiple containers – thous...
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
The 3rd International WebRTC Summit, to be held Nov. 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 15th International Cloud Expo, 6th International Big Data Expo, 3rd International DevOps Summit and 2nd Internet of @ThingsExpo. WebRTC (Web-based Real-Time Communication) is an open source project supported by Google, Mozilla and Opera that aims to enable bro...
As more and more data is generated from a variety of connected devices, the need to get insights from this data and predict future behavior and trends is increasingly essential for businesses. Real-time stream processing is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, and Healthcare. Azure Stream Analytics is a fully managed distributed stream computation service that provides low latency, scalable processing of streaming data in the cloud with an enterprise grade SLA. It features built-in integration with Azur...
With the proliferation of connected devices underpinning new Internet of Things systems, Brandon Schulz, Director of Luxoft IoT – Retail, will be looking at the transformation of the retail customer experience in brick and mortar stores in his session at @ThingsExpo. Questions he will address include: Will beacons drop to the wayside like QR codes, or be a proximity-based profit driver? How will the customer experience change in stores of all types when everything can be instrumented and analyzed? As an area of investment, how might a retail company move towards an innovation methodolo...
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on demos and comprehensive walkthroughs.
Contrary to mainstream media attention, the multiple possibilities of how consumer IoT will transform our everyday lives aren’t the only angle of this headline-gaining trend. There’s a huge opportunity for “industrial IoT” and “Smart Cities” to impact the world in the same capacity – especially during critical situations. For example, a community water dam that needs to release water can leverage embedded critical communications logic to alert the appropriate individuals, on the right device, as soon as they are needed to take action.
WebRTC services have already permeated corporate communications in the form of videoconferencing solutions. However, WebRTC has the potential of going beyond and catalyzing a new class of services providing more than calls with capabilities such as mass-scale real-time media broadcasting, enriched and augmented video, person-to-machine and machine-to-machine communications. In his session at @ThingsExpo, Luis Lopez, CEO of Kurento, will introduce the technologies required for implementing these ideas and some early experiments performed in the Kurento open source software community in areas ...
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevOps to advance innovation and increase agility. Specializing in designing, imple...
Consumer IoT applications provide data about the user that just doesn’t exist in traditional PC or mobile web applications. This rich data, or “context,” enables the highly personalized consumer experiences that characterize many consumer IoT apps. This same data is also providing brands with unprecedented insight into how their connected products are being used, while, at the same time, powering highly targeted engagement and marketing opportunities. In his session at @ThingsExpo, Nathan Treloar, President and COO of Bebaio, will explore examples of brands transforming their businesses by t...
In his session at @ThingsExpo, Lee Williams, a producer of the first smartphones and tablets, will talk about how he is now applying his experience in mobile technology to the design and development of the next generation of Environmental and Sustainability Services at ETwater. He will explain how M2M controllers work through wirelessly connected remote controls; and specifically delve into a retrofit option that reverse-engineers control codes of existing conventional controller systems so they don't have to be replaced and are instantly converted to become smart, connected devices.
With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts, GM of Platform at FinancialForce.com, will discuss the value of business applications on wearable ...
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of technology leadership, Micron's memory solutions enable the world's most innovative computing, consumer,...
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
As more intelligent IoT applications shift into gear, they’re merging into the ever-increasing traffic flow of the Internet. It won’t be long before we experience bottlenecks, as IoT traffic peaks during rush hours. Organizations that are unprepared will find themselves by the side of the road unable to cross back into the fast lane. As billions of new devices begin to communicate and exchange data – will your infrastructure be scalable enough to handle this new interconnected world?
While many app developers are comfortable building apps for the smartphone, there is a whole new world out there. In his session at @ThingsExpo, Narayan Sainaney, Co-founder and CTO of Mojio, will discuss how the business case for connected car apps is growing and, with open platform companies having already done the heavy lifting, there really is no barrier to entry.
SYS-CON Events announced today that the "Second Containers & Microservices Expo" will take place November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
Manufacturing connected IoT versions of traditional products requires more than multiple deep technology skills. It also requires a shift in mindset, to realize that connected, sensor-enabled “things” act more like services than what we usually think of as products. In his session at @ThingsExpo, David Friedman, CEO and co-founder of Ayla Networks, will discuss how when sensors start generating detailed real-world data about products and how they’re being used, smart manufacturers can use the data to create additional revenue streams, such as improved warranties or premium features. Or slash...