YOUR FEEDBACK
Rapid Module Development for DotNetNuke
MICHEAL SMITH wrote: GO TO THE LINK, U HAVE EVERYTHING U WANT THERE. MICHEAL...


2007 West
GOLD SPONSORS:
Active Endpoints
Your SOA Needs BPEL for Orchestration
BEA
Virtualized SOA: Adaptive Infrastructure for Demanding Applications
Nexaweb
Overcoming Bandwidth Challenges with Nexaweb
TIBCO
What is Service Virtualization?
SILVER SPONSORS:
WSO2
Using Web Services Technologies and FOSS Solutions
Click For 2007 East
Event Webcasts

2008 East
PLATINUM SPONSORS:
Appcelerator
Think Fast: Accelerate AJAX Development with Appcelerator
GOLD SPONSORS:
DreamFace Interactive
The Ultimate Framework for Creating Personalized Web 2.0 Mashups
ICEsoft
AJAX and Social Computing for the Enterprise
Kaazing
Enterprise Comet: Real–Time, Real–Time, or Real–Time Web 2.0?
Nexaweb
Now Playing: Desktop Apps in the Browser!
Sun
jMaki as an AJAX Mashup Framework
POWER PANELS:
The Business Value
of RIAs
What Lies Beyond AJAX?
KEYNOTES:
Douglas Crockford
Can We Fix the Web?
Anthony Franco
2008: The Year of the RIA
Click For 2007 Event Webcasts
SYS-CON.TV
TOP LINKS YOU MUST CLICK ON


Linux.SYS-CON.com Feature: Live Patching on Linux
A key technology for highly available Linux systems

Digg This!

Page 1 of 2   next page »

There are some computing systems that require high availability. Telecommunication systems are a good example. They require 24 hours a day and 365 days a year service availability and their downtime should not exceed five minutes per year and that includes hardware and software upgrades. These systems require carrier-grade reliability that guarantees high service availability, 99.999% uptime or higher.

To satisfy high-availability requirements, special-purpose operating systems, sometimes proprietary or self-developed operating system, were used in telecom systems. As the telecommunication world is now moving towards using the Linux operating system on mission-critical systems, new high requirements are imposed on the operating systems. However, Linux is designed to work best on desktop and enterprise systems, and it doesn't have the mechanisms and capabilities needed for mission-critical system with an intense and complex workload that must also handle very confidential information. The OSDL Carrier-Grade Linux (CGL) working group is looking at filling these gaps by creating the CGL requirement definition documents and supporting the creation of Open Source projects to fill these gaps.

Software developers usually provide patches to fix bugs, enhance existing capabilities, or add new capabilities. The intervals between software program updates are getting shorter and shorter as software structure grows in size and complexity. The number of patches t is on the rise. Normally, it's necessary to restart the process (or the service) to apply these patches, and sometimes the operating system has to be rebooted. The software program itself can't be modified without being stopped because it's loaded in the process memory space, which only the process can access. In some instances, it takes a few seconds and sometimes a few minutes to restart a process or service. As a result, the services offered aren't available during the restart.There are special software programs that can modify themselves and their functions via a defined interface. However, most software can't.

What Is Live Patching?
Live patching is one of the capabilities in version 3.1 of the CGL requirement definition document released in June 2005. This feature enables a process to modify its functions without restarting, a very needed capability for telecommunication systems that are expected be continuously in service.

One approach to achieving live patching is overwriting the "jmp" assembly code to the entry point of function, which is the method applied by the PANNUS project. PANNUS enables the replacement of a function without restarting a process. This approach is very practical because many software programs are usually implemented with various functions.

Live Patching Requirements
This section describes the requirements of live patching from viewpoint of carrier service, software structure, and operating environment.

Real-Time Capability
Live patching has been used in the telecommunication industry for a long time. Customers expect that their voice and data services will always be available. To ensure service 24 hours a day 365 days a year, maintaining and expanding service on running telecom systems without disruption must be possible. Typical telecommunication systems are constructed as redundant systems following, for example, the 1+1 redundancy model where one server is active handling service requests, and the second is a hot-standby for the first server.. Each server in such a configuration knows the status of the other through a "heartbeat" mechanism that sends signals between the two servers as a keep-alive message. If both redundant servers fail, services are stopped and many customers are affected. Furthermore, once the service is no longer available, it takes a long time to recover and resume service because telecommunication switches are complex systems that consist of multiple components. So patches have to be applied without disrupting the service to end users, subscribe to telephony services.

The Limitations of Target Software
Developers can release several hundred software patches for each piece of software, including patches that are significant to the system's base software, e.g., the fundamental system software and library. If these patches aren't applied through a live-patching mechanism and use source patching instead, the processes require frequent restarting to enable the new patches and bug fixes. If these patches aren't applied quickly, the servers will encounter fatal errors or delays in addingfeatures necessary to the service. So live patching should be applied to a customer's original software and to generic fundamental system software programs that are widely used. If the approach requires the target software to have a specific feature or to link to a particular library, achieving live patching is expensive especially on large complex systems.

Easy Operation
After applying a patch module by live patching, the modified system must be surveyed for a certain period of time to confirm that the patches are acting properly. If some fatal problem occurs during that "trial" period, activated patches must be deactivated immediately, re-checked, fixed, and re-applied again. So to make operations easier, patch modules should have an explicitly stated state transition that can be cancelled.

In a typical operation environment, the person who applies the patch modules is a maintenance engineer, not the original developer of the patches. Maintenance engineers don't know much aboutbug fixes. If applying a live patch is too complicated, some mistakes can't be avoided. So the act of live patching should be easy to do.

The PANNUS Approach
PANNUS is a live-patching implementation that enables live patching for processes by overwriting the "jmp" assembly code at the entry point of a function (see Figure 1).

Outline of Processing

  • Stage 1: Loading Patch - The first stage is loading the patches on the target process memory space. For this stage, PANNUS has a new system call "mmap3," an expansion of the "mmap2" system call that enables the mapping of a given memory area to a target process. The "mmap3" system call enables a target process to load patch modules without disturbing the process execution. Before loading, PANNUS checks for duplicate global symbols in the target binary and the patch module binary, and omits duplicate symbols to improve usability for programmers. Next, PANNUS seeks a vacant space to map the patch module and loads it as a shared library. So patch modules must be built with the options, "-shared -fPIC."

  • Stage 2: Linking Patch - The second stage is linking patch modules to the target process. After reading the relocation table of the patch modules, PANNUS calculates the relocated addresses and writes them to the process memory by using a new system call "accesspvm." The "accesspvm" doesn't disturb the target process execution because these resolved memory areas are resolved locations. That means they aren't used by the target process. PANNUS also saves the dependency information of applied patch modules in the target process memory to maintain relations between the target process and loaded patch modules. If users try to load a patch module that depends on another patch module and that patch module doesn't exist, PANNUS checks the dependency and stops execution to prevent software error.

  • Stage 3: Initialize Patch - The third stage is making the target process execute the initialization of the patch module. Binary files, such as an executable file or shared object file, have an "_init" section, which is executed right after they're loaded. In the "_init" section, initialization of some global variables and objects is executed. PANNUS makes the target process execute this section by using a capability similar to that of signal handler.

  • Stage 4: Writing Assembly Code - The final stage is writing "jmp" assembly code to the entry point of a target function. This stage is most significant for the target process. PANNUS attaches the target process by using ptrace and checks if it can overwrite "jmp" code safely. If "jmp" assembly code overwrites the instruction that was just executed by the CPU, the CPU executes an unexpected instruction after it's detached and then the process malfunctions. So PANNUS checks which instruction CPUs to execute and if a CPU executes an instruction that will be overwritten by the "jmp" assembly code, PANNUS detaches the process and tries again to ensure atomic access. In addition, if the address in the "jmp" region is included in the stack of the target process, the CPU will execute an unexpected instruction after returning from the "callee" function or interrupt handler. Hence, the "jmp" region must not include instructions such as "call" or "int." The length of the"jmp" code that PANNUS writes on the entry point of a target function is 5 bytes in i386 architecture or 14 bytes in x86-64 architecture; so it's safe to embed 5 (or 14) "nop"(no operation) instruction codes to the entry of each function on the target process in advance. After checking, PANNUS overwrites "jmp" code and activates the patch module.

    PANNUS uses a slightly different method for a process that handles exceptions for C++. The binary of the process that handles exceptions has sections such as ".eh_frame" or ".gcc_except_table" that have some information used for exception handling. With "jmp" assembly code overwriting, these sections aren't executed correctly. If an exception occurs, the process aborts because it can't find the exception information to handle the exception. In this case, PANNUS makes a target process execute "dlopen," which is usually used for loading additional shared libraries by initializing the patch modules during loading. This approach costs much higher than normal "mmap3" loading because the process itself has to load the patch modules through "dlopen" before executing initialization.


  • Page 1 of 2   next page »

    About Takashi Ikebe
    Takashi Ikebe is a senior open source development engineer with NTT Network Service Systems Laboratories. Within CGL, he participates in the Specifications Group.

    About Masahiko Uchiyama
    Masahiko Uchiyama is a software developer in PANNUS project. He is a chief system engineer as well as an assistant manager for NTT COMWARE. He's worked with IP telephony switches and related systems for six years. He currently lives in Chiba, Japan.

    LATEST LINUX STORIES
    Kevin Hoffman's Review of Iron Man
    I took the advice of a friend of mine and steered clear of the 'normal' movie theaters and went a little out of the way to go to a DLP movie theater. The experience of comparing a regular movie theater to a DLP movie theater is like comparing standard def analog TV with a 1080i HDTV si
    3rd International Virtualization Conference & Expo: Themes & Topics
    From Application Virtualization to Xen, a round-up of the virtualization themes & topics being discussed in NYC June 23-24, 2008 by the world-class speaker faculty at the 3rd International Virtualization Conference & Expo being held by SYS-CON Events in The Roosevelt Hotel, in midtown
    Verizon Becomes a Counter-Android Linux Convert
    Verizon Wireless is snubbing Google's Linux-based Android initiative to go with the LiMo Foundation's mobile Linux spec for its next wave of mobile phones expected next year. Along with Verizon, Mozilla signed up - giving the consortium its first major open source ISV - and a key one f
    Adaptec Launches New Series 2 RAID Controller For Linux Users
    Adaptec unveiled a new family of entry-level Unified Serial RAID controllers. The new low-profile Series 2 RAID controllers, built on the same Adaptec dual core RAID-on-Chip (ROC) architecture used in its successful Series 5 RAID controllers, provide significant performance enhancement
    JavaOne 2008: Sun Challenges Linux
    Sun's mule train has finally pulled into Indiana after three years on the road. Indiana is the Linux-friendly Fedora-like OpenSolaris project meant to move the Solaris-shy Linux community off Linux and on to Solaris tempted by Solaris widgetry like the highly scalable, rollback-easy, 1
    Curl Announces Support for Ubuntu for Enterprise RIA Platform
    Curl announced it has released the availability of an Ubuntu Installer for the Curl Rich Internet Application (RIA) platform. Curl is a Rich Internet Application platform that competes with Adobe AIR/Flex, Silverlight, and Ajax. Curl has been shipping with Linux support for RedHat 9, S
    SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS
    SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

    SYS-CON FEATURED WHITEPAPERS

    ADS BY GOOGLE