Welcome!

Linux Authors: Michael Sheehan, Lavenya Dilip, Ian Thain, Bruce Armstrong, Ellen Rubin

Related Topics: Linux

Linux: Article

Toward Carrier Grade Linux Platforms

OSDL Carrier Grade Linux and Ericsson's contributions

In this article, Ibrahim Haddad presents on the Open Source Development Labs Carrier Grade Linux Requirements and Ericsson's contributions in this area.

On May 20, 2002, Ericsson joined the Open Source Development Labs (OSDL), working with other OSDL members to develop feature roadmaps and to enable Linux for the telecommunications market. Since then, OSDL has released the 1.0, 1.1, and 2.0 versions of the Carrier Grade Linux Requirement Definitions and work is ongoing for version 3.0. In parallel to the OSDL activities of defining requirements for Carrier Grade Linux and identifying open source projects meeting the requirements, Linux distributors such as MontaVista and SUSE have made available to the market Linux distributions that meet the Carrier Grade Linux requirements as defined by OSDL.

This article examines the motivations for Linux and presents on the Open Source Development Labs and its working groups. Also covered are the activities of the Carrier Grade Linux working group, the Carrier Grade Linux requirements, and Ericsson's three major source code contributions in this area.

Motivations for Linux
Historically, most telecommunications and data service networks were built on proprietary platforms that meet very specific availability and service response requirements.

Communications service providers are now challenged to cost-effectively meet their needs for new architectures and increased bandwidth while maintaining highly available, scalable, and secure systems that are easily maintained and have predictable performance.

An open software environment with the characteristics demanded by Carrier Grade applications, combined with commercial off-the-shelf software and hardware components, is a necessary part of these new architectures. At the base of such an environment is the operating system. Linux is the fastest growing general-purpose server operating system available today. Several reasons contribute to Linux being a candidate operating system for Carrier Grade systems:

  • Cost: Linux is available free in the form of a downloadable package from the Internet.
  • Availability of source code: You gain full access to the source code, allowing you to tailor the kernel to your needs.
  • Open development process: The development process of the kernel is open to anyone to participate and contribute. The process is based on the concept of "release early, release often."
  • Vendor independence: With Linux, you no longer have to be locked in to a specific vendor. Linux is supported on multiple platforms.
  • Peer review and testing resources: With access to the source code, people using a wide variety of platforms, operating systems, and compiler combinations can compile, link, and run the code on their systems to test for portability, compatibility, and bugs.
  • High innovation rate: New features are usually implemented on Linux before they are available on commercial or proprietary systems.
  • Availability of commercial support: Many companies provide support for Linux deployments.
Other factors are the availability of standard APIs and other interfaces, its support for a broad range of processors and peripherals, its high-performance networking, and its proven record as a stable, reliable, robust, and performant server platform.

The Open Source Development Labs (OSDL)
OSDL is a nonprofit organization that was founded in 2000 to accelerate the growth and adoption of Linux in the enterprise. It is now home to Linus Torvalds, creator of Linux and the first OSDL fellow, and Andrew Morton, kernel 2.6 maintainer. OSDL is sponsored and supported by a global consortium of IT and telecom industry leaders and provides state-of the-art computing and test facilities in the U.S. and Japan that are available to developers around the world.

OSDL has two working groups:

  1. Data Center working group: Develops the roadmap for Linux platform software that supports commercial software products and corporate IT requirements, enabling developers to create Linux-based solutions for the data center market segment.
  2. Carrier Grade Linux (CGL) working group: Chartered by the OSDL member companies to enhance the Linux OS to achieve an open source platform that is highly available, secure, scalable, easily maintained, and suitable for Carrier Grade systems.
Carrier Grade Linux Working Group
Carrier Grade is a term for public network telecommunications products that require a reliability percentage up to five or six nines of uptime (i.e., 99.999% or 99.9999% of uptime). Carrier Grade Linux is an enhancement of the vanilla Linux kernel aimed at the communications network industry.

The CGL Working Group has identified three main categories of application areas into which they expect the majority of applications implemented on CGL platforms to fall. These application areas are gateways, signaling servers, and management servers. The CGL Working Group will focus initially on Linux platform requirements to support applications in these areas.

  • Gateways are bridges between two different technologies or administration domains. Typically, a gateway processes large numbers of small messages received and transmitted over a large number of physical interfaces. Gateways perform in a timely manner very close to hard real time. They are implemented on dedicated platforms with replicated systems used for redundancy.
  • Signaling servers handle call control, session control, and radio recourse control. A signaling server handles the routing and maintains the status of calls over the network. It takes the requests of user agents who want to connect to other user agents and routes these requests to the appropriate signaling server. Signaling servers require soft real time response capabilities and may manage tens of thousands of simultaneous connections. Due to requirements for quick switching and a capacity to manage large numbers of connections, a signaling server application is context switch and memory intensive.
  • Management servers handle traditional network management operations, as well as service management and customer management. These servers provide services such as a Home Location Register and Visitor Location Register (for wireless networks) or customer information (such as personal preferences, including features the customer is authorized to use). Typically, management applications are data and communication intensive.
Carrier Grade Linux Architecture
Figure 1 shows the high level architecture for a Carrier Grade Linux platform. The scope of the CGL Working Group will encompass two areas:
  1. Linux OS (kernel) with Carrier Grade enhancements: CGL enhancements to the operating system are related to various requirements such as availability and scalability. Enhancements may also be made to interfaces to hardware, interfaces to the user level or application code, and interfaces to development and debugging tools. In some cases, user-level library changes will be needed to access the kernel services.
  2. Software development tools: These tools will include debuggers and analysis tools.
Following the CGL architecture, middleware vendors are encouraged to provide user mode software and services to the application developer. As such, applications give equipment manufacturers and distributors a means to differentiate their offerings from those of their competitors. The Linux kernel (Linux OS, as shown in Figure 1) becomes the commodity component of the solution stack.

 

Carrier Grade Enhancements
The Carrier Grade enhancements to the Linux kernel fall into the following categories: high availability, security, serviceability, performance and scalability enhancements, reliability, standards, and clustering (see Figure 2).

 

The implementations providing these enhancements are provided as open source projects and planned for integration with the Linux kernel when the implementations are mature and ready for merging. In some cases, bringing some projects to maturity levels takes a considerable amount of time before being able to request its integration into the Linux kernel. Nevertheless, some of the enhancements are targeted for inclusion in kernel version 2.7. Other enhancement will follow in later kernel releases. Meanwhile, all enhancements will be available from SourceForge or specific projects' Web sites.

CGL Specifications
On October 9, 2003, OSDL released the CGL Requirement Definition 2.0, which is organized into three broad classifications: general systems, clustering, and security. The scope of these requirements is limited to Linux as an operating environment, and it includes the Linux kernel and libraries like glibc and libpthreads, which are key to the operation of a Linux-based system.

Standards Requirements
The standards requirements reference specifications controlled outside of the OSDL CGL working group, which is important to Carrier Grade systems such as Linux Standard Base compliance, POSIX interfaces, Service Availability Forum compliance, and many IETF RFCs.

Platform Requirements
The platform requirements support interactions with the hardware platforms making up Carrier Grade server systems. Platform capabilities are vital building blocks, innately closer to the hardware than the availability and serviceability categories. OSDL CGL specifies platform capabilities that are not tied to a particular vendor's implementation. The specification may suggest model implementations that are tied to platforms – but do not require them. Examples of such requirements include hot swap, hot insert, hot remove, hot device identity, remote boot, boot cycle detection, and so on.

Availability Requirements
These requirements support high availability of Carrier Grade server systems. Examples of such requirements include improving the robustness of software components, supporting recovery from failure of hardware or software, support for watchdog timers, application heartbeat, RAID, resilient file system support, disk and volume management, multiple Ethernet NIC bonding and failover, hardened driver support, etc.

Serviceability Requirements
The serviceability requirements support servicing and managing hardware and software on Carrier Grade server systems. Examples of such requirements include resource monitoring, kernel dumps, kernel message structuring, platform signal handler, remote access to event log, dynamic debug/probe insertion, etc.

Tools Requirements
These requirements support auxiliary capabilities not directly involved in normal execution of carrier server systems used to develop modules, drivers, or applications. Examples of such requirements include providing capabilities to facilitate diagnosis, user-level (gdb) debugging support for threads, kernel debugging, etc.

Performance Requirements
These requirements support performance levels necessary for Carrier Grade server systems environments. Examples of such requirements include soft real-time support, preemptible kernel, application (pre) loading, etc.

Scalability Requirements
The scalability requirements support vertical and horizontal scaling of Carrier Grade server systems such that addition of hardware resources results in acceptable increases in capacity.

Clustering Requirements
The clustering requirements are aimed at supporting clustered applications in a Carrier Grade environment as an effective way to achieve highly available services inside a network element. The clustering requirements support the use of multiple carrier server systems. This is to support higher levels of service availability through redundant resources and recovery capabilities, and to provide a horizontally scaled environment supporting increased throughput.

Security Requirements
The security requirements are aimed at maintaining a certain level of security while not endangering the goals of high availability, performance, and scalability. The requirements support the use of additional security mechanisms to protect the system against attacks from both the Internet and intranets, and provide special mechanisms at the kernel level to be used by telecom applications.

Ericsson's Contributions to CGL
The Open Systems Lab located in Montreal, Canada, is the group from Ericsson Research representing Ericsson in the OSDL CGL working group and working with other OSDL members for the advancement of Linux in the telecom space. Ericsson has contributed three major projects to open source, filling the gap in missing implementations for the CGL. These projects are the Telecom IPC, the Distributed Security Infrastructure, and the Asynchronous Event Mechanism.

Telecom IPC (TIPC)
TIPC is a protocol specially designed for intracluster communication that has been used as a part of Ericsson products for years. It has been ported to Linux and it is provided as a portable source code package implemented as a loadable kernel module.

As far as we know, there is no other protocol available providing the combination of versatility and performance of TIPC. The combination of the functional addressing scheme with the subscription services and its alert connection concept are unique. The signaling link implementation, providing full load sharing and safe failover over any type of bearer, is also a big asset. The core features of TIPC are the following:

  • Full addressing transparency: TIPC provides a functional addressing scheme, hiding all aspects of the cluster's physical topology for the application programs. Mapping between functional and physical addresses is performed transparently and on the fly using a distributed, internal translation table. There is no user-level distinction between inter- and intraprocessor communication, user-to-user space, or user-to-kernel space communication. Changes in software or hardware configuration will result in automatic, swift, and nondisturbing updates of this table.
  • Lightweight, alert connections: By avoiding any hidden protocol messages, the message exchange within a transaction, i.e., connection setup, short data transfer, and shutdown, can be tailor-made by the user, and hence be made more efficient. An established connection will react to and report a problem to the application immediately upon any kind of service failure, such as process or processor crash.
  • Generic, adaptive, signaling link protocol: Tasks that are typically implemented in the transport layer, such as retransmission, segmentation, bundling, and continuity check are pushed down to the signaling link layer. This makes the link layer more complex but gives a better resource utilization and results in a more efficient stack. Signaling links are tightly supervised by a continuity check of configurable frequency, and are able to detect and report link failures within a fraction of a second. Failover to redundant links in such cases is handled transparently and disturbance-free. Signaling links are self-configuring, using a broadcast/multicast neighbor detection protocol when possible.
  • Performance: TIPC transfers short (< 1KB) single messages between processors 25–35% faster than TCP/IP, and with comparable speed for larger messages. For intraprocessor messages, delivery time is 75% shorter. Furthermore, by using the lightweight connection mechanism a transaction can be performed by exchanging as little as two messages, to be compared with a minimum of nine in TCP/IP. Hence, short transactions, typical in telecom applications, may be performed in a fraction of the time of corresponding TCP transactions.
  • Quality of Service (QoS): In-sequence, loss-free message delivery can be guaranteed in both connection-oriented and connectionless mode. In case of destination unavailability, undelivered messages are returned to the sender along with an error code indicating the cause of the problem.
  • Topology subscription services: It is possible for application programs to subscribe for the availability/nonavailability of functional and physical addresses. This means that it is easy to keep track of both functional and topological changes in the cluster, as well as synchronize the startup of distributed applications.
TIPC Availability and Status
TIPC is deployed widely in Ericsson products at hundreds of sites around the globe. TIPC was released to open source in January 2003 under the GPL license as a source code contribution from Ericsson to OSDL Carrier Grade Linux. TIPC licensing was later changed to the BSD license to allow more freedom for users and adopters.

TIPC should be regarded as a useful toolbox for anyone wanting to develop or use Carrier Grade Linux clusters. It will provide the necessary infrastructure for both cluster, network, and software management functionality, as well as a good support for designing site-independent, scalable, distributed, highly available, and high-performance applications.

The Distributed Security Infrastructure (DSI)
The Open System Lab at Ericsson Research started the DSI project, as an Open Source project released under the GPL license, to provide a security solution tailored for Carrier Grade Linux servers.

The most commonly used security approach for clustered systems is to package several existing solutions. This yields a complex integration and management process of all the different packages and often results in the absence of interoperability between different security mechanisms, not to mention the complex process of maintaining and upgrading these packages. In addition, Carrier Grade clusters have tight restrictions on performance and response time, making the design of security solutions difficult and excluding many security solutions from deployment due to their high resource consumption. Furthermore, the need for security has increased as telecom servers move from closed systems to IP-based systems connected to the Internet, thereby increasing the risk of security problems due to the more open environment of the Internet.

The project was initiated to fill the gap and provide an open source solution for Carrier Grade Linux clusters. If you have additional interest in the technical motives for DSI, please refer to the DSI Web site for additional information and white papers.

DSI Characteristics
As part of a Carrier Grade Linux cluster, DSI must comply with Carrier Grade requirements of reliability, scalability, and high availability. DSI supports the following requirements:

  • Coherent framework: Security must be coherent across different layers of heterogeneous hardware, applications, middleware, operating systems, and networking technologies. All mechanisms must fit together to prevent any exploitable security gap in the system.
  • Process-level approach: DSI is based on a fine-grained basic entity, the process.
  • Minimal impact on performance: The introduction of security features must not impose high-performance penalties. Performance can be expected to degrade slightly during the initial establishment of a security context; however, the impact on subsequent accesses must be negligible.
  • Preemptive security: Changes in the security context will be reflected immediately on the running security services. Whenever the security context of a subject changes, the system will re-evaluate its current use of resources against this new security context.
  • Dynamic security policy: It must be possible to support runtime changes in the distributed security policy. Carrier-class server nodes must provide continuous and long-term availability; thus, it is impossible to interrupt the service to enforce a new security policy.
To examine the DSI requirements in more detail, please refer to the DSI publications available from the DSI Web site.

DSI Architecture
Figure 3 presents the DSI architecture. DSI has two types of components: management and service. DSI management components define a thin layer that includes a security server, security managers, and a security communication channel. The ser-vice components define a flexible layer that you can modify or update by adding, replacing, or removing services according to your needs.

 

The core components of DSI illustrated in Figure 3 are:

  • The security server (duplicated for redundancy) is the central point of management in DSI. It is the entry point for secure operation and management and intrusion-detection systems. It also defines the dynamic security environment of the whole cluster by broadcasting changes in the distributed policy to all security managers.
  • Security managers enforce security at each node of the cluster. They are responsible for locally enforcing changes in the security environment. Security managers exchange security information only with the security server.
  • The secure communication channel provides encrypted and authenticated communications between security agents. All communications between the security server and outside world take place through the secure communication channel.
  • The distributed security module (DSM) provides the implementation of mandatory access control within a cluster. DSM is responsible for enforcing access control, and providing labeling for the IP messages with the security attributes of the sending process and node across the nodes of the cluster.
The security mechanisms are based on widely known proven and tested algorithms. As users must not be able to bypass these mechanisms, the best place to enforce security is at the kernel level. All security decisions, when necessary, are implemented at the kernel level, the same as for the main security manager component, which has stubs into the kernel. These stubs are implemented through kernel loadable modules.

Currently, DSI offers three security services:

  1. Distributed access control service, which handles homogeneously all access control requests,
  2. Distributed communications' confidentiality and integrity service, responsible for securing communications between various distributed applications, and
  3. Distributed digital signature service, which dynamically verifies the signature of binaries used on the cluster.
Other security services, such as the secure distributed logging service, are being investigated but have not been implemented yet.

DSI Availability and Current Status
DSI is an open source project released under the GPL license. The core DSI components, which include the secure communication channel, security server, security manager, access control service, security policy generation, security session manager, and distributed tracing of events, have been implemented and the source code packages are available for download from the DSI Web site (see references).

Currently the efforts are toward implementing the distributed access control service, distributed security policy component, and the real-time integrity verifications for binaries at the kernel level.

Asynchronous Event Mechanism (AEM)
To help advance Linux in telecom and fill the gap in the missing telecom kernel characteristics, the Open Systems Lab initiated the AEM project to design and prototype an event-driven mechanism that focuses on increasing application reliability, performance, and portability. AEM provides these enhancements at the kernel level, allowing applications with asynchronous execution of processes to have better performance through a reduced response time and better scalability.

In addition, AEM provides an event-driven programming methodology of development by defining specific user interfaces where event handlers contain all of the necessary data for their execution as their parameters. This approach will allow easier development and porting for application based on multithreaded architectures, which is a cumbersome process otherwise.

Currently, the AEM implementation fulfills the requirement for efficient low-level asynchronous events in the OSDL CGL Requirements Definition version 2.0. AEM is released to open source under the GPL license; it is provided as a Linux kernel patch and a set of loadable modules available for the latest stable (2.4) and experimental (2.6) kernel releases. You can read more on AEM from its Web site, listed in the references.

Conclusion
Carrier Grade Linux is a serious effort from the industry to advance Linux in the telecom space and allow careful migration from proprietary closed solutions to open platforms and standards. Ericsson is contributing to the CGL working group, working with other members defining requirements, and providing open source implementations for Linux to meet the stringent telecom requirements. Clearly, Ericsson is a major contributor to the Carrier Grade Linux activities and OSDL.

References

  • Asynchronous Event Mechanism (AEM): http://aem.sf.net
  • BSD License: www.opensource.org/licenses/bsd-license.php
  • Distributed Security Infrastructure (DSI): http://disec.sf.net
  • Ericsson: www.ericsson.com
  • GNU General Public License (GPL): www.gnu.org/copyleft/gpl.html
  • Internet Engineering Task Force (IETF): www.ietf.org
  • Linux Standard Base (LSB): www.linuxbase.org
  • MontaVista Carrier Grade Edition (MV CGE): www.mvista.com/cge
  • Ericsson Research Open System Lab – Linux: www.linux.ericsson.ca
  • Open Source Development Labs (OSDL): www.osdl.org
  • OSDL Carrier Grade Linux Working Group (CGL WG): www.osdl.org/lab_activities/carrier_grade_linux
  • SUSE Linux: www.suse.com
  • Telecom IPC (TIPC): http://tipc.sf.net

    For more information contact Frederic.Rossi@Ericsson.com (asynchronous event mechanism), Ibrahim.Haddad@Ericsson.com (Ericsson & OSDL, open source activities), Makan.Pourzandi@Ericsson.com (distributed security infrastructure), and Jon.Maloy@Ericsson.com (telecom IPC).

    Acknowledgments
    The author would like to acknowledge and thank the following people for their contributions and reviews: Axelle Apvrille, Andre Beliveau, Jon Maloy, and Makan Pourzandi.

  • More Stories By Ibrahim Haddad

    Dr. Ibrahim Haddad is a seasoned telecommunications expert with over a decade of multinational experience in infrastructure, carrier grade, Linux mobile platforms, software development, standards, industry global initiatives, Open Source software and legal compliance. Dr. Ibrahim Haddad is currently Director of Open Source at Palm. His previous professional experiences include Ericsson, the Open Source Development Labs and Motorola. Haddad is the author of “Practical Guide to Open Source Compliance” to be published early 2010 and co-author of two books on Red Hat Linux and Fedora. Dr. Haddad is a Contributing Editor of the Linux Journal and served on numerous conference and review committees. Haddad received a B.Sc. and M.Sc. in Computer Science from the Lebanese American University (Byblos, Lebanon) and a Ph.D. in Computer Science from Concordia University (Montreal, Canada).

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.