Linux Containers Authors: Elizabeth White, Pat Romanski, Liz McMillan, Yeshim Deniz, Derek Weeks

Related Topics: Linux Containers

Linux Containers: Article

Easing Data Migration

The trick is don't move the data

Linux is emerging as the platform of choice for a growing number of enterprises across the globe. The cost, choice, and control advantages of using Open Source software for mission-critical applications have already enabled hundreds of organizations to control IT costs while expanding IT capabilities and productivity. Customers in telecommunications, financial services, and government have aggressively already deployed Linux in production workloads like databases, SAP, messaging services, and custom applications.

While moving to a new operating system is not trivial, its complexity pales in comparison to the struggles of migrating actual data from one platform to another in production environments. The ability to migrate data between different operating systems can reduce IT costs, either as part of platform migrations or multi-platform workflows.

Many companies undertake elaborate migration projects that require the manual migration of data. However, manually migrating data between typically disparate and incompatible systems requires a substantial investment of time and labor. In fact, this complexity often overwhelms the benefits it promises. Moreover, such migrations can trigger a range of risks in data loss, data corruption, policy compliance, and -worst of all - production downtime.

As a result, a growing number of organizations are turning to automated data migration tools to minimize such costs and risks in migrating production data workloads.

The Need to Migrate Data
Server and storage equipment replacements, relocation, consolidation, lease renewals, and balancing workloads all drive the need to migrate data on a regular basis. With larger disk sizes readily available, many organizations are looking to control costs by replacing a number of smaller drives with fewer but larger drives. Of course, fewer drives also means fewer spindles, which can negatively impact overall system performance.

Others simply have so much storage spread out among their worldwide data centers that storage migrations become a frequent process of removing old storage and adding new storage devices.

Other organizations discover they've outgrown their storage capabilities faster than anticipated and planned for, making their existing infrastructure unable to accommodate current and future data storage needs.

The Challenges
Migrating data to a Linux platform is easier said than done. According to a recent survey by Symantec over 72% of respondents take more than two weeks to plan an implementation and over 40% of the migrations involve more than five people to complete them. What's more, 61% exceed their planned downtime, 54% exceed their budget, and 83% exceed their staffing plan.

First, there are the operational issues to consider. Downtime must be scheduled, particularly in cases where the organization is making an application's data set accessible from another access point in the data center. And, with today's virtual environments, organizations have to be able to migrate from a physical to a virtual environment, and vice versa. Having inadequate manual and semi-automated approaches makes this even more difficult.

In all cases, coordination is key to a successful data migration. All administration groups involved in the process must be aware of the organization's data migration schedule, and process, and their role in it. And re-establishing access to storage must be done with minimal disruptions - which is very difficult when upgrading or adding another switch to a storage area network (SAN).

Beyond the operational challenges, organizations have to contend with storage-centric issues, the most daunting being file system issues. When moving data from a Unix to a Linux environment, for example, or simply adding new storage to a server and moving off an old storage device, it's necessary to resize the file system to use the new storage. A number of technologies facilitate this, enabling the virtualization of storage in such a way that the file system can interoperate better with the storage infrastructure.

Organizations must also deal with storage volumes that have incompatible formats, the challenge of preserving LUN and disk mappings across the migration, reclamation, and ensuring capacity at the destination. And as with any conversion and migration, the integrity of the data is at risk.

Application-level issues have to be considered when migrating data from one platform, such as Unix to Linux. Application data formats may not be cross-platform portable, some sort of conversion process on the data file format has to occur to be able to reach the same data on a Linux box.

Finally, organizations must contend with TCP/IP network-centric issues such as ensuring sufficient bandwidth and addressing interoperability concerns. Physical connectivity issues such as re-cabling and the implications on performance made by topological changes must also be addressed.

Easing Cross-Platform Data Migration
With half or more of enterprises' structured data stored in databases by some analyst estimates, this data is very likely to be migrated between unlike platforms at some point in its lifetime. But manual methods make the process unwieldy, time-consuming, and resource-intensive.

For example, moving a database from an Oracle instance running on a Sun Solaris server to another Oracle instance on a Linux server introduces a number of challenges. The storage volumes mounted on the existing system can't simply be unplugged and attached to the new server because the new Linux-based server can't interpret the information being sent.

There are a number of platform-specific factors that limit the ability to share volumes across servers. Among these are disk drive sector size and block size. As a result, new volumes have to be created on the Linux system, and these volumes have to be configured to get data from the existing Solaris server. All processing of applications has to be halted as the data moves from one platform to the next, and the data on the volumes has to be moved physically to the new Linux server. This can be done across the network or manually using tape backup and restore procedures. And the volumes will probably have to be converted before they are mounted or restored on the server. This typically happens when data is moved between platforms with dissimilar endians.

To overcome these challenges, a growing number of organizations are turning to new technologies that don't move the data but simply let it be accessed from another operating system host. The key to this technology is a new default disk format, the basis of platform-independent virtual volume building blocks, often called portable data containers. Volumes formatted with the new parameters of this disk format can be used with volume manager solutions regardless of the operating environment that initialized the disk (including issues like endianess). The resulting volume format enables platform-specific dependencies to be removed from the data movement equation, including sector and block size. In short, why convert and migrate the data when you can just convert the metadata and remount the storage device?

With this new technology, migrating data from Unix to Linux is a simple process, taking minutes, not days. Administrators unmount the file system on Unix, run a conversion utility, deport disks on Unix and import disks on Linux, start volumes, and mount the file system. According to laboratory tests this process can be done in less than a few minutes for a 500GB tablespace - whereas data conversion from tape backup would take five hours and the same process from NFS would need four hours. Actually the time it takes for such migrations isn't dependent on the total size (or capacity) of the data, but on the number of files in the file system.

The portable data-container building blocks simplify data migrations between heterogeneous server platforms. Application data storage can be used by any processing platform, which offers IT organizations greater leverage over existing heterogeneous computing resources in their environment.

Enhancing Business Performance
Moving data from one platform will never be trivial. In fact, it has historically been so hard that many organizations run their applications on sub-optimal and expensive legacy platforms just to avoid the complexities and downtime associated with data migration.

However, by leveraging new technologies that reduce the time and resources required to move data between unlike platforms - obviating the need and risk of traditional data migrations - volumes can easily be transported between unlike platforms. Physical disks can be grouped into logical volumes to improve disk utilization and eliminate storage-related downtime. Moreover, administrators have the flexibility to move data between storage arrays as needed, migrate data to new operating systems, and move files to the most appropriate storage device based on importance.

With these tools, organizations can reduce cost, risk, and downtime, while enhancing performance and maximizing the productivity of their heterogeneous IT environments.


More Stories By Andy Fenselau

Andy Fenselau has led product management across various parts of the Linux technology stack since 1998. He is currently the Linux Product Line Manager for Symantec's enterprise storage and server management solutions, spending most of his time with customers and partners to ensure Symantec's Linux solutions are meeting their needs. As a Linux evangelist, Andy has authored many articles and spoken at many events about the technical and business advantages of the evolving Linux solutions. He holds a BA from Harvard University and an MBA from Stanford University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@ThingsExpo Stories
In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, will be adding the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor...
Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity.
The security needs of IoT environments require a strong, proven approach to maintain security, trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vic...
Cloud based infrastructure deployment is becoming more and more appealing to customers, from Fortune 500 companies to SMEs due to its pay-as-you-go model. Enterprise storage vendors are able to reach out to these customers by integrating in cloud based deployments; this needs adaptability and interoperability of the products confirming to cloud standards such as OpenStack, CloudStack, or Azure. As compared to off the shelf commodity storage, enterprise storages by its reliability, high-availabil...
In the next forty months – just over three years – businesses will undergo extraordinary changes. The exponential growth of digitization and machine learning will see a step function change in how businesses create value, satisfy customers, and outperform their competition. In the next forty months companies will take the actions that will see them get to the next level of the game called Capitalism. Or they won’t – game over. The winners of today and tomorrow think differently, follow different...
The IoT industry is now at a crossroads, between the fast-paced innovation of technologies and the pending mass adoption by global enterprises. The complexity of combining rapidly evolving technologies and the need to establish practices for market acceleration pose a strong challenge to global enterprises as well as IoT vendors. In his session at @ThingsExpo, Clark Smith, senior product manager for Numerex, will discuss how Numerex, as an experienced, established IoT provider, has embraced a ...
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 7-9, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and ...
The Internet of Things (IoT), in all its myriad manifestations, has great potential. Much of that potential comes from the evolving data management and analytic (DMA) technologies and processes that allow us to gain insight from all of the IoT data that can be generated and gathered. This potential may never be met as those data sets are tied to specific industry verticals and single markets, with no clear way to use IoT data and sensor analytics to fulfill the hype being given the IoT today.
Donna Yasay, President of HomeGrid Forum, today discussed with a panel of technology peers how certification programs are at the forefront of interoperability, and the answer for vendors looking to keep up with today's growing industry for smart home innovation. "To ensure multi-vendor interoperability, accredited industry certification programs should be used for every product to provide credibility and quality assurance for retail and carrier based customers looking to add ever increasing num...
The Open Connectivity Foundation (OCF), sponsor of the IoTivity open source project, and AllSeen Alliance, which provides the AllJoyn® open source IoT framework, today announced that the two organizations’ boards have approved a merger under the OCF name and bylaws. This merger will advance interoperability between connected devices from both groups, enabling the full operating potential of IoT and representing a significant step towards a connected ecosystem.
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits – to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at @ThingsExpo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, discussed how leveraging the Industrial Internet a...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
A completely new computing platform is on the horizon. They’re called Microservers by some, ARM Servers by others, and sometimes even ARM-based Servers. No matter what you call them, Microservers will have a huge impact on the data center and on server computing in general. Although few people are familiar with Microservers today, their impact will be felt very soon. This is a new category of computing platform that is available today and is predicted to have triple-digit growth rates for some ...
November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Penta Security is a leading vendor for data security solutions, including its encryption solution, D’Amo. By using FPE technology, D’Amo allows for the implementation of encryption technology to sensitive data fields without modification to schema in the database environment. With businesses having their data become increasingly more complicated in their mission-critical applications (such as ERP, CRM, HRM), continued ...
SYS-CON Events announced today that Cloudbric, a leading website security provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Cloudbric is an elite full service website protection solution specifically designed for IT novices, entrepreneurs, and small and medium businesses. First launched in 2015, Cloudbric is based on the enterprise level Web Application Firewall by Penta Security Sys...
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...
Most people haven’t heard the word, “gamification,” even though they probably, and perhaps unwittingly, participate in it every day. Gamification is “the process of adding games or game-like elements to something (as a task) so as to encourage participation.” Further, gamification is about bringing game mechanics – rules, constructs, processes, and methods – into the real world in an effort to engage people. In his session at @ThingsExpo, Robert Endo, owner and engagement manager of Intrepid D...
WebRTC adoption has generated a wave of creative uses of communications and collaboration through websites, sales apps, customer care and business applications. As WebRTC has become more mainstream it has evolved to use cases beyond the original peer-to-peer case, which has led to a repeating requirement for interoperability with existing infrastructures. In his session at @ThingsExpo, Graham Holt, Executive Vice President of Daitan Group, will cover implementation examples that have enabled ea...