Welcome!

Linux Authors: Ignacio M. Llorente, Trevor Parsons, Tad Anderson, Andrew Phillips, Pat Romanski

Related Topics: Linux

Linux: Article

Easing Data Migration

The trick is don't move the data

Linux is emerging as the platform of choice for a growing number of enterprises across the globe. The cost, choice, and control advantages of using Open Source software for mission-critical applications have already enabled hundreds of organizations to control IT costs while expanding IT capabilities and productivity. Customers in telecommunications, financial services, and government have aggressively already deployed Linux in production workloads like databases, SAP, messaging services, and custom applications.

While moving to a new operating system is not trivial, its complexity pales in comparison to the struggles of migrating actual data from one platform to another in production environments. The ability to migrate data between different operating systems can reduce IT costs, either as part of platform migrations or multi-platform workflows.

Many companies undertake elaborate migration projects that require the manual migration of data. However, manually migrating data between typically disparate and incompatible systems requires a substantial investment of time and labor. In fact, this complexity often overwhelms the benefits it promises. Moreover, such migrations can trigger a range of risks in data loss, data corruption, policy compliance, and -worst of all - production downtime.

As a result, a growing number of organizations are turning to automated data migration tools to minimize such costs and risks in migrating production data workloads.

The Need to Migrate Data
Server and storage equipment replacements, relocation, consolidation, lease renewals, and balancing workloads all drive the need to migrate data on a regular basis. With larger disk sizes readily available, many organizations are looking to control costs by replacing a number of smaller drives with fewer but larger drives. Of course, fewer drives also means fewer spindles, which can negatively impact overall system performance.

Others simply have so much storage spread out among their worldwide data centers that storage migrations become a frequent process of removing old storage and adding new storage devices.

Other organizations discover they've outgrown their storage capabilities faster than anticipated and planned for, making their existing infrastructure unable to accommodate current and future data storage needs.

The Challenges
Migrating data to a Linux platform is easier said than done. According to a recent survey by Symantec over 72% of respondents take more than two weeks to plan an implementation and over 40% of the migrations involve more than five people to complete them. What's more, 61% exceed their planned downtime, 54% exceed their budget, and 83% exceed their staffing plan.

First, there are the operational issues to consider. Downtime must be scheduled, particularly in cases where the organization is making an application's data set accessible from another access point in the data center. And, with today's virtual environments, organizations have to be able to migrate from a physical to a virtual environment, and vice versa. Having inadequate manual and semi-automated approaches makes this even more difficult.

In all cases, coordination is key to a successful data migration. All administration groups involved in the process must be aware of the organization's data migration schedule, and process, and their role in it. And re-establishing access to storage must be done with minimal disruptions - which is very difficult when upgrading or adding another switch to a storage area network (SAN).

Beyond the operational challenges, organizations have to contend with storage-centric issues, the most daunting being file system issues. When moving data from a Unix to a Linux environment, for example, or simply adding new storage to a server and moving off an old storage device, it's necessary to resize the file system to use the new storage. A number of technologies facilitate this, enabling the virtualization of storage in such a way that the file system can interoperate better with the storage infrastructure.

Organizations must also deal with storage volumes that have incompatible formats, the challenge of preserving LUN and disk mappings across the migration, reclamation, and ensuring capacity at the destination. And as with any conversion and migration, the integrity of the data is at risk.

Application-level issues have to be considered when migrating data from one platform, such as Unix to Linux. Application data formats may not be cross-platform portable, some sort of conversion process on the data file format has to occur to be able to reach the same data on a Linux box.

Finally, organizations must contend with TCP/IP network-centric issues such as ensuring sufficient bandwidth and addressing interoperability concerns. Physical connectivity issues such as re-cabling and the implications on performance made by topological changes must also be addressed.

Easing Cross-Platform Data Migration
With half or more of enterprises' structured data stored in databases by some analyst estimates, this data is very likely to be migrated between unlike platforms at some point in its lifetime. But manual methods make the process unwieldy, time-consuming, and resource-intensive.

For example, moving a database from an Oracle instance running on a Sun Solaris server to another Oracle instance on a Linux server introduces a number of challenges. The storage volumes mounted on the existing system can't simply be unplugged and attached to the new server because the new Linux-based server can't interpret the information being sent.

There are a number of platform-specific factors that limit the ability to share volumes across servers. Among these are disk drive sector size and block size. As a result, new volumes have to be created on the Linux system, and these volumes have to be configured to get data from the existing Solaris server. All processing of applications has to be halted as the data moves from one platform to the next, and the data on the volumes has to be moved physically to the new Linux server. This can be done across the network or manually using tape backup and restore procedures. And the volumes will probably have to be converted before they are mounted or restored on the server. This typically happens when data is moved between platforms with dissimilar endians.

To overcome these challenges, a growing number of organizations are turning to new technologies that don't move the data but simply let it be accessed from another operating system host. The key to this technology is a new default disk format, the basis of platform-independent virtual volume building blocks, often called portable data containers. Volumes formatted with the new parameters of this disk format can be used with volume manager solutions regardless of the operating environment that initialized the disk (including issues like endianess). The resulting volume format enables platform-specific dependencies to be removed from the data movement equation, including sector and block size. In short, why convert and migrate the data when you can just convert the metadata and remount the storage device?

With this new technology, migrating data from Unix to Linux is a simple process, taking minutes, not days. Administrators unmount the file system on Unix, run a conversion utility, deport disks on Unix and import disks on Linux, start volumes, and mount the file system. According to laboratory tests this process can be done in less than a few minutes for a 500GB tablespace - whereas data conversion from tape backup would take five hours and the same process from NFS would need four hours. Actually the time it takes for such migrations isn't dependent on the total size (or capacity) of the data, but on the number of files in the file system.

The portable data-container building blocks simplify data migrations between heterogeneous server platforms. Application data storage can be used by any processing platform, which offers IT organizations greater leverage over existing heterogeneous computing resources in their environment.

Enhancing Business Performance
Moving data from one platform will never be trivial. In fact, it has historically been so hard that many organizations run their applications on sub-optimal and expensive legacy platforms just to avoid the complexities and downtime associated with data migration.

However, by leveraging new technologies that reduce the time and resources required to move data between unlike platforms - obviating the need and risk of traditional data migrations - volumes can easily be transported between unlike platforms. Physical disks can be grouped into logical volumes to improve disk utilization and eliminate storage-related downtime. Moreover, administrators have the flexibility to move data between storage arrays as needed, migrate data to new operating systems, and move files to the most appropriate storage device based on importance.

With these tools, organizations can reduce cost, risk, and downtime, while enhancing performance and maximizing the productivity of their heterogeneous IT environments.

Reference

More Stories By Andy Fenselau

Andy Fenselau has led product management across various parts of the Linux technology stack since 1998. He is currently the Linux Product Line Manager for Symantec's enterprise storage and server management solutions, spending most of his time with customers and partners to ensure Symantec's Linux solutions are meeting their needs. As a Linux evangelist, Andy has authored many articles and spoken at many events about the technical and business advantages of the evolving Linux solutions. He holds a BA from Harvard University and an MBA from Stanford University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
We are reaching the end of the beginning with WebRTC, and real systems using this technology have begun to appear. One challenge that faces every WebRTC deployment (in some form or another) is identity management. For example, if you have an existing service – possibly built on a variety of different PaaS/SaaS offerings – and you want to add real-time communications you are faced with a challenge relating to user management, authentication, authorization, and validation. Service providers will want to use their existing identities, but these will have credentials already that are (hopefully) i...
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, discussed how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money!
The Internet of Things will put IT to its ultimate test by creating infinite new opportunities to digitize products and services, generate and analyze new data to improve customer satisfaction, and discover new ways to gain a competitive advantage across nearly every industry. In order to help corporate business units to capitalize on the rapidly evolving IoT opportunities, IT must stand up to a new set of challenges. In his session at @ThingsExpo, Jeff Kaplan, Managing Director of THINKstrategies, will examine why IT must finally fulfill its role in support of its SBUs or face a new round of...
Cultural, regulatory, environmental, political and economic (CREPE) conditions over the past decade are creating cross-industry solution spaces that require processes and technologies from both the Internet of Things (IoT), and Data Management and Analytics (DMA). These solution spaces are evolving into Sensor Analytics Ecosystems (SAE) that represent significant new opportunities for organizations of all types. Public Utilities throughout the world, providing electricity, natural gas and water, are pursuing SmartGrid initiatives that represent one of the more mature examples of SAE. We have s...
One of the biggest challenges when developing connected devices is identifying user value and delivering it through successful user experiences. In his session at Internet of @ThingsExpo, Mike Kuniavsky, Principal Scientist, Innovation Services at PARC, described an IoT-specific approach to user experience design that combines approaches from interaction design, industrial design and service design to create experiences that go beyond simple connected gadgets to create lasting, multi-device experiences grounded in people's real needs and desires.
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Connected devices and the Internet of Things are getting significant momentum in 2014. In his session at Internet of @ThingsExpo, Jim Hunter, Chief Scientist & Technology Evangelist at Greenwave Systems, examined three key elements that together will drive mass adoption of the IoT before the end of 2015. The first element is the recent advent of robust open source protocols (like AllJoyn and WebRTC) that facilitate M2M communication. The second is broad availability of flexible, cost-effective storage designed to handle the massive surge in back-end data in a world where timely analytics is e...
P2P RTC will impact the landscape of communications, shifting from traditional telephony style communications models to OTT (Over-The-Top) cloud assisted & PaaS (Platform as a Service) communication services. The P2P shift will impact many areas of our lives, from mobile communication, human interactive web services, RTC and telephony infrastructure, user federation, security and privacy implications, business costs, and scalability. In his session at @ThingsExpo, Robin Raymond, Chief Architect at Hookflash, will walk through the shifting landscape of traditional telephone and voice services ...
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using the URL as a basic building block, we open this up and get the same resilience that the web enjoys.
The Internet of Things is tied together with a thin strand that is known as time. Coincidentally, at the core of nearly all data analytics is a timestamp. When working with time series data there are a few core principles that everyone should consider, especially across datasets where time is the common boundary. In his session at Internet of @ThingsExpo, Jim Scott, Director of Enterprise Strategy & Architecture at MapR Technologies, discussed single-value, geo-spatial, and log time series data. By focusing on enterprise applications and the data center, he will use OpenTSDB as an example t...
The Domain Name Service (DNS) is one of the most important components in networking infrastructure, enabling users and services to access applications by translating URLs (names) into IP addresses (numbers). Because every icon and URL and all embedded content on a website requires a DNS lookup loading complex sites necessitates hundreds of DNS queries. In addition, as more internet-enabled ‘Things' get connected, people will rely on DNS to name and find their fridges, toasters and toilets. According to a recent IDG Research Services Survey this rate of traffic will only grow. What's driving t...
Enthusiasm for the Internet of Things has reached an all-time high. In 2013 alone, venture capitalists spent more than $1 billion dollars investing in the IoT space. With "smart" appliances and devices, IoT covers wearable smart devices, cloud services to hardware companies. Nest, a Google company, detects temperatures inside homes and automatically adjusts it by tracking its user's habit. These technologies are quickly developing and with it come challenges such as bridging infrastructure gaps, abiding by privacy concerns and making the concept a reality. These challenges can't be addressed w...
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at Internet of @ThingsExpo, James Kirkland, Chief Architect for the Internet of Things and Intelligent Systems at Red Hat, described how to revolutioniz...
Bit6 today issued a challenge to the technology community implementing Web Real Time Communication (WebRTC). To leap beyond WebRTC’s significant limitations and fully leverage its underlying value to accelerate innovation, application developers need to consider the entire communications ecosystem.
The definition of IoT is not new, in fact it’s been around for over a decade. What has changed is the public's awareness that the technology we use on a daily basis has caught up on the vision of an always on, always connected world. If you look into the details of what comprises the IoT, you’ll see that it includes everything from cloud computing, Big Data analytics, “Things,” Web communication, applications, network, storage, etc. It is essentially including everything connected online from hardware to software, or as we like to say, it’s an Internet of many different things. The difference ...
Cloud Expo 2014 TV commercials will feature @ThingsExpo, which was launched in June, 2014 at New York City's Javits Center as the largest 'Internet of Things' event in the world.
SYS-CON Events announced today that Windstream, a leading provider of advanced network and cloud communications, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Windstream (Nasdaq: WIN), a FORTUNE 500 and S&P 500 company, is a leading provider of advanced network communications, including cloud computing and managed services, to businesses nationwide. The company also offers broadband, phone and digital TV services to consumers primarily in rural areas.
"There is a natural synchronization between the business models, the IoT is there to support ,” explained Brendan O'Brien, Co-founder and Chief Architect of Aria Systems, in this SYS-CON.tv interview at the 15th International Cloud Expo®, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
The major cloud platforms defy a simple, side-by-side analysis. Each of the major IaaS public-cloud platforms offers their own unique strengths and functionality. Options for on-site private cloud are diverse as well, and must be designed and deployed while taking existing legacy architecture and infrastructure into account. Then the reality is that most enterprises are embarking on a hybrid cloud strategy and programs. In this Power Panel at 15th Cloud Expo (http://www.CloudComputingExpo.com), moderated by Ashar Baig, Research Director, Cloud, at Gigaom Research, Nate Gordon, Director of T...
An entirely new security model is needed for the Internet of Things, or is it? Can we save some old and tested controls for this new and different environment? In his session at @ThingsExpo, New York's at the Javits Center, Davi Ottenheimer, EMC Senior Director of Trust, reviewed hands-on lessons with IoT devices and reveal a new risk balance you might not expect. Davi Ottenheimer, EMC Senior Director of Trust, has more than nineteen years' experience managing global security operations and assessments, including a decade of leading incident response and digital forensics. He is co-author of t...