Welcome!

Linux Containers Authors: Yeshim Deniz, Liz McMillan, Elizabeth White, Derek Weeks, Patrick Hubbard

Related Topics: Linux Containers, Open Source Cloud, Recurring Revenue

Linux Containers: Article

Proactively Preventing Data Corruption

Linux gains end-to-end data integrity protection

Data Corruption
Corruption can occur as a result of bugs in both software and hardware. A common failure scenario involves incorrect buffers being written to disk, often clobbering good data.

This latent type of corruption can go undetected for a long period of time. It may take months before the application attempts to reread the data from disk, at which point the good data may have been lost forever. Short backup cycles may even have caused all intact copies of the data to be overwritten.

A crucial weapon in preventing this type of error is proactive data integrity protection, a method that prevents corrupted I/O requests from being written to disk.

For several years Oracle has offered a technology called HARD (Hardware Assisted Resilient Data), which allows storage systems to verify the integrity of an Oracle database logical block before it is committed to stable storage. Though the level of protection offered by HARD is mandatory in numerous enterprise and government deployments, adoption outside the mission-critical business segment has been slow. The disk array vendors that license and implement the HARD technology only offer it in their very high-end products. As a result, Oracle has been looking to provide a comparable level of resiliency using an open and standards-based approach.

A recent extension to the SCSI family of protocols allows extra protective measures, including a checksum, to be included in an I/O request. This appended data is referred to as integrity metadata or protection information.

Unfortunately, the SCSI protection envelope only covers the path between the I/O controller and the storage device. To remedy this, Oracle and a few select industry partners have collaborated to design a method of exposing the data integrity features to the operating system. This technology, known as the Data Integrity Extensions, allows the operating system – and even applications such as the Oracle Database – to generate protection data that will be verified as the request goes through the entire I/O stack. Figure 1 illustrates the integrity coverage provided by the technologies described earlier.

T10 Data Integrity Field
T10 is the INCITS standards body responsible for the SCSI family of protocols. Data corruption has been a known problem in the storage industry for years and T10 has provided the means to prevent it by extending the SCSI protocol to allow integrity metadata to be included in an I/O request. The extension to the SCSI block device protocol is called the Data Integrity Field (DIF).

  • Allows I/O controller and storage device to exchange protection information
  • Each data sector is protected by an 8-byte integrity tuple
  • The contents of this tuple include a checksum and an incrementing counter that ensures the I/O is intact
  • Both I/O controller and storage device can detect and reject corrupted requests

Normal SCSI disks use a hardware sector size of 512 bytes. (The term SCSI disk is used to refer to any enterprise-class storage device using the SCSI protocol, i.e., parallel SCSI, Fibre Channel and SAS.) However, when used inside disk arrays, the drives are often reformatted to a bigger sector size of 520 or 528 bytes. The operating system is only exposed to the usual 512 bytes of data. The extra 8 or 16 bytes in each sector are used internally by the array firmware for integrity checks.

DIF is similar in the sense that the storage device must be reformatted to 520 byte sectors. The main difference between DIF and proprietary array firmware is that the format of the extra 8 bytes of information per sector is well defined as well as being an open standard. This means that every node in the I/O path can participate in generating and verifying the integrity metadata.

Each DIF tuple is split up into three sections called tags as shown in Figure 2. There is a 16-bit guard tag, a 16-bit application tag, and a 32-bit reference tag.

The DIF specification lists several types of protection. Each of these protection types defines the contents of the three tag fields in the DIF tuple. The guard tag contains a 16-bit CRC of the 512 bytes of data in the sector. The application tag is for use by the application or operating system, and finally the reference tag is used to ensure the ordering of the individual portions of the I/O request. The reference tag varies depending on protection type. The most common of these is Type 1 in which the reference tag needs to match the 32 lower bits of the target sector logical block address. This helps prevent misdirected writes, a common corruption error where data is written to the wrong place on disk.

If the storage device detects a mismatch between the data and the integrity metadata, the I/O will be rejected before it’s written to disk. Also, since each node in the I/O path is free to inspect and verify the integrity metadata, it is possible to isolate points of error. For instance, it is conceivable that in the future advanced fabric switches will be able to verify the integrity as data flows through the Storage Area Network.

The fact that a storage device is formatted using the DIF protection scheme is transparent to the operating system. In the case of a write request, the I/O controller will receive a number of 512-byte buffers from the operating system and proceed to generate and append the appropriate 8 bytes of protection information to each sector. Upon receiving the request, the SCSI disk will verify that the data matches the included integrity metadata. In the case of a mismatch, the I/O will be rejected and an error returned to the operating system.

Similarly, in the case of a read request, the storage device will include the protection information and send 520 byte sectors to the I/O controller. The controller will verify the integrity of the I/O, strip off the protection data, and return 512 byte data buffers to the operating system.

In other words, the added level of protection between controller and storage device is completely transparent to the operating system. Unfortunately, this also means the operating system is unable to participate in the integrity verification process. This is where the Data Integrity Extensions come in.

More Stories By Martin Petersen

Martin K. Petersen has been involved in Linux development since the early nineties. He has worked on PA-RISC and IA-64 Linux ports for HP as well as the XFS filesystem and the Altix kernel for SGI. Martin works in Oracle's Linux Engineering group where he focuses on enterprise storage technologies.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
"The Striim platform is a full end-to-end streaming integration and analytics platform that is middleware that covers a lot of different use cases," explained Steve Wilkes, Founder and CTO at Striim, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Calligo, an innovative cloud service provider offering mid-sized companies the highest levels of data privacy and security, has been named "Bronze Sponsor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Calligo offers unparalleled application performance guarantees, commercial flexibility and a personalised support service from its globally located cloud plat...
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
SYS-CON Events announced today that Datera, that offers a radically new data management architecture, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera is transforming the traditional datacenter model through modern cloud simplicity. The technology industry is at another major inflection point. The rise of mobile, the Internet of Things, data storage and Big...
"We provide IoT solutions. We provide the most compatible solutions for many applications. Our solutions are industry agnostic and also protocol agnostic," explained Richard Han, Head of Sales and Marketing and Engineering at Systena America, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We've been engaging with a lot of customers including Panasonic, we've been involved with Cisco and now we're working with the U.S. government - the Department of Homeland Security," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are focused on SAP running in the clouds, to make this super easy because we believe in the tremendous value of those powerful worlds - SAP and the cloud," explained Frank Stienhans, CTO of Ocean9, Inc., in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
DX World EXPO, LLC., a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
"MobiDev is a Ukraine-based software development company. We do mobile development, and we're specialists in that. But we do full stack software development for entrepreneurs, for emerging companies, and for enterprise ventures," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
While the focus and objectives of IoT initiatives are many and diverse, they all share a few common attributes, and one of those is the network. Commonly, that network includes the Internet, over which there isn't any real control for performance and availability. Or is there? The current state of the art for Big Data analytics, as applied to network telemetry, offers new opportunities for improving and assuring operational integrity. In his session at @ThingsExpo, Jim Frey, Vice President of S...
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
In his opening keynote at 20th Cloud Expo, Michael Maximilien, Research Scientist, Architect, and Engineer at IBM, discussed the full potential of the cloud and social data requires artificial intelligence. By mixing Cloud Foundry and the rich set of Watson services, IBM's Bluemix is the best cloud operating system for enterprises today, providing rapid development and deployment of applications that can take advantage of the rich catalog of Watson services to help drive insights from the vast t...
SYS-CON Events announced today that EnterpriseTech has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. EnterpriseTech is a professional resource for news and intelligence covering the migration of high-end technologies into the enterprise and business-IT industry, with a special focus on high-tech solutions in new product development, workload management, increased effic...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that Cloud Academy named "Bronze Sponsor" of 21st International Cloud Expo which will take place October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud com...
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and ...
SYS-CON Events announced today that CHEETAH Training & Innovation will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CHEETAH Training & Innovation is a cloud consulting and IT training firm specializing in improving clients cloud strategies and infrastructures for medium to large companies.
SYS-CON Events announced today that Datanami has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datanami is a communication channel dedicated to providing insight, analysis and up-to-the-minute information about emerging trends and solutions in Big Data. The publication sheds light on all cutting-edge technologies including networking, storage and applications, and thei...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...