Welcome!

Linux Containers Authors: Elizabeth White, Liz McMillan, Zakia Bouachraoui, Yeshim Deniz, Pat Romanski

Related Topics: Linux Containers, Open Source Cloud, Recurring Revenue

Linux Containers: Article

Proactively Preventing Data Corruption

Linux gains end-to-end data integrity protection

Data Corruption
Corruption can occur as a result of bugs in both software and hardware. A common failure scenario involves incorrect buffers being written to disk, often clobbering good data.

This latent type of corruption can go undetected for a long period of time. It may take months before the application attempts to reread the data from disk, at which point the good data may have been lost forever. Short backup cycles may even have caused all intact copies of the data to be overwritten.

A crucial weapon in preventing this type of error is proactive data integrity protection, a method that prevents corrupted I/O requests from being written to disk.

For several years Oracle has offered a technology called HARD (Hardware Assisted Resilient Data), which allows storage systems to verify the integrity of an Oracle database logical block before it is committed to stable storage. Though the level of protection offered by HARD is mandatory in numerous enterprise and government deployments, adoption outside the mission-critical business segment has been slow. The disk array vendors that license and implement the HARD technology only offer it in their very high-end products. As a result, Oracle has been looking to provide a comparable level of resiliency using an open and standards-based approach.

A recent extension to the SCSI family of protocols allows extra protective measures, including a checksum, to be included in an I/O request. This appended data is referred to as integrity metadata or protection information.

Unfortunately, the SCSI protection envelope only covers the path between the I/O controller and the storage device. To remedy this, Oracle and a few select industry partners have collaborated to design a method of exposing the data integrity features to the operating system. This technology, known as the Data Integrity Extensions, allows the operating system – and even applications such as the Oracle Database – to generate protection data that will be verified as the request goes through the entire I/O stack. Figure 1 illustrates the integrity coverage provided by the technologies described earlier.

T10 Data Integrity Field
T10 is the INCITS standards body responsible for the SCSI family of protocols. Data corruption has been a known problem in the storage industry for years and T10 has provided the means to prevent it by extending the SCSI protocol to allow integrity metadata to be included in an I/O request. The extension to the SCSI block device protocol is called the Data Integrity Field (DIF).

  • Allows I/O controller and storage device to exchange protection information
  • Each data sector is protected by an 8-byte integrity tuple
  • The contents of this tuple include a checksum and an incrementing counter that ensures the I/O is intact
  • Both I/O controller and storage device can detect and reject corrupted requests

Normal SCSI disks use a hardware sector size of 512 bytes. (The term SCSI disk is used to refer to any enterprise-class storage device using the SCSI protocol, i.e., parallel SCSI, Fibre Channel and SAS.) However, when used inside disk arrays, the drives are often reformatted to a bigger sector size of 520 or 528 bytes. The operating system is only exposed to the usual 512 bytes of data. The extra 8 or 16 bytes in each sector are used internally by the array firmware for integrity checks.

DIF is similar in the sense that the storage device must be reformatted to 520 byte sectors. The main difference between DIF and proprietary array firmware is that the format of the extra 8 bytes of information per sector is well defined as well as being an open standard. This means that every node in the I/O path can participate in generating and verifying the integrity metadata.

Each DIF tuple is split up into three sections called tags as shown in Figure 2. There is a 16-bit guard tag, a 16-bit application tag, and a 32-bit reference tag.

The DIF specification lists several types of protection. Each of these protection types defines the contents of the three tag fields in the DIF tuple. The guard tag contains a 16-bit CRC of the 512 bytes of data in the sector. The application tag is for use by the application or operating system, and finally the reference tag is used to ensure the ordering of the individual portions of the I/O request. The reference tag varies depending on protection type. The most common of these is Type 1 in which the reference tag needs to match the 32 lower bits of the target sector logical block address. This helps prevent misdirected writes, a common corruption error where data is written to the wrong place on disk.

If the storage device detects a mismatch between the data and the integrity metadata, the I/O will be rejected before it’s written to disk. Also, since each node in the I/O path is free to inspect and verify the integrity metadata, it is possible to isolate points of error. For instance, it is conceivable that in the future advanced fabric switches will be able to verify the integrity as data flows through the Storage Area Network.

The fact that a storage device is formatted using the DIF protection scheme is transparent to the operating system. In the case of a write request, the I/O controller will receive a number of 512-byte buffers from the operating system and proceed to generate and append the appropriate 8 bytes of protection information to each sector. Upon receiving the request, the SCSI disk will verify that the data matches the included integrity metadata. In the case of a mismatch, the I/O will be rejected and an error returned to the operating system.

Similarly, in the case of a read request, the storage device will include the protection information and send 520 byte sectors to the I/O controller. The controller will verify the integrity of the I/O, strip off the protection data, and return 512 byte data buffers to the operating system.

In other words, the added level of protection between controller and storage device is completely transparent to the operating system. Unfortunately, this also means the operating system is unable to participate in the integrity verification process. This is where the Data Integrity Extensions come in.

More Stories By Martin Petersen

Martin K. Petersen has been involved in Linux development since the early nineties. He has worked on PA-RISC and IA-64 Linux ports for HP as well as the XFS filesystem and the Altix kernel for SGI. Martin works in Oracle's Linux Engineering group where he focuses on enterprise storage technologies.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
Bill Schmarzo, Tech Chair of "Big Data | Analytics" of upcoming CloudEXPO | DXWorldEXPO New York (November 12-13, 2018, New York City) today announced the outline and schedule of the track. "The track has been designed in experience/degree order," said Schmarzo. "So, that folks who attend the entire track can leave the conference with some of the skills necessary to get their work done when they get back to their offices. It actually ties back to some work that I'm doing at the University of San...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more busine...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in development and launches of disruptive technologies to create new market opportunities as well as enhance enterprise product portfolios with emerging technologies. His most recent venture was Octoblu, a cross-protocol Internet of Things (IoT) mesh network platform, acquired by Citrix. Prior to co-founding Octoblu, Chris was founder of Nodester, an open-source Node.JS PaaS which was acquired by AppFog and ...
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...