|By Brian Carrier||
|August 12, 2005 03:00 PM EDT||
We have all done it before. You accidentally type in the wrong argument to rm or select the wrong file for deletion. As you hit enter, you notice your mistake and your stomach drops. You reach for the backup of the system and realize that there isn't one.
There are many undelete tools for FAT and NTFS file systems, but there are few for Ext3, which is currently the default file system for most Linux distributions. This is because of the way that Ext3 files are deleted. Crucial information that stores where the file content is located is cleared during the deletion process.
In this article, we take a low-level look at why recovery is difficult and look at some approaches that are sometimes effective. We will use some open source tools for the recovery, but the techniques are not completely automated.
What Is a File?
Before we can see how to recover files, we need to look at how files are stored. Typically, file systems are located inside of a disk partition. The partition is usually organized into 512-byte sectors. When the partition is formatted as Ext3, consecutive sectors will be grouped into blocks, whose size can range from 1,024 to 4,096 bytes. The blocks are grouped together into block groups, whose size will be tens of thousands of blocks. Each file has data stored in three major locations: blocks, inodes, and directory entries. The file content is stored in blocks, which are allocated for the exclusive use of the file. A file is allocated as many blocks as it needs. Ideally, the file will be allocated consecutive blocks, but this is not always possible.
The metadata for the file is stored in an inode structure, which is located in an inode table at the beginning of a block group. There are a finite number of inodes and each is assigned to a block group. File metadata includes the temporal data such as the last modified, last accessed, last changed, and deleted times. Metadata also includes the file size, user ID, group ID, permissions, and block addresses where the file content is stored.
The addresses of the first 12 blocks are saved in the inode and additional addresses are stored externally in blocks, called indirect blocks. If the file requires many blocks and not all of the addresses can fit into one indirect block, a double indirect block is used whose address is given in the inode. The double indirect block contains addresses of single indirect blocks, which contain addresses of blocks with file content. There is also a triple indirect address in the inode that adds one more layer of pointers.
Last, the file's name is stored in a directory entry structure, which is located in a block allocated to the file's parent directory. An Ext3 directory is similar to a file and its blocks contain a list of directory entry structures, each containing the name of a file and the inode address where the file metadata is stored. When you use the ls -i command, you can see the inode address that corresponds to each file name. We can see the relationship between the directory entry, the inode, and the blocks in Figure 1.
When a new file is created, the operating system (OS) gets to choose which blocks and inode it will allocate for the file. Linux will try to allocate the blocks and inode in the same block group as its parent directory. This causes files in the same directory to be close together. Later we'll use this fact to restrict where we search for deleted data.
The Ext3 file system has a journal that records updates to the file system metadata before the update occurs. In case of a system crash, the OS reads the journal and will either reprocess or roll back the transactions in the journal so that recovery will be faster then examining each metadata structure, which is the old and slow way. Example metadata structures include the directory entries that store file names and inodes that store file metadata. The journal contains the full block that is being updated, not just the value being changed. When a new file is created, the journal should contain the updated version of the blocks containing the directory entry and the inode.
Several things occur when an Ext3 file is deleted from Linux. Keep in mind that the OS gets to choose exactly what occurs when a file is deleted and this article assumes a general Linux system.
At a minimum, the OS must mark each of the blocks, the inode, and the directory entry as unallocated so that later files can use them. This minimal approach is what occurred several years ago with the Ext2 file system. In this case, the recovery process was relatively simple because the inode still contained the block addresses for the file content and tools such as debugfs and e2undel could easily re-create the file. This worked as long as the blocks had not been allocated to a new file and the original content was not overwritten.
With Ext3, there is an additional step that makes recovery much more difficult. When the blocks are unallocated, the file size and block addresses in the inode are cleared; therefore we can no longer determine where the file content was located. We can see the relationship between the directory entry, the inode, and the blocks of an unallocated file in Figure 2.
Now that we know the components involved with files and which ones are cleared during deletion, we can examine two approaches to file recovery (besides using a backup). The first approach uses the application type of the deleted file and the second approach uses data in the journal. Regardless of the approach, you should stop using the file system because you could create a file that overwrites the data you are trying to recover. You can power the system off and put the drive in another Linux computer as a slave drive or boot from a Linux CD.
The first step for both techniques is to determine the deleted file's inode address. This can be determined from debugfs or The Sleuth Kit (TSK). I'll give the debugfs method here. debugfs comes with most Linux distributions and is a file system debugger. To start debugfs, you'll need to know the device name for the partition that contains the deleted file. In my example, I have booted from a CD and the file is located on /dev/hda5:
# debugfs /dev/hda5
debugfs 1.37 (21-Mar-2005)
We can then use the cd command to change to the directory of the deleted file:
debugfs: cd /home/carrier/
The ls -d command will list the allocated and deleted files in the directory. Remember that the directory entry structure stores the name and the inode of the file and this listing will give us both values because neither is cleared during the deletion process. The deleted files have their inode address surrounded by "<" and ">":
debugfs: ls -d
415848 (12) . 376097 (12) .. 415864 (16) .bashrc
<415926> (28) oops.dat
|theusr 07/09/09 09:29:00 AM EDT|
The figure 2 maybe misleading: the links between the address blocks and the file content are still there (though the address blocks are unallocated), that what's make the recovery possible.
|Mike Kay 01/15/08 03:57:07 PM EST|
Excellent article. Followed it step by step and successfully recovered a .XLS spreadsheet that had been deleted from the /tmp folder on Ubuntu Gutsy. It also found an associated .jpg that I wasn't looking for!
Saved me hours of retyping. Thanks a lot.
|Jahangir 10/22/07 05:26:36 PM EDT|
This was really the best article i could find inspite of 3 hrs of googling.
But what if you are trying to recover a 6GB VM.
|ruintower 04/23/06 09:07:29 PM EDT|
Trackback Added: ext3 undelete; I “mis-deleted” a big file several days ago. So I umount the the partition immediately and searched the recovery method because I knew (but forgot) some methods to recovery file in Linux. However, the result is disappointed. Alt...
|marco 03/13/06 08:04:20 AM EST|
U have saved my life.
U are a GURU,
|marco 03/13/06 08:04:04 AM EST|
U have saved my life.
U are a GURU,
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
Aug. 31, 2015 09:00 PM EDT Reads: 363
The Internet of Things is in the early stages of mainstream deployment but it promises to unlock value and rapidly transform how organizations manage, operationalize, and monetize their assets. IoT is a complex structure of hardware, sensors, applications, analytics and devices that need to be able to communicate geographically and across all functions. Once the data is collected from numerous endpoints, the challenge then becomes converting it into actionable insight.
Aug. 31, 2015 07:00 PM EDT
Consumer IoT applications provide data about the user that just doesn’t exist in traditional PC or mobile web applications. This rich data, or “context,” enables the highly personalized consumer experiences that characterize many consumer IoT apps. This same data is also providing brands with unprecedented insight into how their connected products are being used, while, at the same time, powering highly targeted engagement and marketing opportunities. In his session at @ThingsExpo, Nathan Treloar, President and COO of Bebaio, will explore examples of brands transforming their businesses by t...
Aug. 31, 2015 06:00 PM EDT Reads: 243
With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts, GM of Platform at FinancialForce.com, will discuss the value of business applications on wearable ...
Aug. 31, 2015 03:15 PM EDT
While many app developers are comfortable building apps for the smartphone, there is a whole new world out there. In his session at @ThingsExpo, Narayan Sainaney, Co-founder and CTO of Mojio, will discuss how the business case for connected car apps is growing and, with open platform companies having already done the heavy lifting, there really is no barrier to entry.
Aug. 31, 2015 02:30 PM EDT Reads: 141
With the proliferation of connected devices underpinning new Internet of Things systems, Brandon Schulz, Director of Luxoft IoT – Retail, will be looking at the transformation of the retail customer experience in brick and mortar stores in his session at @ThingsExpo. Questions he will address include: Will beacons drop to the wayside like QR codes, or be a proximity-based profit driver? How will the customer experience change in stores of all types when everything can be instrumented and analyzed? As an area of investment, how might a retail company move towards an innovation methodolo...
Aug. 31, 2015 02:30 PM EDT Reads: 456
The Internet of Things (IoT) is about the digitization of physical assets including sensors, devices, machines, gateways, and the network. It creates possibilities for significant value creation and new revenue generating business models via data democratization and ubiquitous analytics across IoT networks. The explosion of data in all forms in IoT requires a more robust and broader lens in order to enable smarter timely actions and better outcomes. Business operations become the key driver of IoT applications and projects. Business operations, IT, and data scientists need advanced analytics t...
Aug. 31, 2015 02:30 PM EDT Reads: 416
Contrary to mainstream media attention, the multiple possibilities of how consumer IoT will transform our everyday lives aren’t the only angle of this headline-gaining trend. There’s a huge opportunity for “industrial IoT” and “Smart Cities” to impact the world in the same capacity – especially during critical situations. For example, a community water dam that needs to release water can leverage embedded critical communications logic to alert the appropriate individuals, on the right device, as soon as they are needed to take action.
Aug. 31, 2015 12:00 PM EDT
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Aug. 31, 2015 11:30 AM EDT Reads: 895
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of technology leadership, Micron's memory solutions enable the world's most innovative computing, consumer,...
Aug. 31, 2015 11:15 AM EDT Reads: 228
As more intelligent IoT applications shift into gear, they’re merging into the ever-increasing traffic flow of the Internet. It won’t be long before we experience bottlenecks, as IoT traffic peaks during rush hours. Organizations that are unprepared will find themselves by the side of the road unable to cross back into the fast lane. As billions of new devices begin to communicate and exchange data – will your infrastructure be scalable enough to handle this new interconnected world?
Aug. 31, 2015 11:00 AM EDT Reads: 161
Through WebRTC, audio and video communications are being embedded more easily than ever into applications, helping carriers, enterprises and independent software vendors deliver greater functionality to their end users. With today’s business world increasingly focused on outcomes, users’ growing calls for ease of use, and businesses craving smarter, tighter integration, what’s the next step in delivering a richer, more immersive experience? That richer, more fully integrated experience comes about through a Communications Platform as a Service which allows for messaging, screen sharing, video...
Aug. 31, 2015 10:30 AM EDT Reads: 655
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevOps to advance innovation and increase agility. Specializing in designing, imple...
Aug. 31, 2015 10:15 AM EDT Reads: 304
In his session at @ThingsExpo, Lee Williams, a producer of the first smartphones and tablets, will talk about how he is now applying his experience in mobile technology to the design and development of the next generation of Environmental and Sustainability Services at ETwater. He will explain how M2M controllers work through wirelessly connected remote controls; and specifically delve into a retrofit option that reverse-engineers control codes of existing conventional controller systems so they don't have to be replaced and are instantly converted to become smart, connected devices.
Aug. 31, 2015 09:30 AM EDT Reads: 124
SYS-CON Events announced today that IceWarp will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IceWarp, the leader of cloud and on-premise messaging, delivers secured email, chat, documents, conferencing and collaboration to today's mobile workforce, all in one unified interface
Aug. 31, 2015 04:00 AM EDT Reads: 422
WebRTC has had a real tough three or four years, and so have those working with it. Only a few short years ago, the development world were excited about WebRTC and proclaiming how awesome it was. You might have played with the technology a couple of years ago, only to find the extra infrastructure requirements were painful to implement and poorly documented. This probably left a bitter taste in your mouth, especially when things went wrong.
Aug. 31, 2015 02:00 AM EDT Reads: 453
As more and more data is generated from a variety of connected devices, the need to get insights from this data and predict future behavior and trends is increasingly essential for businesses. Real-time stream processing is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, and Healthcare. Azure Stream Analytics is a fully managed distributed stream computation service that provides low latency, scalable processing of streaming data in the cloud with an enterprise grade SLA. It features built-in integration with Azur...
Aug. 28, 2015 07:45 PM EDT Reads: 219
Akana has announced the availability of the new Akana Healthcare Solution. The API-driven solution helps healthcare organizations accelerate their transition to being secure, digitally interoperable businesses. It leverages the Health Level Seven International Fast Healthcare Interoperability Resources (HL7 FHIR) standard to enable broader business use of medical data. Akana developed the Healthcare Solution in response to healthcare businesses that want to increase electronic, multi-device access to health records while reducing operating costs and complying with government regulations.
Aug. 26, 2015 07:00 AM EDT Reads: 192
For IoT to grow as quickly as analyst firms’ project, a lot is going to fall on developers to quickly bring applications to market. But the lack of a standard development platform threatens to slow growth and make application development more time consuming and costly, much like we’ve seen in the mobile space. In his session at @ThingsExpo, Mike Weiner, Product Manager of the Omega DevCloud with KORE Telematics Inc., discussed the evolving requirements for developers as IoT matures and conducted a live demonstration of how quickly application development can happen when the need to comply wit...
Aug. 2, 2015 11:15 AM EDT Reads: 556
The Internet of Everything (IoE) brings together people, process, data and things to make networked connections more relevant and valuable than ever before – transforming information into knowledge and knowledge into wisdom. IoE creates new capabilities, richer experiences, and unprecedented opportunities to improve business and government operations, decision making and mission support capabilities.
Aug. 1, 2015 10:00 AM EDT Reads: 486