My Experience With Data Loss
A summary of my 30 day long recovery process, Proverbs 3:5
Introduction
If you are looking for a reason to backup all your data, or like me kept putting it off until the next day, do it now. In June my old iMac experienced a kernel panic which, when I rebooted, left one of the external HDD’s I had connected unreadable.
This external HDD had my entire collection of family photos I had up until this point, as well as code for several side projects I was working on. It also had the time machine backups for my iMac, but recovering that was less important. I used it as extra storage, because my collection of family photos and movies had grown large enough that my iMac’s internal storage couldn’t fit it anymore.
What was wrong with the Disk?
The first thing I did was try to determine what was wrong with the disk. I am not experienced in data recovery, especially on a Mac, so a lot of this I learned through on the spot googling.
Background on the Disk
The disk was an external 12TB WD HDD formatted as APFS. It had two volumes inside the container: backup512 and Storage2. The backup512 was an unencrypted container which stored the time machine backups, and the second volume, Storage2, was an encrypted volume holding the data I wanted back.
The first thing I did was run Disk utility’s first aid tool…
Disk utility was unable to repair the container, because it said it was “locked” and that I would need to unlock it first, but I couldn’t unlock it because it couldn’t mount…
I tried the first aid tool on individual volumes with the same results.
The next tool I tried was fsck_apfs which is a tool for verifying and repairing APFS containers and volumes built into MacOS.
Running that resulted in my first real error message:
At the time I didn’t know what the Space manager was, but now I know that it keeps track of which parts of the partition are not currently being used to hold data, and then allocates space to a volume when needed.
This was not a lot of information, but it told me that something was clearly wrong with the filesystem, and I potentially had some corrupted data in the checkpoint which was preventing the mounting process from happening.
Always Step one of Data recovery
The first step to any good data recovery is creating a backup of the disk. At this point in the recovery process I was still unsure if it was a hardware failure, or something corrupted in the filesystem, so to be safe I got a second 12TB WD HDD and began the cloning process.
I used dd and began copying the whole 12TB disk using the command:
sudo dd if=/dev/disk2 of=/dev/disk10 bs=1M status=progress
This process took sixteen days to complete and during that time I researched more about file systems and what recovery software I would want to use.
Note that this might not be the best option if the drive is physically failing, because running for so long on a failing drive could induce problems. I was very fortunate that in my case it was a corrupted filesystem and not a failing hard drive.
Recovery software
I went to work trying to understand filesystems more and research best recovery tools for APFS. This, like many people, led me to Reddit.
After reading through these threads I felt the best data recovery tools I could use were R-studio and UFS Explorer.
The first software I tried was UFS explorer as that seemed to be one of the highest recommended softwares on the r/datarecovery subreddit.
After installing UFS and firing it up I was greeted with the drive list
I found it weird that UFS Explorer showed the two volumes as separate disks, but I didn’t think much of it.
I haven’t mentioned it, but up to this point I was unsure of the password, so this allowed me to make sure I knew the password. When I clicked the unlock disk button in UFS explorer and put in my password it displayed that it was successfully accepted. I tried a few incorrect passwords to make sure the software was truly checking against something, and those came back as incorrect. So at this point I knew the password.
I ran UFS recovery software with just APFS selected and with their intellisense deep scan feature disabled for the first run.
The first run of the software took about a day to complete. The software fully scanned the disk and reconstructed a virtual filesystem.
The only issue was UFS didn’t detect any data from the Storage2 volume, it only reconstructed my time machine backups from backups512. I would try running UFS explorer on both disks listed with the same result. I saved the output of UFS explorer and it didn’t matter which disk I applied it to it would just give same result. This told me that UFS explorer was possibly not following the two volumes in the container properly and treating it all as a single APFS volume?
I then tried running UFS explorer with their deep scan intellisense enabled, but after waiting a day for this to complete it would always crash at the end.
I emailed details of my data loss situation and the crash logs of UFS explorer to the company that made the software, and I hope that they fix whatever issue causes this.
The second software I would try was R-studio as that was also highly recommended by the subreddit, but after taking a day to run their scan feature all it was able to recover was raw files without a directory tree, so a lot of files were unrecoverable with it.
Asking for help
At this point in my journey it had been over twenty days of trying and I felt that I had barely anything to show for it. I only managed to get the Time machine backups, which I didn’t need and some photos back.
I began asking people I knew for advice, but a surprising amount of people told me that since it was an encrypted APFS container if I got any data back it would be a miracle.
At this time I revisited an old Github project I saw during my initial googling, DRAT.
I noticed that on his profile the creator of the project, Jivan Pal listed his contact information, so I decided to reach out and see if he had any ideas.
He was very busy, but he took time out of his day to help me use his tool and understand the output it was giving.
I downloaded his tool and ran it, The … indicates that I omitted some of the output from the result as it was very long for this article
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
sudo ./drat-0–1–3-darwin-amd64 inspect /dev/rdisk2s2
Password:
Opening file at `/dev/rdisk2s2` in read-only mode … OK.
Simulating a mount of the APFS container.
Reading container superblock at address 0x0, assuming default block size of 4096 bytes … validating checksum … OK.
Details of block 0x0:
…
The container superblock states that the container object map has Physical OID 0x9670d5.
Loading the container object map … OK.
Validating the container object map … OK.
Details of the container object map:
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
Stored checksum: 0x7382ff320b1436d2
OID: 0xff7a
XID: 0x271044
Storage type: Virtual
Type flags: (no flags)
Type: B-tree (root node)
Subtype: File-system records tree
Flags:
- No snapshot support
Object mappings tree:
- Storage type: Virtual
- Type flags: (no flags)
- Type: (invalid type / no subtype)
- Object ID: 0xffff00380020
Snapshots tree:
- Storage type: Virtual
- Type flags: Non-persistent (should never appear on disk — — if it does, file a bug against the APFS implementation that created this object)
- Type: Unknown value (0x58)
- Object ID: 0x8000800180000
- Number of snapshots: 2 snapshots
- Latest snapshot XID: 0x8001000080018
In-progress revert:
- Minimum XID: 0x0
- Maximum XID: 0x0
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
END: The container object map B-tree is not of the Physical storage type, and therefore it cannot be located.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
After some back and forth with Jivan Pal I learned that In APFS, Physical OID is a name for the block address, and Virtual OID is for an identifier used in a B-tree to refer to a particular object/file.
It makes sense that this would be what is missing, because when I tried on a linux machine using APFS-FUSE to read an APFS drive on linux I got.
So it can’t load the object map.
It is possible that this is due to APFS not being unmounted properly due to the iMac crashing.
Solution
Jivan Pal suggested I use Disk drill, because there was likely a checkpoint which had the correct information. I hadn’t used Disk drill at this point due to people on the subreddit talking about it so poorly, but I thought to give it a shot since I had exhausted a lot of options.
This is proof that the last thing you try always works, as when I installed disk drill, configured the settings, and performed a quick scan of Storage2, within a few hours of initiating just the quick scan it had found all my files with the directory structure intact.
I then purchased a license to Disk drill and copied off all my files from the corrupted APFS storage to a new drive. The process of copying the data off took around a day to complete.
Funny enough Disk drill quick scan was only able to find the directory structure of Storage2, when I tried the same thing on backup512 it found no files. This is the opposite result of UFS explorer, but it is the result I needed.
Closing thoughts
I learned a lot more about APFS and MacOS ending this adventure than I knew going in. Data recovery is not a one single solution type of problem, there are countless different factors that contribute to data loss. I was very blessed that my data loss situation was fixable, and the multiple checkpoint feature in APFS was very helpful in restoring my files if you have software that is capable of traversing them properly.
If you experience data loss, don’t panic the data may still be recoverable, but panicking and trying things quickly could just make things worse. Also I learned about the TRIM feature in SSD’s through this and I am now glad that my situation occurred on an HDD while it made the traversing much slower I did not have to worry about the issues TRIM can have on data recovery.
I prayed a lot to God to help me through this process, and I personally did very little.
I give Jesus full credit for helping me meet the people I did. I was able to get my data back due to the help of other people, and I couldn’t rely on my own ability for this. I am very thankful to the people along the way who gave me their time to help me even if not all the solutions worked we are all in this to help preserve each other's data.
Before you leave
Resources on learning APFS
http://docs.macsysadmin.se/2018/video/Day4Session2.mp4
Thanks for reading!
Aaron