Sunday, October 12, 2008

Rescue Your JFS Partition With jfsrec

So, my 500GB deskstar (aka DeathStar) on my file server developed bad sectors. The original rescue plan was simple - use dd_rescue to clone the disk image. Normally, that should get a pretty good result for a drive going bad recently.

The problem was, I forgot my IDE2USB adapter does not support harddrive that big, and the adapter just bailed out on reading unrecoverable sector. So, I did it the other way - mounted the harddrive as read-only, and rsync directory by directory.

To speed up the file transfer and skip files that are publicly available, I remount the drive as read-write, without any backup copy (that I tell you, is a big mistake that I would probably not make in the workplace - but I get a bit lax on my private stuff). After an rm some_distro.iso, bang... the Kernel panicked!

I rebooted the machine, remounted the harddrive. Now, it's even worse - the mounted JFS volume is empty - zero file!

Since all my personal photos, music, documents, source codes, whatever-you-can-think-of, are all stored in that drive, losing all the data would be such a pain. I realized I have to proceed with caution on that point forward. So, here's what I do:

Step 1: Buy another drive (1TB baby!), mounted both drives to the machine (avoid the RAID card getting to clever and initialize both drives!); boot the box with Knoppix, and clone the drives (be careful of the drive order!)

Step 2: Make another copy of the cloned image (remember that the original drive already developed bad sectors, it is possible that I cannot make a disk image as good as this one.)

Step 3: Running fsck.jfs on the drive, and found the following error message:
Duplicate block references have been detected in
Metadata. CANNOT CONTINUE.
It means, fsck.jfs is not going to do it this time, and the corruption is down to the metadata... Oh snap!

Step 4: Google for JFS structure or any other recovery too (besides TCK/Sleuthkit... it is too much hassle), found a great tool called jfsrec

Step 5: Follow jfsrec's documentation, and run something like this:
./jfsrec --device /path/to/my/disk.image --output /path/to/recovery/directory --logdir /path/to/log.dir
According to the doc, the process could take days, but mine was pretty good - it just took me 24 hours.

Step 6: Now I have most files recovered in /path/to/recovery/directory. Even some files are corrupted, and some filenames are lost (they are named with the inode number), it is already much better than having it all gone. Still, here's another problem - jfsrec does not handle UTF-8 filenames properly yet (Announcing jfsrec - A JFS recovery tool: msg#00008), so I end up with a number of files with garbled names.

Step 7: Fortunately, there is another useful tool to fix this issue. That is... "drum roll please"... convmv. As I said in my last step, jfsrec does not handle UTF-8 filenames properly. So, the filenames are actually UTF-8 interpreted as ISO8859-1. The idea is to reverse the process, so I did the checking with:
cd path/to/garbled/filenames
convmv -f utf-8 -t iso8859-1 *
And the result looks pretty good, so I issue another command to do the work:
convmv -f utf-8 -t iso8859-1 * --notest
Followup...
  1. Check the file integrity, and see which files are corrupted.
  2. Build a RAID-1 system, even if it is a soft-RAID (yes, mine is a software RAID controller card, due to tight budget).
  3. Backup... frequently; preferably offsite backup too (an easy way is to store the backup in my office. Make sure it is encrypted, because it is not nice to give other possibilities to see my bank statements, etc).
  4. and a few other steps...
Improvements & Things Learned
  1. Did I say "backup often"?
  2. Invest in redundancy solution (with regards to cost of data loss vs. price for redundancy configuration).
  3. Follow steps carefully, even if it is my home machine.
  4. Avoid making important decision, when I didn't have enough sleep (I had a heartbreak, and sleepless for days, when I managed to corrupt the JFS volume).
  5. Not to put too much trust on these cheap IDE2USB cables. Also, document and remember its limitation (e.g. maximum size supported).
  6. Offer improvements to the jfsrec project on utf-8 support (when my mood gets better, and I make up my mind).

No comments: