Rescue date from a unmountable BTRFS filesystem

Today just a short post here:
What you can do and what you can not do on a damaged BTRFS filesystem.

BTRFS is a very sophisticated filesystem, it offers a lot of features that all other native linux filesystems lack.
Everybody that worked already with advanced filesystems like WAFL or ZFS knows what i am talking about. I don’t want to miss this features any more! I am using BTRFS since years, also to store my data.

But as you know, in IT business things tend to went south, specially then when you hope they will not.
And there will be filesystem crashes with BTRFS, like with every other filesystem also.
You have to be prepared for that.

In our example here /dev/sdb1 will be the damaged filesystem.

The first thing admins tend to do is a check on the filesystem.

btrfs check [options] <device>

If your OS is still booting and only a data filessytem with BTRFS is not mounting, this is OK.
But if you need to boot from a rescue disk, make absolute sure that your kernel and the btrfs-progs are at least same version or newer then the on the damaged system! This is a must!

The btrfs check command without any options does nothing inversive, it lists you just the problems.

If you are sure that your kernel and the btrfs-progs are a recent version, you can also try to mount the damaged filesystem with the option -o recovery or -o recovery,ro:

mount -o recovery,ro /dev/sdb1 /mnt

This can give you access to the damaged filesystems data so you can make a backup.

Another highly recommended option is to dump your damaged filesystem before you try inversive repair options. You need some kind of storage where the data can be written to, like an attached USB disk or a mounted NFS export with bigger size then your damaged filesystem is.
The command will try to recover every readable file on the damaged filesystem to the restore mount point. You may be able to rescue nearly 100% of the data with that measure!

btrfs restore /dev/sdb1 /mnt/restore

You may be unsuccessful with the default root tree set of the filesystem. To search for a better root tree set you can use

btrfs-find-root /dev/sdb1

Choose a most actual (highest generation number) root tree set with all objects below 256 in it and use the block number mentioned in the line

Well block XXXXXXXXXXX seems great

and then use this block number in the restore command:

btrfs restore -t XXXXXXXXXXXXXX /dev/sdb1 /mnt/restore

It is really absolutely recommended that you do that before you use inversive measures on the filesystem!

Inversive Repair Methods

Remember, before you use the following options, it’s a must to have restored the data on a different device!
All inversive methods can possibly damage your filesystem more!

If a mount gives you in dmesg a similar output

? replay_one_dir_item+0xb5/0xb5 [btrfs]
? walk_log_tree+0x9c/0x19d [btrfs]
? btrfs_read_fs_root_no_radix+0x169/0x1a1 [btrfs]
? btrfs_recover_log_trees+0x195/0x29c [btrfs]
? replay_one_dir_item+0xb5/0xb5 [btrfs]

Then you can zero the log. But really only if you have such an output. Otherwise you make things worse!

The inversive btrfs check –repair command

Warning: this command should be your last resort! Chances that things might get super messed up are high!

Most admins know filesystem repair commands from other filesystem types as a good way to repair damages.
This is not the case here! before you try the repair command you need to have a dump created from your filesystem with btrfs restore!
I made myself some time ago this mistake, i had a 5TB BTRFS filesystem with data on it and no Device at hand to restore the data. I really lost all 5TB with a failed btrfs check –repair !

brtfs check –repair /dev/sdb1

If you have no other choice any more and you already saved the data with btrfs restore you can give it a try, you have nothing to loose then any more.

Conclusion

In case of a BTRFS filesystem failure, the command

btrfs restore

is your best friend. It’s not the command

brtfs check –repair

CU soon here again on my website!