Notifying the ReiserFS of the bad area

Notifying the `ReiserFS` of the bad area
	On using Badblocks with ReiserFS

Notifying the `ReiserFS` of the bad area

	Note
	It appears that reiserfstune may not be run on a mounted file system. So we must unmount the file system after all. The only advantage we created is that this can be (hopefully) a quick operation now.

Note

It also appears that reiserfstune cannot not accept the output of badblocks. Their units of disk space differ (see ) so we have to convert: for n in `cat bad_blocks.dev.sda9` ; do echo -e "${n}\n8\n/\np"|dc ; done > converted-badblocks-file

6512898 6512898 6512898 6512898

When we try to notify the filesystem of its bad blocks ( reiserfsck --add-badblocks <converted-badblocks-file> --fix-fixable /dev/sda9 ), the command returns an error and the message that the block under consideration is already in use, and please use reiserfsck to repair.

	Warning
	This `reiserfsck` then fails with a segmentation fault, and we are glad to escape with our filesystem intact. There is no way that I will use reiserfsck --rebuild-tree on an already populated filesystem.

	Note
	I think we'd better stay away from `ReiserFS` from now on.

We use another route, and do a find /var/ -type f -exec cat {} \;>/dev/null on the affected filesystem. This fails with the message

find /var/ -type f -exec cat {} \;>/dev/null cat: /var/lib/postgresql/8.1/main/base/16629/16667: Input/output error

and since we know that only a single sector is affected, this must be the file that causes the messages in our logs. So we'll do the following:

	Warning
	In the following procedure, a long list of scsi-driver errors (as in ) is often still in the kernel ringbuffer . During start/stop/reload of `syslog-ng` they will scroll across the console. This may look disturbing, but it is not an indication that the bad part of the disk is still being accessed.

Stop all services that use the filesystem by switching to single-user mode:

init 1
Stop syslog-ng too, as it uses /var and is still active in runlevel 1

/etc/init.d/syslog-ng stop
unmount the filesystem ^[18]: umount /var
mount it somewhere else:

mkdir /mnt/sda9 && mount /dev/sda9 /mnt/sda9
Move the file that lies on the bad sector to another filesystem:

dd if=/mnt/sda9/lib/postgresql/8.1/main/base/16629/16667 of=/home/16667 conv=noerror

^[19]
Unmount the partition:

umount /dev/sda9
1. Notify the filesystem of its bad blocks using reiserfstune:
  
  reiserfstune --add-badblocks ~/bad_blocks.dev.sda9.base4096 /dev/sda9
2. When this fails with the already-in-use message, we try
  
  /sbin/reiserfsck --badblocks ~/bad_blocks.dev.sda9.base4096 /dev/sda9
  
  Note
  
  This failed during an earlier try, but it succeeded this time. YMMV.
Remount the partition at the alternative mount point:

mount /dev/sda9 /mnt/sda9
Copy the file back in place:

mv /home/16667 /mnt/sda9/lib/postgresql/8.1/main/base/16629/
Unmount the partition from its alternative mount point:

umount /dev/sda9
Mount it in its usual place:

mount /dev/sda9
Start syslog again:

/etc/init.d/syslog-ng start
Return to multi-user mode:

init 5

After all this is done, we see no more SCSI errors in the logs, and debugreiserfs -B /tmp/bad /dev/sda9 && cat /tmp/bad confirms that block 6512898 is bad.

Example 2. SCSI errors in the log

Raw sense data:0xf0 0x00 0x03 0x0a 0xb5 0xf0 0xfd 0x0a 0x00 0x00 0x00 0x00 0x11 0x00 0xe4 0x80 0x00 0x86 I/O error: dev 08:09, sector 104206368 scsi1: ERROR on channel 0, id 0, lun 0, CDB: 0x28 00 0a b5 f0 fa 00 00 08 00 Info fld=0xab5f0fd, Current sd08:09: sns = f0 3 ASC=11 ASCQ= 0

Table 1. Units of disk space used by programs involved in a ReiserFS badblocks detection

program	unit	size
kernel sata driver	sectors	512 bytes
badblocks	block	1024 bytes
ReiserFS	block	4096 bytes

^[18] Of course, first we have to unmount (possibly remote) file systems mounted on subdirectories of /var, like f.i. /var/mail.

^[19] We cannot use cp, as it will stop when it encounters the bad sector, and copy only part of the file.

	Note
	This failed during an earlier try, but it succeeded this time. YMMV.


On using Badblocks with ReiserFS		Part V. Configuring Linux subsystems