ReiserFS
of the bad area
Note | |
---|---|
It appears that reiserfstune may not be run on a mounted file system. So we must unmount the file system after all. The only advantage we created is that this can be (hopefully) a quick operation now. |
When we try to notify the filesystem of its bad blocks (
reiserfsck --add-badblocks <converted-badblocks-file> --fix-fixable /dev/sda9
), the command returns an error and the message that the block under consideration is already in use, and please use reiserfsck
to repair.
We use another route, and do a find /var/ -type f -exec cat {} \;>/dev/null on the affected filesystem. This fails with the message
find /var/ -type f -exec cat {} \;>/dev/null
cat: /var/lib/postgresql/8.1/main/base/16629/16667: Input/output error
and since we know that only a single sector is affected, this must be the file that causes the messages in our logs. So we'll do the following:
Stop all services that use the filesystem by switching to single-user mode:
init 1
Stop syslog-ng
too, as it uses /var
and is still active in runlevel 1
/etc/init.d/syslog-ng stop
unmount the filesystem [18]: umount /var
mount it somewhere else:
mkdir /mnt/sda9 && mount /dev/sda9 /mnt/sda9
Move the file that lies on the bad sector to another filesystem:
dd if=/mnt/sda9/lib/postgresql/8.1/main/base/16629/16667 of=/home/16667 conv=noerror
Unmount the partition:
umount /dev/sda9
Notify the filesystem of its bad blocks using reiserfstune
:
reiserfstune --add-badblocks ~/bad_blocks.dev.sda9.base4096 /dev/sda9
When this fails with the already-in-use message, we try
/sbin/reiserfsck --badblocks ~/bad_blocks.dev.sda9.base4096 /dev/sda9
Note | |
---|---|
This failed during an earlier try, but it succeeded this time. YMMV. |
Remount the partition at the alternative mount point:
mount /dev/sda9 /mnt/sda9
Copy the file back in place:
mv /home/16667 /mnt/sda9/lib/postgresql/8.1/main/base/16629/
Unmount the partition from its alternative mount point:
umount /dev/sda9
Mount it in its usual place:
mount /dev/sda9
Start syslog
again:
/etc/init.d/syslog-ng start
After all this is done, we see no more SCSI errors in the logs, and debugreiserfs -B /tmp/bad /dev/sda9 && cat /tmp/bad confirms that block 6512898
is bad.
Raw sense data:0xf0 0x00 0x03 0x0a 0xb5 0xf0 0xfd 0x0a 0x00 0x00 0x00 0x00 0x11 0x00 0xe4 0x80 0x00 0x86
I/O error: dev 08:09, sector 104206368
scsi1: ERROR on channel 0, id 0, lun 0, CDB: 0x28 00 0a b5 f0 fa 00 00 08 00
Info fld=0xab5f0fd, Current sd08:09: sns = f0 3
ASC=11 ASCQ= 0
program | unit | size |
---|---|---|
kernel sata driver | sectors | 512 bytes |
badblocks | block | 1024 bytes |
ReiserFS | block | 4096 bytes |