On using Badblocks with ReiserFS

On using Badblocks with ReiserFS
	Part IV. Trouble Shooting

Jurjen Bokma

July 2007

Table of Contents

Finding the bad area using `badblocks`

We have error messages on server iwi202 that look like shown in . The problem repeats twice in quick succession (8 seconds between occurrences) about every twelve minutes, but doesn't stick to fixed post-the-hour times, so we don't believe a cron job causes it. The machine does react more slowly than usual. I will move important processes off the machine, but some minor items may stay on it, and I want to see if I can get rid of the problem by making the ReiserFS stop using the single block that is causing errors.

Example 1. Log of bad sectors on iwi202

Jul 17 09:11:48 src@iwinnn kernel: scsi1: ERROR on channel 0, id 0, lun 0, CDB: 0x28 00 0a b5 f0 fa 00 00 10 00
Jul 17 09:11:48 src@iwinnn kernel: Info fld=0xab5f0fd, Current sd08:09: sns = f0  3
Jul 17 09:11:48 src@iwinnn kernel: ASC=11 ASCQ= 0
Jul 17 09:11:48 src@iwinnn kernel: Raw sense data:0xf0 0x00 0x03 0x0a 0xb5 0xf0 0xfd 0x0a 0x00 0x00 0x00 0x00 0x11 0x00 0xe4 0x80 0x00 0x86
Jul 17 09:11:48 src@iwinnn kernel: I/O error: dev 08:09, sector 104206368
Jul 17 09:11:56 src@iwinnn kernel: scsi1: ERROR on channel 0, id 0, lun 0, CDB: 0x28 00 0a b5 f0 fa 00 00 08 00
Jul 17 09:11:56 src@iwinnn kernel: Info fld=0xab5f0fd, Current sd08:09: sns = f0  3
Jul 17 09:11:56 src@iwinnn kernel: ASC=11 ASCQ= 0
Jul 17 09:11:56 src@iwinnn kernel: Raw sense data:0xf0 0x00 0x03 0x0a 0xb5 0xf0 0xfd 0x0a 0x00 0x00 0x00 0x00 0x11 0x00 0xe4 0x80 0x00 0x86
Jul 17 09:11:56 src@iwinnn kernel: I/O error: dev 08:09, sector 104206368

The sector that causes errors -104206368- is located in /dev/sda9, which is mounted as /var. I could run badblocks on the entire disk if I put the machine in single-user mode and unmounted /var, but I'd rather be as unobtrusive as possible, as I'll see notifications of bad sectors turning up in the logs anyway. According to the badblocks manual, I can say: badblocks -c<blocks-at-a-time> <device> <end-block> <start-block> -i <former-badblocks-report> Badblock counts in blocks of 1024 bytes, whereas we know the location of the bad sector in 512-byte sectors. So we compute the location of the sector in blocks: echo -e "104206368\n2\n/\np"|dc , which yields

52103184

. Then we issue the command to check the partition: badblocks -c64 /dev/sda9 52103222 52103152 |tee ~/bad_blocks.dev.sda9 A few blocks after our culprit appear to be bad as well:

52103184 52103185 52103186 52103187