Notre petit serveur à tout faire est un rack Dell avec deux disque de 750Go en SATA monté en Raid-1 (mirror)
Le sdb avait commencé à manifester des crasses et s'est fait kické du Raid.
J'ai rebooté (!), re-fdiské/reformaté/rajouté au Raid, et hop le looOOoong crunch avait commencé. Il n'était pas arrivé au bout.
J'ai alors changé le drive.
Puis un autre.
Puis un autre.
4x.
Déjà.
(bon ok les deux premiers n'étaient pas neuf, mais en 2011, un disque peut survivre à 2000h de vols non?)
Bref, je commence à ne plus douter du drive...mais de l'interface sdb (et cela me fait très mal, je suis électronicien)
[796082.196337] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[796082.196340] sd 1:0:0:0: [sdb] Sense Key : Aborted Command [current] [descriptor]
[796082.196343] Descriptor sense data with sense descriptors (in hex):
[796082.196345] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[796082.196353] 00 00 00 00
[796082.196356] sd 1:0:0:0: [sdb] Add. Sense: No additional sense information
[796082.196360] sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
[796082.196367] end_request: I/O error, dev sdb, sector 0
[796082.196394] Buffer I/O error on device sdb, logical block 0
[796082.196426] ata2: EH complete
[796082.196580] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.196614] ata2.00: BMDMA stat 0x25
[796082.196637] ata2.00: failed command: READ DMA
[796082.196664] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.196665] res 61/04:08:00:00:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.202924] ata2.00: status: { DRDY DF ERR }
[796082.202953] ata2.00: error: { ABRT }
[796082.232333] ata2.00: configured for UDMA/133
[796082.232345] ata2: EH complete
[796082.232489] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.232525] ata2.00: BMDMA stat 0x25
[796082.232549] ata2.00: failed command: READ DMA
[796082.232577] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.232578] res 61/04:08:00:00:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.232669] ata2.00: status: { DRDY DF ERR }
[796082.232693] ata2.00: error: { ABRT }
[796082.252337] ata2.00: configured for UDMA/133
[796082.252343] ata2: EH complete
[796082.252474] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.252508] ata2.00: BMDMA stat 0x25
[796082.252533] ata2.00: failed command: READ DMA
[796082.252561] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.252562] res 61/04:08:00:00:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.252649] ata2.00: status: { DRDY DF ERR }
[796082.252673] ata2.00: error: { ABRT }
[796082.269835] ata2.00: configured for UDMA/133
[796082.269843] ata2: EH complete
[796082.269983] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.270016] ata2.00: BMDMA stat 0x25
[796082.270041] ata2.00: failed command: READ DMA
[796082.270070] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.270071] res 61/04:08:00:00:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.270164] ata2.00: status: { DRDY DF ERR }
[796082.270188] ata2.00: error: { ABRT }
[796082.284330] ata2.00: configured for UDMA/133
[796082.284342] ata2: EH complete
[796082.284475] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.284504] ata2.00: BMDMA stat 0x25
[796082.284528] ata2.00: failed command: READ DMA
[796082.284556] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.284558] res 61/04:08:00:00:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.284659] ata2.00: status: { DRDY DF ERR }
[796082.284683] ata2.00: error: { ABRT }
[796082.301338] ata2.00: configured for UDMA/133
[796082.301353] ata2: EH complete
[796082.301486] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.301515] ata2.00: BMDMA stat 0x25
[796082.301539] ata2.00: failed command: READ DMA
[796082.301567] ata2.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.301568] res 61/04:08:00:00:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.301654] ata2.00: status: { DRDY DF ERR }
[796082.301678] ata2.00: error: { ABRT }
[796082.324328] ata2.00: configured for UDMA/133
[796082.324338] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[796082.324341] sd 1:0:0:0: [sdb] Sense Key : Aborted Command [current] [descriptor]
[796082.324344] Descriptor sense data with sense descriptors (in hex):
[796082.324345] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
[796082.324351] 00 00 00 00
[796082.324354] sd 1:0:0:0: [sdb] Add. Sense: No additional sense information
[796082.324357] sd 1:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
[796082.324363] end_request: I/O error, dev sdb, sector 0
[796082.324390] Buffer I/O error on device sdb, logical block 0
[796082.324427] ata2: EH complete
[796082.324600] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.324631] ata2.00: BMDMA stat 0x25
[796082.324655] ata2.00: failed command: READ DMA
[796082.324682] ata2.00: cmd c8/00:08:00:10:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.324683] res 61/04:08:00:10:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.324773] ata2.00: status: { DRDY DF ERR }
[796082.324798] ata2.00: error: { ABRT }
[796082.340336] ata2.00: configured for UDMA/133
[796082.340342] ata2: EH complete
[796082.340473] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.340506] ata2.00: BMDMA stat 0x25
[796082.340530] ata2.00: failed command: READ DMA
[796082.340558] ata2.00: cmd c8/00:08:00:10:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.340559] res 61/04:08:00:10:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.340645] ata2.00: status: { DRDY DF ERR }
[796082.340669] ata2.00: error: { ABRT }
[796082.357830] ata2.00: configured for UDMA/133
[796082.357835] ata2: EH complete
[796082.357971] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[796082.357999] ata2.00: BMDMA stat 0x25
[796082.358022] ata2.00: failed command: READ DMA
[796082.358050] ata2.00: cmd c8/00:08:00:10:00/00:00:00:00:00/e0 tag 0 dma 4096 in
[796082.358051] res 61/04:08:00:10:00/04:00:57:00:00/e0 Emask 0x1 (device error)
[796082.358137] ata2.00: status: { DRDY DF ERR }
[796082.358161] ata2.00: error: { ABRT }
[796082.380337] ata2.00: configured for UDMA/133
[796082.380353] ata2: EH complete
[796082.380489] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Je deviens un peu à court de tests/idées/disques neufs....
Les données, c'est pas trop grave, il y a un rsnapshot hourly.
Le downtime serait plus embêtant, et surtout, j'ai l'impression que le re-crunchage du Raid-1 est long et éprouvant pour le sda qui pourrait aussi un jour me dire flûte...