PDA

View Full Version : S.M.A.R.T. errors on my serverbeach debian box.


Hero Zzyzzx
2004-11-05, 21:44 PM
I'm monitoring HDD S.M.A.R.T. information on my debian box via the excellent "smartsuite".

I'm getting attribute 201 errors, where 201 is an "unknown" or "reserved" attribute. These started in the last couple months and are intermittent, sometimes more than once a day and sometimes multiple days without an error.

I'm unable to determine what S.M.A.R.T. attribute 201 means on the Maxtor drive in my serverbeach debian box. . . Everything appears to be sailing along fine, but I want to make sure to catch any errors ahead of time- that's what hardware monitoring is for, no?

Is this a problem? smartctl information is attached below. . .


Device: Maxtor 6Y060L0 Supports ATA Version 7
Drive supports S.M.A.R.T. and is enabled
Check S.M.A.R.T. Passed.

General Smart Values:
Off-line data collection status: (0x80) Offline data collection activity was
never started

Self-test execution status: ( 241) Self-test routine in progess
10% of test remaining

Total time to complete off-line
data collection: ( 181) Seconds

Offline data collection
Capabilities: (0x5b)SMART EXECUTE OFF-LINE IMMEDIATE
Automatic timer ON/OFF support
Suspend Offline Collection upon new
command
Offline surface scan supported
Self-test supported

Smart Capablilities: (0x0003) Saves SMART data before entering
power-saving mode
Supports SMART auto save timer

Error logging capability: (0x01) Error logging supported

Short self-test routine
recommended polling time: ( 2) Minutes

Extended self-test routine
recommended polling time: ( 31) Minutes

Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute Flag Value Worst Threshold Raw Value
( 3)Spin Up Time 0x0027 224 224 063 7097
( 4)Start Stop Count 0x0032 253 253 000 59
( 5)Reallocated Sector Ct 0x0033 253 253 063 0
( 6)Read Channel Margin 0x0001 253 253 100 0
( 7)Seek Error Rate 0x000a 253 252 000 0
( 8)Seek Time Preformance 0x0027 250 242 187 52864
( 9)Power On Hours 0x0032 227 227 000 18474
( 10)Spin Retry Count 0x002b 253 252 157 0
( 11)Calibration Retry Count 0x002b 253 252 223 0
( 12)Power Cycle Count 0x0032 253 253 000 67
(192)Power-Off Retract Count 0x0032 253 253 000 0
(193)Load Cycle Count 0x0032 253 253 000 0
(194)Temperature 0x0032 253 253 000 38
(195)Hardware ECC Recovered 0x000a 253 252 000 364
(196)Reallocated Event Count 0x0008 253 253 000 0
(197)Current Pending Sector 0x0008 253 253 000 0
(198)Offline Uncorrectable 0x0008 253 253 000 0
(199)UDMA CRC Error Count 0x0008 199 198 000 2
(200)Unknown Attribute 0x000a 253 252 000 0
(201)Unknown Attribute 0x000a 253 241 000 0
(202)Unknown Attribute 0x000a 253 252 000 0
(203)Unknown Attribute 0x000b 253 252 180 0
(204)Unknown Attribute 0x000a 253 252 000 0
(205)Unknown Attribute 0x000a 253 252 000 0
(207)Unknown Attribute 0x002a 253 252 000 0
(208)Unknown Attribute 0x002a 253 252 000 0
(209)Unknown Attribute 0x0024 196 196 000 0
( 99)Unknown Attribute 0x0004 253 253 000 0
(100)Unknown Attribute 0x0004 253 253 000 0
(101)Unknown Attribute 0x0004 253 253 000 0
SMART Error Log:
SMART Error Logging Version: 1
Error Log Data Structure Pointer: 02
ATA Error Count: 2
Non-Fatal Count: 0

Error Log Structure 1:
DCR FR SC SN CL SH D/H CR Timestamp
08 00 80 c0 8d 23 e0 ca 36046
08 00 80 40 8e 23 e0 ca 36046
08 00 80 c0 8e 23 e0 ca 36046
08 00 80 40 8f 23 e0 ca 36046
08 00 80 c0 8f 23 e0 ca 36046
00 84 40 00 90 23 e0 51 3824485

Error Log Structure 2:
DCR FR SC SN CL SH D/H CR Timestamp
08 00 80 28 de 23 e0 ca 36046
08 00 80 a8 de 23 e0 ca 36046
08 00 80 28 df 23 e0 ca 36046
08 00 80 a8 df 23 e0 ca 36046
08 00 80 28 e0 23 e0 ca 36046
00 84 3d 6b e0 23 e0 51 3824486

QT
2004-11-05, 22:48 PM
It looks like the only two errors showing are in the "(199)UDMA CRC Error" category. CRC errors can be caused by a slow drive/bus or a flaky drive cable. It normally does not indicate imminent failure, just less-than-ideal performance. When you start seeing "Uncorrectable errors" in your dmesg along with S.M.A.R.T. errors, you should consider backing up your data and asking for a new drive. :)