Discussion:
who's the intended consumer for fmdump(1M) error reports ?
Frank Batschulat
2008-11-29 17:54:02 UTC
Permalink
friends, I wonder who's the intended consumer of FMA error reports that
I see with fmdump -eV ? on my system.
I have 2 error reports that have no corresponding /var/adm/message output so I wonder what's the mystery behind them. from a system administrators
point of view seeing them with fmdump -eV is just pointless:

fmdump - nothing:
opteron.root./export/home/batschul.=> fmdump
TIME UUID SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty

fmadm faulty -av - nothing:

opteron.root./export/home/batschul.=> fmadm faulty -av
opteron.root./export/home/batschul.=>

fmdump -eV - looks there's something:

Nov 23 2008 17:33:47.504699983 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0xa2710035ffd00001
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /***@0,0/pci-***@7,1/***@1/***@0,0
(end detector)

driver-assessment = fail
op-code = 0x28
cdb = 0x28 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 0x0
pkt-reason = 0x0
pkt-state = 0x3f
pkt-stats = 0x0
stat-code = 0x2
key = 0x5
asc = 0x21
ascq = 0x0
sense-data = 0x70 0x0 0x5 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x492985eb 0x1e151c4f

Nov 23 2008 17:35:09.588866572 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0xa3a2c9c5bcf00001
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
device-path = /***@0,0/pci-***@7,1/***@1/***@0,0
(end detector)

driver-assessment = fail
op-code = 0x28
cdb = 0x28 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 0x0
pkt-reason = 0x0
pkt-state = 0x3f
pkt-stats = 0x0
stat-code = 0x2
key = 0x5
asc = 0x21
ascq = 0x0
sense-data = 0x70 0x0 0x5 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x4929863d 0x2319640c

so what kind of information is this considered ? the hex values for sure look somewhat
cool and important, yet they do not deliver any information to the outside observer.

presumably this matches the cdrom drive, maybe unit not ready ?

opteron.batschul./export/home/batschul.=> iostat -E
cmdk0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST3200822A Revision: Serial No: 3LJ Size: 200.05GB <200047067136 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
cmdk1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST380011A Revision: Serial No: 5JVTDTE4 Size: 80.03GB <80025845760 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
sd0 Soft Errors: 0 Hard Errors: 13 Transport Errors: 0
Vendor: TEAC Product: DW-552G Revision: N4K1 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 13 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

if this information is something someone should know about why isn't a message corresponding to the error report not send to a place a system adminsitrator notices ?

if this is information not considered an information at all, why its there ?

questions over questions......

thanks
frankB
--
This message posted from opensolaris.org
cwb
2008-11-30 08:32:32 UTC
Permalink
fmdump -e is not really intended for end users, it shows the contents of
the error log rather than the fault log. In other words it shows the
e-reports that have been sent to fma for analysis. rather than the
faults which have been diagnosed by FMA

fmadm faulty is probably more what you should be looking at

Chris
Post by Frank Batschulat
friends, I wonder who's the intended consumer of FMA error reports that
I see with fmdump -eV ? on my system.
I have 2 error reports that have no corresponding /var/adm/message output so I wonder what's the mystery behind them. from a system administrators
opteron.root./export/home/batschul.=> fmdump
TIME UUID SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty
opteron.root./export/home/batschul.=> fmadm faulty -av
opteron.root./export/home/batschul.=>
Nov 23 2008 17:33:47.504699983 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0xa2710035ffd00001
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
(end detector)
driver-assessment = fail
op-code = 0x28
cdb = 0x28 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 0x0
pkt-reason = 0x0
pkt-state = 0x3f
pkt-stats = 0x0
stat-code = 0x2
key = 0x5
asc = 0x21
ascq = 0x0
sense-data = 0x70 0x0 0x5 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x492985eb 0x1e151c4f
Nov 23 2008 17:35:09.588866572 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0xa3a2c9c5bcf00001
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
(end detector)
driver-assessment = fail
op-code = 0x28
cdb = 0x28 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 0x0
pkt-reason = 0x0
pkt-state = 0x3f
pkt-stats = 0x0
stat-code = 0x2
key = 0x5
asc = 0x21
ascq = 0x0
sense-data = 0x70 0x0 0x5 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x4929863d 0x2319640c
so what kind of information is this considered ? the hex values for sure look somewhat
cool and important, yet they do not deliver any information to the outside observer.
presumably this matches the cdrom drive, maybe unit not ready ?
opteron.batschul./export/home/batschul.=> iostat -E
cmdk0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST3200822A Revision: Serial No: 3LJ Size: 200.05GB <200047067136 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
cmdk1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST380011A Revision: Serial No: 5JVTDTE4 Size: 80.03GB <80025845760 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
sd0 Soft Errors: 0 Hard Errors: 13 Transport Errors: 0
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 13 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
if this information is something someone should know about why isn't a message corresponding to the error report not send to a place a system adminsitrator notices ?
if this is information not considered an information at all, why its there ?
questions over questions......
thanks
frankB
Frank Batschulat (Home)
2008-11-30 12:20:20 UTC
Permalink
Post by cwb
fmdump -e is not really intended for end users, it shows the contents of
the error log rather than the fault log. In other words it shows the
e-reports that have been sent to fma for analysis. rather than the
faults which have been diagnosed by FMA
fmadm faulty is probably more what you should be looking at
yepp, sounds reasonable. its supposed to be an error report that did not
subsequently caused a fault event to be generated and bubbled up the stack.

and the fmdump man page does tell me to stay away from error reports
very likely for reasons like this example:

System Administration Commands fmdump(1M)

-e

Display events from the fault management error log
instead of the fault log. This option is shorthand for
specifying the pathname of the error log file.

The error log file contains Private telemetry informa-
tion used by Sun's automated diagnosis software. This
information is recorded to facilitate post-mortem
analysis of problems and event replay, and should not be
parsed or relied upon for the development of scripts and
other tools. See attributes(5) for information about
Sun's rules for Private interfaces.
Post by cwb
Chris
Post by Frank Batschulat
friends, I wonder who's the intended consumer of FMA error reports that
I see with fmdump -eV ? on my system.
I have 2 error reports that have no corresponding /var/adm/message output so I wonder what's the mystery behind them. from a system administrators
opteron.root./export/home/batschul.=> fmdump
TIME UUID SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty
opteron.root./export/home/batschul.=> fmadm faulty -av
opteron.root./export/home/batschul.=>
Nov 23 2008 17:33:47.504699983 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0xa2710035ffd00001
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
(end detector)
driver-assessment = fail
op-code = 0x28
cdb = 0x28 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 0x0
pkt-reason = 0x0
pkt-state = 0x3f
pkt-stats = 0x0
stat-code = 0x2
key = 0x5
asc = 0x21
ascq = 0x0
sense-data = 0x70 0x0 0x5 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x492985eb 0x1e151c4f
Nov 23 2008 17:35:09.588866572 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
class = ereport.io.scsi.cmd.disk.dev.rqs.derr
ena = 0xa3a2c9c5bcf00001
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = dev
(end detector)
driver-assessment = fail
op-code = 0x28
cdb = 0x28 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x1 0x0
pkt-reason = 0x0
pkt-state = 0x3f
pkt-stats = 0x0
stat-code = 0x2
key = 0x5
asc = 0x21
ascq = 0x0
sense-data = 0x70 0x0 0x5 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x0 0x0 0x0 0x0
__ttl = 0x1
__tod = 0x4929863d 0x2319640c
so what kind of information is this considered ? the hex values for sure look somewhat
cool and important, yet they do not deliver any information to the outside observer.
presumably this matches the cdrom drive, maybe unit not ready ?
opteron.batschul./export/home/batschul.=> iostat -E
cmdk0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST3200822A Revision: Serial No: 3LJ Size: 200.05GB <200047067136 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
cmdk1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST380011A Revision: Serial No: 5JVTDTE4 Size: 80.03GB <80025845760 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
sd0 Soft Errors: 0 Hard Errors: 13 Transport Errors: 0
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 13 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
if this information is something someone should know about why isn't a message corresponding to the error report not send to a place a system adminsitrator notices ?
if this is information not considered an information at all, why its there ?
questions over questions......
thanks
frankB
--
frankB

It is always possible to agglutinate multiple separate problems
into a single complex interdependent solution.
In most cases this is a bad idea.
Loading...