subgroup populations, since any failures that are not associated with the duty will have more apparent
significance in the smaller population.
I would not propose identification of firmware revision as a basis for sub-dividing populations for
monitoring of ongoing reliability. I would only expect to consider firmware revision as part of change
management, or where upon analysis a failure was found to be due to vulnerability of a particular
firmware revision.
Detecting increased failure rate
If the actual number of failures in a population differs markedly from that expected, this may be an
indication of wear out or an error in the original estimate of failure rate. If the number of failures
differs by more than a nominated margin we may prompt revision of the PFD/PFH calculations or
consider the implications for equipment selection/replacement. I suggest that actual number of
failures should be compared with the anticipated number of failures on an annual basis. If there is
an apparently significant increase in failure rate from that anticipated, a representative equipment
item should be inspected to see whether wear out mechanisms have begun to influence
performance; if so, all similar equipment should be overhauled or replaced. If the equipment is
relatively new or if there is no evidence of wear out, it may be that the original estimate of failure
rate was wrong. (The anticipated number of annual failures is simply the population multiplied by
the failure rate in years.)
In order to provide suitable discrimination, the following threshold criteria are suggested for
possible end-of-useful-life alerts based on comparing actual number of failures with the anticipated
number:
AMBER alert: actual failures >= number of anticipated failures x 2, rounded, minimum 1.
RED alert: actual failures >= number of anticipated failures x3, rounded UP, minimum 2.
Unless the number of anticipated failures in a year is very low (<0.02 say?), in which case raise a RED
alert with a single failure. (If the probability of a failure is so low, if it does happen we should
consider why.)
If two amber alerts occur in consecutive years, or if a red alert occurs, the equipment should be
inspected at the next opportunity. The point of the amber alert is that there is always the potential
for a ‘blip’ in performance with subsequent regression to the mean.
Note that if the population is small and the reliability is high, failure count monitoring will not be
useful as a means of detecting potential end of useful life; the anticipated number of failures will be
so low that the failure rate (and associated PFD) could rise significantly without failures being
observed.
Wear Out
BS EN 61508 (Part 2 7.4.9.5 Note 3) says that typically equipment will have a useful life of between
8-12 years, but the provenance of this note is unclear; it might be based on the typical useful life of
aluminium electrolytic capacitors which are sometimes identified as the limiting component in
electronic equipment. Certainly there are many items that remain in service well beyond this period
without exhibiting wear out. Vendors may provide an indication of useful life, but understandably
will qualify this with the need to consider the specific duty the equipment is deployed upon