Functional Safety 2016

subgroup populations, since any failures that are not associated with the duty will have more apparent

significance in the smaller population.

I would not propose identification of firmware revision as a basis for sub-dividing populations for

monitoring of ongoing reliability. I would only expect to consider firmware revision as part of change

management, or where upon analysis a failure was found to be due to vulnerability of a particular

firmware revision.

Detecting increased failure rate

If the actual number of failures in a population differs markedly from that expected, this may be an

indication of wear out or an error in the original estimate of failure rate. If the number of failures

differs by more than a nominated margin we may prompt revision of the PFD/PFH calculations or

consider the implications for equipment selection/replacement. I suggest that actual number of

failures should be compared with the anticipated number of failures on an annual basis. If there is

an apparently significant increase in failure rate from that anticipated, a representative equipment

item should be inspected to see whether wear out mechanisms have begun to influence

performance; if so, all similar equipment should be overhauled or replaced. If the equipment is

relatively new or if there is no evidence of wear out, it may be that the original estimate of failure

rate was wrong. (The anticipated number of annual failures is simply the population multiplied by

the failure rate in years.)

In order to provide suitable discrimination, the following threshold criteria are suggested for

possible end-of-useful-life alerts based on comparing actual number of failures with the anticipated

number:

AMBER alert: actual failures >= number of anticipated failures x 2, rounded, minimum 1.

RED alert: actual failures >= number of anticipated failures x3, rounded UP, minimum 2.

Unless the number of anticipated failures in a year is very low (<0.02 say?), in which case raise a RED

alert with a single failure. (If the probability of a failure is so low, if it does happen we should

consider why.)

If two amber alerts occur in consecutive years, or if a red alert occurs, the equipment should be

inspected at the next opportunity. The point of the amber alert is that there is always the potential

for a ‘blip’ in performance with subsequent regression to the mean.

Note that if the population is small and the reliability is high, failure count monitoring will not be

useful as a means of detecting potential end of useful life; the anticipated number of failures will be

so low that the failure rate (and associated PFD) could rise significantly without failures being

observed.

Wear Out

BS EN 61508 (Part 2 7.4.9.5 Note 3) says that typically equipment will have a useful life of between

8-12 years, but the provenance of this note is unclear; it might be based on the typical useful life of

aluminium electrolytic capacitors which are sometimes identified as the limiting component in

electronic equipment. Certainly there are many items that remain in service well beyond this period

without exhibiting wear out. Vendors may provide an indication of useful life, but understandably

will qualify this with the need to consider the specific duty the equipment is deployed upon