Validating ‘SIL’ Equipment Failure Rates
By Harvey T.Dearden BSc CEng FIET MIMechE FInstMC
There is a requirement in IEC 61508 for
‘…assessing whether the demand rates and failure rates
during operation and maintenance are in accordance with assumptions made during the design of
the system.’
(Part 1, Clause 6.2.12 c, see also Figure 8) Easy to say, but how may we do this on a
practicable basis in a real process plant operation? In principle it is a straightforward matter; you
identify the relevant equipment populations and monitor the failure rates. In practice, the
identification of the populations, and analysis of failures may be less than straightforward.
The point of the monitoring is to identify where a failure rate is higher than anticipated, perhaps
because the assumed intrinsic device failure rate was optimistic or because the specifics of the
equipment deployment increase the failure rate, or the equipment is entering the wear-out phase
and approaching its end of life.
Any failure of equipment within a SIL rated function should be thoroughly analysed to identify the
cause of the failure and the possible implications for other equipment items deployed on SIL rated
duties. It would be unrealistic however to expect the same degree of analysis to extend to
equipment not deployed on SIL rated duties, particularly the typically much larger set of equipment
deployed on control and monitoring (rather than protection) duties. Much of this equipment will
‘repaired-by-replacement’ and forensic analysis of any failure is likely to be an unrealistic ambition.
But much of the equipment deployed on control and monitoring may well be the same as that
deployed on SIL rated duties. This wider population set, in providing a broader sample, will be
potentially useful in identifying failure rates. Since the split of safe/dangerous failures in the non-SIL
population is unlikely to be available we may estimate the number of dangerous failures on the basis
of the total number of failures and the estimate of Safe Failure Fraction. This estimate may be
combined with any explicitly identified dangerous failures in the SIL population to identify an
estimate of the total number of dangerous failures within the wider population.
These non-SIL failures may be monitored from maintenance work order records and spares
consumption as part of an annual review by the site maintenance authority. This analysis would be
for all equipment types that are also deployed on SIL duties and so one of the first requirements is
the compilation of a register of such equipment types.
With this register there is the difficulty of knowing how far to go in identifying the actual build and
deployment of equipment items. Consider an ESD valve; do we distinguish actuator from valve?
Different sizes of a given type? Different material combinations? Different duties for given build?
Different environments for a given build& duty? The greater our resolution in categorising build and
deployment, the smaller the populations available on which to basis our analysis.
I propose that initially we should identify populations as far as manufacturer and series type,
together with vulnerability due to the specific nature of the deployment. Typically we would identify
manufacturer and series of a flow meter, but not what size or material combination. This would
typically mirror the assessments undertaken by manufacturers which are generic to a series design
type. Only if a failure is subsequently found to only be relevant to a particular subset would I
propose a greater resolution in categorisation of populations.