SIS Practices for Late Life-cycle Phases

Validating ‘SIL’ Equipment Failure Rates

By Harvey T.Dearden BSc CEng FIET MIMechE FInstMC

There is a requirement in IEC 61508 for

‘…assessing whether the demand rates and failure rates

during operation and maintenance are in accordance with assumptions made during the design of

the system.’

(Part 1, Clause 6.2.12 c, see also Figure 8) Easy to say, but how may we do this on a

practicable basis in a real process plant operation? In principle it is a straightforward matter; you

identify the relevant equipment populations and monitor the failure rates. In practice, the

identification of the populations, and analysis of failures may be less than straightforward.

The point of the monitoring is to identify where a failure rate is higher than anticipated, perhaps

because the assumed intrinsic device failure rate was optimistic or because the specifics of the

equipment deployment increase the failure rate, or the equipment is entering the wear-out phase

and approaching its end of life.

Any failure of equipment within a SIL rated function should be thoroughly analysed to identify the

cause of the failure and the possible implications for other equipment items deployed on SIL rated

duties. It would be unrealistic however to expect the same degree of analysis to extend to

equipment not deployed on SIL rated duties, particularly the typically much larger set of equipment

deployed on control and monitoring (rather than protection) duties. Much of this equipment will

‘repaired-by-replacement’ and forensic analysis of any failure is likely to be an unrealistic ambition.

But much of the equipment deployed on control and monitoring may well be the same as that

deployed on SIL rated duties. This wider population set, in providing a broader sample, will be

potentially useful in identifying failure rates. Since the split of safe/dangerous failures in the non-SIL

population is unlikely to be available we may estimate the number of dangerous failures on the basis

of the total number of failures and the estimate of Safe Failure Fraction. This estimate may be

combined with any explicitly identified dangerous failures in the SIL population to identify an

estimate of the total number of dangerous failures within the wider population.

These non-SIL failures may be monitored from maintenance work order records and spares

consumption as part of an annual review by the site maintenance authority. This analysis would be

for all equipment types that are also deployed on SIL duties and so one of the first requirements is

the compilation of a register of such equipment types.

With this register there is the difficulty of knowing how far to go in identifying the actual build and

deployment of equipment items. Consider an ESD valve; do we distinguish actuator from valve?

Different sizes of a given type? Different material combinations? Different duties for given build?

Different environments for a given build& duty? The greater our resolution in categorising build and

deployment, the smaller the populations available on which to basis our analysis.

I propose that initially we should identify populations as far as manufacturer and series type,

together with vulnerability due to the specific nature of the deployment. Typically we would identify

manufacturer and series of a flow meter, but not what size or material combination. This would

typically mirror the assessments undertaken by manufacturers which are generic to a series design

type. Only if a failure is subsequently found to only be relevant to a particular subset would I

propose a greater resolution in categorisation of populations.