Background Image
Table of Contents Table of Contents
Previous Page  193 / 274 Next Page
Information
Show Menu
Previous Page 193 / 274 Next Page
Page Background

L A B O R A T O R Y M A N A G E M E N T

© A O A C I N T E R N A T I O N A L

N O V E M B E R / D E C E M B E R 2 0 1 4

22

laboratory reproducibility of a method

as measured by the relative standard

deviation for reproducibility [RSD

(R)

];

2. provides or confirms the accuracy

(trueness; when a certified reference

material is used) and repeatability

(precision) characteristics of a method;

3. determines if the instructions for a

method are clear and can be followed by

analysts who are not affiliated with the

method developer; and 4. determines

that the method has been designed

so that the operating parameters that

might affect the performance of the

method are truly known and under con-

trol (robustness).

Most of a method evaluation can be

completed in a single laboratory. For

example, accuracy, repeatability, and rug-

gedness can be determined in just one

laboratory, AOAC has a well-described

procedure, the Youden ruggedness pro-

cedure (6), to determine ruggedness of

a candidate method. (Ruggedness can

be determined in a single laboratory.

Robustness is demonstrated in a collab-

orative study.) Method instruction clarity

could be determined using an estab-

lished review procedure. Interlaboratory

reproducibility is the only parameter that

requires collaborators.

The obvious question to ask when

assessing the traditional collabora-

tive study design is: Are 8 valid data

sets really required? Clearly, 10 valid

data sets are better than eight, and

12 better than 10, but how many valid

data sets are really needed to satisfy

the purposes of a collaborative study to

quantify “reproducibility.” It is mainly

a question of the confidence associ-

ated with the calculated RSD

(R)

. It

may not be immediately obvious, but

organizations such as AOAC indirectly

establish a confidence interval around

the calculated RSD

(R)

by the simple act

of requiring a minimum number of data

sets. This has been the paradigm of

method validation for more than

50 years. (AOAC has been operating

for over 125 years, but for much of its

history, there was not an agreed upon

minimum number of valid data sets.

That didn’t happen until the 1980s.)

There is another paradigm that is

generally called “fitness-for-purpose.”

Instead of forcing method developers

and users to accept a confidence level

derived as a consequence of the mini-

mum number of collaborators, it is also

possible to allow method developers

to determine the appropriate confi-

dence level and then find the necessary

number of collaborators. The key to a

fitness-for-purpose validation model

is that a method developer would be

required to report the target confidence

interval. A target interval is not nor-

mally calculated or reported because

there is an implied target interval with

the current eight laboratory minimum

collaborative study model.

A fitness-for-purpose model has two

advantages: 1. potential method users

can decide if the reported reproduc-

ibility and confidence level are good

enough for their purposes, much as

a potential user can now assess the

recovery, accuracy, LOQ, and range

of applicability; and 2. in some cases,

notably government-sponsored valida-

tion projects, the number of data sets

far exceeds the eight

laboratory minimum.

In these admittedly

rare and rarer cases,

the estimate of the

reproducibility is

known with much

greater confidence,

and this could be

reported to potential

users.

There is a new

benefit to the fitness-

for-purpose model

in that the acceptance criteria for

the method validation can be clearly

and quantitatively stated using target

measurement uncertainty. A paper

by Weitzel and Johnson (7) describes

a process using decision rules and

probability to determine a target mea-

surement uncertainty that is then used

to set the acceptance criteria for a

method validation. Target measurement

uncertainty is defined as “measurement

uncertainty specified as an upper limit

and decided on the basis of the intended

use of measurement results (8).” The

target measurement uncertainty can

be used to decide appropriate values

for validation criteria, such as bias,

precision, LOD, and LOQ; thus, directly

linking the SMPR to fitness-for-purpose.

Proficiency Testing

Proficiency testing (PT) is a widely

recognized practice for monitoring ana-

lytical performance, and in some ways

the PT process is very similar to the

process of a collaborative study. Test

materials are prepared and distributed

by a program/project coordinator. Each

participating laboratory analyzes a

common set of blind test samples, and

reports their results back to the coor-

dinator. The coordinator then analyzes

the data. Of course, there are several

differences between PT programs and

collaborative studies: 1. the aim of PT

is to assess the performance of the lab-

oratory not the method; 2. laboratories

may use any appropriate method they

choose for PT; and

3. the data is ana-

lyzed to determine how the individual

laboratory performs in relation to the

whole group of laboratories.

For many years, it has been strictly

forbidden to even suggest that PT

data might be used for the purposes

of evaluating a method. However, in

2010, Ellison et al. published a paper

proposing that there might be a role for

proficiency testing data in method vali-

dation under certain conditions. They

concluded that a properly implemented

PT program provides very similar infor-

mation to a traditional collaborative

study, and should be given equal weight

in appraising methods for suitability (9).

Alternative Approaches to the Traditional Collaborative Study

P

roficiency testing (PT) is a

widely recognized practice

for monitoring analytical

performance, and in some ways

the PT process is very similar to the

process of a collaborative study.

139