Kaplan + Sadock's Synopsis of Psychiatry, 11e

219

5.3 Psychiatric Rating Scales

whether the measure provides good balanced coverage of the construct and is less focused on whether the items give the appearance of validity. Content validity is often assessed with formal procedures such as expert consensus or factor analysis. criterion validity .  Criterion validity (sometimes called pre- dictive or concurrent validity ) refers to whether or not the mea- sure agrees with a gold standard or criterion of accuracy. Suitable gold standards include the long form of an established instrument for a new, shorter version, a clinician-rated measure for a self- report form, and blood or urine tests for measures of drug use. For diagnostic interviews, the generally accepted gold standard is the L ongitudinal, E xpert, A ll D ata (LEAD) standard, which incorporates expert clinical evaluation, longitudinal data, medi- cal records, family history, and any other sources of information. construct validity .  When an adequate gold standard is not available—a frequent state of affairs in psychiatry—or when additional validity data are desired, construct validity must be assessed. To accomplish this, one can compare the measure to external validators, attributes that bear a well-characterized relationship to the construct under study but are not measured directly by the instrument. External validators used to validate psychiatric diagnostic criteria and the diagnostic instruments that aim to operationalize them include course of illness, family history, and treatment response. For example, when compared with schizophrenia measures, mania measures are expected to identify more individuals with a remitting course, a family his- tory of major mood disorders, and a good response to lithium. The scales discussed below cover various areas such as diagno- sis, functioning, and symptom severity, among others. Selec- tions were made based on coverage of major areas and common use in clinical research or current (or potential) use in clinical practice. Only a few of the many scales available in each cat- egory are discussed here. Disability Assessment One of the most widely used scales to measure disability was developed by the World Health Association (WHO), known as the WHO Disability Assessment Schedule, now in its second iteration (WHODAS 2.0). It is self-administered and measures disability along a number of parameters such as cognition, inter- personal relations, work and social impairment, among many others. It can be taken at intervals along the course of a person’s illness and is reliable in tracking changes that indicate a positive or negative response to therapeutic interventions or course of ill- ness (Table 5.3-1). A number of assessment scales were developed for inclu- sion in the 5 th edition of the Diagnostic and Statistical Manual of Mental Disorders of the American Psychiatric Association, (DSM-5); however, they were developed by and intended for use by research psychiatrists and are not as well tested as the WHO scales. It is expected that, in time, they will eventually be better adapted for clinical use. Some clinicians may wish to use the scales known as Cross-Cutting Symptom Measure Scales, but at this time the WHO scale is recommended for general use. Selection of Psychiatric Rating Scales

interchangeably in everyday speech, they are distinct in the context of evaluating rating scales. To be useful, scales should be reliable, or consistent and repeatable even if performed by different raters at different times or under different conditions, and they should be valid, or accurate in representing the true state of nature. Reliability.  Reliability refers to the consistency or repeat- ability of ratings and is largely empirical. An instrument is more likely to be reliable if the instructions and questions are clearly and simply worded and the format is easy to understand and score. There are three standard ways to assess reliability: inter- nal consistency, interrater, and test–retest. Internal Consistency .  Internal consistency assesses agree­ ment among the individual items in a measure. This provides information about reliability, because each item is viewed as a single measurement of the underlying construct. Thus, the coher- ence of the items suggests that each is measuring the same thing. Interrater and Test–Retest Reliability .  Interrater (also called interjudge or joint ) reliability is a measure of agreement between two or more observers evaluating the same subjects using the same information. Estimates may vary with assessment conditions—for instance, estimates of interrater reliability based on videotaped interviews tend to be higher than those based on interviews conducted by one of the raters. Test–retest evaluations measure reliability only to the extent that the subject’s true condition remains stable in the time interval. Issues in Interpreting Reliability Data .  When inter- preting reliability data, it is important to bear in mind that reli- ability estimates published in the literature may not generalize to other settings. Factors to consider are the nature of the sample, the training and experience of the raters, and the test conditions. Issues regarding the sample are especially critical. In particular, reliability tends to be higher in samples with high variability in which it is easier to discriminate among individuals. Validity.  Validity refers to conformity with truth, or a gold standard that can stand for truth. In the categorical context, it refers to whether an instrument can make correct classifications. In the continuous context, it refers to accuracy, or whether the score assigned can be said to represent the true state of nature. Although reliability is an empirical question, validity is partly theoretical—for many constructs measured in psychiatry, there is no underlying absolute truth. Even so, some measures yield more useful and meaningful data than others do. Validity assess- ment is generally divided into face and content validity, crite- rion validity, and construct validity. face and content validity .  Face validity refers to whether the items appear to assess the construct in question. Although a rating scale may purport to measure a construct of interest, a review of the items may reveal that it embodies a very differ- ent conceptualization of the construct. For instance, an insight scale may define insight in either psychoanalytic or neurologi- cal terms. However, items with a transparent relationship to the construct may be a disadvantage when measuring socially undesirable traits, such as substance abuse or malinger- ing. Content validity is similar to face validity but describes

Made with