목차

Title page

Contents

WELCOME TO THE WHAT WORKS CLEARINGHOUSE PROCEDURES AND STANDARDS HANDBOOK, VERSION 5.0 10

Major technical and procedural changes between versions 5.0 and 4.1 of WWC procedures and standards 10

Content and organization of the WWC Procedures and Standards Handbook, Version 5.0 13

CHAPTER I. OVERVIEW OF THE WHAT WORKS CLEARINGHOUSE AND ITS PROCEDURES AND STANDARDS 14

What the WWC is, what it does, and why 14

WWC history 14

WWC roles and responsibilities 15

WWC products 15

WWC training and reviewer certification 16

How the WWC conducts study reviews 18

How research can meet WWC standards 21

WWC PROCEDURES AND STANDARDS FOR STUDY REVIEWS 27

How to use the Handbook and review protocols 27

How the WWC defines a study 28

How to determine the unit of assignment 29

CHAPTER II. SCREENING STUDIES FOR ELIGIBILITY 30

Study eligibility requirements 30

After the study is found eligible 32

CHAPTER III. REVIEWING FINDINGS FROM RANDOMIZED CONTROLLED TRIALS AND QUASI-EXPERIMENTAL DESIGNS 34

Step 1. Reviewing outcome measures and checking for confounding factors 35

Step 2. Assignment to conditions 41

Step 3. Compositional change 46

Step 4. Baseline equivalence 60

CHAPTER IV. REVIEWING FINDINGS FROM REGRESSION DISCONTINUITY DESIGNS 71

Screening RDD studies for eligibility 71

Reviewing findings from RDD studies according to WWC standards 73

Reviewing findings from cluster-assignment RDDs according to WWC standards 89

CHAPTER V. REVIEWING COMPLIER AVERAGE CAUSAL EFFECT ESTIMATES AND FINDINGS USING OTHER ADVANCED ANALYTICAL APPROACHES 91

Procedures and standards for CACEs 91

Procedures and standards for repeated-measures analyses 98

Procedures and standards for analyses with endogenous covariates 102

Procedures and standards for analyses with imputations for missing data 103

CHAPTER VI. REVIEWING FINDINGS FROM SINGLE-CASE DESIGN STUDIES 112

Additional eligibility requirements for SCDs 112

Reviewing findings from SCDs according to WWC standards 113

CHAPTER VII. SYNTHESIS AND REPORTING OF RESULTS 135

Criteria for designating findings as main or supplemental 135

Determining the study's research rating based on the research ratings of findings 137

Determining an effectiveness rating based on the evidence of the intervention's effects 138

APPENDIX A. PRINCIPLES FOR PRIORITIZING AND SEARCHING FOR STUDIES TO REVIEW 150

APPENDIX B. PROCEDURES FOR SENDING AUTHOR QUERIES 155

APPENDIX C. BOUNDARIES FOR DEFINING HIGH VERSUS LOW ATTRITION 157

APPENDIX D. GLOSSARY OF SYMBOLS FOR STATISTICAL FORMULAS 159

APPENDIX E. STATISTICAL FORMULAS FOR EACH FINDING IN A STUDY 168

APPENDIX F. STATISTICAL FORMULAS FOR AGGREGATING STUDY FINDINGS 195

APPENDIX G. ADDITIONAL DETAIL FOR ANALYSES OF COMPLIER AVERAGE CAUSAL EFFECTS 202

APPENDIX H. ADDITIONAL DETAIL FOR ANALYSES WITH MISSING DATA 211

APPENDIX I. STATISTICAL FORMULAS FOR THE NONOVERLAP OF ALL PAIRS IN SINGLE-CASE DESIGNS 223

REFERENCES 228

Table 1. What Works Clearinghouse products 16

Table 2. What Works Clearinghouse certification training content and requirements 17

Table 3. What Works Clearinghouse research ratings 20

Table 4. Research designs reviewed by the What Works Clearinghouse and the highest research rating they are eligible to receive 24

Table 5. Examples of an N = 1 confounding factor 39

Table 6. Examples of characteristics or time as a confounding factor 40

Table 7. Examples of compromised individual-level and cluster-level randomized controlled trials 45

Table 8. People influencing students' placement into clusters by level of assignment 53

Table 9. Attrition boundaries and allowable reference samples for measuring individual-level attrition in cluster randomized controlled trials (step 3c) 57

Table 10. Absolute effect sizes at baseline 60

Table 11. Baseline adjustment measures for teacher and school leader outcomes 63

Table 12. Example baseline samples for an outcome sample of grade 12 students in 2014/15 (step 4b) 68

Table 13. Criteria for the forcing variable in a regression discontinuity design study 72

Table 14. Ratings for findings from regression discontinuity design studies 73

Table 15. Regression discontinuity design criteria for Standard 1: Integrity of the forcing variable 75

Table 16. Regression discontinuity design criteria for Standard 2: Sample attrition and baseline equivalence 77

Table 17. Regression discontinuity design criteria for Standard 3: Continuity of the relationship between the outcome and the forcing variable 79

Table 18. Regression discontinuity design criteria for Standard 4: Functional form and bandwidth 81

Table 19. Regression discontinuity design criteria for evaluating fuzzy regression discontinuity designs 85

Table 20. First-stage F statistic thresholds for satisfying the criterion of sufficient instrument strength 97

Table 21. Acceptable approaches for addressing missing baseline or outcome data 106

Table 22. Criteria for the study-level research rating based on research design and execution 138

Table 23. What Works Clearinghouse effectiveness ratings in individual studies and intervention reports by outcome domain 139

Table 24. Effectiveness ratings for recommendations in practice guides 142

Figure 1. Timeline of the What Works Clearinghouse Procedures and Standards Handbook 15

Figure 2. Three phases in the What Works Clearinghouse study review process 18

Figure 3. Study eligibility requirements for a What Works Clearinghouse review 30

Figure 4. Ratings flowchart for individual and cluster assignment randomized controlled trials and quasi-experimental designs 34

Figure 5. Example of outcome domain, outcomes, and measurements 35

Figure 6. Example of differential attrition rates resulting in dissimilar groups 48

Figure 7. Assessing risk of bias due to compositional change in cluster randomized controlled trials (step 3 in reviewing research design) 50

Figure 8. Assessing risk of bias due to joiners in cluster randomized controlled trials (step 3b) 52

Figure 9. Assessing risk of bias due to leavers in cluster randomized controlled trials (step 3c) 55

Figure 10. Acceptable methods for baseline adjustment 65

Figure 11. Two pathways for satisfying the baseline equivalence standard in cluster-level assignment studies (step 4) 66

Figure 12. Review process for studies that report findings from complier average causal effect analyses 95

Figure 13. Eligible and ineligible samples for repeated-measures analyses 101

Figure 14. Research ratings for randomized controlled trials and quasi-experimental designs with missing outcome or baseline data 104

Figure 15. Basic single-case design 113

Figure 16. Single-case design review process for eligible study findings 115

Figure 17. Zero- and low-variability baseline examples 121

Figure 18. Reversal/withdrawal design example 122

Figure 19. Multiple baseline design example 124

Figure 20. Example violations of first and second concurrence requirements 124

Figure 21. Example violation of the third concurrence requirement 125

Figure 22. Example of empty training phases 125

Figure 23. Multiple probe design, example 1 127

Figure 24. Multiple probe design, example 2 127

Figure 25. Alternating treatment design example 128

Figure 26. Treatment reversal design with extra phases that is rated Meets WWC Standards Without Reservations 132

Figure 27. Treatment reversal design with extra phases that is rated Does Not Meet WWC Standards 132

Figure 28. Multiple baseline design with four cases rated Meets WWC Standards Without Reservations 133

Figure 29. Combination multiple baseline design with reversals that are rated Meets WWC Standards Without Reservations 134

Boxes

Box 1. Outcome measures WWC considers valid and reliable without reliability statistics 36

Table A.1. Search term examples from the Adolescent Literacy Review protocol 153

Table C.1. Highest differential attrition rate for a sample to maintain low attrition, by overall attrition rate, under cautious and optimistic assumptions 157

Table D.1. Glossary of statistical formula symbols 159

Table E.1. Descriptive statistics for a low-attrition randomized controlled trial 171

Table E.2. Use cases for covariate-adjusted standard error formulas 175

Table E.3. Descriptive statistics for a cluster-level analysis 184

Table F.1. Domain-level computations for an example study with two main findings 197

Table F.2. Example of fixed-effects meta-analysis 199

Table H.1. Acceptable approaches for addressing missing baseline or outcome data 212

Table I.1. Example of pairwise comparisons for baseline trend 225

Table I.2. Example of pairwise comparisons for reversibility 226

Figure E.1. An improvement of 0.4 standard deviation 169