Title page
Contents
WELCOME TO THE WHAT WORKS CLEARINGHOUSE PROCEDURES AND STANDARDS HANDBOOK, VERSION 5.0 10
Major technical and procedural changes between versions 5.0 and 4.1 of WWC procedures and standards 10
Content and organization of the WWC Procedures and Standards Handbook, Version 5.0 13
CHAPTER I. OVERVIEW OF THE WHAT WORKS CLEARINGHOUSE AND ITS PROCEDURES AND STANDARDS 14
What the WWC is, what it does, and why 14
WWC history 14
WWC roles and responsibilities 15
WWC products 15
WWC training and reviewer certification 16
How the WWC conducts study reviews 18
How research can meet WWC standards 21
WWC PROCEDURES AND STANDARDS FOR STUDY REVIEWS 27
How to use the Handbook and review protocols 27
How the WWC defines a study 28
How to determine the unit of assignment 29
CHAPTER II. SCREENING STUDIES FOR ELIGIBILITY 30
Study eligibility requirements 30
After the study is found eligible 32
CHAPTER III. REVIEWING FINDINGS FROM RANDOMIZED CONTROLLED TRIALS AND QUASI-EXPERIMENTAL DESIGNS 34
Step 1. Reviewing outcome measures and checking for confounding factors 35
Step 2. Assignment to conditions 41
Step 3. Compositional change 46
Step 4. Baseline equivalence 60
CHAPTER IV. REVIEWING FINDINGS FROM REGRESSION DISCONTINUITY DESIGNS 71
Screening RDD studies for eligibility 71
Reviewing findings from RDD studies according to WWC standards 73
Reviewing findings from cluster-assignment RDDs according to WWC standards 89
CHAPTER V. REVIEWING COMPLIER AVERAGE CAUSAL EFFECT ESTIMATES AND FINDINGS USING OTHER ADVANCED ANALYTICAL APPROACHES 91
Procedures and standards for CACEs 91
Procedures and standards for repeated-measures analyses 98
Procedures and standards for analyses with endogenous covariates 102
Procedures and standards for analyses with imputations for missing data 103
CHAPTER VI. REVIEWING FINDINGS FROM SINGLE-CASE DESIGN STUDIES 112
Additional eligibility requirements for SCDs 112
Reviewing findings from SCDs according to WWC standards 113
CHAPTER VII. SYNTHESIS AND REPORTING OF RESULTS 135
Criteria for designating findings as main or supplemental 135
Determining the study's research rating based on the research ratings of findings 137
Determining an effectiveness rating based on the evidence of the intervention's effects 138
APPENDIX A. PRINCIPLES FOR PRIORITIZING AND SEARCHING FOR STUDIES TO REVIEW 150
APPENDIX B. PROCEDURES FOR SENDING AUTHOR QUERIES 155
APPENDIX C. BOUNDARIES FOR DEFINING HIGH VERSUS LOW ATTRITION 157
APPENDIX D. GLOSSARY OF SYMBOLS FOR STATISTICAL FORMULAS 159
APPENDIX E. STATISTICAL FORMULAS FOR EACH FINDING IN A STUDY 168
APPENDIX F. STATISTICAL FORMULAS FOR AGGREGATING STUDY FINDINGS 195
APPENDIX G. ADDITIONAL DETAIL FOR ANALYSES OF COMPLIER AVERAGE CAUSAL EFFECTS 202
APPENDIX H. ADDITIONAL DETAIL FOR ANALYSES WITH MISSING DATA 211
APPENDIX I. STATISTICAL FORMULAS FOR THE NONOVERLAP OF ALL PAIRS IN SINGLE-CASE DESIGNS 223
REFERENCES 228
Table 1. What Works Clearinghouse products 16
Table 2. What Works Clearinghouse certification training content and requirements 17
Table 3. What Works Clearinghouse research ratings 20
Table 4. Research designs reviewed by the What Works Clearinghouse and the highest research rating they are eligible to receive 24
Table 5. Examples of an N = 1 confounding factor 39
Table 6. Examples of characteristics or time as a confounding factor 40
Table 7. Examples of compromised individual-level and cluster-level randomized controlled trials 45
Table 8. People influencing students' placement into clusters by level of assignment 53
Table 9. Attrition boundaries and allowable reference samples for measuring individual-level attrition in cluster randomized controlled trials (step 3c) 57
Table 10. Absolute effect sizes at baseline 60
Table 11. Baseline adjustment measures for teacher and school leader outcomes 63
Table 12. Example baseline samples for an outcome sample of grade 12 students in 2014/15 (step 4b) 68
Table 13. Criteria for the forcing variable in a regression discontinuity design study 72
Table 14. Ratings for findings from regression discontinuity design studies 73
Table 15. Regression discontinuity design criteria for Standard 1: Integrity of the forcing variable 75
Table 16. Regression discontinuity design criteria for Standard 2: Sample attrition and baseline equivalence 77
Table 17. Regression discontinuity design criteria for Standard 3: Continuity of the relationship between the outcome and the forcing variable 79
Table 18. Regression discontinuity design criteria for Standard 4: Functional form and bandwidth 81
Table 19. Regression discontinuity design criteria for evaluating fuzzy regression discontinuity designs 85
Table 20. First-stage F statistic thresholds for satisfying the criterion of sufficient instrument strength 97
Table 21. Acceptable approaches for addressing missing baseline or outcome data 106
Table 22. Criteria for the study-level research rating based on research design and execution 138
Table 23. What Works Clearinghouse effectiveness ratings in individual studies and intervention reports by outcome domain 139
Table 24. Effectiveness ratings for recommendations in practice guides 142
Figure 1. Timeline of the What Works Clearinghouse Procedures and Standards Handbook 15
Figure 2. Three phases in the What Works Clearinghouse study review process 18
Figure 3. Study eligibility requirements for a What Works Clearinghouse review 30
Figure 4. Ratings flowchart for individual and cluster assignment randomized controlled trials and quasi-experimental designs 34
Figure 5. Example of outcome domain, outcomes, and measurements 35
Figure 6. Example of differential attrition rates resulting in dissimilar groups 48
Figure 7. Assessing risk of bias due to compositional change in cluster randomized controlled trials (step 3 in reviewing research design) 50
Figure 8. Assessing risk of bias due to joiners in cluster randomized controlled trials (step 3b) 52
Figure 9. Assessing risk of bias due to leavers in cluster randomized controlled trials (step 3c) 55
Figure 10. Acceptable methods for baseline adjustment 65
Figure 11. Two pathways for satisfying the baseline equivalence standard in cluster-level assignment studies (step 4) 66
Figure 12. Review process for studies that report findings from complier average causal effect analyses 95
Figure 13. Eligible and ineligible samples for repeated-measures analyses 101
Figure 14. Research ratings for randomized controlled trials and quasi-experimental designs with missing outcome or baseline data 104
Figure 15. Basic single-case design 113
Figure 16. Single-case design review process for eligible study findings 115
Figure 17. Zero- and low-variability baseline examples 121
Figure 18. Reversal/withdrawal design example 122
Figure 19. Multiple baseline design example 124
Figure 20. Example violations of first and second concurrence requirements 124
Figure 21. Example violation of the third concurrence requirement 125
Figure 22. Example of empty training phases 125
Figure 23. Multiple probe design, example 1 127
Figure 24. Multiple probe design, example 2 127
Figure 25. Alternating treatment design example 128
Figure 26. Treatment reversal design with extra phases that is rated Meets WWC Standards Without Reservations 132
Figure 27. Treatment reversal design with extra phases that is rated Does Not Meet WWC Standards 132
Figure 28. Multiple baseline design with four cases rated Meets WWC Standards Without Reservations 133
Figure 29. Combination multiple baseline design with reversals that are rated Meets WWC Standards Without Reservations 134
Boxes
Box 1. Outcome measures WWC considers valid and reliable without reliability statistics 36
Table A.1. Search term examples from the Adolescent Literacy Review protocol 153
Table C.1. Highest differential attrition rate for a sample to maintain low attrition, by overall attrition rate, under cautious and optimistic assumptions 157
Table D.1. Glossary of statistical formula symbols 159
Table E.1. Descriptive statistics for a low-attrition randomized controlled trial 171
Table E.2. Use cases for covariate-adjusted standard error formulas 175
Table E.3. Descriptive statistics for a cluster-level analysis 184
Table F.1. Domain-level computations for an example study with two main findings 197
Table F.2. Example of fixed-effects meta-analysis 199
Table H.1. Acceptable approaches for addressing missing baseline or outcome data 212
Table I.1. Example of pairwise comparisons for baseline trend 225
Table I.2. Example of pairwise comparisons for reversibility 226
Figure E.1. An improvement of 0.4 standard deviation 169