Title Page
Contents
ABSTRACT 13
Chapter 1. Introduction 16
Chapter 2. Overview of Accounting Financed Emissions 23
2.1. Partnership for Carbon Accounting Financials (PCAF) 23
2.2. PCAF's Standardized GHG Estimation Accounting 24
2.3. Limitations of Current Methodology 28
Chapter 3. Literature Review 35
3.1. Analyzing the Accuracy of Emissions Data Across Different Data Quality Levels 35
3.2. Corporate Emissions Estimation with Externally Available Data 36
3.2.1. Linear Regression Analysis 37
3.2.2. Machine Learning Analysis 38
Chapter 4. Data 41
4.1. Target Feature 41
4.2. Predictor Features 42
4.2.1. Facility Number 42
4.2.2. Company Region 43
4.2.3. Industrial Classification Code 43
4.2.4. Number of Operating Industry 44
4.2.5. Parent Firm 45
4.2.6. Parent Business Classification 46
4.2.7. Financial Data 46
4.2.8. Currency Standard 49
4.2.9. Report Year & Month 50
4.3. Final Dataset 51
Chapter 5. Methodology 52
5.1. Current Option 3 Methodology 52
5.2. Machine-learning Algorithms 53
5.2.1. LightGBM 54
5.2.2. XGBoost 55
5.2.3. CatBoost 56
5.2.4. Cross-Validation & Test Set Selection 57
5.3. Feature Importance 58
Chapter 6. Results 59
6.1. Selected Evaluation Metrics 59
6.2. Performance 61
6.3. Performance Breakdown by Sectors 63
6.3.1. LightGBM Model's Performance Breakdown by Sectors 64
6.3.2. XGBoost Model's Performance Breakdown by Sectors 65
6.3.3. CatBoost Model's Performance Breakdown by Sectors 67
6.3.4. Performance Breakdown Analysis 69
6.4. Model Interpretability 70
6.4.1. SHAP Value Analysis of LightGBM Model 71
6.4.2. SHAP Value Analysis of XGBoost Model 75
6.4.3. SHAP Value Analysis of CatBoost Model 78
6.4.4. Overall Feature Evaluation 81
6.4.5. Financial Feature Analysis 82
Chapter 7. Discussion and Conclusion 92
References 95
Appendices 99
Appendix 1. Final Data Feature Description 99
Appendix 2. Data Feature Description 100
ABSTRACT IN KOREAN 101
Table 1. List of asset classes of PCAF's Financed Emissions 25
Table 2. General description of the data quality score table 29
Table 3. Data Quality Scores by Asset Class in Sustainable Finance Reports of U.S. Financial Institutions 32
Table 4. Parent Firm Value Standard 45
Table 5. Parent Business Classification Value Standard 46
Table 6. Modeling Technique Performance by NA Thresholds 61
Table 7. Comparative RMSE and R2 Values for Option 3 and Machine Learning Models across Different NA Percentage Thresholds 62
Table 8. Test Set per Sector Number of Company 69
Table 9. LightGBM Top Ten Important Feature 71
Table 10. XGBoost Top Ten Important Feature 75
Table 11. CatBoost Top Ten Important Feature 78
Table 12. Important Financial Feature Description 84
Figure 1. Distribution of Missing Data in Financial Features 48
Figure 2. Performances Boxplot by Sectors of LightGBM Model 65
Figure 3. Performances Boxplot by Sectors of XGBoost Model 66
Figure 4. Performances Boxplot by Sectors of LightGBM Model 68
Figure 5. LightGBM Model Top 10 Important Feature SHAP Value 74
Figure 6. LightGBM Model Feature Importance Excluding 'fac02' 74
Figure 7. XGBoost Model Top 10 Important Feature SHAP Value 77
Figure 8. XGBoost Model Feature Importance Excluding 'fac02' 77
Figure 9. CatBoost Model Top 10 Important Feature SHAP Value 80
Figure 10. CatBoost Model Feature Importance Excluding 'fac02' 80