Title Page
ABSTRACT
Contents
List of Abbreviations 19
Chapter 1. Background and Literature Review of Study 21
1.1. Incidence and Aetiology of CRC 22
1.1.1. Epidemiological Trends in CRC 22
1.1.2. Modifiable Risk Factors 25
1.1.3. Genetic Factors 28
1.2. Carcinogenic Pathway and Molecular Subtype of CRC 30
1.2.1. Colorectal Carcinogenesis Phase and Pathway 30
1.2.2. Classification of Molecular Subtype 33
1.3. Methyl Donor Nutrients (One-carbon Metabolism) and CRC 38
1.4. DNA Mismatch Repair Machinery and Microsatellite Instability (MSI) on CRC 60
1.5. A Machine Learning Approach and CRC 70
1.6. Purpose of Study 80
Chapter 2. Dietary Methyl Donor Nutrients and Modifiable Lifestyle Factors on CRC Risk 83
2.1. Introduction 84
2.2. Subjects and Methods 86
2.2.1. Study Population 86
2.2.2. Clinicopathological Data Collection 87
2.2.3. Microsatellite Instability and DNA Mismatch Repair Proteins(MSI-MMR) Status 89
2.2.4. Assessment of Dietary Methyl Donor Nutrients 89
2.2.5. Assessment of Modifiable Lifestyle Factors 90
2.2.6. Statistical Analysis 91
2.3. Results 92
2.3.1. General Characteristics of Study Subjects 92
2.3.2. Clinicopathological Feature of CRC Patients 94
2.3.3. Comparison of Dietary Methyl Donor Nutrient Intake 96
2.3.4. Association between Dietary Methyl Donor Nutrients and CRC Risk by Anatomic Subsite and MSI-MMR Status 98
2.3.5. Comparison of Contributing Food in Methyl Donor Nutrients 107
2.3.6. Association between Modifiable Lifestyle Factors and CRC Risk by Anatomic Subsite and MSI-MMR Status 115
2.3.7. Spearman's Correlation Coefficient of Methyl Donor Nutrients and Modifiable Lifestyle Factors 118
2.3.8. Association between Dietary Methyl Donor Nutrients and CRC Risk by Modifiable Lifestyle Factors 121
2.4. Discussion 131
2.5. Conclusions 136
Chapter 3. Interaction among Dietary Methyl Donor Nutrients, Modifiable Lifestyle Factors, and Common Polymorphisms of DNA Mismatch Repair Genes on CRC Risk 137
3.1. Introduction 138
3.2. Materials and Methods 141
3.2.1. SNP Genotyping 141
3.2.2. Selection of Candidate Polymorphisms on DNA MMR Genes 143
3.2.3. Statistical Analysis 145
3.3. Result 147
3.3.1. Haplotype Distribution and Tagging SNPs in DNA MMR Genes for Association with CRC Risk 147
3.3.2. Association between Candidate Polymorphisms of hMSH3 Gene and CRC Risk 153
3.3.3. Association between Candidate Polymorphisms of hMSH3 Gene and CRC Risk by Anatomic Subsite and MSI-MMR Status 155
3.3.4. Association between Candidate Polymorphisms of hMSH3 Gene and CRC Risk according to Family History of CRC 157
3.3.5. Comparison of Functional Annotation in Candidate Polymorphisms of hMSH3 Gene 159
3.3.6. Interaction between Dietary Methyl Donor Nutrients and Candidate Polymorphisms of hMSH3 Gene on CRC Risk 161
3.3.7. Interaction between Dietary Niacin Intake and hMSH3 rs41097 Polymorphism on CRC Risk by Anatomic Subsite and MSI-MMR Status 166
3.3.8. Interaction between Dietary Niacin Intake and hMSH3 rs41097 Polymorphism on CRC Risk by Modifiable Lifestyle Factors 168
3.4. Discussion 171
3.5. Conclusions 178
Chapter 4. Impact of Dietary Methyl Donor Nutrients, Modifiable Lifestyle Factors, and DNA Mismatch Repair Genes for Prediction of CRC Using a Machine Learning Approach 179
4.1. Introduction 180
4.2. Materials and Methods 183
4.2.1. Dataset and Pre-processing 183
4.2.2. Construction of Machine Learning Models 186
4.2.3. Model Assessment 188
4.3. Results 190
4.3.1. Comparison of Confusion Matrix for Machine Learning Models 190
4.3.2. Identification of the Importance Feature from Machine learning Models 194
4.4. Discussion 197
4.5. Conclusions 202
Chapter 5. Concluding Remarks 203
5.1. Summary of Study 204
5.2. Limitation of Study 206
5.3. Significance and Future Study 207
BIBLIOGRAPHY 210
Appendices 241
Appendix-1. Flow chart of study subjects for interaction analysis 241
Appendix-2. Description of all polymorphisms on DNA MMR genes 242
Appendix-3. Interaction between modifiable lifestyle factors and candidate polymorphisms of hMSH3 gene on CRC risk 250
Appendix-4. Forms of the written informed consent 254
Appendix-5. The approval by the institutional review board of the National Cancer Center 258
Appendix-6. A structured general questionnaire provided to the case group 261
Appendix-7. A structured general questionnaire provided to the control group 270
Appendix-8. A semi-quantitative food frequency questionnaire (SQFFQ) provided to the case and control groups 281
Table 1-1. Hereditary CRC syndrome and genes 29
Table 1-2. Literature review of the association between dietary methyl donor nutrients and CRC risk 42
Table 1-3. Description of DNA MMR genes 61
Table 1-4. Literature review of the association among MMR genes, MSI status, and CRC risk 65
Table 1-5. Literature review of the association among environmental factors, MMR genes, and MSI status on CRC risk 67
Table 1-6. Literature review of machine learning techniques applied to detection and classification of CRC 74
Table 2-1. General characteristics of study subjects 93
Table 2-2. Clinicopathological feature of CRC patients 95
Table 2-3. Comparison of dietary methyl donor nutrient intake 97
Table 2-4. Association between dietary methyl donor nutrient intake and CRC risk by anatomic subsite and MSI-MMR status 100
Table 2-5. Comparison of contributing food in methyl donor nutrients 108
Table 2-6. Association between modifiable lifestyle factors and CRC risk by anatomic subsite and MSI-MMR status 117
Table 2-7. Spearman's correlation coefficient of methyl donor nutrients and modifiable lifestyle factors 119
Table 2-8. Association between dietary methyl donor nutrient intake and CRC risk by modifiable lifestyle factors 124
Table 3-1. Description of Tag SNPs in DNA MMR genes and FDR analysis for... 152
Table 3-2. Association between candidate polymorphisms of hMSH3 gene and CRC risk 154
Table 3-3. Association between candidate polymorphisms of hMSH3 gene and CRC risk by anatomic subsite and MSI-MMR status 156
Table 3-4. Association between candidate polymorphisms of hMSH3 gene and CRC risk according to family history of CRC 158
Table 3-5. Comparison of functional annotation in candidate polymorphisms of hMSH3 gene 160
Table 3-6. Interaction between dietary methyl donor nutrient intake and candidate polymorphisms of hMSH3 gene on CRC risk 162
Table 3-7. Interaction between dietary niacin intake and hMSH3 rs41097 polymorphism on CRC risk by anatomic subsite and MSI-MMR status 167
Table 3-8. Interaction between niacin intake and hMSH3 rs41097 polymorphism on CRC risk by modifiable lifestyle factors 169
Table 4-1. Description of feature variables in the study for a machine learning approach 185
Table 4-2. Comparison of a confusion matrix for each model 191
Figure 1-1. Age-standardized incidence (A: men, B: women)... 23
Figure 1-2. The proportion of CRC case 24
Figure 1-3. Colorectal carcinogenesis phase and pathway 32
Figure 1-4. Comparison between traditional and currently proposed classification of... 36
Figure 1-5. Molecular pathway in colorectal tumorigenesis 37
Figure 1-6. Overview of one-carbon metabolism 41
Figure 1-7. Schematic diagram of DNA MMR and MSI pathway 64
Figure 1-8. Artificial intelligence, machine learning, and deep learning 71
Figure 1-9. The workflow of a machine learning approach 72
Figure 1-10. Construction of study 82
Figure 2-1. Flow chart of study subjects 88
Figure 3-1. Flow chart of the selection procedure for candidate polymorphisms... 142
Figure 3-2. LocusZoom plot of all polymorphisms on DNA MMR genes 144
Figure 3-3. A haplotype of DNA MMR polymorphisms and Tag SNPs. 149
Figure 4-1. Schematic workflow of a machine learning approach... 184
Figure 4-2. Receiver operating characteristic (ROC) area under the curve... 192
Figure 4-3. Feature importance of each model 195