본문 바로가기 주메뉴 바로가기
국회도서관 홈으로 정보검색 소장정보 검색

목차보기

Title Page

ABSTRACT

Contents

List of Abbreviations 19

Chapter 1. Background and Literature Review of Study 21

1.1. Incidence and Aetiology of CRC 22

1.1.1. Epidemiological Trends in CRC 22

1.1.2. Modifiable Risk Factors 25

1.1.3. Genetic Factors 28

1.2. Carcinogenic Pathway and Molecular Subtype of CRC 30

1.2.1. Colorectal Carcinogenesis Phase and Pathway 30

1.2.2. Classification of Molecular Subtype 33

1.3. Methyl Donor Nutrients (One-carbon Metabolism) and CRC 38

1.4. DNA Mismatch Repair Machinery and Microsatellite Instability (MSI) on CRC 60

1.5. A Machine Learning Approach and CRC 70

1.6. Purpose of Study 80

Chapter 2. Dietary Methyl Donor Nutrients and Modifiable Lifestyle Factors on CRC Risk 83

2.1. Introduction 84

2.2. Subjects and Methods 86

2.2.1. Study Population 86

2.2.2. Clinicopathological Data Collection 87

2.2.3. Microsatellite Instability and DNA Mismatch Repair Proteins(MSI-MMR) Status 89

2.2.4. Assessment of Dietary Methyl Donor Nutrients 89

2.2.5. Assessment of Modifiable Lifestyle Factors 90

2.2.6. Statistical Analysis 91

2.3. Results 92

2.3.1. General Characteristics of Study Subjects 92

2.3.2. Clinicopathological Feature of CRC Patients 94

2.3.3. Comparison of Dietary Methyl Donor Nutrient Intake 96

2.3.4. Association between Dietary Methyl Donor Nutrients and CRC Risk by Anatomic Subsite and MSI-MMR Status 98

2.3.5. Comparison of Contributing Food in Methyl Donor Nutrients 107

2.3.6. Association between Modifiable Lifestyle Factors and CRC Risk by Anatomic Subsite and MSI-MMR Status 115

2.3.7. Spearman's Correlation Coefficient of Methyl Donor Nutrients and Modifiable Lifestyle Factors 118

2.3.8. Association between Dietary Methyl Donor Nutrients and CRC Risk by Modifiable Lifestyle Factors 121

2.4. Discussion 131

2.5. Conclusions 136

Chapter 3. Interaction among Dietary Methyl Donor Nutrients, Modifiable Lifestyle Factors, and Common Polymorphisms of DNA Mismatch Repair Genes on CRC Risk 137

3.1. Introduction 138

3.2. Materials and Methods 141

3.2.1. SNP Genotyping 141

3.2.2. Selection of Candidate Polymorphisms on DNA MMR Genes 143

3.2.3. Statistical Analysis 145

3.3. Result 147

3.3.1. Haplotype Distribution and Tagging SNPs in DNA MMR Genes for Association with CRC Risk 147

3.3.2. Association between Candidate Polymorphisms of hMSH3 Gene and CRC Risk 153

3.3.3. Association between Candidate Polymorphisms of hMSH3 Gene and CRC Risk by Anatomic Subsite and MSI-MMR Status 155

3.3.4. Association between Candidate Polymorphisms of hMSH3 Gene and CRC Risk according to Family History of CRC 157

3.3.5. Comparison of Functional Annotation in Candidate Polymorphisms of hMSH3 Gene 159

3.3.6. Interaction between Dietary Methyl Donor Nutrients and Candidate Polymorphisms of hMSH3 Gene on CRC Risk 161

3.3.7. Interaction between Dietary Niacin Intake and hMSH3 rs41097 Polymorphism on CRC Risk by Anatomic Subsite and MSI-MMR Status 166

3.3.8. Interaction between Dietary Niacin Intake and hMSH3 rs41097 Polymorphism on CRC Risk by Modifiable Lifestyle Factors 168

3.4. Discussion 171

3.5. Conclusions 178

Chapter 4. Impact of Dietary Methyl Donor Nutrients, Modifiable Lifestyle Factors, and DNA Mismatch Repair Genes for Prediction of CRC Using a Machine Learning Approach 179

4.1. Introduction 180

4.2. Materials and Methods 183

4.2.1. Dataset and Pre-processing 183

4.2.2. Construction of Machine Learning Models 186

4.2.3. Model Assessment 188

4.3. Results 190

4.3.1. Comparison of Confusion Matrix for Machine Learning Models 190

4.3.2. Identification of the Importance Feature from Machine learning Models 194

4.4. Discussion 197

4.5. Conclusions 202

Chapter 5. Concluding Remarks 203

5.1. Summary of Study 204

5.2. Limitation of Study 206

5.3. Significance and Future Study 207

BIBLIOGRAPHY 210

Appendices 241

Appendix-1. Flow chart of study subjects for interaction analysis 241

Appendix-2. Description of all polymorphisms on DNA MMR genes 242

Appendix-3. Interaction between modifiable lifestyle factors and candidate polymorphisms of hMSH3 gene on CRC risk 250

Appendix-4. Forms of the written informed consent 254

Appendix-5. The approval by the institutional review board of the National Cancer Center 258

Appendix-6. A structured general questionnaire provided to the case group 261

Appendix-7. A structured general questionnaire provided to the control group 270

Appendix-8. A semi-quantitative food frequency questionnaire (SQFFQ) provided to the case and control groups 281

List of Tables

Table 1-1. Hereditary CRC syndrome and genes 29

Table 1-2. Literature review of the association between dietary methyl donor nutrients and CRC risk 42

Table 1-3. Description of DNA MMR genes 61

Table 1-4. Literature review of the association among MMR genes, MSI status, and CRC risk 65

Table 1-5. Literature review of the association among environmental factors, MMR genes, and MSI status on CRC risk 67

Table 1-6. Literature review of machine learning techniques applied to detection and classification of CRC 74

Table 2-1. General characteristics of study subjects 93

Table 2-2. Clinicopathological feature of CRC patients 95

Table 2-3. Comparison of dietary methyl donor nutrient intake 97

Table 2-4. Association between dietary methyl donor nutrient intake and CRC risk by anatomic subsite and MSI-MMR status 100

Table 2-5. Comparison of contributing food in methyl donor nutrients 108

Table 2-6. Association between modifiable lifestyle factors and CRC risk by anatomic subsite and MSI-MMR status 117

Table 2-7. Spearman's correlation coefficient of methyl donor nutrients and modifiable lifestyle factors 119

Table 2-8. Association between dietary methyl donor nutrient intake and CRC risk by modifiable lifestyle factors 124

Table 3-1. Description of Tag SNPs in DNA MMR genes and FDR analysis for... 152

Table 3-2. Association between candidate polymorphisms of hMSH3 gene and CRC risk 154

Table 3-3. Association between candidate polymorphisms of hMSH3 gene and CRC risk by anatomic subsite and MSI-MMR status 156

Table 3-4. Association between candidate polymorphisms of hMSH3 gene and CRC risk according to family history of CRC 158

Table 3-5. Comparison of functional annotation in candidate polymorphisms of hMSH3 gene 160

Table 3-6. Interaction between dietary methyl donor nutrient intake and candidate polymorphisms of hMSH3 gene on CRC risk 162

Table 3-7. Interaction between dietary niacin intake and hMSH3 rs41097 polymorphism on CRC risk by anatomic subsite and MSI-MMR status 167

Table 3-8. Interaction between niacin intake and hMSH3 rs41097 polymorphism on CRC risk by modifiable lifestyle factors 169

Table 4-1. Description of feature variables in the study for a machine learning approach 185

Table 4-2. Comparison of a confusion matrix for each model 191

List of Figures

Figure 1-1. Age-standardized incidence (A: men, B: women)... 23

Figure 1-2. The proportion of CRC case 24

Figure 1-3. Colorectal carcinogenesis phase and pathway 32

Figure 1-4. Comparison between traditional and currently proposed classification of... 36

Figure 1-5. Molecular pathway in colorectal tumorigenesis 37

Figure 1-6. Overview of one-carbon metabolism 41

Figure 1-7. Schematic diagram of DNA MMR and MSI pathway 64

Figure 1-8. Artificial intelligence, machine learning, and deep learning 71

Figure 1-9. The workflow of a machine learning approach 72

Figure 1-10. Construction of study 82

Figure 2-1. Flow chart of study subjects 88

Figure 3-1. Flow chart of the selection procedure for candidate polymorphisms... 142

Figure 3-2. LocusZoom plot of all polymorphisms on DNA MMR genes 144

Figure 3-3. A haplotype of DNA MMR polymorphisms and Tag SNPs. 149

Figure 4-1. Schematic workflow of a machine learning approach... 184

Figure 4-2. Receiver operating characteristic (ROC) area under the curve... 192

Figure 4-3. Feature importance of each model 195

초록보기

Colorectal cancer (CRC) is a heterogeneous disease caused by the complex interplay among diet, environmental factors, and genetics according to the multiplicity of different molecules and pathological pathways. In one-carbon metabolism, dietary methyl donor nutrients (vitamin B2, niacin, vitamin B6, folate, vitamin B12, methionine, and choline) play a role in nucleic acid synthesis and repair, epigenetic alterations, and homeostasis of amino acid. The methyl group may affect CRC risk according to genetic variants of DNA mismatch repair (MMR: hMLH1, hMLH3, hMSH2, hMSH3, hMSH6, hPMS1, and hPMS2) genes related to colorectal carcinogenesis. This study aimed to determine whether dietary methyl donor nutrients were associated with CRC risk to identify how the association could be altered by genetic variants of DNA MMR genes and examine the interactive effects between them along with modifiable lifestyle factors. Moreover, multiple machine learning methods were performed to compare and identify the best predictive model for CRC.

This study was conducted with 626 cases and 838 controls matched for age and sex. Of the 626 cases, clinicopathological information was obtained from the medical record including the anatomical location of the lesion and the status of DNA MMR and microsatellite instability (MSI). A semiquantitative food frequency questionnaire was used to collect the dietary information of methyl donor nutrients. The common polymorphisms of DNA MMR genes were selected by comparing the 1000 GENOMES to Chinese and Japanese populations according to the elimination of minor allele frequency (〈 0.1). Odds ratios (ORs) and their corresponding 95% confidence intervals (CIs) were estimated using logistic regression. In terms of multiple machine learning approaches, machine learning-based algorithms (logistic regression, Ridge regression, Lasso regression, Naive Bayes, Random Forest, Support Vector Machine, and Kernel Support Vector Machine) were used to construct the model for the prediction of CRC. The area under the curve (AUC) with receiver operating characteristic (ROC) was used to evaluate the performance.

Among seven methyl donor nutrients, high intake of vitamin B2, niacin, vitamin B6, folate, and methionine showed the inverse associations with CRC risk (OR Q4 vs. Q1, 95% CI, P for trend = vitamin B2, 0.20, 0.13-0.32, P 〈 0.001; niacin, 0.41, 0.27-0.63, P 〈 0.001; vitamin B6, 0.54, 0.36-0.80, P = 0.001; folate, 0.38, 0.25-0.57, P 〈 0.001; methionine, 0.24, 0.15-0.37, P 〈 0.001), while high intake of choline increased the risk of CRC (OR Q4 vs. Q1, 95% CI = 5.97, 3.78-9.43, P for trend 〈 0.001). In the subgroup analysis of MSI-MMR status, the effects of vitamin B2, niacin, methionine, and choline regarding the risk of CRC were observed regardless of MSI-MMR status. However, the inverse effects of vitamin B6 and folate on CRC risk were observed in the proficient MMR and microsatellite stable (MSS) status. Among the modifiable lifestyle factors, patients who had high intake of methyl donor nutrients with regular physical activity were more likely to have reduced the risk of CRC. In terms of the genetic variants of DNA MMR machinery, three genetic variants (rs32952 A〉C, rs41097 A〉G, and rs245404 C〉G) of hMSH3 gene independently reduced the risk of CRC (OR, 95% CI = rs32952, 0.80, 0.65-0.98, P = 0.031, AA vs. AC vs. CC in additive; rs32952, 0.72, 0.56-0.94, P = 0.014, AC+CC vs. AA in dominant; rs41097, 0.74, 0.57-0.97, P = 0.027, AG+GG vs. AA in dominant); rs245404, 0.76, 0.58-1.00, P = 0.047, CG+GG vs. CC in dominant), particularly the proficient MMR or MSS status. Among three genetic variants of hMSH3, the heterozygous genotype of them with no first-degree family history of CRC reduced the risk of CRC compared to their homozygous major allele genotype. Furthermore, high intake of niacin with G allele carriers of hMSH3 rs41097 showed a strong interactive effect compared to that in AA carriers with low intake of niacin (OR, 95% CI = 0.49, 0.33-0.72, AG+GG carriers with high intake of niacin vs. AA carriers with low intake of niacin, P for interaction = 0.020), particularly colon cancer and the proficient MMR or MSS status. In the stratified analysis of modifiable lifestyle factors, the interactive effect between niacin and rs41097 variant regarding CRC risk was observed in low BMI or prior BMI, non-smoking, alcohol consumption, regular physical activity, and low intake of total energy. When performed the multiple machine learning approaches using the potential risk factors, Random Forest with accuracy 80.89% and AUC of 0.801 and Lasso regression with accuracy 76.75% and AUC of 0.860 showed slightly better performance regarding the prediction of CRC than the others.

Our findings indicate the association between dietary methyl donor nutrients and genetic variants of DNA MMR system regarding the risk of CRC suggesting the different benefits from gene-diet interaction to CRC etiology. Furthermore, an improved understanding of association among methyl donor nutrients, modifiable lifestyle factors, DNA MMR genes, and MSI-MMR status on CRC may provide the valuable avenues and insight for qualitative diagnosis of CRC along with machine learning approaches.