Colorectal cancer (CRC) is a heterogeneous disease caused by the complex interplay among diet, environmental factors, and genetics according to the multiplicity of different molecules and pathological pathways. In one-carbon metabolism, dietary methyl donor nutrients (vitamin B2, niacin, vitamin B6, folate, vitamin B12, methionine, and choline) play a role in nucleic acid synthesis and repair, epigenetic alterations, and homeostasis of amino acid. The methyl group may affect CRC risk according to genetic variants of DNA mismatch repair (MMR: hMLH1, hMLH3, hMSH2, hMSH3, hMSH6, hPMS1, and hPMS2) genes related to colorectal carcinogenesis. This study aimed to determine whether dietary methyl donor nutrients were associated with CRC risk to identify how the association could be altered by genetic variants of DNA MMR genes and examine the interactive effects between them along with modifiable lifestyle factors. Moreover, multiple machine learning methods were performed to compare and identify the best predictive model for CRC.
This study was conducted with 626 cases and 838 controls matched for age and sex. Of the 626 cases, clinicopathological information was obtained from the medical record including the anatomical location of the lesion and the status of DNA MMR and microsatellite instability (MSI). A semiquantitative food frequency questionnaire was used to collect the dietary information of methyl donor nutrients. The common polymorphisms of DNA MMR genes were selected by comparing the 1000 GENOMES to Chinese and Japanese populations according to the elimination of minor allele frequency (〈 0.1). Odds ratios (ORs) and their corresponding 95% confidence intervals (CIs) were estimated using logistic regression. In terms of multiple machine learning approaches, machine learning-based algorithms (logistic regression, Ridge regression, Lasso regression, Naive Bayes, Random Forest, Support Vector Machine, and Kernel Support Vector Machine) were used to construct the model for the prediction of CRC. The area under the curve (AUC) with receiver operating characteristic (ROC) was used to evaluate the performance.
Among seven methyl donor nutrients, high intake of vitamin B2, niacin, vitamin B6, folate, and methionine showed the inverse associations with CRC risk (OR Q4 vs. Q1, 95% CI, P for trend = vitamin B2, 0.20, 0.13-0.32, P 〈 0.001; niacin, 0.41, 0.27-0.63, P 〈 0.001; vitamin B6, 0.54, 0.36-0.80, P = 0.001; folate, 0.38, 0.25-0.57, P 〈 0.001; methionine, 0.24, 0.15-0.37, P 〈 0.001), while high intake of choline increased the risk of CRC (OR Q4 vs. Q1, 95% CI = 5.97, 3.78-9.43, P for trend 〈 0.001). In the subgroup analysis of MSI-MMR status, the effects of vitamin B2, niacin, methionine, and choline regarding the risk of CRC were observed regardless of MSI-MMR status. However, the inverse effects of vitamin B6 and folate on CRC risk were observed in the proficient MMR and microsatellite stable (MSS) status. Among the modifiable lifestyle factors, patients who had high intake of methyl donor nutrients with regular physical activity were more likely to have reduced the risk of CRC. In terms of the genetic variants of DNA MMR machinery, three genetic variants (rs32952 A〉C, rs41097 A〉G, and rs245404 C〉G) of hMSH3 gene independently reduced the risk of CRC (OR, 95% CI = rs32952, 0.80, 0.65-0.98, P = 0.031, AA vs. AC vs. CC in additive; rs32952, 0.72, 0.56-0.94, P = 0.014, AC+CC vs. AA in dominant; rs41097, 0.74, 0.57-0.97, P = 0.027, AG+GG vs. AA in dominant); rs245404, 0.76, 0.58-1.00, P = 0.047, CG+GG vs. CC in dominant), particularly the proficient MMR or MSS status. Among three genetic variants of hMSH3, the heterozygous genotype of them with no first-degree family history of CRC reduced the risk of CRC compared to their homozygous major allele genotype. Furthermore, high intake of niacin with G allele carriers of hMSH3 rs41097 showed a strong interactive effect compared to that in AA carriers with low intake of niacin (OR, 95% CI = 0.49, 0.33-0.72, AG+GG carriers with high intake of niacin vs. AA carriers with low intake of niacin, P for interaction = 0.020), particularly colon cancer and the proficient MMR or MSS status. In the stratified analysis of modifiable lifestyle factors, the interactive effect between niacin and rs41097 variant regarding CRC risk was observed in low BMI or prior BMI, non-smoking, alcohol consumption, regular physical activity, and low intake of total energy. When performed the multiple machine learning approaches using the potential risk factors, Random Forest with accuracy 80.89% and AUC of 0.801 and Lasso regression with accuracy 76.75% and AUC of 0.860 showed slightly better performance regarding the prediction of CRC than the others.
Our findings indicate the association between dietary methyl donor nutrients and genetic variants of DNA MMR system regarding the risk of CRC suggesting the different benefits from gene-diet interaction to CRC etiology. Furthermore, an improved understanding of association among methyl donor nutrients, modifiable lifestyle factors, DNA MMR genes, and MSI-MMR status on CRC may provide the valuable avenues and insight for qualitative diagnosis of CRC along with machine learning approaches.