Title Page
Contents
Abstract 9
1. Introduction 10
1.1. Data Augmentation Techniques 10
1.2. The Deep Learning-based Bioinformatics Analysis 12
1.3. Attention Mechanism 15
1.3.1. Recurrent Models of Visual Attention 15
1.3.2. Channel Attention 17
1.3.3. Spatial-channel Hybrid Attention 18
1.4. Generative Adversarial Networks 20
2. Related Works 22
3. The Proposed Method 24
3.1. Data Dimensionality Reduction 24
3.2. Augmentation Model 26
3.3. Attention Mechanism Module 29
3.4. Generated Data Distribution 34
3.5. Classifier Model 35
4. Experimental Results 36
4.1. The Datasets 36
4.2. Data Augmentation Model 39
4.3. UMAP Distribution Map 41
4.4. Classifying by using the Real Data 45
4.5. Classifying by using the Augmented Data and Real Data 46
5. Conclusions 48
6. Discussion 49
References 50
국문요약 54
Table.1. The gene expression data. The table consists of four parts: gene ID, gene expression data, patient ID, and project name (cancer type). 36
Fig.1. Recurrent models of visual attention. 15
Fig.2. Squeeze-and-Excitation networks. 17
Fig.3. The overview of CBAM. 18
Fig.4. The AC-GAN model. G represents the generator network and D represents the discriminator network. Z denotes the input data, and C... 26
Fig.5. The generator network. The filters in Convolution Transpose layer are 256, 256, 256, 512, 512, 512, 512, 256, 256, 256 and the kernel size... 31
Fig.6. The discriminator network. The filters in the Convolution layer are 256, 128, 64, 32, 16 and the kernel size is 2. The ratio of the first fully... 33
Fig.7. The classifier model. 35
Fig.8. The training loss curve of the data augmentation model. After 50,000 training iterations, the loss functions of both the generator and... 39
Fig.9. The UMAP plot of the training data. 41
Fig.10. 200 augmented data samples of the Class 0. 42
Fig.11. 200 augmented data samples of the Class 1. 43
Fig.12. 200 augmented data samples of the Class 2. 43
Fig.13. 200 augmented data samples of the Class 3. 44
Fig.14. Confusion matrix of the classifier's predictions versus the true labels when using the raw data. 45
Fig.15. Confusion matrix of the classifier's predictions versus the true labels when using the augmented data and raw data. 46