Title Page
Contents
ABSTRACT 9
Ⅰ. Introduction 11
Ⅱ. Background 14
1. Unsupervised domain adaptation 14
2. Semi-supervised learning 16
Ⅲ. Methods 18
1. Overall framework 18
2. Data collection 20
3. Preprocessing 22
4. Adversarial training and semi-supervised learning for unsupervised domain adaptation 23
1) Pre-training 23
2) Adversarial training 23
3) Semi-supervised learning 25
5. Implementation details 26
Ⅳ. Results 27
1. Performance comparison with batch correction methods 27
2. Visualization of representation space 29
3. Generalization to the DNA methylome cancer datasets 34
4. Ablation study 36
Ⅴ. Conclusion 38
REFER0ENCES 39
ABSTRACT IN KOREAN 47
〈Table 1〉 Datasets used for training scUDAS 20
〈Table 2〉 Cell type and BRCA subtype information 21
〈Table 3〉 Performance comparison of cell type prediction accuracy with batch correction methods 28
〈Table 4〉 Performance comparison of cell type prediction weighted f1 score with batch correction methods 28
〈Table 5〉 Performance comparison of BRCA subtype between the scUDAS and other ML-based methods 35
〈Table 6〉 Results of ablation study on main components of scUDAS 37
[Figure 1] Unsupervised domain adaptation (UDA) process. The shape represents the cell type and the color represents the domain. Dashed... 15
[Figure 2] Workflow of semi-supervised learning. It consists of three main phases: (1) Pre-training the model using labeled data. (2)... 17
[Figure 3] Workflow of the proposed scUDAS based on adversarial training and semi-supervised learning using the scRNA-seq dataset.... 19
[Figure 4] UMAP visualization of input 2000 genes (a) Smart-seq2 (b) CEL-Seq2 31
[Figure 5] UMAP visualization of an uncorrected batch effect in the representation space (a) Smart-seq2 (b) CEL-Seq2 32
[Figure 6] UMAP visualization of scUDAS in the representation space (a) Smart-seq2 (b) CEL-Seq2 33