Title Page
Contents
ABSTRACT 11
Ⅰ. Introduction 13
Ⅱ. Background Knowledge 18
2.1. Graph Neural Networks (GNNs) 18
2.1.1. Graph Convolutional Network (GCN) 20
2.1.2. Graph Attention Network (GAT) 21
2.2. Node Embeddings 23
Ⅲ. Related Work 27
3.1. Deep Clustering Research 28
3.1.1. Autoencoder-based Deep Clustering Research 28
3.1.2. GNN-based Deep Clustering Research 30
3.1.3. GNN and Autoencoder-Based Deep Clustering Research 31
3.2. Contributions of Our Proposed Algorithm 33
Ⅳ. Deep k-Means Clustering of Nodes in a Graph 34
4.1. Initial GNN training and initial clustering stage 35
4.2. Iterative redefinition of the loss function and GNN retraining 37
4.3. Final clustering stage 41
Ⅴ. Experiments 43
5.1. Experiment Setup 44
5.1.1. Real-world Datasets 44
5.1.2. Baselines and Parameter Settings 46
5.1.3. Evaluation metrics 50
5.2. Results 51
5.2.1. Clustering Performance Evaluation 51
5.2.2. Influence of k-means Clustering Loss Components 56
5.2.3. Influence of Hyperparameter 59
5.2.4. Clustering Visualization 61
5.2.5. Comparative Analysis of Execution Times 66
Ⅵ. Conclusion 69
REFERENCES 71
ABSTRACT IN KOREAN 75
Table 1. Real datasets used in the experiments 44
Table 2. Summary of Comparative Algorithms for Loss Functions in Embedding Models 48
Table 3. Clustering performance for citation network datasets 52
Table 4. Clustering performance for amazon datasets 53
Table 5. Clustering performance for Flickr dataset 54
Table 6. Clustering performance for BlogCatalog dataset 54
Table 7. Clustering performance for Wiki dataset 55
Figure 1. Difference between classical clustering and Deep Clustering 14
Figure 2. Visualizing the (a) Cohesion and (b) Separation of Clusters 16
Figure 3. Visualization of Node Embedding Obtained Using GNN 25
Figure 4. Overall flow of the proposed method 34
Figure 5. Influence of k-means clustering loss: ACC 57
Figure 6. Influence of k-means clustering loss: NMI 57
Figure 7. Influence of k-means clustering loss: ARI 58
Figure 8. Influence of k-means clustering loss: F1 58
Figure 9. Influence of k-means clustering loss hyperparameter 60
Figure 10. Clustering results visualization on BlogCatalog dataset 62
Figure 11. Clustering results visualization on CiteSeer dataset 63
Figure 12. Clustering results visualization on Computers Dataset 64
Figure 13. Clustering results visualization on Cora dataset 65
Figure 14. Execution Time Comparison Between Models for Each Dataset 67