Title Page
Abstract
Contents
Chapter 1. Introduction & Background 14
1.1. Background 14
1.2. Difference between Supervised and Unsupervised Learning 16
1.3. Objective 17
1.4. Parameters of Analysis 17
1.5. Thesis Focus 17
1.6. Thesis Outline 18
Chapter 2. Literature Review 19
2.1. Overview 19
2.2. Machine Learning 19
2.3. Types of Machine Learning 20
2.4. Types of Machine Learning Algorithms 21
2.5. Supervised Machine Learning 22
2.6. Classification 22
2.7. Classification Process 23
2.8. Classification Algorithms 24
2.8.1. Logistic Regression 25
2.8.2. Naïve Bayes 25
2.8.3. K-Nearest Neighbors 25
2.8.4. Decision Tree 26
2.8.5. Random Forest 26
2.8.6. Support Vector Machine 26
2.9. Unsupervised Machine Learning 27
2.10. Clustering 27
2.10.1. Clustering Workflow 28
2.11. Clustering Algorithms 29
2.11.1. K-mean Clustering Algorithm 29
2.11.2. Expectation-Maximization (EM) Algorithm 29
2.12. Summary 30
Chapter 3. Experimental Study 31
3.1. Overview 31
3.2. Datasets 31
3.3. Methodology 32
3.4. What is Weka? 32
3.4.1. Features of Weka 33
3.5. Experimental Study of Classification Algorithms 33
3.6. Data Mining Process 33
3.7. Experimental Analysis of Classification (Supervised Learning) Algorithms 35
3.7.1. Performance of KNN Algorithm 35
3.7.2. Graphical Representation of Testing Accuracy (KNN) 37
3.7.3. Graphical Representation of Training Accuracy (KNN) 38
3.7.4. Performance of Backpropagation Algorithm 40
3.7.5. Graphical Representation of Testing Accuracy (BP) 42
3.7.6. Graphical Representation of Training Accuracy (BP) 43
3.7.7. Performance of Naïve Bayes Algorithm 45
3.7.8. Graphical Representation of Naïve Bayes Algorithm 45
3.7.9. Highest Testing and Training Accuracy of Classification Algorithms 46
3.7.10. Graphical Representation of Highest Testing & Training Accuracy 48
3.8. Experimental Analysis of Clustering (Unsupervised Learning) Algorithms 49
3.8.1. K-Means 49
3.8.2. Expectation-Maximization (EM) Algorithm 50
3.9. Summary 51
Chapter 4. Results and Discussion 52
4.1. Overview 52
4.2. Supervised Learning Algorithms 52
4.2.1. KNN Results Analysis 52
4.2.2. Back Propagation Results Analysis 53
4.2.3. Naïve Bayes Results Analysis 53
4.3. Unsupervised Learning Algorithms 54
4.3.1. Result Analysis of Clustering Algorithms 54
4.3.2. K-mean Analysis 54
4.3.3. Expectation-Maximization (EM) Analysis 56
Chapter 5. Conclusion 59
5.1. Conclusion 59
References 61
Table 3.1. Datasets 31
Table 3.2. KNN Algorithm Results 37
Table 3.3. BP Algorithm Results 42
Table 3.4. Naïve Bayes Algorithm Results 45
Table 3.5. Highest Testing Accuracy of Classification Algorithms 46
Table 3.6. Highest Training Accuracy 47
Table 3.7. Clustering Algorithm Results 51
Fig 2.1. Types of Machine Learning 20
Fig 2.2. Types of Machine Learning Algorithms 21
Fig 2.3. Classification Components 24
Fig 3.1. Data Mining Process 34
Fig 4.1. K-Mean Result - No of Cluster 2 55
Fig 4.2. K-Mean Result - No of Cluster 3 55
Fig 4.3. K-Mean Result - No of Cluster 4 55
Fig 4.4. EM Result - No of Cluster 2 57
Fig 4.5. EM Result - No of Cluster 3 57
Fig 4.6. EM Result - No of Cluster 4 57
Graph 3.1. KNN Testing Accuracy of Breast Cancer 37
Graph 3.2. KNN Testing Accuracy of Ionosphere 37
Graph 3.3. KNN Testing Accuracy of Glass 37
Graph 3.4. KNN Testing Accuracy of Unbalanced 37
Graph 3.5. KNN Testing Accuracy of Soybean 38
Graph 3.6. KNN Testing Accuracy of Labor 38
Graph 3.7. KNN Testing Accuracy of Credit 38
Graph 3.8. KNN Training Accuracy of Breast Cancer 38
Graph 3.9. KNN Training Accuracy of Ionosphere 38
Graph 3.10. KNN Training Accuracy of Glass 39
Graph 3.11. KNN Training Accuracy of Unbalanced 39
Graph 3.12. KNN Training Accuracy of Soybean 39
Graph 3.13. KNN Training Accuracy of Labor 39
Graph 3.14. KNN Training Accuracy of Credit 39
Graph 3.15. BP Testing Accuracy of Breast Cancer 42
Graph 3.16. BP Testing Accuracy of Ionosphere 42
Graph 3.17. BP Testing Accuracy of Glass 42
Graph 3.18. BP Testing Accuracy of Unbalanced 42
Graph 3.19. BP Testing Accuracy of Soybean 43
Graph 3.20. BP Testing Accuracy of Labor 43
Graph 3.21. BP Testing Accuracy of Credit 43
Graph 3.22. BP Training Accuracy of Breast Cancer 43
Graph 3.23. BP Training Accuracy of Ionosphere 43
Graph 3.24. BP Training Accuracy of Glass 44
Graph 3.25. BP Training Accuracy of Unbalance 44
Graph 3.26. BP Training Accuracy of Soybean 44
Graph 3.27. BP Training Accuracy of Labor 44
Graph 3.28. BP Training Accuracy of Credit 44
Graph 3.29. Naïve Bayes Testing Accuracy 45
Graph 3.30. Naïve Bayes Training Accuracy 45
Graph 3.31. Highest Testing Accuracy - KNN 48
Graph 3.32. Highest Training Accuracy - KNN 48
Graph 3.33. Highest Testing Accuracy - BP 48
Graph 3.34. Highest Training Accuracy - BP 48
Graph 3.35. Highest Testing Accuracy - Naïve Bayes 48
Graph 3.36. Highest Training Accuracy - Naïve Bayes 48
Graph 4.1. Graphical Representation of K-Mean Algorithm Results 56
Graph 4.2. Graphical Representation of EM Algorithm Results 58