Panel data is a type of data that combines cross-sectional and time-series information, providing a comprehensive dataset.
Therefore, data collection, management, and analysis are difficult. To effectively use the data, the analysis methodology must be defined based on the size of the panel data. In this study, the performance of each prediction model was compared by analyzing the panel data to predict whether or not outstanding customers would churn based on the changes in the customer size and period. The algorithms used include a mixed model that incorporates random effects, primarily used in the panel data. Additionally, there are general statistical models and machine learning models that do not incorporate random effects. Based on the size of the panel data, panel, non-panel, linear-based, tree-based, and neural network-based models were compared. For each type, predictive performance rankings were derived. The analysis revealed that as the time period increased, the performance of the panel model improved in comparison to that of the nonpanel model. Additionally, the size of the customer did not have a significant impact. Furthermore, the performance of the base model was compared based on size. Based on the size of the panel data, it is expected that the appropriate model can be easily determined using these results. This study is significant because it establishes the effectiveness of the predictive models that can be applied to the panel data, which is typically challenging to analyze because of its size.