This study attempted to explore the optimal algorithm for the development of a Korean essay-type automatic scoring program by reviewing deep learning-based learning models.
In order to carry out the purpose of this study, an essay-type evaluation automatic scoring model was developed using deep learning algorithms such as Recurrent Neural Network (RNN), Long-Short-Term-Memory (LSTM), and Gated-Recurrent-Unit (GRU). Each constructed model was designed to enable polynomial classification in consideration of these characteristics and predicted scores because essay scoring data was composed of multiple data. The performance of each algorithm was compared based on the time required for model learning, classification Accuracy, Precision, Recall, and F1-score.
There are total of four research questions in this study: first, what are the advantages and disadvantages of the RNN-based Korean essay-type evaluation automatic scoring program, and second, what are the advantages and disadvantages of the LSTM-based Korean essay-type evaluation automatic scoring program.
Third, what are the advantages and disadvantages of the GRU-based Korean essay-type evaluation automatic scoring program,
Fourth, it is to confirm which of the RNN, LSTM, and GRU is the optimal algorithm that can be applied to the Korean essay-type evaluation automatic scoring program.
As a result, the performance of the RNN, LSTM, and GRU-based automatic scoring programs built in this study cannot be said to be high. However, it is a sufficiently acceptable result considering the quality and amount of data and the distribution of data, which are major factors that determine the performance of the learning model.
Therefore, it is expected that the performance of the automatic scoring program can be further improved when models are constructed and learned by securing high quality, sufficient amount of data, and even distribution data.
Among the RNN, LSTM, and GRU algorithms, the GRU-based automatic scoring program shows similar performance to the LSTM-based automatic scoring program, but takes less time for the model to learn. Therefore, among the three models, the GRU-based automatic scoring program showed optimal results in the development of the Korean essay-type automatic scoring program.
In addition, as a follow-up study, it is expected that research on the development of an optimal deep learning algorithm-based Korean essay-type automatic scoring program will be actively conducted by conducting comparative studies such as Bi-directional LSTM and Attention-based deep learning models.