In the field of machine learning, the MNIST dataset has been recognized as a challenging benchmark that researchers consistently utilize as a training dataset. Ioffe et al. showed the effectiveness of batch normalization on the MNIST dataset in their paper published in 2015. The purpose of this research is to demonstrate that the training error of a typical multilayer neural network on the MNIST dataset through batch normalization approaches 0%. We improved the test error without applying CNN directly to the MNIST dataset. To achieve this, we trained a multilayer neural network equipped with batch normalization using various optimization algorithms such as SGD, Adam, AdaGrad, and Momentum.