초록

전자우편(e-mail)은 상용화된 이래 스팸 메일을 올바르고 정확하게 분류하는 문제는 오랫동안 학계 및 보안업계의 큰 문제였다. 그동안 수많은 방법들이 논의되었으며, 필터 기반의 형태소 분석과 의미 분석에 대한 응용연구가 활발히 수행되면서 스팸 필터링 방법이 상당 부분 진전되었다. 하지만 여전히 해당 방법은 사람과 동일한 수준의 분류를 수행하는 데에는 도달하지 못하고 있다. 본 논문에서는 규칙기반의 스팸 필터 방법을 결합하여 안전성을 유지하면서 규칙기반 스팸 필터에서 차단하지 못하는 새로운 패턴의 스팸메일에 대하여 추가적인 BiLSTM(Bidirectional Long Short-Term Memory) 필터를 적용하여 스팸메일 필터링의 정확도를 높이는 방법을 제안한다.

Since e-mail has been commercialized, the problem of correctly and accurately classifying spam has long been a concern in academia and the security industry. In the meantime, many methods have been discussed, and significant progress has been made in spam filtering as filter-based, especially, applied research on stemming analyzers and semantic analysis have been actively conducted. However, it still has not reached the same level of classification as humans. In this paper, we propose a text-based spam filtering method based on BiLSTM (Bidirectional Long Short-Term Memory) combining the existing rule-based spam filtering method.