본문 바로가기 주메뉴 바로가기
국회도서관 홈으로 정보검색 소장정보 검색

결과 내 검색

동의어 포함

초록보기

Tabular data are fundamental to critical decision-making across healthcare, finance, and manufacturing domains. However, generating high-quality synthetic tabular data remains challenging because of heterogeneous feature types, severe class imbalance, and stringent privacy requirements. To address these challenges, we propose TabCL, a generative framework that combines a variational autoencoder (VAE) with a denoising diffusion model, reinforced by contrastive learning and consistency regularization. The contrastive learning component sharpens class boundaries in the VAE's latent space, improving class separation and representation quality. Further, consistency regularization ensures that latent codes perturbed with different noise levels reconstruct to identical outputs, which enhances both sample diversity and model robustness without increasing computational complexity. Extensive experiments on six public benchmarks, four classification and two regression datasets, demonstrate that the proposed TabCL outperforms the existing methods including SMOTE, CTGAN, and TVAE across all standard quantitative metrics. Distributional analyses further reveal that TabCL more accurately reproduces rare categorical levels, heavy-tailed numerical outliers, and complex cross-feature correlation, resulting in synthetic data whose statistical properties closely align with those of the real data even under severe class imbalance and limited-sample sizes. By simultaneously improving latent-space structure and diffusion-based generation, TabCL produces high-fidelity, privacy-respecting synthetic tabular data suitable for downstream modelling and data sharing. Future extensions will target longitudinal datasets and incorporate formal differential privacy guarantees to enable broader deployment in privacy-sensitive industrial environments.

권호기사

권호기사 목록 테이블로 기사명, 저자명, 페이지, 원문, 기사목차 순으로 되어있습니다.
기사명 저자명 페이지 원문 목차
정형 데이터 생성을 위한 일관성 기반 디퓨전 모델 = Consistency-driven diffusion for robust tabular data synthesis 윤지현, 김성범 p. 439-453
Grover algorithm-based quantum scheduling framework for constrained graph coloring = 제약 조건이 있는 그래프 컬러링 문제를 위한 그로버 알고리즘 기반 퀀텀 스케줄링 프레임웍 Lutfiana Sausan, Hyunsoo Lee p. 454-464
데이터파밍을 활용한 빅데이터 기반 품질개선 방법 = Quality improvement method using data farming on quality big data 주혜진, 송유진, 변재현 p. 465-474
동적 관심구간 선정법을 활용한 디퓨전 기반 PPG-to-ECG 변환 방법론 = Diffusion-based PPG-to-ECG translation with dynamic ROI selection 추창욱, 김성범 p. 475-488
텍스트 마이닝 기반 모바일 금융앱 UX 문제 진단 및 개선안 제안 = Text mining-based diagnosis and improvement of UX issues in mobile financial applications 박가율, 정명철, 모승민 p. 489-501
설명가능 인공지능을 활용한 식물 병해 심각도 정량적 접근법 = A quantitative approach to plant disease severity using explainable AI : a case study on powdery mildew : 흰가루병 적용 사례 이종욱, 이상연, 장원준, 정준각 p. 502-514
임곗값 최적화 기반 용접 결함 분류 = Welding defect classification based on threshold optimization : a solution to class imbalance challenges : 클래스 불균형 문제 해소를 중심으로 한민기, 김도희, 배혜림 p. 515-526