대한민국 국회도서관

인명/단체명 검색결과
전체 선택	대표형(전거형, Authority)	생물정보	이형(異形, Variant)	소속	직위	직업	활동분야	주기	서지
연구/단체명을 입력해주세요.

소장자료
공공정책정보
외부기관 자료

국내기사 유용성과 노출 위험성 지표를 이용한 재현자료 기법 비교 연구 = A comparison of synthetic data approaches using utility and disclosure risk measures

저자명
안성빈, 트랑 도안, 이주희, 김지우, 김용재, 김윤지, 윤창원, 정성규, 김동하, 권성훈, 김항준, 안정연, 박철우
발행사항
서울 : 한국통계학회, 2023. 4. 30
수록지명
응용통계연구 = The Korean journal of applied statistics. 제36권 제2호 (2023년 4월), p. 141-166
자료실 서울관 전자자료, 정기간행물실(524호) 도서위치안내[서울관]
제어번호
KINX2023135362
주기사항
한국연구재단 제공 KCI 등재(후보)학술지
본문은 한국어, 요약문은 영어, 한국어 수록
원문
협정기관
연계정보
외부기관 원문
한국학술지인용색인(NRF)
외부기관 원문

목차보기

목차 1

유용성과 노출 위험성 지표를 이용한 재현자료 기법 비교 연구 ＝ A comparison of synthetic data approaches using utility and disclosure risk measures / 안성빈 ; 트랑 도안 ; 이주희 ; 김지우 ; 김용재 ; 김윤지 ; 윤창원 ; 정성규 ; 김동하 ; 권성훈 ; 김항준 ; 안정연 ; 박철우 1

Abstract 1

1. 서론 1

2. 재현자료 생성 기법 3

2.1. SURVEY EST 데이터셋 설명 3

2.2. 순차회귀모형을 이용한 재현자료 생성 4

2.3. 비모수 베이지안 모형을 이용한 재현자료 생성 6

2.4. 인공 신경망을 이용한 재현자료 생성 7

2.5. 재현자료 생성 기법의 특징과 차이점 9

3. 재현자료의 평가 지표 10

3.1. 유용성 측도 10

3.2. 재현자료의 노출 위험도 평가 지표 12

3.3. α-정밀도, β-재현율, 독창성 점수 14

3.4. 평가 지표들의 특징과 차이점 15

4. 재현자료 기법들 비교 분석 18

5. 결론 20

Appendix 20

References 23

요약 26

권호기사

권호기사 목록 테이블로 기사명, 저자명, 페이지, 원문, 기사목차 순으로 되어있습니다.
기사명	저자명	페이지	목차
음성위조 탐지에 있어서 데이터 증강 기법의 성능에 관한 비교 연구 = Comparative study of data augmentation methods for fake audio detection	박관열, 곽일엽	p. 101-114	보기

가우시안 과정 분류에 대한 변분 베이지안 다항 프로빗 모형 = Variational Bayesian multinomial probit model with Gaussian process classification on mice protein expression level data : 쥐 단백질 발현 데이터에의 적용	손동현, 황범석	p. 115-127	보기

Causal temporal convolutional neural network를 이용한 변동성 지수 예측 = Forecasting volatility index by temporal convolutional neural network	신지원, 신동완	p. 129-139	보기

유용성과 노출 위험성 지표를 이용한 재현자료 기법 비교 연구 = A comparison of synthetic data approaches using utility and disclosure risk measures	안성빈, 트랑 도안, 이주희, 김지우, 김용재, 김윤지, 윤창원, 정성규, 김동하, 권성훈, 김항준, 안정연, 박철우	p. 141-166	보기

t-SNE에 대한 요약 = A review on the t-distributed stochastic neighbors embedding	김기풍, 김충락	p. 167-173	보기

참고문헌 (40건) : 자료제공( 네이버학술정보 )

참고문헌 목록에 대한 테이블로 번호, 참고문헌, 국회도서관 소장유무로 구성되어 있습니다.
번호	참고문헌	국회도서관 소장유무
1	Alaa A, Van Breugel B, Saveliev ES, and van der Schaar M (2022). How faithful is your synthetic data? Samplelevel metrics for evaluating and auditing generative models, International Conference on Machine Learning, 290–306, PMLR.	미소장
2	Arjovsky M, Chintala S, and Bottou L (2017). Wasserstein generative adversarial networks, International Conference on Machine Learning, 214–223, PMLR.	미소장
3	Arthur D and Vassilvitskii S (2007) K-means plus plus: The advantages of careful seeding, In Proceedings of the Eighteenth Annual Acm-Siam Symposium on Discrete Algorithms, New Orleans, Louisiana, USA, 1027–1035.	미소장
4	Breiman L, Friedman JH, Olshen RA, and Stone CJ (2017). Classification and Regression Trees, Routledge, New York.	미소장
5	Dhariwal P and Nichol A (2021). Diffusion models beat gans on image synthesis, Advances in Neural Information Processing Systems, 34, 8780–8794.	미소장
6	Drechsler J and Reiter JP (2009). Disclosure risk and data utility for partially synthetic data: An empirical study using the german iab establishment survey, Journal of Offcial Statistics, 25, 589–603.	미소장
7	EI Emam K, Mosquera L, and Bass J (2020). Evaluating identity disclosure risk in fully synthetic health data:Model development and validation, Journal of Medical Internet Research, 22, e23139.	미소장
8	Elliot M (2015). Final report on the disclosure risk associated with the synthetic data produced by the sylls team, Report 2015, 2.	미소장
9	Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, and Courville AC (2017). Improved training of Wasserstein GANs, Advances in Neural Information Processing Systems, 30, 1–11.	미소장
10	Hilprecht B, H´arterich M, and Bernau D (2019). Monte carlo and reconstruction membership inference attacks against generative models, Proceedings on Privacy Enhancing Technologies, 2019, 232–249.	미소장
11	Hu J and Savitsky TD (2018). Bayesian data synthesis and disclosure risk quantification: An application to the consumer expenditure surveys, Available from: arXiv preprint arXiv:1809.10074	미소장
12	Ishwaran H and James LF (2001). Gibbs sampling methods for stick-breaking priors, Journal of the American Statistical Association, 96, 161–173.	미소장
13	Karr AF, Kohnen CN, Oganian A, Reiter JP, and Sanil AP (2006). A framework for evaluating the utility of data altered to protect confidentiality, The American Statistician, 60, 224–232.	미소장
14	Khamis H (2008). Measures of association: How to choose?, Journal of Diagnostic Medical Sonography, 24, 155–162.	미소장
15	Kingma DP andWellingM(2013). Auto-encoding variational Bayes, Available from: arXiv preprint arXiv:1312.6114	미소장
16	Kim HJ, Drechsler J, and Thompson KJ(2021). Synthetic microdata for establishment surveys under informative sampling, Journal of the Royal Statistical Society: Series A, 184, 255–281.	미소장
17	Kim J and Park M-J (2019). Multiple imputation and synthetic data, The Korean Journal of Applied Statistics, 32, 83–97.	미소장
18	Kullback S and Leibler RA (1951). On information and suffciency, The Annals of Mathematical Statistics, 22, 79–86.	미소장
19	Lee Y (2013). Review on statistical methods for protecting privacy and measuring risk of disclosure when releasing information for public use, Journal of the Korean Data and Information Science Society, 24, 1029–1041.	미소장
20	Lin Z, Khetan A, Fanti G, and Oh S (2018). The power of two samples in generative adversarial networks, Advances in Neural Information Processing Systems, 31, 1–10.	미소장
21	Little RJA (1993). Statistical analysis of masked data, Journal of Offcial Statistics, Stockholm, 9, 407–407.	미소장
22	Markus H, Rudolf M, and Andreas E (2020). A baseline for attribute disclosure risk in synthetic data, In Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy (CODASPY’20), March 16–18, 2020, New Orleans, LA, USA, ACM, New York, NY, USA, 11, Available from: https://doi.org/10.1145/3374664.3375722	미소장
23	Murray JS and Reiter JP (2016). Multiple imputation of missing categorical and continuous values via bayesian mixture models with local dependence, Journal of the American Statistical Association, 111, 1466–1479.	미소장
24	Nowok B, Raab GM, and Dibben C (2016). Synthpop: Bespoke creation of synthetic data in R, Journal of Statistical Software, 74, 1–26.	미소장
25	Park MJ, Kwon SP, and Shim KH (2013). Microdata masking for Survey of Household Finances and Living Conditions, Statistical Research Institute, Daejeon.	미소장
26	Park M-J, Han J, and Park N (2020). Study on synthetic data generation methods with applications to statistics Korea RDC data, Technical report, Statistical Research Institute.	미소장
27	Raghunathan TE, Reiter JP, and Rubin DB (2003). Multiple imputation for statistical disclosure limitation, Journal of Offcial Statistics, 19, 1–16.	미소장
28	Reiter JP (2003). Inference for partially synthetic, public use microdata sets, Survey Methodology, 29, 181–188.	미소장
29	Reiter JP (2005). Using CART to generate partially synthetic public use microdata, Journal of Offcial Statistics, 21, 441–462.	미소장
30	Rosenbaum PR and Rubin DB (1983). The central role of the propensity score in observational studies for causal effects, Biometrika, 70, 41–55.	미소장
31	Rubin DB (1993). Statistical disclosure limitation, Journal of Offcial Statistics, 9, 461–468.	미소장
32	Scholkopf B, Platt JC, Shawe-Taylor J, Smola AJ, and Williamson RC (2001). Estimating the support of a highdimensional distribution, Neural Computation, 13, 1443–1471.	미소장
33	Snoke J, Raab GM, Nowok B, Dibben C, and Slavkovic A (2018). General and specific utility measures for synthetic data, Journal of the Royal Statistical Society: Series A, 181, 663–688.	미소장
34	Song Y and Ermon S (2019). Generative modeling by estimating gradients of the data distribution, Advances in Neural Information Processing Systems, 32, 11895–11907.	미소장
35	Song Y, Sohl-Dickstein J, Kingma DP, Kumar A, Ermon S, and Poole B (2020). Score-based generative modeling through stochastic differential equations, International Conference on Learning Representations, Available from: https://arxiv.org/abs/2011.13456	미소장
36	Stan M, Jordi N, Morvarid S, and Tomasz S (2015). A review of attribute disclosure control, Advanced Research in Data Privacy, 567, 41–61.	미소장
37	Villani C (2008). Optimal Transport: Old and New, Springer, New York.	미소장
38	Woo M-J, Reiter JP, Oganian A, and Karr AF (2009). Global measures of data utility for microdata masked for disclosure limitation, Journal of Privacy and Confidentiality, 1, 111–124.	미소장
39	Xu L, Skoularidou M, Cuesta-Infante A, and Veeramachaneni K (2019). Modeling tabular data using conditional GAN, Advances in Neural Information Processing Systems, 32, 7333–7343.	미소장
40	Yoon J, Jarrett D, and Van der SchaarM(2019). Time-series generative adversarial networks, Advances in Neural Information Processing Systems, 32, 5509–5519.	미소장

자료명
저자사항
제어번호
*요청자 이름	회신요청
*전화번호	휴대폰 번호를 입력하세요.
*이메일	@
*요청내용
*오류항목

청구기호
자료명/저자사항
발행사항
형태사항
ISSN

* 서재명
설명
* 공개수준	비공개 완전공개 * 주의: 국회도서관 이용자 모두에게 공유서재로 서비스 됩니다.

알림톡 발송로 자료명, 기사명/저자명, 수록지명, 자료실, 서가번호, 전화번호로 구성되어 있습니다.




*전화번호	※ '-' 없이 휴대폰번호를 입력하세요

다국어입력

상세검색

다국어입력

저자 검색

관련 키워드 검색

주제별 검색

국내기사 유용성과 노출 위험성 지표를 이용한 재현자료 기법 비교 연구 = A comparison of synthetic data approaches using utility and disclosure risk measures

목차보기

권호기사

참고문헌 (40건) : 자료제공( 네이버학술정보 )

추천서가 (다양한 추천 자료를 만나보세요)

MARC 보기

오류 데이터 정정요청

알림톡 발송

권호기사보기

연속간행물 권호 선택

연속간행물 권호 선택

우편복사 안내

도서위치안내(서울관)

저자프로필

목차보기

우편복사 안내

우편복사 목록담기

확인

내서재에 담기

새로운 서재

저장

로그인