본문바로가기

자료 카테고리

전체 1
도서자료 0
학위논문 1
연속간행물·학술기사 0
멀티미디어 0
동영상 0
국회자료 0
특화자료 0

도서 앰블럼

전체 (0)
일반도서 (0)
E-BOOK (0)
고서 (0)
세미나자료 (0)
웹자료 (0)
전체 (1)
학위논문 (1)
전체 (0)
국내기사 (0)
국외기사 (0)
학술지·잡지 (0)
신문 (0)
전자저널 (0)
전체 (0)
오디오자료 (0)
전자매체 (0)
마이크로폼자료 (0)
지도/기타자료 (0)
전체 (0)
동영상자료 (0)
전체 (0)
외국법률번역DB (0)
국회회의록 (0)
국회의안정보 (0)
전체 (0)
표·그림DB (0)
지식공유 (0)

도서 앰블럼

전체 1
국내공공정책정보
국외공공정책정보
국회자료
전체 ()
정부기관 ()
지방자치단체 ()
공공기관 ()
싱크탱크 ()
국제기구 ()
전체 ()
정부기관 ()
의회기관 ()
싱크탱크 ()
국제기구 ()
전체 ()
국회의원정책자료 ()
입법기관자료 ()

검색결과

검색결과 (전체 1건)

검색결과제한

열기
논문명/저자명
Data-driven non-native pronunciation variation modeling for automatic speech recognition = 타 언어권 회자 연속음성인식을 위한 테이터[실은 데이터] 기반 발음변이 모델링 / 김민아 인기도
발행사항
광주 : 광주과학기술원, 2008.2
청구기호
TM 621.382 -8-303
형태사항
xiii, 63 p. ; 30 cm
자료실
전자자료
제어번호
KDMT1200800957
주기사항
학위논문(석사) -- 광주과학기술원, 정보기전공학, 2008.2
원문
미리보기

목차보기더보기

title page

Abstract

국문요약

Acknowledgements

Contents

Chapter 1. Introduction 19

1.1. Speech Recognition 20

1.2. Non-native Speech Recognition 21

1.3. Related works 22

1.4. Thesis Organization 24

Chapter 2. ASR System : An Overview 25

2.1. Feature Extraction 25

2.2. Stochastic Modeling of Speech 27

2.2.1. Hidden Markov Model 28

2.2.2. Decoding algorithm - Viterbi 31

2.3. Acoustic Model 32

2.4. Pronunciation Model 33

2.5. Language Model 34

2.6. Experiments and Results 35

2.6.1. Speech database 35

2.6.2. Baseline ASR system 36

2.6.3. Performance evaluation of the baseline ASR system 37

Chapter 3. Pronunciation Model Adaptation 39

3.1. The state-of-the-art in pronunciation adaptation method 39

3.2. Pronunciation adaptation for non-native speech 41

3.2.1. Phoneme recognition and alignment sequence 42

3.2.2. Deriving rules using a decision tree and adapting a dictionary 45

3.3. Example of pronunciation modeling and optimization of a dictionary 46

3.3.1. Phoneme recognition and alignment sequence for native and non-native speech 46

3.3.2. Deriving rules using a decision tree and adapting a dictionary 50

3.4. Experiments and Results 53

3.5. Discussion 55

Chapter 4. Confusability Reduction of Multiple Pronunciation Dictionary 57

4.1. Confusability measure 58

4.1.1. Levenshtein distance 58

4.1.2. Modified Levenshtein distance 60

4.2. Example of confusability reduction 61

4.3. Experiments and Results 62

4.4. Discussion 66

Chapter 5. Combined Method 67

5.1. Decomposition of pronunciation variability for non-native speech 67

5.1.1. Data-driven pronunciation variability analysis 67

5.1.2. Context-independent and context-dependent pronunciation variability 68

5.2. Combination of acoustic and pronunciation model adaptation for non-native speech 68

5.2.1. Acoustic model adaptation 70

5.2.2. Combined method 71

5.3. Experiments and Results 72

Chapter 6. Conclusion and Future Work 75

6.1. Conclusion 75

6.2. Future work 77

References 78

Table 2.1: List of Korean phonemes for native and non-native ASR. 36

Table 2.2: Comparing of the average word error rates (%) of the baseline ASR system using the dictionaries obtained by canonical (CC_Dict), knowledge-based (KB_Dict), and hand-labeled (HL_Dict) transcriptions. 38

Table 3.1: Example of three reference sequences obtained by canonical, knowledge-based, and hand-labeled transcriptions, and an alternative phonetic sequence after recognizing a Korean utterance : "그래서 여러 가지로 의미가 깊은 달이기 때문입니다," which in English means "This is because this month has several deep meanings." 47

Table 3.2: The rule pattern is obtained using Eq. (3.1) for the sentence in Table 3.1. 49

Table 3.3: Comparison of the average word error rate (%) of the non-native and native ASR systems employing the dictionaries adapted by either non-native rules or native rules. 54

Table 3.4: Comparison of the average word error rate (%) of the non-native and native ASR systems employing the dictionaries adapted by the combination of non-native rules and native rules. 55

Table 4.1: Example of confusability measure (CM) scores for all the pronunciation variations obtained by the indirect data-driven method, where a Korean word "멍해" meaning "stupid" is transcribed as /m v N h E/. 61

Table 4.2: Performance evaluation of an ASR system with a) the baseline dictionary, b) a multiple dictionary prior to reduction, and c) optimized multiple pronunciation dictionary by the proposed confusability reduction method. 65

Table 5.1: Comparison of the WERs (%) of the baseline ASR system, an ASR system with adapted acoustic models (adapted-AM), an ASR system with adapted pronunciation model with the baseline acoustic models (adapted-PM), and an ASR system... 73

Figure 1.1: The speech chain 20

Figure 1.2: The motivation of handling non-native speech recognition. 22

Figure 1.3: Three major approaches of handling non-native speech for ASR. 23

Figure 2.1: The overall structure of the construction for the continuous speech recognition system. 26

Figure 2.2: An example of left-to-right HMM model 30

Figure 2.3: The Viterbi algorithm 31

Figure 2.4: An example of pronunciation models about the word, "학교", a) single pronunciation model and b) multiple pronunciation model. 33

Figure 3.1: Procedure for the proposed pronunciation variation modeling method based on an indirect data-driven approach applied to native and non-native speech. 43

Figure 3.2: Example of decision tree building to derive pronunciation variation rules for a phone 'k.' 51

Figure 4.1: Comparison of the average WER (%) of the non-native ASR systems using the multiple pronunciation dictionary optimized (a) by the Levenshtein distance and (b) by the modified Levenshtein distance according to different CM threshold. 63

Figure 5.1: The procedure of the proposed combination adaptation method. 69

Figure 5.2: An example of a) a decision tree for the phoneme /p/ and b) a decision tree for the phoneme /o/ and /v/ for acoustic model adaptation. 71

Figure 6.1: The summary of evaluations for proposed methods. 76

권호기사보기

권호기사 목록 테이블로 기사명, 저자명, 페이지, 원문, 기사목차 순으로 되어있습니다.
기사명 저자명 페이지 원문 기사목차
연속간행물 팝업 열기 연속간행물 팝업 열기