My Account Log in

2 options

Speech enhancement : theory and practice / Philipos C. Loizou.

Online

Available online

View online
Van Pelt Library TK7882.S65 L65 2007 1 v. + CD-ROM
Loading location information...

Available This item is available for access.

Log in to request item
Format:
Book
Author/Creator:
Loizou, Philipos C.
Contributor:
Louis A. Duhring Fund.
Series:
Signal processing and communications ; 30.
Signal processing and communications ; 30
Language:
English
Subjects (All):
Speech processing systems.
Signal processing--Digital techniques.
Signal processing.
Image processing--Digital techniques.
Image processing.
Physical Description:
608 pages : illustrations ; 24 cm + 1 CD-ROM (4 3/4 in.)
Place of Publication:
Boca Raton : CRC Press, [2007]
Summary:
The first book to provide comprehensive and up-to-date coverage of all major speech enhancement algorithms proposed in the last two decades, Speech Enhancement: Theory and Practice is a valuable resource for experts and newcomers in the field. The book covers traditional speech enhancement algorithms, such as spectral subtraction and Wiener filtering algorithms as well as state-of-the-art algorithms including minimum mean-squared error algorithms that incorporate signal-presence uncertainty and subspace algorithms that incorporate psychoacoustic models. The coverage includes objective and subjective measures used to evaluate speech quality and intelligibility.
Providing clear and concise coverage of the subject, the author brings together a large body of knowledge about how human listeners compensate for acoustic noise when in noisy environments. This book is a valuable resource not only for engineers who want to implement the latest speech enhancement algorithms but also for speech practitioners who want to incorporate some of these algorithms into hearing aid applications for speech intelligibility and quality improvement.
Features: Supplies up-to-date coverage of all major noise suppression algorithms, Provides an understanding of the limitations and potential of existing enhancement algorithms, Covers the fundamentals needed to understand speech enhancement algorithms, Discusses all major enhancement algorithms as well as noise estimation algorithms, Presents a description of the evaluation measures used to assess the performance of enhancement algorithms, Elucidates the evaluation results obtained from a comparison between several algorithms in terms of speech quality and intelligibility, Includes MATLAB[Registered] code for the implementation of major speech enhancement algorithms.
Contents:
1.1 Understanding the Enemy: Noise 2
1.1.1 Noise Sources 2
1.1.2 Noise and Speech Levels in Various Environments 5
1.2 Classes of Speech Enhancement Algorithms 6
1.3 Book Organization 7
Chapter 2 Discrete-Time Signal Processing and Short-Time Fourier Analysis 13
2.1 Discrete-Time Signals 13
2.2 Linear Time-Invariant Discrete-Time Systems 16
2.2.1 Difference Equations 16
2.2.2 Linear Convolution 17
2.3 The z-Transform 18
2.3.1 Properties 18
2.3.2 The z-Domain Transfer Function 19
2.4 Discrete-Time Fourier Transform 21
2.4.1 DTFT Properties 22
2.4.2 Discrete Fourier Transform 24
2.4.3 Windowing 27
2.5 Short-Time Fourier Transform 32
2.5.2 Interpretations of the STFT 33
2.5.3 Sampling the STFT in Time and Frequency 35
2.5.4 Short-Time Synthesis of Speech 36
2.5.4.1 Filterbank Summation for Short-Time Synthesis of Speech 37
2.5.4.2 Overlap-and-Add Method for Short-Time Synthesis 39
2.6 Spectrographic Analysis of Speech Signals 42
Chapter 3 Speech Production and Perception 45
3.1 The Speech Signal 45
3.2 The Speech Production Process 45
3.2.1 Lungs 46
3.2.2 Larynx and Vocal Folds 47
3.2.3 Vocal Tract 51
3.3 Engineering Model of Speech Production 54
3.4 Classes of Speech Sounds 55
3.5 Acoustic Cues in Speech Perception 57
3.5.1 Vowels and Diphthongs 57
3.5.2 Semivowels 60
3.5.3 Nasals 61
3.5.4 Stops 62
3.5.4 Fricatives 64
Chapter 4 Noise Compensation by Human Listeners 69
4.1 Intelligibility of Speech in Multiple-Talker Conditions 70
4.1.1 Effect of Masker's Spectral/Temporal Characteristics and Number of Talkers: Monaural Hearing 70
4.1.2 Effect of Source Spatial Location: Binaural Hearing 73
4.2 Acoustic Properties of Speech Contributing to Robustness 78
4.2.1 Shape of the Speech Spectrum 78
4.2.2 Spectral Peaks 80
4.2.3 Periodicity 83
4.2.4 Rapid Spectral Changes Signaling Consonants 83
4.3 Perceptual Strategies for Listening in Noise 85
4.3.1 Auditory Streaming 85
4.3.2 Listening in the Gaps and Glimpsing 86
4.3.3 Use of F0 Differences 87
4.3.4 Use of Linguistic Knowledge 88
4.3.5 Use of Spatial and Visual Cues 89
Part 2 Algorithms 95
Chapter 5 Spectral-Subtractive Algorithms 97
5.1 Basic Principles of Spectral Subtraction 97
5.2 A Geometric View of Spectral Subtraction 101
5.2.1 Upper Bounds on the Difference Between the Noisy and Clean Signals' Phases 102
5.2.2 Alternate Spectral-Subtractive Rules and Theoretical Limits 104
5.3 Shortcomings of the Spectral Subtraction Method 110
5.4 Spectral Subtraction Using Oversubtraction 112
5.5 Nonlinear Spectral Subtraction 119
5.6 Multiband Spectral Subtraction 120
5.7 MMSE Spectral Subtraction Algorithm 125
5.8 Extended Spectral Subtraction 128
5.9 Spectral Subtraction Using Adaptive Gain Averaging 130
5.10 Selective Spectral Subtraction 133
5.11 Spectral Subtraction Based on Perceptual Properties 135
5.12 Performance of Spectral Subtraction Algorithms 136
Chapter 6 Wiener Filtering 143
6.2 Wiener Filters in the Time Domain 144
6.3 Wiener Filters in the Frequency Domain 146
6.4 Wiener Filters and Linear Prediction 148
6.5 Wiener Filters for Noise Reduction 150
6.5.1 Square-Root Wiener Filter 158
6.5.2 Parametric Wiener Filters 158
6.6 Iterative Wiener Filtering 163
6.6.1 Mathematical Speech Production Model 164
6.6.2 Statistical Parameter Estimation of the All-Pole Model in Noise 165
6.7 Imposing Constraints on Iterative Wiener Filtering 172
6.7.1 Across-Time Spectral Constraints 172
6.7.2 Across-Iterations Constraints 176
6.8 Constrained Iterative Wiener Filtering 177
6.9 Constrained Wiener Filtering 180
6.9.1 Mathematical Definitions of Speech and Noise Distortions 180
6.9.2 Limiting the Noise Distortion Level 184
6.10 Estimating the Wiener Gain Function 187
6.11 Incorporating Psychoacoustic Constraints in Wiener Filtering 192
6.11.1 Shaping the Noise Distortion in the Frequency Domain 192
6.11.2 Using Masking Thresholds as Constraints 195
6.12 Codebook-Driven Wiener Filtering 198
6.13 Audible Noise Suppression Algorithm 202
Chapter 7 Statistical-Model-Based Methods 213
7.1 Maximum-Likelihood Estimators 213
7.2 Bayesian Estimators 219
7.3 MMSE Estimator 219
7.3.1 MMSE Magnitude Estimator 222
7.3.2 MMSE Complex Exponential Estimator 227
7.3.3 Estimating the A Priori SNR 228
7.3.3.1 Maximum-Likelihood Method 229
7.3.3.2 Decision-Directed Approach 230
7.4 Improvements to the Decision-Directed Approach 231
7.4.1 Reducing the Bias 232
7.4.2 Improving the Adaptation Speed 233
7.5 Implementation and Evaluation of the MMSE Estimator 237
7.6 Elimination of Musical Noise 238
7.7 Log-MMSE Estimator 240
7.8 MMSE Estimation of the pth-Power Spectrum 242
7.9 MMSE Estimators Based on Non-Gaussian Distributions 247
7.10 Maximum A Posteriori (MAP) Estimators 251
7.11 General Bayesian Estimators 254
7.12 Perceptually Motivated Bayesian Estimators 256
7.12.1 Psychoacoustically Motivated Distortion Measure 256
7.12.2 Weighted Euclidean Distortion Measure 257
7.12.3 Itakura-Saito Measure 262
7.12.4 Cosh Measure 263
7.12.5 Weighted Likelihood Ratio 266
7.12.6 Modified IS Distortion Measure 266
7.13 Incorporating Speech Absence Probability in Speech Enhancement 269
7.13.1 Incorporating Speech-Presence Uncertainty in Maximum-Likelihood Estimators 270
7.13.2 Incorporating Speech-Presence Uncertainty in MMSE Estimators 272
7.13.3 Incorporating Speech-Presence Uncertainty in Log-MMSE Estimators 277
7.13.4 Implementation Issues Regarding A Priori SNR Estimation 279
7.14 Methods for Estimating the A Priori Probability of Speech Absence 279
Chapter 8 Subspace Algorithms 291
8.1.2 Projections 293
8.1.3 Low-Rank Modeling 298
8.2 Using SVD for Noise Reduction: Theory 300
8.2.1 SVD Analysis of "Noisy" Matrices 300
8.2.2 Least-Squares and Minimum-Variance Estimates of the Signal Matrix 303
8.3 SVD-Based Algorithms: White Noise 306
8.3.1 SVD Synthesis of Speech 306
8.3.2 Determining the Effective Rank 311
8.3.4 Noise Reduction Algorithm 315
8.4 SVD-Based Algorithms: Colored Noise 316
8.5 SVD-Based Methods: A Unified View 320
8.6 EVD-Based Methods: White Noise 320
8.6.1 Eigenvalue Analysis of "Noisy" Matrices 320
8.6.2 Subspace Methods Based on Linear Estimators 325
8.6.2.1 Linear Minimum Mean-Square Estimator (LMMSE) 326
8.6.2.2 Time-Domain-Constrained Estimator 328
8.6.2.3 Spectral-Domain-Constrained Estimator 332
8.6.3 Implementation 338
8.6.3.1 Covariance Estimation 338
8.6.3.2 Estimating the Lagrange Multiplier 340
8.6.3.3 Estimating the Signal Subspace Dimension 342
8.7 EVD-Based Methods: Colored Noise 344
8.7.1 Prewhitening Approach 345
8.7.2 Signal/Noise KLT-Based Method 349
8.7.3 Adaptive KLT Approach 352
8.7.4 Subspace Approach with Embedded Prewhitening 354
8.7.4.1 Time-Domain-Constrained Estimator 354
8.7.4.2 Spectrum-Domain-Constrained Estimator 356
8.7.4.3 Implementation 359
8.7.4.4 Relationship Between Subspace Estimators and Prewhitening 361
8.8 EVD-Based Methods: A Unified View 366
8.9 Perceptually Motivated Subspace Algorithms 367
8.9.1 Fourier to Eigen-Domain Relationship 368
8.9.2 Incorporating Psychoacoustic Model Constraints 372
8.9.3 Incorporating Auditory Masking Constraints 374
8.10 Subspace-Tracking Algorithms 376
8.10.1 Block Algorithms 377
8.10.2 Recursive Algorithms 383
8.10.2.1 Modified Eigenvalue Problem Algorithms 384
8.10.2.2 Adaptive Algorithms 385
8.10.3 Using Subspace-Tracking Algorithms in Speech Enhancement 392
Chapter 9 Noise Estimation Algorithms 399
9.1 Voice Activity Detection Vs.
Noise Estimation 399
9.3 Minimal-Tracking Algorithms 403
9.3.1 Minimum Statistics (MS) Noise Estimation 403
9.3.1.2 Derivation of the Bias Factor 405
9.3.1.3 Derivation of Optimal Time- and Frequency-Dependent Smoothing Factor 411
9.3.1.4 Searching for the Minimum 414
9.3.1.5 Minimum Statistics Algorithm 415
9.3.2 Continuous Spectral Minimum Tracking 417
9.4 Time-Recursive Averaging Algorithms for Noise Estimation 420
9.4.1 SNR-Dependent Recursive Averaging 421
9.4.2 Weighted Spectral Averaging 423
9.4.3 Recursive Averaging Algorithms Based on Signal-Presence Uncertainty 429
9.4.3.1 Likelihood Ratio Approach 430
9.4.3.2 Minima-Controlled Recursive Averaging (MCRA) Algorithms 434
9.5 Histogram-Based Techniques 446
9.6 Other Noise Estimation Algorithms 453
9.7 Objective Comparison of Noise Estimation Algorithms 455
Part 3 Evaluation 463
Chapter 10 Evaluating Performance of Speech Enhancement Algorithms 465
10.1 Quality vs. Intelligibility 465
10.2 Evaluating Intelligibility of Processed Speech 466
10.2.1 Nonsense Syllable Tests 467
10.2.2 Word Tests 472
10.2.2.1 Phonetically Balanced Word Tests 472
10.2.2.2 Rhyming Word Tests 473
10.2.3 Sentence Tests 476
10.2.4 Measuring Speech Intelligibility 478
10.2.4.1 Speech Reception Threshold 478
10.2.4.2 Using Statistical Tests to Assess Significant Differences: Recommended Practice 480
10.3 Evaluating Quality of Processed Speech 486
10.3.1 Relative Preference Methods 486
10.3.2 Absolute Category Rating Methods 489
10.3.2.1 Mean Opinion Scores 490
10.3.2.2 Diagnostic Acceptability Measure 492
10.3.2.3 The ITU-T P.835 Standard 495
10.4 Evaluating Reliability of Quality Judgments: Recommended Practice 498
10.4.1 Intrarater Reliability Measures 498
10.4.2 Interrater Reliability Measures 500
10.5 Objective Quality Measures 502
10.5.1 Segmental SNR Measures: Time and Frequency 503
10.5.2 Spectral Distance Measures Based on LPC 506
10.5.3 Perceptually Motivated Measures 507
10.5.3.1 Weighted Spectral Slope (WSS) Distance Measure 508
10.5.3.2 Bark Distortion Measures 509
10.5.3.3 Perceptual Evaluation of Speech Quality (PESQ) Measure 514
10.5.4 Composite Measures 525
10.6 Nonintrusive Objective Quality Measures 527
10.7 Figures of Merit of Objective Quality Measures 528
10.8 Challenges and Future Directions in Objective Quality Evaluation 530
Chapter 11 Comparison of Speech Enhancement Algorithms 541
11.1 NOIZEUS: A Noisy Speech Corpus for Quality Evaluation of Speech Enhancement Algorithms 542
11.2 Comparison of Speech Enhancement Algorithms: Quality 543
11.2.1 Quality Evaluation: Procedure 544
11.2.2 Subjective Quality Evaluation: Results 545
11.2.3 Within-Class Algorithm Comparisons 545
11.2.4 Across-Class Algorithm Comparisons 550
11.2.5 Comparisons in Reference to Noisy Speech 554
11.2.6 Contribution of Speech and Noise Distortion to Judgment of Overall Quality 558
11.3 Comparison of Speech Enhancement Algorithms: Intelligibility 560
11.3.1 Listening Tests: Procedure 561
11.3.2 Intelligibility Evaluation: Results 562
11.3.3 Intelligibility Comparison Among Algorithms 564
11.3.4 Intelligibility Comparison Against Noisy Speech 564
11.4 Comparison of Objective Measures for Quality Evaluation 568
11.4.1 Objective Measures 568
11.4.2 Correlations of Objective Measures with Quality 573
Appendix A Special Functions and Integrals 581
Appendix B Derivation of the MMSE Estimator 585
Appendix C Speech Databases and MATLAB Code 589.
Notes:
Includes bibliographical references and index.
Local Notes:
Acquired for the Penn Libraries with assistance from the Louis A. Duhring Fund.
ISBN:
9780849350320
0849350328
OCLC:
76898042

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account