SCOPUS 정보 검색 플랫폼

Digital Signal Processing: A Review Journal

Volumn 17, Issue 3, 2007, Pages 578-616

Short-time phase spectrum in speech processing: A review and some experimental results

(2) Alsteris, Leigh D a Paliwal, Kuldip K a

Author keywords

Automatic speech recognition; Magnitude spectrum; Overlap add procedure; Phase spectrum; Short time Fourier transform; Speech perception

Indexed keywords

FOURIER TRANSFORMS; INTELLIGENT CONTROL; SPECTRUM ANALYSIS; SPEECH RECOGNITION; SPEECH SYNTHESIS;

AUTOMATIC SPEECH RECOGNITION; MAGNITUDE SPECTRUM; OVERLAP ADD PROCEDURE; PHASE SPECTRUM; SHORT TIME FOURIER TRANSFORM; SPEECH PERCEPTION;

SPEECH PROCESSING;

EID: 34047258851 PISSN: 10512004 EISSN: None Source Type: Journal
DOI: 10.1016/j.dsp.2006.06.007 Document Type: Article

Times cited : (112)

References (71)

1
- 0018642851
- Enhancement and bandwidth compression of noisy speech
- Lim J.S., and Oppenheim A.V. Enhancement and bandwidth compression of noisy speech. Proc. IEEE 67 12 (1979) 1586-1604
- (1979) Proc. IEEE , vol.67 , Issue.12 , pp. 1586-1604
- Lim, J.S.¹ Oppenheim, A.V.²

2
- 0020167383
- The unimportance of phase in speech enhancement
- Wang D.L., and Lim J.S. The unimportance of phase in speech enhancement. IEEE Trans. Acoust. Speech Signal Process. ASSP-30 4 (1982) 679-681
- (1982) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-30 , Issue.4 , pp. 679-681
- Wang, D.L.¹ Lim, J.S.²

3
- 0027659197
- Signal modeling techniques in speech recognition
- Picone J.W. Signal modeling techniques in speech recognition. Proc. IEEE 81 9 (1993) 1215-1247
- (1993) Proc. IEEE , vol.81 , Issue.9 , pp. 1215-1247
- Picone, J.W.¹

4
- 0016555855
- Models of hearing
- Schroeder M.R. Models of hearing. Proc. IEEE 63 (1975) 1332-1350
- (1975) Proc. IEEE , vol.63 , pp. 1332-1350
- Schroeder, M.R.¹

5
- 0019569248
- The importance of phase in signals
- Oppenheim A.V., and Lim J.S. The importance of phase in signals. Proc. IEEE 69 (1981) 529-541
- (1981) Proc. IEEE , vol.69 , pp. 529-541
- Oppenheim, A.V.¹ Lim, J.S.²

6
- 0031220487
- Effects of phase on the perception of intervocalic stop consonants
- Liu L., He J., and Palm G. Effects of phase on the perception of intervocalic stop consonants. Speech Commun. 22 4 (1997) 403-417
- (1997) Speech Commun. , vol.22 , Issue.4 , pp. 403-417
- Liu, L.¹ He, J.² Palm, G.³

7
- 84944812390
- Phase vocoder
- Flanagan J.L., and Golden R.M. Phase vocoder. Bell Syst. Tech. J. 45 (1966) 1493-1509
- (1966) Bell Syst. Tech. J. , vol.45 , pp. 1493-1509
- Flanagan, J.L.¹ Golden, R.M.²

8
- 0016962212
- Implementation of the digital phase vocoder using the fast Fourier transform
- Portnoff M.R. Implementation of the digital phase vocoder using the fast Fourier transform. IEEE Trans. Acoust. Speech Signal Process. ASSP-24 3 (1976) 243-248
- (1976) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-24 , Issue.3 , pp. 243-248
- Portnoff, M.R.¹

9
- 0032638660
- H. Pobloth, W.B. Kleijn, On phase perception in speech, in: Proc. International Conf. Acoustics, Speech, Signal Processing, 1999, pp. 29-32

10
- 0033676787
- D.S. Kim, Perceptual phase redundancy in speech, in: Proc. International Conf. Acoustics, Speech, Signal Processing, 2000, pp. 1383-1386

11
- 0036286580
- The effect of group delay spectrum on timbre
- Banno H., Takeda K., and Itakura F. The effect of group delay spectrum on timbre. Acoust. Sci. Tech. 23 (2002) 1-9
- (2002) Acoust. Sci. Tech. , vol.23 , pp. 1-9
- Banno, H.¹ Takeda, K.² Itakura, F.³

12
- 0032673049
- Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous frequency based F0 extraction: Possible role of a repetitive structure in sounds
- Kawahara H., Katsuse I.M., and Cheveigne A.D. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous frequency based F0 extraction: Possible role of a repetitive structure in sounds. Speech Commun. 27 (1999) 187-207
- (1999) Speech Commun. , vol.27 , pp. 187-207
- Kawahara, H.¹ Katsuse, I.M.² Cheveigne, A.D.³

13
- 0024965810
- Formant extraction from phase using weighted group delay function
- Murthy H.A., Murthy K.V.M., and Yegnanarayana B. Formant extraction from phase using weighted group delay function. Electron. Lett. 25 23 (1989) 1609-1611
- (1989) Electron. Lett. , vol.25 , Issue.23 , pp. 1609-1611
- Murthy, H.A.¹ Murthy, K.V.M.² Yegnanarayana, B.³

14
- 0026923568
- Significance of group delay functions in spectrum estimation
- Yegnanarayana B., and Murthy H.A. Significance of group delay functions in spectrum estimation. IEEE Trans. Signal Process. 40 9 (1992) 2281-2289
- (1992) IEEE Trans. Signal Process. , vol.40 , Issue.9 , pp. 2281-2289
- Yegnanarayana, B.¹ Murthy, H.A.²

15
- 84980066893
- Uber die Definition des Tones, nebst daran geknupfter Theorie der Sirene und ahnlicher tonbildender Vorrichtungen
- Ohm G.S. Uber die Definition des Tones, nebst daran geknupfter Theorie der Sirene und ahnlicher tonbildender Vorrichtungen. Ann. Phys. Chem. 59 (1843) 513-565
- (1843) Ann. Phys. Chem. , vol.59 , pp. 513-565
- Ohm, G.S.¹

16
- 0004220068
- (English translation by A.J. Ellis), Longmans, Green and Co., London (original work published 1875)
- von Helmholtz H.L.F. On the Sensations of Tone. (English translation by A.J. Ellis) (1912), Longmans, Green and Co., London (original work published 1875)
- (1912) On the Sensations of Tone
- von Helmholtz, H.L.F.¹

17
- 0006927594
- A note on phase distortion in hearing
- de Boer E. A note on phase distortion in hearing. Acoustica 11 (1961) 182
- (1961) Acoustica , vol.11 , pp. 182
- de Boer, E.¹

18
- 13544259544
- On the usefulness of STFT phase spectrum in human listening tests
- Paliwal K.K., and Alsteris L.D. On the usefulness of STFT phase spectrum in human listening tests. Speech Commun. 45 2 (2005) 153-170
- (2005) Speech Commun. , vol.45 , Issue.2 , pp. 153-170
- Paliwal, K.K.¹ Alsteris, L.D.²

19
- 0141480080
- H.A. Murthy, V. Gadde, The modified group delay function and its application to phoneme recognition, in: Proc. International Conf. Acoustics, Speech, Signal Processing, April 2003, pp. 68-71

20
- 0017552006
- A unified approach to short-time Fourier analysis and synthesis
- Allen J.B., and Rabiner L.R. A unified approach to short-time Fourier analysis and synthesis. Proc. IEEE 65 11 (1977) 1558-1564
- (1977) Proc. IEEE , vol.65 , Issue.11 , pp. 1558-1564
- Allen, J.B.¹ Rabiner, L.R.²

21
- 0017626192
- Short-term spectral analysis, synthesis, and modification by discrete Fourier transform
- Allen J.B. Short-term spectral analysis, synthesis, and modification by discrete Fourier transform. IEEE Trans. Acoust. Speech Signal Process. ASSP-25 3 (1977) 235-238
- (1977) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-25 , Issue.3 , pp. 235-238
- Allen, J.B.¹

22
- 0018983179
- A weighted overlap-add method of short-time Fourier analysis/synthesis
- Crochiere R.E. A weighted overlap-add method of short-time Fourier analysis/synthesis. IEEE Trans. Acoust. Speech Signal Process. ASSP-28 1 (1980) 99-102
- (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-28 , Issue.1 , pp. 99-102
- Crochiere, R.E.¹

23
- 0021407831
- Signal estimation from modified short-time Fourier transform
- Griffin D.W., and Lim J.S. Signal estimation from modified short-time Fourier transform. IEEE Trans. Acoust. Speech Signal Process. ASSP-32 2 (1984) 236-243
- (1984) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-32 , Issue.2 , pp. 236-243
- Griffin, D.W.¹ Lim, J.S.²

24
- 0003440227
- Prentice Hall, Englewood Cliffs, NJ
- Lim J.S., and Oppenheim A.V. Advanced Topics in Signal Processing (1988), Prentice Hall, Englewood Cliffs, NJ
- (1988) Advanced Topics in Signal Processing
- Lim, J.S.¹ Oppenheim, A.V.²

25
- 0018298311
- M.R. Portnoff, Magnitude-phase relationships for short-time Fourier transforms based on Gaussian analysis windows, in: Proc. International Conf. Acoustics, Speech, Signal Processing, April 1979, pp. 186-189

26
- 0018982701
- Time-frequency representation of digital signals and systems based on short-time Fourier analysis
- Portnoff M.R. Time-frequency representation of digital signals and systems based on short-time Fourier analysis. IEEE Trans. Acoust. Speech Signal Process. ASSP-28 1 (1980) 55-69
- (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-28 , Issue.1 , pp. 55-69
- Portnoff, M.R.¹

27
- 0019582149
- Short-time Fourier analysis of sampled speech
- Portnoff M.R. Short-time Fourier analysis of sampled speech. IEEE Trans. Acoust. Speech Signal Process. ASSP-29 3 (1981) 364-373
- (1981) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-29 , Issue.3 , pp. 364-373
- Portnoff, M.R.¹

28
- 0019582545
- Time-scale modification of speech based on short-time Fourier analysis
- Portnoff M.R. Time-scale modification of speech based on short-time Fourier analysis. IEEE Trans. Acoust. Speech Signal Process. ASSP-29 3 (1981) 374-390
- (1981) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-29 , Issue.3 , pp. 374-390
- Portnoff, M.R.¹

29
- 0003927842
- Prentice Hall, Upper Saddle River, NJ
- Quatieri T.F. Discrete-Time Speech Signal Processing (2002), Prentice Hall, Upper Saddle River, NJ
- (2002) Discrete-Time Speech Signal Processing
- Quatieri, T.F.¹

30
- 4544282258
- Prentice Hall, Englewood Cliffs, NJ
- Rabiner L.R., and Schafer R.W. Discrete-Time Speech Signal Processing, Principles and Practice (1978), Prentice Hall, Englewood Cliffs, NJ
- (1978) Discrete-Time Speech Signal Processing, Principles and Practice
- Rabiner, L.R.¹ Schafer, R.W.²

31
- 0015699024
- Design and simulation of a speech analysis-synthesis system based on short-time Fourier analysis
- Schafer R.W., and Rabiner L.R. Design and simulation of a speech analysis-synthesis system based on short-time Fourier analysis. IEEE Trans. Audio Electroacoust. AU-21 (1973) 165-174
- (1973) IEEE Trans. Audio Electroacoust. , vol.AU-21 , pp. 165-174
- Schafer, R.W.¹ Rabiner, L.R.²

32
- 0003513556
- Prentice Hall, Upper Saddle River, NJ
- Oppenheim A.V., and Schafer R.W. Discrete-Time Signal Processing. second ed. (1999), Prentice Hall, Upper Saddle River, NJ
- (1999) Discrete-Time Signal Processing. second ed.
- Oppenheim, A.V.¹ Schafer, R.W.²

33
- 0003793552
- Prentice Hall, Englewood Cliffs, NJ
- Oppenheim A.V., and Schafer R.W. Digital Signal Processing (1975), Prentice Hall, Englewood Cliffs, NJ
- (1975) Digital Signal Processing
- Oppenheim, A.V.¹ Schafer, R.W.²

34
- 0017480265
- A new phase unwrapping algorithm
- Tribolet J.M. A new phase unwrapping algorithm. IEEE Trans. Acoust. Speech Signal Process. ASSP-25 2 (1977) 170-177
- (1977) IEEE Trans. Acoust. Speech Signal Process. , vol.ASSP-25 , Issue.2 , pp. 170-177
- Tribolet, J.M.¹

35
- 0003718128
- Wiley, New York
- Ghiglia D.C., and Pritt M.D. Two-Dimensional Phase Unwrapping. Theory, Algorithms and Software (1998), Wiley, New York
- (1998) Two-Dimensional Phase Unwrapping. Theory, Algorithms and Software
- Ghiglia, D.C.¹ Pritt, M.D.²

36
- 0037333359
- Using the matrix pencil method to solve phase unwrapping
- Nico G., and Fortuny J. Using the matrix pencil method to solve phase unwrapping. IEEE Trans. Signal Process. 51 3 (2003)
- (2003) IEEE Trans. Signal Process. , vol.51 , Issue.3
- Nico, G.¹ Fortuny, J.²

37
- 84866792444
- L.D. Alsteris, K.K. Paliwal, ASR on speech reconstructed from short-time Fourier phase spectra, in: Proc. International Conf. Spoken Language Processing, October 2004

38
- 0035278964
- Time-frequency distributions for automatic speech recognition
- Potamianos A., and Maragos P. Time-frequency distributions for automatic speech recognition. IEEE Trans. Speech Audio Process. 9 (2001) 196-200
- (2001) IEEE Trans. Speech Audio Process. , vol.9 , pp. 196-200
- Potamianos, A.¹ Maragos, P.²

39
- 85009224932
- D. Dimitriadis, P. Maragos, Robust energy demodulation based on continuous models with application to speech recognition, in: Proc. European Conf. Speech Communication and Technology, September 2003, pp. 2853-2856

40
- 85009192384
- K.K. Paliwal, B.S. Atal, Frequency-related representation of speech, in: Proc. European Conf. Speech Communication and Technology, September 2003, pp. 65-68

41
- 85009230349
- Y. Wang, J. Hansen, G.K. Allu, R. Kumaresan, Average instantaneous frequency and average log envelopes for ASR with the aurora 2 database, in: Proc. European Conf. Speech Communication and Technology, September 2003, pp. 25-28

42
- 0028997045
- T. Abe, T. Kobayashi, S. Imai, Harmonics tracking and pitch extraction based on instantaneous frequency, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 1995, pp. 756-759

43
- 0022907822
- F.J. Charpentier, Pitch detection using the short-term phase spectrum, in: Proc. International Conf. Acoustics, Speech, Signal Processing, April 1986, pp. 113-116

44
- 85009168546
- 0 estimation, in: Proc. European Conf. Speech Communication and Technology, September 2003, pp. 2313-2316

45
- 0029375490
- Determination of instants of significant excitation in speech using group delay function
- Smits R., and Yegnanarayana B. Determination of instants of significant excitation in speech using group delay function. IEEE Trans. Speech Audio Process. 3 5 (1995) 325-333
- (1995) IEEE Trans. Speech Audio Process. , vol.3 , Issue.5 , pp. 325-333
- Smits, R.¹ Yegnanarayana, B.²

46
- 0000668614
- Robustness of group-delay based method for extraction of significant instants of excitation from speech signals
- Murthy P.S., and Yegnanarayana B. Robustness of group-delay based method for extraction of significant instants of excitation from speech signals. IEEE Trans. Speech Audio Process. 7 6 (1999) 609-619
- (1999) IEEE Trans. Speech Audio Process. , vol.7 , Issue.6 , pp. 609-619
- Murthy, P.S.¹ Yegnanarayana, B.²

47
- 0024879901
- H.A. Murthy, K.V.M. Murthy, B. Yegnanarayana, Formant extraction from Fourier transform phase, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 1989, pp. 484-487

48
- 0024900419
- G. Duncan, B. Yegnanarayana, H.A. Murthy, A nonparametric method of formant estimation using group delay spectra, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 1989, pp. 572-575

49
- 0030008906
- Speech formant frequency and bandwidth tracking using multiband energy demodulation
- Potamianos A., and Maragos P. Speech formant frequency and bandwidth tracking using multiband energy demodulation. J. Acoust. Soc. Amer. 99 6 (1996) 3795-3806
- (1996) J. Acoust. Soc. Amer. , vol.99 , Issue.6 , pp. 3795-3806
- Potamianos, A.¹ Maragos, P.²

50
- 0022236938
- D.H. Friedman, Instantaneous-frequency distribution vs time: An interpretation of the phase structure of speech, in: Proc. International Conf. Acoustics, Speech, Signal Processing, March 1985, pp. 1121-1124

51
- 0017097478
- A comparative performance study of several pitch detection algorithms
- Rabiner L.R., Cheng M.J., Rosenberg A.E., and McGonegal C.A. A comparative performance study of several pitch detection algorithms. IEEE Trans. Acoust. Speech Signal Process. 24 (1976) 399-418
- (1976) IEEE Trans. Acoust. Speech Signal Process. , vol.24 , pp. 399-418
- Rabiner, L.R.¹ Cheng, M.J.² Rosenberg, A.E.³ McGonegal, C.A.⁴

52
- 0003391579
- Springer-Verlag, Berlin
- Hess W. Pitch Determination of Speech Signals (1983), Springer-Verlag, Berlin
- (1983) Pitch Determination of Speech Signals
- Hess, W.¹

53
- 0035165763
- Cross-spectral methods for processing speech
- Nelson D.J. Cross-spectral methods for processing speech. J. Acoust. Soc. Amer. 110 5 (2001) 2575-2592
- (2001) J. Acoust. Soc. Amer. , vol.110 , Issue.5 , pp. 2575-2592
- Nelson, D.J.¹

54
- 4644266117
- D.J. Nelson, Cross-spectral based formant estimation and alignment, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 2004, pp. 621-624

55
- 0028997020
- B. Yegnanarayana, R. Smits, A robust method for determining instants of major excitations in voiced speech, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 1995, pp. 776-779

56
- 34047271998
- L.D. Alsteris, K.K. Paliwal, Intelligibility of speech from phase spectrum, in: Proc. Microelectronic Engineering Research Conf., November 2003

57
- 4544293686
- L.D. Alsteris, K.K. Paliwal, Importance of window shape for phase-only reconstruction of speech, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 2004, pp. 573-576

58
- 34047273057
- L.D. Alsteris, K.K. Paliwal, Further intelligibility results from human listening tests using the short-time phase spectrum, Speech Commun., in press

59
- 0022079951
- Derivative of phase spectrum of truncated autoregressive signals
- Reddy N.S., and Swamy M.N.S. Derivative of phase spectrum of truncated autoregressive signals. IEEE Trans. Circuits Syst. CAS-32 6 (1985) 2575-2592
- (1985) IEEE Trans. Circuits Syst. , vol.CAS-32 , Issue.6 , pp. 2575-2592
- Reddy, N.S.¹ Swamy, M.N.S.²

60
- 0019053271
- Comparison of parametric representations for mono-syllabic word recognition in continuously spoken utterances
- Davis S.B., and Mermelstein P. Comparison of parametric representations for mono-syllabic word recognition in continuously spoken utterances. IEEE Trans. Acoust. Speech Signal Process. 28 4 (1980) 357-366
- (1980) IEEE Trans. Acoust. Speech Signal Process. , vol.28 , Issue.4 , pp. 357-366
- Davis, S.B.¹ Mermelstein, P.²

61
- 0003822743
- Cambridge University Engineering Department, Cambridge, UK
- Young S. The HTK Book (2001), Cambridge University Engineering Department, Cambridge, UK
- (2001) The HTK Book
- Young, S.¹

62
- 33847165718
- L.D. Alsteris, K.K. Paliwal, Evaluation of the modified group delay feature for isolated word recognition, in: Int. Symposium on Signal Processing and Its Applications, August 2005, pp. 715-718

63
- 85009083403
- B. Bozkurt, B. Doval, C. D'Alessandro, T. Dutoit, Zeros of z-transform (ZZT) decomposition of speech for source-tract separation, in: Proc. International Conf. Speech, Language Processing, October 2004

64
- 34047244425
- B. Bozkurt, B. Doval, C. D'Alessandro, T. Dutoit, Appropriate windowing for group delay analysis and roots of z-transform of speech signals, in: European Signal Processing Conf., 2004

65
- 34047266694
- K.K. Paliwal, Decorrelated and liftered filter-bank energies for robust speech recognition, in: Proc. European Conf. Speech Communication and Technology, September 1999, pp. 85-88

66
- 0004056285
- Prentice Hall, NJ
- Huang X., Acero A., and Hon H. Spoken Language Processing (2001), Prentice Hall, NJ
- (2001) Spoken Language Processing
- Huang, X.¹ Acero, A.² Hon, H.³

67
- 0032097263
- Academic Press, San Diego
- Fukunaga K. Introduction to Statistical Pattern Recognition (1990), Academic Press, San Diego
- (1990) Introduction to Statistical Pattern Recognition
- Fukunaga, K.¹

68
- 0031630638
- G. Rigoll, D. Willett, A NN/HMM hybrid for continuous speech recognition with a discriminant nonlinear feature extraction, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 1998, pp. 9-12

69
- 0033709098
- H. Hermansky, D.P.W. Ellis, S. Sharma, Tandem connectionist feature extraction for conventional HMM systems, in: Proc. International Conf. Acoustics, Speech, Signal Processing, June 2000, pp. 1635-1638

70
- 0034848926
- D.P.W. Ellis, R. Singh, S. Sivadas, Tandem acoustic modeling in large-vocabulary recognition, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 2001, pp. 517-520

71
- 0034856452
- B. Yegnanarayana, K.S. Reddy, S.P. Kishore, Source and system features for speaker recognition using AANN model, in: Proc. International Conf. Acoustics, Speech, Signal Processing, May 2001, pp. 409-412

* 이 정보는 Elsevier사의 SCOPUS DB에서 KISTI가 분석하여 추출한 것입니다.