Ph.D. Tezi Görüntüleme

Student:	Ergün YÜCESOY
Supervisor:	Prof. Dr. Vasif V. NABİYEV
Department:	Bilgisayar Mühendisliği
Institution:	Graduate School of Natural and Applied Sciences
University:	Karadeniz Technical University Turkey

Title of the Thesis:	CLASSIFICATION OF SPEAKERS BASED ON ACOUSTIC AND PROSODIC FEATURES ACCORDING TO AGE AND GENDER GROUPS
Level:	Ph.D.
Acceptance Date:	30/6/2017
Number of Pages:	165
Registration Number:	Di1189

Summary:

In this study, age and gender determination of a speaker is investigated. Automatic age and gender recognition systems having applications mainly in trade, medicine and forensic can directly be used for selection of a service or as an initial operation for different recognition systems as well. However, speech signal is quite variable. Therefore all factors affecting speech are required to realize a successful system. In this study by examining feature extraction and classification methods used in speech processing, performance evaluations of age and gender classification systems developed by these methods are carried out, pros and cons of each system are presented and the most suitable parameters such as model size, speech duration and feature size for these systems are determined. Beside, commonly used acoustic and prosodic features and parameters obtained from the voice source are also examined. Dynamic time warping, vector quantization, Gaussian mixture model (GMM), support vector machine, and GMM supervectors are used as classification methods. In the study, moreover, a new system based on score-level fusion of 7 subsystems is proposed and %5 success rate increase is achieved. The effect of channel compensation developed with nuisance attribute projection method on success rate became as 1.5%

Key Words: Age and gender recognition, Acoustic and prosodic features, Gaussian mixture model, Voice source, Score-level fusion, GMM supervector, Support vector machine