Prosodic structure and sentence types by using large speech databases supported by deep learning techniques

Financer institution: Nemzeti Kutatási, Fejlesztési és Innovációs Hivatal

ID: K-135038

Domestic tenderInstitutional tender

Principal investigator: Katalin Mády

One prerequisite of studies on the structure of Hungarian is the availability of a large amount of spontaneous speech data. Manual data processing is time-consuming and expensive. For this reason, we develop models based on deep neural networks in order to facilitate the automatic processing of speech data now and in the future. Data processing includes automatic speech recognition and the time-alignment of the annotations within the acoustic signal. Parallelly, a prosodic annotation system is being developed in order to reveal relevant units of Hungarian prosody and their structure. The main research questions we seek to answer based on the corpora are: identification of the communicative functions of complex sentence structure and a description of Hungarian intonation. The databases and language models developed during the project are freely available for research purposes, thus contributing to the improved usability of Hungarian language resources. The work is supported by two computer engineers (Gergely Dobsinszki, Máté Kádár) and two research assistents (Péter Csényi, Flóra Hegyi).

Duration 2020-2024