Towards Solving Cocktail-Party: The First Method to Build a Realistic Dataset with Ground Truths for Speech Separation
Main Article Content
Abstract
A significant improvement occurred in recent years towards the solution of the cocktail-party problem. In fact, much attention has been drawn to supervised learning methods using synthetic mixtures datasets despite their being not representative of real-world mixtures. The difficulty in building a realistic dataset led researchers to use unsupervised-learning based methods, because of their ability to handle realistic mixtures directly. The results of unsupervised methods are still unconvincing.
In this paper, a method is introduced to create a realistic dataset with ground truth sources for speech separation. The main problem in designing a realistic dataset is the unavailability of ground truths for speakers’ signals, so a method is suggested to record two speakers simultaneously and obtain the ground truth for each speaker. Our method utilizes a MATLAB function which exploits a full duplex sound card to record and playback audio files at the same time. We have used TIMIT (Texas Instruments/Massachusetts Institute of Technology) corpus to implement our method, and design Realistic_TIMIT_2mix dataset. Evaluation is carried out on three datasets, and experiments show that our proposed dataset improved SI-SDR (Scale Invariant Signal to Distortion Ratio) by more than 1.5 dB and PESQ (Perceptual Evaluation of Speech Quality) by 0.5 approximately. We also measured the performance on different distances between the microphone and the speakers, and we found that our method made the learned model more stable when the distance changes.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Journal author rights
In order for Romanian Journal of Acoustics and Vibration to publish and disseminate research articles, we need publishing rights. This is determined by a publishing agreement between the author and Romanian Journal of Acoustics and Vibration. This agreement deals with the transfer or license of the copyright to Romanian Journal of Acoustics and Vibration and authors retain significant rights to use and share their own published articles. Romanian Journal of Acoustics and Vibration supports the need for authors to share, disseminate and maximize the impact of their research and these rights, in Romanian Journal of Acoustics and Vibration proprietary are defined below:
For subscription articles:
Authors transfer copyright to the publisher as part of a journal publishing agreement, but have the right to: Share their article for personal use (manuscript version); retain patent, trademark and other intellectual property rights (including research data); proper attribution and credit for the published work.
For open access articles:
Authors sign an exclusive license agreement, where authors have copyright but license exclusive rights in their article to the publisher. In this case authors have the right to: share their article in the same ways permitted to third parties under the relevant user license; retain patent, trademark and other intellectual property rights (including research data); proper attribution and credit for the published work.
Rights granted to Romanian Journal of Acoustics and Vibration
For both subscription and open access articles, published in proprietary titles, Romanian Journal of Acoustics and Vibration is granted the following rights:
- The exclusive right to publish and distribute an article, and to grant rights to others, including for commercial purposes;
- For open access articles, Romanian Journal of Acoustics and Vibration will apply the relevant third party user licence (Open access licencses) where Romanian Journal of Acoustics and Vibration publishes the article on its online platforms;
- The right to provide the article in all forms and media so the article can be used on the latest technology even after publication;
- The authority to enforce the rights in the article, on behalf of an author, against third parties, for example in the case of plagiarism or copyright infringement.