Sunday, May 10, 2015

Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network

April 29, 2015

Deep Learning Machine Solves the Cocktail Party Problem

Separating a singer’s voice from background music has always been a uniquely human ability.

The cocktail party effect is the ability to focus on a specific human voice while filtering out other voices or background noise. The ease with which humans perform this trick belies the challenge that scientists and engineers have faced in reproducing it synthetically. By and large, humans easily outperform the best automated methods for singling out voices.
A particularly challenging cocktail party problem is in the field of music, where humans can easily concentrate on a singing voice superimposed on a musical background that includes a wide range of instruments. By comparison, machines are poor at this task.
Andrew Simpson  at the University of Surrey in the U.K.  has used some of the most recent advances associated with deep neural networks to separate human voices from the background in a wide range of songs.

The task of picking out a voice from this mixture is essentially the task of separating the voice’s unique spectrogram from the other spectrograms each time" it" scans through the database.
In other words, having learned what a voice sounds like, a deep neural network can use this information to pick out other voices from a mix.

Spectral subtraction denoising preprocessing block to improve P300-based brain-computer interfacing

 Meena M .BioMedical Engineering ;2014, Vol. 13 Issue 1, p1
 The signals acquired in brain-computer interface (BCI) experiments usually involve several complicated sampling, artifact and noise conditions. This mandated the use of several strategies as preprocessing to allow the extraction of meaningful components of the measured signals to be passed along to further processing steps. In spite of the success present preprocessing methods have to improve the reliability of BCI, there is still room for further improvement to boost the performance even more. Method A new preprocessing method for denoising P300-based brain-computer interface data that allows better performance with lower number of channels and blocks is presented. The new denoising technique is based on a modified version of the spectral subtraction denoising and works on each temporal signal channel independently thus offering seamless integration with existing preprocessing and allowing low channel counts to be used. Results The new method is verified using experimental data and compared to the classification results of the same data without denoising and with denoising using present wavelet shrinkage based technique. Enhanced performance in different experiments as quantitatively assessed using classification block accuracy as well as bit rate estimates was confirmed. Conclusion The new preprocessing method based on spectral subtraction denoising offer superior performance to existing methods and has potential for practical utility as a new standard preprocessing block in BCI signal processing

No comments:

Post a Comment