Posts by Collection

portfolio

Hyperbolic embedding using AudioSet taxonomy

Embed AudioSet taxonomy to Poincare disk.

Multichannel blind source separation demo

Blind source separation demo using IVA, ILRMA, MNMF, and so on.

publications

Investigation of Network Architecture for Single-Channel End-to-End Denoising

Published in EUSIPCO, 2021

Empirical Bayesian Independent Deeply Learned Matrix Analysis For Multichannel Audio Source Separation

Published in ICASSP, 2021

This paper propose an IDLMA extension, empirical Bayesian IDLMA (EB-IDLMA) to implicitly consider the reliability of the estimated source power spectrograms for the estimation of demixing filters through the hyperparameters of the prior distribution estimated by the DNN.

PoP-IDLMA: Product-of-Prior Independent Deeply Learned Matrix Analysis for Multichannel Music Source Separation

Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 2680-2694, 2023, 2023

This paper proposes PoP-IDLMA, an extension of independent deeply learned matrix analysis (IDLMA).

A comparative study of text-speech alignment methods for emotional speech synthesis.

Published in ASJ Autumn Meeting, 2024

In this paper, we investigate the performance of various text-speech alignment methods to build a high-quality emotional parallel text-to-speech system.

Music Tagging with Classifier Group Chains

Published in International Conference on Acoustics, Speech and Signal Processing, 2025, 2025

This paper proposes music tagging with classifier group chains, which considers conditional dependence among tag categories.

Takuya Hasumi