wav2wav:

wave to wave Voice conversion


Paper


wav2wav


Conversion samples

Recommended browsers: Safari, Chrome, Firefox, and Opera.

Experimental conditions

  • We evaluated our method on the Spoke (i.e., non-parallel VC) task of the Voice Conversion Challenge 2018 (VCC 2018) [1].
  • For each speaker, 81 sentences (approximately 5 min in length, which is relatively short for VC) were used for training and 35 sentences for test.
  • The training set contains no overlapping utterances between the source and target speakers; therefore, we need to learn a converter in a fully non-parallel setting.

Compared models

Results

Female (VCC2SF3) → Male (VCC2TM1)

Source Target baseline (MaskCycleGAN) Proposed
Sample 1
Sample 2
Sample 3

Male (VCC2SM3) → Female (VCC2TF1)

Source Target baseline (MaskCycleGAN) Proposed
Sample 1
Sample 2
Sample 3

Female (VCC2SF3) → Female (VCC2TF1)

Source Target baseline (MaskCycleGAN) Proposed
Sample 1
Sample 2
Sample 3

Male (VCC2SM3) → Male (VCC2TM1)

Source Target baseline (MaskCycleGAN) Proposed
Sample 1
Sample 2
Sample 3

References

[1] J. Lorenzo-Trueba, J. Yamagishi, T. Toda, D. Saito, F. Villavicencio, T. Kinnunen, Z. Ling. The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods. Odyssey, 2018. [Paper] [Dataset]
[2] T. Kaneko, H. Kameoka, K. Tanaka, N. Hozo. MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames. ICASSP, 2021. [Paper]