An Attribute Interpolation Method in Speech Synthesis by Model Merging

Audio Samples

4.2. Speaker Generation


Female-Female


speaker combination (A-B) base speaekr A interpolation A-B base speaker B
p225-p229
p225-p231
p228-p231

Male-Male


speaker combination (A-B) base speaekr A interpolation A-B base speaker B
p226-p237
p226-p241
p232-p237

Male-Female


speaker combination (A-B) base speaekr A interpolation A-B base speaker B
p227-p228
p237-p225
p226-p228(Failed)

4.3. Emotion Intensity Control


emotion style alpha=0(Neutral) alpha=0.25 alpha=0.5 alpha=0.75 alpha=1.0(Emotional)
Angry
Happy
Sad
Surprise