DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Probabilistic Model


Instruction:

In this demo page, we report audio demos generated by DuTa-VC on the UASpeech and LibriSpeech.

As the UASpeech is not an open-source dataset, we can only show the synthesized voices without the source voices and the target voices we used. We provide the names so that you can listen to them once you get access to the UASpeech.

We also report perceptual evaluation results are calculated by mean absolute error (MAE) and details of the speaker similarity test results.


Outline:


UASpeech Audio Samples

F02 (female, dysarthria-low):

No. 1 2 3 4 5 6 7 8
Source Voice
CM09_B2_UW27_M5.wav CM13_B2_UW60_M5.wav CM13_B2_CW58_M5.wav CF02_B2_CW78_M5.wav CM09_B2_UW22_M5.wav CM12_B2_C6_M5.wav CF04_B2_UW46_M5.wav CM10_B2_CW44_M5.wav
Target Voice
F02_B1_D2_M5.wav F02_B1_CW23_M5.wav F02_B3_CW72_M5.wav F02_B3_UW68_M5.wav F02_B3_UW64_M5.wav F02_B1_CW69_M5.wav F02_B3_UW56_M5.wav F02_B3_CW98_M5.wav
Generated Voice
Text
anxieties brotherhood these people agricultural Escape Bengal which

F05 (female, dysarthria-high):

No. 1 2 3 4 5 6 7 8
Source Voice
CM12_B2_C6_M5.wav CF05_B2_UW33_M5.wav CM04_B2_CW41_M5.wav CM08_B2_UW82_M5.wav CF05_B2_D1_M5.wav CM06_B2_CW4_M5.wav CF04_B2_D7_M5.wav CF05_B2_UW13_M5.wav
Target Voice
F05_B3_CW13_M5.wav F05_B1_CW16_M5.wav F05_B3_CW57_M5.wav F05_B3_UW41_M5.wav F05_B3_LJ_M5.wav F05_B1_UW46_M5.wav F05_B3_C13_M5.wav F05_B3_CW84_M5.wav
Generated Voice
Text
Escape approach use watch One a Seven unusual

M04 (male, dysarthria-very low):

No. 1 2 3 4 5 6 7 8
Source Voice
CF04_B2_UW38_M5.wav CM09_B2_LT_M5 CM05_B2_UW31_M5.wav CM13_B2_UW70_M5.wav CF04_B2_UW3_M5.wav CF05_B2_UW6_M5.wav CF03_B2_UW62_M5.wav CM06_B2_CW93_M5.wav
Target Voice
M04_B3_CW28_M5.wav M04_B1_UW65_M5.wav M04_B3_UW17_M5.wav M04_B3_UW99_M5.wav M04_B1_CW48_M5.wav M04_B1_CW31_M5.wav M04_B1_UW98_M5.wav M04_B1_CW75_M5.wav
Generated Voice
Text
bachelor Tango appreciable coherent Nuremberg roof bulge did

M05 (male, dysarthria-mid):

No. 1 2 3 4 5 6 7 8
Source Voice
CF04_B2_UW91_M5.wav CM13_B2_UW47_M5.wav CM09_B2_LC_M5.wav CF02_B2_UW97_M5.wav CM09_B2_CW33_M5.wav CF02_B2_UW38_M5.wav CM04_B2_D4_M5.wav CF04_B2_UW81_M5.wav
Target Voice
M05_B3_CW6_M5.wav M05_B1_CW35_M5.wav M05_B1_UW53_M5.wav M05_B3_CW67_M5.wav M05_B1_LN_M5.wav M05_B3_UW15_M5.wav M05_B3_LS_M5.wav M05_B3_LF_M5.wav
Generated Voice
Text
orange bequeath Charlie bath all bachelor Four endowments

M09 (male, dysarthria-high):

No. 1 2 3 4 5 6 7 8
Source Voice
CF03_B2_CW21_M5.wav CF02_B2_UW49_M5.wav CF03_B2_LZ_M5.wav CM09_B2_CW76_M5.wav CF02_B2_CW96_M5.wav CM13_B2_CW46_M5.wav CM05_B2_UW87_M5.wav CF02_B2_UW21_M5.wav
Target Voice
M09_B3_LV_M5.wav M09_B3_UW66_M5.wav M09_B3_LW_M5.wav M09_B1_CW42_M5.wav M09_B3_LQ_M5.wav M09_B1_UW86_M5.wav M09_B1_CW71_M5.wav M09_B1_CW48_M5.wav
Generated Voice
Text
at betroth Zulu way made do car advantageous

M16 (male, dysarthria-low):

No. 1 2 3 4 5 6 7 8
Source Voice
CM01_B2_UW73_M5.wav CM10_B2_LN_M5.wav CM06_B2_UW39_M5.wav CF03_B2_CW68_M5.wav CF05_B2_CW89_M5.wav CM12_B2_UW42_M5-1.wav CF02_B2_C16_M5.wav CM05_B2_UW5_M5.wav
Target Voice
M16_B1_LM_M5.wav M16_B3_UW29_M5.wav M16_B3_D7_M5.wav M16_B3_UW98_M5.wav M16_B1_D6_M5.wav M16_B3_UW49_M5.wav M16_B3_CW34_M5.wav M16_B1_LU_M5.wav
Generated Voice
Text
cowhide November bathe has find battlements Upward re-united

LibriSpeech Audio Samples

No. Source Voice F02 F05 M04 M05 M09 M16 Text
1
AND SAID TO HER OH MY LADY
2
WITHIN HERE SHALT THOU FIND DEATH THERE WAS A KEY OF BRASS IN THE DOOR
3
WHAT BUSINESS AN YE FOR GO BLABBING THY AFFAIRS ALL OVER BOSLEY I SAY IT ISNA THY PLACE FOR BUY LINEN

Perceptual Evaluation Results

  Metrics F02 F03 F04 F05 M01 M04 M05 M07 M08 M09 M10 M11 M12 M14 M16 VL L M H Avg
1 overall dysarthria severity 0.83 0.50 0.00 1.17 0.67 0.67 0.67 0.33 0.67 1.17 1.50 0.67 1.00 1.33 0.17 0.71 0.44 0.44 1.17 0.76
2 overall artic severity 0.67 0.50 0.00 1.00 0.83 0.67 0.50 0.33 0.67 0.83 1.50 0.17 1.00 1.17 0.00 0.75 0.33 0.22 1.03 0.66
3 artic: imprecise consonants 0.33 0.33 0.17 1.17 0.67 0.67 0.50 0.17 0.67 0.83 1.50 0.50 1.17 1.17 0.17 0.71 0.22 0.39 1.07 0.67
4 artic: prolonged phonemes 1.00 2.17 0.67 0.83 0.67 0.00 0.00 0.67 0.67 1.33 1.00 0.50 1.50 0.83 0.33 1.08 0.67 0.39 0.93 0.81
5 artic: repeated phonemes 0.17 0.00 0.17 0.00 1.00 0.00 1.00 1.17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.25 0.44 0.39 0.00 0.23
6 artic: irregular articulatory breakdowns 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.50 0.00 0.28 0.00 0.00 0.06
7 artic: distorted vowels 0.83 0.17 0.17 0.50 1.33 1.50 0.00 0.33 0.67 0.33 1.33 0.17 1.67 0.67 0.00 1.17 0.39 0.11 0.70 0.64
8 overall voice quality 0.50 0.00 0.33 0.00 0.50 1.00 0.00 0.50 0.33 0.33 0.50 0.83 0.00 0.50 0.50 0.38 0.50 0.39 0.33 0.39
9 voice: harsh 0.83 0.00 0.00 0.00 0.00 0.50 0.00 0.00 0.00 0.00 0.50 0.00 0.50 0.17 0.17 0.25 0.33 0.00 0.13 0.18
10 voice: hoarse/wet 0.33 0.17 0.00 0.00 0.00 0.33 0.00 0.00 0.00 0.17 0.00 0.83 0.00 0.33 0.00 0.12 0.11 0.28 0.10 0.14
11 voice: breathy 0.17 1.00 0.33 0.00 0.00 0.83 0.33 0.67 0.00 0.50 0.00 0.00 1.33 0.00 0.17 0.79 0.33 0.22 0.10 0.36
12 voice: strained/strangled 0.83 0.33 0.17 0.17 0.50 0.83 0.33 0.33 0.50 0.50 0.17 0.83 0.17 0.50 0.67 0.46 0.61 0.44 0.37 0.46
13 voice: stoppages 0.17 0.00 0.00 0.00 0.17 0.00 0.00 0.17 0.00 0.00 0.00 0.00 0.00 0.00 0.17 0.04 0.17 0.00 0.00 0.04
14 voice: flutter 0.00 0.00 0.17 0.17 0.17 0.00 0.00 0.17 0.17 0.33 0.00 0.17 0.00 0.00 0.00 0.04 0.06 0.11 0.13 0.09

Speaker Similarity Results

  Source Speaker Target Speaker Source with Target Source with Generated Target with Generated
0 CF02 F02 0.806 0.807 0.916
1 CF02 F03 0.772 0.751 0.947
2 CF02 F04 0.909 0.906 0.971
3 CF02 F05 0.84 0.779 0.967
4 CF02 M01 0.756 0.774 0.917
5 CF02 M04 0.806 0.743 0.934
6 CF02 M05 0.771 0.772 0.957
7 CF02 M07 0.774 0.755 0.972
8 CF02 M08 0.742 0.71 0.966
9 CF02 M09 0.75 0.747 0.947
10 CF02 M10 0.726 0.697 0.975
11 CF02 M11 0.769 0.741 0.962
12 CF02 M12 0.804 0.805 0.921
13 CF02 M14 0.788 0.741 0.955
14 CF02 M16 0.804 0.782 0.938
15 CF03 F02 0.822 0.762 0.919
16 CF03 F03 0.764 0.722 0.946
17 CF03 F04 0.833 0.846 0.969
18 CF03 F05 0.761 0.713 0.957
19 CF03 M01 0.809 0.796 0.915
20 CF03 M04 0.808 0.744 0.923
21 CF03 M05 0.833 0.802 0.95
22 CF03 M07 0.755 0.736 0.967
23 CF03 M08 0.783 0.737 0.955
24 CF03 M09 0.775 0.778 0.951
25 CF03 M10 0.715 0.71 0.959
26 CF03 M11 0.792 0.753 0.968
27 CF03 M12 0.837 0.787 0.906
28 CF03 M14 0.789 0.752 0.951
29 CF03 M16 0.826 0.774 0.956
30 CF04 F02 0.818 0.839 0.916
31 CF04 F03 0.771 0.727 0.932
32 CF04 F04 0.864 0.853 0.977
33 CF04 F05 0.832 0.796 0.965
34 CF04 M01 0.779 0.814 0.911
35 CF04 M04 0.837 0.785 0.938
36 CF04 M05 0.761 0.749 0.944
37 CF04 M07 0.791 0.778 0.968
38 CF04 M08 0.824 0.789 0.957
39 CF04 M09 0.768 0.742 0.944
40 CF04 M10 0.794 0.787 0.976
41 CF04 M11 0.828 0.797 0.963
42 CF04 M12 0.776 0.769 0.896
43 CF04 M14 0.845 0.807 0.968
44 CF04 M16 0.804 0.781 0.95
45 CF05 F02 0.82 0.863 0.899
46 CF05 F03 0.767 0.761 0.923
47 CF05 F04 0.86 0.843 0.976
48 CF05 F05 0.94 0.906 0.952
49 CF05 M01 0.756 0.789 0.917
50 CF05 M04 0.791 0.758 0.922
51 CF05 M05 0.681 0.706 0.954
52 CF05 M07 0.756 0.768 0.964
53 CF05 M08 0.783 0.76 0.963
54 CF05 M09 0.725 0.739 0.95
55 CF05 M10 0.757 0.768 0.969
56 CF05 M11 0.753 0.742 0.967
57 CF05 M12 0.761 0.804 0.884
58 CF05 M14 0.817 0.766 0.922
59 CF05 M16 0.766 0.75 0.954
60 CM01 F02 0.667 0.67 0.895
61 CM01 F03 0.687 0.687 0.938
62 CM01 F04 0.723 0.714 0.962
63 CM01 F05 0.698 0.673 0.967
64 CM01 M01 0.746 0.789 0.859
65 CM01 M04 0.831 0.771 0.915
66 CM01 M05 0.83 0.843 0.955
67 CM01 M07 0.792 0.805 0.957
68 CM01 M08 0.909 0.892 0.959
69 CM01 M09 0.827 0.802 0.972
70 CM01 M10 0.908 0.896 0.985
71 CM01 M11 0.863 0.895 0.948
72 CM01 M12 0.714 0.714 0.921
73 CM01 M14 0.913 0.874 0.957
74 CM01 M16 0.814 0.79 0.951
75 CM04 F02 0.737 0.739 0.892
76 CM04 F03 0.721 0.704 0.943
77 CM04 F04 0.756 0.741 0.974
78 CM04 F05 0.728 0.696 0.964
79 CM04 M01 0.763 0.766 0.892
80 CM04 M04 0.831 0.784 0.934
81 CM04 M05 0.829 0.858 0.95
82 CM04 M07 0.807 0.803 0.968
83 CM04 M08 0.853 0.835 0.966
84 CM04 M09 0.771 0.771 0.975
85 CM04 M10 0.897 0.892 0.979
86 CM04 M11 0.838 0.826 0.956
87 CM04 M12 0.719 0.718 0.934
88 CM04 M14 0.923 0.907 0.974
89 CM04 M16 0.84 0.818 0.934
90 CM05 F02 0.724 0.807 0.898
91 CM05 F03 0.719 0.75 0.927
92 CM05 F04 0.8 0.79 0.974
93 CM05 F05 0.787 0.737 0.97
94 CM05 M01 0.732 0.859 0.854
95 CM05 M04 0.779 0.775 0.914
96 CM05 M05 0.739 0.781 0.948
97 CM05 M07 0.766 0.771 0.957
98 CM05 M08 0.806 0.783 0.962
99 CM05 M09 0.789 0.767 0.962
100 CM05 M10 0.848 0.808 0.975
101 CM05 M11 0.773 0.798 0.957
102 CM05 M12 0.723 0.778 0.925
103 CM05 M14 0.881 0.849 0.97
104 CM05 M16 0.819 0.869 0.94
105 CM06 F02 0.655 0.716 0.903
106 CM06 F03 0.685 0.683 0.927
107 CM06 F04 0.69 0.641 0.968
108 CM06 F05 0.706 0.665 0.952
109 CM06 M01 0.673 0.739 0.896
110 CM06 M04 0.786 0.745 0.933
111 CM06 M05 0.732 0.803 0.928
112 CM06 M07 0.796 0.788 0.965
113 CM06 M08 0.825 0.775 0.96
114 CM06 M09 0.774 0.763 0.973
115 CM06 M10 0.903 0.857 0.976
116 CM06 M11 0.816 0.825 0.959
117 CM06 M12 0.658 0.686 0.927
118 CM06 M14 0.879 0.819 0.96
119 CM06 M16 0.768 0.764 0.959
120 CM08 F02 0.65 0.735 0.904
121 CM08 F03 0.653 0.66 0.944
122 CM08 F04 0.687 0.663 0.974
123 CM08 F05 0.74 0.711 0.961
124 CM08 M01 0.657 0.722 0.888
125 CM08 M04 0.765 0.739 0.931
126 CM08 M05 0.686 0.727 0.953
127 CM08 M07 0.757 0.771 0.97
128 CM08 M08 0.825 0.81 0.963
129 CM08 M09 0.72 0.702 0.959
130 CM08 M10 0.92 0.902 0.976
131 CM08 M11 0.793 0.801 0.956
132 CM08 M12 0.63 0.687 0.904
133 CM08 M14 0.889 0.839 0.954
134 CM08 M16 0.747 0.771 0.957
135 CM09 F02 0.643 0.727 0.897
136 CM09 F03 0.666 0.694 0.939
137 CM09 F04 0.699 0.684 0.98
138 CM09 F05 0.745 0.719 0.95
139 CM09 M01 0.68 0.77 0.894
140 CM09 M04 0.786 0.751 0.929
141 CM09 M05 0.727 0.795 0.932
142 CM09 M07 0.783 0.787 0.967
143 CM09 M08 0.856 0.819 0.966
144 CM09 M09 0.761 0.746 0.959
145 CM09 M10 0.934 0.922 0.981
146 CM09 M11 0.801 0.835 0.967
147 CM09 M12 0.644 0.723 0.864
148 CM09 M14 0.902 0.897 0.97
149 CM09 M16 0.773 0.781 0.953
150 CM10 F02 0.733 0.764 0.883
151 CM10 F03 0.744 0.735 0.942
152 CM10 F04 0.75 0.729 0.983
153 CM10 F05 0.757 0.714 0.969
154 CM10 M01 0.79 0.845 0.899
155 CM10 M04 0.82 0.741 0.938
156 CM10 M05 0.799 0.823 0.936
157 CM10 M07 0.839 0.82 0.964
158 CM10 M08 0.892 0.845 0.972
159 CM10 M09 0.875 0.871 0.965
160 CM10 M10 0.89 0.864 0.985
161 CM10 M11 0.832 0.84 0.966
162 CM10 M12 0.743 0.779 0.929
163 CM10 M14 0.911 0.856 0.964
164 CM10 M16 0.863 0.852 0.956
165 CM12 F02 0.771 0.741 0.906
166 CM12 F03 0.689 0.68 0.956
167 CM12 F04 0.734 0.709 0.979
168 CM12 F05 0.762 0.725 0.965
169 CM12 M01 0.774 0.785 0.915
170 CM12 M04 0.784 0.706 0.926
171 CM12 M05 0.79 0.788 0.953
172 CM12 M07 0.785 0.762 0.971
173 CM12 M08 0.849 0.822 0.959
174 CM12 M09 0.796 0.778 0.945
175 CM12 M10 0.81 0.816 0.972
176 CM12 M11 0.8 0.772 0.964
177 CM12 M12 0.708 0.68 0.926
178 CM12 M14 0.865 0.812 0.967
179 CM12 M16 0.843 0.811 0.966
180 CM13 F02 0.709 0.75 0.912
181 CM13 F03 0.728 0.753 0.946
182 CM13 F04 0.768 0.751 0.965
183 CM13 F05 0.774 0.733 0.968
184 CM13 M01 0.745 0.821 0.893
185 CM13 M04 0.828 0.781 0.929
186 CM13 M05 0.811 0.866 0.943
187 CM13 M07 0.82 0.84 0.972
188 CM13 M08 0.873 0.825 0.962
189 CM13 M09 0.835 0.835 0.962
190 CM13 M10 0.916 0.891 0.974
191 CM13 M11 0.844 0.855 0.964
192 CM13 M12 0.723 0.737 0.932
193 CM13 M14 0.938 0.89 0.965
194 CM13 M16 0.857 0.876 0.958
195 All All 0.786 0.779 0.946