Improving fairness for spoken language understanding in atypical speech with Text-to-Speech


Instruction:

In this demo page, we report audio demos generated by Aty-TTS, DuTa-VC and Grad-TTS on the AtySpeech

We also report perceptual evaluation results by speech pathologists.

Outline:


AtySpeech Test Results

Speaker 0005:

No. Real

DuTa-VC [1]

Grad-TTS [2]

Aty-TTS Transcription
1
Published soon His warning came too late.
2
Published soon The first night in camp.
3
Published soon Does it make any difference?

Speaker 0006:

No. Real

DuTa-VC [1]

Grad-TTS [2]

Aty-TTS Transcription
1
Published soon He seemed to be thinking of something else.
2
Published soon "It's a big place," he said.
3
Published soon At about twenty minutes of five.

Speaker 0007:

No. Real

DuTa-VC [1]

Grad-TTS [2]

Aty-TTS Transcription
1
Published soon To meet was to find each other.
2
Published soon He protected her, and she strengthened him.
3
Published soon The wondering singer.

Speaker 0009:

No. Real

DuTa-VC [1]

Grad-TTS [2]

Aty-TTS Transcription
1
Published soon The driver's voice came from the background.
2
Published soon Why do you ask?
3
Published soon "Better ring again," suggested the driver.

Speaker 0015:

No. Real

DuTa-VC [1]

Grad-TTS [2]

Aty-TTS Transcription
1
Published soon For my Mother and Father.
2
Published soon If anything, he was pressing the attack.
3
Published soon Walking on his knees.

Speaker 0018:

No. Real

DuTa-VC [1]

Grad-TTS [2]

Aty-TTS Transcription
1
Published soon And multiplying, I don't doubt.
2
Published soon What am I to do?
3
Published soon Thank the gods, there she moves away!

Speaker 0019:

No. Real

DuTa-VC [1]

Grad-TTS [2]

Aty-TTS Transcription
1
Published soon How quickly he disappeared!
2
Published soon "There is my news," he said.
3
Published soon "Did Malcolm give you this?" Randal asked.

Perceptual Evaluation Results

1. Rating dysarthria (overall dysarthria, artic, voice) definition:

Rating Definition
0 WNL or not present
1 mild
2 mod
3 marked or mod-severe
4 severe

2. Rating naturalness definition:

Rating Definition
0 very natural
1 mostly natural
2 somewhat natural/unnatural
3 mostly unnatural
4 very unnatural

3. Results:

  speaker overall naturalness overall dysarthria severity overall artic severity artic: imprecise consonants artic: prolonged phonemes artic: repeated phonemes artic: irregular articulatory breakdowns artic: distorted vowels overall voice quality voice: harsh voice: hoarse/wet voice: breathy voice: strained/ strangled voice: stoppages voice: flutter
1 0005 1.75 2 2.5 1.5 1.75 0 0.5 1 2 0.75 0.25 0.25 1.75 0 0
2 0006 0.5 0 0 0 0 0 0 0 0.25 0.25 0 0 0 0 0
3 0007 0 1 1 1 0 0 0 0 0.75 0.75 0 0 0 0 0
4 0009 1.25 1.5 1.25 1.25 0.5 0.75 1.25 0.75 1 0.5 0.25 0 0.5 0 0
5 0010 2.25 1.75 1.75 1.75 1 0 0.75 0.75 1.75 0 0 1.75 0 0 0
6 0011 0.5 0.25 0 0 0 0 0 0 1.75 1.5 1 0 0.5 0 0
7 0012 0 0 0 0 0 0 0 0 0.5 0.5 0 0 0 0 0
8 0013 1 2.75 2.75 2.75 1.25 0 2.25 1.25 3 0 2.75 2.75 0 1.5 0
9 0014 0.25 1 1 1 0 0 0 0 0.5 0.25 0.25 0 0 0 0
10 0015 2 1.75 1.5 1.5 2.25 0 0 0.5 2 1.25 0 0 1.25 0 0
11 0017 2.25 1.75 1.25 1.25 0 0 0.75 0.75 2.5 1.25 1.75 0 0.5 0 0
12 0018 1.75 1.75 1.75 1.75 1.75 0 1.25 0.75 2.75 0.5 1.5 0 2.75 0 0.25
13 0019 0.75 2.5 2 2 0.5 1.5 1 1.5 1.75 0.25 0.5 1.25 0.75 0 0
14 0020 0.75 1 1 1 0.25 0 0 0.25 1 0.75 0 0 0.25 0 0
15 0021 0 0 0 0 0 0 0 0 1 0.75 0.25 0 0 0 0
16 0022 0.5 1 1 1 1 0 0.75 0.5 1 0 0 1 0 0 0
17 0023 0.5 0.5 0.5 0.5 0 0 0.25 0.25 0.75 0.25 0.5 0.25 0.5 0 0
18 0024 1 1.25 1 1 0.25 0 0.5 0.5 1 0.25 0.5 0.5 0.5 0 0
19 0025 1.5 1.25 0.75 0.75 0 0 0.25 0 1.25 0 0.5 0.75 0 0 0
20 0026 1 2.5 2 1.75 1.5 0 0.25 0 2.5 1 0.75 0 2 0 0
21 0027 0.75 1.75 1.5 1.25 1 0 0 0.75 1.75 1.5 0 0 1 0 0
22 0028 1 2.5 1.75 1.75 0.75 0 1 0.5 1.5 0.5 0 0.5 1 0 0
23 0029 1 2 2 2 1 0 1 0.5 1.25 0.25 1 0.25 1.25 0 0
24 0030 0.5 1 1 1 0 0 0.5 0.25 1.25 1 0 0 1 0 0