This page provides supplementary information 
for the paper:

Ian S Howard & Mark A Huckvale, "Training a Vocal Tract Synthesizer to Imitate Speech using Distal Supervised Learning", Submitted to Specom 2005.

A presentation based on the work in this paper is available in PowerPoint format here.

Speech data produced by the babble generator

Babble generator run with different phonetically motivated sampling of the vocal tract space.
The durations between the states are drawn from a Gaussian distribution. with the following parameters:
fast:      mean =150ms, sd = 50ms
slow:     mean = 300ms ,  sd = 100ms

1a. Fast babble vowel space 2a. Fast /baba/ babble 3. Fast babble Vowel and Consonant  Space
vowelSpaceFast AbASpaceFast AiubAiuSpaceFast
1b. Slow babble vowel space 2b. Slow /baba/ babble 3.  Slow babble Vowel and Consonant  Space
vowelSpaceSlow AbASpaceSlow AiubAiuSpaceSlow
 

Re-synthesized speech spoken by a male subject

Utterance:  ' Boogie boogie ba ba ba ba, boogie boogie ba ba ba'  spoken by a male subject

Real Speech Input Via Direct Inverse Model Via Distal retrained Inverse Model
First Attempt: realSpeechIAH3   First Attempt: resynthSpeechIAH3  
Improved: boogierealSpeech Improved: boogiedirInv Improved: boogiedistInvBab


Re-synthesized speech sung by a male subject

Utterance: The song Daisy sung by a male subject:
Daisy Daisy,
give me your answer do.
I'm half crazy,
all for the love of you.
We won't have a stylish marriage,
I can't afford a carriage,
but you'll look sweet,
upon the seat,
of a bicycle made for two.

Real Speech Input Via Direct Inverse Model Via Distal retrained Inverse Model
daisySung2realSpeech daisySung2dirInv daisySung2distInvBab
 

Please send your comments about this web page to: drianhoward@gmail.com 
Copyright © 2005-2012 Ian Howard. All Rights Reserved
Last Changed: 26 May 2012