SpeechLinks - Speech Synthesis Speech Technology Hyperlinks Page

Domain-specific synthesis concatenates prerecorded words and phrases to create complete utterances. It is used in applications where the variety of texts the system will output is limited to a particular domain, like transit schedule announcements or weather reports.15 The technology is very simple to implement, and has been in commercial use for a long time, in devices like talking clocks and calculators. The level of naturalness of these systems can be very high because the variety of sentence types is limited, and they closely match the prosody and intonation of the original recordings.

Speech Synthesis | Speech Technologies | InTechOpen

Speech Synthesis Technology - The National Academies Press

Lucent Technologies Bell Labs Text-to-Speech Synthesis demo

In 1979 Allen, Hunnicutt, and Klatt demonstrated the MITalk laboratory text-to-speech system developed at M.I.T. (track ). The system was used later also in Telesensory Systems Inc. (TSI) commercial TTS system with some modifications (Klatt 1987, Allen et al. 1987). Two years later Dennis Klatt introduced his famous Klattalk system (track ), which used a new sophisticated voicing source described more detailed in (Klatt 1987). The technology used in MITalk and Klattalk systems form the basis for many synthesis systems today, such as DECtalk (tracks -) and Prose-2000 (track ). For more detailed information of MITalk and Klattalk systems, see for example Allen et al. (1987), Klatt (1982), or Bernstein et al. (1980).

Why Synthesized Speech Sounds So Awful - MIT Technology Review

First articulatory synthesizer was introduced in 1958 by George Rosen at the Massachusetts Institute of Technology, M.I.T. (Klatt 1987). The DAVO (Dynamic Analog of the VOcal tract) was controlled by tape recording of control signals created by hand (track ). In mid 1960s, first experiments with Linear Predictive Coding (LPC) were made (Schroeder 1993). Linear prediction was first used in low-cost systems, such as TI Speak'n'Spell in 1980, and its quality was quite poor compared to present systems (track ). However, with some modifications to basic model, which are described later in , the method has been found very useful and it is used in many present systems.

Computer Synthesized Speech Technologies: Tools for …

Languages with a phonemic orthography have a very regular writing system, and the prediction of the pronunciation of words based on their spellings is quite successful. Speech synthesis systems for such languages often use the rule-based method extensively, resorting to dictionaries only for those few words, like foreign names and borrowings, whose pronunciations are not obvious from their spellings. On the other hand, speech synthesis systems for languages like English language, which have extremely irregular spelling systems, are more likely to rely on dictionaries, and to use rule-based methods only for unusual words, or words that aren't in their dictionaries.

Speech synthesis is the artificial production of human speech

Until recently, articulatory synthesis models have not been incorporated into commercial speech synthesis systems. A notable exception is the NeXT-based system originally developed and marketed by Trillium Sound Research, a spin-off company of the University of Calgary, where much of the original research was conducted. Following the demise of the various incarnations of NeXT (started by Steve Jobs in the late 1980s and merged with Apple Computer in 1997), the Trillium software was published under the GNU General Public License, with work continuing as gnuspeech. The system, first marketed in 1994, provides full articulatory-based text-to-speech conversion using a waveguide or transmission-line analog of the human oral and nasal tracts controlled by Carré's "distinctive region model".

It includes many improvements on SR and TTS engines in the past year

The consistent evaluation of speech synthesis systems may be difficult because of a lack of universally agreed objective evaluation criteria. Different organizations often use different speech data. The quality of speech synthesis systems also depends to a large degree on the quality of the production technique (which may involve analogue or digital recording) and on the facilities used to replay the speech. Evaluating speech synthesis systems has therefore often been compromised by differences between production techniques and replay facilities.