Loquendo - AbsoluteAstronomy.com

Loquendo is a multinational

Multinational corporation

A multi national corporation or enterprise , is a corporation or an enterprise that manages production or delivers services in more than one country. It can also be referred to as an international corporation...

computer

Computer

A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

software technology corporation

Corporation

A corporation is created under the laws of a state as a separate legal entity that has privileges and liabilities that are distinct from those of its members. There are many different forms of corporations, most of which are used to conduct business. Early corporations were established by charter...

, headquartered in Torino, Italy

Italy

Italy , officially the Italian Republic languages]] under the European Charter for Regional or Minority Languages. In each of these, Italy's official name is as follows:;;;;;;;;), is a unitary parliamentary republic in South-Central Europe. To the north it borders France, Switzerland, Austria and...

, that provides speech recognition, speech synthesis, speaker verification and identification applications. Loquendo, which was founded in 2001 under the Telecom Italia

Telecom Italia

Telecom Italia is the largest Italian telecommunications company, also active in the media and manufacturing industries. Now a private concern listed on the Borsa Italiana, it was founded in 1994 by the merger of several state-owned telecommunications companies, the most important of which was...

Lab, also has offices in United Kingdom, Spain, Germany, France, and the United States.

Current business products can be found in portable and in-car navigation devices, assistive devices for the differently able, smartphones, ebook readers, talking ATMs, computer games, voice-controlled domestic appliances and others. The voice synthesis and speech recognition systems is used in a new e-health application as part of Spain’s Junta de Andalucía Government Health Services's virtual assistant.

Loquendo's products have been the recipient of several awards including being a Speech Technologies Speech Engine Leader in 2007, 2008, and 2009 It was rated as 'Market Leader' by Speech Technologies in 2009 and 2010.

On September 30, 2011, Nuance (one of Loquendo's main competitors) announced that it had acquired Loquendo.

History

The group that in the following years worked on speech synthesis, recognition and encoding was created in the middle of the Seventies

1970s

File:1970s decade montage.png|From left, clockwise: US President Richard Nixon doing the V for Victory sign after his resignation from office after the Watergate scandal in 1974; Refugees aboard a US naval boat after the Fall of Saigon, leading to the end of the Vietnam War in 1975; The 1973 oil...

by the intuition of some IRI

Istituto per la Ricostruzione Industriale

The Istituto per la Ricostruzione Industriale was an Italian public company set up by the fascist government in 1933 to combat the effects of the global depression on the Italian economy...

-STET in the :it:CSELT laboratories in Turin

Turin

Turin is a city and major business and cultural centre in northern Italy, capital of the Piedmont region, located mainly on the left bank of the Po River and surrounded by the Alpine arch. The population of the city proper is 909,193 while the population of the urban area is estimated by Eurostat...

, already prestigious at international level.

Speech synthesis
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

On advices from the University of Padua

University of Padua

The University of Padua is a premier Italian university located in the city of Padua, Italy. The University of Padua was founded in 1222 as a school of law and was one of the most prominent universities in early modern Europe. It is among the earliest universities of the world and the second...

, by applying the so-called diphones

Diphone

In phonetics, a diphone is an adjacent pair of phones. It is usually used to refer to a recording of the transition between two phones.In the following diagram, a stream of phones are represented by P1, P2, etc., and the corresponding diphones are represented by D1-2, D2-3, etc:...

technique (union of a consonant and a vowel, 150 in total in the case of the Italian language) in the 1975 was created the first speech synthesizer with high intelligibility; it was MUSA (MUltichannel Speaking Automaton) that shows everybody the target that, with the technology in those days, the computers could reach. The results achieved in those years were condensed in a 45 rpm record that was spread in thousands of copies with the mass communication media. It was mainly the Italian version of the small song Frère Jacques

Frère Jacques

"Frère Jacques" , in English sometimes called "Brother John" or "Brother Peter", is a French nursery melody. The song is traditionally sung in a round. When the first singer reaches the end of the first line the next person starts at the beginning...

carried out in polyphony with more singing voices (MUSA could manage up to 8 synthesis channels in parallel) to arouse the maximum astonishment.

The evolution from that prototype, with the increased number of diphones (about 1000), the sharpening of the linguistic analysis tools (that allow, for example in Italian, to automatically distinguish "àncora" from "ancòra" - homograph words where only the stress changes and that mean, respectively, "anchor" and "again") and a better waveform management, brought in the next years towards the improvement of the synthetic voice. It was born in this wave the chip "sintetizzatore voce" developed internally in CSELT and added to the SGS (then SGS-Thompson and now STMicroelectronics

STMicroelectronics

STMicroelectronics is an Italian-French electronics and semiconductor manufacturer headquartered in Geneva, Switzerland.While STMicroelectronics corporate headquarters and the headquarters for EMEA region are based in Geneva, the holding company, STMicroelectronics N.V. is registered in Amsterdam,...

) catalog as Zilog

Zilog

Zilog, Inc., previously known as ZiLOG , is a manufacturer of 8-bit and 24-bit microcontrollers, and is most famous for its Intel 8080-compatible Z80 series.-History:...

's Z80 microprocessor's peripheral (with the code M8950).

In the Nineties

1990s

File:1990s decade montage.png|From left, clockwise: The Hubble Space Telescope floats in space after it was taken up in 1990; American F-16s and F-15s fly over burning oil fields and the USA Lexie in Operation Desert Storm, also known as the 1991 Gulf War; The signing of the Oslo Accords on...

ELOQUENS ® was born, speech synthesizer multi-platform for various operating systems (DOS

DOS

DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...

, Windows, System 7

System 7

System 7 is the name of a Macintosh operating system introduced in 1991.System 7 may also refer to:* System 7 , a British dance/ambient band* System 7 , 1991 album* IBM System/7, a 1970s computer system...

, Unix

Unix

Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

, OS/2

OS/2

OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "Personal System/2 " line of second-generation personal...

) or telephone boards with very large number of channels.

Such boards are used by the Italian telephone operator to build the reverse telephoner subscribers information service (to get, from the telephone number, subscriber's identity and address).

Towards the end of the second millennium the synthesis technology changes totally, by moving from the diphones approach to the "variable length unit selection and concatenation". This approach was made possible thank to the increased computer strength and mainly the increased capacity of the mass storage systems. So it was born ACTOR ® - The human sounding voice - that begin to have big audience thanks to the number of telephone services and application for disable people created by connected to Loquendo companies.

In the two-thousands, with the spin-off of the research group in the new company Loquendo, the synthesizer changes name, by getting the one of the company and is enriched in the following years of an impressive number of languages and voice (by reaching after ten years to have more than 30 languages and 70 voices, males and females).

The synthesizer, came out from the research labs and became a commercial product, endows itself with a number of editing tools to produce synthetic audio enriched with emotions (typical is the case of audio books for blind people achieved with the DAISY

DAISY Digital Talking Book

DAISY is a standard for digital talking books. DAISY books are typically used by people have "print disabilities," including blindness, impaired vision, dyslexia...

technology) and furthermore it present itself as a SW library to produce the most various products, from small portable devices - mobile phones, navigators and palm computers, to telephone servers multichannel/multilingual for (semi)automatic call centers.

Speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...

Just after the beginning of the speech synthesis

Speech synthesis

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

research, they started the ones on the speech recognition

Speech recognition

Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...

and, yet at the beginning of the Eighties

1980s

File:1980s decade montage.png|thumb|400px|From left, clockwise: The first Space Shuttle, Columbia, lifted off in 1981; American President Ronald Reagan and Soviet leader Mikhail Gorbachev eased tensions between the two superpowers, leading to the end of the Cold War; The Fall of the Berlin Wall in...

it was produced a first prototype able to understand the ten digits and a few simple commands.

By applying the Hidden Markov models in the 1984 it was developed a connected words and sentences speech recognizer in collaboration with ELSAG, another company in the IRI

Istituto per la Ricostruzione Industriale

The Istituto per la Ricostruzione Industriale was an Italian public company set up by the fascist government in 1933 to combat the effects of the global depression on the Italian economy...

-STET group.

The need to produce speech recognizer speaker independent for telephone applications brought to carry out speech databases with the recorded voice from hundreds of different people and in the 1987 it was created the first large database obtained by recording over the phone, with an automatic procedure, the voice of more than 1000 people calling, from all over Italy, a telephone server got ready in the CSELT labs.

The so recorded material allows to train Hidden Markov models and, by using sophisticated algorithms, to develop AURIS ® the first speech recognizer that could run in the most varied devices with DSP - Digital Signal Processor

DSP

- Computing :* Digital signal processing, the study and implementation of signals in digital computing and their processing methods* Digital signal processor, a specialized microprocessor designed specifically for digital signal processing...

.

In the Nineties

1990s

they begin the big European collaborations (in the framework of the European funded projects) and, together about twenty other companies and universities in the whole Europe, very large speech databases are collected in the whole Europe (on the whole more than 65000 people are contacted).

This huge material, combined with a new mixed approach Hidden Markov models - Neural networks

Neural Networks

Neural Networks is the official journal of the three oldest societies dedicated to research in neural networks: International Neural Network Society, European Neural Network Society and Japanese Neural Network Society, published by Elsevier...

brought to realize FLEXUS ®, the first flexible vocabulary speech recognizer, that allows many varied telephone services to use automatic speech recognition in their human interfaces.
By bringing together FLEXUS and ACTOR in a unique dialog system, DIALOGOS ®, allows to realize extremely avant-garde telephone services for those years: the Telephone Subscribers Information Service (servizio 12) and the Railway Information Services (servizio FS Informa).

The two-thousands brought also for the speech recognizer the change of the name, by getting the one of the just born company Loquendo, the development of a large number of languages and the release of the engine also in the form of a SW library to realize the most varied telephone applications.

They are introduced varied sistems to write state-finite grammars and natural language models systems.

They continue the speech databases recording campaigns, by exiting from Europe and by moving to the Mediterranean countries, to the South, Center and North America and, finally, in the Far East countries. In total it has been recorded infinite hours of speech by contacting hundred thousands of people in the countries of the listed regions. The recordings have been executed both for fixed telephone networks, as well as in moving vehicles for mobile phones and also in domestic environment with high quality microphones for consumer applications (video games, appliances and domotics in general)

Speaker recognition
Speaker recognition
Speaker recognition is the computing task of validating a user's claimed identity using characteristics extracted from their voices .There is a difference between speaker recognition and speech recognition . These two terms are frequently confused, as is voice recognition...

Research activities on speaker recognition

Speaker recognition

Speaker recognition is the computing task of validating a user's claimed identity using characteristics extracted from their voices .There is a difference between speaker recognition and speech recognition . These two terms are frequently confused, as is voice recognition...

initiated very recently, at the meddle of two-thousands when speech databases tailored for this task have been available. In collaboration with Politecnico of Turin started some experiments on two different topic: speaker "identification" and "verification".

The successful pushed the company to move forward to the development of products targeted for this task commercialized both as SW libraries than trough the enabling platform described below.

Speech coding
Speech coding
Speech coding is the application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting...

The research activities on Speech coding

Speech coding

Speech coding is the application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting...

started even before the ones on speech recognition and synthesis, aiming to build equipment (CODEC

Codec

A codec is a device or computer program capable of encoding or decoding a digital data stream or signal. The word codec is a portmanteau of "compressor-decompressor" or, more commonly, "coder-decoder"...

) and echo canceler able to increase as much as possible the number of telephone conversation that can flow trough a single cable (or satellite connection) without losing voice intelligibility.

At the end of the Seventies

1970s

, studies and experiments brought to the achievement of algorithm to encode the telephonic speech signal and set-up the European regulation CCITT known as encoding A-law

G.711

G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972. Its formal name is Pulse code modulation of voice frequencies. It is required standard in many technologies, for example in H.320 and H.323 specifications. It can also...

(8-bit logarithm encoding law "A" for audio signal 8 kHz band limited). Standard then used in the CODEC

Codec

for telephone lines ISDN 64 kbit/s.

In the next years they were built stronger codec (used in the telephone exchanges) and, within the PAN-Europe consortium GSM, the codec to use in the second generation mobile phones.

At the same time are built CODEC

Codec

to transmit high quality signal in spite of the 8 kHz band limit of the telephone cables, useful for audio and video conference applications-

When Loquendo was born all studies on coding were left to other CSELT groups (in the meanwhile renamed TILab - Telecom Italia Lab) that will bring, among other things, to the MP3

MP3

MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...

standard to encode high quality audio signal in MPEG video streams (used in the DVD video that have totally replaced the VHS video tapes). MP3 then became synonym of digital music and file sharing with the coming of Napster

Napster

Napster is an online music store and a Best Buy company. It was originally founded as a pioneering peer-to-peer file sharing Internet service that emphasized sharing audio files that were typically digitally encoded music as MP3 format files...

yet at the end of the Nineties

1990s

Enabling platforms

Towards the end of Nineties

1990s

and the development of internet in the form currently known (browsable hypertexts resident on different servers that embrace the planet is an only great network) it born the need to make these texts also available in voice through the telephone.

At the same time IVR - Interactive Voice Response

Interactive voice response

Interactive voice response is a technology that allows a computer to interact with humans through the use of voice and DTMF keypad inputs....

systems become always more and more widespread and it became essentials HW and SW tools to fast development of new telephone applications and services. It is evident to everybody that the previous development models that brought the achievement of complex systems such us the automation of the 'Telephone directory

Telephone directory

A telephone directory is a listing of telephone subscribers in a geographical area or subscribers to services provided by the organization that publishes the directory...

or the Railway Information Service are too rigid and do not allow the easy development of new applications.

It feels therefore the need of enabling platforms for automatic voice telephone systems both scalable and easily programmables. It is created an appropriate working group that, by combining the efforts of all other groups, develops a voice browser

Voice browser

A voice browser is a web browser that presents an interactive voice user interface to the user. In addition, it typically provides an interface to the PSTN or a PBX. Just as a visual web browser works with HTML pages, a voice browser operates on pages that specify voice dialogues...

prototype that is shown to the public at SMAU

SMAU

SMAU is a computer expo in Italy, held annually at FieraMilano exhibition centre, Milan since 1967.- External links :* *...

2000 with the name VoxNauta ®. The success is so big that Telecom Italia

Telecom Italia

decides to come out, from the original research labs, developing group and platform an therefore it creates Loquendo on 1 February 2001.

In the years, VoxNauta ® is further developed in various scalable forms: from small server to huge enterprise systems with thousands of lines and it is installed in hundreds of company all around the world (as languages/voices available in that moment).

The birth of standards to write telephone services (VoiceXML

VoiceXML

VoiceXML is the W3C's standard XML format for specifying interactive voice dialogues between a human and a computer. It allows voice applications to be developed and deployed in an analogous way to HTML for visual applications. Just as HTML documents are interpreted by a visual web browser,...

) and protocols ( MRCP

Media Resource Control Protocol

Media Resource Control Protocol is a communication protocol used by speech servers to provide various services to their clients...

) to connect server hosting the speech technologies to servers hosting the telephone boards pushes the development of solo SW Speech Server, hosting text-to-speech and speech-recognizer engines from Loquendo.

This constant research and development activity has brought Loquendo to become one of the most known players in the speech technology.

The brand

The origin of the name Loquendo ® while the logo was created by the Telecom Italia

Telecom Italia

graphic department. The three small waves over the letter "0", in the animated gif version of the logo, light in sequence, giving the meaning of the emitting sound.

It is for sure that the name was a brilliant idea as originality and mnemonicity; in fact, at the time of his registration as company brand, it did not appear in any search engine but only rare Latin scrips. Its uniquety acted in such a way that in the years it became synonym of Italian speech technologies (although ofter erroneously identified with the only text-to-speech). To this result contributed also the marketing decision of the first two-thousands to dismiss the historical brand names Actor ® and Flexus ® to bet everything on the company name itself: they were born so Loquendo TTS e Loquendo ASR.

The brand does not have been protected by the company with strong emphasis and this is one of the causes of its enormous spread, also to the detriment of competitors' brands. It is sufficient search it on YouTube

YouTube

YouTube is a video-sharing website, created by three former PayPal employees in February 2005, on which users can upload, view and share videos....

to get it associated to hundreds of amusing and ironic video (although some time of doubtful taste) where the vocal track is carried out with one of the voices of the Turin synthesizer; video authors have in fact decided to leave the Loquendo word in the title of their artistic works to help to identify them as video with an artificial voice.

Same thing for Facebook

Facebook

Facebook is a social networking service and website launched in February 2004, operated and privately owned by Facebook, Inc. , Facebook has more than 800 million active users. Users must register before using the site, after which they may create a personal profile, add other users as...

in which hundreds of profiles all around the world use the brand Loquendo to identify of non-real person profiles.

In conclusion, after ten years from its creation and by paraphrasing the slogan of an another well known Italian company, in these days we could certainly say: Where there is a voice, then there is Loquendo.

Sale of the company

In the years there are various rumors of the sale of Loquendo to other companies.

The last are the ones in the summer 2011, when it was announced the interest into the Turin company from two possible multinationals USA based: Nuance

Nuance Communications

Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, USA, that provides speech and imaging applications...

and Avaya

Avaya

Avaya Inc. is a privately held computer networking, information technology and telecommunications company that is a global provider of business communications systems. The international head quarters is in Basking Ridge, New Jersey, United States...

.

As fhe first was a direct competitor of the Italian company it was seen with some worry by Loquendo workers that were worried about the possible melt of the research and development group and the definitive death of knowledge acquired in forty years of activity.

The second company appeared more interesting because complementary to the activity carried on by Loquendo; Avaya

Avaya

in fact did not own any speech technology engines and therefore could have been very interested to grown in house these technologies rather than to continue to buy them outside (classic dilemma "make or buy".

This news have been followed carefully by the workers, Turin and Piedmont governments and the entire international scientific community.

At the end however, on 13 August 2011, Telecom Italia

Telecom Italia

publically announced the sale of the whole stake it was owning to the American company Nuance

Nuance Communications

Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, USA, that provides speech and imaging applications...

for 53 million of euro

Products

speech synthesis
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...
speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...
speaker verification
voice browser
Voice browser
A voice browser is a web browser that presents an interactive voice user interface to the user. In addition, it typically provides an interface to the PSTN or a PBX. Just as a visual web browser works with HTML pages, a voice browser operates on pages that specify voice dialogues...

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

History

Speech synthesisSpeech synthesisSpeech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

Speech recognitionSpeech recognitionSpeech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...

Enabling platforms

The brand

Sale of the company

Products

Speech synthesis
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

Speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...