Gleason et al., 2001 - Google Patents

Composite background models and score standardization for language identification systems

Gleason et al., 2001

Document ID: 12547677552124761505
Author: Gleason T; Zissman M
Publication year: 2001
Publication venue: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221)

External Links

Cited by

Snippet

Describes two enhancements to our language identification system. Composite background (CBG) modeling allows us to identify target language speech in an environment where labeled background training data is unavailable or limited. Instead of separate models for …

Continue reading at citeseerx.ist.psu.edu (PDF) (other versions)

239000002131 composite material 0 title abstract description 10

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
- G06F17/279—Discourse representation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals

Similar Documents

Publication	Publication Date	Title
CN101548313B (en)	2011-07-13	Voice activity detection system and method
Young	1994	Detecting misrecognitions and out-of-vocabulary words
Zissman et al.	1994	Automatic language identification of telephone speech messages using phoneme recognition and n-gram modeling
EP0708960B1 (en)	2001-11-21	Topic discriminator
US6738745B1 (en)	2004-05-18	Methods and apparatus for identifying a non-target language in a speech recognition system
US8024188B2 (en)	2011-09-20	Method and system of optimal selection strategy for statistical classifications
US8050929B2 (en)	2011-11-01	Method and system of optimal selection strategy for statistical classifications in dialog systems
McDonough et al.	1994	Approaches to topic identification on the switchboard corpus
Szöke et al.	2005	Phoneme based acoustics keyword spotting in informal continuous speech
US20060025995A1 (en)	2006-02-02	Method and apparatus for natural language call routing using confidence scores
Simonnet et al.	2017	ASR error management for improving spoken language understanding
Lane et al.	2006	Out-of-domain utterance detection using classification confidences of multiple topics
Kawahara et al.	1996	Key-phrase detection and verification for flexible speech understanding
Raghuvanshi et al.	2019	Entity resolution for noisy ASR transcripts
Zissman	1997	Predicting, diagnosing and improving automatic language identification performance.
Kawahara et al.	1997	Combining key-phrase detection and subword-based verification for flexible speech understanding
US6178398B1 (en)	2001-01-23	Method, device and system for noise-tolerant language understanding
Lincoln et al.	1998	A comparison of two unsupervised approaches to accent identification
Gleason et al.	2001	Composite background models and score standardization for language identification systems
Duchateau et al.	2002	Confidence scoring based on backward language models
Ramesh et al.	1998	Context dependent anti subword modeling for utterance verification.
Bouwman et al.	2000	Weighting phone confidence measures for automatic speech recognition
Hazen et al.	2008	Discriminative feature weighting using MCE training for topic identification of spoken audio recordings
Yin et al.	2008	Improvements on hierarchical language identification based on automatic language clustering
Schmitt et al.	2010	Facing reality: Simulating deployment of anger recognition in ivr systems