Gleason et al., 2001 - Google Patents
Composite background models and score standardization for language identification systemsGleason et al., 2001
View PDF- Document ID
- 12547677552124761505
- Author
- Gleason T
- Zissman M
- Publication year
- Publication venue
- 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221)
External Links
Snippet
Describes two enhancements to our language identification system. Composite background (CBG) modeling allows us to identify target language speech in an environment where labeled background training data is unavailable or limited. Instead of separate models for …
- 239000002131 composite material 0 title abstract description 10
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2785—Semantic analysis
- G06F17/279—Discourse representation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101548313B (en) | Voice activity detection system and method | |
Young | Detecting misrecognitions and out-of-vocabulary words | |
Zissman et al. | Automatic language identification of telephone speech messages using phoneme recognition and n-gram modeling | |
EP0708960B1 (en) | Topic discriminator | |
US6738745B1 (en) | Methods and apparatus for identifying a non-target language in a speech recognition system | |
US8024188B2 (en) | Method and system of optimal selection strategy for statistical classifications | |
US8050929B2 (en) | Method and system of optimal selection strategy for statistical classifications in dialog systems | |
McDonough et al. | Approaches to topic identification on the switchboard corpus | |
Szöke et al. | Phoneme based acoustics keyword spotting in informal continuous speech | |
US20060025995A1 (en) | Method and apparatus for natural language call routing using confidence scores | |
Simonnet et al. | ASR error management for improving spoken language understanding | |
Lane et al. | Out-of-domain utterance detection using classification confidences of multiple topics | |
Kawahara et al. | Key-phrase detection and verification for flexible speech understanding | |
Raghuvanshi et al. | Entity resolution for noisy ASR transcripts | |
Zissman | Predicting, diagnosing and improving automatic language identification performance. | |
Kawahara et al. | Combining key-phrase detection and subword-based verification for flexible speech understanding | |
US6178398B1 (en) | Method, device and system for noise-tolerant language understanding | |
Lincoln et al. | A comparison of two unsupervised approaches to accent identification | |
Gleason et al. | Composite background models and score standardization for language identification systems | |
Duchateau et al. | Confidence scoring based on backward language models | |
Ramesh et al. | Context dependent anti subword modeling for utterance verification. | |
Bouwman et al. | Weighting phone confidence measures for automatic speech recognition | |
Hazen et al. | Discriminative feature weighting using MCE training for topic identification of spoken audio recordings | |
Yin et al. | Improvements on hierarchical language identification based on automatic language clustering | |
Schmitt et al. | Facing reality: Simulating deployment of anger recognition in ivr systems |