[go: nahoru, domu]

Gleason et al., 2001 - Google Patents

Composite background models and score standardization for language identification systems

Gleason et al., 2001

View PDF
Document ID
12547677552124761505
Author
Gleason T
Zissman M
Publication year
Publication venue
2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 01CH37221)

External Links

Snippet

Describes two enhancements to our language identification system. Composite background (CBG) modeling allows us to identify target language speech in an environment where labeled background training data is unavailable or limited. Instead of separate models for …
Continue reading at citeseerx.ist.psu.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. hidden Markov models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2785Semantic analysis
    • G06F17/279Discourse representation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/68Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
    • G06K9/6807Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
    • G06K9/6842Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Similar Documents

Publication Publication Date Title
CN101548313B (en) Voice activity detection system and method
Young Detecting misrecognitions and out-of-vocabulary words
Zissman et al. Automatic language identification of telephone speech messages using phoneme recognition and n-gram modeling
EP0708960B1 (en) Topic discriminator
US6738745B1 (en) Methods and apparatus for identifying a non-target language in a speech recognition system
US8024188B2 (en) Method and system of optimal selection strategy for statistical classifications
US8050929B2 (en) Method and system of optimal selection strategy for statistical classifications in dialog systems
McDonough et al. Approaches to topic identification on the switchboard corpus
Szöke et al. Phoneme based acoustics keyword spotting in informal continuous speech
US20060025995A1 (en) Method and apparatus for natural language call routing using confidence scores
Simonnet et al. ASR error management for improving spoken language understanding
Lane et al. Out-of-domain utterance detection using classification confidences of multiple topics
Kawahara et al. Key-phrase detection and verification for flexible speech understanding
Raghuvanshi et al. Entity resolution for noisy ASR transcripts
Zissman Predicting, diagnosing and improving automatic language identification performance.
Kawahara et al. Combining key-phrase detection and subword-based verification for flexible speech understanding
US6178398B1 (en) Method, device and system for noise-tolerant language understanding
Lincoln et al. A comparison of two unsupervised approaches to accent identification
Gleason et al. Composite background models and score standardization for language identification systems
Duchateau et al. Confidence scoring based on backward language models
Ramesh et al. Context dependent anti subword modeling for utterance verification.
Bouwman et al. Weighting phone confidence measures for automatic speech recognition
Hazen et al. Discriminative feature weighting using MCE training for topic identification of spoken audio recordings
Yin et al. Improvements on hierarchical language identification based on automatic language clustering
Schmitt et al. Facing reality: Simulating deployment of anger recognition in ivr systems