JP2001268669A

JP2001268669A - Device and method for equipment control using mobile telephone terminal and recording medium

Info

Publication number: JP2001268669A
Application number: JP2000077566A
Authority: JP
Inventors: Tetsuya Sakayori; 哲也酒寄
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2000-03-21
Filing date: 2000-03-21
Publication date: 2001-09-28

Abstract

PROBLEM TO BE SOLVED: To provide an equipment control system which is general to equipments and specified to users. SOLUTION: When a user speaks, a voice recognition part 5 compares voice features extracted by a voice feature extraction part 2 with a voice feature dictionary held in a user information storage part 4 and a conversation control part 8 recognizes a key work that the user speaks from recognition candidates set with status information, etc., of the equipment. The conversation control part 8 sends a command for controlling the equipment to an equipment control part 9 according to the recognized key word to control the equipment 10. The conversation control part 8 generates a speech sentence, limits a user speech, and explains and changes the key word in addition to the conversion from the voice recognition result to the command and the limitation of the voice recognition candidates.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、移動電話端末を利
用した機器制御装置、方法、及び記録媒体に関し、より
詳細には、移動電話端末にて音声認識，音声合成，対話
処理を行うことにより、複雑な操作が要求される機器の
個人対応ユーザインターフェースとして移動電話端末を
利用して機器を制御する、移動電話端末を利用した機器
制御装置、方法、及び記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a device control apparatus, a method, and a recording medium using a mobile telephone terminal. More specifically, the present invention relates to a mobile telephone terminal that performs voice recognition, voice synthesis, and interactive processing. The present invention relates to a device control apparatus, a method, and a recording medium using a mobile telephone terminal, which control the device using a mobile telephone terminal as a personalized user interface of a device requiring complicated operations.

【０００２】[0002]

【従来の技術】特開平５−１３０１８１号公報には、Ａ
Ｖ機器やエアコン等の赤外線リモコン機能を有するコー
ドレス電話装置に関する発明が記載されている。特開平
１０−９３６６５号公報には、移動電話端末を利用して
複数の機器を制御する室内機器制御装置に関する発明が
記載されている。この装置では機器側から機能一覧を送
信し、ユーザはそれを移動電話で選択して動作を指示す
ることができる。2. Description of the Related Art Japanese Patent Laid-Open No. 5-130181 discloses that
An invention relating to a cordless telephone device having an infrared remote control function such as a V device or an air conditioner is described. Japanese Patent Application Laid-Open No. Hei 10-93665 discloses an invention relating to an indoor device control device for controlling a plurality of devices using a mobile telephone terminal. In this device, a function list is transmitted from the device side, and the user can select it with a mobile phone and instruct an operation.

【０００３】[0003]

【発明が解決しようとする課題】日常、多種多様な機器
に囲まれて生活するような現代では、それぞれの機器毎
にバラバラに設計されたユーザ・インターフェースを憶
える必要があり、大きな負荷となっている。そのため機
器に対して汎用的でありユーザに対して特化した、ひと
つのユーザ・インターフェース・ユニットを持ち歩き、
これによって複数の機器を制御することが出来れば理想
的である。通信機能を有し、携帯性にも優れた移動電話
端末を、このような汎用的ユーザ・インターフェースと
して利用するメリットは大きい。In today's modern day where people are surrounded by a variety of devices, it is necessary to remember user interfaces designed separately for each device. I have. Therefore, we carry one user interface unit that is general-purpose for the device and specialized for the user,
It would be ideal if a plurality of devices could be controlled by this. There is a great merit of using a mobile telephone terminal having a communication function and excellent portability as such a general-purpose user interface.

【０００４】特開平５−１３０１８１号公報に記載の
「コードレス電話装置」においては、家庭内の機器との
親和性の高さから、コードレス電話装置と機器との間で
赤外線通信を行っている。しかし、複数の機器を電話の
限られたボタン類で操作するため、ボタン操作と機器動
作の対応を複数憶える必要があるなどユーザ負荷は依然
大きい。[0004] In the "cordless telephone device" described in Japanese Patent Application Laid-Open No. 5-130181, infrared communication is performed between the cordless telephone device and the device due to its high affinity with home appliances. However, since a plurality of devices are operated with a limited number of buttons on the telephone, the user load is still large, for example, it is necessary to memorize a plurality of correspondences between button operations and device operations.

【０００５】特開平１０−９３６６５号公報に記載の
「移動電話端末を利用した室内機器制御装置」において
は、機器側から移動電話に制御可能な機能一覧を送信す
ることにより、ユーザはこれを見て選択的に操作できる
ので、操作のための記憶負荷が軽減でき、対応できる機
器数も拡張可能である。しかし、機器の機能が多数ある
場合にこの中から所望の機能を選択するのは、通常の移
動電話程度の表示デバイスとボタン類だけでは容易では
ない。さらに設定動作などで多数のパラメータを入力す
る必要がある場合など、汎用的インターフェースでは煩
雑な操作を繰り返すことになりやすい。また、多様な機
器の制御を考える場合、それぞれの機器に特有の機能を
簡潔な表示でユーザに伝えることも難しく、その操作を
すべて記憶することも難しい。せっかく汎用的なユーザ
インターフェースによって複数の機器が制御できても、
個人に特化した操作系を構築できるわけではないのでユ
ーザにとっての負荷は大きい。[0005] In the "indoor equipment control apparatus using a mobile telephone terminal" described in Japanese Patent Application Laid-Open No. Hei 10-93665, a user transmits a list of controllable functions from the equipment side to the mobile telephone so that the user can view the list. Operation can be selectively performed, so that the memory load for the operation can be reduced, and the number of compatible devices can be expanded. However, when there are a large number of functions of the device, it is not easy to select a desired function from these functions using only a display device and buttons similar to those of an ordinary mobile phone. Further, when a large number of parameters need to be input for a setting operation or the like, a general-purpose interface tends to repeat complicated operations. Also, when considering control of various devices, it is difficult to convey a function unique to each device to the user in a simple display, and it is also difficult to store all the operations. Even if you can control multiple devices with a general-purpose user interface,
Since an operation system specialized for an individual cannot be constructed, the load on the user is large.

【０００６】一方、音声認識には以下の課題がある。特
定のユーザの発音に特化せず誰の発声でも高い認識率を
得ることは難しいが、不特定多数の人間によって共用さ
れる機器では誰の声も正確に認識する必要がある。ま
た、発話に用いる単語などは個人差，文脈，状況に依存
して変わるため、機器の動作コマンドへの変換には常に
多義性が存在し、発話者と発話者が持つ文脈を特定でき
ない状態では認識候補の増大を招き、さらなる認識率の
低下につながる。On the other hand, speech recognition has the following problems. Although it is difficult to obtain a high recognition rate for any utterance without specializing in pronunciation of a specific user, it is necessary for devices shared by an unspecified number of people to accurately recognize everyone's voice. In addition, since words used in speech change depending on individual differences, contexts, and situations, there is always ambiguity in the conversion to device operation commands, and in situations where the speaker and the context of the speaker cannot be specified. This leads to an increase in recognition candidates, leading to a further reduction in the recognition rate.

【０００７】本発明は、上述の実情に鑑みてなされたも
のであり、以下の事項を目的とする。請求項１，１６，
３１，３２の発明は、移動電話端末を通した対話によっ
て機器制御コマンドを送出することによって、機器の多
様な機能をユーザの記憶負荷を増やすことなく利用可能
とする。[0007] The present invention has been made in view of the above circumstances, and has the following objects. Claims 1, 16,
The inventions of the inventions 31 and 32 make it possible to use various functions of a device without increasing a user's memory load by transmitting a device control command through a dialogue through a mobile telephone terminal.

【０００８】請求項２，１７，３１，３２の発明は、移
動電話端末を通した音声対話によって機器制御コマンド
を送出することによって、機器の多様な機能をユーザの
記憶負荷を増やすことなく利用可能とするともに、不特
定多数のユーザに対しても個人に特化した音声認識を行
うことによって音声認識率の向上を目指す。According to the inventions of claims 2, 17, 31, and 32, various functions of the equipment can be used without increasing the memory load of the user by transmitting the equipment control command by voice dialogue through the mobile telephone terminal. At the same time, we aim to improve the voice recognition rate by performing voice recognition specialized for individuals even for an unspecified number of users.

【０００９】請求項３，１８，３１，３２の発明は、複
数のキーワードを同時に認識することによって、煩雑に
なりがちがなパラメータ設定を含む動作指示を容易に利
用可能とする。According to the inventions of claims 3, 18, 31, and 32, by simultaneously recognizing a plurality of keywords, an operation instruction including parameter setting which tends to be complicated can be easily used.

【００１０】請求項４，１９，３１，３２の発明は、限
られた表示面積による文字情報だけでは理解しがたい複
雑な機能も、音声による説明によってユーザの理解を助
ける。According to the inventions of claims 4, 19, 31, and 32, even a complicated function which cannot be understood only by character information with a limited display area can help the user to understand by a voice explanation.

【００１１】請求項５，２０，３１，３２の発明は、コ
マンドとキーワードの対応をユーザが変更することによ
って、ユーザ毎に無理なく記憶できる操作系を構築し、
記憶負荷を減らす。According to the invention of claims 5, 20, 31, and 32, the user changes the correspondence between the command and the keyword, thereby constructing an operation system capable of storing the information without difficulty for each user.
Reduce memory load.

【００１２】請求項６，２１，３１，３２の発明は、ユ
ーザ毎の発話履歴やコマンド履歴を移動電話端末に保持
することによって、ユーザの持つ文脈を推定してユーザ
発話の多義性を解消し、適切な機器動作コマンドへ変換
する。According to the inventions of claims 6, 21, 31, and 32, the utterance history and the command history of each user are stored in the mobile telephone terminal, thereby estimating the context of the user and eliminating the ambiguity of the user utterance. , Is converted into an appropriate device operation command.

【００１３】請求項７，８，２２，２３，３１，３２の
発明は、類似機器の操作から獲得した知識を容易に再利
用可能とすることで、個々のユーザ毎に適切な、ストレ
スの少ないユーザインタフェースを実現する。[0013] According to the invention of claims 7, 8, 22, 23, 31, and 32, the knowledge acquired from the operation of similar devices can be easily reused, so that it is appropriate for each user and low in stress. Implement a user interface.

【００１４】請求項９，２４，３１，３２の発明は、ユ
ーザが過去に習得した知識によって容易に理解可能なメ
ッセージを生成する。The invention according to claims 9, 24, 31, and 32 generates a message that can be easily understood by the user based on the knowledge acquired in the past.

【００１５】請求項１０，２５，３１，３２の発明は、
ユーザの発話しやすい語彙に絞り込んで音声認識率の向
上を目指す。[0015] The invention of claims 10, 25, 31, 32 is
We aim to improve the speech recognition rate by focusing on vocabulary that is easy for the user to speak.

【００１６】請求項１１，２６，３１，３２の発明は、
慣れたワークフローに沿ってユーザにとって違和感のな
い操作をガイドする。The invention of claims 11, 26, 31, 32 is
Guide operations that are comfortable for the user along with the familiar workflow.

【００１７】請求項１２，２７，３１，３２の発明は、
発話の順序を予測することで、認識対象を絞り込み、音
声認識率の向上を目指す。The invention according to claims 12, 27, 31, 32 is
By predicting the order of the utterances, the recognition target is narrowed down, and the speech recognition rate is improved.

【００１８】請求項１３，２８，３１，３２の発明は、
操作対象機器に特有の機能を強調してガイドすることで
誤操作を防ぐ。The invention of claims 13, 28, 31, 32 is
Prevents erroneous operations by emphasizing and guiding functions specific to the operation target device.

【００１９】請求項１４，２９，３１，３２の発明は、
適切な既定値を設定することで操作の効率を向上する。The invention of claims 14, 29, 31, 32 is
Improve operation efficiency by setting appropriate defaults.

【００２０】請求項１５，３０，３１，３２の発明は、
ユーザの重要視している設定項目に焦点を当てた操作を
可能とする。The invention of claims 15, 30, 31, and 32 is
It is possible to perform an operation focusing on a setting item that the user attaches importance to.

【００２１】[0021]

【課題を解決するための手段】請求項１の発明は、１つ
或いは複数の機器の制御を行う機器制御ユニットと、ユ
ーザ毎に該ユーザのインターフェースを担う移動電話端
末からなり、機器への動作指示を表すコマンド、前記機
器の動作設定に必要なパラメータ、及び、前記コマンド
とパラメータの関係を記憶し、前記移動電話端末によっ
てユーザが選択したコマンドを前記機器制御ユニットに
通知し、該機器制御ユニットによって前記機器を制御す
る、移動電話端末を利用した機器制御装置であって、キ
ーワードを含む入力を認識する入力認識手段と、ユーザ
へ出力メッセージを送出する出力合成手段と、前記機器
のステータス情報から前記入力認識手段の認識候補を制
限し、前記入力認識手段の認識結果から機器制御コマン
ドを生成し、前記機器からの情報要求により出力メッセ
ージを生成する対話制御手段と、該対話制御手段から生
成される前記機器制御コマンドによって前記機器を制御
し、該機器のステータス情報を前記対話制御手段に送る
機器制御手段と、を備えることを特徴としたものであ
る。The invention according to claim 1 comprises an equipment control unit for controlling one or a plurality of equipment, and a mobile telephone terminal for each user, which acts as an interface for the user. A command representing an instruction, a parameter necessary for operation setting of the device, and a relationship between the command and the parameter are stored, and a command selected by a user by the mobile phone terminal is notified to the device control unit. A device control device using a mobile telephone terminal, wherein the device control device controls the device by input recognition means for recognizing an input including a keyword, output synthesis means for sending an output message to a user, and status information of the device. Limiting a recognition candidate of the input recognition means, generating a device control command from a recognition result of the input recognition means, Dialogue control means for generating an output message in response to an information request from a device, and device control means for controlling the device by the device control command generated from the dialogue control device and sending status information of the device to the dialogue control device And the following.

【００２２】請求項２の発明は、１つ或いは複数の機器
の制御を行う機器制御ユニットと、ユーザ毎に該ユーザ
のインターフェースを担う移動電話端末からなり、機器
への動作指示を表すコマンド、前記機器の動作設定に必
要なパラメータ、及び、前記コマンドとパラメータの関
係を記憶し、前記移動電話端末によってユーザが選択し
たコマンドを前記機器制御ユニットに通知し、該機器制
御ユニットによって前記機器を制御する、移動電話端末
を利用した機器制御装置であって、キーワードを含むユ
ーザ発声を認識する音声認識手段と、ユーザへ音声出力
メッセージを送出する音声合成手段と、前記機器のステ
ータス情報から前記音声認識手段の認識候補を制限し、
前記音声認識手段の認識結果から機器制御コマンドを生
成し、前記機器からの情報要求により出力メッセージを
生成する対話制御手段と、該対話制御手段から生成され
る機器制御コマンドによって前記機器を制御し、該機器
のステータス情報を前記対話制御手段に送る機器制御手
段と、を備えることを特徴としたものである。According to a second aspect of the present invention, there is provided a device control unit for controlling one or a plurality of devices, and a mobile phone terminal for each user, the command representing an operation instruction to the device. The parameters necessary for the operation setting of the device and the relationship between the command and the parameter are stored, the command selected by the user by the mobile telephone terminal is notified to the device control unit, and the device is controlled by the device control unit. A device control device using a mobile telephone terminal, a voice recognition unit for recognizing a user utterance including a keyword, a voice synthesis unit for sending a voice output message to a user, and the voice recognition unit based on status information of the device. Restrict recognition candidates for,
A device control command is generated from the recognition result of the voice recognition unit, a dialogue control unit that generates an output message in response to an information request from the device, and the device is controlled by a device control command generated from the dialogue control unit. Device control means for sending status information of the device to the dialog control means.

【００２３】請求項３の発明は、請求項２の発明におい
て、前記音声認識手段が認識するキーワードは同時に複
数あり、複数のキーワードを認識することによって、前
記コマンドを複数組み合わせた動作指示や、前記パラメ
ータを複数同時に設定可能であることを特徴としたもの
である。According to a third aspect of the present invention, in the second aspect of the present invention, there are a plurality of keywords recognized by the voice recognition means at the same time, and by recognizing a plurality of keywords, an operation instruction combining a plurality of the commands, It is characterized in that a plurality of parameters can be set simultaneously.

【００２４】請求項４の発明は、請求項２の発明におい
て、前記移動電話端末は、前記キーワードに付加して、
該キーワードに対応した機器の動作に関する説明情報を
保持し、該保持された説明情報をユーザの要求によって
音声又は視覚的表示によって提示することを特徴とした
ものである。According to a fourth aspect of the present invention, in the second aspect of the present invention, the mobile telephone terminal adds, to the keyword,
It is characterized in that it holds explanation information on the operation of the device corresponding to the keyword, and presents the held explanation information by voice or visual display at the request of the user.

【００２５】請求項５の発明は、請求項２の発明におい
て、前記キーワードとコマンドの対応を前記移動電話端
末の操作によって変更可能であることを特徴としたもの
である。According to a fifth aspect of the present invention, in the second aspect of the present invention, the correspondence between the keyword and the command can be changed by operating the mobile telephone terminal.

【００２６】請求項６の発明は、請求項２の発明におい
て、前記対話制御手段は、現時点までのユーザ発話の履
歴及び装置発話の履歴及びコマンド履歴を保持し、前記
機器のステータス情報に加えて前記保持された履歴を利
用することによって、前記音声認識手段での認識候補を
制限することを特徴としたものである。According to a sixth aspect of the present invention, in the second aspect of the present invention, the dialog control means holds a history of a user's utterance, a history of a device utterance and a command history up to the present time, and adds the history to the status information of the device. The present invention is characterized in that recognition candidates in the voice recognition means are restricted by using the held history.

【００２７】請求項７の発明は、１つ或いは複数の機器
の制御を行う機器制御ユニットと、ユーザ毎に該ユーザ
のインターフェースを担う移動電話端末からなる、対話
によって前記機器を制御する、移動電話端末を利用した
機器制御装置であって、操作履歴や対話履歴を記憶する
履歴記憶手段と、該履歴記憶手段で記憶されたデータを
分析して、語彙リストとワークフローとのうち少なくと
も１つを出力するユーザモデル分析手段と、使用した機
器を目的、作業内容などによって、典型的機能群を持つ
機器分類に分類する機器分類手段と、前記ユーザモデル
分析手段の分析結果を前記機器分類手段の分類結果と関
連付けて記憶するユーザモデル記憶手段と、該ユーザモ
デル記憶手段で記憶されたデータを利用して対話処理を
行う対話処理手段と、を備えることを特徴としたもので
ある。According to a seventh aspect of the present invention, there is provided a mobile telephone, comprising: a device control unit for controlling one or a plurality of devices; and a mobile telephone terminal for each user, which controls the user. A device control device using a terminal, a history storage unit for storing an operation history and a dialog history, and analyzing data stored in the history storage unit to output at least one of a vocabulary list and a workflow A user model analyzing unit, a device classifying unit that classifies the used device into a device class having a typical function group according to the purpose, the work content, and the like, and a result of the analysis by the device classifying unit, the analysis result of the user model analyzing unit. User model storage means for storing in association with the user model, and dialog processing means for performing a dialog process using data stored in the user model storage means It is obtained by further comprising a.

【００２８】請求項８の発明は、１つ或いは複数の機器
の制御を行う機器制御ユニットと、ユーザ毎に該ユーザ
のインターフェースを担う移動電話端末からなる、音声
対話によって前記機器を制御する、移動電話端末を利用
した機器制御装置であって、操作履歴や対話履歴を記憶
する履歴記憶手段と、該履歴記憶手段で記憶されたデー
タを分析して、語彙リストとワークフローとのうち少な
くとも１つを出力するユーザモデル分析手段と、使用し
た機器を目的、作業内容などによって、典型的機能群を
持つ機器分類に分類する機器分類手段と、前記ユーザモ
デル分析手段の分析結果を前記機器分類手段の分類結果
と関連付けて記憶するユーザモデル記憶手段と、該ユー
ザモデル記憶手段で記憶されたデータを利用して対話処
理を行う対話処理手段と、を備えることを特徴としたも
のである。[0028] The invention according to claim 8 comprises an equipment control unit for controlling one or a plurality of equipment, and a mobile telephone terminal for each user, which controls the equipment by voice dialogue. A device control device using a telephone terminal, comprising: a history storage unit for storing an operation history and a dialog history; and analyzing data stored in the history storage unit to determine at least one of a vocabulary list and a workflow. A user model analyzing means for outputting, a device classifying means for classifying a used device into a device class having a typical function group according to a purpose, a work content, and the like; and an analysis result of the user model analyzing means for classification of the device classifying means. User model storage means for storing in association with a result, and interactive processing for performing interactive processing using data stored in the user model storage means It is obtained by comprising: the stage, a.

【００２９】請求項９の発明は、請求項８の発明におい
て、操作対象機器の属する前記機器分類に関連付けられ
た語彙リストを用いて音声出力メッセージを生成するこ
とを特徴としたものである。A ninth aspect of the present invention is characterized in that, in the eighth aspect of the present invention, a voice output message is generated using a vocabulary list associated with the device classification to which the operation target device belongs.

【００３０】請求項１０の発明は、請求項８の発明にお
いて、操作対象機器の属する前記機器分類に関連付けら
れた語彙リストを用いて音声認識候補を生成することを
特徴としたものである。A tenth aspect of the present invention is characterized in that, in the eighth aspect of the present invention, a speech recognition candidate is generated using a vocabulary list associated with the device classification to which the operation target device belongs.

【００３１】請求項１１の発明は、請求項８の発明にお
いて、操作対象機器の属する前記機器分類に関連付けら
れたワークフローを用いて音声出力メッセージを作成す
ることを特徴としたものである。[0031] The invention of claim 11 is characterized in that, in the invention of claim 8, a voice output message is created using a workflow associated with the device classification to which the device to be operated belongs.

【００３２】請求項１２の発明は、請求項８の発明にお
いて、操作対象機器の属する前記機器分類に関連付けら
れたワークフローを用いて音声認識候補を生成すること
を特徴としたものである。A twelfth aspect of the present invention is characterized in that, in the eighth aspect of the present invention, a speech recognition candidate is generated using a workflow associated with the device classification to which the device to be operated belongs.

【００３３】請求項１３の発明は、請求項８の発明にお
いて、操作対象機器の属する前記機器分類に関連付けら
れた機能群にない機能に対しては詳細な音声出力メッセ
ージを生成し、強調された合成音声を生成することを特
徴としたものである。According to a thirteenth aspect of the present invention, in the invention of the eighth aspect, a detailed voice output message is generated for a function not included in the function group associated with the device classification to which the operation target device belongs, and the message is emphasized. It is characterized by generating synthesized speech.

【００３４】請求項１４の発明は、請求項８の発明にお
いて、前記ワークフローは、機器使用時にユーザが値を
設定すべき設定項目に関して、対象ユーザが設定する設
定値が、異なる機器使用時においても変化が小さく安定
している場合には、前記設定項目の前記設定値を既定値
として作成することを特徴としたものである。According to a fourteenth aspect of the present invention, in the invention of the eighth aspect, the workflow is such that the setting values set by the target user with respect to the setting items to be set by the user when using the device are different even when using a different device. When the change is small and stable, the setting value of the setting item is created as a default value.

【００３５】請求項１５の発明は、請求項８の発明にお
いて、前記ワークフローは、既定値からの変更が少ない
設定項目に対しては、設定処理の優先順位を下げて作成
することを特徴としたものである。The invention of claim 15 is characterized in that, in the invention of claim 8, the workflow is created by lowering the priority of the setting process for a setting item whose change from the default value is small. Things.

【００３６】請求項１６の発明は、１つ或いは複数の機
器の制御を行う機器制御ユニットと、ユーザ毎に該ユー
ザのインターフェースを担う移動電話端末からなり、機
器への動作指示を表すコマンド、前記機器の動作設定に
必要なパラメータ、及び、前記コマンドとパラメータの
関係を記憶し、前記移動電話端末によってユーザが選択
したコマンドを前記機器制御ユニットに通知し、該機器
制御ユニットによって前記機器を制御する、移動電話端
末を利用した機器制御方法であって、キーワードを含む
入力を認識する入力認識ステップと、ユーザへ出力メッ
セージを送出する出力合成ステップと、前記機器のステ
ータス情報から前記入力認識ステップの認識候補を制限
し、前記入力認識ステップの認識結果から機器制御コマ
ンドを生成し、前記機器からの情報要求により出力メッ
セージを生成する対話制御ステップと、該対話制御ステ
ップで生成される前記機器制御コマンドによって前記機
器を制御し、該機器のステータス情報を前記対話制御ス
テップに送る機器制御ステップと、を含んでなることを
特徴としたものである。The invention according to claim 16 comprises a device control unit for controlling one or a plurality of devices, and a mobile telephone terminal for each user, which serves as an interface for the user, and a command indicating an operation instruction to the device. The parameters necessary for the operation setting of the device and the relationship between the command and the parameter are stored, the command selected by the user by the mobile telephone terminal is notified to the device control unit, and the device is controlled by the device control unit. A device control method using a mobile telephone terminal, comprising: an input recognition step of recognizing an input including a keyword; an output combining step of sending an output message to a user; and recognition of the input recognition step from status information of the device. Limiting the candidates, generating a device control command from the recognition result of the input recognition step, A dialogue control step of generating an output message in response to an information request from a device, and a device control step of controlling the device by the device control command generated in the dialogue control step and sending status information of the device to the dialogue control step And the following.

【００３７】請求項１７の発明は、１つ或いは複数の機
器の制御を行う機器制御ユニットと、ユーザ毎に該ユー
ザのインターフェースを担う移動電話端末からなり、機
器への動作指示を表すコマンド、前記機器の動作設定に
必要なパラメータ、及び、前記コマンドとパラメータの
関係を記憶し、前記移動電話端末によってユーザが選択
したコマンドを前記機器制御ユニットに通知し、該機器
制御ユニットによって前記機器を制御する、移動電話端
末を利用した機器制御方法であって、キーワードを含む
ユーザ発声を認識する音声認識ステップと、ユーザへ音
声出力メッセージを送出する音声合成ステップと、前記
機器のステータス情報から前記音声認識ステップの認識
候補を制限し、前記音声認識ステップの認識結果から機
器制御コマンドを生成し、前記機器からの情報要求によ
り出力メッセージを生成する対話制御ステップと、該対
話制御ステップで生成される機器制御コマンドによって
前記機器を制御し、該機器のステータス情報を前記対話
制御ステップに送る機器制御ステップと、を含んでなる
ことを特徴としたものである。The invention according to claim 17 comprises a device control unit for controlling one or a plurality of devices, and a mobile telephone terminal for each user, which serves as an interface for the user, and a command indicating an operation instruction to the device. The parameters necessary for the operation setting of the device and the relationship between the command and the parameter are stored, the command selected by the user by the mobile telephone terminal is notified to the device control unit, and the device is controlled by the device control unit. A device control method using a mobile telephone terminal, a voice recognition step of recognizing a user utterance including a keyword, a voice synthesis step of sending a voice output message to a user, and the voice recognition step from status information of the device. And the device control command is restricted based on the recognition result of the voice recognition step. And an interactive control step of generating an output message in response to an information request from the device, and controlling the device by a device control command generated in the interactive control step, and sending status information of the device to the interactive control step And a device control step.

【００３８】請求項１８の発明は、請求項１７の発明に
おいて、前記音声認識ステップで認識するキーワードは
同時に複数あり、複数のキーワードを認識することによ
って、前記コマンドを複数組み合わせた動作指示や、前
記パラメータを複数同時に設定可能であることを特徴と
したものである。The invention of claim 18 is the invention of claim 17, wherein there are a plurality of keywords recognized in the voice recognition step at the same time, and by recognizing a plurality of keywords, an operation instruction combining a plurality of the commands, It is characterized in that a plurality of parameters can be set simultaneously.

【００３９】請求項１９の発明は、請求項１７の発明に
おいて、前記移動電話端末は、前記キーワードに付加し
て、該キーワードに対応した機器の動作に関する説明情
報を保持し、該保持された説明情報をユーザの要求によ
って音声又は視覚的表示によって提示することを特徴と
したものである。According to a nineteenth aspect, in the seventeenth aspect, the mobile telephone terminal holds, in addition to the keyword, explanation information on an operation of a device corresponding to the keyword, and the held explanation. Information is presented by voice or visual display at the request of the user.

【００４０】請求項２０の発明は、請求項１７の発明に
おいて、前記キーワードとコマンドの対応を前記移動電
話端末の操作によって変更可能であることを特徴とした
ものである。[0040] The invention of claim 20 is the invention of claim 17, wherein the correspondence between the keyword and the command can be changed by operating the mobile telephone terminal.

【００４１】請求項２１の発明は、請求項１７の発明に
おいて、前記対話制御ステップは、現時点までのユーザ
発話の履歴及び装置発話の履歴及びコマンド履歴を保持
し、前記機器のステータス情報に加えて前記保持された
履歴を利用することによって、前記音声認識ステップで
の認識候補を制限することを特徴としたものである。According to a twenty-first aspect of the present invention, in the seventeenth aspect, the interaction control step holds a history of a user utterance, a history of a device utterance, and a command history up to the present time, and adds the history to the status information of the device. The present invention is characterized in that recognition candidates in the voice recognition step are limited by using the held history.

【００４２】請求項２２の発明は、１つ或いは複数の機
器の制御を行う機器制御ユニットと、ユーザ毎に該ユー
ザのインターフェースを担う移動電話端末からなる、対
話によって前記機器を制御する、移動電話端末を利用し
た機器制御方法であって、操作履歴や対話履歴を記憶す
る履歴記憶ステップと、該履歴記憶ステップで記憶した
データを分析して、語彙リストとワークフローとのうち
少なくとも１つを出力するユーザモデル分析ステップ
と、使用した機器を目的、作業内容などによって、典型
的機能群を持つ機器分類に分類する機器分類ステップ
と、前記ユーザモデル分析ステップでの分析結果を前記
機器分類ステップでの分類結果と関連付けて記憶するユ
ーザモデル記憶ステップと、該ユーザモデル記憶ステッ
プで記憶したデータを利用して対話処理を行う対話処理
ステップと、を含んでなることを特徴としたものであ
る。According to a twenty-second aspect of the present invention, there is provided a mobile telephone, comprising: a device control unit for controlling one or a plurality of devices; and a mobile phone terminal for each user, which controls the user. A device control method using a terminal, comprising: a history storing step of storing an operation history and a dialog history; and analyzing data stored in the history storing step to output at least one of a vocabulary list and a workflow. A user model analysis step, a device classification step of classifying a used device into a device classification having a typical function group according to purpose, work content, and the like, and a classification in the device classification step of an analysis result in the user model analysis step. Storing a user model storing step in association with the result; and storing the data stored in the user model storing step. Dialogue processing step of performing an interactive process with use, in which characterized in that it comprises a.

【００４３】請求項２３の発明は、１つ或いは複数の機
器の制御を行う機器制御ユニットと、ユーザ毎に該ユー
ザのインターフェースを担う移動電話端末からなる、音
声対話によって前記機器を制御する、移動電話端末を利
用した機器制御方法であって、操作履歴や対話履歴を記
憶する履歴記憶ステップと、該履歴記憶ステップで記憶
したデータを分析して、語彙リストとワークフローとの
うち少なくとも１つを出力するユーザモデル分析ステッ
プと、使用した機器を目的、作業内容などによって、典
型的機能群を持つ機器分類に分類する機器分類ステップ
と、前記ユーザモデル分析ステップでの分析結果を前記
機器分類ステップでの分類結果と関連付けて記憶するユ
ーザモデル記憶ステップと、該ユーザモデル記憶ステッ
プで記憶したデータを利用して対話処理を行う対話処理
ステップと、を含んでなることを特徴としたものであ
る。According to a twenty-third aspect of the present invention, there is provided a device control unit for controlling one or a plurality of devices, and a mobile telephone terminal for each user, which controls an interface of the user. A device control method using a telephone terminal, comprising: a history storage step for storing an operation history and a dialog history; and analyzing data stored in the history storage step to output at least one of a vocabulary list and a workflow. A user model analyzing step, a device classifying step of classifying a used device into a device class having a typical function group according to a purpose, a work content, and the like, and an analysis result in the user model analyzing step in the device classifying step. Storing the user model in association with the classification result; and storing the data stored in the user model storing step. Dialogue processing step of performing a conversation process using the one in which was characterized in that it comprises a.

【００４４】請求項２４の発明は、請求項２３の発明に
おいて、操作対象機器の属する前記機器分類に関連付け
られた語彙リストを用いて音声出力メッセージを生成す
ることを特徴としたものである。According to a twenty-fourth aspect of the present invention, in the twenty-third aspect, a voice output message is generated using a vocabulary list associated with the device classification to which the device to be operated belongs.

【００４５】請求項２５の発明は、請求項２３の発明に
おいて、操作対象機器の属する前記機器分類に関連付け
られた語彙リストを用いて音声認識候補を生成すること
を特徴としたものである。A twenty-fifth invention is characterized in that, in the twenty-third invention, a speech recognition candidate is generated using a vocabulary list associated with the device classification to which the device to be operated belongs.

【００４６】請求項２６の発明は、請求項２３の発明に
おいて、操作対象機器の属する前記機器分類に関連付け
られたワークフローを用いて音声出力メッセージを作成
することを特徴としたものである。A twenty-sixth invention is characterized in that, in the twenty-third invention, a voice output message is created using a workflow associated with the device classification to which the device to be operated belongs.

【００４７】請求項２７の発明は、請求項２３の発明に
おいて、操作対象機器の属する前記機器分類に関連付け
られたワークフローを用いて音声認識候補を生成するこ
とを特徴としたものである。In a twenty-seventh aspect, in the twenty-third aspect, a voice recognition candidate is generated using a workflow associated with the device classification to which the operation target device belongs.

【００４８】請求項２８の発明は、請求項２３の発明に
おいて、操作対象機器の属する前記機器分類に関連付け
られた機能群にない機能に対しては詳細な音声出力メッ
セージを生成し、強調された合成音声を生成することを
特徴としたものである。According to a twenty-eighth aspect, in the twenty-third aspect, a detailed voice output message is generated and emphasized for a function not included in the function group associated with the device classification to which the operation target device belongs. It is characterized by generating synthesized speech.

【００４９】請求項２９の発明は、請求項２３の発明に
おいて、前記ワークフローは、機器使用時にユーザが値
を設定すべき設定項目に関して、対象ユーザが設定する
設定値が、異なる機器使用時においても変化が小さく安
定している場合には、前記設定項目の前記設定値を既定
値として作成することを特徴としたものである。According to a twenty-ninth aspect of the present invention, in the twenty-third aspect of the present invention, the workflow is such that the setting value set by the target user with respect to the setting item to be set by the user when using the device is different even when using a different device When the change is small and stable, the setting value of the setting item is created as a default value.

【００５０】請求項３０の発明は、請求項２３の発明に
おいて、前記ワークフローは、既定値からの変更が少な
い設定項目に対しては、設定処理の優先順位を下げて作
成することを特徴としたものである。The invention according to claim 30 is characterized in that, in the invention according to claim 23, the workflow is created by lowering the priority of the setting process for setting items whose change from the default value is small. Things.

【００５１】請求項３１の発明は、請求項１乃至１５の
いずれか１に記載の移動電話端末を利用した機器制御装
置として機能させるためのプログラムを記録したことを
特徴としたものである。According to a thirty-first aspect of the present invention, there is provided a program recording a program for functioning as a device control device using the mobile telephone terminal according to any one of the first to fifteenth aspects.

【００５２】請求項３２の発明は、請求項１６乃至３０
のいずれか１に記載の移動電話端末を利用した機器制御
方法を実現させるためのプログラムを記録したことを特
徴としたものである。The invention of claim 32 is the invention of claims 16 to 30.
A program for realizing the device control method using the mobile telephone terminal described in any one of the above items is recorded.

【００５３】[0053]

【発明の実施の形態】図１は、本発明の一実施形態にお
ける機器制御装置の構成を示すブロック図である。本実
施形態における機器制御装置の構成要素である移動電話
端末と機器制御ユニットは、電波や赤外線など、なんら
かの無線通信によって通信可能であり、この通信は直接
でも、なんらかの基地局を経由したものでも利用可能で
ある。また、機器制御ユニットは、機器１０に内蔵した
ものでも、なんらかのインターフェースによって接続す
る別筐体のものでも良い。また、１つの機器に対して１
つの機器制御ユニットで制御するように構成すること
も、複数の機器を１つの機器制御ユニットで制御するよ
うに構成することも可能である。FIG. 1 is a block diagram showing a configuration of a device control device according to an embodiment of the present invention. The mobile phone terminal and the device control unit, which are components of the device control device according to the present embodiment, can communicate by any kind of wireless communication such as radio waves or infrared rays, and this communication can be used directly or via a certain base station. It is possible. Further, the device control unit may be built in the device 10 or may be of a separate housing connected by some kind of interface. Also, one for one device
It is possible to configure so as to be controlled by one device control unit, or to control a plurality of devices by one device control unit.

【００５４】本実施形態では、操作部１，音声特徴抽出
部２，音声合成部３，ユーザ情報記憶部４を移動電話端
末に備え、音声認識部５，音声特徴生成部６，機器情報
記憶部７，対話制御部８，機器制御部９を機器制御ユニ
ットに備える。この構成は移動電話端末での処理をＤＳ
Ｐ（ＤｉｇｉｔａｌＳｕｒｒｏｕｎｄＰｒｏｃｅｓ
ｓｏｒ）の使用に向いたものにまとめることによって、
処理効率の向上を狙ったものである。他方、操作部１，
音声特徴抽出部２，音声合成部３，ユーザ情報記憶部
４，音声認識部５，音声特徴生成部６，機器情報記憶部
７，対話制御部８までを移動電話端末に含めることによ
って、機器制御ユニットの処理量を軽減するとともに、
移動電話端末・機器制御ユニット間の通信データ量を減
らすことで、多くの機器への機器制御ユニット接続を容
易にして、本装置の普及促進を図るような構成も可能で
ある。このように各部の実装は移動電話端末、機器制御
ユニットのどちらかに固定されるものではない。In this embodiment, the operation unit 1, the voice feature extraction unit 2, the voice synthesis unit 3, and the user information storage unit 4 are provided in the mobile telephone terminal, and the voice recognition unit 5, the voice feature generation unit 6, and the device information storage unit 7, a dialog control unit 8, and a device control unit 9 are provided in a device control unit. In this configuration, the processing at the mobile telephone terminal is DS
P (Digital Surround Procedures)
sor)
It aims to improve processing efficiency. On the other hand, the operation unit 1,
By including a voice feature extraction unit 2, a voice synthesis unit 3, a user information storage unit 4, a voice recognition unit 5, a voice feature generation unit 6, a device information storage unit 7, and a dialogue control unit 8 in a mobile telephone terminal, device control is performed. While reducing the processing volume of the unit,
By reducing the amount of communication data between the mobile telephone terminal and the device control unit, it is possible to facilitate connection of the device control unit to many devices and to promote the spread of the device. Thus, the implementation of each unit is not fixed to either the mobile telephone terminal or the device control unit.

【００５５】本発明の実装としては、移動電話端末であ
る携帯電話からのコマンドを機器と接続された機器制御
ユニットへ送り、機器を制御するようにしても良い。ま
た、携帯電話にて、自宅の電話と対話して、その電話に
接続された（或いは電話を内包する）機器制御ユニット
から各機器を制御するようにしても良い。本発明は、不
特定多数のユーザがある程度複雑な操作を行う機器で、
特にその効果を発揮するが、その例としてコンビニエン
ス・ストアにおける複写機や、カラオケ・ボックスにお
けるカラオケ機の操作が挙げられる。以下では複写機を
制御対象機器１０とした場合について、本実施形態にお
ける各部の動作を示す。As an implementation of the present invention, a command from a mobile phone as a mobile phone terminal may be sent to a device control unit connected to a device to control the device. Alternatively, each device may be controlled from a device control unit connected to (or including) the telephone by interacting with a home telephone using a mobile telephone. The present invention is a device in which an unspecified number of users perform a somewhat complicated operation,
The effect is particularly exhibited, and examples thereof include the operation of a copying machine in a convenience store and the operation of a karaoke machine in a karaoke box. The operation of each unit in the present embodiment when the copying machine is the control target device 10 will be described below.

【００５６】ユーザ情報記憶部４にはユーザ毎の固有情
報を記憶する。具体的には音声特徴辞書，発話履歴，コ
マンド履歴，ユーザキーワードなどである。操作部１で
は、機器１０の情報を機器情報記憶部７に入力、或いは
登録の補助をする。The user information storage unit 4 stores unique information for each user. Specifically, it is a speech feature dictionary, an utterance history, a command history, a user keyword, and the like. The operation unit 1 inputs information of the device 10 to the device information storage unit 7 or assists in registration.

【００５７】機器情報記憶部７では機器制御ユニットが
制御する機器固有の情報を記憶する。具体的には機器Ｉ
Ｄ，キーワード・コマンド・パラメータ・説明の対応表
などである。図２は、本実施形態における機器固有の情
報の一例を示す図である。本例においては、上記の対応
表において、キーワードとして、キーワードリスト，キ
ーワード変更，余白・マージン・綴じ代，両面，拡大・
ズーム，ｗｏｒｄ，ｎセンチ・ｎミリ，偶数・奇数，ｎ
パーセントを挙げている。例えば、キーワード「余白・
マージン・綴じ代」に対するコマンドは＃ＭＡＲＧＩＮ
であり、そのキーワードに対しては「幅」が必要となっ
てくるので、コマンド＠ＷＩＤＴＨをパラメータとして
対応させている。このキーワードの説明としては、「縮
小して綴じ代を調整します。」と表示或いは発声するよ
うに対応づけられている。なお、表中、ｎは任意の数字
表現を、ｗｏｒｄは任意の言語表現を表す。The device information storage unit 7 stores information unique to the device controlled by the device control unit. Specifically, device I
D, a correspondence table of keywords, commands, parameters, and descriptions. FIG. 2 is a diagram illustrating an example of device-specific information according to the present embodiment. In this example, in the above correspondence table, as keywords, keyword list, keyword change, margin / margin / binding margin, both sides, enlargement /
Zoom, word, n centimeters / n mm, even / odd, n
The percentages are listed. For example, the keyword "margin /
The command for "margin / binding margin" is #MARGIN
Since "width" is required for the keyword, the command $ WIDTH is used as a parameter. The description of the keyword is associated with the display or utterance of “Reduce and adjust the binding margin”. In the table, n represents an arbitrary numeric expression, and word represents an arbitrary linguistic expression.

【００５８】音声特徴抽出部２は、移動電話端末のマイ
クから入力されデジタル変換された音声波形から、音声
認識に必要な音声特徴を抽出する。音声特徴及びその抽
出方法に関しては既存の音声認識技術で用いるものと何
ら変わるものではないのでここでは詳しい説明は省略す
る。The voice feature extraction unit 2 extracts voice features required for voice recognition from a voice waveform input from a microphone of the mobile telephone terminal and converted into a digital signal. The speech features and the method of extracting them are not different from those used in the existing speech recognition technology, so that detailed description is omitted here.

【００５９】音声認識部５は、音声特徴抽出部２で抽出
された音声特徴を、ユーザ情報記憶部４で保持する音声
特徴辞書と比較し、対話制御部８で設定された認識候補
の中からユーザが発声したキーワードを認識する。この
際、複数のキーワードを同時に認識可能とする。例え
ば、「Ａ４に縮小して両面で１０部コピーしなさい」と
いうユーザ発話から、「Ａ４」，「縮小」，「両面」，
「１０部」などのキーワードが得られる。認識過程に関
しては既存の音声認識技術で用いるものと何ら変わるも
のではないのでここでは詳しい説明は省略する。The speech recognition unit 5 compares the speech features extracted by the speech feature extraction unit 2 with the speech feature dictionary stored in the user information storage unit 4, and selects one of the recognition candidates set by the dialog control unit 8. Recognize keywords uttered by the user. At this time, a plurality of keywords can be recognized simultaneously. For example, from a user utterance “reduce to A4 and copy 10 copies on both sides”, “A4”, “reduce”, “double-sided”,
A keyword such as "10 copies" is obtained. Since the recognition process is not different from that used in the existing speech recognition technology, a detailed description is omitted here.

【００６０】対話制御部８の機能は大きく分けて３つあ
る。音声認識結果からコマンドへの変換と音声認識候補
の限定、装置の発話文の生成とユーザ発話の限定、キー
ワードの説明と変更である。以下で順に説明する。The functions of the dialogue control unit 8 are roughly classified into three. These are conversion of speech recognition results into commands, limitation of speech recognition candidates, generation of utterance sentences of the device and limitation of user utterances, and explanation and modification of keywords. This will be described below in order.

【００６１】多機能な機器ではコマンドの数も多くな
り、さらにひとつのコマンドに対応するキーワードは１
つとは限らないので、すべてを音声認識部５の対象認識
候補とすると、それは膨大なものになり、処理時間の増
大と認識率の低下を招く。In a multifunctional device, the number of commands also increases, and the keyword corresponding to one command is 1
If not all candidates are the target recognition candidates of the speech recognition unit 5, they become enormous, which leads to an increase in processing time and a decrease in recognition rate.

【００６２】そこで、対話制御部８では、機器制御部９
から得られる機器１０のステータス情報、ユーザ情報記
憶部４から得られる発話履歴，コマンド履歴などから、
認識候補を限定して音声認識部５へ送る。ステータス情
報から認識候補を限定する例としては、現在複写動作中
であれば各種設定用キーワードは認識候補とせず、「ス
トップ」，「割り込み」，「キャンセル」などを候補と
するなどがある。発話履歴から候補を限定する例として
は、以前の使用において、当該移動電話端末のユーザが
「余白」というキーワードを使用した場合、認識候補と
して「マージン」ではなく「余白」を優先的に設定する
などがある。コマンド履歴から候補を限定する例として
は、部数設定コマンドが未出でソータ使用コマンドが既
出の場合、部数設定コマンドに対応するキーワードを優
先的に設定するなどがある。Therefore, the dialog control unit 8 includes the device control unit 9
From the status information of the device 10 obtained from the user, the utterance history and the command history obtained from the user information storage unit 4,
The recognition candidates are limited and sent to the voice recognition unit 5. As an example of limiting recognition candidates based on status information, if a copying operation is currently being performed, various setting keywords are not recognized as recognition candidates, but "stop", "interrupt", "cancel", and the like are candidates. As an example of limiting candidates from the utterance history, if the user of the mobile phone terminal used the keyword “margin” in the previous use, “margin” is set preferentially instead of “margin” as a recognition candidate and so on. As an example of limiting candidates from the command history, when a copy setting command has not been issued and a sorter use command has been issued, a keyword corresponding to the copy setting command is preferentially set.

【００６３】上述のごとく選ばれた認識候補の中から音
声認識部５で得られた音声認識結果は、機器情報記憶部
７の対応表（図２参照）及びユーザ情報記憶部４のユー
ザ・キーワードを用いて、コマンドに変換され機器制御
部９に送出される。ただし、パラメータの必要なコマン
ドで、その発声が認識できていなければ、パラメータを
要求する装置発話を合成する。図３は、図１の音声認識
部５で得られた音声認識結果からコマンドを実行する処
理を説明するためのフロー図である。まず、音声認識部
５においてキーワードを認識し（ステップＳ１）、機器
情報記憶部７に記憶されている対応表からパラメータが
必要かどうかを判断する（ステップＳ２）。パラメータ
が必要でなければ、そのままコマンドを実行する（ステ
ップＳ６）。パラメータが必要であれば、ステップＳ３
においてパラメータの音声があるかを判断し、無ければ
コマンドを実行し（ステップＳ６）、パラメータの音声
があれば、パラメータの説明を出力してから（ステップ
Ｓ５）コマンドを実行する（ステップＳ６）。The speech recognition results obtained by the speech recognition unit 5 from the recognition candidates selected as described above are stored in the correspondence table of the device information storage unit 7 (see FIG. 2) and the user / keyword of the user information storage unit 4. , And is sent to the device control unit 9. However, if the utterance is not recognized in the command requiring the parameter, the device utterance requesting the parameter is synthesized. FIG. 3 is a flowchart for explaining a process of executing a command from the speech recognition result obtained by the speech recognition unit 5 in FIG. First, a keyword is recognized by the voice recognition unit 5 (step S1), and it is determined from the correspondence table stored in the device information storage unit 7 whether a parameter is required (step S2). If no parameters are required, the command is executed as it is (step S6). If parameters are required, step S3
It is determined whether or not there is a parameter voice. If there is no parameter voice, the command is executed (step S6). If there is a parameter voice, a description of the parameter is output (step S5) and the command is executed (step S6).

【００６４】パラメータ要求から装置発話文を生成し、
音声特徴生成部６へ送り、ユーザの発話を促すととも
に、ユーザの発話を限定して次発話の認識候補を絞り込
む。この際、ユーザ情報記憶部４の発話履歴やコマンド
履歴を必要に応じて参照し、ユーザの文脈に沿った簡潔
でわかりやすい表現を選択する。例えば、ユーザが既に
「余白」というキーワードを使っていれば、装置発話文
でも「マージン」ではなく「余白」というキーワードを
使う。また、「何部ですか？」という装置発話文を出力
することで、「〜枚」というユーザ発話を抑制し、次発
話認識候補を絞り込む。A device utterance sentence is generated from the parameter request,
The voice utterance is sent to the voice feature generation unit 6 to urge the user to utter, and the utterance of the user is limited to narrow down recognition candidates for the next utterance. At this time, the utterance history and the command history in the user information storage unit 4 are referred to as necessary, and a simple and easy-to-understand expression according to the context of the user is selected. For example, if the user has already used the keyword “margin”, the device utterance uses the keyword “margin” instead of “margin”. Further, by outputting a device utterance sentence "What is the copy?", The user utterance "~" is suppressed and the next utterance recognition candidates are narrowed down.

【００６５】ユーザは最初どのようなキーワードでどの
ようなことができるのかを知らない。そこで本実施形態
ではキーワードリストの説明を行うメタコマンド％ＫＷ
ＬＩＳＴが用意されている（図２参照）。％ＫＷＬＩＳ
Ｔが実行されると対話制御部８は全てのキーワードとそ
の説明を装置発話文として生成する。また、キーワード
の表現がユーザにとって違和感のある場合に、それを変
更するメタコマンド％ＫＷＣＨＡＮＧＥが用意されてい
る。％ＫＷＣＨＡＮＧＥが実行されると、新しいキーワ
ードであるパラメータ＠ＮＥＷＫＷが機器ＩＤ，コマン
ドとともにユーザキーワードとしてユーザ情報記憶部４
に記憶される。なお、これらのキーワードの説明や変更
は、ここに示した音声による方法の他に、移動電話端末
に備わる液晶画面とボタンによる視覚的操作によって実
現することも可能である。また、それらを併用した形態
としても実現可能である。The user does not initially know what keyword can do what. Therefore, in the present embodiment, the meta command% KW for explaining the keyword list
A LIST is prepared (see FIG. 2). % KWLIS
When T is executed, the dialogue control unit 8 generates all keywords and their descriptions as device utterances. Also, a meta command% KWCHANGE for changing the expression of the keyword when the expression is uncomfortable for the user is prepared. When% KWCHANGE is executed, the parameter $ NEWKW, which is a new keyword, is used as a user keyword together with the device ID and command as the user keyword in the user information storage unit 4.
Is stored. The description and change of these keywords can be realized by a visual operation using a liquid crystal screen and buttons provided in the mobile telephone terminal, in addition to the voice method described here. Moreover, it is also feasible to use them in combination.

【００６６】音声特徴生成部６では対話制御部８で生成
された装置発話文から、音韻・韻律情報を規定する音声
特徴を生成する。これは発音記号のようなシンボルレベ
ルのものでも、音素番号，継続時間長，ピッチ周波数な
どの列によるパラメータレベルのものでも利用可能であ
る。なお、この過程に関しては既存のテキスト音声合成
技術で用いるものと何ら変わるものではないのでここで
は詳しい説明は省略する。The voice feature generation unit 6 generates a voice feature that defines phonemic / prosodic information from the device utterance sentence generated by the dialogue control unit 8. This can be used at a symbol level such as a phonetic symbol, or at a parameter level based on a column such as a phoneme number, a duration time, and a pitch frequency. Since this process is not different from that used in the existing text-to-speech synthesis technology, a detailed description is omitted here.

【００６７】音声合成部３では音声特徴から音声波形を
生成する。この過程に関しては既存の規則音声合成技術
で用いるものと何ら変わるものではないのでここでは詳
しい説明は省略する。The voice synthesizing unit 3 generates a voice waveform from voice characteristics. Since this process is not different from that used in the existing rule speech synthesis technology, detailed description is omitted here.

【００６８】機器制御部９では対話制御部８で選択され
た機器制御コマンド，パラメータをもとに、機器１０へ
制御信号を送出する。また、機器１０からのステータス
信号を受信して対話制御部８へ送る。The device control unit 9 sends a control signal to the device 10 based on the device control command and the parameter selected by the dialog control unit 8. Further, it receives a status signal from the device 10 and sends it to the dialogue control unit 8.

【００６９】なお、本実施形態においては、キーワード
の説明や変更のみでなく、全ての処理を移動電話端末に
備わるディスプレイで表示してボタンで操作するような
構成としても、或いは、上記実施形態の音声による処理
とディスプレイ及びボタンによる処理を併用するような
構成としても良い。In this embodiment, not only the explanation and the change of the keyword but also all the processes are displayed on the display provided in the mobile telephone terminal and operated by using the buttons. It is also possible to adopt a configuration in which processing by voice and processing by display and buttons are used together.

【００７０】図４は、本発明の他の実施形態における機
器制御装置の構成を示すブロック図である。本実施形態
における機器制御装置の構成要素である移動電話端末と
機器制御ユニットは、電波や赤外線など、なんらかの無
線通信によって通信可能であり、この通信は直接あるい
は基地局を経由したものでも利用可能である。また、機
器制御ユニットは、機器２０に内蔵することも考えられ
るが、なんらかのインターフェースによって接続する別
筐体としても実現できる。また、１つの機器に対して１
つの機器制御ユニットで制御するように構成すること
も、複数の機器を１つの機器制御ユニットで制御するよ
うに構成することも可能である。FIG. 4 is a block diagram showing a configuration of a device control device according to another embodiment of the present invention. The mobile phone terminal and the device control unit, which are components of the device control device according to the present embodiment, can communicate with each other by some kind of wireless communication such as radio waves or infrared rays, and this communication can be used directly or via a base station. is there. The device control unit may be built in the device 20, but may be realized as a separate housing connected by some interface. Also, one for one device
It is possible to configure so as to be controlled by one device control unit, or to control a plurality of devices by one device control unit.

【００７１】本実施形態では、操作部１１，音声特徴抽
出部１２，音声合成部１３，ユーザモデル記憶部１４を
移動電話端末に備え、音声認識部１５，音声特徴生成部
１６，機器情報記憶部１７，対話制御部１８，機器制御
部１９，ユーザモデル分析部２１，履歴記憶部２２を機
器制御ユニットに備える。この構成は移動電話端末での
処理をＤＳＰ（ＤｉｇｉｔａｌＳｕｒｒｏｕｎｄＰ
ｒｏｃｅｓｓｅｒ）の使用に向いたものにまとめること
によって、処理効率の向上を狙ったものである。他方、
操作部１１，音声特徴抽出部１２，音声合成部１３，ユ
ーザモデル記憶部１４，音声認識部１５，音声特徴生成
部１６，機器情報記憶部１７，ユーザモデル分析部２
１，履歴記憶部２２，対話制御部１８までを移動電話端
末に含めることによって、機器制御ユニットの処理量を
軽減するとともに、移動電話端末・機器制御ユニット間
の通信データ量を減らすことで、多くの機器への機器制
御ユニット接続を容易にして、本装置の普及促進を図る
ような構成も可能である。このように各部の実装は移動
電話端末、機器制御ユニットのどちらかに固定されるも
のではない。In this embodiment, a mobile telephone terminal is provided with an operation unit 11, a voice feature extraction unit 12, a voice synthesis unit 13, and a user model storage unit 14, and a voice recognition unit 15, a voice feature generation unit 16, and a device information storage unit. 17, a dialog control unit 18, a device control unit 19, a user model analysis unit 21, and a history storage unit 22 are provided in a device control unit. In this configuration, processing at the mobile phone terminal is performed by a DSP (Digital Surround PDS).
The purpose of this is to improve the processing efficiency by grouping the devices suitable for the use of the processor. On the other hand,
Operation unit 11, voice feature extraction unit 12, voice synthesis unit 13, user model storage unit 14, voice recognition unit 15, voice feature generation unit 16, device information storage unit 17, user model analysis unit 2
1, the history storage unit 22 and the dialogue control unit 18 are included in the mobile telephone terminal, thereby reducing the processing amount of the equipment control unit and reducing the communication data amount between the mobile telephone terminal and the equipment control unit. A configuration that facilitates connection of the device control unit to another device and promotes the spread of the present device is also possible. Thus, the implementation of each unit is not fixed to either the mobile telephone terminal or the device control unit.

【００７２】本発明は、不特定多数のユーザがある程度
複雑な操作を行う機器で、特にその効果を発揮するが、
その例としてコンビニエンス・ストアにおける複写機
や、カラオケ・ボックスにおけるカラオケ機の操作が挙
げられる。以下では複写機を制御対象機器２０とした場
合について、本実施形態における各部の動作を示す。The present invention is particularly effective in a device in which an unspecified number of users perform a somewhat complicated operation.
Examples of such operations include the operation of a copier in a convenience store and the operation of a karaoke machine in a karaoke box. Hereinafter, the operation of each unit in the present embodiment when the copying machine is the control target device 20 will be described.

【００７３】操作部１１では、機器の情報を機器情報記
憶部１７に入力、或いは登録の補助をする。機器情報記
憶部１７では機器制御ユニットが制御する機器２０固有
の情報を記憶する。具体的には機器ＩＤ，キーワード・
コマンド・パラメータ・説明の対応表などであるが、こ
の他にその機器２０が属する機器分類も記憶する。上記
の対応表の例を図２に示す。図２においては、キーワー
ドとして、キーワードリスト，キーワード変更，余白・
マージン・綴じ代，両面，拡大・ズーム，ｗｏｒｄ，ｎ
センチ・ｎミリ，偶数・奇数，ｎパーセントを挙げてい
る。例えば、キーワード「余白・マージン・綴じ代」に
対するコマンドは＃ＭＡＲＧＩＮであり、そのキーワー
ドに対しては「幅」が必要となってくるので、コマンド
＠ＷＩＤＴＨをパラメータとして対応させている。この
キーワードの説明としては、「縮小して綴じ代を調整し
ます。」と表示或いは発声するように対応付けられてい
る。機器分類は、複写機とファクシミリ、テープレコー
ダとＣＤプレーヤ、ジュークボックスとカラオケ機器な
ど、目的，機能，操作性などの類似性の高いものを予め
グルーピングしたものである。The operation unit 11 inputs device information to the device information storage unit 17 or assists registration. The device information storage unit 17 stores information unique to the device 20 controlled by the device control unit. Specifically, device ID, keyword,
It stores a correspondence table of commands, parameters, and descriptions, and also stores the device classification to which the device 20 belongs. FIG. 2 shows an example of the above correspondence table. In FIG. 2, the keywords include a keyword list, a keyword change,
Margin / binding allowance, both sides, enlargement / zoom, word, n
Cm / n mm, even / odd, and n percent. For example, the command for the keyword "margin / margin / binding margin" is #MARGIN, and "width" is required for the keyword, so the command $ WIDTH is used as a parameter. The description of this keyword is associated with the display or utterance of "Reduce and adjust the binding margin." The device classification is obtained by previously grouping devices having high similarities such as purpose, function, and operability, such as a copier and a facsimile, a tape recorder and a CD player, and a jukebox and a karaoke device.

【００７４】音声特徴抽出部１２は、移動電話端末のマ
イクから入力されデジタル変換された音声波形から、音
声認識に必要な音声特徴を抽出する。音声特徴及びその
抽出方法に関しては既存の音声認識技術で用いるものと
何ら変わるものではないのでここでは詳しい説明は省略
する。The voice feature extraction unit 12 extracts voice features necessary for voice recognition from a voice waveform input from a microphone of the mobile telephone terminal and converted into a digital signal. The speech features and the method of extracting them are not different from those used in the existing speech recognition technology, so that detailed description is omitted here.

【００７５】音声認識部１５は、音声特徴抽出部１２で
抽出された音声特徴を、ユーザ情報記憶部（ユーザモデ
ル記憶部）１４で保持する音声特徴辞書と比較し、対話
制御部１８で設定された認識候補の中からユーザが発声
したキーワードを認識する。この際、複数のキーワード
を同時に認識可能とする。例えば、「Ａ４に縮小して両
面で１０部コピーしなさい」というユーザ発話から、
「Ａ４」，「縮小」，「両面」，「１０部」などのキー
ワードが得られる。認識過程に関しては既存の音声認識
技術で用いるものと何ら変わるものではないのでここで
は詳しい説明は省略する。The speech recognition unit 15 compares the speech features extracted by the speech feature extraction unit 12 with the speech feature dictionary stored in the user information storage unit (user model storage unit) 14, and is set by the dialog control unit 18. The keywords uttered by the user are recognized from the recognized candidates. At this time, a plurality of keywords can be recognized simultaneously. For example, from a user's utterance “Reduce to A4 and copy 10 copies on both sides”,
Keywords such as "A4", "reduced", "double-sided", and "10 copies" are obtained. Since the recognition process is not different from that used in the existing speech recognition technology, a detailed description is omitted here.

【００７６】音声特徴生成部１６では対話制御部１８で
生成された装置発話文から、音韻・韻律情報を規定する
音声特徴を生成する。これは発音記号のようなシンボル
レベルのものでも、音素番号，継続時間長，ピッチ周波
数などの列によるパラメータレベルのものでも利用可能
である。なお、この過程に関しては既存のテキスト音声
合成技術で用いるものと何ら変わるものではないのでこ
こでは詳しい説明は省略する。The voice feature generation unit 16 generates voice features that define phoneme / prosodic information from the device utterance sentence generated by the dialogue control unit 18. This can be used at a symbol level such as a phonetic symbol, or at a parameter level based on a column such as a phoneme number, a duration time, and a pitch frequency. Since this process is not different from that used in the existing text-to-speech synthesis technology, a detailed description is omitted here.

【００７７】音声合成部１３では音声特徴から音声波形
を生成する。この過程に関しては既存の規則音声合成技
術で用いるものと何ら変わるものではないのでここでは
詳しい説明は省略する。The voice synthesizing unit 13 generates a voice waveform from voice characteristics. Since this process is not different from that used in the existing rule speech synthesis technology, detailed description is omitted here.

【００７８】機器制御部１９では対話制御部１８で選択
された機器制御コマンド，パラメータをもとに、機器２
０へ制御信号を送出する。また、機器２０からのステー
タス信号を受信して対話制御部１８へ送る。The device controller 19 controls the device 2 based on the device control commands and parameters selected by the dialog controller 18.
Send a control signal to 0. The status signal from the device 20 is received and sent to the dialogue control unit 18.

【００７９】ユーザモデル記憶部１４にはユーザ毎の固
有情報を記憶する。本実施形態では音声特徴辞書の他
に、語彙リストとワークフローなどを記憶する。目的や
機能の似ている機器を操作する場合、ユーザはその操作
法を同一のメタファによって理解していると考えられ
る。この同一のメタファによって理解されている機器群
に対するユーザの振る舞いをまとめたデータがここでい
うユーザモデルであり、本実施形態では語彙リストとワ
ークフローからなる。The user model storage unit 14 stores unique information for each user. In the present embodiment, a vocabulary list and a workflow are stored in addition to the voice feature dictionary. When operating a device having a similar purpose or function, it is considered that the user understands the operation method using the same metaphor. The data summarizing the behavior of the user with respect to the device group understood by the same metaphor is the user model referred to here, and in the present embodiment, includes a vocabulary list and a workflow.

【００８０】図５は、図４の対話制御部１８においてユ
ーザの操作と発話を誘導・予測する処理を説明するため
のフロー図である。対話制御部１８はワークフローと語
彙リストに従ってユーザの操作と発話を誘導・予測す
る。誘導は、好ましくは音韻・韻律を通常と変化させ強
調したような（例えば高い声，大きな声）、ガイダンス
メッセージによって行い、予測結果は音声認識候補の設
定となる。対話制御部１８はワークフロー中の操作場面
を順番に処理する。まず、機器分類によるユーザモデル
が選択される（ステップＳ１）。対象機器２０がユーザ
にとって馴染みのないものであれば（ステップＳ１
４）、当該操作場面に対応するガイダンスメッセージを
生成する（ステップＳ１５）。その際、当該操作場面に
おいて使用される語を語彙リストより選択し（ステップ
Ｓ１２）、さらに、これらの語を優先的に音声認識候補
として設定する（ステップＳ１３）。ステップＳ１４に
おいて、対象機器２０が、ユーザにとってなれた機器で
あれば、ガイダンスメッセージを生成しない。続いて、
ステップＳ１６において、最後の操作場面かどうかを判
断し、最後の操作場面でなければ、次の場面（ステップ
Ｓ１７）の処理を行う（ステップＳ１２へ戻る）。ステ
ップＳ１６において、最後の操作場面であれば、処理を
終了する。FIG. 5 is a flowchart for explaining the process of guiding / predicting a user's operation and utterance in the dialog control unit 18 of FIG. The dialog control unit 18 guides and predicts user operations and utterances according to the workflow and the vocabulary list. The guidance is preferably performed by a guidance message in which the phonemes and prosody are changed and emphasized as usual (for example, a high voice, a loud voice), and the prediction result is set as a speech recognition candidate. The dialog control unit 18 sequentially processes operation scenes in the workflow. First, a user model based on device classification is selected (step S1). If the target device 20 is unfamiliar to the user (step S1
4) A guidance message corresponding to the operation scene is generated (step S15). At this time, words used in the operation scene are selected from the vocabulary list (step S12), and these words are preferentially set as speech recognition candidates (step S13). In step S14, if the target device 20 is a device that is familiar to the user, no guidance message is generated. continue,
In step S16, it is determined whether or not the scene is the last operation scene. If the scene is not the last operation scene, the process of the next scene (step S17) is performed (return to step S12). If it is the last operation scene in step S16, the process is terminated.

【００８１】図６は、図４の機器制御装置の使用例を説
明するための図で、複写機操作に慣れたユーザが初めて
ファクシミリの操作を行う例を説明するための図であ
る。対象機器２０の機器分類はハードコピー系であるこ
とは機器情報より分かる。そこでユーザモデル記憶部１
４を検索してハードコピー系のワークフローを得る。な
お、これは同じハードコピー系の複写機の操作によっ
て、過去に分析・記録されたものである。このワークフ
ローの操作場面系列は、条件設定，原稿セット，テス
ト，実行である。それぞれの場面で使用するコマンド，
語，優先順位の対応を語彙リストから得て、ガイダンス
メッセージの生成と音声認識候補の設定を行う。優先順
位はパラメータやコマンドの過去に指定された度合いで
あるので、これを考慮して設定を行う。また、ここで、
機器情報に特記事項として原稿セットの向きが下向きで
ある旨が記されている。ユーザモデルの情報とは異なる
のでガイダンスメッセージでそのことを強調する。FIG. 6 is a diagram for explaining an example of use of the apparatus control device of FIG. 4, and is a diagram for explaining an example in which a user accustomed to the operation of a copying machine operates a facsimile for the first time. It can be seen from the device information that the device classification of the target device 20 is a hard copy system. Therefore, the user model storage unit 1
4 to obtain a hard copy workflow. This is analyzed and recorded in the past by operating the same hard copy type copying machine. The operation scene sequence of this workflow is condition setting, document setting, test, and execution. Commands used in each case,
The correspondence between words and priorities is obtained from the vocabulary list, and guidance messages are generated and speech recognition candidates are set. Since the priority is a degree specified in the past of the parameter or the command, the setting is performed in consideration of this. Also, where
The device information indicates that the orientation of the document set is downward as a special note. Since this is different from the information of the user model, the guidance message emphasizes this.

【００８２】このようにして複写機と同じメタファによ
る操作で、初めて操作するファクシミリを操作すること
ができる。場面終了時に指定されていないパラメータに
関しては既定値を算出して用いる。既定値は過去に指定
された値の履歴から計算する。In this way, a facsimile to be operated for the first time can be operated by the same metaphor operation as the copying machine. For parameters not specified at the end of a scene, default values are calculated and used. The default value is calculated from the history of the value specified in the past.

【００８３】履歴記憶部２２では一連の操作及び対話の
履歴を一時的にすべて保存する。ユーザモデル分析部２
１では履歴記憶部２２のデータをもとにユーザモデルの
更新を行う。本実施形態ではユーザモデルはワークフロ
ーと語彙リストから構成されるが、本発明はこの構成に
限定されるものではない。これら２つは独立に使用可能
であり、どちらか一方のみを含む構成も可能である。ま
ず、ワークフローについて説明する。ユーザモデル分析
部２１では履歴記憶部２２に記憶された詳細な操作履歴
シーケンスを、操作場面毎に分割することでワークフロ
ーとして出力する。例えば連続したキー入力を条件設定
場面としてまとめるなどである。ここで複写機での典型
的ワークフローを考える。原稿をセットして、濃度，倍
率，枚数などの条件を設定して、スタートボタンを押す
というシーケンスが考えられる。しかし一方、条件設定
をしてから原稿をセットして、スタートボタンを押すと
いうシーケンスも考えられる。また、注意深いユーザで
あれば、まず１枚テストコピーを取ってから、枚数をセ
ットして本コピーを取るというシーケンスも考えられ
る。このように一般的には複数存在するワークフローの
中から、ユーザが選択したワークフローが特定されるこ
とになる。なお、既にワークフローデータが存在する機
器２０で、それと異なるフローで操作が行われた場合、
対話制御部１８での予測が外れる結果となるが、ここで
は実際に取られた操作履歴によってワークフローが更新
される。The history storage unit 22 temporarily stores the history of a series of operations and conversations. User model analysis unit 2
In step 1, the user model is updated based on the data in the history storage unit 22. In the present embodiment, the user model includes a workflow and a vocabulary list, but the present invention is not limited to this configuration. These two can be used independently, and a configuration including only one of them is also possible. First, the workflow will be described. The user model analysis unit 21 divides the detailed operation history sequence stored in the history storage unit 22 for each operation scene and outputs the result as a workflow. For example, continuous key input is put together as a condition setting scene. Here, consider a typical workflow in a copying machine. A sequence is conceivable in which a document is set, conditions such as density, magnification, and number of sheets are set, and a start button is pressed. However, on the other hand, a sequence in which a document is set after setting the conditions and the start button is pressed may be considered. In addition, if the user is careful, a sequence of taking one test copy first, setting the number of copies, and taking a real copy may be considered. In general, a workflow selected by the user is specified from a plurality of workflows. If an operation is performed in a different flow on the device 20 for which workflow data already exists,
Although the result of the prediction by the dialog control unit 18 is incorrect, the workflow is updated here with the operation history actually taken.

【００８４】次に語彙リストについて以下に説明する。
ひとつのコマンドやパラメータを表現するために使われ
る語は一般に複数あるが、ユーザによって使用される語
はほぼ限定される。そこで履歴記憶部２２の対話履歴デ
ータによってユーザが使用した語をチェックし、語彙リ
ストをユーザにあわせて更新する。なお、パラメータな
どについては全てが指定されるわけではない。当該操作
において指定された項目については優先順位を上げ、指
定されなかった項目については優先順位を下げる。ま
た、指定されたパラメータの値は既定値算出のためのデ
ータとして記録する。Next, the vocabulary list will be described below.
Although there are generally a plurality of words used to express one command or parameter, the words used by the user are almost limited. Therefore, the words used by the user are checked based on the conversation history data in the history storage unit 22, and the vocabulary list is updated according to the user. Not all parameters are specified. The priority is increased for items specified in the operation, and the priority is reduced for items not specified. The value of the designated parameter is recorded as data for calculating a default value.

【００８５】なお、本実施形態においては、キーワード
の説明や変更のみでなく、全ての処理を移動電話端末に
備わるディスプレイで表示してボタンで操作するような
構成としても、或いは、上記実施形態の音声による処理
とディスプレイ及びボタンによる処理を併用するような
構成としても良い。In the present embodiment, not only the explanation and the change of the keyword but also all the processes are displayed on the display provided in the mobile telephone terminal and operated by using the buttons. It is also possible to adopt a configuration in which processing by voice and processing by display and buttons are used together.

【００８６】本発明に関して幾つかの実施形態を機器制
御装置として説明してきたが、本発明はこれに限られる
訳ではなく、これら機器制御装置として機能させるよう
なプログラムを移動電話端末に組み込んだ形態でも実装
可能である。Although some embodiments of the present invention have been described as device control devices, the present invention is not limited to this, and a program in which a program to function as these device control devices is incorporated in a mobile telephone terminal. But it can be implemented.

【００８７】[0087]

【発明の効果】請求項１，１６，３１，３２の発明によ
れば、移動電話端末を通した対話によって機器制御コマ
ンドを送出することによって、機器の多様な機能をユー
ザの負荷を増やすことなく利用可能となる。According to the invention of claims 1, 16, 31, and 32, various device functions are transmitted without increasing the load on the user by transmitting device control commands through dialogue through the mobile telephone terminal. Will be available.

【００８８】請求項２，１７，３１，３２の発明によれ
ば、移動電話端末を通した音声対話によって機器制御コ
マンドを送出することによって、機器の多様な機能をユ
ーザの負荷を増やすことなく利用可能となるとともに、
不特定多数のユーザに対しても個人に特化した音声認識
を行うことによって音声認識率が向上する。According to the inventions of claims 2, 17, 31, and 32, various functions of the device can be used without increasing the load on the user by transmitting the device control command by voice conversation through the mobile telephone terminal. It becomes possible,
The voice recognition rate is improved by performing voice recognition specialized for individuals even for an unspecified number of users.

【００８９】請求項３，１８，３１，３２の発明によれ
ば、複数のキーワードを同時に認識することによって、
煩雑になりがちがなパラメータ設定を含む動作指示を容
易に利用可能となる。According to the invention of claims 3, 18, 31, and 32, by simultaneously recognizing a plurality of keywords,
The operation instruction including the parameter setting which tends to be complicated can be easily used.

【００９０】請求項４，１９，３１，３２の発明によれ
ば、限られた表示面積による文字情報だけでは理解しが
たい複雑な機能も、音声による説明によってユーザの理
解が容易となる。According to the inventions of claims 4, 19, 31, and 32, even a complicated function that cannot be easily understood only by character information with a limited display area can be easily understood by the user by explanation by voice.

【００９１】請求項５，２０，３１，３２の発明によれ
ば、コマンドとキーワードの対応をユーザが変更するこ
とによって、ユーザ毎に無理なく記憶できる操作系が構
築され、記憶負荷が軽減される。According to the inventions of claims 5, 20, 31, and 32, the user changes the correspondence between the command and the keyword, thereby constructing an operation system capable of storing the information without difficulty for each user, and reducing the storage load. .

【００９２】請求項６，２１，３１，３２の発明によれ
ば、ユーザ毎の発話履歴やコマンド履歴を移動電話端末
に保持することによって、ユーザの持つ文脈を推定して
ユーザ発話の多義性を解消し、適切な機器動作コマンド
へ変換できる。According to the inventions of claims 6, 21, 31, and 32, the utterance history and the command history of each user are stored in the mobile telephone terminal, thereby estimating the context of the user and reducing the ambiguity of the user utterance. Can be resolved and converted to an appropriate device operation command.

【００９３】請求項７，８，２２，２３，３１，３２の
発明によれば、操作対象機器に類似した、使用実績のあ
る機器のユーザモデルに基づいて対話処理を行うことに
よって、ユーザにとってわかりやすい操作を実現でき
る。According to the seventh, eighth, twenty-second, twenty-third, and thirty-second aspects of the present invention, the interactive processing is performed based on a user model of a device that has been used and is similar to the device to be operated. Operation can be realized.

【００９４】請求項９，２４，３１，３２の発明によれ
ば、操作対象機器に類似した機器について過去に使われ
た語彙を使用することで、既知の言語知識のみで理解可
能なメッセージを生成できる。According to the ninth, twenty-fourth, thirty-second, and thirty-second inventions, a vocabulary used in the past for a device similar to the operation target device is used to generate a message that can be understood only by known linguistic knowledge. it can.

【００９５】請求項１０，２５，３１，３２の発明によ
れば、操作対象機器に類似した機器について過去に使わ
れた語彙を使用することで、ユーザの発話しやすい語彙
に認識対象を絞り込むことが可能になり、音声認識率の
向上につながる。According to the tenth, twenty-fifth, thirty-first, and thirty-second inventions, the vocabulary used in the past is used for a device similar to the operation target device, thereby narrowing down the recognition target to a vocabulary that is easy for the user to speak. Is possible, which leads to an improvement in the speech recognition rate.

【００９６】請求項１１，２６，３１，３２の発明によ
れば、操作対象機器に類似した機器について組み立てら
れたワークフローに沿ってガイドメッセージを出力する
ことでスムーズな操作が期待出来る。According to the invention of claims 11, 26, 31, and 32, a smooth operation can be expected by outputting a guide message according to a workflow assembled for devices similar to the device to be operated.

【００９７】請求項１２，２７，３１，３２の発明によ
れば、操作対象機器に類似した機器について組み立てら
れたワークフローに沿って発話の順序を予測すること
で、認識対象を絞り込むことが可能になり、音声認識率
の向上につながる。According to the invention of claims 12, 27, 31, and 32, it is possible to narrow down recognition targets by predicting the order of speech in accordance with a workflow assembled for devices similar to the operation target device. This leads to an improvement in the speech recognition rate.

【００９８】請求項１３，２８，３１，３２の発明によ
れば、標準的ではなく、操作対象機器に特有の機能に対
して詳細で強調されたメッセージを出力することで間違
いの少ない操作が期待出来る。According to the inventions of claims 13, 28, 31, and 32, an operation with few errors can be expected by outputting a message that is not standard but is detailed and emphasizes a function specific to the operation target device. I can do it.

【００９９】請求項１４，２９，３１，３２の発明によ
れば、設定値がいつも安定している設定項目について
は、その設定値を既定値としてワークフローを作成する
ことで、無駄の少ない効率的な操作が期待出来る。According to the invention of claims 14, 29, 31, and 32, for a setting item whose setting value is always stable, a workflow is created with the setting value as a default value, thereby reducing wasteful and efficient. Operation can be expected.

【０１００】請求項１５，３０，３１，３２の発明によ
れば、既定値からの変更が少ない設定項目に対しては、
設定処理の優先順位を下げてワークフローを作成するこ
とで、ユーザの重要視している設定項目に対してより時
間をかけて操作が行える。According to the invention of claims 15, 30, 31, and 32, for a setting item whose change from the default value is small,
By creating a workflow by lowering the priority of the setting process, it is possible to operate the setting items that the user values more importantly with more time.

[Brief description of the drawings]

【図１】本発明の一実施形態における機器制御装置の
構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of a device control device according to an embodiment of the present invention.

【図２】本発明の一実施形態における機器固有の情報
の一例を示す図である。FIG. 2 is a diagram illustrating an example of device-specific information according to an embodiment of the present invention.

【図３】図１の音声認識部で得られた音声認識結果か
らコマンドを実行する処理を説明するためのフロー図で
ある。FIG. 3 is a flowchart illustrating a process of executing a command from a speech recognition result obtained by a speech recognition unit in FIG. 1;

【図４】本発明の他の実施形態における機器制御装置
の構成を示すブロック図である。FIG. 4 is a block diagram illustrating a configuration of a device control device according to another embodiment of the present invention.

【図５】図４の対話制御部においてユーザの操作と発
話を誘導・予測する処理を説明するためのフロー図であ
る。FIG. 5 is a flowchart illustrating a process of guiding / predicting a user operation and an utterance in the dialogue control unit of FIG. 4;

【図６】図４の機器制御装置の使用例を説明するため
の図で、複写機操作に慣れたユーザが初めてファクシミ
リの操作を行う例を説明するための図である。6 is a diagram for explaining an example of use of the device control device of FIG. 4, and is a diagram for explaining an example in which a user accustomed to operating a copying machine first operates a facsimile.

[Explanation of symbols]

１，１１…操作部、２，１２…音声特徴抽出部、３，１
３…音声合成部、４…ユーザ情報記憶部、５，１５…音
声認識部、６，１６…音声特徴生成部、７，１７…機器
情報記憶部、８，１８…対話制御部、９，１９…機器制
御部、１０，２０…機器、１４…ユーザモデル記憶部、
２１…ユーザモデル分析部、２２…履歴記憶部。1, 11: operation unit, 2, 12: voice feature extraction unit, 3, 1
3: Voice synthesis unit, 4: User information storage unit, 5, 15: Voice recognition unit, 6, 16: Voice feature generation unit, 7, 17: Device information storage unit, 8, 18: Dialogue control unit, 9, 19 ... Device control unit, 10, 20 ... Device, 14 ... User model storage unit
21: User model analysis unit, 22: History storage unit.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 3/00 ５７１ＶＦターム(参考） 5D015 AA04 BB01 KK01 KK04 LL02 LL06 LL10 5K027 AA11 BB02 FF22 FF28 HH26 MM16 5K048 AA04 AA14 BA01 DB04 EB02 FB10 FB12 FB15 9A001 DD11 HH16 HH17 HH18 HH34 JJ12 ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI theme coat ゛ (reference) G10L 3/00 571V F term (reference) 5D015 AA04 BB01 KK01 KK04 LL02 LL06 LL10 5K027 AA11 BB02 FF22 FF28 HH26 MM16 5K048 AA04 AA14 BA01 DB04 EB02 FB10 FB12 FB15 9A001 DD11 HH16 HH17 HH18 HH34 JJ12

Claims

[Claims]

1. A device control unit for controlling one or a plurality of devices, and a mobile phone terminal for each user, which serves as an interface for the user, wherein a command indicating an operation instruction to the device and an operation setting of the device are provided. Required mobile phone terminal, storing the relationship between the command and the parameter, notifying the device control unit of the command selected by the user by the mobile phone terminal, and controlling the device by the device control unit. An apparatus control apparatus that uses the input control unit, an input recognition unit that recognizes an input including a keyword, an output synthesis unit that sends an output message to a user, and restricts a recognition candidate of the input recognition unit from status information of the device. A device control command is generated from the recognition result of the input recognition unit, and an output message is generated in response to an information request from the device. Dialogue control means for generating a message, and device control means for controlling the device by the device control command generated from the dialogue control device and sending status information of the device to the dialogue control device. Equipment control device using a mobile telephone terminal.

2. A device control unit for controlling one or a plurality of devices, and a mobile telephone terminal for each user, which serves as an interface for the user, wherein a command indicating an operation instruction to the device and an operation setting of the device are provided. Required mobile phone terminal, storing the relationship between the command and the parameter, notifying the device control unit of the command selected by the user by the mobile phone terminal, and controlling the device by the device control unit. A device control device that uses a voice recognition unit that recognizes a user utterance including a keyword, a voice synthesis unit that sends a voice output message to a user, and limits recognition candidates of the voice recognition unit from status information of the device. Then, a device control command is generated from the recognition result of the voice recognition unit, and the device control command is generated by an information request from the device. Dialogue control means for generating an output message, and device control means for controlling the device by a device control command generated from the dialogue control device and sending status information of the device to the dialogue control device. An equipment control device using a mobile phone terminal.

3. A plurality of keywords recognized by the voice recognition unit are simultaneously recognized. By recognizing a plurality of keywords, an operation instruction combining a plurality of the commands,
3. The device control apparatus using a mobile telephone terminal according to claim 2, wherein a plurality of said parameters can be set simultaneously.

4. The mobile telephone terminal holds, in addition to the keyword, explanation information on an operation of a device corresponding to the keyword, and stores the held explanation information by voice or visual display according to a user's request. 3. The device control device using a mobile telephone terminal according to claim 2, wherein the device control device is provided.

5. The device control device using a mobile telephone terminal according to claim 2, wherein the correspondence between the keyword and the command can be changed by operating the mobile telephone terminal.

6. The dialogue control means holds a history of a user utterance, a history of a device utterance, and a command history up to the present time, and uses the held history in addition to status information of the device, thereby 3. The device control device using a mobile telephone terminal according to claim 2, wherein recognition candidates in the voice recognition means are limited.

7. A device using a mobile telephone terminal, comprising: a device control unit for controlling one or a plurality of devices; and a mobile telephone terminal for each user, which controls an interface of the user. A control device, comprising: history storage means for storing an operation history and a dialogue history; and user model analysis means for analyzing data stored in the history storage means and outputting at least one of a vocabulary list and a workflow. And a device classifying means for classifying the used device into a device class having a typical function group according to the purpose, work content, and the like; and storing an analysis result of the user model analyzing means in association with a classification result of the device classifying means. User model storage means; and dialogue processing means for performing dialogue processing using data stored in the user model storage means. Device control device using the mobile telephone terminal, characterized in.

8. A mobile telephone terminal, comprising: a device control unit for controlling one or a plurality of devices; and a mobile telephone terminal for each user, which controls the user by voice dialogue. A device control apparatus, comprising: a history storage unit for storing an operation history and a dialog history; and a user model analysis for analyzing data stored in the history storage unit and outputting at least one of a vocabulary list and a workflow. Means, a device classifying means for classifying the used device into a device class having a typical function group according to purpose, work content, and the like; and storing an analysis result of the user model analyzing unit in association with a classification result of the device classifying unit. User model storage means for performing an interactive process using data stored in the user model storage means. Device control device using the mobile telephone terminal, characterized in that.

9. The device control device using a mobile telephone terminal according to claim 8, wherein a voice output message is generated using a vocabulary list associated with the device classification to which the device to be operated belongs.

10. The device control apparatus using a mobile telephone terminal according to claim 8, wherein a speech recognition candidate is generated using a vocabulary list associated with the device classification to which the device to be operated belongs.

11. The device control device using a mobile telephone terminal according to claim 8, wherein a voice output message is created using a workflow associated with the device classification to which the device to be operated belongs.

12. The device control apparatus using a mobile telephone terminal according to claim 8, wherein a speech recognition candidate is generated using a workflow associated with the device classification to which the device to be operated belongs.

13. A detailed voice output message is generated for a function not included in a function group associated with the device classification to which an operation target device belongs, and an enhanced synthesized voice is generated. An apparatus control device using the mobile telephone terminal according to claim 8.

14. The workflow according to claim 1, wherein the setting values set by the target user with respect to the setting items to be set by the user when using the device are small and stable even when using a different device. 9. The device control device using a mobile telephone terminal according to claim 8, wherein the setting value of the setting item is created as a default value.

15. The apparatus using a mobile telephone terminal according to claim 8, wherein the workflow is created by lowering the priority of the setting process for a setting item whose change from a default value is small. Control device.

16. A device control unit for controlling one or a plurality of devices, and a mobile telephone terminal for each user, which serves as an interface for the user, wherein a command indicating an operation instruction to the device and an operation setting of the device are provided. Required mobile phone terminal, storing the relationship between the command and the parameter, notifying the device control unit of the command selected by the user by the mobile phone terminal, and controlling the device by the device control unit. An apparatus control method using: an input recognition step of recognizing an input including a keyword; an output synthesis step of sending an output message to a user; and restricting recognition candidates of the input recognition step from status information of the apparatus, A device control command is generated from the recognition result of the input recognition step, and information from the device is generated. A dialogue control step of generating an output message in response to a request; and a device control step of controlling the device by the device control command generated in the dialogue control step and sending status information of the device to the dialogue control step. A device control method using a mobile telephone terminal, characterized by comprising:

17. A device control unit for controlling one or a plurality of devices, and a mobile phone terminal for each user, which serves as an interface for the user, wherein a command indicating an operation instruction to the device and an operation setting of the device are provided. Required mobile phone terminal, storing the relationship between the command and the parameter, notifying the device control unit of the command selected by the user by the mobile phone terminal, and controlling the device by the device control unit. A method of controlling a device that uses a voice recognition step of recognizing a user utterance including a keyword, a voice synthesis step of sending a voice output message to a user, and limiting recognition candidates in the voice recognition step from status information of the device. Generating a device control command from the recognition result of the voice recognition step; A dialogue control step of generating an output message in response to an information request from the device, and a device control step of controlling the device by a device control command generated in the dialogue control step and sending status information of the device to the dialogue control step; A device control method using a mobile telephone terminal.

18. The method according to claim 18, wherein a plurality of keywords are recognized at the same time in the voice recognition step, and by recognizing the plurality of keywords, an operation instruction combining a plurality of the commands and a plurality of the parameters can be simultaneously set. An apparatus control method using the mobile telephone terminal according to claim 17.

19. The mobile phone terminal holds, in addition to the keyword, explanation information on an operation of a device corresponding to the keyword, and stores the held explanation information by voice or visual display at the request of the user. 18. The device control method using a mobile telephone terminal according to claim 17, wherein the device control method is provided.

20. The apparatus control method using a mobile telephone terminal according to claim 17, wherein the correspondence between the keyword and the command can be changed by operating the mobile telephone terminal.

21. The dialogue control step holds the history of the user utterance, the history of the device utterance, and the command history up to the present time, and uses the held history in addition to the status information of the device. 18. The apparatus control method using a mobile telephone terminal according to claim 17, wherein recognition candidates in the voice recognition step are limited.

22. A device using a mobile telephone terminal, comprising: a device control unit for controlling one or a plurality of devices; and a mobile telephone terminal for each user, which controls the user interface. A control method, comprising: a history storing step of storing an operation history and a dialog history; and a user model analyzing step of analyzing data stored in the history storing step and outputting at least one of a vocabulary list and a workflow. A device classifying step of classifying the used device into a device class having a typical function group according to the purpose, work content, and the like; and storing the analysis result in the user model analysis step in association with the classification result in the device classification step. User model storage step to perform, and interactive processing using the data stored in the user model storage step. Interactive processing steps and, a comprise device control method using a mobile telephone terminal, characterized by comprising performing.

23. A mobile telephone terminal, comprising a device control unit for controlling one or a plurality of devices and a mobile telephone terminal for each user, which controls said user by voice dialogue. A device control method, comprising: a history storage step of storing an operation history and a dialog history; and a user model analysis step of analyzing data stored in the history storage step and outputting at least one of a vocabulary list and a workflow. And a device classifying step of classifying the used device into a device class having a typical function group according to the purpose, work content, etc. A user model storage step for storing, and a pair using the data stored in the user model storage step. Equipment control method using a mobile telephone terminal, characterized in that it comprises a dialogue processing step of performing processing, the.

24. The device control method using a mobile telephone terminal according to claim 23, wherein a voice output message is generated using a vocabulary list associated with the device classification to which the device to be operated belongs.

25. The device control method using a mobile telephone terminal according to claim 23, wherein a speech recognition candidate is generated using a vocabulary list associated with the device classification to which the device to be operated belongs.

26. The device control method using a mobile telephone terminal according to claim 23, wherein a voice output message is created using a workflow associated with the device classification to which the operation target device belongs.

27. The device control method using a mobile telephone terminal according to claim 23, wherein a voice recognition candidate is generated using a workflow associated with the device classification to which the device to be operated belongs.

28. A function that generates a detailed voice output message for a function that is not included in the function group associated with the device classification to which the operation target device belongs, and generates an enhanced synthesized voice. 23. A device control method using the mobile telephone terminal according to 23.

29. The workflow according to claim 1, wherein the setting value set by the target user with respect to a setting item to be set by the user when using the device is small and stable even when using a different device. 24. The method according to claim 23, wherein the setting value of the setting item is created as a default value.
An apparatus control method using the mobile phone terminal described in the above.

30. The apparatus using a mobile telephone terminal according to claim 23, wherein the workflow is created by lowering the priority of the setting process for a setting item whose change from a default value is small. Control method.

31. A computer-readable recording medium in which a program for causing the mobile telephone terminal according to claim 1 to function as a device control device is recorded.

32. A computer-readable recording medium on which a program for implementing the device control method using the mobile telephone terminal according to claim 16 is recorded.