JP2005284723A

JP2005284723A - Natural language processing system, natural language processing method, and computer program

Info

Publication number: JP2005284723A
Application number: JP2004097830A
Authority: JP
Inventors: Tomoko Okuma; 智子大熊; Hiroshi Masuichi; 博増市; Hiroki Yoshimura; 宏樹吉村; Daigo Sugihara; 大悟杉原
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2004-03-30
Filing date: 2004-03-30
Publication date: 2005-10-13

Abstract

<P>PROBLEM TO BE SOLVED: To obtain an accurate syntax semantic analysis result on the whole original sentence by accurately connecting the syntax analysis result of each element after dividing a long sentence to make a syntax semantic analysis. <P>SOLUTION: Divided positions in an input sentence are determined based on morphemic information of one morpheme or two or more continuous morphemes included in the input sentence, and a syntax semantic analysis is made on each element of the input sentence divided in the divided positions, to acquire the modification relation between words and the kind of the relation for every element. Based on morphemic information in the divided positions of the input sentence, the connected positions and modification relation in the syntax semantic analysis result of each divided element are determined to connect the syntax semantic analysis results of the respective elements to one another. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、人間が日常的なコミュニケーションに使用する自然言語を数学的に取り扱うための自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムに係り、特に、自然言語文を単語や接辞に分割や品詞認定などの形態素解析を施した後に、構文意味解析処理を行ない語と語の間の係り受け関係やその係り受け関係の種類を取得する自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムに関する。 The present invention relates to a natural language processing system and a natural language processing method for mathematically handling a natural language used by human beings for daily communication, and a computer program, and in particular, a natural language sentence is divided into words and affixes. A natural language processing system, natural language processing method, and computer language which perform syntactic and semantic analysis processing after obtaining morphological analysis such as recognition and part of speech recognition, and acquire the dependency relationship between words and the type of the dependency relationship; Regarding the program.

さらに詳しくは、本発明は、長文を分割して構文意味解析に要する計算量を軽減する自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムに係り、特に、分割した文の各要素を構文意味解析した後に各要素の構文解析結果を正確に連結して元の文についての構文意味解析結果を得る自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムに関する。 More particularly, the present invention relates to a natural language processing system, a natural language processing method, and a computer program that divide long sentences to reduce the amount of computation required for syntactic and semantic analysis, and in particular, to syntax each element of a divided sentence. The present invention relates to a natural language processing system, a natural language processing method, and a computer program that obtain a syntactic and semantic analysis result of an original sentence by accurately connecting the syntax analysis results of each element after semantic analysis.

日本語や英語など、人間が日常的なコミュニケーションに使用する言葉のことを「自然言語」と呼ぶ。多くの自然言語は、自然発生的な起源を持ち、人類、民族、社会の歴史とともに進化してきた。勿論、人は身振りや手振りなどによっても意思疎通を行なうことが可能であるが、自然言語により最も自然で且つ高度なコミュニケーションを実現することができる。 Words that humans use for everyday communication, such as Japanese and English, are called “natural languages”. Many natural languages have a naturally occurring origin and have evolved with the history of mankind, people and society. Of course, people can communicate with each other by gestures and hand gestures, but natural language can realize the most natural and advanced communication.

他方、情報技術の発展に伴い、コンピュータが人間社会に定着し、各種産業や日常生活の中に深く浸透している。いまやコンピュータ・データだけでなく、画像や音響などほとんどすべての情報コンテンツがコンピュータ上で取り扱われ、情報の編集・加工、蓄積、管理、伝達、共有など高度な処理を行なうことが可能となっている。 On the other hand, with the development of information technology, computers have become established in human society and have deeply penetrated into various industries and daily life. Now, not only computer data, but almost all information content such as images and sounds are handled on the computer, making it possible to perform advanced processing such as editing / processing, storage, management, transmission and sharing of information. .

例えば、日本語や英語を始めとする各種の言語で記述される自然言語は、本来抽象的で曖昧性が高い性質を持つが、文章を数学的に取り扱うことにより、コンピュータ処理を行なうことができる。この結果、機械翻訳や対話システム、検索システム、質問応答システムなど、自動化処理により自然言語に関するさまざまなアプリケーション／サービスが実現される。 For example, a natural language written in various languages such as Japanese and English is inherently abstract and highly ambiguous, but can be processed computerically by handling sentences mathematically. . As a result, various applications / services related to natural language are realized by automated processing such as machine translation, dialogue system, search system, and question answering system.

自然言語処理は一般に、形態素解析、構文解析、意味解析、文脈解析という各処理フェーズに区分される。形態素解析では、文を意味的最小単位である形態素（ｍｏｒｐｈｅｍｅ）に分節して品詞の認定処理を行なう。構文解析では、文法規則などを基に句構造などの文の構造を解析する。文法規則が木構造であることから、構文解析結果は一般に個々の形態素が係り受け関係などを基にして接合された木構造となる。意味解析では、文中の語の語義（概念）や、語と語の間の意味関係などに基づいて、文が伝える意味を表現する意味構造を求めて、意味構造を合成する。また、文脈解析では、文の系列である文章（談話）を解析の基本単位とみなして、文間の意味的なまとまりを得て談話構造を構成する。 Natural language processing is generally divided into processing phases of morphological analysis, syntax analysis, semantic analysis, and context analysis. In morpheme analysis, a sentence is segmented into morphemes which are the smallest semantic units, and part-of-speech recognition processing is performed. In syntax analysis, sentence structure such as phrase structure is analyzed based on grammatical rules. Since the grammatical rule is a tree structure, the parsing result generally has a tree structure in which individual morphemes are joined based on a dependency relationship. In semantic analysis, a semantic structure that expresses the meaning conveyed by a sentence is obtained based on the meaning (concept) of the words in the sentence and the semantic relationship between words, and the semantic structure is synthesized. In context analysis, a sentence series (discourse) is regarded as a basic unit of analysis, and a discourse structure is constructed by obtaining a semantic group between sentences.

形態素とは、言語学において、単語や接辞など、文法上、最小の単位となる要素のことである。したがって、形態素解析では、形態素の文法的属性（品詞や活用など）を同定するために、単語を分割して品詞付けを行なう。すなわち、形態素解析は、入力文中の単語を同定し、その語形変化を解析するため、かな漢字変換、情報検索、機械翻訳などにおいて、基本となる処理である（例えば、非特許文献１）。 A morpheme is an element that is the smallest grammatical unit, such as a word or affix, in linguistics. Therefore, in morphological analysis, in order to identify the grammatical attributes (part of speech, utilization, etc.) of the morpheme, the word is divided and part of speech is added. That is, morphological analysis is a basic process in kana-kanji conversion, information retrieval, machine translation, and the like in order to identify a word in an input sentence and analyze the change in the word form (for example, Non-Patent Document 1).

また、構文意味解析処理とは、自然言語文を受け取り、文法規則に基づいて語と語の係り受け関係とその関係の種類（主語、目的語などの格フレームや修飾の種類）を特定する処理である。 The syntactic and semantic analysis process is a process that receives a natural language sentence and identifies the dependency relationship between words and the type of relationship (case frame or modification type of subject, object, etc.) based on grammatical rules. It is.

前者の係り受け関係だけを抽出する構文解析に関しては、統計処理などを用いて実時間で解析結果を得ることが可能である。構文解析に関しては、既にＣａｂｏｃｈａ（例えば、非特許文献２を参照のこと）やＫＮＰ（例えば、非特許文献３を参照のこと）などのシステムが当業界において周知となっている。 With regard to syntax analysis that extracts only the former dependency relationship, it is possible to obtain an analysis result in real time using statistical processing or the like. Regarding syntax analysis, systems such as Cabocha (for example, see Non-Patent Document 2) and KNP (for example, see Non-Patent Document 3) are already known in the art.

ここで、実用的なアプリケーションを考慮した場合、形態素毎の品詞情報や、形態素間の係り受け関係だけでは解析結果として不十分であり、構文解析のさらに次の段階の処理に相当する意味解析を行ない、格フレームなどの情報を得ることが重要であると思料される。 Here, when considering a practical application, the part-of-speech information for each morpheme and the dependency relationship between morphemes are not sufficient as analysis results, and semantic analysis corresponding to the processing of the next stage of syntax analysis is not sufficient. It is thought that it is important to obtain information such as practice and case frames.

ところが、この意味解析処理は計算量の点で非常にコストがかかるため、構文解析に比べると実時間で解析結果を得ることは困難である。計算量は、文の形態素数が増すにつれ指数関数的に増大するため、長文を対象として意味解析を実現することはとりわけ大きな課題となる。 However, since this semantic analysis processing is very expensive in terms of computational complexity, it is difficult to obtain an analysis result in real time as compared to syntax analysis. Since the amount of calculation increases exponentially as the number of morphemes in a sentence increases, it is particularly a challenge to realize semantic analysis for long sentences.

この問題を解決する方法として、構文意味解析システムに文を入力する前に、長文を分割して解析に要する計算量を軽減できるようにする前処理が考えられる。 As a method for solving this problem, pre-processing is considered in which a long sentence is divided and the amount of calculation required for the analysis can be reduced before the sentence is input to the syntactic and semantic analysis system.

例えば、長い日本語文について、あらかじめ用意された分割候補文字列セットを検索し、この文中に分割候補文字列があるかどうかを判定して分割し、分割された文に対して形態素解析を行ない、文末を調整することができる（例えば、特許文献１を参照のこと）。 For example, for a long Japanese sentence, search for a set of division candidate character strings prepared in advance, determine whether there is a division candidate character string in this sentence, divide it, perform morphological analysis on the divided sentence, The end of a sentence can be adjusted (for example, refer to patent documents 1).

さらに、文章を分割する際に、文章中の接続助詞のある位置を分割位置とすることが好ましいと当業界では一般的に理解されている。例えば、接続助詞や機能後の線形結合順位をコーパスから求め、構文解析階の優先付けに用いるという手法がある（例えば、非特許文献４を参照のこと）。 Furthermore, it is generally understood in the art that when a sentence is divided, it is preferable to set a position where a connection particle is present in the sentence as a division position. For example, there is a method of obtaining a connection particle or a linear combination rank after function from a corpus and using it for prioritizing a parsing floor (for example, see Non-Patent Document 4).

また、文章を分割して、言語処理に要するコスト若しくは計算時間を軽減するという手法は、構文意味解析だけでなく、機械翻訳や形態素解析においても採り入れられている。 In addition, the technique of dividing sentences to reduce the cost or calculation time required for language processing is adopted not only in syntax semantic analysis but also in machine translation and morphological analysis.

例えば、元文の長さが制限を越えるような場合に、原文データを解析可能な長さの単語列に分割し、部分翻訳後、各単語列の訳文データを一つの文に結合し、これを翻訳結果として出力することができる（例えば、特許文献２を参照のこと）。 For example, when the length of the original sentence exceeds the limit, the original text data is divided into word strings of a length that can be analyzed, and after partial translation, the translated text data of each word string is combined into one sentence. Can be output as a translation result (see, for example, Patent Document 2).

また、長文を対象にした翻訳において、前処理で分割した元文の各要素を翻訳した後、要素間に接続詞や関係代名詞などの単語を挿入することで自然な翻訳文を得ることができる（例えば、特許文献３を参照のこと）。 Moreover, in the translation for long sentences, after translating each element of the original sentence divided by the preprocessing, a natural translation sentence can be obtained by inserting words such as connectives and relative pronouns between the elements ( For example, see Patent Document 3).

特公平７−２１８０３号公報Japanese Examined Patent Publication No. 7-21803 特開平１０−２１２３９号公報Japanese Patent Laid-Open No. 10-21239 特開平７−１８２３４６号公報JP 7-182346 A 内元清貴、馬青共著「人文学と情報処理」（勉誠出版，ｐｐ．１３−１５，１９９９）Uchimoto Kiyotaka and Ma Ao, “Humanities and Information Processing” (Study Publishing, pp. 13-15, 1999) 工藤拓、松本裕治共著「チャンキングの段階適用による日本語係り受け解析」（情報処理学会論文誌，４３（６），１８３４−１８４２）Taku Kudo and Yuji Matsumoto "Japanese dependency analysis by applying the chunking stage" (Journal of Information Processing Society of Japan, 43 (6), 1834-1842) 黒橋禎夫著「結構やるな，ＫＮＰ」（情報処理，４１（１１），１２１５−１２２０）Kurohashi Ikuo, "Do not do well, KNP" (Information Processing, 41 (11), 1215-1220) 市丸夏樹、飛松宏征共著「接続助詞の結合順位に基づく複文の構文解析」（２００３−ＮＬ−１５８）Co-authored by Natsuki Ichimaru and Hiroyuki Himatsu “Syntax Analysis of Compound Sentences Based on Joining Order of Connected Particles” (2003-NL-158)

構文意味解析を始めとする言語処理の計算量は、文の形態素数が増すにつれ指数関数的に増大する。このため、長文を分割して構文意味解析に要する計算量を軽減するという方法が採用されている。ところが、この場合、分割した文の各要素を構文意味解析した後に、各要素の構文意味解析結果を正確に連結して文全体としての解析結果を得なければならないという問題が残っている。 The computational complexity of language processing, including syntactic and semantic analysis, increases exponentially as the number of morphemes in a sentence increases. For this reason, a method is adopted in which long sentences are divided to reduce the amount of calculation required for syntactic and semantic analysis. However, in this case, there is still a problem that after the syntax / semantic analysis of each element of the divided sentence, the syntax / semantic analysis result of each element must be accurately connected to obtain the analysis result of the entire sentence.

これに対し、上述した手法ではいずれも、分割した要素を再び結合する際に単純に個々の処理結果を連ねているが、これでは目的とする構文意味解析システムに対して適用することは不可能である。何故ならば構文意味解析システムでは、個々の処理結果を関係付けるだけでは不十分であり、その関係の種類を特定する必要があるからである。例えば、特許文献３では接続語を間に挿入する方式を採っている。しかしながら、分割の個所や分割後の各要素の構文意味解析結果によっては解析結果の連結方法が異なる場合があるため、語を挿入するだけでは正しい結果を得ることができない。 On the other hand, in each of the methods described above, the individual processing results are simply linked when the divided elements are combined again, but this cannot be applied to the target syntactic and semantic analysis system. It is. This is because it is not sufficient to relate individual processing results in the syntactic and semantic analysis system, and it is necessary to specify the type of the relationship. For example, Patent Document 3 adopts a method of inserting connection words between them. However, since the method of linking the analysis results may differ depending on the part of the division and the result of syntactic and semantic analysis of each element after the division, a correct result cannot be obtained simply by inserting words.

本発明は上述したような技術的課題を鑑みたものであり、その主な目的は、入力文に対して構文意味解析処理を行ない、語と語の間の係り受け関係やその係り受け関係の種類に関する正確な解析結果を得ることができる、優れた自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムを提供することにある。 The present invention has been made in view of the technical problems as described above, and its main purpose is to perform a syntactic and semantic analysis process on the input sentence, and to determine the dependency relationship between words and their dependency relationship. An object of the present invention is to provide an excellent natural language processing system, natural language processing method, and computer program capable of obtaining an accurate analysis result regarding a type.

本発明のさらなる目的は、長文を分割して構文意味解析に要する計算量を軽減することができる、優れた自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムを提供することにある。 A further object of the present invention is to provide an excellent natural language processing system, natural language processing method, and computer program capable of reducing the amount of calculation required for syntactic and semantic analysis by dividing long sentences.

本発明のさらなる目的は、分割した文の各要素を構文意味解析した後に各要素の構文解析結果を正確に連結して元の文全体についての正確な構文意味解析結果を得ることができる、優れた自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムを提供することにある。 A further object of the present invention is to obtain an accurate syntactic and semantic analysis result for the entire original sentence by correctly connecting the parsing results of each element after syntactic and semantic analysis of each element of the divided sentence. Another object of the present invention is to provide a natural language processing system, a natural language processing method, and a computer program.

本発明は、上記課題を参酌してなされたものであり、その第１の側面は、入力された自然言語文を解析する自然言語処理システムであって、入力文に含まれる１つの形態素又は２以上の連続する形態素が持つ形態素情報に基づいて、入力文における分割位置を決定し、入力文を該分割位置の前要素と後要素に分割する文分割処理手段と、前記分割位置により分割された入力文の各要素に対してそれぞれ構文意味解析を施し、語と語の係り受け関係とその関係の種類を要素毎に取得する構文意味解析手段と、分割された各要素における語と語の係り受け関係とその関係の種類、及び入力文の分割位置における形態素情報に基づいて、分割された各要素の構文意味解析結果における連結する位置と係り受け関係を決定して要素の構文意味解析結果同士を連結する文要素連結処理手段とを具備することを特徴とする自然言語処理システムである。 The present invention has been made in consideration of the above problems, and a first aspect of the present invention is a natural language processing system that analyzes an input natural language sentence, and includes one morpheme or 2 included in the input sentence. Based on the morpheme information of the above continuous morphemes, the division position in the input sentence is determined, the sentence division processing means for dividing the input sentence into the previous element and the rear element of the division position, and divided by the division position Syntactic and semantic analysis means that performs syntactic and semantic analysis on each element of the input sentence and obtains the dependency relationship between words and words and the type of the relationship for each element, and the relationship between words and words in each divided element Based on the receiving relationship, the type of the relationship, and the morpheme information at the split position of the input sentence, the syntactic analysis result of the element is determined by determining the connecting position and the dependency relationship in the syntactic and semantic analysis result of each divided element. A natural language processing system characterized by comprising the text elements connecting processing means for coupling the Judges.

構文意味解析を始めとする言語処理の計算量は、文の形態素数が増すにつれ指数関数的に増大する。このため、長文を分割して構文意味解析に要する計算量を軽減するという方法が採用されている。この場合、分割して得られた要素毎の構文意味解析結果が得られる。各要素の構文意味解析結果は、例えば語と語の係り受け関係を基にして接合された木構造として表され、語と語を接合する枝には係り受け関係の種類が付加されている。 The computational complexity of language processing, including syntactic and semantic analysis, increases exponentially as the number of morphemes in a sentence increases. For this reason, a method is adopted in which long sentences are divided to reduce the amount of calculation required for syntactic and semantic analysis. In this case, a syntactic and semantic analysis result for each element obtained by division is obtained. The result of syntactic and semantic analysis of each element is expressed as a tree structure joined based on the dependency relationship between words, for example, and the type of dependency relationship is added to the branch connecting words.

ところが、この場合、分割した文の各要素を構文意味解析した後に、各要素の構文意味解析結果を正確に連結して文全体としての解析結果を得なければならないという問題がある。すなわち、各要素について得られた互いの構文意味解析木のうちどの語同士をいかなる係り受け関係の種類を以って接合すべきかが不明であり、一旦分割された文の各要素を構文意味解析後に正確に連結することは困難である。 However, in this case, after syntactic and semantic analysis of each element of the divided sentence, there is a problem that the syntactic and semantic analysis result of each element must be accurately connected to obtain an analysis result as the entire sentence. In other words, it is unclear which words in the syntactic and semantic analysis tree obtained for each element should be joined with what kind of dependency relationship, and each semantic element of the sentence once divided is syntactically analyzed. It is difficult to connect correctly later.

これに対し、本発明によれば、入力文の分割位置における形態素情報に基づいて、分割された各要素の構文意味解析結果における連結する位置と係り受け関係を決定して要素同士を連結するようにしたので、分割した文の各要素を構文意味解析した後に各要素の構文解析結果を正確に連結することにより、元の文全体についての正確な構文意味解析結果を得ることができる。 On the other hand, according to the present invention, based on the morpheme information at the divided position of the input sentence, the position to be linked and the dependency relationship in the syntactic and semantic analysis result of each divided element are determined, and the elements are linked. Therefore, the syntactic and semantic analysis of each element of the divided sentence is performed, and then the syntactic analysis result of each element is accurately connected to obtain an accurate syntactic and semantic analysis result for the entire original sentence.

ここで、文に含まれる１つの形態素又は２以上の連続する形態素が持つ形態素情報に基づいて決定される分割位置と、入力文の分割位置における形態素情報に基づいて決定される各要素の構文意味解析結果における連結する位置と係り受け関係との間には、一定の対応関係がある。 Here, the division position determined based on the morpheme information of one morpheme or two or more consecutive morphemes included in the sentence, and the syntax meaning of each element determined based on the morpheme information at the division position of the input sentence There is a certain correspondence between the connection position and the dependency relationship in the analysis result.

そこで、文の分割位置における１つの形態素又は２以上の連続する形態素が持つ形態素情報を分割処理規則として記述するとともに、この分割処理規則に対応する、分割された各要素の構文意味解析結果同士を連結するための位置と係り受け関係を連結処理規則として記述し、複数の分割処理規則並びにこれらにそれぞれ対応する連結処理規則をあらかじめ保持しておくようにしてもよい。このような場合、前記文要素連結処理手段は、入力文の分割に用いられた分割処理規則に対応する連結処理規則を用いて、前記構文意味解析手段から得られる各要素についての意味解析結果を連結することができる。 Therefore, the morpheme information of one morpheme or two or more consecutive morphemes at the sentence division position is described as a division processing rule, and the syntactic analysis results of each divided element corresponding to this division processing rule are The position for connection and the dependency relationship may be described as a connection processing rule, and a plurality of division processing rules and a connection processing rule corresponding to each of them may be held in advance. In such a case, the sentence element concatenation processing means uses the concatenation processing rule corresponding to the division processing rule used for the division of the input sentence to obtain the semantic analysis result for each element obtained from the syntax semantic analysis means. Can be linked.

したがって、本発明によれば、これまで構文意味解析を実行することが不可能であった長文を解析することが可能となる。文の分割処理規則とそれに対応する連結処理規則を対応付けて保持することによって、分割の仕方に応じた適切な部分解析結果の連結を行なうことができるので、構文解析処理に投入する前に一旦文を複数の要素に分割しているにも拘らず、最終的には文全体についての意味解析結果を正しく出力することが可能である。 Therefore, according to the present invention, it is possible to analyze a long sentence that has not been possible to perform syntactic and semantic analysis. By holding the sentence split processing rules and the corresponding link processing rules in association with each other, it is possible to connect the appropriate partial analysis results according to the way of splitting. Despite the fact that the sentence is divided into a plurality of elements, it is finally possible to correctly output the semantic analysis result for the entire sentence.

また、入力文の分割位置を決定するための分割処理規則を複数用意しておく場合には、各分割処理規則に優先順位を割り振るようにしてもよい。そして、優先順位に従って分割処理規則を選択的に適用し、得られる分割位置において入力文を分割処理することによって、より適切な分割位置を決定し、より精度の高い構文意味解析結果を効率的に得ることができる。 Further, when a plurality of division processing rules for determining the division position of the input sentence are prepared, priority may be assigned to each division processing rule. Then, the division processing rules are selectively applied according to the priority order, and the input sentence is divided at the obtained division position, thereby determining a more appropriate division position and efficiently obtaining a more accurate syntactic and semantic analysis result. Can be obtained.

図１には、本発明に係る自然言語処理システムの実現形態の一例を模式的に示している。図示の自然言語処理装置１０は、文入力部１１と、分割・連結ルール保持部１２と、文分割処理部１３と、構文意味解析部１４と、文要素連結部１５と、解析結果出力部１６で構成される。 FIG. 1 schematically shows an example of a natural language processing system according to the present invention. The illustrated natural language processing apparatus 10 includes a sentence input unit 11, a division / connection rule holding unit 12, a sentence division processing unit 13, a syntax and semantic analysis unit 14, a sentence element connection unit 15, and an analysis result output unit 16. Consists of.

文入力部１１は、日本語やその他の言語で記述された自然言語文を受け取り、文を単語すなわち形態素毎に分割して品詞付けや活用形の同定を行なう形態素解析処理を施すことによって、入力文の単語区切りや品詞付けを行なう。 The sentence input unit 11 receives a natural language sentence described in Japanese or other languages, and performs a morpheme analysis process that divides the sentence into words, that is, morphemes, performs part-of-speech assignment, and identifies a utilization form. Perform word breaks and parts of speech.

分割・連結ルール保持部１２は、分割する位置の条件と適用する優先順位からなる分割ルール、並びに各分割ルールに対応付けられている連結位置と関係名からなる連結ルールを保持している。 The division / connection rule holding unit 12 holds a division rule composed of a condition of a division position and a priority order to be applied, and a connection rule composed of a connection position and a relation name associated with each division rule.

文分割処理部１３は、分割・連結ルール保持部１２に問い合わせ、入力文に対し分割ルールを優先度順に順次適用していき、合致する分割ルールを用いて分割位置を決定し、入力文を分割位置の前後の要素に分割する。 The sentence division processing unit 13 inquires the division / concatenation rule holding unit 12, sequentially applies the division rules to the input sentence in order of priority, determines the division position using the matching division rule, and divides the input sentence Split into elements before and after the position.

構文意味解析部１４は、分割された文の各要素に対して、係り受け関係の抽出と格フレームなどの関係の種類を特定する処理を施す。この構文意味解析処理により、語と語の係り受け関係とその関係を記述した解析結果が文の要素毎に出力される。 The syntactic and semantic analysis unit 14 performs a process of extracting a dependency relationship and specifying a relationship type such as a case frame for each element of the divided sentence. By this syntactic and semantic analysis processing, the dependency relationship between words and the analysis result describing the relationship are output for each element of the sentence.

文要素連結部１５は、分割・連結ルール保持部１２に問い合わせ、文分割処理部１３において適用した分割ルールに対応する連結ルールに基づいて、文の要素の解析結果を結合する。そして、解析結果出力部１６は、結合された解析結果を、元の文全体の解析結果として出力する。 The sentence element connection unit 15 inquires of the division / connection rule holding unit 12 and combines the analysis results of the sentence elements based on the connection rule corresponding to the division rule applied by the sentence division processing unit 13. Then, the analysis result output unit 16 outputs the combined analysis result as the analysis result of the entire original sentence.

また、本発明に係る自然言語処理システムは、入力文を分割して得られた各要素についての構文意味解析の処理結果を評価する解析結果評価手段をさらに備えていてもよい。 In addition, the natural language processing system according to the present invention may further include an analysis result evaluation unit that evaluates the processing result of the syntax semantic analysis for each element obtained by dividing the input sentence.

前記解析結果評価手段は、前記構文意味解析手段が入力文の各要素に対する解析を所定時間以内に実行できなかったことに応答して、該要素をさらに分割すべきかどうかを決定することができる。このような場合、前記文分割処理手段は前記解析結果評価手段による決定に応じて該要素を再度分割し、さらに前記構文意味解析手段は該要素を再度分割して得られる各要素について構文意味解析処理を施すことにより、入力文が長文であっても、より精度の高い構文意味解析を行なうことが可能となる。 The analysis result evaluation means can determine whether or not to further divide the element in response to the fact that the syntax and semantic analysis means cannot execute the analysis on each element of the input sentence within a predetermined time. In such a case, the sentence division processing unit divides the element again according to the determination by the analysis result evaluation unit, and the syntax-semantic analysis unit further performs syntax-semantic analysis for each element obtained by dividing the element again. By performing the processing, it is possible to perform more accurate syntactic and semantic analysis even if the input sentence is a long sentence.

また、前記解析結果評価手段は、入力文をある分割処理規則により分割して得られた各要素についての構文意味解析結果を評価し、入力文を分割し直すべきかどうかを決定することができる。このような場合、前記文分割処理手段は前記解析結果評価手段による決定に応じて入力文に対して他の分割処理規則を適用して再度分割し、さらに前記構文意味解析手段は入力文を再度分割して得られる各要素について構文意味解析処理を施すことにより、入力文が長文であっても、より精度の高い構文意味解析を行なうことが可能となる。 Further, the analysis result evaluation means can evaluate a syntax-semantic analysis result for each element obtained by dividing the input sentence according to a certain division processing rule, and determine whether the input sentence should be divided again. . In such a case, the sentence division processing unit applies another division processing rule to the input sentence again according to the determination by the analysis result evaluation unit, and further the syntax-separation analysis unit again processes the input sentence. By performing the syntactic and semantic analysis processing on each element obtained by dividing, it is possible to perform more accurate syntactic and semantic analysis even if the input sentence is a long sentence.

また、前記文分割処理手段は、機械学習法に基づいて分割処理規則を選択し、入力文の分割を行なうようにしてもよい。 The sentence division processing means may select a division processing rule based on a machine learning method and divide the input sentence.

例えば、前記文分割処理手段が接続助詞Ａ類に読点が連結する分割位置において文を分割した場合には、前記連結処理手段は、該分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 For example, when the sentence division processing unit divides a sentence at a division position where a punctuation point is connected to a connection particle A class, the connection processing unit converts an element before the division position to an element after the division position. By connecting the syntactic and semantic analysis results of each element as a continuous modification component for the verb, it is possible to obtain the syntactic and semantic analysis results for the entire original sentence.

また、前記文分割処理手段が接続助詞Ｂ類に読点が連結する分割位置において文を分割したときに、前記連結処理手段は、各要素を束ねるための代用語を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することで、元の文全体について構文意味解析結果を得ることができる。 In addition, when the sentence division processing unit divides the sentence at the division position where the punctuation marks are connected to the connection particle B class, the connection processing unit inserts a synonym for bundling each element, and before the division position, The syntactic and semantic analysis results of the entire original sentence can be obtained by linking the syntactic and semantic analysis results of the element and the subsequent element by a juxtaposition relationship via a pronoun.

また、前記文分割処理手段が活用後の連用形に読点が連結する分割位置において文を分割したときに、前記連結処理手段は、各要素を束ねるための代用語を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することで、元の文全体について構文意味解析結果を得ることができる。 In addition, when the sentence division processing unit divides the sentence at the division position where the reading marks are connected to the continuous form after use, the connection processing unit inserts a synonym for bundling each element and precedes the division position. The syntactic and semantic analysis results of the entire original sentence can be obtained by linking the syntactic and semantic analysis results of the element and the subsequent element by a juxtaposition relationship via a pronoun.

また、前記文分割処理手段が格助詞に読点が連結する分割位置において文を分割したときに、前記連結処理手段は、該分割位置より後の要素の動詞が引用動詞であれば該分割位置より前の要素を該動詞の引用格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 In addition, when the sentence division processing unit divides the sentence at the division position where the reading mark is connected to the case particle, the connection processing unit determines from the division position that the verb of the element after the division position is a citation verb. Concatenate the results of syntactic and semantic analysis of each element using the previous element as the citation of the verb, otherwise connect the results of syntactic and semantic analysis of each element using the element preceding the split position as the continuous modification component Thus, the syntactic and semantic analysis result can be obtained for the entire original sentence.

また、前記文分割処理手段が名詞と係助詞と読点が連続する分割位置において文を分割したときに、前記連結処理手段は、該分割位置より後の要素の動詞に省略されている格要素があれば該分割位置より前の要素を該当する格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 Further, when the sentence division processing unit divides the sentence at a division position where the noun, the counsel and the reading mark are continuous, the connection processing unit has a case element omitted in the verb of the element after the division position. If there is an element before the division position, the syntactic analysis results of each element are connected to each other with the corresponding case as the case. Can be used to obtain a syntactic and semantic analysis result for the entire original sentence.

また、前記文分割処理手段が副詞可能名詞に読点が連結する分割位置において文を分割したときに、前記連結処理手段は、該分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 Further, when the sentence division processing unit divides the sentence at the division position where the punctuation marks are connected to the adverb nouns, the connection processing unit applies the element before the division position to the verb of the element after the division position. By connecting the syntactic and semantic analysis results of each element as the continuous modification component, it is possible to obtain the syntactic and semantic analysis results for the entire original sentence.

また、前記文分割処理手段が読点を分割位置として文を分割したときに、前記連結処理手段は、該分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 In addition, when the sentence division processing unit divides a sentence with a reading point as a division position, the connection processing unit uses each element as a continuous modification component for the verb of the element after the division position. By connecting the syntactic and semantic analysis results of, the syntactic and semantic analysis results can be obtained for the entire original sentence.

また、前記文分割処理手段が接続助詞Ａ類を分割位置として文を分割したときに、前記連結処理手段は、該分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 In addition, when the sentence division processing unit divides the sentence using the connection particle A class as a division position, the connection processing unit uses an element that modifies the element preceding the division position as the component modification verb for the element after the division position. By synthesizing the syntax and semantic analysis results of each element, it is possible to obtain the syntax and semantic analysis results for the entire original sentence.

また、前記文分割処理手段が接続助詞Ｂ類を分割位置として文を分割したときに、前記連結処理手段は、各要素を束ねるための代用語を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することで、元の文全体について構文意味解析結果を得ることができる。 Further, when the sentence division processing unit divides the sentence using the connection particle B class as the division position, the connection processing unit inserts a pronoun for binding the elements, and the elements before and after the division position By synthesizing each syntactic and semantic analysis result of the elements in a side-by-side relationship via a synonym, a syntactic and semantic analysis result can be obtained for the entire original sentence.

また、前記文分割処理手段が活用後の連用形を分割位置として文を分割したときに、前記連結処理手段は、各要素を束ねるための代用語を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することで、元の文全体について構文意味解析結果を得ることができる。 Further, when the sentence division processing means divides the sentence with the continuous form after use as a division position, the connection processing means inserts a pronoun for binding each element, and the elements before and after the division position. By synthesizing each syntactic and semantic analysis result of the elements in a side-by-side relationship via a synonym, a syntactic and semantic analysis result can be obtained for the entire original sentence.

また、前記文分割処理手段が格助詞を分割位置として文を分割したときに、前記連結処理手段は、該分割位置より後の要素の動詞が引用動詞であれば該分割位置より前の要素を該動詞の引用格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 In addition, when the sentence division processing unit divides the sentence using the case particle as a division position, the connection processing unit selects an element before the division position if the verb of the element after the division position is a citation verb. By linking the syntactic analysis results of each element as the citation of the verb, or by linking the syntactic analysis results of each element with the element before the split position as the continuous modification component, Syntactic and semantic analysis results can be obtained for the entire sentence.

また、前記文分割処理手段が係助詞を分割位置として文を分割したときに、前記連結処理手段は、該分割位置より後の要素の動詞に省略されている格要素があれば該分割位置より前の要素を該当する格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 In addition, when the sentence division processing unit divides the sentence with the auxiliary particle as a division position, the connection processing unit determines that if there is an abbreviated case element in the verb of the element after the division position, By concatenating the syntactic and semantic analysis results of each element with the previous element as the corresponding case, or by concatenating the syntactic and semantic analysis results of each element with the element preceding the split position as the continuous modifier component, Syntactic and semantic analysis results can be obtained for the entire original sentence.

また、前記文分割処理手段が副詞可能名詞を分割位置として文を分割したときに、前記連結処理手段は、該分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することで、元の文全体について構文意味解析結果を得ることができる。 Further, when the sentence division processing unit divides the sentence using the adverb noun as a division position, the connection processing unit uses the element before the division position as a continuous modification component for the verb of the element after the division position. By connecting the syntax and semantic analysis results of each element, it is possible to obtain the syntax and semantic analysis results for the entire original sentence.

また、本発明の第２の側面は、入力された自然言語文を解析するための処理をコンピュータ・システム上で実行するようにコンピュータ可読形式で記述されたコンピュータ・プログラムであって、入力文が持つ形態素情報に基づいて、入力文における分割位置を決定し、入力文を該分割位置の前要素と後要素に分割する文分割処理ステップと、前記分割位置により分割された入力文の各要素に対してそれぞれ構文意味解析を施し、語と語の係り受け関係とその関係の種類を要素毎に取得する構文意味解析ステップと、入力文の分割位置における形態素情報に基づいて、分割された各要素の構文意味解析結果における連結する位置と係り受け関係を決定して要素の構文意味解析結果同士を連結する文要素連結処理ステップとを具備することを特徴とするコンピュータ・プログラムである。 According to a second aspect of the present invention, there is provided a computer program written in a computer-readable format so that a process for analyzing an input natural language sentence is executed on a computer system. Based on the morpheme information possessed, a division position in the input sentence is determined, a sentence division processing step for dividing the input sentence into a preceding element and a subsequent element of the division position, and each element of the input sentence divided by the division position Each of the divided elements based on the morphological information at the division position of the input sentence, and the syntactic and semantic analysis step for obtaining the element-to-element dependency relation and the type of the relation for each element. A sentence element linking process step for linking element syntactic analysis results by determining positions to be connected and dependency relations in the syntax semantic analysis results. It is a computer program that.

本発明の第２の側面に係るコンピュータ・プログラムは、コンピュータ・システム上で所定の処理を実現するようにコンピュータ可読形式で記述されたコンピュータ・プログラムを定義したものである。換言すれば、本発明の第２の側面に係るコンピュータ・プログラムをコンピュータ・システムにインストールすることによって、コンピュータ・システム上では協働的作用が発揮され、本発明の第１の側面に係る自然言語処理システムと同様の作用効果を得ることができる。 The computer program according to the second aspect of the present invention defines a computer program described in a computer-readable format so as to realize predetermined processing on a computer system. In other words, by installing the computer program according to the second aspect of the present invention in the computer system, a cooperative action is exhibited on the computer system, and the natural language according to the first aspect of the present invention. The same effects as the processing system can be obtained.

本発明によれば、入力文に対して構文意味解析処理を行ない、語と語の間の係り受け関係やその係り受け関係の種類に関する正確な解析結果を得ることができる、優れた自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムを提供することができる。 According to the present invention, it is possible to perform syntactic and semantic analysis processing on an input sentence, and to obtain an accurate analysis result regarding a dependency relationship between words and a type of the dependency relationship. A system, a natural language processing method, and a computer program can be provided.

また、本発明によれば、長文を分割して構文意味解析に要する計算量を軽減することができる、優れた自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムを提供することができる。 Furthermore, according to the present invention, it is possible to provide an excellent natural language processing system, natural language processing method, and computer program that can reduce the amount of calculation required for syntactic and semantic analysis by dividing a long sentence.

また、本発明によれば、分割した文の各要素を構文意味解析した後に各要素の構文解析結果を正確に連結して元の文についての正確な構文意味解析結果を得ることができる、優れた自然言語処理システム及び自然言語処理方法、並びにコンピュータ・プログラムを提供することができる。 In addition, according to the present invention, after syntactic and semantic analysis of each element of the divided sentence, it is possible to obtain an accurate syntactic and semantic analysis result of the original sentence by accurately connecting the syntax analysis results of each element. A natural language processing system, a natural language processing method, and a computer program can be provided.

本発明によれば、互いに対応付けられた文分割ルールと連結ルールと文を分割して得た要素毎の構文意味解析結果を併せて用いることによって、解析困難な長文を分割した後、各要素についての構文意味解析処理を行ない、その解決結果を連結することにより、元の文全体についてのより正確な解析結果を得ることができる。 According to the present invention, after dividing a long sentence that is difficult to analyze by using a sentence-separation rule and a concatenation rule associated with each other and a syntactic and semantic analysis result for each element obtained by dividing the sentence, By performing the syntactic and semantic analysis processing on and connecting the solution results, it is possible to obtain a more accurate analysis result for the entire original sentence.

すなわち、本発明によれば、これまで構文意味解析を実行することが不可能であった長文を解析することが可能となる。文の分割ルールとそれに対応する連結ルールを対にして持つことによって、分割の仕方に応じた適切な部分解析結果の連結を行なうことができるため、分割を行なっているにも拘らず文全体の解析結果を正しく出力することが可能である。 That is, according to the present invention, it is possible to analyze a long sentence that has not been possible to perform syntactic and semantic analysis. By having a sentence splitting rule and a corresponding connection rule in pairs, it is possible to link the appropriate partial analysis results according to the way of splitting. It is possible to output the analysis result correctly.

本発明のさらに他の目的、特徴や利点は、後述する本発明の実施形態や添付する図面に基づくより詳細な説明によって明らかになるであろう。 Other objects, features, and advantages of the present invention will become apparent from more detailed description based on embodiments of the present invention described later and the accompanying drawings.

以下、図面を参照しながら本発明の実施形態について詳解する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図２には、本発明の一実施形態に係る自然言語処理装置１００の機能構成を模式的に示している。図示の自然言語処理装置１００は、文入力部１０１と、分割・連結ルール保持部１０２と、文分割処理部１０３と、構文意味解析部１０４と、解析結果評価部１０７と、文要素連結部１０５と、解析結果出力部１０６で構成される。 FIG. 2 schematically shows a functional configuration of the natural language processing apparatus 100 according to an embodiment of the present invention. The illustrated natural language processing apparatus 100 includes a sentence input unit 101, a division / connection rule holding unit 102, a sentence division processing unit 103, a syntax and semantic analysis unit 104, an analysis result evaluation unit 107, and a sentence element connection unit 105. And an analysis result output unit 106.

文入力部１０１は、日本語やその他の言語で記述された自然言語文を受け取り、文を単語すなわち形態素毎に分割して品詞付けや活用形の同定を行なう形態素解析処理を施すことによって単語区切りや品詞付けを行なう。 The sentence input unit 101 receives a natural language sentence described in Japanese or other languages, divides the sentence into words, that is, morphemes, and performs morpheme analysis processing for identifying part-of-speech and using forms, thereby separating words. And part of speech.

分割・連結ルール保持部１０２は、分割する位置の条件と適用する優先順位からなる分割ルール、並びに各分割ルールに対応付けられている連結位置と関係名からなる連結ルールを保持している。分割ルールは、入力文における分割位置を決定するための１つの形態素又は２以上の連続する形態素が持つ形態素情報として記述される。また、連結ルールは、分割された各要素の構文意味解析結果同士を連結するための位置と係り受け関係を記述したものである。但し、分割ルール並びに連結ルールの詳細については後述に譲る。 The division / connection rule holding unit 102 holds a division rule composed of a condition of a division position and a priority order to be applied, and a connection rule composed of a connection position and a relation name associated with each division rule. The division rule is described as morpheme information of one morpheme or two or more continuous morphemes for determining a division position in the input sentence. The connection rule describes the position and dependency relationship for connecting the syntax and semantic analysis results of each divided element. However, details of the division rule and the connection rule will be described later.

文分割処理部１０３は、分割・連結ルール保持部１０２に問い合わせ、分割ルールを優先度順に順次適用していき、合致する分割ルールを用いて分割位置を決定し、入力文を分割位置の前後の要素に分割する。 The sentence division processing unit 103 inquires the division / concatenation rule holding unit 102, sequentially applies the division rules in order of priority, determines the division position using the matching division rules, and determines the input sentence before and after the division position. Split into elements.

構文意味解析部１０４は、分割された文の各要素に対して、係り受け関係の抽出と格フレームなどの関係の種類を特定する処理を施す。この構文意味解析処理により、語と語の係り受け関係とその関係を記述した解析結果が文の要素毎に出力される。 The syntactic and semantic analysis unit 104 performs processing for extracting dependency relationships and specifying types of relationships such as case frames for each element of the divided sentence. By this syntactic and semantic analysis processing, the dependency relationship between words and the analysis result describing the relationship are output for each element of the sentence.

文要素連結部１０５は、分割・連結ルール保持部１０２に問い合わせ、文分割処理部１０３において適用した分割ルールに対応する連結ルールに基づいて、文の要素の解析結果を結合する。そして、解析結果出力部１０６は、結合された解析結果を、元の文全体の解析結果として出力する。 The sentence element linking unit 105 makes an inquiry to the division / connection rule holding unit 102 and combines the analysis results of the sentence elements based on the linking rule corresponding to the division rule applied by the sentence division processing unit 103. Then, the analysis result output unit 106 outputs the combined analysis result as the analysis result of the entire original sentence.

解析結果評価部１０７は、構文意味解析部１０４による処理結果を入力とし、入力文を分割して得られた各要素についての構文意味解析の処理結果を評価する。そして、構文意味解析部１０４から正常な解析結果を得られなかったと判断した場合には、文分割処理部１０２に指示を出力し、入力文に対する分割処理を再帰的に行ない、より適切な分割位置により入力文の分割と構文意味解析処理の精度向上を図る。 The analysis result evaluation unit 107 receives the processing result from the syntax and semantic analysis unit 104 as input, and evaluates the processing result of the syntax and semantic analysis for each element obtained by dividing the input sentence. If it is determined that a normal analysis result has not been obtained from the syntax and semantic analysis unit 104, an instruction is output to the sentence division processing unit 102, the input sentence is recursively divided, and a more appropriate division position is obtained. To improve the accuracy of input sentence division and syntactic analysis.

例えば、解析結果評価部１０７は、構文意味解析部１０５が入力文の各要素に対する解析を所定時間以内に実行できなかったこと、すなわちタイムアウト・エラーの発生に応答して、分割された各要素をさらに分割すべきかどうかを決定する。そして、さらに分割すべきと決定した場合には、文分割処理部１０２は要素に対する分割処理を再帰的に行ない、構文意味解析部１０５は要素を再度分割して得られる各要素について構文意味解析処理を施す。 For example, the analysis result evaluation unit 107 determines that the syntax / semantic analysis unit 105 cannot execute analysis on each element of the input sentence within a predetermined time, that is, in response to occurrence of a timeout error, Decide whether to split further. When it is determined that further division is required, the sentence division processing unit 102 recursively performs division processing on the elements, and the syntax semantic analysis unit 105 performs syntax semantic analysis processing on each element obtained by dividing the element again. Apply.

また、解析結果評価部１０７は、入力文をある分割処理規則により分割して得られた各要素についての構文意味解析結果を評価し、入力文を分割し直すべきかどうかを決定する。そして、入力文を分割し直すべきと決定した場合には、文分割処理部１０２は、入力文に対して別の分割処理規則を適用して再度分割し、構文意味解析部１０５は入力文を再度分割して得られる各要素について改めて構文意味解析処理を施す。 Further, the analysis result evaluation unit 107 evaluates the syntactic and semantic analysis results for each element obtained by dividing the input sentence according to a certain division processing rule, and determines whether or not the input sentence should be divided again. When it is determined that the input sentence should be re-divided, the sentence division processing unit 102 applies another division processing rule to the input sentence and divides it again, and the syntax-semantic analysis unit 105 determines the input sentence. A syntactic and semantic analysis process is performed again for each element obtained by division again.

なお、解析結果の評価による入力文の再帰的分割処理の詳細については、後述に譲る。 The details of the recursive division processing of the input sentence by evaluating the analysis result will be described later.

本発明に係る自然言語処理は、構文意味解析処理システムにおける構文意味解析部に組み込んで実装することができる。例えば、構文・意味解析を行なうための文法理論の代表的な例として、ＬｅｘｉｃａｌＦｕｎｃｔｉｏｎａｌＧｒａｍｍａｒ（ＬＦＧ）を挙げることができる。ＬＦＧでは、ネイティブ・スピーカの言語知識すなわち文法を、コンピュータ処理や、コンピュータの処理動作に影響を及ぼすその他の非文法的な処理パラメータとは切り離したコンポーネントとして構成している。 The natural language processing according to the present invention can be implemented by being incorporated into a syntax and semantic analysis unit in the syntax and semantic analysis processing system. For example, Lexical Functional Grammar (LFG) can be cited as a typical example of grammar theory for performing syntactic / semantic analysis. In LFG, linguistic knowledge, that is, grammar of native speakers is configured as a component separated from computer processing and other non-grammatical processing parameters that affect the processing operation of the computer.

図３には、ＬＦＧ文法理論に基づく構文意味解析処理システム２００の機能的構成を模式的に示している。 FIG. 3 schematically shows a functional configuration of the syntactic and semantic analysis processing system 200 based on the LFG grammar theory.

形態素解析部２０２は、図２に示した自然言語処理装置１００における文入力部１０１に相当する処理を行なう。図示の例では、形態素解析部２０２は、日本語など特定の言語に関する形態素ルール２０２Ａと形態素辞書２０２Ｂを持ち、入力文を意味的最小単位である形態素に分節して品詞の認定処理を行なう。形態素解析システムとして、例えば「茶筌（Ｃｈａｓｅｎ）」など日本語形態素解析システムを適用することができるが、本発明の要旨はこれに限定されるものではない。茶筌による形態素解析システムについては、例えば、松本裕治、北内啓、山下達雄、平野善隆、松田寛、高岡一馬、浅原正幸共著「日本語形態素解析システム茶筌ｖｅｒｓｉｏｎ２．２．１使用説明書」（奈良先端科学技術大学院大学，２０００）を参照されたい。 The morphological analysis unit 202 performs processing corresponding to the sentence input unit 101 in the natural language processing apparatus 100 shown in FIG. In the example shown in the figure, the morpheme analysis unit 202 has a morpheme rule 202A and a morpheme dictionary 202B relating to a specific language such as Japanese, and performs a part-of-speech recognition process by segmenting an input sentence into morphemes that are the smallest semantic units. As the morphological analysis system, for example, a Japanese morphological analysis system such as “Chasen” can be applied, but the gist of the present invention is not limited to this. For example, Yuji Matsumoto, Kei Kitauchi, Tatsuo Yamashita, Yoshitaka Hirano, Hiroshi Matsuda, Kazuma Takaoka, Masayuki Asahara, “Corporation version 2.2.1 Instruction Manual for Japanese Morphological Analysis System” (Nara Institute of Science and Technology, 2000).

形態素解析処理により、例えば、「私の犬は林檎を食べる。」という文が入力された場合、形態素解析結果として、「私｛名詞｝の｛連体化助詞｝犬｛名詞｝は｛係助詞｝林檎｛名詞｝を｛格助詞｝食べる｛動詞｝。｛句点｝」が出力される。 For example, when a sentence “My dog eats apple” is input by the morphological analysis processing, the {morphological analysis result} of {I {noun} {joint particle}} dog {noun} is {subject} The apple {noun} {case particle} eat {verb}. {Phrase}} is output.

このような形態素解析結果は、次いで、構文意味解析部２０３に入力される。図２に示した自然言語処理装置１００における文分割処理部１０２、構文意味解析部１０３、解析結果評価部１０５、文要素連結部１０６は、構文意味解析部２０３に組み込んで実装することができる。 Such a morphological analysis result is then input to the syntax and semantic analysis unit 203. The sentence division processing unit 102, the syntax and semantic analysis unit 103, the analysis result evaluation unit 105, and the sentence element coupling unit 106 in the natural language processing apparatus 100 illustrated in FIG.

図３に示す例では、構文意味解析部２０３は、文法ルール２０３Ａや格フレーム辞書２０３Ｂなどの辞書を持ち、文法ルールなどに基づく句構造の解析や、文中の語の語義や語と語の間の意味関係などに基づいて文が伝える意味を表現する意味構造の解析を行なう（格フレーム辞書は動詞と主語などの文中の他の構成要素との関係を記述したものであり、述部とそれに係る語の意味関係を抽出することができる）。そして、構文解析した結果として、単語や形態素などからなる文章の句構造を木構造として表した“ｃ−ｓｔｒｕｃｔｕｒｅ（ｃｏｎｓｔｉｔｕｅｎｔｓｔｒｕｃｔｕｒｅ）”と、主語、目的語などの格構造に基づいて入力文を疑問文、過去形、丁寧文など意味的・機能的に解析した結果として“ｆ−ｓｔｒｕｃｔｕｒｅ（ｆｕｎｃｔｉｏｎａｌｓｔｒｕｃｔｕｒｅ）”を出力する。 In the example shown in FIG. 3, the syntax / semantic analysis unit 203 has dictionaries such as a grammar rule 203A and a case frame dictionary 203B, analyzes a phrase structure based on a grammar rule, and the meaning of words in a sentence and between words. Analyzes the semantic structure that expresses the meaning that the sentence conveys based on the semantic relations of words (the case frame dictionary describes the relationship between verbs and other components in the sentence, such as the subject, predicates and The semantic relationship of such words can be extracted). As a result of parsing, “c-structure (constituent structure)” representing a phrase structure of a sentence including words and morphemes as a tree structure, and an input sentence based on a case structure such as a subject and an object are questioned. “F-structure (functional structure)” is output as a result of semantic and functional analysis such as sentences, past tense, and polite sentences.

図４及び図５には、入力文「私の犬は林檎を食べる。」を構文意味解析部２０３により処理した結果として得られるｃ−ｓｔｒｕｃｔｕｒｅ及びｆ−ｓｔｒｕｃｔｕｒｅをそれぞれ示している。 FIG. 4 and FIG. 5 respectively show c-structure and f-structure obtained as a result of processing the input sentence “My dog eats apple” by the syntax and semantic analysis unit 203.

ｃ−ｓｔｒｕｃｔｕｒｅは、文中の単語や句の構造を木構造形式で表したものであり、構文カテゴリによって定義される。例えば音素列を生成するための音韻学的な解釈を、ｃ−ｓｔｒｕｃｔｕｒｅを基に行なうことができる。一方、ｆ−ｓｔｒｕｃｔｕｒｅは、文法的な機能を明確に表現したものであり、文法的な機能名、意味的形式、並びに特徴シンボルにより構成される。このようなｆ−ｓｔｒｕｃｔｕｒｅを参照することにより、主語（ｓｕｂｊｅｃｔ）、目的語（ｏｂｊｅｃｔ）、補語（ｃｏｍｐｌｅｍｅｎｔ）、修飾語（ａｄｊｕｎｃｔ）といった意味理解を得ることができる。ｆ−ｓｔｒｕｃｔｕｒｅは、ｃ−ｓｔｒｕｃｔｕｒｅの各節点に付随する素性の集合であり、例えば図５に示すように属性−属性値のマトリックスの形で表現される。すなわち、図５に示す例では、［］で囲まれた中の左側は素性（属性）の名前であり、右側は素性の値（属性値）である。 c-structure represents the structure of words and phrases in a sentence in a tree structure format, and is defined by a syntax category. For example, phonological interpretation for generating a phoneme string can be performed based on c-structure. On the other hand, f-structure clearly expresses a grammatical function, and includes a grammatical function name, a semantic form, and a feature symbol. By referring to such f-structure, it is possible to obtain an understanding of the meaning of a subject, a subject, an complement, a modifier, and an adjunct. The f-structure is a set of features associated with each node of the c-structure, and is expressed in the form of an attribute-attribute value matrix as shown in FIG. 5, for example. That is, in the example shown in FIG. 5, the left side in [] is a feature (attribute) name, and the right side is a feature value (attribute value).

また、ｆ−ｓｔｒｕｃｔｕｒｅを、語と語の係り受け関係を基にして接合された木構造すなわち「依存木」として表現することもでき、この場合には語と語を接合する枝には係り受け関係の種類が付加される。図６には、図５に示したｆ−ｓｔｒｕｃｔｕｒｅを依存木の形態で表現している。図示のように、依存木は、単語を示すノードと係り受けを示すリンクと関係名を示すリンクラベルからなる。 Also, f-structure can be expressed as a tree structure that is joined based on the dependency relationship between words, that is, a “dependency tree”. A relationship type is added. 6 represents the f-structure shown in FIG. 5 in the form of a dependency tree. As illustrated, the dependency tree includes a node indicating a word, a link indicating dependency, and a link label indicating a relationship name.

なお、ＬＦＧの詳細に関しては、例えばＲ．Ｍ．Ｋａｐｌａｎ及びＪ．Ｂｒｅｓｎａｎ共著の論文“Ｌｅｘｉｃａｌ−ＦｕｎｃｔｉｏｎａｌＧｒａｍｍａｒ：ＡＦｏｒｍａｌＳｙｓｔｅｍｆｏｒＧｒａｍｍａｔｉｃａｌＲｅｐｒｅｓｅｎｔａｔｉｏｎ”（ＴｈｅＭＩＴＰｒｅｓｓ，Ｃａｍｂｒｉｄｇｅ（１９８２）．ＲｅｐｒｉｎｔｅｄｉｎＦｏｒｍａｌＩｓｓｕｅｓｉｎＬｅｘｉｃａｌ−ＦｕｎｃｔｉｏｎａｌＧｒａｍｍａｒ，ｐｐ．２９−１３０．ＣＳＬＩｐｕｂｌｉｃａｔｉｏｎｓ，ＳｔａｎｆｏｒｄＵｎｉｖｅｒｓｉｔｙ（１９９５）．）などに記述されている。 For details of LFG, see, for example, R.A. M.M. Kaplan and J.H. Bresnan co-author of the paper. "Lexical-Functional Grammar: A Formal System for Grammatical Representation" (The MIT Press, Cambridge (1982) Reprinted in Formal Issues in Lexical-Functional Grammar, pp.29-130.CSLI publications, Stanford University (1995 ).) Etc.

続いて、本実施形態に係る自然言語処理装置１００における自然言語処理の動作例について具体的に説明する。なお、以下では、日本語を対象として説明を行なうが、構文意味解析処理が適用可能な言語であればいかなる言語であっても同様の効果を得ることができるということを十分理解されたい。 Subsequently, an operation example of natural language processing in the natural language processing apparatus 100 according to the present embodiment will be specifically described. In the following description, Japanese will be described, but it should be understood that the same effect can be obtained in any language that can be applied to the syntax and semantic analysis processing.

まず、文入力部１０１に対して自然言語文「円覚寺は鎌倉時代の建造物としてとても有名なので、広く一般的に知られている。」が入力されたとする。この文は形態素解析処理によって、図７に示すような品詞情報付きの形態素列に変換される。 First, it is assumed that a natural language sentence “Enkakuji is very well-known as a building in the Kamakura period and widely known” is input to the sentence input unit 101. This sentence is converted into a morpheme string with part-of-speech information as shown in FIG. 7 by morpheme analysis processing.

次に、文分割処理部１０２において、分割ルールを優先度順に順次適用していき、合致する分割ルールを用いて分割位置を決定し、入力文を分割位置の前後の要素に分割する。図８には、分割ルールの例を示している。分割ルールは、入力文における分割位置を決定するための１つの形態素又は２以上の連続する形態素が持つ形態素情報として記述される。図８には、分割ルールの例を示している。 Next, the sentence division processing unit 102 sequentially applies the division rules in order of priority, determines the division position using the matching division rules, and divides the input sentence into elements before and after the division position. FIG. 8 shows an example of the division rule. The division rule is described as morpheme information of one morpheme or two or more continuous morphemes for determining a division position in the input sentence. FIG. 8 shows an example of the division rule.

図８に示す例では、優先度１の分割ルールＡとして、接続助詞Ａ類に読点が連結する箇所を分割位置として取り決めている。また、優先度２の分割ルールＢとして、接続助詞Ｂ類に読点が連結する箇所を分割位置として取り決めている。また、優先度３の分割ルールＣとして、活用後の連用形に読点が連結する箇所を分割位置として取り決めている。また、優先度４の分割ルールＤとして、格助詞に読点が連結する箇所を分割位置として取り決めている。また、優先度５の分割ルールＥとして、名詞と係助詞と読点が連続する箇所を分割位置として取り決めている。また、優先度６の分割ルールＦとして、副詞可能名詞に読点が連結する箇所を分割位置として取り決めている。また、優先度７の分割ルールＧとして、読点が出現する箇所を分割位置として取り決めている。また、優先度８の分割ルールＡ’として、接続助詞Ａ類が出現する箇所を分割位置として取り決めている。また、優先度９の分割ルールＢ’として、接続助詞Ｂ類が出現する箇所を分割位置として取り決めている。また、優先度１０の分割ルールＣ’として、活用後の連用形が出現する箇所を分割位置として取り決めている。また、優先度１１の分割ルールＤ’として、格助詞が出現する箇所を分割位置として取り決めている。また、優先度１２の分割ルールＥ’として、係助詞が出現する箇所を分割位置として取り決めている。また、優先度１３の分割ルールＦ’として、副詞可能名詞が出現する箇所を分割位置として取り決めている。 In the example shown in FIG. 8, as the division rule A with priority 1, the location where the reading mark is connected to the connection particle A class is determined as the division position. Further, as a division rule B with a priority of 2, a portion where a reading point is connected to the connection particle B class is determined as a division position. Further, as a division rule C of priority 3, a place where a reading point is connected to a continuous form after use is decided as a division position. Further, as a division rule D having a priority level 4, a position where a reading mark is connected to a case particle is determined as a division position. Further, as a division rule E of priority 5, a place where a noun, an auxiliary particle and a reading point continue is determined as a division position. Further, as a division rule F having a priority level 6, a position where a reading point is connected to an adverb noun can be determined as a division position. Further, as a division rule G with a priority of 7, a place where a reading point appears is decided as a division position. In addition, as a division rule A ′ having a priority of 8, a place where a connection particle A class appears is determined as a division position. In addition, as a division rule B ′ having a priority of 9, a location where a connection particle B class appears is determined as a division position. In addition, as a division rule C ′ having a priority of 10, a portion where a continuous form after use appears is determined as a division position. In addition, as a division rule D ′ having a priority of 11, a location where a case particle appears is determined as a division position. In addition, as a division rule E ′ having a priority of 12, a part where a coordinator appears is determined as a division position. In addition, as a division rule F ′ having a priority of 13, a location where an adverb noun appears appears as a division position.

文分割処理部１０２は、分割・連結ルール保持部１０３に形態素列に変換された入力文を問い合わせると、入力文の中には接続助詞Ａ類＋読点である「ので、」が存在するので、図８中において優先度１が与えられているルールＡが適応され、分割位置が決定される。この結果、上記の入力文は、２つの文要素、すなわち前要素「円覚寺は鎌倉時代の建造物としてとても有名なので、」と、後要素「広く一般的に知られている。」に分割される。 When the sentence division processing unit 102 inquires of the division / concatenation rule holding unit 103 for an input sentence converted into a morpheme string, in the input sentence, there is “so,” which is a connected particle A class + reading mark. In FIG. 8, rule A given priority 1 is applied to determine the division position. As a result, the above-mentioned input sentence is divided into two sentence elements, that is, the front element “Enkakuji is very famous as a building in the Kamakura period” and the rear element “widely known”. The

次に、構文意味解析部１０４において、入力文を分割して得られた前要素及び後要素のそれぞれについて構文意味解析を行なう。図９には、分割位置前後の各要素についての構文意味解析結果を依存木の形式で示している。図示のように、依存木は、単語を示すノードと係り受けを示すリンクと関係名を示すリンクラベルからなる。 Next, the syntax and semantic analysis unit 104 performs syntax and semantic analysis on each of the previous element and the subsequent element obtained by dividing the input sentence. FIG. 9 shows the syntax and semantic analysis results for each element before and after the division position in the form of a dependency tree. As illustrated, the dependency tree includes a node indicating a word, a link indicating dependency, and a link label indicating a relationship name.

次に、文要素連結部１０５において、図９に示した前要素及び後要素それぞれの構文意味解析結果を連結する。本実施形態では、文要素連結部１０５は、分割された各要素における語と語の係り受け関係とその関係の種類、及び入力文の分割位置における形態素情報に基づいて、分割された各要素の構文意味解析結果における連結する位置と係り受け関係を決定して要素の構文意味解析結果同士を連結する。 Next, the sentence element coupling unit 105 concatenates the syntax and semantic analysis results of the previous element and the subsequent element shown in FIG. In the present embodiment, the sentence element linking unit 105 determines whether each of the divided elements is based on the word-to-word dependency relationship in each divided element, the type of the relationship, and the morpheme information at the input sentence division position. In the syntactic and semantic analysis results, the connecting positions and dependency relationships are determined, and the syntactic and semantic analysis results of the elements are connected.

連結ルールは、分割された各要素の構文意味解析結果同士を連結するための位置と係り受け関係を記述したものである。既に述べたように、入力文の分割位置における形態素情報に基づいて連結ルールが定まるが、これは分割ルールと連結ルールの間に対応関係があることを意味している。そこで、本実施形態では、分割・連結ルール保持部１０３は、分割ルールに対応付けて連結ルールを保持している。図１０には連結ルールの例を示している。 The connection rule describes the position and dependency relationship for connecting the syntactic and semantic analysis results of each divided element. As described above, the connection rule is determined based on the morpheme information at the division position of the input sentence, which means that there is a correspondence between the division rule and the connection rule. Therefore, in this embodiment, the division / connection rule holding unit 103 holds a connection rule in association with the division rule. FIG. 10 shows an example of a connection rule.

図１０に示す例では、分割ルールＡに対応する連結ルールＡとして、分割位置より前の要素を分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＢに対応する連結ルールＢとして、ダミーノード（各要素を束ねるための代用語）を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することを取り決めている。また、分割ルールＣに対応する連結ルールＣとして、各要素を束ねるための代用語を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することを取り決めている。また、分割ルールＤに対応する連結ルールＤとして、分割位置より後の要素の動詞が引用動詞であれば該分割位置より前の要素を該動詞の引用格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＥに対応する連結ルールＥとして、分割位置より後の要素の動詞に省略されている格要素があれば該分割位置より前の要素を該当する格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＦに対応する連結ルールＦとして、分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＧに対応する連結ルールＧとして、分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＡ’に対応する連結ルールＡ’として、分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＢ’に対応する連結ルールＢ’として、各要素を束ねるための代用語を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することを取り決めている。また、分割ルールＣ’に対応する連結ルールＣ’として、各要素を束ねるための代用語を挿入し、該分割位置より前の要素及び後の要素の各構文意味解析結果を代用語を介して並置関係により連結することを取り決めている。また、分割ルールＤ’に対応する連結ルールＤ’として、分割位置より後の要素の動詞が引用動詞であれば該分割位置より前の要素を該動詞の引用格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＥ’に対応する連結ルールＥ’として、分割位置より後の要素の動詞に省略されている格要素があれば該分割位置より前の要素を該当する格として各要素の構文意味解析結果同士を連結し、そうでなければ該分割位置より前の要素を連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。また、分割ルールＦ’に対応する連結ルールＦ’として、分割位置より前の要素を該分割位置より後の要素の動詞に対する連用修飾成分として各要素の構文意味解析結果同士を連結することを取り決めている。 In the example shown in FIG. 10, as the connection rule A corresponding to the division rule A, the elements before the division position are connected as syntactic modification components for the verbs of the elements after the division position, and the syntactic analysis results of the respective elements are connected. Is negotiating. In addition, as a connection rule B corresponding to the division rule B, a dummy node (a synonym for bundling each element) is inserted, and each syntactic and semantic analysis result of the element before and after the division position is used as a synonym. It is negotiated to be connected by juxtaposition. Also, as a concatenation rule C corresponding to the division rule C, a synonym for bundling each element is inserted, and the syntactic and semantic analysis results of the element before and after the division position are placed side by side via the synonym. It is decided to connect by. In addition, as the connection rule D corresponding to the division rule D, if the verb of the element after the division position is a citation verb, the syntactic analysis result of each element is obtained with the element before the division position as the citation of the verb. Otherwise, it is agreed to connect the results of syntactic analysis of each element using the elements before the division position as the continuous modification component. In addition, as a connection rule E corresponding to the division rule E, if there is a case element omitted in the verb of the element after the division position, the result of the syntax and semantic analysis of each element with the element before the division position as the corresponding case Otherwise, it is decided to connect the results of syntactic and semantic analysis of each element by using the element before the division position as a continuous modification component. Further, as a connection rule F corresponding to the division rule F, it is decided to connect the results of syntactic and semantic analysis of each element using the element before the division position as a continuous modification component for the verb of the element after the division position. . In addition, as a connection rule G corresponding to the division rule G, it is decided that the elements before the division position are connected to each other as the consecutive modification component for the verb of the element after the division position, and the syntactic analysis results of each element are connected. . In addition, as a connection rule A ′ corresponding to the division rule A ′, it is decided to connect the results of syntactic analysis of each element using the element before the division position as a continuous modification component for the verb of the element after the division position. ing. Further, as a concatenation rule B ′ corresponding to the division rule B ′, a synonym for bundling each element is inserted, and each syntactic and semantic analysis result of the element before and after the division position is passed through the synonym. Arrange to connect by juxtaposition. Further, as a connection rule C ′ corresponding to the division rule C ′, a synonym for bundling each element is inserted, and the syntactic and semantic analysis results of the element before and after the division position are passed through the synonym. Arrange to connect by juxtaposition. In addition, as the connection rule D ′ corresponding to the division rule D ′, if the verb of the element after the division position is a citation verb, the result of syntactic analysis of each element with the element before the division position as the citation of the verb Otherwise, it is decided to connect the results of syntactic and semantic analysis of each element by using the element before the division position as a continuous modification component. Further, as a connection rule E ′ corresponding to the division rule E ′, if there is a case element omitted in the verb of the element after the division position, the syntax meaning of each element is assumed to be the case before the division position. It is decided that the analysis results are connected to each other, and otherwise, the syntax-semantic analysis results of the respective elements are connected to each other with the element preceding the division position as the continuous modification component. Also, as a connection rule F ′ corresponding to the division rule F ′, it is decided to connect the results of syntactic analysis of each element using the element before the division position as a continuous modification component for the verb of the element after the division position. ing.

文要素連結部１０５は、前要素と後要素の構文意味解析結果を連結するために、分割・連結ルール保持部１０３に連結ルールを問い合わせる。ここで例に取り上げている上記の入力文は、ルールＡによって上記の各要素に分割されていることから、ルールＡに対応する連結ルールＡを適用する。したがって、前要素から後要素に対してリンクが張られ、連用修飾成分のリンクラベルＡＤＪＵＮＣＴが付与されて、２つの要素の解析結果が連結される。図１１には、この手続きによって得られた２つの要素の解析結果を連結した結果得られる元の文全体としての構文意味解析結果を示している。 The sentence element linking unit 105 inquires of the division / connection rule holding unit 103 about a connection rule in order to connect the syntax and semantic analysis results of the previous element and the subsequent element. Since the input sentence taken up as an example here is divided into the above elements by the rule A, the connection rule A corresponding to the rule A is applied. Therefore, a link is established from the front element to the rear element, the link label ADJUNCT of the continuous modification component is given, and the analysis results of the two elements are connected. FIG. 11 shows the syntactic and semantic analysis result of the entire original sentence obtained as a result of concatenating the analysis results of the two elements obtained by this procedure.

最後に、解析結果出力部１０６において、元の入力文「円覚寺は鎌倉時代の建造物としてとても有名なので、広く一般的に知られている。」の解析結果として、図１１に示される解析結果が出力される。 Finally, in the analysis result output unit 106, the analysis result shown in FIG. 11 is the analysis result of the original input sentence “Engonji is very well known as a building in the Kamakura period”. Is output.

続いて、自然言語文「クリントン大統領の来日の際に決めた日米投資促進機構の設立は、衆議院が反対した。」を入力文とする場合を用いて、上記以外の分割ルール及び連結ルールが適用される場合の構文意味解析処理の動作例について説明する。 Next, the division rules and connection rules other than the above are used in the case of using the natural language sentence "The House of Representatives objected to the establishment of the Japan-US Investment Promotion Organization decided upon President Clinton's visit to Japan." An example of the operation of the syntactic and semantic analysis process when is applied will be described.

まず、文入力部１０１において、この文が入力され、形態素解析処理によって、上記と同様に品詞情報付きの形態素列に変換される。 First, in the sentence input unit 101, this sentence is input and converted into a morpheme string with part-of-speech information in the same manner as described above by morphological analysis processing.

次に文分割処理部１０２において、この入力文の分割を行なう。すなわち、文分割処理部１０２は、分割・連結ルール保持部１０３に対し、形態素列に変換された入力文を問い合わせると、入力文の中には名詞＋係助詞＋読点である「設立は、」が存在するので、図８中で優先度５を持つルールＥが適応される。この結果、入力文は前後２つの文要素、すなわち前要素「クリントン大統領の来日の際に決めた日米投資促進機構の設立は、」と、後要素「衆議院が反対した。」に分割される。 Next, the sentence division processing unit 102 divides the input sentence. That is, when the sentence division processing unit 102 inquires of the division / concatenation rule holding unit 103 about an input sentence converted into a morpheme string, the “establishment” is a noun + an auxiliary particle + reading mark in the input sentence. Therefore, rule E having priority 5 in FIG. 8 is applied. As a result, the input sentence is divided into two sentence elements, the former element “Establishment of the Japan-US Investment Promotion Organization decided when President Clinton visited Japan” and the latter element “The House of Representatives objected.” The

次に、構文意味解析部１０４において、分割された要素毎の構文意味解析を行なう。図1２には、分割位置前後の各要素についての構文意味解析結果を依存木の形式で示している。図示のように、依存木は、単語を示すノードと係り受けを示すリンクと関係名を示すリンクラベルからなる。 Next, the syntax and semantic analysis unit 104 performs syntax and semantic analysis for each divided element. FIG. 12 shows the syntax and semantic analysis results for each element before and after the dividing position in the form of a dependency tree. As illustrated, the dependency tree includes a node indicating a word, a link indicating dependency, and a link label indicating a relationship name.

次に、文要素連結部１０５において、図１２に示した前後２つ要素の解析結果を連結する。すなわち、文要素連結部１０５は、分割・連結ルール保持部１０３に連結ルールを問い合わせる。これらの要素は分割ルールＥによって入力文を分割して得られているため、分割ルールＥに対応する連結ルールＥを適用する。したがって、後要素には省略された格要素ＯＢＪが存在するため、前要素をＯＢＪとして後要素に連結する。図１３には、この手続きによって得られた２つの要素の構文解析結果を連結することによって得られる元の文全体についての構文意味解析結果を示している。 Next, the sentence element connecting unit 105 connects the analysis results of the two elements before and after shown in FIG. That is, the sentence element connection unit 105 inquires of the division / connection rule holding unit 103 about a connection rule. Since these elements are obtained by dividing the input sentence by the division rule E, the connection rule E corresponding to the division rule E is applied. Therefore, since the case element OBJ omitted is present in the rear element, the front element is connected to the rear element as OBJ. FIG. 13 shows a syntactic and semantic analysis result for the entire original sentence obtained by concatenating the parsing results of two elements obtained by this procedure.

最後に、解析結果出力部１０６において、元の入力文「クリントン大統領の来日の際に決めた日米投資促進機構の設立は、衆議院が反対した。」の解析結果として、図１３に示される解析結果が出力される。 Finally, in the analysis result output unit 106, FIG. 13 shows an analysis result of the original input sentence “The House of Representatives objected to the establishment of the Japan-US Investment Promotion Organization decided when President Clinton visited Japan”. The analysis result is output.

解析結果評価部１０７は、構文意味解析の処理結果を評価し、この評価次第では入力文に対する分割処理を再帰的に行ない、より適切な分割位置により入力文の分割と構文意味解析処理の精度向上を図ることができる（前述）。上述した２つの入力文の処理例では入力文の再帰的分割処理を行なう必要はなかったが、以下では、入力文の再帰的分割処理を伴う場合についての構文意味解析の処理手順について説明する。 The analysis result evaluation unit 107 evaluates the processing result of the syntax and semantic analysis, and depending on the evaluation, recursively performs the split processing on the input sentence, and improves the accuracy of the input sentence split and the syntax and semantic analysis processing at a more appropriate split position. Can be achieved (described above). Although it is not necessary to perform recursive division processing of the input sentence in the above-described two input sentence processing examples, a description will be given below of a processing procedure of syntactic and semantic analysis in a case involving recursive division processing of the input sentence.

まず、文入力部１０１において、自然言語文「この問題は中国側の捜査の進展状況とも密接に絡んでおり、日本捜査当局は１７日、合同会議で話し合うだろうと言われている。」を入力文として受容し、形態素解析処理によって上記と同様に品詞情報付きの形態素列に変換される。 First, in the sentence input unit 101, a natural language sentence “This problem is closely related to the progress of the Chinese investigation, and it is said that the Japanese investigation authorities will discuss at the joint meeting on the 17th”. It is accepted as a sentence and converted into a morpheme string with part-of-speech information in the same manner as described above by morpheme analysis processing.

次に、文分割処理部１０２において、この入力文の分割を行なう。すなわち、分割・連結ルール保持部１０３に形態素列に変換された入力文を問い合わせると、入力文の中には補助動詞の連用形＋読点である「おり、」が存在するので、図８中の優先度３の分割ルールＣが適応される。この結果、上記の入力文は２つの文要素、すなわち前要素「この問題は、中国側の捜査の進展状況とも密接に絡んでおり、」と、後要素「日本捜査当局は１７日、合同会議で話し合うだろうと言われている。」に分割される。 Next, the sentence division processing unit 102 divides this input sentence. That is, when an inquiry is made to the split / concatenation rule holding unit 103 for an input sentence converted to a morpheme string, the auxiliary sentence combination form + punctuation mark “Ori” exists in the input sentence. A division rule C of degree 3 is applied. As a result, the above-mentioned input sentence has two sentence elements, namely the former element “This issue is closely related to the progress of the Chinese investigation,” and the latter element “the Japanese investigation authorities It is said that they will talk with each other. "

次に、構文意味解析部１０４において、分割して得られた各要素についての構文意味解析を行なう。ここでは、前要素と後要素がそれぞれタイムアウト・エラーのため構文意味解析に失敗したと仮定する。 Next, the syntax and semantic analysis unit 104 performs syntax and semantic analysis on each element obtained by division. Here, it is assumed that the syntactic and semantic analysis has failed due to a time-out error in each of the preceding and succeeding elements.

次に、解析結果評価部１０７において、各要素についての構文意味解析結果の評価を行ない、次に行なう処理を決定する。図１４には、解析結果評価部１０７における処理手順をフローチャートの形式で示している。以下、このフローチャートに従って、各要素についての構文意味解析結果の評価に基づく処理の流れについて説明する。 Next, the analysis result evaluation unit 107 evaluates the syntax and semantic analysis results for each element, and determines the next process to be performed. FIG. 14 shows a processing procedure in the analysis result evaluation unit 107 in the form of a flowchart. Hereinafter, the flow of processing based on the evaluation of the syntactic and semantic analysis results for each element will be described according to this flowchart.

まず、解析結果が成功したか否かを判断する（ステップＳ１）。ここでは解析が失敗しているため、続いて失敗の要因がタイムアウトか否かを判断する（ステップＳ２）。ここでは、タイムアウト・エラーによって構文意味解析処理に失敗していることから、次の処理を「文要素の分割」に決定し、これに応答して、前要素と後要素に対して再度分割処理を行なう（ステップＳ４）。 First, it is determined whether or not the analysis result is successful (step S1). Here, since the analysis has failed, it is subsequently determined whether or not the cause of the failure is a timeout (step S2). Here, since the syntax and semantic analysis process has failed due to a timeout error, the next process is determined to be "splitting sentence element", and in response to this, the previous element and the subsequent element are divided again. (Step S4).

再度分割処理の要求に応答して、文分割処理部１０２において、当該要素についての再帰的分割処理が行なわれる。すなわち、前要素「この問題は、中国側の捜査の進展状況とも密接に絡んでおり、」の形態素列に変換された文を分割・連結ルール保持部１０３に問い合わせると、当該要素の文中には名詞＋係助詞＋読点である「問題は、」が存在するので、図８中の優先度５の分割ルールＥが適応される。この結果、当該要素はさらに２つの要素、すなわち前要素「この問題は、」と後要素「中国側の捜査の進展状況とも密接に絡んでおり、」に分割される。 In response to the request for division processing again, the sentence division processing unit 102 performs recursive division processing for the element. In other words, when the sentence converted to the morpheme sequence of the previous element “This problem is closely related to the progress of the Chinese investigation” is inquired of the division / connection rule holding unit 103, the sentence of the element contains Since there is a “problem is” which is a noun + ancillary particle + a reading mark, the division rule E of priority 5 in FIG. 8 is applied. As a result, the element is further divided into two elements, namely the former element “This problem is” and the latter element “which is closely related to the progress of the Chinese investigation”.

次に、構文意味解析部１０４において、構文意味解析を行なう。ここでは、２つの要素の処理がそれぞれ成功したと仮定する。図１５には、これらの解析結果を示している。そして、解析結果評価部１０７では、各要素が解析に成功しているため、これ以上処理は行なわれない。 Next, the syntax and semantic analysis unit 104 performs syntax and semantic analysis. Here, it is assumed that the processing of the two elements is successful. FIG. 15 shows these analysis results. In the analysis result evaluation unit 107, since each element has been successfully analyzed, no further processing is performed.

次に、文分割処理部１０２において、入力文の後要素「日本捜査当局は１７日、合同会議で話し合うだろうと言われている。」についての再帰的分割処理を行なう。すなわち、分割・連結ルール保持部に形態素列に変換された文を問い合わせると、文中には読点が存在するので、図８中の優先度７のルールＧが適応される。この結果、２つの文要素、すなわち前要素「日本捜査当局は１７日、」と、後要素「合同会議で話し合うだろうと言われている。」に分割される。 Next, the sentence division processing unit 102 performs a recursive division process on the post-element of the input sentence “It is said that the Japanese investigation authorities will discuss at the joint meeting on the 17th”. In other words, when a sentence converted to a morpheme string is inquired to the division / concatenation rule holding unit, since there is a punctuation mark in the sentence, the rule G of priority 7 in FIG. 8 is applied. As a result, it is divided into two sentence elements, that is, the former element “The Japanese investigative authorities are on the 17th” and the latter element “It is said that they will discuss at the joint meeting.”

次に、構文意味解析部１０４において、再起分割された各要素についての構文意味解析を行なう。ここでは、前要素の解析が文法エラーによって失敗したと仮定する。 Next, the syntactic and semantic analysis unit 104 performs syntactic and semantic analysis on each element that has been re-divided. Here, it is assumed that the analysis of the previous element has failed due to a syntax error.

次に、解析結果評価部１０７において、要素毎の構文意味解析結果の評価を行ない、次のステップで行なう処理を決定する。構文意味解析の文法エラーが発生した場合の処理の流れについて、図１４に示したフローチャートを参照しながら説明する。 Next, the analysis result evaluation unit 107 evaluates the syntax and semantic analysis result for each element, and determines the processing to be performed in the next step. The flow of processing when a syntax error in syntax semantic analysis occurs will be described with reference to the flowchart shown in FIG.

まず、構文意味解析部１０４における要素毎の構文意味解析結果が成功したか否かを判断する（ステップＳ１）。ここでは解析が失敗しているため、さらに失敗の要因がタイムアウトか否かを判断する（ステップＳ２）。ここでは、タイムアウト・エラーではなく文法エラーによって失敗しているため、次の処理を「元の文（要素）に現在の優先度ｎ＋１のルールを適用して再分割する」に決定する（ステップＳ３）。すなわち、今回とは別の分割ルールを用いて入力文の分割処理をやり直すことにより、異なる分割位置により得られる各要素についての構文意味解析処理を試行する。 First, it is determined whether or not the syntax and semantic analysis result for each element in the syntax and semantic analysis unit 104 is successful (step S1). Here, since the analysis has failed, it is further determined whether or not the cause of the failure is a timeout (step S2). Here, since the failure is due to a syntax error instead of a timeout error, the next process is determined to be “re-divide by applying the rule of the current priority n + 1 to the original sentence (element)” (step S3). ). In other words, by performing the input sentence division process again using a different division rule from this time, a syntactic and semantic analysis process is tried for each element obtained at a different division position.

ここで用いている例では、現在の分割ルールの優先度は７であるため、優先度８以降のルールを元の文要素「日本捜査当局は１７日、合同会議で話し合うだろうと言われている。」に適用し、再度分割処理を行なう。 In the example used here, since the priority of the current division rule is 7, the rule after priority 8 is said to be the original sentence element “The Japanese investigative authorities will discuss at the joint meeting on the 17th. . ”And the division process is performed again.

文分割処理部１０２において、この文要素「日本捜査当局は１７日、合同会議で話し合うだろうと言われている。」についての再分割を行なう。すなわち、分割・連結ルール保持部１０３の優先度８以降の分割ルールに、形態素列に変換された当該要素を問い合わせる。ここで、文中には格助詞「と」が存在するので、図８中の優先度１１のルールＤ’が適応される。この結果、当該要素はさらに２つの文要素、すなわち前要素「日本捜査当局は１７日、合同会議で話し合うだろうと」と、後要素「言われている。」に分割される。 The sentence division processing unit 102 performs subdivision on the sentence element “It is said that the Japanese investigation authorities will discuss at the joint meeting on the 17th”. That is, the division / connection rule holding unit 103 inquires of the division rule of priority 8 or later about the element converted into the morpheme string. Here, since the case particle “to” exists in the sentence, the rule D ′ having the priority 11 in FIG. 8 is applied. As a result, the element is further divided into two sentence elements, namely, the former element “The Japanese investigative authorities will discuss at the joint meeting on the 17th” and the latter element “It is said.”

次に、構文意味解析部１０４において、再帰的に分割されたこれらの各要素についての構文意味解析を行なう。ここでは、２つの要素の処理がそれぞれ成功したと仮定する。図１６には、これらの構文解析結果を示している。解析結果評価部１０７では、各要素が解析に成功しているため、これ以上処理は行なわれない。 Next, the syntactic and semantic analysis unit 104 performs syntactic and semantic analysis on each of these elements recursively divided. Here, it is assumed that the processing of the two elements is successful. FIG. 16 shows the results of these syntax analysis. In the analysis result evaluation unit 107, since each element has been successfully analyzed, no further processing is performed.

以上説明してきた処理により、文「この問題は中国側の捜査の進展状況とも密接に絡んでおり、日本捜査当局は１７日、合同会議で話し合うだろうと言われている。」の分割が終了する。図１７には、入力文を分割する際、さらに分割された文の要素をさらに再帰的に分割する際に適用された分割ルールを示している。 Through the above-described process, the division of the sentence “It is said that this issue is closely related to the progress of the Chinese investigation and that the Japanese investigative authorities will discuss at the joint meeting on the 17th.” . FIG. 17 shows a division rule that is applied when the input sentence is divided and the elements of the further divided sentence are further recursively divided.

次に、文要素連結部１０５において、それぞれの要素の解析結果を連結する。まず、図１５に示した２つの解析結果を連結する。これらの要素は分割ルールＥによって分割されていることから、分割ルールＥに対応する図１０中の連結ルールＥを適用する。ここでは、後要素には省略された格要素ＯＢＪが存在するため、前要素をＯＢＪとして後要素に連結する。図１８には、上述した手続きによって得られた２つの要素の構文意味解析結果を連結して得られる結果を示している。 Next, the sentence element connecting unit 105 connects the analysis results of the respective elements. First, the two analysis results shown in FIG. 15 are connected. Since these elements are divided by the division rule E, the connection rule E in FIG. 10 corresponding to the division rule E is applied. Here, since the case element OBJ omitted is present in the rear element, the front element is connected to the rear element as OBJ. FIG. 18 shows the result obtained by concatenating the syntactic and semantic analysis results of the two elements obtained by the above-described procedure.

次に、図１６に示した２つの解析結果を連結する。これらの要素は分割ルールＤ’によって分割されているため、分割ルールＤ’に対応する図１０中の連結ルールＤ’を適用する。この場合、後要素には省略された引用格要素ＯＢＬ＿ｔｏが存在するので、前要素をＯＢＬ＿ｔｏとして後要素に連結する。図１９に、上述した手続きによって得られた２つの要素の構文意味解析結果を連結して得られる結果を示している。 Next, the two analysis results shown in FIG. 16 are connected. Since these elements are divided by the division rule D ′, the connection rule D ′ in FIG. 10 corresponding to the division rule D ′ is applied. In this case, since the reference element OBL_to omitted is present in the rear element, the front element is connected to the rear element as OBL_to. FIG. 19 shows a result obtained by concatenating the syntactic and semantic analysis results of two elements obtained by the above-described procedure.

さらに、図１８と図１９にそれぞれ示した構文意味解析の連結結果をさらに連結して、元の入力文全体についての構文意味解析結果を得る。これらの要素は元の入力文に対し分割ルールＣを適用して分割することによって得られているので、分割ルールＣに対応する連結ルールＣを適用する。したがって、ダミーノード（各要素を束ねるための代用語）が挿入され、前要素と後要素に対してリンクが張られ、並置要素としてのリンクラベルＳＥＴが付与されて、２つの要素の解析結果が連結される。図２０に、この手続きによって得られた２つの要素を連結して得られる、元の入力文全体についての構文意味解析結果を示している。 Further, the results of syntactic and semantic analysis shown in FIGS. 18 and 19 are further connected to obtain the result of syntactic and semantic analysis for the entire original input sentence. Since these elements are obtained by dividing the original input sentence by applying the division rule C, the connection rule C corresponding to the division rule C is applied. Therefore, a dummy node (a synonym for bundling each element) is inserted, a link is established between the front element and the rear element, a link label SET as a juxtaposed element is given, and the analysis result of the two elements is Connected. FIG. 20 shows a syntactic and semantic analysis result for the entire original input sentence, which is obtained by connecting two elements obtained by this procedure.

最後に、解析結果出力部１０６において、元の入力文「この問題は中国側の捜査の進展状況とも密接に絡んでおり、日本捜査当局は１７日、合同会議で話し合うだろうと言われている。」の解析結果として、図２０に示される解析結果が出力される。 Finally, in the analysis result output unit 106, it is said that the original input sentence “This problem is closely related to the progress of the Chinese investigation, and the Japanese investigation authorities will discuss at the joint meeting on the 17th. The analysis result shown in FIG. 20 is output.

以上、特定の実施形態を参照しながら、本発明について詳解してきた。しかしながら、本発明の要旨を逸脱しない範囲で当業者が該実施形態の修正や代用を成し得ることは自明である。 The present invention has been described in detail above with reference to specific embodiments. However, it is obvious that those skilled in the art can make modifications and substitutions of the embodiment without departing from the gist of the present invention.

本明細書では、本発明に係る自然言語処理を日本語からなる入力文に対して適用した場合を中心に本発明の実施形態並びにその作用効果を説明してきたが、本発明の要旨はこれに限定されるものではなく、入力文が他の言語で記述されている場合であっても、同様に本発明を適用し好適に作用効果を得ることができることは言うまでもない。 In the present specification, the embodiment of the present invention and the operation and effect thereof have been described with a focus on the case where the natural language processing according to the present invention is applied to an input sentence composed of Japanese. It is not limited, and it goes without saying that even if the input sentence is described in another language, the present invention can be similarly applied to obtain the advantageous effects.

要するに、例示という形態で本発明を開示してきたのであり、本明細書の記載内容を限定的に解釈するべきではない。本発明の要旨を判断するためには、冒頭に記載した特許請求の範囲の欄を参酌すべきである。 In short, the present invention has been disclosed in the form of exemplification, and the description of the present specification should not be interpreted in a limited manner. In order to determine the gist of the present invention, the claims section described at the beginning should be considered.

図１は、本発明に係る自然言語処理システムの実現形態の一例を模式的に示した図である。FIG. 1 is a diagram schematically showing an example of an implementation form of a natural language processing system according to the present invention. 図２は、本発明の一実施形態に係る自然言語処理装置１００の機能構成を模式的に示したブロック図である。FIG. 2 is a block diagram schematically showing a functional configuration of the natural language processing apparatus 100 according to an embodiment of the present invention. 図３は、ＬＦＧ文法理論に基づく構文意味解析処理システム２００の機能的構成を模式的に示した図である。FIG. 3 is a diagram schematically showing a functional configuration of the syntactic and semantic analysis processing system 200 based on the LFG grammar theory. 図４は、入力文「私の犬は林檎を食べる。」を構文意味解析部２０３により処理した結果として得られるｃ−ｓｔｒｕｃｔｕｒｅを示した図である。FIG. 4 is a diagram showing c-structure obtained as a result of processing the input sentence “My dog eats apple” by the syntax and semantic analysis unit 203. 図５は、入力文「私の犬は林檎を食べる。」を構文意味解析部２０３により処理した結果として得られるｆ−ｓｔｒｕｃｔｕｒｅを示した図である。FIG. 5 is a diagram showing f-structure obtained as a result of processing the input sentence “My dog eats apple” by the syntax and semantic analysis unit 203. 図６は、図５に示したｆ−ｓｔｒｕｃｔｕｒｅを依存木の形態で表現した図である。FIG. 6 is a diagram expressing the f-structure shown in FIG. 5 in the form of a dependency tree. 図７は、自然言語文「円覚寺は鎌倉時代の建造物としてとても有名なので、広く一般的に知られている。」を文入力部１０１により形態素解析処理して得られる品詞情報付きの形態素列を示した図である。FIG. 7 shows a morpheme sequence with part-of-speech information obtained by morphological analysis processing by the sentence input unit 101 of the natural language sentence “Enkakuji is well known as a building in the Kamakura period”. FIG. 図８は、分割ルールの例を示した図である。FIG. 8 is a diagram illustrating an example of the division rule. 図９は、入力文「円覚寺は鎌倉時代の建造物としてとても有名なので、広く一般的に知られている。」を分割して得られた各要素についての構文意味解析結果を依存木の形式で示した図である。Fig. 9 shows the dependency tree format for the results of syntactic and semantic analysis for each element obtained by dividing the input sentence "Enkakuji is very well known as a building in the Kamakura period." It is the figure shown by. 図１０は、連結ルールの例を示した図である。FIG. 10 is a diagram illustrating an example of a connection rule. 図１１は、図９に示した各要素の構文意味解析結果を連結して得られる元の文全体としての構文意味解析結果を示した図である。FIG. 11 is a diagram showing the syntax and semantic analysis result of the original sentence as a whole obtained by concatenating the syntax and semantic analysis results of each element shown in FIG. 図１２は、入力文「クリントン大統領の来日の際に決めた日米投資促進機構の設立は、衆議院が反対した。」を分割して得られた要素毎の構文意味解析結果を依存木の形式で示した図である。Figure 12 shows the result of syntactic and semantic analysis for each element obtained by dividing the input sentence "The US House of Representatives was opposed to the establishment of the Japan-US Investment Promotion Organization when President Clinton visited Japan." It is the figure shown in the form. 図１３は、図１２に示した各要素の構文意味解析結果を連結して得られる元の文全体としての構文意味解析結果を示した図である。FIG. 13 is a diagram showing the syntax and semantic analysis result of the original sentence as a whole obtained by concatenating the syntax and semantic analysis results of each element shown in FIG. 図１４は、解析結果評価部１０７における処理手順を示したフローチャートである。FIG. 14 is a flowchart showing a processing procedure in the analysis result evaluation unit 107. 図１５は、構文意味解析部１０４における文の要素についての解析結果の例を示した図である。FIG. 15 is a diagram illustrating an example of an analysis result for a sentence element in the syntax and semantic analysis unit 104. 図１６は、構文意味解析部１０４における文の要素についての解析結果の例を示した図である。FIG. 16 is a diagram illustrating an example of an analysis result of a sentence element in the syntax and semantic analysis unit 104. 図１７は、入力文を分割する際、さらに分割された文の要素をさらに再帰的に分割する際に適用された分割ルールを示した図である。FIG. 17 is a diagram illustrating a division rule that is applied when the input sentence is divided and the elements of the further divided sentence are further recursively divided. 図１８は、図１５に示した２つの要素の構文意味解析結果を連結して得られる結果を示した図である。FIG. 18 is a diagram illustrating a result obtained by concatenating the syntax and semantic analysis results of the two elements illustrated in FIG. 図１９は、図１６に示した２つの要素の構文意味解析結果を連結して得られる結果を示した図である。FIG. 19 is a diagram illustrating a result obtained by concatenating the syntax and semantic analysis results of the two elements illustrated in FIG. 図２０は、２つの要素を連結して得られる、元の入力文全体についての構文意味解析結果を示した図である。FIG. 20 is a diagram showing a syntax and semantic analysis result of the entire original input sentence obtained by connecting two elements.

Explanation of symbols

１０…自然言語処理装置
１１…文入力部
１２…分割・連結ルール保持部
１３…文分割処理部
１４…構文意味解析部
１５…文要素連結部
１６…解析結果出力部
１００…自然言語処理装置
１０１…文入力部
１０２…分割・連結ルール保持部
１０３…文分割処理部
１０４…構文意味解析部
１０５…文要素連結部
１０６…解析結果出力部
１０７…解析結果評価部 DESCRIPTION OF SYMBOLS 10 ... Natural language processing apparatus 11 ... Sentence input part 12 ... Division | segmentation and connection rule holding part 13 ... Sentence division | segmentation processing part 14 ... Syntax and semantic analysis part 15 ... Sentence element connection part 16 ... Analysis result output part 100 ... Natural language processing apparatus 101 ... sentence input unit 102 ... division / concatenation rule holding unit 103 ... sentence division processing unit 104 ... syntax and semantic analysis unit 105 ... sentence element connection unit 106 ... analysis result output unit 107 ... analysis result evaluation unit

Claims

A natural language processing system for analyzing an input natural language sentence,
Sentence division processing means for determining a division position in the input sentence based on morpheme information of the input sentence, and dividing the input sentence into a front element and a rear element of the division position;
Syntactic and semantic analysis for each element of the input sentence divided by the division position, and to obtain a word-word dependency relationship and the type of the relationship for each element;
A sentence element connection processing means for determining a connection position and a dependency relationship in a syntax-semantic analysis result of each divided element based on morpheme information at a division position of an input sentence, and connecting element syntax analysis results;
A natural language processing system comprising:

One or more division processing rules describing morpheme information of one morpheme or two or more consecutive morphemes related to a sentence division position;
One or more connection processing rules that describe the position and dependency relationship for connecting the syntactic and semantic analysis results of each divided element corresponding to the division processing rule,
The sentence element connection processing means connects the syntactic and semantic analysis results of the elements divided by using a connection processing rule corresponding to the division processing rule applied to the input sentence division processing by the sentence division processing means.
The natural language processing system according to claim 1.

A natural language processing system for analyzing an input natural language sentence,
A division / concatenation processing rule holding means for associating and holding a division processing rule for determining a position to divide a sentence and a concatenation processing rule for concatenating a plurality of syntax semantic analysis results into one;
A sentence division processing means for dividing the input sentence into elements by referring to the division processing rules held in the division / concatenation processing rule holding means;
Syntax semantic analysis means for performing syntax semantic analysis on each element of the input sentence divided by the sentence division processing means;
Sentence element concatenation processing means for concatenating the semantic analysis results for each element obtained from the syntactic and semantic analysis means using a concatenation processing rule corresponding to the division processing rule used for dividing the input sentence by the sentence division processing means. When,
A natural language processing system comprising:

Multiple split processing rules with priorities are prepared,
The sentence division processing means applies a division processing rule in accordance with the priority order and divides the input sentence at the obtained division position.
The natural language processing system according to claim 2, wherein the system is a natural language processing system.

An analysis result evaluation means for evaluating the processing result of the syntax-semantic analysis for each element obtained by dividing the input sentence;
The natural language processing system according to claim 2, wherein the system is a natural language processing system.

The analysis result evaluation unit determines that the element should be further divided in response to the fact that the syntax-semantic analysis unit cannot perform analysis on each element of the input sentence within a predetermined time period.
The sentence division processing unit divides the element again according to the determination by the analysis result evaluation unit,
The syntactic and semantic analysis means performs a syntactic and semantic analysis process on each element obtained by dividing the element again.
The natural language processing system according to claim 5.

The analysis result evaluation means evaluates the syntactic and semantic analysis results for each element obtained by dividing the input sentence according to a certain division processing rule, and determines whether or not to divide the input sentence.
The sentence division processing unit applies another division processing rule to the input sentence according to the determination by the analysis result evaluation unit, and divides again,
The syntactic and semantic analysis means performs a syntactic and semantic analysis process on each element obtained by dividing the input sentence again.
The natural language processing system according to claim 5.

The sentence division processing means selects a division processing rule based on a machine learning method and divides an input sentence.
The natural language processing system according to claim 2, wherein the system is a natural language processing system.

When the sentence division processing unit divides the input sentence at the division position where the reading marks are connected to the connection particle A class, the connection processing unit applies the element before the division position to the verb of the element after the division position. Concatenate the syntactic and semantic analysis results of each element as a continuous modification component,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing means divides the input sentence at the division position where the reading mark is connected to the connection particle B class, the connection processing means inserts a synonym for bundling each element, and before the division position, Concatenates the results of syntactic and semantic analysis of elements and subsequent elements by means of juxtaposition relations via pronouns;
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the input sentence is divided at the dividing position where the reading marks are connected to the continuous form after use by the sentence dividing processing means, the connecting processing means inserts a synonym for bundling each element, and before the dividing position, Concatenates the results of syntactic and semantic analysis of elements and subsequent elements by means of juxtaposition relations via pronouns;
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence at a division position where a punctuation point is connected to a case particle, the connection processing unit determines that the verb of the element after the division position is a citation verb before the division position. Connecting the syntactic analysis results of each element with the element of the verb as the quotation of the verb, otherwise connecting the syntactic analysis results of each element with the element before the division position as the continuous modification component,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence at the division position where the noun, the counsel and the reading point are continuous, the connection processing unit may include a case element omitted in the verb of the element after the division position. For example, the elements before the division position are connected to each other as the corresponding case, and the syntax-semantic analysis results of the elements are connected to each other. Link,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence at the division position where the punctuation point is connected to the adverb noun, the connection processing unit uses the element before the division position for the verb of the element after the division position. Link the syntactic and semantic analysis results of each element as a modifier.
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence using the reading point as a division position, the connection processing unit uses the element before the division position as a continuous modification component for the verb of the element after the division position. Concatenate syntactic and semantic analysis results,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence using the connection particle A class as a division position, the connection processing unit uses the element before the division position as a continuous modification component for the verb of the element after the division position. Concatenate the syntactic and semantic analysis results of each element,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence using the connection particle B class as a division position, the connection processing unit inserts a pronoun for binding each element, and the elements before and after the division position are inserted. Link each result of syntactic and semantic analysis of elements by juxtaposition via pronouns,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing means divides the input sentence using the continuous form after use as a division position, the connection processing means inserts a pronoun for binding the elements, and the elements before and after the division position Link each result of syntactic and semantic analysis of elements by juxtaposition via pronouns,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence using the case particle as a division position, the connection processing unit selects the element before the division position if the verb of the element after the division position is a citation verb. Concatenating the results of syntactic and semantic analysis of each element as a verbal citation, otherwise concatenating the results of syntactic and semantic analysis of each element using the element before the split position as a continuous modification component,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing means divides the input sentence using the particle as a division position, the connection processing means, if there is an abbreviated case element in the verb of the element after the division position, Concatenating the results of syntactic and semantic analysis of each element with the corresponding element as the case, otherwise concatenating the results of syntactic and semantic analysis of each element with the element before the split position as a continuous modification component,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

When the sentence division processing unit divides the input sentence using the adverb noun as a division position, the connection processing unit uses each element before the division position as a continuous modification component for the verb of the element after the division position. Concatenates the results of syntactic analysis of elements,
The natural language processing system according to claim 1, wherein the natural language processing system is a natural language processing system.

A natural language processing method for analyzing an input natural language sentence,
A sentence division processing step for determining a division position in the input sentence based on morpheme information included in the input sentence, and dividing the input sentence into a front element and a rear element of the division position;
A syntax and semantic analysis step is performed for each element of the input sentence divided by the division position, and a dependency relationship between words and words and a type of the relationship are obtained for each element;
A statement element connection processing step for determining a connection position and a dependency relationship in a syntax and semantic analysis result of each divided element based on morpheme information at a division position of an input sentence, and connecting the syntax and semantic analysis results of elements; ,
A natural language processing method comprising:

Describes the division processing rules that describe the morpheme information of one morpheme or two or more consecutive morphemes related to the sentence division position, and the position and dependency relationship for linking the results of syntactic analysis of each divided element Is associated with one or more concatenation processing rules,
In the sentence element connection processing step, the syntactic and semantic analysis results of the elements divided using the connection processing rule corresponding to the division processing rule applied to the input sentence division processing in the sentence division processing step are connected.
The natural language processing method according to claim 22.

A natural language processing system for analyzing an input natural language sentence,
A division processing rule for determining a position to divide a sentence is associated with a concatenation processing rule for concatenating a plurality of syntactic and semantic analysis results into one,
A sentence division processing step for dividing an input sentence into elements by referring to any of the division processing rules;
A syntax and semantic analysis step for performing syntax and semantic analysis on each element of the input sentence divided in the sentence division processing step;
Sentence element concatenation processing step of concatenating the semantic analysis results for each element obtained by the syntactic and semantic analysis step using a concatenation processing rule corresponding to the division processing rule used for the division of the input sentence in the sentence division processing step When,
A natural language processing method comprising:

Multiple split processing rules with priorities are prepared,
In the sentence division processing step, the division processing rule is applied according to the priority order, and the input sentence is divided at the obtained division position;
The natural language processing method according to claim 23 or 24, wherein

An analysis result evaluation step for evaluating the processing result of the syntactic and semantic analysis for each element obtained by dividing the input sentence;
The natural language processing method according to claim 23 or 24, wherein

In the analysis result evaluation step, in response to failure to execute analysis for each element of the input sentence in the syntax-semantic analysis step within a predetermined time, it is determined that the element should be further divided,
In the sentence division processing step, the element is divided again according to the determination in the analysis result evaluation step,
In the syntactic and semantic analysis step, a syntactic and semantic analysis process is performed on each element obtained by dividing the element again.
27. The natural language processing method according to claim 26.

In the analysis result evaluation step, the syntax-semantic analysis result for each element obtained by dividing the input sentence according to a certain division processing rule is evaluated, and it is determined whether or not the input sentence should be divided again.
In the sentence division processing step, another division processing rule is applied to the input sentence according to the determination in the analysis result evaluation step, and the division is performed again.
In the syntactic and semantic analysis step, a syntactic and semantic analysis process is performed on each element obtained by dividing the input sentence again.
27. The natural language processing method according to claim 26.

In the sentence division processing step, a division processing rule is selected based on a machine learning method, and an input sentence is divided.
The natural language processing method according to claim 23 or 24, wherein

When the input sentence is divided at the division position where the reading marks are connected to the connected particle A class in the sentence division processing step, in the connection processing step, the element preceding the division position is changed to the verb of the element after the division position. Concatenate the results of syntactic analysis of each element as a continuous modification component for
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided at the dividing position where the punctuation marks are connected to the connected particle B class in the sentence dividing process step, a synonym for bundling each element is inserted in the connecting process step, and before the dividing position, Link the results of syntactic and semantic analysis of the elements of and the following elements by a juxtaposition relationship via pronouns.
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided at the division position where the punctuation marks are connected to the continuous form after utilization in the sentence division processing step, in the connection processing step, a substitute term for bundling each element is inserted, and the sentence before the division position is inserted. Link the results of syntactic and semantic analysis of the elements of and the following elements by a juxtaposition relationship via pronouns.
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided at the division position where the punctuation marks are connected to the case particles in the sentence division processing step, in the connection processing step, if the verb of the element after the division position is a citation verb, from the division position Concatenate the results of syntactic and semantic analysis of each element with the previous element as the citation of the verb, otherwise concatenate the results of syntactic and semantic analysis of each element with the element preceding the split position as the continuous modification component,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

In the sentence division process, when the input sentence is divided at the division position where the noun, the counsel and the punctuation mark are continuous, in the connection processing step, there is a case element omitted in the verb of the element after the division position. For example, the elements before the division position are connected to each other as the corresponding case, and the syntax-semantic analysis results of the elements are connected to each other. Link,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided at the division position where the punctuation marks are connected to the adverb nouns in the sentence division processing step, in the connection processing step, the element before the division position is assigned to the verb of the element after the division position. Concatenate the syntactic and semantic analysis results of each element as a continuous modification component,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided by using the reading point as the division position in the sentence division processing step, in the connection processing step, the element before the division position is used as a continuous modification component for the verb of the element after the division position. Concatenate syntactic and semantic analysis results,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided using the connection particle A class as the division position in the sentence division processing step, in the connection processing step, the element before the division position is used as a continuous modification component for the verb of the element after the division position. Concatenate the syntactic and semantic analysis results of each element,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

In the sentence division processing step, when the input sentence is divided using the connection particle B class as a division position, in the connection processing step, a synonym for bundling each element is inserted, and the elements before and after the division position are inserted. Link each result of syntactic and semantic analysis of elements by juxtaposition via pronouns,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided using the continuous form after utilization in the sentence division processing step as a division position, in the connection processing step, a pronoun for binding each element is inserted, and the elements before and after the division position are inserted. Link each result of syntactic and semantic analysis of elements by juxtaposition via pronouns,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided using the case particles as the division positions in the sentence division processing step, in the connection processing step, if the verb of the element after the division position is a citation verb, the element before the division position is Concatenating the results of syntactic and semantic analysis of each element as a verbal citation, otherwise concatenating the results of syntactic and semantic analysis of each element using the element before the split position as a continuous modification component,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

In the sentence division processing step, when the input sentence is divided by using the auxiliary particle as a division position, in the connection processing step, if there is a case element omitted in the verb of the element after the division position, the input sentence is preceded by the division position. Concatenating the results of syntactic and semantic analysis of each element with the corresponding element as the case, otherwise concatenating the results of syntactic and semantic analysis of each element with the element before the split position as a continuous modification component,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

When the input sentence is divided by using the adverb noun as a dividing position in the sentence dividing process step, each of the elements before the dividing position is used as a continuous modification component for the verb of the element after the dividing position. Concatenates the results of syntactic analysis of elements,
The natural language processing method according to claim 22 or 24, wherein the natural language processing method is used.

A computer program written in a computer-readable format so as to execute processing for analyzing an input natural language sentence on a computer system,
A sentence division processing step for determining a division position in the input sentence based on morpheme information included in the input sentence, and dividing the input sentence into a front element and a rear element of the division position;
A syntax and semantic analysis step is performed for each element of the input sentence divided by the division position, and a dependency relationship between words and words and a type of the relationship are obtained for each element;
A statement element linking process step for linking element syntactic analysis results of the elements by determining a linking position and a dependency relationship in a syntax semantic analysis result of each divided element based on morpheme information at the division position of the input sentence; ,
A computer program comprising: