CN111353306B

CN111353306B - Entity relationship and dependency Tree-LSTM-based combined event extraction method

Info

Publication number: CN111353306B
Application number: CN202010109601.1A
Authority: CN
Inventors: 张旻; 曹祥彪; 汤景凡; 姜明; 李鹏飞
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2020-02-22
Filing date: 2020-02-22
Publication date: 2020-10-16
Anticipated expiration: 2040-02-22
Also published as: CN111353306A

Abstract

The invention discloses a method for extracting a combined event based on an entity relationship and a dependency Tree-LSTM. The method comprises the following steps: 1. and encoding the original text and the text marking information. 2. The result of step 1 is input into the bi-directional LSTM. Forward and backward implicit state vectors with timing are obtained. 3. Firstly, an input sentence is analyzed into a dependency Tree structure, then the result in the step 1 is input into the constructed dependency Tree-LSTM, and the Tree root node hidden state vector and the hidden state vector at each moment are obtained. 4. And acquiring and storing the entity relationship sentence information characteristic vector. Connecting forward and backward implicit state vectors of bidirectional LSTM t and implicit state vector depending on Tree-LSTM t time

5. Carrying out trigger word identification and classification; 6. event arguments are identified and classified.

Description

Entity relationship and dependency Tree-LSTM-based combined event extraction method

Technical Field

The invention relates to an event extraction method, in particular to a combined event extraction method based on an entity relationship and a dependency Tree-LSTM, and belongs to the field of natural language processing.

Background

Event Extraction (EE) is an important component of an Information Extraction task (IE). The Event extraction mainly comprises two subtasks of trigger word recognition and classification (ED) and Event Argument recognition and classification (AI), wherein the ED task mainly finds out the trigger words triggering the events from the text and correctly judges the Event types of the trigger words. The latter is to determine whether the sentence is an event sentence (containing trigger words), and then to determine whether the entity reference appearing in the sentence is the event argument. And assigns each entity mention the correct event argument role. With the occurrence of massive text information and the deep development of deep learning technology, event extraction also becomes a hot problem for research of people. In addition, the event extraction technology has been applied to news message classification, social public opinion management, and the like.

Disclosure of Invention

The invention provides a combined event extraction method based on an entity relationship and a dependency Tree-LSTM, which is mainly aimed at the problems that the dependency path of an event trigger word and an event argument is too long and the output characteristics of a model lack the entity relationship.

The method for extracting the combined event based on the entity relationship and the dependency Tree-LSTM comprises the following steps:

step 1, encoding an original text and text labeling information;

step 2, inputting the result of the step 1 into a bidirectional LSTM; obtaining forward implicit state vectors with timing

And backward implicit state vector

Step 3, firstly, analyzing an input sentence into a dependency Tree structure by using a Stanford CoreNLP tool, then inputting the coding result of the step 1 into a dependency Tree-LSTM constructed by the dependency Tree structure, and obtaining a Tree root node hidden state vector

And implicit State vectors at t instants

Step 4, carrying out entity relation vector R^kEncoding a junction tree root node hidden state vector

Obtaining and saving entity relation sentence vector

Forward implicit state vectors connecting simultaneously two-way LSTM t instants

And backward implicit state vector

And an implicit State vector that depends on the Tree-LSTM t time

Solving new implicit state vectors

Thereby not only saving the information of the sub-nodes but also acquiring the local context information with a certain time sequence;

step 5, connecting the hidden state vector H at the time t in the step 4_tCarrying out trigger word recognition and classification with the sentence vector F;

step 6, sequentially identifying the hidden state vector H of the t-th word identified as the trigger word in the step 5_tThe ith event argument candidate word (the ith entity reference) implicit state vector

Sentence vector F containing entity relationship and entity relationship vector R of ith event argument candidate^kEntity relationship argument role in

Connecting, and identifying event argumentsAnd classifying;

further, the step 1 is specifically realized as follows:

1-1, acquiring unprocessed original text and text labeling information from a source file, wherein the labeling information comprises entity mention, entity types, event trigger words, event arguments, event argument roles, entity relationships and entity relationship argument roles, wherein the number of the entity types is 7, the number of the event trigger word types is 39, the number of the entity relationship types is 20, and the number of the entity relationship argument roles is 16; then, sentence and word segmentation are carried out on the original text by using Stanford CoreNLP; acquiring a dependency tree structure of parts of speech and sentences, wherein each word is used as a node of the tree structure; respectively creating a part-of-speech vector table, an entity type vector table, an entity relation argument role vector table, a trigger word type vector table and an event argument role vector table, wherein each vector table has initialization vectors corresponding to other types; entity mentions may consist of multiple words; for convenience of representing entity mentions, we denote each entity mention by the head of each entity mention (mostly the last word of an entity mention) and the subscript that the head appears in sentences denotes the subscript of each entity mention; thus, the subscripts referred to by each entity are indicated by the symbols: head₁,head₂,head₃,...,head_k-1,head_k(where k is the number of entity mentions, k may be zero); for this purpose, we use

Representing entity mentions that appear in the sentence; randomly initializing each vector in all vector tables, and updating the vectors during training;

1-2, inquiring the pre-trained glove word vector matrix to obtain the word vector w of each word in the sentence_iThen, the part-of-speech vector table is inquired to obtain a part-of-speech vector w_posAnd querying the entity type vector table to obtain an entity type vector w_e；

Obtain each word representation x_i＝{w_i,w_pos,w_eThus the sentence vector matrix is denoted W ═ x₁,x₂,...,x_n-1,x_nWhere n is the length of the sentence;

further, step 2 is specifically implemented as follows:

converting the vector matrix W of the sentence into { x }₁,x₂,...,x_n-1,x_nInputting the sentence into a bidirectional LSTM, and respectively acquiring a forward implicit state matrix of the sentence

And backward implicit state matrix

Wherein

And

representing the forward hidden state vector and the backward hidden state at time t, respectively, t ∈ [1, n]Bi-directional LSTM is a time series sensitive model and, therefore,

and

respectively storing the upper and lower information with certain time sequence information;

further, step 3 is specifically implemented as follows:

analyzing each sentence into a tree structure through a Stanford CoreNLP tool, wherein each word in the sentence forms a node of the tree structure, and the parent node or the child node of the node appears when the dependency relationship exists between the word and the node; changing W to { x ═ x₁,x₂,...,x_n-1,x_nInputting the information into a dependency Tree-LSTM constructed based on the Tree structure, and obtaining an implicit state vector of each node in the Tree structure parsed by the sentence

And the implicit state vector of the root node

Thus, the dependency Tree-LSTM of a sentence outputs the implicit state matrix of the sentence

Wherein t, root ∈ [1, n ]]N is the length of the sentence;

further, step 4 is specifically implemented as follows:

4-1, obtaining an entity relation vector R in the sentence by inquiring the entity relation table initialized randomly in the step 1^kRepresenting the kth entity relationship; if no entity relationship exists, R^kPoint to the "other" entity relationship vectors and adjust the vectors during the training process;

the memory unit vector c and the hidden state vector h of each node in the 4-2 dependency Tree-LSTM are obtained by summing the hidden state vectors of the sub-nodes of the node; therefore, the root node in the semantic dependency tree structure contains the information of the whole sentence, and the hidden vector of the root node generated in the step 4 is used for making the sentence contain the sentence-level vector of the entity relationship information

And an entity relationship vector R^kConnecting and obtaining sentence vector containing entity relation information

4-3, combining the hidden vectors at each moment in the step 2 and the step 3, and simultaneously, acquiring the hidden state vector at the moment t by adopting an averaging mode to reduce the dimensionality of the hidden vectors:

and the hidden state matrix of the whole sentence is H ═ H₁,H₂,···,H_n-1,H_nWhere t ∈ [1, n ]]N is the length of the sentence;

further, step 5 is specifically implemented as follows:

5-1 specifies onlyThere are verbs and nouns as trigger word candidates, and there are 39 seed types in total, including "other" types; judging the part of speech of each word in the sentence, if the part of speech is a verb or a noun, carrying out hidden state vector H at the current t moment_tThe expression is connected with the sentence vector F and is input into a trigger word multi-classification formula:

wherein, W_TAnd b_TRespectively triggering a weight matrix and a bias item of multi-classification of words;

representing the probability that the trigger word candidate for the tth word (each word being a time instant) triggers the event type,

representing the event type triggered at the t-th moment;

further, step 6 is specifically implemented as follows:

6-1, 20 entity relation argument roles are provided, a randomly initialized entity relation argument role vector table is established, the vector table is searched through the entity relation argument roles, and the vector is adjusted in the training process; by using

Representing the ith entity mention in an entity relationship vector R^kPlays the role of j entity relationship argument;

6-2, referring the entity in the sentence as an event argument candidate word; sequentially subjecting the ith event argument candidate word (the ith entity reference) to implicit state vector

Tth recognized as a trigger word at step 5-1Implicit state vector H of a word_tSentence vector F containing entity relationship and ith event argument candidate in entity relationship R^kEntity relationship argument role in

Connecting; inputting the join vector into an event argument recognition multi-classification formula:

wherein, W_AAnd b_ARespectively, a weight matrix and bias terms for the event argument class,

indicating the event type of the ith event argument candidate

Probability values of the role of event argument played;

indicating the event type of the ith event argument candidate

The role of event argument;

the invention has the following beneficial effects:

aiming at the defects of the prior art, a method for extracting the combined event based on the entity relationship and the dependency Tree-LSTM is provided. And obtaining the hidden state vector of each moment by using a dependency Tree-LSTM and a bidirectional LSTM, and respectively combining the entity relationship vector and the entity relationship argument role vector with the hidden state vectors to perform multi-classification on the trigger word candidate words and the argument candidate words. The model not only can reduce the influence of wrong trigger word types on argument identification, but also can fully utilize entity relationship and entity relationship argument role information, thereby improving the accuracy of the event extraction model.

Drawings

FIG. 1 is a flow chart of the overall implementation of the present invention.

FIG. 2 is a detailed flow diagram of the present invention triggering word recognition and classification and event argument recognition and classification.

Fig. 3 is a network architecture diagram of the model of the invention.

Detailed Description

The attached drawings disclose a flow chart of a preferred embodiment of the invention in a non-limiting way; the technical solution of the present invention will be described in detail below with reference to the accompanying drawings.

The event extraction is an important component of information extraction research and is a common technical basis for news hotspot extraction and social public opinion analysis. The event extraction is to find out event suggestions from a large amount of texts, and the event suggestions are composed of event trigger words and event arguments. Therefore, the event extraction mainly comprises two tasks of trigger word recognition and event argument role classification. Some researches divide the task into two stages, wherein the first stage firstly acquires the event type of the trigger word, and then judges the role of the event argument candidate word in the sentence according to the category of the trigger word. The method has the defect that the error classification of the trigger words in the first stage influences the effect of event argument role classification, so that a joint learning model of trigger word identification and event argument classification is provided. However, the above model does not take full advantage of the entity relationships and the role of entity references in entity relationship arguments. Therefore, a method for extracting the combined event based on the entity relationship and the dependency Tree-LSTM is provided.

1-3, a method for extracting join events based on entity relationships and dependency Tree-LSTM, comprises the following steps:

step 1, encoding the original text and the text label information.

Step 2 inputs the result of step 1 into the bi-directional LSTM. Obtaining forward implicit state vectors with timing

And backward implicit state vector

Step 3, firstly, the Stanford CoreNLP tool is utilized to analyze the input sentence into a dependency Tree structure, then the result of the step 1 is input into a dependency Tree-LSTM constructed by the dependency Tree structure, and the hidden state vector of the Tree root node is obtained

And an implicit state vector for each time instant

Step 4, entity relation R^kCoded connection

Obtaining and storing entity relation sentence information characteristic vector

At the same time, the forward implicit state vector of the concatenated bi-directional LSTM t

And backward implicit state vector

And an implicit State vector that depends on the Tree-LSTM t time

Make it

The information of the sub-nodes can be saved, and the local context information with a certain time sequence can be obtained.

And connecting, and identifying and classifying event arguments.

Further, the step 1 is specifically realized as follows:

the method comprises the steps of obtaining unprocessed original texts and labeling information from a source file, wherein the labeling information comprises entity words, entity types, event trigger words, event arguments, event argument roles, entity relationships and entity relationship argument roles, and the labeling information comprises 7 entity types, 39 event trigger word types, 20 entity relationship types and 16 entity relationship argument roles. And then, sentence and word segmentation are carried out on the original text by using the Stanford CoreNLP. And acquiring a dependency tree structure of parts of speech and sentences, wherein each word is used as a node of the tree structure. And respectively creating a part-of-speech vector table, an entity type vector table, an entity relation argument role vector table, a trigger part-of-speech type vector table and an event argument role vector table, wherein each vector table has other corresponding initialization vectors. These vectors are initialized randomly and updated at the time of training.

Inquiring the pre-trained glove word vector matrix to obtain the word vector w of each word in the sentence_iThen, the part-of-speech vector table is inquired to obtain w_posAnd querying entity type to obtain w_e。

Each word obtained is represented by x_i＝{w_i,w_pos,w_eThus the sentence vector matrix is denoted W ═ x₁,x₂,...,x_n-1,x_nWhere n is the length of the sentence.

And backward implicit state matrix

Wherein

And

and

the above and below information with certain timing information is saved separately.

The Stanford CoreNLP tool parses each sentence into a tree structure, with each word in the sentence constituting a node of the tree structure, where a dependency relationship with the word occurs with either a parent or child of the node. Changing W to { x ═ x₁,x₂,...,x_n-1,x_nInputting the information into the dependency Tree-LSTM constructed based on the Tree structure, and obtaining the implicit state vector of each node in the Tree structure analyzed by the sentence

And the implicit state vector of the root node

Dependency Tree-LSTM of a sentence thus outputs an implicit state matrix of the sentence

Wherein t, root ∈ [1, n ]]And n is the length of the sentence.

In event extraction, some trigger words may be ambiguous in recognition, for example: elop plan to leveNokia. Most event extraction models (EE) identify leave as an event type transport more easily, but if the relationship of an entity Elop in a sentence and membership existing in an entity Nokia is utilized, the EE can identify an End-Position event triggered by leave in the sentence more easily. Therefore, by inquiring the entity relationship table initialized randomly in the step (1), the entity relationship vector R in the sentence is obtained^k(indicating a kth entity relationship), and if no entity relationship exists, R^kPoint to the "other" entity relationship vector and adjust the vector during the training process.

The memory element vector c and the hidden state vector h of each node in the dependency Tree-LSTM are obtained by summing the hidden state vectors of the sub-nodes of the node. Therefore, the root node in the semantic dependency tree structure contains the information of the whole sentence, and the hidden vector of the root node generated in step 4 is used for making the sentence contain the sentence-level vector of the entity relationship information

The dependency Tree-LSTM is a non-time-series sensitive model, and the implicit state vector output at each time also lacks certain time-series information, so the implicit vectors at each time in steps 2 and 3 are combined, but in order to reduce the dimensionality of the implicit vectors, the implicit state vector at time t is obtained by averaging:

and the hidden state matrix of the whole sentence is H ═ H₁,H₂,…,H_n-1,H_nWhere t ∈ [1, n ]]N is a sentenceLength.

Specify that only verbs and nouns are candidates for trigger words, for a total of 39 seed types, including "other" types. Firstly, judging the part of speech of each word in a sentence, and if the part of speech is a verb or a noun, carrying out hidden state vector H at the current t moment_tThe expression is connected with the sentence vector F and is input into a trigger word multi-classification formula:

wherein,

the probability that the trigger word candidate for the tth word triggers the event type,

indicating the event type triggered by the t-th word.

For judging the event argument role played by the event argument candidate word (entity mention) in the sentence in the event type, it is desirable to utilize the entity relation argument role played by the entity mention in the entity relation. As with the example sentence mentioned in 4-1, the two entity mentions Elop and Nokia, if learned by the model, act as employeemberber and org, respectively, in the entity relationship membership. The model will more easily assign event argument roles Person and Entity to the two event arguments Elop and Nokia in the event type transport. The total number of the entity relationship argument roles is 20, a randomly initialized entity relationship argument role vector table is established, the table is searched through the entity relationship argument roles, and the vectors are adjusted in the training process. By using

Entity mention at time i in entity relationship R^kPlays the role of j entity relationship argument.

In the sentenceEntity mentions as event argument candidates. Sequentially subjecting ith event argument candidate word to implicit state vector H_iImplicit State vector connection H for the t-th word identified as trigger word at 5-1_tSentence vector F containing entity relationship and ith event argument candidate in relationship R^kEntity relationship argument role in

And (4) connecting. Inputting the join vector into an event argument recognition multi-classification formula:

wherein,

indicating the event type of the ith event argument candidate

Probability value of the role of event argument played.

Indicating the event type of the ith event argument candidate

Plays the role of event argument.

Claims

1. The method for extracting the combined event based on the entity relationship and the dependency Tree-LSTM is characterized by comprising the following steps of:

step 1, encoding an original text and text labeling information;

And backward implicit state vector

And implicit State vectors at t instants

Obtaining and saving entity relation sentence vector

And backward implicit state vector

And an implicit State vector that depends on the Tree-LSTM t time

Solving new implicit state vectors

Thereby both preserving sub-nodesThe information also acquires local context information with a certain time sequence;

step 6, sequentially identifying the hidden state vector H of the t-th word identified as the trigger word in the step 5_tThe ith event argument candidate word, namely the ith entity mention hidden state vector

And connecting, and identifying and classifying event arguments.

2. The method for extracting combined event based on entity relationship and dependency Tree-LSTM as claimed in claim 1, wherein step 1 is implemented as follows:

1-1, acquiring unprocessed original text and text label information from a source file, wherein the label information comprises entity mention, entity type, event trigger word, event argument role, entity relationship and entity relationship argument role, wherein the label information comprises 7 entity types, 39 event trigger word types, 20 entity relationship types and 16 entity relationship argument roles; then, sentence and word segmentation are carried out on the original text by using Stanford CoreNLP; acquiring a dependency tree structure of parts of speech and sentences, wherein each word is used as a node of the tree structure; respectively creating a part-of-speech vector table, an entity type vector table, an entity relation argument role vector table, a trigger word type vector table and an event argument role vector table, wherein each vector table has initialization vectors corresponding to other types;

1-2, inquiring a pre-trained glove word vector matrix to obtain a word vector w of each word in a sentence_iThen, the part-of-speech vector table is inquired to obtain the wordProperty vector w_posAnd querying the entity type vector table to obtain an entity type vector w_e；

Obtain each word representation x_i＝{w_i,w_pos,w_eThus the sentence vector matrix is denoted W ═ x₁,x₂,...,x_n-1,x_nWhere n is the length of the sentence.

3. The method for extracting combined event based on entity relationship and dependency Tree-LSTM as claimed in claim 1 or 2, wherein step 2 is implemented as follows:

And backward implicit state matrix

Wherein

And

and

4. The method for extracting combined events based on entity relationship and dependency Tree-LSTM as claimed in claim 3, wherein step 3 is implemented as follows:

And the implicit state vector of the root node

Wherein t, root ∈ [1, n ]]And n is the length of the sentence.

5. The method for extracting combined events based on entity relationship and dependency Tree-LSTM as claimed in claim 4, wherein step 4 is implemented as follows:

And entitiesRelation vector R^kConnecting and obtaining sentence vector containing entity relation information

and the hidden state matrix of the whole sentence is H ═ H₁,H₂,···,H_n-1,H_nWhere t ∈ [1, n ]]And n is the length of the sentence.

6. The method for extracting combined events based on entity relationship and dependency Tree-LSTM as claimed in claim 5, wherein step 5 is implemented as follows:

5-1, only verbs and nouns are specified as trigger word candidates, and 39 types of seeds are provided in total, wherein the types comprise 'other' types; judging the part of speech of each word in the sentence, if the part of speech is a verb or a noun, carrying out hidden state vector H at the current t moment_tThe expression is connected with the sentence vector F and is input into a trigger word multi-classification formula:

P_t ^tri＝softmax_tri(W_T[H_t,F]+b_T)

wherein, W_TAnd b_TRespectively triggering a weight matrix and a bias item of multi-classification of words; p_t ^triRepresenting the probability of triggering an event type for a trigger word candidate for the tth word, each word being a time,

indicating the type of event triggered at the t-th time.

7. The method for extracting combined event based on entity relationship and dependency Tree-LSTM as claimed in claim 6, wherein step 6 is implemented as follows:

6-2, mentioning the entity in the sentence as a candidate word of event argument; sequentially subjecting ith event argument candidate word hidden state vector

Implicit State vector H for the tth word identified as the trigger word in step 5-1_tSentence vector F containing entity relationship and ith event argument candidate in entity relationship R^kEntity relationship argument role in

indicating the ith event argument candidate in the event classModel (III)

Probability values of the role of event argument played;

indicating the event type of the ith event argument candidate

Plays the role of event argument.