Parsing with Compositional Vector Grammars--paper

时间:2023-01-01 22:42:52

这篇和2012年的区别:

1)Max-Margin Training Objective

J中RNN变为了CVG

2012-两个词向量合并并打分,这个-两个(词向量,POS)合并并打分

2012年:

Socher et al. (2012) proposed to give every single word a matrix and a vector. The matrix is then applied to the sibling node’s vector during the composition. While this results in a powerful composition function that essentially depends on the words being combined, the number of model parameters explodes and the composition functions do not capture the syntactic commonalities between similar POS tags or syntactic categories

这篇:

The idea is that the syntactic categories of the children determine what composition function to use for computing the vector of their parents.

2)

The original RNN is parameterized by a single weight matrix W.

这篇:

In contrast, the CVG uses a syntactically untied RNN (SU-RNN) which has a set of such weights. The size of this set depends on the number of sibling category combinations in the PCFG.

3)Scoring Tree

2012:In order to compute a score of how plausible of a syntactic constituent a parent is the RNN uses a single-unit linear layer for all i: s(p (i) ) = v T p (i)

这篇:

First, a single linear unit that scores the parent vector and second, the log probability of the PCFG for the rule that combines these two children: s p (1) = v (B,C) T p (1) + log P(P1 → B C)