Topological Dependency Trees: A Constraint Based Account of Linear Precedence

icon

8

pages

icon

English

icon

Documents

Écrit par

Publié par

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Découvre YouScribe et accède à tout notre catalogue !

Je m'inscris

Découvre YouScribe et accède à tout notre catalogue !

Je m'inscris
icon

8

pages

icon

English

icon

Documents

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Topological Dependency Trees: A Constraint-Based Account of Linear Precedence Denys Duchier Programming Systems Lab Universitat des Saarlandes, Geb. 45 Postfach 15 11 50 66041 Saarbrucken, Germany Ralph Debusmann Computational Linguistics Universitat des Saarlandes, Geb. 17 Postfach 15 11 50 66041 Saarbrucken, Germany Abstract We describe a new framework for de- pendency grammar, with a modular de- composition of immediate dependency and linear precedence. Our approach distinguishes two orthogonal yet mutu- ally constraining structures: a syntactic dependency tree and a topological de- pendency tree. The syntax tree is non- projective and even non-ordered, while the topological tree is projective and partially ordered. 1 Introduction Linear precedence in so-called free word order languages remains challenging for modern gram- mar formalisms. To address this issue, we pro- pose a new framework for dependency gram- mar which supports the modular decomposition of immediate dependency and linear precedence. Duchier (1999) formulated a constraint-based ax- iomatization of dependency parsing which char- acterized well-formed syntax trees but ignored is- sues of word order. In this article, we develop a complementary approach dedicated to the treat- ment of linear precedence. Our framework distinguishes two orthogonal, yet mutually constraining structures: a syntactic dependency tree (ID tree) and a topological de- pendency tree (LP tree).

  • option- ally extraposed

  • mann zu

  • infinitive zu

  • edge constraints

  • fieldext

  • versucht

  • must satisfy

  • fieldint

  • verb


Voir icon arrow

Publié par

Langue

English

Topological Dependency Trees:
A Constraint-Based Account of Linear Precedence
Denys Duchier Ralph Debusmann
Programming Systems Lab Computational Linguistics
Universita¨t des Saarlandes, Geb. 45 Universita¨t des Saarlandes, Geb. 17
Postfach 15 11 50 Postfach 15 11 50
¨ ¨66041 Saarbrucken, Germany 66041 Saarbrucken, Germany
duchier@ps.uni-sb.de rade@coli.uni-sb.de
Abstract trees is formulated in terms of (a) lexicalized con-
straints and (b) principles governing e.g. climbing
We describe a new framework for de- conditions.
pendency grammar, with a modular de- In Section 2 we discuss the difficulties pre-
composition of immediate dependency sented by discontinuous constructions in free
and linear precedence. Our approach word order languages, and briefly touch on the
distinguishes two orthogonal yet mutu- limitations of Reape’s (1994) popular theory of
ally constraining structures: a syntactic ‘word order domains’. In Section 3 we introduce
dependency tree and a topological de- the concept of topological dependency tree. In
pendency tree. The syntax tree is non- Section 4 we outline the formal framework for
projective and even non-ordered, while our theory of ID/LP trees. Finally, in Section 5
the topological tree is projective and we illustrate our approach with an account of the
partially ordered. word-order phenomena in the verbal complex of
German verb final sentences.
1 Introduction 2 Discontinuous Constructions
Linear precedence in so-called free word order In free word order languages, discontinuous con-
languages remains challenging for modern gram- structions occur frequently. German, for example,
mar formalisms. To address this issue, we pro- is subject to scrambling and partial extraposition.
pose a new framework for dependency gram- In typical phrase structure based analyses, such
mar which supports the modular decomposition phenomena lead to e.g. discontinuous VPs:
of immediate dependency and linear precedence.
(1) (dass) einen Mann Maria zu lieben versucht
Duchier (1999) formulated a constraint-based ax- (that) a man Maria to love triesacc nom
iomatization of dependency parsing which char-
whose natural syntax tree exhibits crossing edges:
acterized well-formed syntax trees but ignored is-
Ssues of word order. In this article, we develop a
NP V
complementary approach dedicated to the treat-
VP
ment of linear precedence. NP V
DET NOur framework distinguishes two orthogonal,
(dass) einen Mann Maria zu lieben versucht
yet mutually constraining structures: a syntactic
Since this is classically disallowed, discontinu-dependency tree (ID tree) and a topological de-
ous constituents must often be handled indirectlypendency tree (LP tree). While edges of the ID
through grammar extensions such as traces.tree are labeled by syntactic roles, those of the
Reape (1994) proposed the theory of word or-LP tree are labeled by topological fields (Bech,
der domains which became quite popular in the1955). The shape of the LP tree is a flattening of
HPSG community and inspired others such asthe ID tree’s obtained by allowing nodes to ‘climb
Mu¨ller (1999) and Kathol (2000). Reape distin-up’ to land in an appropriate field at a host node
guished two orthogonal tree structures: (a) the un-where that field is available. Our theory of ID/LP
ordered syntax tree, (b) the totally ordered tree ofword order domains. The latter is obtained from
the syntax tree by flattening using the operation
of domain union to produce arbitrary interleav-
ings. The boolean feature [∪±] of each node con-
trols whether it must be flattened out or not. In-
finitives in canonical position are assigned [∪+]:
(dass) Maria einen Mann zu lieben versucht
S
The topological tree (LP tree) is partially ordered
NP VP[∪+] V
and projective:
NP[∪−] V
DET N
(dass) Maria einen Mann zu lieben versucht
v
n n vThus, the above licenses the following tree of
d
word order domains:
(dass) Maria einen Mann zu lieben versucht
S
Its edge labels are called (external) fields and are
NP NP V V
totally ordered: df ≺ mf ≺ vc. This induces a
DET N linear precedence among the daughters of a node
(dass) einen Mann Maria zu lieben versucht in the LP tree. This precedence is partial because
daughters with the same label may be freely per-Extraposed infinitives are assigned [∪−]:
muted.
S
In order to obtain a linearization of a LP tree,
NP V VP[∪−] it is also necessary to position each node with
NP V respect to its daughters. For this reason, each
node is also assigned an internal field (d, n, or v)
DET N
shown above on the vertical pseudo-edges. The
(dass) Maria versucht einen Mann zu lieben
set of internal and external fields is totally or-
As a consequence, Reape’s theory correctly pre- dered: d≺ df≺ n≺ mf≺ vc≺ v
dicts scrambling (2,3) and full extraposition (4), Like Reape, our LP tree is a flattened version of
but cannot handle the partial extraposition in (5): the ID tree (Reape, 1994; Uszkoreit, 1987), but
the flattening doesn’t happen by ‘unioning up’;(2) (dass) Maria einen Mann zu lieben versucht
rather, we allow each individual daughter to climb
(3) (dass) einen Mann Maria zu lieben versucht
up to find an appropriate landing place. This idea
(4) (dass) Maria versucht, einen Mann zu lieben is reminiscent of GB, but, as we shall see, pro-
ceeds rather differently.
(5) (dass) Maria einen Mann versucht, zu lieben
4 Formal Framework3 Topological Dependency Trees
The framework underlying both ID and LP trees
Our approach is based on dependency grammar.
is the configuration of labeled trees under valency
We also propose to distinguish two structures: (a)
(and other) constraints. Consider a finite set L
a tree of syntactic dependencies, (b) a tree of topo-
of edge labels, a finite set V of nodes, and E ⊆
logical dependencies. The syntax tree (ID tree) is
V ×V ×L a finite set of directed labeled edges,
unordered and non-projective (i.e. it admits cross-
′such that (V,E) forms a tree. We write w−−ℓ→w
ing edges). For display purposes, we pick an ar-
′for an edge labeled ℓ from w to w . We define the
bitrary linear arrangement:
ℓ-daughtersℓ(w) of w∈ V as follows:
′ ′ℓ(w) ={w ∈ V | w−−ℓ→w ∈ E}
subject
mf
mf
zuvinf
vc
object
det
dfb bWe writeL for the set of valency specifications ℓ the verbal complement field, xf the extraposition
defined by the following abstract syntax: field. Features of lexical entries relevant to LP
trees are grouped under table heading “Topology”
bℓ ::= ℓ | ℓ? | ℓ∗ (ℓ∈L) din Figure 1. valency assigns a F valencyextLP
to each node and is subject to the lexicalizedbA valency is a subset ofL. The tree (V,E) satis-
constraint:bLfies the valency assignment valency : V → 2 if
for all w∈ V and all ℓ∈L: valency (w) = lex(w).valencyLP LP
ℓ∈ valency(w) ⇒ |ℓ(w)| = 1 (V,E ) must satisfy the valency assignmentLP LP
ℓ?∈ valency(w) ⇒ |ℓ(w)|≤ 1 as described earlier. For example, the lexical en-
ℓ∗∈ valency(w) ⇒ |ℓ(w)|≥ 0 try for zu lieben specifies:2
otherwise ⇒ |ℓ(w)| = 0
valency (zu lieben ) ={mf∗, xf?}2LP
4.1 ID Trees
which permits 0 or more mf edges and at most
An ID tree (V,E , lex, cat, valency ) consistsID ID one xf edge; we say that it offers fields mf and xf.
of a tree (V,E ) with E ⊆ V ×V ×R, whereID ID
Unlike the ID tree, the LP tree must be projective.
the setR of edge labels (Figure 1) represents syn-
The grammar stipulates a total order on F ,ext
tactic roles such as subject or vinf (bare infinitive
thus inducing a partial linear precedence on each
argument). lex : V → Lexicon assigns a lexi-
node’s daughters. This order is partial because
cal entry to each node. An illustrative Lexicon is
all daughters in the same field may be freely per-
displayed in Figure 1 where the 2 features cats
muted: our account of scrambling rests on free
and valency of concern to ID trees are grouped
ID permutations within the mf field. In order to ob-
under table heading “Syntax”. Finally, cat and
tain a linearization of the LP tree, it is necessarybvalency assign a category and anR valency to
ID to specify the position of a node with respect to its
each node w∈ V and must satisfy:
daughters. For this reason each node is assigned
an internal field inF . The setF ∪F is to-int ext intcat(w)∈ lex(w).cats
tally ordered:valency (w) = lex(w).valency
ID ID
d≺ df≺ n≺ mf≺ vc≺ v≺ xf
(V,E ) must satisfy the valency assignment asID ID
described earlier. For example the lexical entry In what (external) field a node may land

Voir icon more
Alternate Text