Universal Dependencies
<span property="owl:versionInfo"> These pages draw from Section 2 of *[Stanford typed dependencies manual](http://nlp.stanford.edu/software/dependencies_manual.pdf)* (de Marneffe and Manning 2008), but have been updated for UD.
</span> </span>
Note: nmod, neg and punct appear in two places.
|
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
|
||||||||||||||||||||||||||||||||||||||||||
|
|
|
acl
: clausal modifier of noun
acl
is used for finite and non-finite clauses that modify a
noun. Note that in English relative clauses get assigned a specific
relation acl:relcl, a subtype of acl
.
Non-relative clause finite clausal complements for nouns are limited to complement clauses with a subset of nouns like fact or report. We analyze them as acl
(parallel to the analysis of this class as “content clauses” in Huddleston and Pullum 2002). Such clausal complements are usually finite (though there are occasional remnant English subjunctives).
acl:relcl
: relative clause modifier
A relative clause modifier of an noun is a relative clause modifying the noun. The relation points from the noun that is modified to the head of the relative clause. Relative clauses are finite.
advcl
: adverbial clause modifier
An adverbial clause modifier is a clause which modifies a verb or other predicate (adjective, etc.), as a modifier not as a core complement. This includes things such as a temporal clause, consequence, conditional clause, purpose clause, etc. The dependent must be clausal (or else it is an advmod) and the dependent is the main predicate of the clause.
advmod
: adverbial modifier
An adverbial modifier of a word is a (non-clausal) adverb or adverbial phrase (ADVP) that serves to modify the meaning of the word.
amod
: adjectival modifier
An adjectival modifier of a nominal is any adjective or adjectival phrase that serves to modify the meaning of the nominal. This includes always or sometimes postposed modifiers, such as else and nice in the examples below.
appos
: appositional modifier
An appositional modifier of an NP is an NP immediately to the right of the first NP that serves to define or modify that NP. It includes parenthesized examples, as well as defining abbreviations in one of these structures.
aux
: auxiliary
An auxiliary of a clause is a non-main verb of the clause, e.g., a modal auxiliary, or a form of be, do or have in a periphrastic tense.
(Contrary to the older SD and arguments of Pullum (1982) and following, infinitive to is not analyzed as an auxiliary. Instead, it is analyzed as a mark.)
auxpass
: passive auxiliary
A passive auxiliary of a clause is a non-main verb of the clause which contains the passive information.
case
: case marking
The case
relation is used for any preposition in English. Prepositions are treated as dependents of the noun they attach to or introduce in an “extended nominal projection”. Thus, contrary to SD, UD abandons treating a preposition as a mediator between a modified word and its object. The case
relation aims at providing a uniform analysis of prepositions and case in morphologically rich languages. In English, subordinating conjunctions introducing clauses are often in the form of prepositions. However, they are given a different dependency: The relation mark is used for markers in an “extended clausal projection”.
The case
relation is also used for the possessive clitic ‘s in English, which we separate from what it modifies, because it acts as a phrasal clitic, as shown in the last example.
cc
: coordination
edit cc
cc:preconj
: preconjunct
A preconjunct is the relation between the head of an NP and a word that appears at the beginning bracketing a conjunction (and puts emphasis on it), such as either, both, neither).
ccomp
: clausal complement
edit ccomp
compound
: compound
compound is used for:
- noun compounds. (These should show the correct modification structure of noun compounds, and do - or should - in the English UD treebank. Note, however, that the current automatic Stanford UD converter still makes all nouns modify the rightmost noun of the noun phrase when run on corpora like the 1999 Penn Treebank 3 which do not show noun compound structure - there is no intelligent noun compound analysis. The correct results are achieved when run on corpora like OntoNotes which do represent the branching structure of noun phrases.)
This includes proper names that use regular syntactic relations—contrast with name:
- numbers
- adjectival compounds
- imitative reduplication
- idiomatic phrasal verbs are analyzed as a language-specific subrelation of compound
compound:prt
: phrasal verb particle
The phrasal verb particle relation identifies an idiomatic phrasal verb, and holds between the verb and its particle (tagged as ADP). It is a subtype of the compound relation.
This relation excludes literal/directional uses of prepositions/particles, such as up, down, in, out, etc. These would typically become an ADV with the relation advmod:
conj
: conjunct
edit conj
cop
: copula
A copula is the relation between the complement of a copular verb and the copular verb. Copular heads are avoided when possible.
Prepositional phrases are annotated similarly, the only difference being that the nominal predicate has an additional case marker.
When an adjective or adverb is being predicated of a nominal phrase, the adjective/adverb is the root, the nominal phrase is the nsubj, and the copula is the cop.
Prepositions may also project a cop dependent.
In predicative wh-constructions, the fronted wh-word is the head, and the copula is another cop.
However, whenever the copula has a clausal argument/adjunct, the copula becomes the root, so the cop relation is not used.
Predicative “be” is the only verb recognized as a copula; other copula-like verbs,such as “become”, “get”, and “seem”, are treated as regular raising verbs, and thus take xcomp arguments. Non-predicative uses of “be”–e.g., “be” when used in periphrastic verbal constructions, presentationals, or existentials–is annotated as an aux instead. of a cop.
csubj
: clausal subject
A clausal subject is a clausal syntactic subject of a clause, i.e., the subject is itself a clause. The governor of this relation might not always be a verb: when the verb is a copular verb, the root of the clause is the complement of the copular verb. In the two following examples, what she said is the subject.
csubjpass
: clausal passive subject
A clausal passive subject is a clausal syntactic subject of a passive clause. In the example below, that she lied is the subject.
dep
: dependent
A dependency is labeled as dep
when a system is unable to determine
a more precise dependency relation between two words. This may be
because of a weird grammatical construction, a limitation in the
Stanford Dependency conversion software, a parser error, or because of
an unresolved long distance dependency.
det:predet
: predeterminer
A predeterminer is the relation between the head of an NP and a word that precedes and modifies the meaning of the NP determiner.
discourse
: discourse element
This is used for interjections and other discourse particles and elements (which are not clearly linked to the structure of the sentence, except in an expressive way). We generally follow the guidelines of what the Penn Treebanks count as an INTJ. They define this to include: interjections (oh, uh-huh, Welcome), fillers (um, ah), and discourse markers (well, like, actually, but not you know).
dislocated
: dislocated elements
The dislocated
relation is used for fronted or postposed elements
that do not fulfill the usual core grammatical relations of a
sentence. Dislocated elements are attached to the same governor as the dependent that they double for.
dobj
: direct object
The direct object of a VP is the noun phrase which is the (accusative) object of the verb.
expl
: expletive
This relation captures an existential there or it in extraposition constructions. There is further discussion and examples on the universal dependency page (expl).
foreign
: foreign words
We use foreign
to label sequences of foreign words. These are given
a linear analysis: the head is the first token in the foreign phrase.
goeswith
: goes with
This relation links two parts of a word that are separated in text that is not well edited. We follow the treebank: The GW part is the dependent and the head is in some sense the main part, often the second part.
iobj
: indirect object
The indirect object of a (verbal) predicate is the nominal which is the dative
object of the verb. The relation iobj
is used for objects that are not direct
objects. It occurs only when there is a dobj
or ccomp
in the clause.
Note that prepositional phrases are not considered core arguments in English, hence in she gave it to me, the to me part is attached as nmod although semantically it corresponds to the dative.
list
: list
The list
relation is used for chains of comparable items. Web text often contains passages which are meant to be interpreted as lists but are parsed as single sentences. Email signatures in particular contain these structures, in the form of contact information: the different contact information items are labeled as list
; the key-value pair relations are labeled as appos.
In lists with more than two items, all items of the list shoud modify the first one.
In an itemized or numbered list, we have been taking the item marker as a dependent of the head of the contentful list item. This appears to be better than the alternative.
mark
: marker
A marker is the word introducing a clause subordinate to another clause. For a complement clause, this will typically be that or whether. For an adverbial clause, the marker is typically a preposition like before or a subordinating conjunction fulfilling a similar role like while or although. The mark is a dependent of the subordinate clause head.
The infinitive marker to is analyzed as a mark
.
When a a noun or a verb takes a prepositionally marked non-core argument (modifier) and that modifier is a clause, then we also label that prepositon as mark
(as it would not seem reasonable to call it case
when it is marking a clause). The result will commonly be a doubly marked clause.
mwe
: multi-word expression
The multi-word expression (modifier) relation is used for certain fixed grammaticized expressions with function words that behave like a single function word. Multiword expressions are annotated in a flat, head-initial structure,
in which all words in the expression modify the first one using the
mwe
label.
At present, this relation is used inside the following expressions:
as well
as well as
such as
due to (and other forms, such as d t and d/t)
because of (and other forms, such as b c of and b/c of)
instead of
in case
in case of
of course
so that
more than (when used synonymously with “over” in a quantity)
less than (when used synonymously with “under” in a quantity)
up to (when used in quantities)
according to
in order
rather than
at least (when not used for quantities)
as if
prior to
as to
kind of
whether or not
not to mention
as opposed to
let alone
so as to
in between
all but
that is
how come
had better (and ‘d better)
Not mwe
s
The following are not annotated as mwe
s, but are instead labeled according to their apparent internal structure.
out of, off of (All double prepositions denoting spatial relations are annotated with two cases on the nominal)
by far
what about
at all
at most, at least (when used for quantities. To determine whether at least should be an mwe
or not in borderline cases, substitute it with at most; if the sentence remains grammatical, it should receive its surface analysis)
at best, at worst
what if
so long
name
: name
name
is one of the three relations for compounding in UD (together
with compound and mwe).
It is used for proper nouns constituted of multiple nominal
elements. For example, name
would be used between the words of
Hillary Rodham Clinton, New York, or Carl XVI Gustaf but not to
replace the usual relations in a phrasal or clausal name like The
king of Sweden or the novels The Lord of the Rings and Captured By
Aliens.
Words joined by name
should all be part of a minimal noun phrase;
otherwise regular syntactic relations should be used. This is
basically similar to the treatment of noun compounds with
compound, except that in many cases parts of the name may be
another nominal element such as an adjective (United Airlines).
In general, names are annotated in a flat, head-initial structure, in
which all words in the name modify the first one using the name
label.
For organization names with clear syntactic modification structure, the dependencies should reflect the syntactic modification structure using regular syntactic relation, as in:.
In addition, regular syntactic relations are used: (i) for a modifying English determiner or (ii) to connect together the words of a description or name which involve English embedded prepositional phrases, sentences, etc.
If a name contains a function word in another language than English, we also use the name
relation.
neg
: negation modifier
The negation modifier is the relation between a negation word and the
word it modifies. It is used both for predicate negation (canonically, not) and nominal negation (canonically no). Dependents labeled neg
in the current treebank are the following (in various lowercase/uppercase forms): n, n’t, neither, never, no, non, not, nt, t.
nmod
: nominal modifier
The nmod
relation is used for nominal modifiers of nouns or clausal
predicates. nmod
is a noun functioning as a non-core (oblique)
argument or adjunct. In English, nmod
is used
- for prepositional complements (including datives and partitives):
The nmod
relation holds between the noun/predicate modified by the
prepositional complement and the noun introduced by the preposition.
- for ‘s genitives:
Nominal modifiers not marked by a preposition or ‘s genitive
are tagged nmod:npmod, a subtype of nmod
. Temporal nominal
modifiers are also marked with a separate relation nmod:tmod. See
the definitions of these relations.
nmod:npmod
: noun phrase as adverbial modifier
This relation is a subtype of the nmod relation, which captures the following cases where something syntactically a noun phrase is used as an adverbial modifier in a sentence:
(i) a measure phrase, which is the relation between the head of an adjectival/adverbial or prepositional phrase and the head of a measure phrase modifying it:
(ii) noun phrases giving an extent to a verb, which are not objects:
(iii) financial constructions involving an adverbial, notably the following construction $5 a share, where the second nominal means “per share”:
(iv) floating reflexives
and (v) certain other absolutive nominal constructions.
A temporal modifier nmod:tmod is a subclass of npmod which is distinguished as a separate relation.
nmod:poss
: possessive nominal modifier
nmod:poss
is used for a nominal modifier which occurs before its head in the specifier position used for ‘s possessives. It is marked with the case
‘s or one of its variant forms. This relation isn’t used for other pre-head modifiers such as noun compounds or quotative phrases.
nmod:tmod
: temporal modifier
A temporal modifier is a subtype of the nmod relation: if the modifier is specifying a time, it is labeled as tmod.
nsubj
: nominal subject
A nominal subject (nsubj
) is a nominal which is the syntactic subject and the proto-agent of a clause.
That is, it is in the position that passes typical grammatical test for subjecthood, and this argument is the more agentive,
the do-er, or the proto-agent of the clause.
(See csubj for when the subject is clausal. See nsubjpass and csubjpass for when the subject is not
the proto-agent argument due to valence changing operations.) This nominal may be headed by a noun,
or it may be a pronoun or relative pronoun, or in ellipsis contexts, other things such as an adjective.
The nsubj
role is only applied to semantic arguments of a predicate.
When there is an empty argument in a grammatical subject position (sometimes called a pleonastic or expletive),
it is labeled as expl. If there is then a displaced subject
in the clause, as in the English existential there construction, it will be labeled as nsubj
.
The governor of the nsubj
relation might not always be a verb: when
the verb is a copular verb, the root of the clause is the complement
of the copular verb, which can be an adjective or noun, including a noun marked by a preposition,
as in the examples below.
In English, the nsubj
normally precedes the predicate that it depends on, but this need not be the case,
both for the displaced subjects of expletive constructions and in other cases of stylistic inversion, such
as the example headed by the predicate come below.
nsubjpass
: passive nominal subject
A passive nominal subject is a noun phrase which is the syntactic subject of a passive clause.
nummod
: numeric modifier
A numeric modifier of a noun is any number phrase that serves to modify the meaning of the noun with a quantity.
parataxis
: parataxis
edit parataxis
punct
: punctuation
This is used for any piece of punctuation in a clause, if punctuation is being retained in the typed dependencies. By default, punctuation is not retained in the output.
remnant
: remnant in ellipsis
The remnant
relation is used to provide a satisfactory treatment of ellipsis (in
the case of gapping and stripping, where a predicational or verbal
head gets elided) without having to postulate empty nodes in the basic representation. This is something that was lacking in earlier versions
of SD and provides a basis for being able to reconstruct dependencies
in the enhanced representation of SD.
USD adopts an analysis that notes that in ellipsis a remnant
corresponds to a correlate in a preceding clause. The remnant
relation connects each remnant to its correlate in the basic dependency representation. This is then a sufficient representation to reconstruct the predicate-argument structure in the enhanced representation.
Even in the more complex example below, the remnant
relations enable us to correctly retrieve the subjects and objects in
the clauses with an elided verb.
Note in particular that (unlike for conj), remnant
uses a chaining analysis where each subsequent remnant depends on the immediately preceding remnant/correlate. The reason for this is that otherwise in a sentence with 2 or more chained ellipses the dependency structure would no longer track which remnants go together. It would become impossible to determine whether Mary won silver and Sandy gold, or Mary won gold and Sandy silver.
Instances of stripping typically occur when there is only one argument in the second clause, but with an accompanying adverbial modifier such as not or only. We model these sentences with the remnant relation as well.
Sometimes in these constructions adverbials will be “sprouted”, and have no correlate in the precedeing clause. In such a situation, the adverbial should attach to one of the remnants; in principle it shouldn’t matter which remnant it attaches to, since all remnants at a particular depth of embedding point back to the same semantic event (which the adverbial is a part of). However, to enforce a regular system, the adverbial should depend on the nearest leftmost dependent.
The remnant
relation is used when no predicational material is present. In contrast, in right-node-raising (RNR) and VP-ellipsis constructions in which some kind of predicational or verbal material is still present, the remnant
relation is not used. In RNR, the verbs are coordinated and the object is a dobj of the first verb:
In VP-ellipsis, we keep the auxiliary as the head, as shown below:
reparandum
: overridden disfluency
We use reparandum
to indicate disfluencies overridden in a speech
repair. The disfluency is the dependent of the repair.
root
: root
The root
grammatical relation points to the root of the sentence. A fake node “ROOT” is used as the governor. The ROOT node is indexed with “0”, since the indexation of real words in the sentence starts at 1.
vocative
: vocative
The vocative
relation is used to mark dialogue participant addressed
in text (common in emails and newsgroup postings). The relation links
the addressee’s name to its host sentence.
xcomp
: open clausal complement
An open clausal complement (xcomp
) of a verb or an adjective is a predicative or clausal complement without its own subject. The reference of the subject is necessarily determined by an argument external to the xcomp (normally by the object of the next higher clause, if there is one, or else by the subject of the next higher clause. These complements are always non-finite, and they are complements (arguments of the higher verb or adjective) rather than adjuncts/modifiers, such as a purpose clause. The name xcomp
is borrowed from Lexical-Functional Grammar.