POS tags
Open class words | Closed class words | Other |
---|---|---|
ADJ | ADP | PUNCT |
ADV | AUX | SYM |
INTJ | CONJ | X |
NOUN | DET | |
PROPN | NUM | |
VERB | PART | |
PRON | ||
SCONJ |
ADJ
: adjective
Definition
Adjectives are words that typically modify nouns and specify their properties or attributes. They may also function as predicates, as in
Машина зеленая “The car is green.”
The ADJ
tag is intended for ordinary adjectives only. See DET
for determiners and NUM for cardinal numerals.
In accord with the UD approach, adjectival ordinal numerals (первый, седьмой, стопятидесятый) are tagged as adjectives, although the traditional grammar classifies them as numerals. They behave like adjectives both morphologically and syntactically, with the exception that they cannot be compared and negated.
Most Russian adjectives inflect for ru-feat/Gender (большой – большое – большая) “big”, ru-feat/Number (большой – большие), ru-feat/Case (большой – большого – большому – большим – большом), ru-feat/Degree (большой – больше – наибольший) and Negation (большой – небольшой).
Examples
- большой “big”
- старый “old”
- зеленый “green”
- студенческий, учительский “student’s, teacher’s” (possessive adjectives)
- первый, второй, третий “first, second, third”
- сделанный “done” (passive participial adjective)
- делающий “doing” (present participial adjective, derived from present transgressive)
- сделавший “having done” (past participial adjective, derived from past transgressive)
Border cases
Passive participles lie on the border between verbs and adjectives.
Core participial forms (ending in consonant or short vowel) are tagged VERB
.
Long forms are participial adjectives and they are tagged ADJ
.
For example:
- Verb: писан, писано, писана, писаны “written”
- Adjective: писанный, писанное, писанная, писанные “written”
Only true participles (verbs) can be used to form the passive voice (but it may be sometimes difficult to distinguish from copula constructions, see AUX). On the other hand, the participial adjectives inflect for case and thus can modify nouns.
There is an analogy with some adjectives that preserved so called nominal (short) forms. And these adjectives are not derived from verbs. Example:
- Short (nominal) forms: стар, старо, стара “old”
- Normal (pronominal) forms: старый, старое, старая “old”
Here both groups are ADJ
. The nominal forms are used in predication,
the standard forms both in predication and to modify nouns.
ADP
: adposition
Definition
Russian has only prepositions but not postpositionsand and circumpositions. They occur before a complement noun phrase (noun, pronoun) and they form a single structure with the complement to express its grammatical and semantic relation to another unit within a clause.
Some prepositions take the form of fixed multiword expressions, e.g.
по сравнению с “in comparison with”, в связи с
“in connection with”. The
component words are then still tagged according to their basic use
(по is ADP
, сравнению is NOUN, etc.) and their status as
multiword expressions are accounted for in the syntactic annotation
Examples
- в; “in, at”
- к “to”
- на “on, at”
ADV
: adverb
Definition
Adverbs are words that typically modify verbs for such categories as time, place, direction or manner. They may also modify adjectives and other adverbs, as in очень важно “very significantly” or вероятно неправильно “provably wrong”.
There is a closed subclass of pronominal adverbs that refer to
circumstances in context, rather than naming them directly; similarly
to pronouns, these can be categorized as interrogative, relative,
demonstrative etc. Pronominal adverbs also get the ADV
part-of-speech tag but they are differentiated by additional features.
Note that Russian transgressives (also called adverbial participles)
are tagged VERB, not ADV
.
Examples
- очень “very”
- хорошо “well”
- точно “exactly”
- завтра “tomorrow”
- вниз, наверх “up, down”
- ordinal numeral adverbs: впервые “for the first time”
- multiplicative numeral adverbs: однажды, дважды, трижды “once, twice, three times”
- interrogative adverbs: где, куда, когда, как, почему “where, where to, when, how, why”
- demonstrative adverbs: здесь, там, сейчас, потом, так “here, there, now, then, so”
- indefinite adverbs: где-то, куда-то, когда-то, как-то “somewhere, to somewhere, sometime, somehow”
- total adverbs: везде, всегда “everywhere, always”
- negative adverbs: нигде, никогда “nowhere, never”
AUX
: auxiliary verb
Definition
The only truly auxiliary verb in Russian is быть “to be”. It accompanies the lexical verb of a verb phrase and expresses grammatical distinctions not carried by the lexical verb.
Examples
-
Future tense. Finite future form of быть is combined with infinitive of the lexical verb. The auxiliary expresses person, number and tense: буду делать “I will do”, будешь делать “you will do”, будут делать “they will do”. Note that a limited set of verbs can form future morphologically, without the auxiliary.
-
Conditional mood. Conditional form (historically aorist) of být is combined with past participle of the lexical verb. The auxiliary expresses person and number, the participle expresses gender and number: сделал бы “I would do.
Masc
”, сделала бы “I would do.Fem
”, сделали бы “we would do.Masc
”. -
Passive voice. A form of быть (in various tenses and moods or in the infinitive) is combined with passive participle of the lexical verb. The auxiliary expresses person, number, tense(past and future) and mood, the participle expresses gender, number and voice: будет сделан “he will be done”, был сделан “he was done”, был бы сделан “he would be done”,
Modal verbs are not auxiliaries
Russian modal verbs are not considered auxiliary and they are tagged VERB
.
Their behavior is only slightly different from other content verbs.
CONJ
: coordinating conjunction
Definition
A coordinating conjunction is a word that links words or larger constituents without syntactically subordinating one to the other and expresses a semantic relationship between them.
For subordinating conjunctions, see SCONJ.
Examples
- и “and”
- или “or”
- но “but”
DET
: determiner
Definition
Determiners are words that modify nouns or noun phrases and express the reference of the noun phrase in context. That is, a determiner may indicate whether the noun is referring to a definite or indefinite element of a class, to a closer or more distant element, to an element belonging to a specified person or thing, to a particular number or quantity, etc.
An important point to note is that the traditional grammar of Russian does not
define determiners as a separate word class. Russian does not have articles.
Most determiners are traditionally called pronouns; that is, an UD-conformant
annotation of Russian must distinguish between substantive pronouns (UD tag PRON)
and attributive pronouns (UD tag DET
).
Examples
- possessive determiners: мой, твой, его, её, наш, ваш, их “my, your, his, her, our, your, their”
- reflexive possessive determiner: свой “one’s own”
- demonstrative determiners: этот as in Я видела эту машину вчера. “I saw this car yesterday.”
- interrogative determiners: какой as in Какая машина тебе нравится? “Which car do you like?”
- relative determiners: который as in Мне интересно, которая машина тебе нравится. “I wonder which car you like.”
- relative possessive determiner: чей “whose”
- indefinite determiners: некоторый
- total determiners: каждый
- negative determiners: никакой as in У нас не осталось никаких машин. “We have no cars available.”
INTJ
: interjection
Definition
An interjection is a word that is used most often as an exclamation or part of an exclamation. It typically expresses an emotional reaction, is not syntactically related to other accompanying expressions, and may include a combination of sounds not otherwise found in the language.
Examples
(Note that no direct translation of interjections is possible. The approximate translations below are for orientation purposes and they cannot serve to judge the part of speech from the English perspective.)
- ах “oh”
- ого “wow”
- ну “well”
- ради бога “for God’s sake”
NOUN
: noun
Definition
Nouns are a part of speech typically denoting a person, place, thing, animal or idea.
The NOUN
tag is intended for common nouns only. See PROPN for
proper nouns and PRON for pronouns.
Russian nouns have the lexical feature ru-feat/Gender. Furthermore, the nouns inflect for ru-feat/Number and ru-feat/Case.
A verbal noun can be derived productively from almost every verb
(e.g. есть “to eat” → поедание “eating”).
While in other languages a corresponding form may be called gerund and tagged VERB,
in Russian it is tagged NOUN
. It has always the neuter gender and the full
number-case inflectional paradigm.
Examples
- девочка “girl”
- кошка “cat”
- дерево “tree”
- воздух “air”
- красота “beauty”
- плавание “swimming”
NUM
: numeral
Definition
A numeral is a word, functioning most typically as a determiner, adjective or pronoun, that expresses a number and a relation to the number, such as quantity, sequence, frequency or fraction.
Note that cardinal numerals are covered by NUM
whether they are used
as determiners or not (as in Windows 7) and whether they
are expressed as words (четыре), digits (4) or Roman numerals
(IV).
Russian grammar distinguishes several subclasses of pronominal numerals (quantifiers):
interrogative and relative (сколько “how many”);
demonstrative (столько “this many”);
indefinite (несколько “several”).
These words behave similarly to (most) cardinal numbers,
e.g. they require that the counted noun phrase be in genitive.
They are not similar to adjectives (unlike their English counterparts).
However, in accord with the UD standard, they should be tagged DET, not NUM
.
In addition, several types of (non-pronominal) numerals, such as ordinal numerals and multiplicative numerals, are tagged ADJ or ADV, based on their syntactic and morphological behavior.
Examples
- 0, 1, 2, 3, 4, 5, 2014, 1000000, 3.14159265359
- I, II, III, IV, V, MMXIV
- один, два, три, четыре, пять, семьдесят “one, two, three, four, five, seventy”
- половина, треть, четверть “one-half, one third, quarter”: denominators of fractions constitute a separate class of cardinal numerals.
- четверо, пятеро “four, five” (These are special forms, so-called generic numerals. They are used rarely, in literary or archaic style.)
Counterexamples
- первый, второй, третий “first, second, third”: adjectival ordinal numerals. They are tagged ADJ, and the ru-feat/NumType feature reveals their semantic relation to numbers.
- впервые “for the first time”: adverbial ordinal numerals. They are tagged ADV, and the ru-feat/NumType feature reveals their semantic relation to numbers.
- однажды, дважды, трижды “once, twice, three times”: multiplicative numerals. They are tagged ADV, and the ru-feat/NumType feature reveals their semantic relation to numbers.
- двое, трое, четверо, пятеро “twofold, three kinds of, four kinds of, five kinds of: generic numerals. They are tagged ADJ.
- пара, тройка, четверка “pair, triplet, foursome”: n-tuples (n-tice) are not considered numerals in the Czech grammar. They are tagged NOUN.
- единица, двойка, тройка, четверка, пятерка “number one, number two, number three, number four, number five”: names of numbers, or of objects identified by the number (e.g. of a bus route). They are not considered numerals and they are tagged NOUN.
- тысяча, миллион, биллион, триллион “thousand, million, billion, trillion”: words for large quantities are ambiguous between cardinal numerals (tagged
NUM
) and nouns. If they inflect as nouns, they are tagged NOUN; but the borderline is fuzzy. For instance, in phrases like тысячи людей вышли на улицы (“thousands of people went on the streets”), тысячи is a noun. In numeric expressions, e.g. 110 тысяч долларов (“110 thousand dollars”), it is a cardinal numeral.
PART
: particle
Definition
Particles are function words that must be associated with another word or phrase to impart meaning and that do not satisfy definitions of other universal parts of speech (e.g. adpositions, coordinating conjunctions, subordinating conjunctions or auxiliary verbs). Particles may encode grammatical categories such as negation, mood, tense etc. Russian particles are not inflected.
Note that response words such as да “yes”, нет “no”, etc. are considered particles in the PDT tagset but they should be retagged as interjections under the UD standard.
Examples
- Sentence modality: пусть (“May you have an enjoyable stay!”)
- же “just, only”
- аж “as late as, even, up to” Use case: Мне сегодня аж пять писем пришло. “Today I have recieved even five letters”
PRON
: pronoun
Definition
Pronouns are words that substitute for nouns or noun phrases, whose meaning is recoverable from the linguistic or extralinguistic context.
Pronouns under this definition function like nouns. Note that
Russian grammar traditionally extends the term pronoun to words that
substitute for adjectives. Such words are not tagged PRON
under our universal scheme. They are tagged as determiners in
order to annotate the same thing same way across languages.
For instance, ‘это “this” is traditionally called pronoun in
Russian grammar, regardless of context (the notion of determiners does
not exist in Russian grammar). To make the annotation parallel across
languages, it should be now tagged PRON
in Я видел это вчера. “I saw this yesterday.” and DET
in
Я видел эту машину вчера. “I saw this car yesterday.”
Examples
- personal pronouns: я, ты, он, она, оно, мы, вы, они “I, you, he, she, it, we, you, they”
- reflexive pronouns: себе, сам “oneself”
- demonstrative pronouns: это as in Я видел это вчера. “I saw this yesterday.”
- interrogative pronouns: кто, что “who, what” as in Что ты думаешь? “What do you think?”
- relative pronouns: кто, что “who, what” as in Мне интересно, что ты думаешь. “I wonder what you think.”
- indefinite pronouns: кто-то, что-то “somebody, something”
- total pronouns: каждый, все “everybody, all”
- negative pronouns: никто, ничто “nobody, nothing”
PROPN
: proper noun
Definition
A proper noun is a noun that is the name of a specific individual, place, or object. Russian proper nouns are always written starting with an uppercase letter.
Single-word named entities should be tagged PROPN
though they originate
from a common noun (Грязь) (village) or an adjective (Белая) (river).
Even if they were originally adjectives and inflect according to adjectival
paradigms, they behave syntactically as nouns. For instance, Белая
(a river in Bashkortostan) is originally feminine form of the
adjective белый “warm” but as a geographical name, it is a noun.
It denotes a concrete location (rather than a property of somebody/something)
and its feminine gender is fixed (while adjectives have forms in all three
genders).
Note that names of languages (русский, английский)
and adjectives derived from geographical names (русский, английский “Czech, English”)
are written in lowercase and are not tagged PROPN
.
Personal names are typically treated as a sequence of proper nouns
(one or more given names and one or more surnames).
If the name contains prepositions, conjunctions or articles (foreign names), these are tagged as ADP
, CONJ
and DET
,
respectively.
Russian (and other Slavic) multi-word named entities have internal syntactic
structure, which is preserved in the annotation. The headword is always noun
and there may be other nouns involved. They will be tagged either PROPN
or
NOUN
and possible ambiguities must be resolved individually.
Modifying adjectives are never tagged PROPN
. Even if an adjective is the
first word of a multi-word name, and thus it starts with an uppercase letter,
it is still tagged ADJ
.
Similarly, function words in named entities retain their normal tags.
These rules are less strict for foreign named entities where the original
part of speech is hidden for a Czech speaker.
Examples
- Белая.
ADJ
река.NOUN
“White River”. Even though the two words together are a name of a particular river, река is a common noun and is tagged as such. - Организация.
NOUN
Объединенных.ADJ
Наций.NOUN
“United Nations Organization” consists of three words, none of which is proper noun. However, the acronym ООН “UNO” is a single-token name and is taggedPROPN
.
PUNCT
: punctuation
Definition
Punctuation marks are non-alphabetical characters and character groups used to delimit linguistic units in printed text.
Punctuation is not taken to include logograms such as $, %, and §, which are instead tagged as SYM.
Examples
- Period: .
- Comma: ,
- Parentheses: ()
SCONJ
: subordinating conjunction
Definition
A subordinating conjunction is a conjunction that links constructions by making one of them a constituent of the other. The subordinating conjunction typically marks the incorporated constituent which has the status of a (subordinate) clause.
For coordinating conjunctions, see CONJ.
Examples
- что “that”
- если “if”
- как “as”
- чем “than”
SYM
: symbol
Definition
A symbol is a word-like entity that differs from ordinary words by form, function, or both.
Many symbols are or contain special non-alphanumeric characters, similarly to punctuation. What makes them different from punctuation is that they can be substituted by normal words. This involves all currency symbols, e.g. $ 75 is identical to seventy-five dollars.
Mathematical operators form another group of symbols.
Another group of symbols is emoticons and emoji.
Strings that consists entirely of alphanumeric characters are not
symbols but they may be proper nouns: 130XE, DC10; others
may be tagged PROPN
(rather than SYM
) even if they contain special
characters: DC-10.
Similarly, abbreviations for single words are not symbols but are assigned the part of speech
of the full form. For example, Mr. (mister), kg (kilogram), km (kilometr), dr (doktor)
should be tagged nouns.
Acronyms for proper names such as OSN and NATO should be tagged as proper nouns.
Characters used as bullets in itemized lists (•, ‣) are not symbols, they are punctuation.
Examples
- $, %, §, ©
- +, −, ×, ÷, =, <, >
- :), ♥‿♥, 😝
- john.doe@universal.org, http://universaldependencies.org/, 1-800-COMPANY
VERB
: verb
Definition
A verb is a member of the syntactic class of words that typically signal events and actions, can constitute a minimal predicate in a clause, and govern the number and types of other constituents which may occur in the clause.
Note that the VERB
tag covers main verbs (content verbs),
modal verbs and
copulas but it does not cover auxiliary verbs, for which there is
the AUX tag. (Russian modal verbs are not considered auxiliary.)
See the description of AUX
for more information on the borderline
between VERB
and AUX
.
Czech verbs can take the following morphological forms:
- Infinitive (this is the citation form)
- Finite verb (indicative and imperative forms; conditional is constructed periphrastically)
- Past participle (used to construct past and conditional)
- Passive participle (used to construct passive voice; also used separately as an adjective)
- Transgressive (also called adverbial participle)
There are participial forms that are tagged as adjectives (ADJ) rather than verbs. See below for examples.
A verbal noun can be derived productively from almost every verb
(e.g. есть “to eat” → поедание “eating”).
While in other languages a corresponding form may be called gerund and tagged VERB
,
in Russian it is tagged NOUN. It has always the neuter ru-feat/Gender
and it inflects for ru-feat/Number and ru-feat/Case.
Examples
- рисовать “to draw”
- рисую, рисуешь, рисует, рисуем, рисуете, рисуют “I draw, you draw, he/she/it draw, we draw, you draw, they draw”
- рисуй, рисуйте “draw” (imperative in different persons and numbers)
- рисовал, рисовала, рисовало, рисовали “drew” (past participle in different genders and numbers)
- рисован, рисована, рисовано, рисованы “drawn” (passive participle in different genders and numbers)
- рисуя “drawing” (present transgressive)
Border cases
There are passive participles as verb forms (VERB
)
and participial adjectives (ADJ
). For example:
- Verb: рисован, рисована, рисовано, рисованы “drawn”
- Adjective: рисованый, рисованая, рисованое, рисованые “drawn”
Their meaning is almost identical but the usage slightly varies. Both groups can be used in nominal predication with copula. Only true participles (verbs) can be used to form the passive voice (but it may be sometimes difficult to distinguish from copula constructions, see AUX). On the other hand, the participial adjectives inflect for case and thus can modify nouns.
There is an analogy with some adjectives that preserved so called nominal (short) forms. And these adjectives are not derived from verbs. Example:
- Short (nominal) forms: стар, стара, старо “old”
- Normal (pronominal) forms: старый, старая, старое “old”
Here both groups are ADJ
. The nominal forms are used in predication,
the standard forms both in predication and to modify nouns.
X
: other
Definition
The tag X
is used for words that for some reason cannot be assigned
a real part-of-speech category.
A special usage of X
is for cases of code-switching where it is not
possible (or meaningful) to analyze the intervening language
grammatically (and where the dependency relation foreign is
typically used in the syntactic analysis).
This rarely applies to the PDT data where many foreign words are tagged with their original
part of speech.
Even if foreign words are tagged X
, this usage does not extend
to ordinary loan words which should be assigned a normal
part-of-speech. For example, in Он надел килт “He put on kilt”,
килт is an ordinary NOUN.
Examples
- И потом он просто xfgh pdl jklw “And then he just xfgh pdl jklw”