home issue tracker

Features

Features cs

Lexical features
PronType
NumType
+NumForm
+NumValue
Poss
Reflex
+NameType
+AdpType
+ConjType
+Abbr
+Hyph
+Foreign
+Style
Inflectional features
Nominal Verbal
Gender VerbForm
Animacy Mood
Number Tense
Case Aspect
+PrepCase Voice
+Variant Person
Definite Negative
Degree
+Gender[psor]
+Number[psor]

Abbr: abbreviation [ ]

Boolean feature. Is this an abbreviation? Note that the abbreviated word typically belongs to a part of speech other than cs-pos/X.

Yes: it is abbreviation

Examples

  • Acronyms: ČR (Česká republika)  “Czech Republic”, LN (Lidové noviny)  (a newspaper), ODS (Občanská demokratická strana)  “Civic Democratic Party”, OSN (Organizace spojených národů)  “United Nations Organization”, ODA (Občanská demokratická aliance)  “Civic Democratic Alliance”
  • Initials: J, M, V, A, C
  • Abbreviations: r. (rok)  “year”, např. (například)  “for example”, tzv. (takzvaný)  “so-called”, a. s. (akciová společnost)  “joint-stock company”, tel. (telefon)  “phone”
edit Abbr

AdpType: adposition type [ ]

Czech has neither postpositions nor circumpositions but there are several forms of prepositions that this feature distinguishes.

Prep: (normal) preposition

Examples

  • v “in”, na “on”, o “about”, z “of”, s “with”, do “into”, k “to”, pro “for”, za “behind”, po “after”

Voc: vocalized preposition

Some Czech prepositions are non-syllabic and their form has to be changed in some contexts to facilitate pronunciation. Moreover, some syllabic prepositions are altered too, if the following word starts with certain consonant clusters.

Examples

The first line shows examples of vocalized preposition forms, the second line shows corresponding base forms.

  • ve, se, ze, ke, ode, beze, ku, skrze, přede, nade
  • v, s, z, k, od, bez, k, skrz, před, nad

Comprep: dependent part of compound preposition

This value marks dependent first part of a compound preposition. This word cannot occur alone. Not all compound prepositions contain words marked Comprep. Many compound prepositions consist of two normal prepositions and a noun (an example is na rozdíl od “in contrast to”). Sometimes there are just two words, the second one is normal preposition and the first one is a secondary preposition (etymologically some other part of speech, but it has been frozen as a preposition).

Examples

  • vzhledem k(e) “due to”, nehledě na “regardless of”, narozdíl od “in contrast to”
edit AdpType

Animacy: animacy [ ]

Similarly to Gender, animacy is a lexical feature of nouns and inflectional feature of other parts of speech that mark agreement with nouns. It is independent of gender, therefore it is encoded separately in some tagsets (e.g. all the Multext-East tagsets). On the other hand, in Czech the (almost) only grammatical implications occur within the masculine gender, which is why the PDT tagset does not have animacy as separate feature and instead defines four genders: masculine animate, masculine inanimate, feminine and neuter.

Anim: animate

Human beings, animals, fictional characters, names of professions etc. are all animate. Even nouns that are normally inanimate can be inflected as animate if they are personified. For instance, consider a children’s story about cars where cars live and talk as people; then the cars may become and be inflected as animates.

PDT examples of masculine animate nouns:

  • člověk  “man”, ministr  “minister”, prezident  “president”, předseda  “chairman”, ředitel  “director”

Inan: inanimate

Nouns that are not animate are inanimate.

PDT examples of masculine inanimate nouns:

  • rok  “year”, zákon  “law”, stát  “state”, případ  “case”, milión  “million”
edit Animacy

Aspect: aspect [ ]

Aspect is a feature that specifies duration of the action in time, whether the action has been completed etc.

In Czech, aspect is considered a lexical feature of verbs. While many imperfective verbs have morphologically related perfective counterparts, it is not a regular system and the two verbs are represented by different lemmas.

Imp: imperfect aspect

The action took / takes / will take some time span and there is no information whether and when it was / will be completed.

Examples

  • péci  “to bake” (Imp); pekl chleba  “he baked / was baking a bread”

Perf: perfect aspect

The action has been / will have been completed. Since there is emphasis on one point on the time scale (the point of completion), this aspect does not work well with the present tense. Czech morphology can create present forms of perfective verbs but these actually have a future meaning.

Examples

  • upéci  “to bake” (Perf); upekl chleba  “he baked / has baked a bread”

Diffs

Prague Dependency Treebank

The PDT tagset does not encode aspect. However, verb lemmas in PDT contain their own features that encode the aspect: _:T = Imp and _:W = Perf. These lemma features were removed during conversion, and the Aspect feature was introduced instead.

Unfortunately the morphological lexicon underlying the PDT annotation is incomplete and numerous verbs lack the aspect information. Without this imperfection there would be only a tiny group of verbs that work with both aspects.

edit Aspect

Case: case [ ]

Case is an inflectional feature of nouns and other parts of speech (adjectives, numerals) that mark agreement with nouns. It is also valency feature of prepositions (saying that the preposition requires its argument to be in that case).

Case helps specify the role of the noun phrase in the sentence. For example, the nominative and accusative cases often distinguish subject and object of the verb, while in fixed-word-order languages these functions would be distinguished merely by the positions of the nouns in the sentence.

Czech morphology distinguishes seven cases: Nom, Gen, Dat, Acc, Voc, Loc and Ins (this ordering is fixed in the grammar and the cases are also referred to by numbers 1–7).

Examples

  • singular nominative matka  “mother”, genitive matky , dative matce,  accusative matku,  vocative matko,  locative matce,  instrumental matkou
  • plural nominative matky,  genitive matek,  dative matkám,  accusative matky,  vocative matky,  locative matkách,  instrumental matkami

The descriptions of the individual case values below include semantic hints about the prototypical meaning of the case. Bear in mind that quite often a case will be used for a meaning that is totally unrelated to the meaning mentioned here. Valency of verbs, adpositions and other words will determine that the noun phrase must be in a particular grammatical case to fill a particular valency slot (semantic role).

Nom: nominative

The base form of the noun, also used as citation form (lemma). This is the word form used for subjects of clauses.

Gen: genitive

Prototypical meaning of genitive is that the noun phrase somehow belongs to its governor; it would often be translated by the English preposition of.

Note that despite considerable semantic overlap, the genitive case is not the same as the feature of possessivity (Poss). Possessivity is a lexical feature, i.e. it applies to lemma and its whole paradigm. Genitive is a feature of just a subset of word forms of the lemma. Semantics of possessivity is much more clearly defined while the genitive (as many other cases) may be required in situations that have nothing to do with possessing. For example, bez prezidentovy dcery  “without the president’s daughter” is a prepositional phrase containing the preposition bez  “without”, the possessive adjective prezidentovy  “president’s” and the noun dcery  “daughter”. The possessive adjective is derived from the noun prezident  but it is really an adjective (with separate lemma and paradigm), not just a form of the noun. In addition, both the adjective and the noun are in their genitive forms (the nominative would be prezidentova dcera). There is nothing possessive about this particular occurrence of the genitive. It is there because the preposition bez  always requires its argument to be in genitive.

Examples

  • Praha je hlavní město České republiky “Prague is the capital of the Czech Republic.”

Dat: dative

This is the word form often used for indirect objects of verbs.

Examples

  • Dal jsem dárek svému bratrovi “I gave my brother a present.” (svému bratrovi  “my brother” is dative and dárek  “present” is accusative.)

Acc: accusative

Perhaps the second most widely spread morphological case. This is the word form most frequently used for direct objects of verbs.

Voc: vocative

The vocative case is a special form of noun used to address someone. Thus it predominantly appears with animate nouns (see the feature of Animacy). Nevertheless this is not a grammatical restriction and inanimate things can be addressed as well.

Examples

  • Co myslíš, Filipe “What do you think, Filip?”

Loc: locative

The locative case often expresses location in space or time, which gave it its name. As elsewhere, non-locational meanings also exist and they are not rare. On the other hand, some location roles may be expressed using other cases (e.g. because those cases are required by a preposition).

This is the only Czech case that is used exclusively in combination with prepositions.

Examples

  • V červenci jsem byl ve Švédsku “In July I was in Sweden.”
  • Mluvili jsme tam o morfologii “We talked there about morphology.” (Non-locational non-temporal example)

Ins: instrumental

The role from which the name of the instrumental case is derived is that the noun is used as instrument to do something (as in psát perem  “to write using a pen”). Many other meanings are possible, for example the instrumental is required by the preposition “with” and thus it includes the meaning expressed in other languages by the comitative case.

In Czech the instrumental is also used for the agent-object in passive constructions (cf. the English preposition by).

Examples

  • Tento zákon byl schválen vládou “This bill has been approved by the government.” (Passive example)
edit Case

ConjType: conjunction type [ ]

This feature further subclassifies the parts of speech cs-pos/CONJ and cs-pos/SCONJ; in Czech, it is used only with CONJ. The main distinction between coordinating and subordinating conjunctions is done already at the part-of-speech level.

Oper: mathematical operator

Note that operators can be expressed either using symbols or using words. The words are considered special kind of coordinating conjunctions and they are marked using ConjType=Oper.

Examples

  • x “×”, krát “times”, plus “plus”, minus “minus”, kráte “times”
edit ConjType

Degree: degree of comparison [ ]

Degree of comparison is inflectional feature of some adjectives and adverbs.

Pos: positive, first degree

This is the base form that merely states a quality of something, without comparing it to qualities of others. Note that although this degree is traditionally called “positive”, negative properties can be compared, too.

Examples

  • mladý muž young man”

Cmp: comparative, second degree

The quality of one object is compared to the same quality of another object.

Examples

  • ten muž je mladší než já  “the man is younger than me”

Sup: superlative, third degree

The quality of one object is compared to the same quality of all other objects within a set.

Examples

  • toto je nejmladší muž v našem týmu  “this is the youngest man in our team”
edit Degree

Foreign: is this a foreign word? [ ]

Boolean feature. Is this a foreign word? Not a loan word but a genuinely foreign word appearing inside native text, e.g. inside direct speech, titles of books etc.

Note that Czech data (especially those from the PDT) often indicate the original part of speech of foreign words. Thus this feature may occur with any POS tag. If the original part of speech is not known, the feature will accompany the cs-pos/X tag.

Foreign: it is foreign

Examples

  • … nese jméno VLIW (Very Long Instruction Word – velmi dlouhé instrukční slovo)

Fscript: it is foreign and written in a foreign script

Examples

  • V nepálštině se hora jmenuje सगरमाथा “In Nepali, the mountain is called सगरमाथा.”

Tscript: it is foreign and transcribed from a foreign script

Examples

  • Výše uvedené nepálské slovo lze přepsat jako Sagaramāthā “The above Nepali word can be transcribed Sagaramāthā.”

Diffs

Prague Dependency Treebank

PDT does not contain words in foreign scripts (what it does contain are foreign letters based on the Latin script), and transcriptions from foreign scripts are not explicitly marked, hence the values Fscript and Tscript do not appear in the converted PDT data.

For proper nouns the borderline between foreign words and loan words is somewhat fuzzy, so e.g. the English personal name George  is marked as foreign even though it would not normally be translated (except for names of rulers and saints, which would become Jiří).

Articles in foreign names (the, die, le)  are tagged cs-pos/ADJ, not cs-pos/DET.

edit Foreign

Gender: gender [ ]

Gender is a lexical feature of nouns and inflectional feature of other parts of speech (adjectives, verbs) that mark agreement with nouns. There are three values of gender: masculine, feminine, and neuter.

See also the related feature of Animacy.

Masc: masculine gender

Nouns denoting male persons are masculine. Other nouns may be also grammatically masculine, without any relation to sex.

Examples

  • pán  “gentleman”
  • hrad  “castle”
  • muž  “man”
  • stroj  “machine”
  • předseda  “chairman”
  • soudce  “judge”

Fem: feminine gender

Nouns denoting female persons are feminine. Other nouns may be also grammatically feminine, without any relation to sex.

Examples

  • žena  “woman”
  • růže  “rose”
  • píseň  “song”
  • kost  “bone”

Neut: neuter gender

This third gender is for nouns that are neither masculine nor feminine (grammatically). Nouns whose nominative suffix is -o  or -í  (including a large group of deverbative nouns denoting actions) are usually neuter.

Examples

  • město  “city”
  • moře  “sea”
  • kuře  “chicken”
  • stavení  “building”
edit Gender

Gender[psor]: possessor’s gender [ ]

Possessive adjectives and pronouns may have two different genders: that of the possessed object (gender agreement with modified noun) and that of the possessor (lexical feature, inherent gender). The Gender[psor] feature captures the possessor’s gender.

In the Czech examples below, the masculine gender implies using one of the suffixes -ův, -ova, -ovo, and the feminine gender implies using one of -in, -ina, -ino.

Masc: masculine possessor

Examples

  • otcův syn “father’s son” Gender[psor]=Masc|Gender=Masc
  • otcova dcera “father’s daughter” Gender[psor]=Masc|Gender=Fem
  • otcovo dítě “father’s child” Gender[psor]=Masc|Gender=Neut

Fem: feminine possessor

Examples

  • matčin syn “mother’s son” Gender[psor]=Fem|Gender=Masc
  • matčina dcera “mother’s daughter” Gender[psor]=Fem|Gender=Fem
  • matčino dítě “mother’s child” Gender[psor]=Fem|Gender=Neut
edit Gender[psor]

Hyph: hyphenated compound or part of it [ ]

Boolean feature. Is this the first part of a hyphenated compound?

Compound adjectives with hyphens, such as česko-slovenský  “Czech-Slovak” get split during tokenization. The last part, slovenský,  is an independent adjective with full inflection paradigm. However, the first part, česko,  is a form that does not occur elsewhere than in compounds (the independent form would be český).

Yes: it is part of hyphenated compound

Examples

  • česko-slovenský  “Czecho-Slovak”
edit Hyph

Mood: mood [ ]

Mood is a feature that expresses modality and subclassifies finite verb forms.

Ind: indicative

The indicative can be considered the default mood. A verb in indicative merely states that something happens, has happened or will happen, without adding any attitude of the speaker.

Examples

  • Studuješ na univerzitě. You study at the university.”

Imp: imperative

The speaker uses imperative to order or ask the addressee to do the action of the verb.

Czech verbs (except for modal verbs) have imperative forms of the second person singular, first person plural and second person plural.

Examples

  • Studuj na univerzitě! Study at the university!”

Cnd: conditional

The conditional mood is used to express actions that would have taken place under some circumstances but they actually did not / do not happen.

Czech has present conditional and past conditional, both formed periphrastically using the past participle of the content verb, and a special form of the auxiliary verb být. The special form is historically aorist tense, but the tense does not exist in modern Czech, so the auxiliary form is better described by Mood=Cnd.

The past participle of the content verb is not marked as conditional because it can also be used in past indicative.

Examples

  • Kdybych byl chytrý, studoval bych na univerzitě.  “If I were smart I would study at the university.”
edit Mood

NameType: type of named entity [ ]

Classification of named entities (token-based, no nesting of entities etc.) The feature applies mainly to the cs-pos/PROPN tag; in multi-word foreign names, adjectives may also have this feature (they preserve the ADJ tag but at the same time they would not exist in Czech otherwise than in the named entity).

Conversion from the Prague Dependency Treebank

Lemmas in PDT contain features that also encode types of named entities. When converting the PDT annotation to UD, these lemma features are removed and the feature NameType is added to the universal features to preserve the type.

The following table lists the name types together with the most frequent examples. See http://ufal.mff.cuni.cz/techrep/tr27.pdf, page 8, section 2.1 (Lemma structure) for more details.

_;Ygiven nameJan, Jiří, Václav, Petr, Josef“Jan, Jiří, Václav, Petr, Josef”
_;SsurnameKlaus, Havel, Němec, Jelcin, Svoboda“Klaus, Havel, Němec, Yeltsin, Svoboda”
_;Emember of a particular nation, inhabitant of a particular territoryNěmec, Čech, Srb, Američan, Slovák“German, Czech, Serbian, American, Slovak”
_;Ggeographical namePraha, ČR, Evropa, Německo, Brno“Prague, CR, Europe, Germany, Brno”
_;Kcompany, organization, institutionODS, OSN, Sparta, ODA, Slavia“ODS, UN, Sparta, ODA, Slavia”
_;RproductLN, Mercedes, Tatra, PC, MF“LN, Mercedes, Tatra, PC, MF”
_;mother proper name: names of mines, stadiums, guerilla bases etc.US, PVP, Prix, Rapaport, Tour“US, PVP, Prix, Rapaport, Tour”

Geo: geographical name

Names of cities, countries, rivers, mountains etc.

Examples

  • Praha  “Prague”, Kostelec nad Černými lesy, Německo  “Germany”

Prs: name of person

This value is used if it is not known whether it is a given or a family name, but it is known that it is a personal name.

Giv: given name of person

Given name (not family name). This is usually the first name in European and American names. In Chinese names, the last two syllables (of three) are usually the given name.

Examples

  • Jan, Jiří, Václav

Sur: surname / family name of person

Family name (surname). This is usually the last name in European and American names. In Chinese names, the first syllable (of three) is usually the surname.

Examples

  • Klaus, Havel, Němec

Nat: nationality

Name denoting a member of a particular nation, or inhabitant of a particular territory. This does not include derived adjectives, nor nouns denoting languages (both groups are written in lowercase). Thus Čech  “Czech [man]” belongs here but český  “Czech” and čeština  “Czech [language]” do not.

Examples

  • Čech  “Czech”, Němec  “German”, Pražan  “Praguer”

Com: company, organization

Pro: product

Oth: other

Names of stadiums, guerilla bases, events etc.

edit NameType

Negative: whether the word can be or is negated [ ]

In Czech, negation is mostly done using the bound morpheme ne-, and an independent negating particle (equivalent to English “not”) is rarely seen. Words that can take the morpheme of negation have the feature of negativeness.

It applies to verbs, adjectives, sometimes also adverbs and even nouns. (Most nouns have just Negative=Pos; deverbative nouns can have also Negative=Neg.)

Note that Negative=Neg is not the same thing as PronType=Neg. For pronouns and other pronominal parts of speech there is no such binary opposition as for verbs and adjectives. (There is no such thing as “affirmative pronoun”.)

Pos: positive, affirmative

Examples

  • přišel  “he came”
  • rozumný  “wise”
  • pěkně  “nicely”
  • přijetí  “acceptance”

Neg: negative

Examples

  • nepřišel  “he did not come”
  • nerozumný  “unwise”
  • nepěkně  “nastily”
  • nepřijetí  “non-acceptance, rejection”
edit Negative

NumForm: numeral form [ ]

Feature of cardinal and ordinal numbers. Is the number expressed by digits or as a word?

Word: number expressed as word

Examples

  • jeden “one”, dva “two”, tři “three”

Digit: number expressed using digits

Examples

  • 1, 2, 3

Roman: roman numeral

Examples

  • I, II, III
edit NumForm

NumType: numeral type [ ]

Czech has a complex system of numerals. For example, in the school grammar of Czech, the main part of speech is “numeral”, it includes almost everything where counting is involved and there are various subtypes. It also includes interrogative, relative, indefinite and demonstrative quantifiers (words like kolik  “how many”, tolik  “so many”, několik  “several”), so at the same time we may have a non-empty value of PronType.

From the syntactic point of view, some numtypes behave like adjectives and some behave like adverbs. We tag them cs-pos/ADJ and cs-pos/ADV respectively. Thus the NumType feature applies to several different parts of speech:

Card: cardinal number or corresponding interrogative / relative / indefinite / demonstrative word

Examples

  • jeden, dva, tři  “one, two, three”
  • kolik  “how many”
  • několik  “several”, mnoho  “many”, málo  “few”
  • tolik  “so many”

Ord: ordinal number or corresponding interrogative / relative / indefinite / demonstrative word

This is a subtype of adjective or adverb.

Adjectival examples

  • první  “first”; druhý  “second”, třetí  “third”
  • kolikátý  lit. how manieth  “which rank”
  • několikátý  “some rank”
  • tolikátý  “this/that rank”

Adverbial examples

  • poprvé  “for the first time”; podruhé  “for the second time”; potřetí  “for the third time”
  • pokolikáté  “for which time”
  • poněkolikáté  “for x-th time”
  • potolikáté  “it has been so many times”

Mult: multiplicative numeral or corresponding interrogative / relative / indefinite / demonstrative word

This is a subtype of adverb.

Examples

  • jednou  “once”; dvakrát  “twice”; třikrát  “three times”
  • kolikrát  “how many times”
  • několikrát  “several times”
  • tolikrát  “so many times”

Frac: fraction

This is a subtype of cardinal numbers. It may denote a fraction or just the denominator of the fraction.

Examples

  • půl / polovina  “half”; třetina  “one third”; čtvrt / čtvrtina  “quarter”

Sets: number of sets of things

Morphologically distinct class of numerals used to count sets of things, or nouns that are pluralia tantum.

Examples

  • dvoje / troje boty two / three [pairs of] shoes”; as opposed to normal cardinal numbers: dvě / tři boty  “two / three shoes”

Gen: generic numeral, i.e. a numeral that is neither of the above

Czech school grammar distinguishes this subclass, which is why it appears in Czech tagsets. (Note that “generic numerals” in Czech grammar also include the Sets subclass mentioned above.)

Examples

  • čtvero, patero, desatero  (specific forms of four, five, ten; they are morphologically, syntactically and stylistically distinct from the default forms čtyři, pět, deset)
  • dvojí, trojí, čtverý  (twofold, threefold, fourfold; these are morphologically and syntactically adjectives)
edit NumType

NumValue: numeric value [ ]

In Czech, number “one” agrees with the counted noun in Gender, Number and Case. Number “two” agrees in gender and case and numbers “three” and “four” agree in case. These numerals behave similarly to adjectives. Numbers “five”, “six” etc. behave differently. If the case of the counted phrase is genitive, dative, locative or instrumental, the numeral agrees in case with the noun. However, if the case of the whole phrase is nominative, accusative or vocative, then the numeral dictates that the noun is in genitive. This behavior is similar to nouns modified by other nouns in genitive. (Note that this is why in the Czech PDT some numeral nodes are annotated as governing nouns instead of modifying them.) In addition, the whole phrase (number + counted noun) together behaves as neuter singular (this is important for subject-verb agreement).

Specific behavior of low-value numerals is the reason why there is a separate feature to mark these numerals.

1: numeric value 1

  • jeden, jedna, jedno “one”

2: numeric value 2

  • dva, dvě “two”

3: numeric value 3 or 4

  • tři “three”, čtyři “four”
edit NumValue

Number: number [ ]

Number is an inflectional feature of nouns and other parts of speech (adjectives, verbs) that mark agreement with nouns.

Sing: singular number

A singular noun denotes one person, animal or thing.

Examples

  • starý muž přišel  “an old man came”
  • mladá žena přišla  “a young woman came”
  • malé kuře přišlo  “a small chicken came”

Plur: plural number

A plural noun denotes several persons, animals or things.

Examples

  • staří muži přišli  “old men came”
  • mladé ženy přišly  “young women came”
  • malá kuřata přišla  “small chickens came”

Dual: dual number

A dual noun denotes two objects. The dual number has almost vanished from Czech with the exception of special instrumental case suffixes for body parts that occur in pairs, and any adjectives that modify them.

Examples

The noun noha  means either “leg” of a human, or of a table. Dual is used for the former and plural for the latter:

  • holka s dlouhýma nohama  “a girl with long legs
  • stůl s dlouhými nohami  “a table with long legs”

The numeral sto  “hundred” has also a special form of plural that is actually the dual:

  • dvě stě  “two hundred
  • tři sta  “three hundred”

Ptan: plurale tantum

Some nouns appear only in the plural form even though they denote one thing (semantic singular); some tagsets mark this distinction. Grammatically they behave like plurals, so Plur is obviously the back-off value here; however, the non-existence of singular form sometimes means that the gender is unknown. In Czech, special type of numerals is used when counting nouns that are plurale tantum (NumType=Sets).

Examples

  • nůžky, kalhoty  “scissors, pants”

Coll: collective / mass / singulare tantum

Collective or mass or singulare tantum is a special case of singular. It applies to words that use grammatical singular to describe sets of objects, i.e. semantic plural. Although in theory they might be able to form plural, in practice it would be rarely semantically plausible. Sometimes, the plural form exists and means “several sorts of” or “several packages of”.

Examples

  • lidstvo  “mankind”

Diffs

Prague Dependency Treebank

The PDT tagset does not distinguish Ptan from Plur and Coll from Sing, therefore this distinction is not being made in the converted data.

edit Number

Number[psor]: possessor’s number [ ]

Possessives may have two different numbers: that of the possessed object (number agreement with modified noun) and that of the possessor. The Number[psor] feature captures the possessor’s number.

Sing: singular possessor

Examples

  • můj pes “my dog” Number[psor]=Sing|Number=Sing
  • psi “my dogs” Number[psor]=Sing|Number=Plur

Plur: plural possessor

Examples

  • náš pes “our dog” Number[psor]=Plur|Number=Sing
  • naši psi “our dogs” Number[psor]=Plur|Number=Plur
edit Number[psor]

Person: person [ ]

Person is a feature of personal and possessive pronouns, and of verbs. On verbs it is in fact an agreement feature that marks the person of the verb’s subject. Person marked on verbs makes it unnecessary to always add a personal pronoun as subject and thus subjects are sometimes dropped (Czech is a pro-drop language).

1: first person

In singular, the first person refers just to the speaker / author. In plural, it must include the speaker and one or more additional persons.

Examples

  • dělám I do”
  • děláme we do”

2: second person

In singular, the second person refers to the addressee of the utterance / text. In plural, it may mean several addressees and optionally some third persons too.

Examples

  • děláš you.Sing do”
  • děláte you.Plur do”

3: third person

The third person refers to one or more persons that are neither speakers nor addressees.

Examples

  • dělá he/she/it does
  • dělají they do”
edit Person

Poss: possessive [ ]

Boolean feature of pronouns, determiners or adjectives. It tells whether the word is possessive.

While many tagsets would have “possessive” as one of the various pronoun types, this feature is intentionally separate from PronType, as it is orthogonal to pronominal types. Several of the pronominal types can be optionally possessive, and adjectives can too.

Yes: it is possessive

Note that there is no No value. If the word is not possessive, the Poss feature will just not be mentioned in the FEAT column. (Which means that empty value has the No meaning.)

Examples

  • possessive personal pronouns/determiners: můj, tvůj, jeho, její, náš, váš, jejich  “my, your, his, her, our, your, their”
  • possessive reflexive pronoun/determiner: svůj  “one’s own”
  • possessive relative pronoun/determiner: jehož  “whose”
  • possessive adjectives: otcův  “father’s”, matčin  “mother’s”
edit Poss

PrepCase: case form sensitive to prepositions [ ]

Some personal pronouns have different forms depending on whether they are objects of prepositions or not.

Default empty value means that the word form is neutral w.r.t. prepositions.

Npr: non-prepositional case

This word form must not be used after a preposition.

Examples

  • jeho, jemu, jím “him” (Gen,Acc, Dat, Ins)

Pre: prepositional case

This word form must be used after a preposition.

Examples

  • něho, němu, něm, ním “him” (Gen,Acc, Dat, Loc, Ins)
edit PrepCase

PronType: pronominal type [ ]

This feature typically applies to pronouns, determiners, pronominal numerals (quantifiers) and pronominal adverbs.

Prs: personal or possessive personal pronoun or determiner

See also the Poss feature that distinguishes normal personal pronouns from possessives. Note that Prs also includes reflexive personal/possessive pronouns (e.g. se / svůj; see the Reflex feature).

Examples

  • já, ty, on, ona, ono, my, vy, oni, ony, ona, se  “I, you, he, she, it, we, you, they, they, they, oneself”
  • můj, tvůj, jeho, její, náš, váš, jejich, svůj  “my, your, his/its, her, our, your, their, one’s own”

Int: interrogative pronoun, determiner, numeral or adverb

Note that possessive interrogative determiners (whose) can be distinguished by the Poss feature.

Examples:

  • kdo  “who”
  • co  “what”
  • jaký  “what kind of”
  • který  “which”
  • čí  “whose”
  • kolik  “how many”
  • kolikátý  “how-maniest” (ordinal number)
  • kolikrát  “how many times”
  • kde  “where”
  • kam  “where to”
  • kdy  “when”
  • jak  “how”
  • proč  “why”

Rel: relative pronoun, determiner, numeral or adverb

Note that this class heavily overlaps with interrogatives, yet there are pronouns that are only relative.

Examples:

  • jenž, což  “which, that” (relative but not interrogative pronouns)
  • jehož  “whose” (possessive relative pronoun)

Dem: demonstrative pronoun, determiner, numeral or adverb

These are to some extent parallel to interrogatives.

Examples

  • tento  “this”
  • tamten  “that”
  • takový  “such”
  • týž  “same”
  • tolik  “so many”
  • tolikátý  “so-maniest” (ordinal number)
  • tolikrát  “so many times”
  • tady  “here”
  • tam  “there”
  • teď  “now”
  • tehdy  “then”
  • tak  “so”

Tot: total (collective) pronoun, determiner or adverb

Examples

  • každý  “every, everybody, everyone, each”
  • všechno  “everything, all”
  • všude  “everywhere”
  • vždy  “always”

Neg: negative pronoun, determiner or adverb

Examples

  • nikdo  “nobody”
  • nic  “nothing”
  • nijaký  “no (kind)”
  • ničí  “no one’s”
  • žádný  “no, none”
  • nikde  “nowhere”
  • nikam  “(to) nowhere”
  • nikdy  “never”
  • nijak  “no way” (lit. “no-how”)

Ind: indefinite pronoun, determiner, numeral or adverb

Examples

  • někdo  “somebody”; kdokoli  “anybody”; málokdo  “few people”; leckdo  “quite a few people”; kdosi  “somebody”
  • něco  “something”; cokoli  “anything”; máloco  “few things”; lecco  “quite a few things”; cosi  “something”
  • nějaký  “some kind of”; jakýkoli  “any kind of”; lecjaký  “just any”; jakýsi  “some, certain”
  • některý  “some”; kterýkoli  “any”; málokterý  “few”; leckterý  “quite a few”; kterýsi  “some”
  • něčí  “someone’s”; číkoli  “anyone’s”; lecčí  “of quite a few people”; čísi  “someone’s”
  • několik  “several”; málo  “few”; mnoho  “many”
  • několikátý  “severalth” (indefinite ordinal numeral)
  • několikrát  “several times”
  • někde  “somewhere”; kdekoli  “anywhere”; málokde  “few places”; leckde  “quite a few places”; kdesi  “somewhere”
  • někam  “(to) somewhere”; kamkoli  “(to) anywhere”; kamsi  “(to) somewhere”
  • někdy  “sometimes”; kdykoli  “anytime”; málokdy  “few times”; leckdy  “quite a few times”; kdysi  “once (long ago)”
  • nějak  “somehow”; jakkoli  “anyhow”; lecjak  “quite a few ways”; jaksi  “somehow”
edit PronType

Reflex: reflexive [ ]

Boolean feature of pronouns or determiners. It tells whether the word is reflexive, i.e. refers to the subject of its clause.

In Czech, reflexive pronouns have various functions:

  • Reflexive object of a verb means that the object is the same entity as the subject: Jan si koupil auto  = “Jan bought himself a car” vs. Jan mu koupil auto  = “Jan bought him [someone else] a car”
  • Reflexive object of a verb in plural may also indicate a reciprocal action. This usage of the reflexive pronoun is translated to English as “each other”. Unlike e.g. German, Czech does not have a special reciprocal pronoun and the reflexive pronoun is used instead: Jan a Marie se milují  = “Jan and Mary love each other
  • Reflexive pronoun in a subjectless clause constitutes so-called reflexive passive: To se napíše zítra  (reflexive passive, the verb is morphologically in active form) vs. To bude napsáno zítra  (normal passive, with auxiliary finite verb and a passive participle) “That will be written tomorrow”
  • Some verbs are mandatorily reflexive, i.e. they never occur without the reflexive pronoun. The pronoun does not alter the meaning in any way, but without it the sentence would not be grammatical: Jan se směje  “Jan laughs”

Reflexive possessives indicate that the subject of the clause is the possessor:

  • Jan prodal své auto.  “Jan sold his [own] car.”
  • Jan prodal jeho auto.  “Jan sold his [someone else’s] car.”

Yes: it is reflexive

Note that there is no No value. If the word is not reflexive, the Reflex feature will just not be mentioned in the FEAT column. (Which means that empty value has the No meaning.)

Examples

  • reflexive personal pronouns: se, si, sebe, sobě, sebou (occurs in various cases but not in nominative and vocative; does not distinguish Number)
  • reflexive possessive pronoun: svůj
edit Reflex

Style: style or sublanguage to which this word form belongs [ ]

This may be a lexical feature (some words-lemmas are archaic, some are colloquial) or a morphological feature (inflectional patterns may systematically change between dialects or styles).

Arch: archaic, obsolete

Examples

  • biblí, bukův, dubův, činějí

Rare: rare

Form: formal, literary

Poet: poetic

Norm: normal, neutral

Coll: colloquial

Examples

  • normal paradigm of hard adjectives: mladý, mladého, mladému, mladém, mladým, mladí, mladých, mladým, mladé, mladými “young”
  • colloquial paradigm of hard adjectives: mladej, mladýho, mladýmu, mladým, mladým, mladý, mladejch, mladejm, mladý, mladejma “young”

Vrnc: vernacular

Slng: slang

Expr: expressive, emotional

Derg: derogative

Vulg: vulgar

Diffs

Prague Dependency Treebank

PDT does not classify lemmas according to style. It marks non-standard inflections. In general, the style feature is used infrequently and only two values are to be expected: Arch and Coll.

edit Style

Tense: tense [ ]

Tense is a feature that specifies the time when the action took / takes / will take place, in relation to the current moment or to another action in the utterance.

Past: past tense

The past tense denotes actions that happened before the current moment. Past tense in Czech consists of the past participle (also called active participle or l-participle), which is accompanied by a present auxiliary verb in the first and second persons, and stands alone in the third person.

The auxiliary (if any) is in its present form, so it will have Tense=Pres. The participle has Tense=Past, even though it can also be used to form present conditional.

Examples

  • Šel jsem domů.  “I have gone home.”
  • Šel jsi domů.  “You have gone home.”
  • Šel domů.  “He has gone home.”

Pres: present tense

The present tense denotes actions that are happening right now or that usually happen.

Note that morphologically present forms of perfective verbs have actually a future meaning but they will still be marked Tense=Pres.

Examples

  • Přicházím domů.  “I come / am coming home.” (Přicházet  is an imperfective verb.)
  • Přijdu domů.  “I will come home.” (Přijít  is a perfective verb.)
  • Jdu domů.  “I go / am going home.” (Jít  is an imperfective verb.)

Fut: future tense

The future tense denotes actions that will happen after the current moment. Future tense in Czech is formed in one of three ways, depending of the verb:

  • Present forms of perfective verbs have future meaning. These forms are tagged Tense=Pres, not Tense=Fut (see above).
  • The verb být  “to be” has a set of distinct future forms. They combine a future stem bud  with present suffixes. A small set of verbs (mostly motion verbs) have also future forms. These are formed as the present form (present stem and suffix) with the prefix po-. Although these forms are morphologically very close to the present forms, they are tagged Tense=Fut because the same lemma has also present forms and the feature must distinguish the two.
  • The remaining imperfective verbs have periphrastic future forms, consiting of the future form of the auxiliary být,  and the infinitive of the content verb. Only the auxiliary will have Tense=Fut, while there will be no tense information at the infinitive.

Examples

  • Půjdu domů.  “I will go home.” (Jít  is an imperfective verb, phonological rule transformed the prefix po- to pů-.)
  • Budu přicházet domů.  “I will be coming home.” (Přicházet  is an imperfective verb and it forms future periphrastically.)
edit Tense

Variant: alternative form of word [ ]

Sometimes there are multiple word forms for the same lemma and set of features. The Variant feature helps distinguish alternate forms.

In Czech there are two groups of words where double forms are regular and worth capturing: short forms of adjectives and short (clitic) forms of personal pronouns. This feature only marks the non-standard short forms, hence there is only one value, Short. For the long standard forms the Variant feature remains unspecified.

Short: short form of adjectives

The short form is called nominal form of adjective (jmenný tvar přídavného jména), as opposed to the long form, which is pronominal because it originated as a combination of a nominal form and a personal pronoun. But this is ancient history of the language. In modern Czech, only a subset of the nominal forms survive, and using them sometimes sounds slightly archaic. They are used as nominal predicates with copula, but they do not appear as premodifiers of nouns. The pronominal forms are considered standard, except for two frequent adjectives that do not have them: třeba, rád.

Examples

  • možno “possible”, schopen “able”, nutno “necessary”, znám “known”, spokojen “satisfied”, povinen “supposed to”, ochoten “willing”, jist “sure”, vědom “knowing”, přítomen “present”, roven “equal”, patrno “apparent”, hotov “finished”, spjat “connected”, vinen “guilty”
  • Long equivalents: možné, schopný, nutné, známý, spokojený, povinný, ochotný, jistý, vědomý, přítomný, rovný, patrné, hotový, spjatý, vinný

Short: short (clitic) form of personal pronouns

Some personal pronouns in dative and accusative Case have double forms. The normal (long) form is more independent in terms of positions it can take in word order. The short forms are clitics (http://cs.wikipedia.org/wiki/P%C5%99%C3%ADklonka). They are separate words (unlike in some other languages) but in the word order they usually stick to the second position.

  • mi, , ti, , mu, ho, si, se
  • mně, mne, tobě, tebe, jemu, jeho, sobě, sebe
  • “me, me, you, you, him, him, oneself, oneself”
edit Variant

VerbForm: form of verb or deverbative [ ]

Even though the name of the feature seems to suggest that it is used exclusively with verbs, it is not the case. The Part value can be used also with adjectives. It distinguishes participles from other verb forms, and participial adjectives from other adjectives.

Fin: finite verb

Rule of thumb: if it has non-empty Mood, it is finite. In Czech this applies to indicative and imperative forms, and to the special conditional forms of the auxiliary verb být.

Examples

  • nesu, neseš, nese, neseme, nesete, nesou  “I carry, you carry, he/she/it carries, we carry, you carry, they carry”
  • nes, nesme, neste  “carry” (imperative in different persons and numbers)
  • jsem, jsi, je, jsme, jste, jsou  “I am, you are, he/she/it is, we are, you are, they are”
  • budu, budeš, bude, budeme, budete, budou  “I will be, you will be, he/she/it will be, we will be, you will be, they will be”
  • bych, bys, by, bychom, byste, by  “I would, you would, he/she/it would, we would, you would, they would”
  • buď, buďme, buďte  “be” (imperative in different persons and numbers)

Inf: infinitive

Infinitive is the citation form of verbs. It is also used with the auxiliary být  to form periphrastic future tense, and it appears as the argument of modal and other verbs.

Examples

  • nést  “to carry”
  • být  “to be”

Part: participle

Participle is a non-finite verb form that shares properties of verbs and adjectives. Czech has two types of participles:

  • The past participle (also called active participle or l-participle) is used to form the past tense, and the conditional mood in present or past tense.
  • The passive participle is used to form the passive voice (in any tense or mood).

Participles inflect for Gender and Number but not for Person.

Examples

  • nesl, nesla, neslo, nesli, nesly  “carried” (past participle in different genders and numbers)
  • nesen, nesena, neseno, neseni, neseny  “carried” (passive participle in different genders and numbers)
  • byl, byla, bylo, byli, byly  “was/been” (past participle in different genders and numbers)

Trans: transgressive

The transgressive, also called adverbial participle, is a non-finite verb form that shares properties of verbs and adverbs.

Imperfective verbs form present transgressive, meaning “while doing”.

Perfective verbs form past transgressive, meaning “having done”.

Examples

  • nesa, nesouc, nesouce  “carrying” (present transgressive in different genders and numbers)
  • přines, přinesši, přinesše  “having brought” (past transgressive in different genders and numbers)
  • jsa, jsouc, jsouce  “being” (present transgressive in different genders and numbers)
  • byv, byvši, byvše  “having been” (past transgressive in different genders and numbers)
  • zírali na mne, pevně svírajíce své zbraně  “they stared at me while gripping their guns firmly”
  • udělavši večeři, zavolala rodinu ke stolu having prepared the dinner, she called her family to the table”
edit VerbForm

Voice: voice [ ]

Voice is a feature of verbs that helps map the traditional syntactic functions, such as subject and object, to semantic roles, such as agent and pacient.

Act: active voice

The subject of the verb is the doer of the action (agent), the object is affected by the action (pacient).

All finite verb forms and the active/past participles are tagged Voice=Act.

Examples

  • Napadli jsme nepřítele.  “We attacked the enemy” (the active participle napadli  can be used to form either past tense or conditional mood; here it forms the past tense.)

Pass: passive voice

The subject of the verb is affected by the action (patient). The doer (agent) is either unexpressed or it appears as an object of the verb.

Only the passive participle is tagged Voice=Pass.

Examples

  • Jsme napadeni nepřítelem.  “We are attacked by the enemy” (the passive participle napadeni  is used to form passive in all tenses; here it forms the present passive.)
edit Voice