home issue tracker

Features

Features bg

Lexical features
PronType
NumType
Poss
Reflex
Inflectional features
Nominal Verbal
Gender VerbForm
Animacy Mood
Number Tense
Case Aspect
Definite Voice
Degree Person
Negative

Animacy: animacy [ ]

Similarly to Gender (and to the African noun classes), animacy is usually a lexical feature of nouns and inflectional feature of other parts of speech that mark agreement with nouns. It is independent of gender, therefore it is encoded separately in some tagsets (e.g. all the Multext-East tagsets).

In the BulTreeBank tagset Animacy is not encoded as a special feature. The dichotomy that plays a role here is rather: Human - Non-human. With very few exceptions, these features are not encoded grammatically.

Anim: animate

As explicitly Animate can be considered the following pronouns:

  • the masculine accusative forms of some pronouns: Pre-as-m (relative - когото /kogoto “whom”), Pce-as-m (collective - всекиго /vsekigo “everybody”, Pie-as-m (interrogative - кого /kogo “whom”), Pfe-as-m (indefinite - някого /nyakogo “somebody”), Pne-as-m (negative - никого /nikogo “nobody”))
  • some pronouns for quantity of humans: Piy (interrogative - колцина / koltsina “how many”); Pfy# (indefinite - неколцина / nekoltsina “few, some”)
  • the 1st and 2nd personal and possessive pronouns: Ppe#1 (аз, ние / az, nie “I, we”), Ppe#2 (ти, вие / ti, vie “you, you”), Pph#2 (Вие / Vie “you-honorific”); Ps#1# (мой / moy “my”), Ps#2# (твой / tvoy “your”)

Nhum: animate but non-human

It has the so-called count form in contrast to the humans, but only for masculine nouns. The count form is a kind of plural, which comes after numerals.

  • два лъва / dva lava “two lions”

Inan: inanimate

It has also the so-called count form in contrast to the humans, but only for masculine nouns. The count form is a kind of plural, which comes after numerals.

  • три стола / tri stola “three chairs”

Note that the symbol `#’, used in the Universal POS section indicates a holder for arbitrary number of features, suppressed in the respective tag as irrelevant in the BulTreeBank tagset, when mapped to the Universal one.

edit Animacy

Aspect: aspect [ ]

Aspect

Aspect is a feature that specifies duration of the action in time, whether the action has been completed etc.

In Bulgarian aspect is a lexical feature, as in other Slavic languages. It comprises two grammemes: perfective and imperfective.

Imp: imperfect aspect

The action took / takes / will take some time span and there is no information whether and when it was / will be completed.

Examples

  • казвам / kazvam “say”
  • намирам / namiram “find”
  • разбирам / razbiram “understand”

Perf: perfect aspect

The action has been / will have been completed. Since there is emphasis on one point on the time scale (the point of completion), this aspect does not work well with the present tense for actual activities.

Examples

  • кажа / kazha “say”
  • намеря / namerya “find”
  • разбера / razbera “understand”
edit Aspect

Case: case [ ]

In Bulgarian only some nouns have special vocative forms (v):

Examples

  • Иване, приятелю, Родино, Стефке / Ivane, priyatelyu, Rodino, Stefke (Ivan, friend, homeland, Stefka)

The cases are still alive in personal pronouns: nominative (n), accusative (a) and dative (d).

Examples

  • нея, тя, му, го / neya, tya, mu, go (her.ACC.LONG, she.NOM, him.DAT.SHORT, him.ACC.SHORT).

Accusative and dative cases are still present in the masculine, singular forms of some other pronouns – interrogative, indefinite, collective, relative, negative. Please note that the dative forms are analytical and thus, only the accusative form is marked after the preposition ‘на’.

Examples

  • кого, някого, никого / kogo, nyakogo, nikogo (whom, someone.ACC, nobody.ACC)
  • на кого, на някого, на никого / na kogo, na nyakogo, na nikogo (to whom, to someone.ACC, to nobody.ACC)

In our tagset another idiosyncratic case has been marked – the so-called ‘dative possessive case’ (s). It refers to situations where the short possessive pronoun comes before its possessor noun and thus – next to the verb.

Examples

  • Той ми взе шапката / Toy mi vze shapkata ‘He my.POSS took hat.DEF’ (He took my hat.)

The canonical sentence would be: Той взе шапката ми / Toy vze shapkata mi ‘He took hat.DEF my.POSS’ (He took my hat).

edit Case

Definite: definiteness or state [ ]

Definiteness is typically a feature of nouns, adjectives and articles. Its value distinguishes whether we are talking about something known and concrete, or something general or unknown. It can be marked on definite and indefinite articles, or directly on nouns, adjectives etc.

In Bulgarian there are definite and indefinite articles. The definite article is part of the word, in postposition (жената / zhenata ‘woman-the’ (the woman))). The indefinite articles can be: the form един / edin (one) or the zero marker.

However, when added to a nominal phrase, the articles become phrasal affixes, i.e. Bulgarian does not have agreement is definiteness. For example, хубавата висока руса жена / hubavata visoka rusa zhena ‘pretty-the tall blond woman’ (the pretty tall blond woman).

Ind: indefinite

Examples

  • Видях една жена да минава по улицата / Vidyah edna zhena da minava po ulitsata “I saw a woman walking on the street”
  • Видях жена да минава по улицата / Vidyah zhena da minava po ulitsata “I saw a woman walking on the street”

Def: definite

Examples

  • Видях жената да минава по улицата / Vidyah zhenata da minava po ulitsata “I saw the woman walking on the street”
edit Definite

Degree: degree of comparison [ ]

Degree of comparison is typically an inflectional feature of some adjectives and adverbs.

In Bulgarian the comparative and superlative forms are created with the help of the particles по / po “more” and най / nay “most”, which are part of the word and come in preposition, separated by a defice.

Pos: positive, first degree

This is the base form that merely states a quality of something, without comparing it to qualities of others. Note that although this degree is traditionally called “positive”, negative properties can be compared, too.

Examples

  • удобен стол / udoben stol “a comfortable chair”
  • млад човек / mlad chovek “a young man”

Cmp: comparative, second degree

The quality of one object is compared to the same quality of another object.

Examples

  • Моят стол е по-удобен от твоя / Moyat stol e po-udoben ot tvoya “My chair is more comfortable than yours”
  • Брат ми е по-млад от мен / Brat mi e po-mlad ot men “My brother is younger than me”

Sup: superlative, third degree

The quality of one object is compared to the same quality of all other objects within a set.

Examples

  • Този стол е най-удобният от всички / Tozi stol e nay-udobniyat ot vsichki “This chair is the most comfortable of all”
  • Той е най-младият учител в училището / Toy e nay-mladiyat uchitel v uchilishteto “He is the youngest teacher in the school”
edit Degree

Gender: gender [ ]

Gender is usually a lexical feature of nouns and inflectional feature of other parts of speech (adjectives, verbs) that mark agreement with nouns. In Bulgarian gender is grammatical.

There are three genders: masculine(m), feminine (f) and neuter (n).

Masc: masculine gender

Nouns denoting male persons are masculine. Other nouns may be also grammatically masculine, without any relation to sex.

Example: [bg] замък / zamak “castle”

Fem: feminine gender

Nouns denoting female persons are feminine. Other nouns may be also grammatically feminine, without any relation to sex.

Example: [bg] маса / masa “table”

Neut: neuter gender

Neither masculine nor feminine (grammatically).

Example: [bg] дете / dete “child”

edit Gender

Mood: mood [ ]

Mood

Mood is a feature that expresses modality and subclassifies finite verb forms. In Bulgarian there are three moods: Indicative, Imperative and Conditional.

Ind: indicative

The indicative can be considered the default mood. A verb in indicative merely states that something happens, has happened or will happen, without adding any attitude of the speaker. Indicative covers all the 9 tenses and their passive forms in Bulgarian. It also covers the evidential forms.

Examples

  • Следвам право в университета. / Sledvam pravo v universiteta “I study law at the University”.
  • Той беше ходил в САЩ много пъти. / Toy beshe hodil v SASHT mnogo pati “He had been to the USA many times”.

Imp: imperative

The speaker uses imperative to order or ask the addressee to do the action of the verb. The forms in Bulgarian are synthetic.

Examples

  • Купете хляб и сирене! / Kupete hlyab i sirene “Buy some bread and cheese!”
  • Подай ми солта, моля! / Poday mi solta, molya “Pass me the salt, please!”

Cnd: conditional

The conditional mood is used to express actions that might happen under certain circumstances or that would have taken place but they actually did not / do not happen. It usually presupposes volition. The forms in Bulgarian are analytic.

Examples

  • Бих дошъл, ако ме поканиш. / Bih doshal, ako me pokanish “I would come if you invite me.”
  • Бих дошъл, ако имах възможност. / Bih doshal, ako imah vazmozhnost “I would come if I could.”
  • Би трябвало добре да се подготвим за срещата. / Bi tryabvalo dobre da se podgotvim za sreshtata “We should prepare very well for the meeting.”
edit Mood

Negative: whether the word can be or is negated [ ]

Negativeness

Negativeness is typically a feature of verbs, adjectives, sometimes also adverbs and nouns in languages that negate using bound morphemes.

In Bulgarian nouns, adjectives, attrubutive participles use bound morpheme не (with the exception of clear contrastive contexts) Verbs and transgressives, however, use the clitic не for negation.

The negativeness feature is used to distinguish response interjections yes and no.

Pos: positive, affirmative

Examples

  • човек / chovek “man”
  • добър / dobar “good”
  • разбралата жена / razbralata zhena “the woman that understood”
  • вървя / varvya “I am walking”
  • вървейки / varveyki “walking”

Neg: negative

Examples

  • нечовек / nechovek “not a man”
  • недобър / nedobar “not good”
  • неразбралата жена / nerazbralata zhena “the woman that did not understand”
  • не вървя / ne varvya “I am not walking”
  • не вървейки / ne varveyki “not walking”
edit Negative

NumType: numeral type [ ]

NumType

Some languages (especially Slavic) have a complex system of numerals. For example, in the school grammar of Czech, the main part of speech is “numeral”, it includes almost everything where counting is involved and there are various subtypes. It also includes interrogative, relative, indefinite and demonstrative words referring to numbers (words like kolik / how many, tolik / so many, několik / some, a few), so at the same time we may have a non-empty value of PronType. (In English, these words are called quantifiers and they are considered a subgroup of determiners.)

In this respect Bulgarian behaves like Czech language.

From the syntactic point of view, some numtypes behave like adjectives and some behave like adverbs. We tag them u-pos/ADJ and u-pos/ADV respectively. Thus the NumType feature applies to several different parts of speech:

  • u-pos/NUM: cardinal numerals
  • u-pos/DET: quantifiers
  • u-pos/ADJ: definite adjectival, e.g. ordinal numerals
  • u-pos/ADV: adverbial (e.g. ordinal and multiplicative) numerals, both definite and pronominal

Card: cardinal number or corresponding interrogative / relative / indefinite / demonstrative word

Note that in some Indo-European languages there is a fuzzy borderline between numerals and nouns for thousand, million and billion.

Examples

  • [bg] едно, две, три / edno, dve, tri “one, two, three”; колко / kolko “how many”; няколко / nyakolko “some”; толкова / tolkova “so many”; много / mnogo “many”; малко / malko “few”

Ord: ordinal number or corresponding interrogative / relative / indefinite / demonstrative word

This is a subtype of adjective.

Examples

  • [bg] adjectival: първи / parvi “first”; втори / vtori “second”, трети / treti “third”, etc.

Mult: multiplicative numeral or corresponding interrogative / relative / indefinite / demonstrative word

This is subtype of adverb.

Examples

  • [bg] веднъж / vednazh “once”; дваж / dvazh “twice”

Frac: fraction

This is a subtype of cardinal numbers, occasionally distinguished in corpora. It may denote a fraction or just the denominator of the fraction. In Bulgarian the numerator is cardinal numeral and denominator is ordinal numeral.

Examples

  • [bg] две трети / dve treti “two thirds”
edit NumType

Number: number [ ]

Number is an inflectional feature of nouns, adjectives, verbs. In the tagset it is encoded as: singular (s), plural (p), count (c), pluralia tantum (l). Singularia tantum is not encoded.

Sing: singular number

A singular noun denotes one person, animal or thing.

Examples: [bg] молив / moliv (pencil)

Plur: plural number

A plural noun denotes several persons, animals or things.

Examples: [bg] моливи / molivi (pencils)

Count: count plural form

A form that is used as plural for masculine non-person nouns after numerals. This is a remnant of the dual form.

Examples: [bg] 2 молива / (2) moliva (2 pencils-count)

Ptan: plurale tantum

Some nouns appear only in the plural form even though they denote one thing (semantic singular); some tagsets mark this distinction.

Examples: [bg] финанси, дънки / finansi, danki (finances, jeans)

Coll: collective / mass / singulare tantum

Collective or mass or singulare tantum is a special case of singular. It applies to words that use grammatical singular to describe sets of objects, i.e. semantic plural.

Examples: [bg] човечество / chovechestvo (mankind)

edit Number

Person: person [ ]

Person

Person is typically feature of personal and possessive pronouns, and of verbs. On verbs it is in fact an agreement feature that marks the person of the verb’s subject. Person marked on verbs makes it unnecessary to always add a personal pronoun as subject and thus subjects are sometimes dropped (pro-drop languages).

Bulgarian is a pro-drop language, as other Slavic languages.

1: first person

In singular, the first person refers just to the speaker / author. In plural, it must include the speaker and one or more additional persons.

Examples

  • аз / az “I”
  • идвам / idvam “I am coming”

2: second person

In singular, the second person refers to the addressee of the utterance / text. In plural, it may mean several addressees and optionally some third persons too.

Examples

  • ти / ti “you”
  • идваш / idvash “You are coming”

3: third person

The third person refers to one or more persons that are neither speakers nor addressees.

Examples

  • той, тя, то / toy, tya, to “he, she, it”
  • идва / idva “He/she/it is coming”
edit Person

Poss: possessive [ ]

Poss

Boolean feature of pronouns, determiners or adjectives. It tells whether the word is possessive.

While many tagsets would have “possessive” as one of the various pronoun types, this feature is intentionally separate from PronType, as it is orthogonal to pronominal types. Several of the pronominal types can be optionally possessive, and adjectives can too.

In BulTreeBank tagset “possessive” is one of the various pronoun types.

Yes: it is possessive

Note that there is no No value. If the word is not possessive, the Poss feature will just not be mentioned in the FEAT column. (Which means that empty value has the No meaning.)

Examples

  • [bg] possessive adjectives: майчина любов / maychina lyubov “mother’s love”
edit Poss

PronType: pronominal type [ ]

PronType

This feature typically applies to pronouns, determiners, pronominal numerals (quantifiers) and pronominal adverbs.

Prs: personal or possessive personal pronoun or determiner

See also the Poss feature that distinguishes normal personal pronouns from possessives. Note that Prs also includes reflexive personal/possessive pronouns (e.g. [cs] se / svůj; see the Reflex feature).

Examples

  • аз, ти, той, тя, то, ние, вие, те, себе си, мой, твой, негов, неин, негов, наш, ваш, техен, свой / az, ti, toy, tya, to, nie, vie, te, sebe si, moy, tvoy, negov, nein, negov, nash, vash, tehen, svoy “I, you, he, she, it, we, they, oneself, my, your, his, her, its, our, their, mine, yours, hers, ours, theirs, oneself’s”

Rcp: reciprocal pronoun

Examples

  • един друг / edin drug “one another”
  • един на друг / edin na drug “each other”

Int: interrogative pronoun, determiner, numeral or adverb

Note that possessive interrogative determiners (whose) can be distinguished by the Poss feature.

Examples:

  • [bg/en] кой /koy “who”, какво / kakvo “what”, кой / koy “which”, чий / chiy “whose”, колко / kolko “how many, how much”, къде / kade “where”, кога / koga “when”, как / kak “how”, защо / zashto “why”

Rel: relative pronoun, determiner, numeral or adverb

In Bulgarian this class is distinct from the class of interrogatives.

Examples:

  • [bg] който / koyto “which”, “that” (relative but not interrogative pronouns); чийто / chiyto “whose” (possessive relative pronoun)

Dem: demonstrative pronoun, determiner, numeral or adverb

BulTreeBank tagset does not differenciate between pronouns for narness/distance, although in Bulgarian there is such distinction.

Examples

  • [bg/en] този / този “this”, онзи / onzi “that”, такъв / takav “such”, тук / tuk “here”, там / tam “there”, etc.

Tot: total (collective) pronoun, determiner or adverb

Examples

  • [bg/en] всеки / vseki “every, everybody, everyone, each”, всичко / vsichko “everything” “all”, etc.

Neg: negative pronoun, determiner or adverb

Examples:

  • [bg/en] никой / nikoy “nobody”, нищо / nishto “nothing”, никакъв / nikakav “no”, ничий nichiy “no one’s” (possessive negative pronoun), etc.

Ind: indefinite pronoun, determiner, numeral or adverb

Examples

  • [bg/en] някой / nyakoy “somebody”, нещо / neshto “something”, някакъв / nyakakav “some”, нечий / nechiy someone’s_ (possessive indefinite pronoun), etc.
  • [bg/en] който и да е / koyto i da e “whoever, anybody”, каквото и да е / kakvoto i da e “whatever, anything”, etc.
  • [bg/en] еди-кой си / edi-koy si “somebody specific for the speaker, but not for the hearer”
edit PronType

Reflex: reflexive [ ]

Reflex

Boolean feature, typically of pronouns or determiners. It tells whether the word is reflexive, i.e. refers to the subject of its clause.

In Bulgarian the reflexive feature is not encoded as one of the pronoun types, but as a reference type (similarly to entity, attribute, possession, etc.)

In Bulgarian there are reflexive verbs - both as form and as meaning. They are written separately: събуждам се / sabuzhdam se “to wake up”.

Yes: it is reflexive

Note that there is no No value. If the word is not reflexive, the Reflex feature will just not be mentioned in the FEAT column. (Which means that empty value has the No meaning.)

Examples

  • [bg] reflexive personal pronouns: се, си, себе си / se, si, sebe si “oneself”; reflexive possessive pronoun: свой / svoy “oneself’s”.
edit Reflex

Tense: tense [ ]

Tense

Tense is a feature that specifies the time when the action took / takes / will take place, in relation to the current moment or to another action in the utterance. In Bulgarian aspect and tense are separate, although not completely independent of each other.

In Bulgarian there are 9 tenses: 3 synthetic and 6 analytic.

Since the feature Tense is assigned to a single word, i.e. it relates to synthetic forms, in Bulgarian it is applicable to only 3 tenses: Present, Aorist and Imperfect.

Past: past tense / preterite / aorist

The past tense denotes actions that happened before the current moment. In Bulgarian, this is aorist. It can be used with both imperfective and perfective verbs.

Examples

  • Те дойдоха навреме. / Te doydoha navreme “They came on time”.
  • Взе ли си изпита? / Vze li si izpita? “Did you take the exam?”

Pres: present tense

The present tense denotes actions that are happening right now, that are crossing the moment of speaking or that usually happen. In Bulgarian present tense has a lot of usages: for actual activities (where the perfective verbs are blocked); for historical events, for habitual activities, etc.

Examples

  • В момента чета. / V momenta cheta “I am reading now”.
  • Всеки ден чета. / Vseki den cheta “I read every day”.

Imp: imperfect

Imperfect is a special case of the past tense. It denotes actions that are happening during some past moment. These actions might continue after the moment of speaking, but also might not, i.e. the evidence is not in the form itself, but it is in the context. Both verbs - perfective and imperfective - are used in imperfect tense.

  • Когато се прибрах вкъщи, децата вече спяха. / Kogato se pribrah vkashti, detsata veche spyaha “When I came home, the children were already asleep.”
  • Щом дойдеше, веднага запалваше цигара. / Shtom doydeshe, vednaga zapalvashe tsigara “Every time he came, he always lit a cigarette”.
edit Tense

VerbForm: form of verb or deverbative [ ]

Even though the name of the feature seems to suggest that it is used exclusively with verbs, it is not the case. Some verb forms in some languages actually form a gray zone between verbs and other parts of speech (nouns, adjectives and adverbs). For instance, participles may be either classified as verbs or as adjectives, depending on language and context. In both cases VerbForm=Part may be used to separate them from other verb forms or other types of adjectives.

Bulgarian does not have an infinitive. It distinguishes: finite verbs and non-finite verbs (participles and transgressives).

Fin: finite verb

Rule of thumb: if it has non-empty Mood, it is finite. This features is encoded in the following values as second position in verbal tags: Vp# (personal verb); Vn# (impersonal verb); Vx#, Vy# and Vi# (auxiliary verbs).

Examples

  • Аз съм, ти си / Az sam, ti si “I am, you are”
  • Трябва да дойдеш /Tryabva da doydesh “You must come”
  • Прочетох книгата / Prochetoh knigata “I read the book”

Part: participle

Participle is a non-finite verb form that shares properties of verbs and adjectives. The participle in Bulgarian is encoded as c in fifth position of the tag: V#c#.

In Bulgarian there are four types of participles: present active, past perfective active, past imperfective active, past passive. The present active one can be used only adjectively; the past imperfective one can be used only in evidential verb forms; the other have the two usages. The present active can be derived only from imperfective verbs.

Examples

  • виждащ / vizhdasht “seeing” (present active). BulTreeBank tag: V#car#
  • видял / vidyal “seen” (past perfective active). BulTreeBank tag: V#cao#
  • видел / videl “seen” (past imperfective active). BulTreeBank tag: V#cam#
  • видян / vidyan “seen” (past passive). BulTreeBank tag: V#cv#

Trans: transgressive

The transgressive, also called adverbial participle, is a non-finite verb form that shares properties of verbs and adverbs. It appears e.g. in Slavic and Indo-Aryan languages.

In Bulgarian it can be derived only from imperfective verbs.

Examples

  • Виждайки това, той се разстрои / Vizhdayki tova, toy se razstroi “Having seen this, he became upset”. BulTreebang tag: V#g

Note that the symbol `#’, used in the Universal POS section indicates a holder for arbitrary number of features, suppressed in the respective tag as irrelevant in the BulTreeBank tagset, when mapped to the Universal one.

edit VerbForm

Voice: voice [ ]

Voice

For Indo-European speakers, voice means mainly the active-passive distinction. In other languages, other shades of verb meaning are categorized as voice.

In Bulgarian linguistics there are various theories of Voice distinctions: 2-voice one (active vs. passive), 3-voice one (active vs. passive vs. reflexive), 4-voice one(active vs. passive vs. reflexive vs. impersonal).

Here the 2-voice theory is adopted.

Act: active voice

The subject of the verb is the doer of the action (agent), the object is affected by the action (pacient).

Examples

  • Нападнахме врага. / Napadnahme vraga “We attacked the enemy”.
  • Децата се засмяха. / Detsata se zasmyaha “The children laughed”.
  • Децата се измиха. / Detsata se izmiha “The children washed themselves”.

Pass: passive voice

The subject of the verb is affected by the action (patient). The doer (agent) is either unexpressed or it appears as an object of the verb. In Bulgarian there are two ways of forming passive:

  • tenses plus the reflexive particle se
  • special participial conjugation

Examples

  • Тази книга се чете лесно. / Tazi kniga se chete lesno “This book reads easily”.
  • Тази книга беше прочетена по-бързо от другите. / Tazi kniga beshe prochetena po-barzo ot drugite “This book was read faster than the others”.
edit Voice