home fr/pos edit page issue tracker

PROPN: proper noun

Definition

A proper noun is a noun (or nominal content word) that is the name (or part of the name) of a specific individual, place, or object. The names of people living in a place (such as Les Américains “The Americans”) should be tagged as NOUN (but this is not yet done consistently in the French data).

Examples


Treebank Statistics (UD_French)

There are 16580 PROPN lemmas (46%), 16580 PROPN types (35%) and 31320 PROPN tokens (8%). Out of 17 observed tags, the rank of PROPN is: 1 in number of lemmas, 1 in number of types and 6 in number of tokens.

The 10 most frequent PROPN lemmas: France, Paris, Europe, États-Unis, Jean, Maroc, État, la, Espagne, New

The 10 most frequent PROPN types: France, Paris, Europe, États-Unis, Jean, Maroc, État, la, Espagne, New

The 10 most frequent ambiguous lemmas: État (PROPN 80, NOUN 48), la (PROPN 3, ADV 1, NOUN 1), Espagne (PROPN 71, NOUN 1), New (PROPN 65, X 1), York (PROPN 64, X 1), guerre (NOUN 136, PROPN 2), le (DET 43234, PRON 876, PROPN 3), The (PROPN 42, X 1), de (ADP 31514, PROPN 33, X 1), saint (NOUN 30, PROPN 10, ADJ 8)

The 10 most frequent ambiguous types: État (PROPN 80, NOUN 45), la (DET 9728, PRON 108, PROPN 3, ADV 1, NOUN 1), Espagne (PROPN 71, NOUN 1), New (PROPN 65, ADJ 2, X 1), York (PROPN 64, X 1), guerre (NOUN 121, PROPN 2), Nord (PROPN 53, NOUN 11), Conseil (PROPN 51, NOUN 12), le (DET 13833, PRON 287, PROPN 3), Coupe (PROPN 46, NOUN 21)

Morphology

The form / lemma ratio of PROPN is 1.000000 (the average of all parts of speech is 1.308112).

The 1st highest number of forms (2) was observed with the lemma “Jésus-Christ”: J.-C., Jésus-Christ.

The 2nd highest number of forms (1) was observed with the lemma “%”: %.

The 3rd highest number of forms (1) was observed with the lemma “’upa’upa”: ‘upa’upa.

PROPN occurs with 2 features: fr-feat/Gender (1; 0% instances), fr-feat/Number (1; 0% instances)

PROPN occurs with 2 feature-value pairs: Gender=Masc, Number=Sing

PROPN occurs with 2 feature combinations. The most frequent feature combination is _ (31319 tokens). Examples: France, Paris, Europe, États-Unis, Jean, Maroc, État, la, Espagne, New

Relations

PROPN nodes are attached to their parents using 25 different relations: fr-dep/nmod (13024; 42% instances), fr-dep/name (6590; 21% instances), fr-dep/appos (3488; 11% instances), fr-dep/nsubj (3448; 11% instances), fr-dep/conj (2715; 9% instances), fr-dep/dobj (733; 2% instances), fr-dep/amod (314; 1% instances), fr-dep/nsubjpass (243; 1% instances), fr-dep/root (222; 1% instances), fr-dep/det (167; 1% instances), fr-dep/acl (108; 0% instances), fr-dep/compound (62; 0% instances), fr-dep/case (60; 0% instances), fr-dep/nummod (47; 0% instances), fr-dep/dep (22; 0% instances), fr-dep/xcomp (21; 0% instances), fr-dep/acl:relcl (17; 0% instances), fr-dep/nmod:poss (9; 0% instances), fr-dep/parataxis (9; 0% instances), fr-dep/ccomp (8; 0% instances), fr-dep/advmod (5; 0% instances), fr-dep/advcl (4; 0% instances), fr-dep/vocative (2; 0% instances), fr-dep/cc (1; 0% instances), fr-dep/mwe (1; 0% instances)

Parents of PROPN nodes belong to 16 different parts of speech: PROPN (11871; 38% instances), NOUN (11678; 37% instances), VERB (6917; 22% instances), ADJ (310; 1% instances), ROOT (222; 1% instances), PRON (180; 1% instances), NUM (63; 0% instances), X (25; 0% instances), SYM (15; 0% instances), ADV (12; 0% instances), ADP (9; 0% instances), INTJ (5; 0% instances), PUNCT (5; 0% instances), AUX (4; 0% instances), DET (3; 0% instances), CONJ (1; 0% instances)

10552 (34%) PROPN nodes are leaves.

9101 (29%) PROPN nodes have one child.

6205 (20%) PROPN nodes have two children.

5462 (17%) PROPN nodes have three or more children.

The highest child degree of a PROPN node is 68.

Children of PROPN nodes are attached using 32 different relations: fr-dep/case (13468; 30% instances), fr-dep/det (6761; 15% instances), fr-dep/name (6594; 15% instances), fr-dep/punct (5022; 11% instances), fr-dep/nmod (2884; 6% instances), fr-dep/conj (2837; 6% instances), fr-dep/cc (1644; 4% instances), fr-dep/appos (1454; 3% instances), fr-dep/amod (1109; 2% instances), fr-dep/acl (546; 1% instances), fr-dep/acl:relcl (457; 1% instances), fr-dep/nummod (415; 1% instances), fr-dep/cop (320; 1% instances), fr-dep/compound (237; 1% instances), fr-dep/advmod (205; 0% instances), fr-dep/nsubj (199; 0% instances), fr-dep/nmod:poss (48; 0% instances), fr-dep/expl (26; 0% instances), fr-dep/dobj (24; 0% instances), fr-dep/dep (22; 0% instances), fr-dep/nsubjpass (22; 0% instances), fr-dep/auxpass (20; 0% instances), fr-dep/mark (18; 0% instances), fr-dep/neg (16; 0% instances), fr-dep/advcl (14; 0% instances), fr-dep/aux (14; 0% instances), fr-dep/parataxis (9; 0% instances), fr-dep/ccomp (7; 0% instances), fr-dep/mwe (3; 0% instances), fr-dep/xcomp (2; 0% instances), fr-dep/discourse (1; 0% instances), fr-dep/iobj (1; 0% instances)

Children of PROPN nodes belong to 17 different parts of speech: ADP (13365; 30% instances), PROPN (11871; 27% instances), DET (6662; 15% instances), PUNCT (5022; 11% instances), NOUN (2182; 5% instances), CONJ (1567; 4% instances), VERB (1280; 3% instances), ADJ (889; 2% instances), NUM (741; 2% instances), ADV (314; 1% instances), PRON (203; 0% instances), X (165; 0% instances), SYM (43; 0% instances), AUX (34; 0% instances), PART (31; 0% instances), SCONJ (28; 0% instances), INTJ (2; 0% instances)


PROPN in other languages: [bg] [cs] [de] [el] [en] [es] [eu] [fa] [fi] [fr] [ga] [he] [hu] [it] [ja] [ko] [sv] [u]