home edit page issue tracker

Introduction

The Basque UD treebank is based on a conversion from part of the Basque Dependency Treebank (BDT), created at the University of of the Basque Country by the IXA NLP research group. The treebank consists of 8,993 sentences (121,443 tokens) and covers mainly literary and journalistic texts.

The morphological and syntactic annotation of the Basque UD treebank is created through an automatic conversion of BDT data (Aduriz et al. 2003, Aranzabe et al. 2015).

References

Aranzabe M.J., Atutxa A., Bengoetxea K., Díaz de Ilarraza A., Goenaga I., Gojenola K., Uria L. 2015 Automatic Conversion of the Basque Dependency Treebank to Universal Dependencies In Markus Dickinsons, Erhard Hinrichs, Agnieszka Patejuk, Adam Przepiórkowski (eds), Proceedings of the Fourteenth International Workshop on Treebanks an Linguistic Theories (TLT14), pp.: 233-241. Institute of Computer Science of the Polish Academy of Sciences, Warszawa, Poland. ISBN: 978-83-63159-18-4

Aduriz I., Aranzabe M., Arriola J., Atutxa A., Díaz de Ilarraza A., Garmendia A., Oronoz M. 2003 Construction of a Basque Dependency Treebank. In Joakim Nivre and Erhards Hinrichs (eds.), Proceedings of the Second Workshop on Treebanks and Linguistic Theories, pp.: 201-204, ISSN: 1651-0267, ISBN: 91-7636-394-5, TLT 2003, Vaxjo, Sweden, November 14-15.