Skip to main content

CDT - The Copenhagen Danish-English Dependency Treebank

The Copenhagen Dependency Treebanks are a set of treebanks for Danish, English, Spanish and Italian. The purpose of the Copenhagen Dependency Treebank project is to create linguistically annotated text collections (treebanks) on the basis of the dependency-based grammar formalism Discontinuous Grammar (Buch-Kromann 2009). The treebanks created in the project can be used to train natural language parsers, syntax-based machine translation systems, and other statistically based natural language applications. The treebanks are based on a unified dependency annotation, where texts are analyzed as a single dependency structure that spans all levels of analysis, from morphology to discourse.

Reference: Buch-Kromann, M. (2009). Discontinuous Grammar: A Dependency-Based Model of Human Parsing and Language Learning. Saarbrücken: VDM Verlag Dr. Müller. (https://research.cbs.dk/en/publications/discontinuous-grammar-a-dependency-based-model-of-human-parsing-a)

Data og ressourcer

Nøgleord

Yderligere info

URI https://data.gov.dk/dataset/lang/1d755e32-2686-43ee-9a38-eef87bb63749
Destinationsside https://github.com/mbkromann/copenhagen-dependency-treebank/wiki/CDT
Høstes af Datavejviser
Udgivelsesdato 20-11-2012
Seneste ændringsdato
Opdateringsfrekvens
Dækningsperiode  / 
Emne(r)
  • 16.05.07 Sprog og retskrivning
  • Uddannelse, kultur og sport
Adgangsrettigheder offentlig
Overholder
Proveniensudsagn
Dokumentation