Skip to main content

6 datasæt fundet

Filtrér resultater
  • Danish Gigaword

    A billion-word corpus of Danish text. Split into many sections, and covering many dimensions of variation (spoken/written, formal/informal, modern/old, rigsdansk/dialect, and so...
  • FT-Speech

    FT Speech er et dansk korpus med folketingets taler i lydformat og manuelt transskriberet tekst. Datasættet er blevet kureret af Andreas Kirkedal, Marija Stepanović og Barbara...
  • Bidirectional Long-Short Term Memory tagger

    A toolkit for Part-of-Speech tagging and NER in DyNet. It has been tested on Danish, amongst other languages (for the UD POS tags in the UD_Danish-DDT version 1.1 and 2.3)...
  • Bornholmsk (NLP tools / data for Bornholmsk)

    Language processing resources and tools for Bornholmsk, a language spoken on the island of Bornholm, with roots in Danish and closely related to Scanian. Includes corpora, word...
  • Danish Universal Dependencies DDT (UD_Danish-DDT)

    The Danish Universal Dependencies treebank (Johannsen et al., 2015, UD-DDT) is a conversion of the Danish Dependency Treebank (Buch-Kromann et al. 2003) based on texts from...
  • Danish Named Entity Recognition data on top of the Danish Universal...

    This resource is an annotation of four NER types (PER, ORG, LOC, MISC) on top of the UD_Danish-DDT data. Status: published and freely available since summer 2019 Reference:...