Skip to main content

8 datasæt fundet

Udgivere: Nationalbiblioteket i Norge

Filtrér resultater
  • NB-BERT

    "NB-BERT-base is a general BERT-base model built on the large digital collection at the National Library of Norway. This model is based on the same structure as BERT Cased...
  • The Norwegian Colossal Corpus

    "The Norwegian Colossal Corpus (NCC) is a collection of multiple smaller Norwegian corpuses suitable for training large language models. We have done extensive cleaning on the...
  • NST dansk ATG-database (16 kHz) – reorganisert

    his database was created by Nordic Language Technology for the development of automatic speech recognition and dictation in Danish. In this updated version, the organization of...
  • NST Danish Dictation (22 kHz)

    Samling af lydoptagelser i 22 kHz 1 kanal (mono). Stammer fra NST (Nordisk Språkteknologi) som gik konkurs i 2003. Er holdt ajour i den norske sprogbank i Nationalbiblioteket....
  • NST Danish ATG Database (16 kHz)

    This database was originally developed by Nordic Language Technology in the 1990ies in order to facilitate automatic speech recognition in Danish . A reorganized and more user...
  • NST udtaleleksikon for dansk

    This pronunciation lexicon for Danish was originally produced by Nordic Language Technology (NST), and contains approximately 238,000 entries. The word list consists of a...
  • NST N-gram – dansk nyhendetekst

    Dette korpus indeholder n-grammer på dansk afledt af et korpus på 290 millioner ord med danske nyhedsarktikler fra aviserne Berlingske Tidende, Ekstrabladet og Politiken....
  • Stortinget Speech Corpus version 1.0

    The Stortinget Speech Corpus (SSC) is a 5000+ hours speech dataset for weak supervision ASR created from audio and aligned proceedings text from Stortinget, the Norwegian...