-
NB-BERT
"NB-BERT-base is a general BERT-base model built on the large digital collection at the National Library of Norway. This model is based on the same structure as BERT Cased... -
The Norwegian Colossal Corpus
"The Norwegian Colossal Corpus (NCC) is a collection of multiple smaller Norwegian corpuses suitable for training large language models. We have done extensive cleaning on the... -
NST dansk ATG-database (16 kHz) – reorganisert
his database was created by Nordic Language Technology for the development of automatic speech recognition and dictation in Danish. In this updated version, the organization of... -
NST Danish Dictation (22 kHz)
Samling af lydoptagelser i 22 kHz 1 kanal (mono). Stammer fra NST (Nordisk Språkteknologi) som gik konkurs i 2003. Er holdt ajour i den norske sprogbank i Nationalbiblioteket.... -
NST Danish ATG Database (16 kHz)
This database was originally developed by Nordic Language Technology in the 1990ies in order to facilitate automatic speech recognition in Danish . A reorganized and more user... -
NST udtaleleksikon for dansk
This pronunciation lexicon for Danish was originally produced by Nordic Language Technology (NST), and contains approximately 238,000 entries. The word list consists of a... -
NST N-gram – dansk nyhendetekst
Dette korpus indeholder n-grammer på dansk afledt af et korpus på 290 millioner ord med danske nyhedsarktikler fra aviserne Berlingske Tidende, Ekstrabladet og Politiken.... -
Stortinget Speech Corpus version 1.0
The Stortinget Speech Corpus (SSC) is a 5000+ hours speech dataset for weak supervision ASR created from audio and aligned proceedings text from Stortinget, the Norwegian...
Du kan også tilgå dette register med API (se API-dokumenter).