Røst-315M

Udgiver

Alexandra Instituttet

Alexandra Instituttets udgangspunkt er samfundsmæssige problemstillinger primært virksomheders og organisationers behov for at omsætte de nyeste forskningsresultater til...

Læs mere

Kontaktpunkt

Dan Saattrup Nielsen

dan.nielsen@alexandra.dk

Datasætansvarlig organisation

Alexandra Instituttet

Selskab

URI: https://data.gov.dk/id/organization/7d67b754-51a8-46d0-bfd1-a96464f9093a

Skaber

Alexandra Instituttet

Selskab

URI: https://data.gov.dk/id/organization/7e596929-7e46-4bfa-aa72-b0c657bd9a24

Kvalificeret kreditering

Digitaliseringsstyrelsen
Aktørrolle: Samarbejdspartner (aktør der bistår med genereringen af ressourcen som ikke er primær undersøger)
Aktørtype: National myndighed

Københavns Universitet
Aktørrolle: Samarbejdspartner (aktør der bistår med genereringen af ressourcen som ikke er primær undersøger)
Aktørtype: Akademisk-videnskabelig organisation

Alvenir
Aktørrolle: Samarbejdspartner (aktør der bistår med genereringen af ressourcen som ikke er primær undersøger)
Aktørtype: Selskab

Corti
Aktørrolle: Samarbejdspartner (aktør der bistår med genereringen af ressourcen som ikke er primær undersøger)
Aktørtype: Selskab

Licenser

OpenRAIL-M

Datasæt

Røst-315M

RØST-315M is a speech recognition model based on the CoRal-dataset, and the model is a product of the CoRal-project. CoRal is a project that aims to produce datasets that are comprehensive automatic speech recognition (ASR) datasets designed to capture the diversity of the Danish language across various dialects, accents, genders, and age groups. The primary goal of the CoRal dataset is to provide a robust resource for training and evaluating ASR models that can understand and transcribe spoken Danish in all its variations.

This model is intended to be used for Danish automatic speech recognition.

Note that Biometric Identification is not allowed using the CoRal dataset and/or derived models.

The dataset is licensed under a custom license, adapted from OpenRAIL-M, which allows commercial use with a few restrictions (speech synthesis and biometric identification). See license.

A research paper will be submitted soon, but until then, if you use the CoRal dataset in your research or development, please cite it as follows:

@dataset{coral2024, author = {Dan Saattrup Nielsen, Sif Bernstorff Lehmann, Simon Leminen Madsen, Anders Jess Pedersen, Anna Katrine van Zee and Torben Blach}, title = {CoRal: A Diverse Danish ASR Dataset Covering Dialects, Accents, Genders, and Age Groups}, year = {2024}, url = {https://hf.co/datasets/alexandrainst/coral}, }

Data og ressourcer

RØST-315Mhttp://publications.europa.eu/resource/authority/file-type/HTML
Tilgå ressourcen her.
Udforsk
- Mere information
- Gå til ressource

Nøgleord

Yderligere info

URI	https://data.gov.dk/dataset/lang/c3cd6a9c-eba5-4c9b-9078-f8583c34b9a9
Destinationsside	https://huggingface.co/alexandrainst/roest-315m
Høstes af Datavejviser	Nej
Udgivelsesdato	14-10-2024
Seneste ændringsdato	15-10-2024
Opdateringsfrekvens	opdateres løbende
Dækningsperiode	/
Emne(r)	16.05.07 Sprog og retskrivning Uddannelse, kultur og sport
Adgangsrettigheder	offentlig
Overholder
Proveniensudsagn
Dokumentation	https://huggingface.co/alexandrainst/roest-315m/blob/main/README.md