Search results

Finnish Folk Poetry

1

The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2014052…

The corpus is available in Kielipankki - the Language Bank of Finland (korp.csc.fi), http://urn.fi/urn:nbn:fi:lb-2014052711. A 34-volume collection of Finnic oral poetry, lyric, short rhymes, incantations etc., collected and recorded from the 16th century to the 1930s and published mostly between 1908 and 1948, with a…

Finnish Karelian Livvi Ludian Votic … (+3)

VCR

Christmas Gospel text-to-speech in four Uralic languages, source

1

This resource is available for download in Kielipankki – the Language Bank of Finland. This resource consists of .txt a…

This resource is available for download in Kielipankki – the Language Bank of Finland. This resource consists of .txt and .wav files in four languages pertaining to the Finnish Christmas Gospel verses Luke 2. 1–20 The four languages include Komi-Zyrian (kpv), Erzya (myv), Karelian (krl) and Olonets-Karelian (olo, aka …

Komi-Zyrian Erzya Karelian Livvi Kompane

VCR

Christmas Gospel text-to-speech in four Uralic languages, Korp

1

This resource is available via Korp in Kielipankki – the Language Bank of Finland. This resource consists of .txt and .…

This resource is available via Korp in Kielipankki – the Language Bank of Finland. This resource consists of .txt and .wav files in four languages pertaining to the Finnish Christmas Gospel verses Luke 2. 1–20 The four languages include Komi-Zyrian (kpv), Erzya (myv), Karelian (krl) and Olonets-Karelian (olo, aka Livv…

Komi-Zyrian Erzya Karelian Livvi Kompane

VCR

Uralic UD v2.10, Kielipankki Korp version

1

The resource is available via Korp in Kielipankki – the Language Bank of Finland. The corpus content has been annotated…

The resource is available via Korp in Kielipankki – the Language Bank of Finland. The corpus content has been annotated according to the Universal Dependencies version 2.10 (http://hdl.handle.net/11234/1-4758) for the following Uralic languages: Erzya, Estonian, Finnish, Hungarian, Karelian, Komi-Permyak, Komi-Zyrian,…

Erzya Estonian Finnish Hungarian Karelian … (+6)

VCR

Wanca 2016, Korp Version

1

The Korp version of Wanca 2016 is a collection of web corpora in small Uralic languages. The collection is composed of 2…

The Korp version of Wanca 2016 is a collection of web corpora in small Uralic languages. The collection is composed of 29 sentence corpora in different languages. The corpora have been collected from the Internet using the automated system developed in the Finno-Ugric Languages and the Internet project (SUKI) supported…

Skolt Sami Inari Sami Fiu Karelian Moksha … (+25)

VCR

HeLI-OTS 1.1

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

HeLI-OTS 1.0

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

Wanca 2016, source

1

Wanca 2016 is a collection of web corpora in small Uralic languages. The collection is composed of 29 sentence corpora i…

Wanca 2016 is a collection of web corpora in small Uralic languages. The collection is composed of 29 sentence corpora in different languages. The corpora have been collected from the Internet using the automated system developed in the Finno-Ugric Languages and the Internet project (SUKI) supported by the Kone foundat…

Skolt Sami Inari Sami Fiu Karelian Moksha … (+25)

VCR

HeLI-OTS 2.0

1

HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-17…

HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-1700 sentences (averaging c. 150 characters) per second from a file using one core and around 4,3 gigabytes of memory on a modern laptop. # Requirements Java The software has been created and tested o…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

HeLI-OTS 1.5

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

CLARIN Virtual Language Observatory

Facets

Language

Collection

Resource type

Modality

Format

Multilingual

Country

Organisation

Data provider

National project

Search options

Temporal Coverage

Availability

Search options

Finnish Folk Poetry

Christmas Gospel text-to-speech in four Uralic languages, source

Christmas Gospel text-to-speech in four Uralic languages, Korp

Uralic UD v2.10, Kielipankki Korp version

Wanca 2016, Korp Version

HeLI-OTS 1.1

HeLI-OTS 1.0

Wanca 2016, source

HeLI-OTS 2.0

HeLI-OTS 1.5