Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
These levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows. …
eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows. eSpeak uses a "formant synthesis" method. This allows many languages to be provided in a small size. The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synt…
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 token…
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia). Changes in version 1.1: 1. Universal Dependencies tagset instead of the older and sma…
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 token…
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Pretrained model weights for the UDify model, and extracted BERT weights in pytorch-transformers format. Note that these…
Pretrained model weights for the UDify model, and extracted BERT weights in pytorch-transformers format. Note that these weights slightly differ from those used in the paper.
Enregistrement des mots du lexique Swadesh en Kurde
Enregistrement des mots du lexique Swadesh en Kurde
Enregistrement des interactions de base en Kurde
Enregistrement des interactions de base en Kurde
Hafret Yi Kēye? (‘Who deserves the woman?’ lit. Whose is the woman?): A story with a religious content. A carpenter, a t…
Hafret Yi Kēye? (‘Who deserves the woman?’ lit. Whose is the woman?): A story with a religious content. A carpenter, a tailor and a priest, conjointly turn a tree trunk into a beautiful woman. Whose should be the woman? This question is addressed to the audience at the end of the story. It is told in third person’s poi…
Enregistrement de l'histoire de l'âne de Nasreddin Hodja en Kurde
Enregistrement de l'histoire de l'âne de Nasreddin Hodja en Kurde
This package contains the system outputs from the CoNLL 2017 Shared Task in Multilingual Parsing from Raw Text to Universal Dependencies.
This package contains the system outputs from the CoNLL 2017 Shared Task in Multilingual Parsing from Raw Text to Universal Dependencies.
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many l…
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal…