Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
These levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
The resource is available via Korp in Kielipankki – the Language Bank of Finland. The corpus content has been annotated…
The resource is available via Korp in Kielipankki – the Language Bank of Finland. The corpus content has been annotated according to the Universal Dependencies version 2.10 (http://hdl.handle.net/11234/1-4758) for the following Uralic languages: Erzya, Estonian, Finnish, Hungarian, Karelian, Komi-Permyak, Komi-Zyrian,…
The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States. This …
The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States. This collection of legislative text changes continuously and currently comprises selected texts written between the 1950s and now. As of the beginning of the year 2007, the EU had 27 Member States and 23 o…
*Introduction* CSLU: 22 Languages v 1.2 was developed by the Center for Spoken Language Understanding (CSLU) and contai…
*Introduction* CSLU: 22 Languages v 1.2 was developed by the Center for Spoken Language Understanding (CSLU) and contains approximately 84 hours of fixed vocabulary and fluent continuous telephone speech in 21 languages and orthographic transcriptions for a subset of the utterances. The corpus is distributed by the L…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-17…
HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-1700 sentences (averaging c. 150 characters) per second from a file using one core and around 4,3 gigabytes of memory on a modern laptop. # Requirements Java The software has been created and tested o…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 21 Eu…
The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 21 European languages: Romanic (French, Italian, Spanish, Portuguese, Romanian), Germanic (English, Dutch, German, Danish, Swedish), Slavik (Bulgarian, Czech, Polish, Slovak, Slovene), Finni-Ugric (Finnish…
The Helsinki Korp version of the Opus open parallel corpus (http://opus.lingfil.uu.se/), containing scrambled sentences,…
The Helsinki Korp version of the Opus open parallel corpus (http://opus.lingfil.uu.se/), containing scrambled sentences, has been published in Korp, http://urn.fi/urn:nbn:fi:lb-2016012101 The subcorpora of Opus, Helsinki Korp Version are: OPUS Finnish–Czech OPUS Finnish–Danish OPUS Finnish–Dutch OPUS Finnish–Engli…
Hunspell is the spell checker of LibreOffice, OpenOffice.org, Mozilla Firefox 3 & Thunderbird, Google Chrome, and it is …
Hunspell is the spell checker of LibreOffice, OpenOffice.org, Mozilla Firefox 3 & Thunderbird, Google Chrome, and it is also used by proprietary software packages, like Mac OS X, InDesign, memoQ, Opera and SDL Trados. Main features: Extended support for language peculiarities; Unicode character encoding, compound…