Welcome to the VLO!
Use the search bar below to start searching through hundreds of thousands of language resources, or continue to browse everything and use facets to narrow down to your area of interest or discover new resources.
See all records Learn more Take a quick tourUse the categories below to limit the search results to those matching the selected value(s).
These levels provide an indication of the degree to which resources and tools are publicly accessible. Please check the specific conditions on any resource or tool that you end up using.
Large set of 2274 recordings (approx. 360h) of spoken dialectal German (Saxonian) recorded in Transilvania (Romania) in …
Large set of 2274 recordings (approx. 360h) of spoken dialectal German (Saxonian) recorded in Transilvania (Romania) in approx. 250 different locations. This up-to-now unpublished material has been collected on analog tape in the 1960s and 70s by different linguists based at the universities of Bukarest, Hermannstadt a…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-17…
HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-1700 sentences (averaging c. 150 characters) per second from a file using one core and around 4,3 gigabytes of memory on a modern laptop. # Requirements Java The software has been created and tested o…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…
HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…
The AThEME Verona-Trento Corpus is a spoken corpus composed of data collected during the AThEME project in Work Package …
The AThEME Verona-Trento Corpus is a spoken corpus composed of data collected during the AThEME project in Work Package 2 ‘Regional Languages’ by the units of Verona and Trento for minority languages and dialects spoken in the area between Innsbruck and the Po Valley (Tyrolean, Trentino, Fodom Ladin, Fassan Ladin, Mòc…
The database contains about 5 Million dialectal linguistic evidences collected in differend projects within the Free Sta…
The database contains about 5 Million dialectal linguistic evidences collected in differend projects within the Free State of Bavaria to the dialects Bavarian, Frankish, and Swabian. In 1984, linguists at the University of Augsburg began to collect dialect data for the research and documentation project "Linguistic …
The database offers access to over 6 million dialectal linguistic evidences of the project "Dictionary of Bavarian Diale…
The database offers access to over 6 million dialectal linguistic evidences of the project "Dictionary of Bavarian Dialects" (German: Das Bayerische Wörterbuch) as image snippets, partly and forthgoing lemmatized. The area covered by the Dictionary of Bavarian Dialects (Bayerisches Wörterbuch) comprises Upper Bavari…