Search results

HeLI-OTS 1.1

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

HeLI-OTS 1.0

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

HeLI-OTS 2.0

1

HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-17…

HeLI off-the-shelf language identifier with language models for 220 languages. # Performance It can identify c. 600-1700 sentences (averaging c. 150 characters) per second from a file using one core and around 4,3 gigabytes of memory on a modern laptop. # Requirements Java The software has been created and tested o…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

HeLI-OTS 1.5

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

HeLI-OTS 1.4

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

HeLI-OTS 1.3

1

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and cl…

HeLI off-the-shelf language identifier with language models for 200 languages. The program will read the <infile> and classify the language of each line as one of the 200 languages it knows and writes the results, one ISO 639-3 code per line, into file <outfile>. It can identify c. 3000 sentences per second using one c…

Zulu Nenets Yiddish Mingrelian Walloon … (+195)

VCR

NCHLT Optical Character Recognition for South African Languages

(Part of SADiLaR Resource Catalogue)

2
1

An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. …

An OCR system is an application that enables one to convert scanned paper documents into editable and searchable texts. The engine analyses the structure of document image and divides the page into elements such as blocks of texts, tables and images. These blocks are used to identify character image patterns which are …

Afrikaans English South Ndebel.. Xhosa Zulu … (+6)

Landing page for this record

VCR

Translate.org.za isiZulu - isiXhosa Corpus 2012

(Part of SADiLaR Resource Catalogue)

2
1

isiZulu-isiXhosa translation memory.

Xhosa Zulu

Landing page for this record

VCR

African Wordnet: isiZulu 1.0

(Part of SADiLaR Resource Catalogue)

2
1

Developed using the expand model with Princeton WordNet 2.0 as basis. Each wordnet contains synsets with at least the fo…

Developed using the expand model with Princeton WordNet 2.0 as basis. Each wordnet contains synsets with at least the following fields:\nWord form (lemma; synonym)\nID (linking to the Princeton Wordnet 2.0)\nPart of speech\nDomain\nSUMO/MILO classification\n\nAdditional data may include the following fields:\nUsage exa…

Zulu

Landing page for this record

VCR

NCHLT isiZulu Named Entity Annotated Corpus

(Part of SADiLaR Resource Catalogue)

2
1

Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION,…

Named entity annotated data from the NCHLT Text Resource Development: Phase II Project, annotated with PERSON, LOCATION, ORGANISATION and MISCELLANEOUS tags.

Zulu

Landing page for this record

VCR

CLARIN Virtual Language Observatory

Facets

Language

Collection

Resource type

Modality

Format

Multilingual

Genre

Subject

Country

Organisation

Data provider

National project

Search options

Temporal Coverage

Availability

Search options

HeLI-OTS 1.1

HeLI-OTS 1.0

HeLI-OTS 2.0

HeLI-OTS 1.5

HeLI-OTS 1.4

HeLI-OTS 1.3

NCHLT Optical Character Recognition for South African Languages

Translate.org.za isiZulu - isiXhosa Corpus 2012

African Wordnet: isiZulu 1.0

NCHLT isiZulu Named Entity Annotated Corpus