CLARIN Virtual Language Observatory: Help

The Virtual Language Observatory (VLO) faceted browser was developed within CLARIN as a means to explore linguistic resources, services and tools available within CLARIN and related communities. Its aim is to provide an easy to use interface, allowing for a uniform search and discovery process for a large number of resources from a wide variety of domains and providers.

More information can be found on the VLO page on the CLARIN website. For answers to common questions about the VLO, please consult the FAQ. More documentation and other references are listed on the "About" page.

Faceted browsing

The VLO search interface presents a number of facets, for each of which a value can be selected in order to narrow down the selection of displayed records. For example, to only include records that relate to Spain as a country, open the facet Country and select the value Spain. Notice that next to each available value, a number is displayed that indicates the number of records within the current selection that contain that value, in other words the number of remaining records should that value be selected.

Only the values that occur in the current selection (that is, the records that match the already selected values and the optional textual query (see below)) are shown. The VLO shows up to ten of the most frequently occurring values for each facet when you click on the facet name. If there are more then ten available values, there is a link labeled more..., which leads to a pop-up showing all available values (given the current selections), than can be filtered textually and sorted either alphabetically or by number of matching records. It is also possible to search for facet values by typing (part of) a value in the filter box below the facet name and above the facet values ('Type to search for more') in the panel next to the search results.

Facets that do not have any matching records given the current selection will not be displayed in the facets panel in the VLO search interface.

Search syntax

In addition to navigating the resources by means of the selection of facet values, the VLO faceted browser also allows for searching by means of textual queries.

Such queries are to be entered in the large text box at the top of the main page or faceted browsing page with the button labeled 'Search' next to it.

In its simplest form, a search query consists of one or more terms, separated by a space character. Such queries result in the retrieval of all documents that have one or more occurrences of all of the included search terms. In other words, an AND operator is implied by default.

Advanced querying

It is possible to construct a more specific query by means of a set of syntax features that can be processed by the VLO. The supported syntax is that of the Lucene Query Parser 1

The Lucene Query Parser syntax allows for the following boolean operators: 'AND', 'OR', 'NOT', '+' and '-'. It also allows for grouping of terms by means of parenthesis. Terms can be combined into phrases by means of double quote characters.

The following examples illustrate the usage of these operators. Click any of the following examples to perform that query on the actual data currently in the VLO:

Targeting fields

In addition to the logical operators, the syntax also allows for search for occurrences of a term within a specific field, such as language or modality. To do so, enter the name of the field and the term to search for, separated by a semicolon. The asterisk ('*') can be used to achieve partial matches. Quotes are required to match a term that contains spaces.

The following field names are available: language, country, continent, modality, genre, subject, format, organisation, resourcetype, keyword, resources.

Click any of the following examples to perform that query on the actual data currently in the VLO:

A full overview of syntax features, including options for fuzzy search, ranges, and term boosting, can be found at the Lucene syntax description page.


1 Support for the Lucene syntax was implemented by using the Solr Extended DisMax Query Parser.