Developing an agenda and a roadmap
for achieving full digital language
equality in Europe by 2030

How does the European Language Grid strengthen linguistic diversity?

Happy faces and the ELG logo

Europe consists of more than 40 different countries and even more cultures. Everyone brings something unique to the table, languages being one of the more obvious aspects. Although it is possible to encounter five different languages within a fifteen minute train ride, this diversity is less represented when it comes to the digital world and especially language technology. As was shown in the META-NET White Paper Series in 2012, tools like machine translation, text-to-speech applications and text summarisation work predominantly in English, with languages like German, French and Spanish following closely behind. Languages with weaker support include Icelandic, Latvian, Welsh and Irish.

In order to preserve and strengthen Europe’s unique linguistic diversity, languages that are less widespread need to be equally supported and represented. Welsh serves as a fitting example here: although the overall use of the language was declining, the last few decades have been marked by revitalisation efforts – governmental, scientific and social – that work towards bilinguality being more common in Wales. One of the key aspects of this is strengthening bilingual communication and representation online.

For many, English is the go-to language of the internet. Not only is it used in communication; a lot of websites also default to English even though versions in other languages are available. Looking at the big picture, this risks smaller languages falling by the wayside. On an individual level, there is another reason for this to be an issue: not everyone speaks English, and for some of those that do, it can be a chore to get through a paragraph they would much more comfortably read in their own language. Once again regarding Welsh, there is a tool that provides a start in overcoming this issue: The Welshify Widget. The plugin lets users know when a Welsh version of a website is available and guides them through the process of changing their browser settings to set Welsh as their preferred language.

By highlighting Welsh versions of websites, the widget fosters an online environment that is more inclusive towards Welsh native speakers. There are a variety of digital language tools that have similar effects for a wide range of European languages, by making smaller languages available in the digital world and supporting their usage. Each one of them contributes towards strengthening linguistic diversity and equality among European languages.

In an effort to reach those goals, it is necessary to know where each European language has gaps in digital support. The European Language Equality (ELE) project examines 70+ European languages individually, analysing where sufficient support exists and where more is needed. The results of this research will be presented in a strategic agenda and roadmap, detailing what needs to be done to reach digital language equality by 2030.

In order to make that equality a reality, language resources need to reach their intended user base. Potential consumers need to know what is available. The European Language Grid (ELG) aims to facilitate this, among other things. The ELG is a platform that hosts European language technologies with the goal of becoming their primary hub. Companies and research facilities can upload and link their projects on ELG. Having one centralised hub like the ELG will enable developers to get the word out about their products, while users have an easier time finding and downloading the type of tool they want.

ELG also allows developers to test their tools or services, which in turn makes them easier and faster to finalize. This is also aided by the communication that is made possible through the ELG. Language technology developers are able to learn from and collaborate with each other, which, among other things, opens the door to potential translations of existing tools into other European languages. Faster development of tools and communication within the language technology community will quickly create more available technologies and resources. The heightened number and visibility of these resources will not only boost individual languages – in doing so, the linguistic diversity that already exists in Europe will be strengthened as well.

Tools like the Welshify Widget make the online experience more inclusive for non-English speakers and help revitalize the language of a European culture. The ELG as the main hub for European language technology aims to provide the platform for projects like these to reach their full potential and work towards digital language equality.


How do ELE and ELG work together towards Digital Language Equality?


© Adobe Stock/bernardbodo

Europe’s diversity in terms of culture and communication sets it apart from other major players in the global field of Language Technology (LT) that usually concentrate on single languages. The number of European languages provides a unique opportunity to work together and to learn from each other in the process of developing digital language tools. In order to access this potential, it is crucial that every official, unofficial and minority language is equally represented in the digital world and LT landscape. This is one of the reasons why Digital Language Equality (DLE, further described here) is an important goal that needs to be actively worked towards. One of the main aspects of this work is handling the fragmentation of the European LT landscape that is still prevalent. The ELG project addresses this issue by building a platform that aims to host all European LT resources – the European Language Grid (ELG). Having one unified hub will support the LT community greatly. Developers will have an easier time getting the word out about a product, while consumers are more likely to hear of and be able to use it. Furthermore, a centralised platform will give LT creators a broader reach and encourage collaboration, communication and learning from one another.

While ELG is bringing the LT landscape together, it is also necessary to combat the existing language inequality actively and directly. This is where ELE comes in: During the runtime of the project, 70+ European languages are being researched and analysed to find out where exactly the inequalities lie. This effort spans Europe’s official, and many unofficial and minority languages. By the end of the project, a strategic agenda and roadmap detailing the best approach to the existing discrepancies will be presented. This research will lay important groundwork for a long-term funding program that will be able to provide support based on the discovered disparities.

ELE and ELG are working in tandem to combat the digital inequality found among European languages. ELG is establishing a platform and marketplace to bring the LT community together. Meanwhile, ELE is dedicated to understanding which aspects need to be focused on to reach DLE by 2030 as well as establishing a funding program as a tool to ease the way.

The goal of DLE is only possible with the ELE research as a blueprint and the ELG platform as the facilitator of that vision. These combined efforts aim to create an environment where the barriers that currently fragment the European LT landscape fall and languages are able to flourish alongside and in interaction with each other.


What is Digital Language Equality?

Europe shines in its diversity, which is expressed in part in its multilingualism. According to the European Constitution, all 24 official European languages are equal. Unfortunately, in the digital age, that is not entirely the case, as there are notable discrepancies in the field of Language Technology (LT). Back in 2012, the META-NET White Paper series Europe’s Languages in the Digital Age showed that languages with more speakers had better support through Language Technology. For example, Spanish had fairly strong LT support, though not quite on the same level as English. Among the lesser-spoken languages, Estonian was slightly better equipped, though especially Machine Translation showed some gaps.

Differences like these pose a challenge to preserving and nurturing Europe’s multilingualism. Considering the current LT landscape, every language has its own gaps and its own needs for the future. Therefore, it is necessary to address and support each language individually.

To that purpose, the EU-funded project European Language Equality (ELE) is re-examining the LT support of the 31 languages covered in the META-NET White Papers ten years ago, alongside previously unevaluated ones. In total, ELE’s efforts span the 24 official and 32 additional EU-languages as well as 33 endangered minority languages. Over the course of the project, the results of this research will create the basis of a strategic agenda and roadmap towards ELE’s main goal: Digital Language Equality (DLE) by 2030.

DLE can come across as a vague term, so a specific definition is imperative in order to know what we are working towards. Our preliminary definition describes DLE as all relevant languages having the necessary support to “continue to exist and prosper as living languages in the digital age”.

This necessary support involves two categories of factors, though they are not without overlap. First, there are technological factors. Some examples are tools and services (e.g., grammar checkers), corpora (e.g., audio transcripts) and projects or organizations active in the LT community. The second category involves contextual factors, which are essentially the political and social but also economic situation in the region where a language is spoken.

In order for this definition to be useful when examining the current LT support of a language, these factors need to be accurately quantifiable. So far, no such score exists, which is why ELE is creating the “DLE metric”. As of now, the metric consists of a comprehensive list of the aforementioned factors that make up a language’s LT support. Aspects like scoring and weighting (including the introduction of potential penalties) the individual factors will be worked out over the course of the project.

Once complete, the metric will enable the direct comparison of the technology support of our languages, allowing for the identification of current problem areas as well as future priorities due to the empirical data the metric is based upon. Additionally, the metric will enable us to track the development of the LT landscape for each individual language over time, creating a long-term overview.

The ability to measure the level of LT support in a way that is precise and consistent across languages will form an important step towards our primary goal – establishing Digital Language Equality by 2030.