Developing an agenda and a roadmap
for achieving full digital language
equality in Europe by 2030

Call for Papers: Towards Digital Language Equality workshop at LREC 2022

LREC2022 Marseille

As one of the most important applications of Artificial Intelligence with a fast-growing economic impact, Language Technology (LT) is revolutionizing many language-related tasks. Although cross-language communication forms an important part of this development, LT resources are not equally available to all languages and domains. To make use of the full potential of Language Technology, a progress towards a multilingual, efficient, accurate, explainable, ethical, fair and unbiased language understanding is necessary – in short: Digital Language Equality (DLE).

For the past year, the European Language Equality (ELE) project has been working on this subject and is now hosting the workshop Towards Digital Language Equality (TDLE 2022) in parallel to the 13th Language Resources and Evaluation Conference (LREC) from 20 to 25 June in Marseille, France. The workshop focusses on policies, initiatives, projects, studies and research on Digital Language Equality on all levels – regional, national and European – and explores the recent advances in Natural Language Understanding.

As part of the TDLE 2022 workshop, a Call for Papers invites researchers to submit their work. The topics of interest include a range of DLE-related areas such as use cases for LT-deployment, monitoring tools, policy analysis and the societal impact of Digital Language Equality. Papers can be submitted until 11 April 2022. The full description of the workshop, a list of topics of interest, important dates, the committee members and contact details can be found on our TDLE 2022 event page. We are looking forward to many helpful submissions on the path towards Digital Language Equality!

Spanish government invests 1.1 billion Euros into “New Language Economy”, including co-official languages

On 1 March 2022, the Spanish government announced that the Council of Ministers, the main decision-making body of the country, has approved a new “Strategic Project for the Recuperation and Economic Transformation (PERTE)”, an instrument for public collaboration with the private sector that was put into action in December 2020 to support the economy during and after the Covid pandemic. The project titled “New Language Economy” (“Nueva Economía de la Lengua”) aims to mobilise public and private investments in order to maximise the value of Spanish and the co-official languages of the country during the process of digital transformation towards a global level.

A total of 1.1 billion Euros of investments are meant to provide an impulse to the entire value chain of the language, AI and knowledge economy in Spain, which represents central aspects of digitalisation. This strategic initiative is supposed to combine a push of the key sectors as well as emerging markets with the transition towards a digital economy. According to the notice of the Council of Ministers, the primary aim of the project is to guarantee that AI “thinks” in Spanish and that Spanish-speaking businesses and citizens play a leading role in the emergence and creation of quality employment.

At least 30 million Euros of the investment are meant to be directed at co-official languages, which are spoken by around 30% of the Spanish population. A total of 14 projects, coordinated by public administration, universities, research centres, businesses and the industry, will receive funding, following five topical “axes”: The creation of a knowledge base in Spanish and in the co-official languages, Artificial Intelligence, science, the teaching and learning of Spanish as well as Spanish in the world and in cultural industries.

One central aspect of the “New Economy of Language” programme will be the creation of a corpus of Spanish and the co-official languages, for which 97 million Euros are earmarked. 330 million Euros will support AI projects, while 70 million Euros are directed towards the cultural and creative sector.

For the European Language Technology community and the goal of Digital Language Equality, these are exciting news. German Rigau, Professor at the Basque Center for Language Technology of the University of the Basque Country, one of the Spanish partners in the European Language Equality project, was involved in the preparation and planning of the “Impulse of Language Technology Plan (Plan TL), which is continued and extended by the new PETRE project. Regarding the recent approval, German Rigau states:

“The approval of the investments into a new language economy represents great news for our work at the University of the Basque Country, for Spain and all its languages and for Europe in general. We have helped developing the predecessing “Plan TL” since 2014 and are happy that the recent approval recognizes the importance of language in the digital world, in AI, in culture, but also in terms of creating equality. Hopefully, other countries will follow in this recognition.”

Redressing the balance: Jill Evans on promoting Welsh in the EU and laying the groundwork for the ELE project

Jill Evans is a Welsh politician of the Plaid Cymru party, who has worked tirelessly to further the goal of language equality in Europe. During her time as a member of the European Parliament from 1999-2020, she was the first person to use the Welsh language in parliamentary debate and wrote the report “Language Equality in the Digital Age”, which laid the groundwork for the ELE project as a whole.

Can you please tell us a bit about yourself: What work have you done and are you currently doing in regards to the Welsh language?

I was born in the Rhondda Valley in 1959. I attended my local primary school which was an English language school. Given the more recent expansion of Welsh medium education, I am delighted that the school is now a Welsh language school. I went to Tonypandy Grammar School which was also English language, but I had inspirational Welsh teachers to whom I owe my fluency in Welsh today. I gained a degree in Welsh at Aberystwyth University and was later awarded a Master of Philosophy degree for my work in developing an ‘O’ level examination for adults learning Welsh as a second language. I have always been active in politics and in the peace movement. In 1999 I was elected to the European Parliament as a Plaid Cymru MEP representing Wales. I was subsequently re-elected for four terms until 2020, when the UK left the European Union, much against my will. I have retired from elected politics, but I am the Plaid Cymru Director of International Affairs, Chair of the Wales Campaign for Nuclear Disarmament and a board member of the recently established Academi Heddwch, Wales’s peace institute.

What is your involvement with the European Language Equality project?

I was very grateful to the experts and now partners of the European Language Equality project for the invaluable assistance on my report on language equality, as well as for all that they have done in highlighting the huge potential for the development of language technologies in Europe.

From your perspective, how has the situation of the Welsh language developed over the last years and how do you view these developments?

Sixty years ago this month, Saunders Lewis gave a famous lecture entitled “The Fate of the Language” that called for urgent action and led to the establishment of the Welsh Language Society, Cymdeithas yr Iaith Gymraeg to campaign for the future of the language. The current Welsh Government has adopted an ambitious target of one million Welsh speakers by 2050 which is widely supported but also dependent on action in many other areas of policy, such as housing, employment and sustainable communities. It has also announced free Welsh lessons for all 16- to 25-year-olds from September, as well as for teachers, head teachers and teaching assistants and an e-learning resource pilot project for 16- to 18-year-olds in school, college or on an apprenticeship scheme. It is “…a small but crucial contribution in our efforts to expand Welsh language citizenship to everyone”. Wales is unique in having a Wellbeing of Future Generations Act that requires all public bodies to consider the social, economic, environmental and cultural well-being of the nation. Its digital service standards include meeting the needs of people who use the Welsh language in their everyday lives and providing services that promote and facilitate the Welsh language.

You were the first person to use Welsh in a debate in the European Parliament. What was this experience like?

I campaigned consistently for the Welsh language to become an official EU language, as it is an official language in Wales. When I was first elected, only the EU official languages were allowed to be spoken in the parliament chamber. We succeeded in changing that rule to allow other languages to be spoken but without interpretation. We won co-official status for Welsh following similar recognition of the Spanish state languages, but it was still far from the equality we sought. When Tony Blair came to Brussels to address the European Parliament during the UK presidency, it was too good an opportunity to miss, so I made part of my speech in Welsh and my microphone was not turned off! It attracted a lot of press coverage. A nice footnote to an otherwise very sad end to my time as an MEP is that I spoke Welsh again in my final speech and although there was no interpretation, the translation unit contacted me the next morning for a transcript and translation so they could include my whole speech on record.

Do you consider Welsh to be well-represented in the digital world?

The development of new technologies gives us an unprecedented opportunity to ensure true equality between languages. The Welsh Government has adopted a Welsh Language Technology Action Plan with the key aim of developing artificial intelligence to enable machines to understand spoken Welsh. Improving computer assisted translation is also an important element. I supported the government’s initiative which has meant that the excellent work done by Canolfan Bedwyr in Bangor University, for example can be supported and promoted. In the online world, English continues to dominate but given adequate resources and policies and working with ELE we can redress the balance.

What does Digital Language Equality mean to you and what could it mean for the Welsh language?

Digital language equality means that speakers of all languages, including minority and endangered languages, have the same access to digital resources as speakers of dominant languages. For Welsh, this means achieving true equality with English, in particular in the online world. This will have a particular impact on children who tend to spend more time in the digital world and could help accelerate a shift in the “real world”.

What do you think needs to be done for a language to be considered equal to others in a digitized society?

People of all communities should have equal access to digital tools. The first step in achieving that is the ELE project looking at the language technology support of different languages and on this basis developing a strategic agenda and a roadmap towards equality. It is this research and targeting that will enable all languages to flourish in the digital age, given the right support.

What do you consider the main differences between supporting a language offline or online?

I believe that providing support for languages online has enormous potential in terms of accessibility to tools and resources but also in achieving and confirming equal status for all languages, which is a very important principle. In an increasingly globalised world, that means that all languages, whatever the number of speakers and whether official or unofficial, can take their place side by side.

In 2018 you wrote a report for the EU commission titled “Language Equality in the Digital Age”, which is the basis for the ELE project. What motivated you the most about this work and what changes have you observed since then?

When I first proposed a report on language equality in the digital age to the European Parliament’s Culture and Education Committee it provoked huge interest, both there and in the Industry Committee. That was, I believe, for two reasons: the first was that there is digital language inequality in Europe, even for some of the EU official languages; secondly, that this was important for all those interested in EU competitiveness in the digital industries. The idea for the report had come from the STOA study (the EU research body) which showed the social and economic consequences of language barriers and the widening of the technological gap. As someone who had long campaigned for the Welsh language, I could identify with the problems but also recognise the potential of a major EU project in this field. During the course of the drafting of the report I learned a great deal from the many organisations campaigning for language rights and from those engaged in the ground-breaking work in developing the technologies. I am proud that the overwhelming vote in favour of my report in the European Parliament helped the Commission in establishing the language equality project and supporting the exciting work that is taking place now.

Is there a digital Welsh language tool or application that you enjoy using?

As someone who does a lot of writing in Welsh, I really appreciate free software like Cysgliad with a spell checker and dictionary. I am always delighted to see new made in Wales developments which strengthen linguistic diversity.

“A matter of life and death”: Inaki Irazabalbeitia on language equality and assuring a future for the Basque Language

Inaki Irazabalbeitia is an important figure in the socio-political world of the Basque language. This includes the field of language technology, political parties and foundations as well as the Basque Language Academy. Mr. Irazabalbeitia was a member of the European parliament from 2013-2014 and continues to advocate for equality of the Basque language, for instance as mayor of the Basque village of Alkiza.

Portrait of Inaki Irazabalbeitia

Can you please tell us a bit about yourself: What projects are you currently working on with regards to the Basque language?

I was born in Donostia-San Sebastian in 1957 into a bilingual family. I studied chemistry at the University of the Basque Country and got a Ph.D. in 1986. I spent the majority of my professional career working for the normalization of the Basque language at the Elhuyar Foundation where I was CEO from 1995 to 2003 and general manager of Eleka Ingeniaritza Lingusitikoa, the language engineering branch of Elhuyar (2006-2011).

In 2012 I entered into politics full time. I’ve been a Member of the European Parliament (2013-2014). Currently I’m retired, but I’m still in politics as mayor of my adoptive village, Alkiza. Furthermore, I continue to be attached to the Ezkerraberri foundation, a socio-political Basque organization. I am still actively involved in LT related initiatives. I helped out in the drafting of MEP Jill Evan’s report on language equality in the digital age. For the last four years, I’ve been advising the department of Language Policy of the Basque Government on IT and LT policies.

What is your involvement with the European Language Equality project?

I helped Georg Rehm and Olga Perez (advisor for the Greens/EFA parliamentary group on topics related to education, culture and media) in promoting the idea of the need for an agenda and roadmap for achieving full digital language equality by means of a pilot project of the EP and in the preliminary steps of the definition of the project.

From your perspective, how has the situation of the Basque language developed over the last years and how do you view these developments?

I would highlight different aspects. First, the Basque LT community has done a great R&D investment in order to improve and develop tools and resources. New machine translation tools, based on AI and neural nets, have dramatically improved the quality of the output of commercial MT systems. This is probably the most remarkable achievement from the point of view of language professionals and the general public. Second, the creation in 2010 of Langune, the Basque Association of Language Industries, represented a step forward in the cohesion, visibility and impact of the sector.

Third, the position of the Basque Government towards language technologies has changed positively in the last 5 years. Although LTs were one of the pillars of the Science, Technology and Innovation Plan in 2001-2004, they lost relevance in the following plans. For instance in 2010’s plan, all references to LT disappeared and the language industry was mentioned only a couple of times. The government fell for the charm of big foreign LT actors and considered the local LT community subsidiary. The persons currently in charge of Language Policy in the Basque Government positively believe that a strong local LT community is one the keys to assure a future for the Basque Language in the digital world.

In your opinion, how well is the Basque language represented in the digital world?

I would say it is similar to that of the non-hegemonic languages. In STOA’s report Language equality in the digital age – Towards a Human Language Project (2017), Basque appeared in the group of languages with fragmentary support together with many of the official EU languages such as Danish, Greek or Polish. Clearly, English and other hegemonic languages such as French, German or Spanish are better represented. In the case of Basque, that is simultaneously an opportunity and a threat. A threat because it is enclosed by two of the biggest languages of the world, French and Spanish, and almost all Basque speakers are bilingual – be it Basque-Spanish or Basque-French. But, at the same time, that situation is an opportunity, because the Basque LT community has the linguistic knowledge to work in French and Spanish as well. In other words, it opens the door to a bigger market.

Some areas such as MT have developed enormously, but there is still a long way ahead to reach a fair representation of Basque in the digital world. Political support and investments in R&T and education as well as the participation of our LT community in European networks and projects like the European Language Equality project, are crucial.

What does Digital Language Equality mean to you and what could it mean for the Basque language?

I think that in the European context we can say that there is digital language equality once all languages, regardless of the number of speakers, can offer similar levels of tools and resources to their speakers. In the case of Basque, French and Spanish are our mirrors.

What do you consider the most important requirement for language to become equal in a digitized society?

Back to my previous answer, all languages should have the possibility of offering resources and tools similar to those offered by hegemonic languages to their communities.

In 2020, the Ezkerraberri Foundation published a book about the lack of representation of non-hegemonic languages in audiovisual media. If you had to sum up the key message of the book, what would it be?

In the case of regional languages like Basque or Catalan, the access to media platforms is a matter of life and death for the survival of the language, since the majority of speakers are bilingual. In the case of Basque in Spain, if Basque speakers do not have access to or cannot enjoy audiovisual content in Basque but they can do so in Spanish, the smaller language, Basque, suffers tremendously due to the disglossic situation it falls into. We know where that leads the weakest language … That’s why states should set up legislation to secure the presence of non-hegemonic languages in the audiovisual offer. Unfortunately, you can’t take it for granted even in those so-called plurilingual states.

Currently the Spanish government is preparing a new audiovisual law to transpose EU Directive 2018/1808, amending the Audiovisual Media Services Directive. The document that the government is to send to Parliament for approval strongly defends the presence of Spanish in those services, but it doesn’t do the same for Basque, Catalan or Galician.

Do you think language technology could contribute to the support of the Basque language and others like it in the field of audiovisual media?

For sure! Let me give you an example. The production of subtitles is cheaper and faster if you could use voice-to-text technologies, followed by MT. I have no doubt that non-hegemonic languages need LT to ensure a sustainable audiovisual media offer and production.

Do you have a favourite digital Basque language tool or application?

I’m in love with the ELIA machine translation tool. It makes translation easier!

Is there one you would like to see for the Basque language that doesn’t exist yet?

Yes. A voice recognition tool which could be able to identify and properly transcript the different accents, dialects and speeches of Basque.

How does the European Language Grid strengthen linguistic diversity?

Happy faces and the ELG logo

Europe consists of more than 40 different countries and even more cultures. Everyone brings something unique to the table, languages being one of the more obvious aspects. Although it is possible to encounter five different languages within a fifteen minute train ride, this diversity is less represented when it comes to the digital world and especially language technology. As was shown in the META-NET White Paper Series in 2012, tools like machine translation, text-to-speech applications and text summarisation work predominantly in English, with languages like German, French and Spanish following closely behind. Languages with weaker support include Icelandic, Latvian, Welsh and Irish.

In order to preserve and strengthen Europe’s unique linguistic diversity, languages that are less widespread need to be equally supported and represented. Welsh serves as a fitting example here: although the overall use of the language was declining, the last few decades have been marked by revitalisation efforts – governmental, scientific and social – that work towards bilinguality being more common in Wales. One of the key aspects of this is strengthening bilingual communication and representation online.

For many, English is the go-to language of the internet. Not only is it used in communication; a lot of websites also default to English even though versions in other languages are available. Looking at the big picture, this risks smaller languages falling by the wayside. On an individual level, there is another reason for this to be an issue: not everyone speaks English, and for some of those that do, it can be a chore to get through a paragraph they would much more comfortably read in their own language. Once again regarding Welsh, there is a tool that provides a start in overcoming this issue: The Welshify Widget. The plugin lets users know when a Welsh version of a website is available and guides them through the process of changing their browser settings to set Welsh as their preferred language.

By highlighting Welsh versions of websites, the widget fosters an online environment that is more inclusive towards Welsh native speakers. There are a variety of digital language tools that have similar effects for a wide range of European languages, by making smaller languages available in the digital world and supporting their usage. Each one of them contributes towards strengthening linguistic diversity and equality among European languages.

In an effort to reach those goals, it is necessary to know where each European language has gaps in digital support. The European Language Equality (ELE) project examines 70+ European languages individually, analysing where sufficient support exists and where more is needed. The results of this research will be presented in a strategic agenda and roadmap, detailing what needs to be done to reach digital language equality by 2030.

In order to make that equality a reality, language resources need to reach their intended user base. Potential consumers need to know what is available. The European Language Grid (ELG) aims to facilitate this, among other things. The ELG is a platform that hosts European language technologies with the goal of becoming their primary hub. Companies and research facilities can upload and link their projects on ELG. Having one centralised hub like the ELG will enable developers to get the word out about their products, while users have an easier time finding and downloading the type of tool they want.

ELG also allows developers to test their tools or services, which in turn makes them easier and faster to finalize. This is also aided by the communication that is made possible through the ELG. Language technology developers are able to learn from and collaborate with each other, which, among other things, opens the door to potential translations of existing tools into other European languages. Faster development of tools and communication within the language technology community will quickly create more available technologies and resources. The heightened number and visibility of these resources will not only boost individual languages – in doing so, the linguistic diversity that already exists in Europe will be strengthened as well.

Tools like the Welshify Widget make the online experience more inclusive for non-English speakers and help revitalize the language of a European culture. The ELG as the main hub for European language technology aims to provide the platform for projects like these to reach their full potential and work towards digital language equality.

How do ELE and ELG work together towards Digital Language Equality?

© Adobe Stock/bernardbodo

Europe’s diversity in terms of culture and communication sets it apart from other major players in the global field of Language Technology (LT) that usually concentrate on single languages. The number of European languages provides a unique opportunity to work together and to learn from each other in the process of developing digital language tools. In order to access this potential, it is crucial that every official, unofficial and minority language is equally represented in the digital world and LT landscape. This is one of the reasons why Digital Language Equality (DLE, further described here) is an important goal that needs to be actively worked towards. One of the main aspects of this work is handling the fragmentation of the European LT landscape that is still prevalent. The ELG project addresses this issue by building a platform that aims to host all European LT resources – the European Language Grid (ELG). Having one unified hub will support the LT community greatly. Developers will have an easier time getting the word out about a product, while consumers are more likely to hear of and be able to use it. Furthermore, a centralised platform will give LT creators a broader reach and encourage collaboration, communication and learning from one another.

While ELG is bringing the LT landscape together, it is also necessary to combat the existing language inequality actively and directly. This is where ELE comes in: During the runtime of the project, 70+ European languages are being researched and analysed to find out where exactly the inequalities lie. This effort spans Europe’s official, and many unofficial and minority languages. By the end of the project, a strategic agenda and roadmap detailing the best approach to the existing discrepancies will be presented. This research will lay important groundwork for a long-term funding program that will be able to provide support based on the discovered disparities.

ELE and ELG are working in tandem to combat the digital inequality found among European languages. ELG is establishing a platform and marketplace to bring the LT community together. Meanwhile, ELE is dedicated to understanding which aspects need to be focused on to reach DLE by 2030 as well as establishing a funding program as a tool to ease the way.

The goal of DLE is only possible with the ELE research as a blueprint and the ELG platform as the facilitator of that vision. These combined efforts aim to create an environment where the barriers that currently fragment the European LT landscape fall and languages are able to flourish alongside and in interaction with each other.

Survey for LT developers and users: Shape the future of European Language Technology

Despite the recognizable advantages and historical and cultural worth of multilingualism, the many European languages face a striking imbalance in terms of their preservation in the digital world and their support by language technology. The European Language Equality project (ELE) addresses this risk to European identity in the digital age by preparing a Strategic Research and Innovation Agenda and Roadmap working towards digital language equality by 2030. The European Language Grid (ELG) is closely related to this project, offering LT developers, researchers and providers an inclusive platform to present, share and market their language technologies and connect within the European LT community.

As part of the two projects that are funded by the European Commission and address an appeal by the European Parliament resolution titled “Language equality in the digital age”, we are reaching out both to LT developers and LT users to participate in a large-scale, EU-wide consultation that will impact and shape the future of language technologies in the multilingual continent. The two surveys are aimed on the one hand at academic and commercial developers in the field of Language Technology (LT), Natural Language Processing (NLP) and Language-centric Artificial Intelligence (AI) and on the other at all Language Technology users and consumers.

The questionnaire takes approximately 20 minutes to fill in; your answers will help evaluating the level of LT support for European Languages, indicating the challenges and highlighting the needs and expectations of professionals and users in the future. Your contributions will be carefully taken into account when preparing the ELE strategic agenda and roadmap.

The European Language Equality project is a pan-European effort that will significantly impact the field and funding situation of LT in Europe for the next 10 to 15 years. Help us shape the future of multilingualism in the digital age – join in!

  • Survey for Language Technology developers

  • Survey for Language Technology users and consumers

  • What is Digital Language Equality?

    Europe shines in its diversity, which is expressed in part in its multilingualism. According to the European Constitution, all 24 official European languages are equal. Unfortunately, in the digital age, that is not entirely the case, as there are notable discrepancies in the field of Language Technology (LT). Back in 2012, the META-NET White Paper series Europe’s Languages in the Digital Age showed that languages with more speakers had better support through Language Technology. For example, Spanish had fairly strong LT support, though not quite on the same level as English. Among the lesser-spoken languages, Estonian was slightly better equipped, though especially Machine Translation showed some gaps.

    Differences like these pose a challenge to preserving and nurturing Europe’s multilingualism. Considering the current LT landscape, every language has its own gaps and its own needs for the future. Therefore, it is necessary to address and support each language individually.

    To that purpose, the EU-funded project European Language Equality (ELE) is re-examining the LT support of the 31 languages covered in the META-NET White Papers ten years ago, alongside previously unevaluated ones. In total, ELE’s efforts span the 24 official and 32 additional EU-languages as well as 33 endangered minority languages. Over the course of the project, the results of this research will create the basis of a strategic agenda and roadmap towards ELE’s main goal: Digital Language Equality (DLE) by 2030.

    DLE can come across as a vague term, so a specific definition is imperative in order to know what we are working towards. Our preliminary definition describes DLE as all relevant languages having the necessary support to “continue to exist and prosper as living languages in the digital age”.

    This necessary support involves two categories of factors, though they are not without overlap. First, there are technological factors. Some examples are tools and services (e.g., grammar checkers), corpora (e.g., audio transcripts) and projects or organizations active in the LT community. The second category involves contextual factors, which are essentially the political and social but also economic situation in the region where a language is spoken.

    In order for this definition to be useful when examining the current LT support of a language, these factors need to be accurately quantifiable. So far, no such score exists, which is why ELE is creating the “DLE metric”. As of now, the metric consists of a comprehensive list of the aforementioned factors that make up a language’s LT support. Aspects like scoring and weighting (including the introduction of potential penalties) the individual factors will be worked out over the course of the project.

    Once complete, the metric will enable the direct comparison of the technology support of our languages, allowing for the identification of current problem areas as well as future priorities due to the empirical data the metric is based upon. Additionally, the metric will enable us to track the development of the LT landscape for each individual language over time, creating a long-term overview.

    The ability to measure the level of LT support in a way that is precise and consistent across languages will form an important step towards our primary goal – establishing Digital Language Equality by 2030.