Newsletter #22 – December 2022

Dear reader,

we’re closing off 2022 with a variety of big news! 

First, we are happy to announce the publication of the Strategic Research and Implementation Agenda for achieving full digital language equality in Europe by 2030 (SRIA) developed by the  ELE project. From this month on, we’re introducing recommendations given in the SRIA in our section “From the SRIA”. This edition starts with an introduction of the scientific goal of reaching Deep Natural Language Understanding by 2030. 

On 8 November, the STOA workshop “Towards full digital language equality in a multilingual European Union” took place, organised by ELE together with the European Parliament. Did you miss the event? Have a look at the interview with Prof. Andy Way or watch a recording of the livestream of the workshop! All links and media to recap the workshop are also listed separately in the ELE section. 

CORDIS, the European Commission’s primary source of information from EU research projects, published an article about ELG that’s available in six different languages. 

Finally, there are also very good news regarding the European Language Grid. In the ELG section, you can read all about the publication of the new ELG book which was finally published in early November. In addition to the usual hard copy, the book is also available as a free digital download (Open Access).

With best regards

Georg Rehm

Language Technology and NLP in the news
Social media highlights
General news
European Language Grid Book Cover Image

The EU project European Language Grid ran for three and a half years and was completed successfully in June 2022. In November 2022, the book European Language Grid – A Language Technology Platform for Multilingual Europe was published by Springer in the series “Cognitive Technologies”. The volume was edited by project coordinator Georg Rehm (German Research Center for Artificial Intelligence). The book documents the evolution and the results of the ELG project, and it describes the architecture and implementation of the ELG LT and NLP cloud platform. Moreover, the reader learns more about the constantly growing ELG community, consisting of hundreds of industrial and academic stakeholders all over Europe as well as about the planned future of the ELG platform after the end of the project runtime. The book is Open Access, i.e., it can be downloaded free of charge. Hard copies are available for purchase. Read the full press release here.

Selected new tools and resources on the
European Language Grid
Bangor University's Experimental Bilingual Part-of-Speech Tagger for English and Welsh is an experimental tagger that can tag parts of speech in Welsh and English texts within the Python library spaCy.

The impetus for this experiment was that bilingual models could be very useful in the Welsh context, as the two languages often appear in the same documents in Wales (especially as names, titles and quotations).

This model was trained by combining English Universal Dependencies data English Web Treebank (EWT) with Welsh data [Corpws Cystrawenol y Gymraeg](https://universaldependencies .org/treebanks/cy_ccg/index.html) (CCG). 

953 Welsh training sentences from the CCG and 614 Welsh testing sentences were used as development sentences. For the English element, 12,543 training sentences and 2007 test sentences from the EWT corpus were used as development sentences.

The tagging accuracy reported following the training process was 92.7% on the test set.

General news

On 8 November, the STOA workshop “Towards full digital language equality in a multilingual European Union” took place in Brussels. It was the third STOA workshop to feature the topic of Language Technology  after events in 2013 and 2017. Chaired by Jordi Solé (MEP), it brought together over ten speakers from different parts of Europe for keynotes and a final panel discussion. The central topic was the goal of digital language equality and avoiding the digital extinction that at least 21 European languages are currently facing and the importance of protecting multilingualism. In that context, the ELE project and more specifically the SRIA were also discussed. “If the EU does not address the matter properly, the worst-case scenario is that some or maybe even most of these languages will eventually suffer from digital […] extinction,” concluded Prof. Andy Way. In his presentation, given remotely from Berlin, Prof. Georg Rehm provided a summary of the different recommendations specified in the SRIA including a short overview of the roadmap developed by the ELE project.

In other news, our Open Call for SRIA Contribution Projects has ended! A total number of 37 proposals have been submitted – we want to thank everyone for their contributions and will follow up with more news on it soon.

STOA News and Media
From the SRIA
Deep Natural Language Understanding by 2030 in Europe is the scientific goal of our Strategic Research and Innovation Agenda. We envision that systems are going to be able to accurately and seamlessly integrate modalities, situational and linguistic context, general knowledge, reasoning, emotion, irony, sarcasm, humour, metaphors, and culture. They should be able to explain themselves at request and do everything reliably as required, on the fly, and at scale across domains for the many languages of Europe and beyond.

Even with a language as universally represented as English, there is still much research and implementation needed in order to achieve genuine understanding of content or documents by machines. For the many other less well represented and implemented languages even more work lies ahead, given that they have to keep up with English. We envision systems that enable humans to share a partnership, enabling "everlearning" systems through a circle of training data collection, active learning and interactive feedback. These systems would allow maintaining a meaningful conversation with a computer, independent of the language spoken by the user.
 
Upcoming Events

If you have an event that you think the European language technology community should know about, get in touch with us to have it featured in this newsletter.

Next edition

The next ELT newsletter will be sent out on 3 January 2022. Until then, follow our ELT social media accounts (as linked below) for the latest news! Happy holidays and to more exciting news in 2023!


Want to learn more? Visit https://european-language-technology.eu 
or contact us directly.
Website
YouTube
Twitter
LinkedIn
Copyright © 2022 ELE and ELG Consortium, All rights reserved.
Why did I get this email?
The European Language Grid is an initiative funded by the European Union’s Horizon 2020 programme under grant agreement № 825627 (ELG).
The European Language Equality Project has received funding from the European Union under the grant agreement № LC-01641480 – 101018166 (ELE)
Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.