Newsletter #16 – June 2022

Dear reader,

Almost two weeks have passed since META-FORUM 2022, and we still look back fondly at our two-day conference in Brussels: Overall, nearly 400 participants attended on-site as well as online, and we are especially happy that the hybrid format with several live streams and discussions between guests on stage and on the screen worked seamlessly.

As you can expect, this newsletter is – for the most part – dedicated to a recap of the META-FORUM 2022 highlights, with details on each conference day in the respective project section. These highlights include the ELG Platform Release 3, which has been live for the past two weeks. You can browse it by yourself or have a look at the introductory session held at META-FORUM 2022 (videolink below).

In the ELE section, we present a language report that reached us from the Nordic countries, which we are especially happy about – much like the profile of ELE partner Språkrådet, the Language Council of Norway. We hope you enjoy our newsletter and wish you a good week.

With best regards

Georg Rehm

Language Technology and NLP in the news
Social media highlights
  • Next to watching the entire META-FORUM 2022 conference on YouTube (all 14.5 hours of it…), the most thorough summary of the conference can certainly be found in the Twitter threads by @translation_eu.

  • Test the capabilities of LinkedIn’s automatic translation tool by reading this informative post of the University of Vigo, written in Galician, translated into your language.

  • A Tweet with a very simple question that has been discussed by linguists since the dawn of the computer – how is this word pronounced?

  • Google’s Lambda has been in the news a lot lately. In its Science Weekly podcast, The Guardian dives into the myths surrounding it and what other areas of NLP are important – summed up by Maria Aretoulaki in this LinkedIn post.
General news

“We’re seeing a great model for what I would like to see in 2030 in the language community. If you ask me, it’s probably the best integrated, the best collaborating AI community in Europe in terms of infrastructure, in terms of tools that are being created together, that are put together, that are working together – we haven’t really seen this in any other community to the degree that I’m seeing it here.” 

Summarising the input from the first day at META-FORUM 2022, Philipp Slusallek, co-initiator of the CLAIRE network praises the achievements of the European Language Grid project, as presented at the hybrid conference on 8 June in Brussels. After an opening keynote by Jonas Andrulis, founder and CEO of German AI company Aleph Alpha, the ELG project team took over to present the current state of the platform and its services and give an overview, demonstration and tutorial of its third and final release, which has been live for the past three weeks.

In addition to the ELG results, day 1 of META-FORUM 2022 also offered many insights from the European LT industry, with a dedicated session on the ELG pilot projects, which can be experienced in our virtual expo, as well as a panel discussion with representatives from six different LT and AI companies, hosted by Gerhard Backfried from HENSOLDT Analytics. The first conference day was closed by Philippe Gelin, presenting the EU Commission’s plan for the Language Data Space, and Georg Rehm (ELG Coordinator), who introduced the future of the ELG platform as a legal entity.

If you missed META-FORUM 2022, we have good news: While the recordings of all conference sessions can now be revisited on our ELG YouTube channel, the slides and scripts for each keynote, presentation and contribution are linked in the conference programme on our website. We would like to thank everyone who participated in META-FORUM 2022 and made the return to Brussels such a pleasure and success!
META-FORUM 2022 - Day 1, Session 1: Opening - European Language Grid
Selected new tools and resources on the
European Language Grid
CorpoGrabber – The Toolchain to Automatic Acquiring and Extraction of the Website Content. CorpoGrabber is a pipeline of tools to get the most relevant content of the website, including all subsites (up to the user-defined depth). The proposed toolchain can be used to build a big Web corpora of text documents. It requires only the list of the root websites as the input. Tools composing CorpoGrabber are adapted to Polish, but most subtasks are language independent. The whole process can be run in parallel on a single machine. The result is a corpora as a set of tagged documents for each website. The tool was produced by the Wroclaw University of Technology and harvested from CLARIN-PL on 12 June 2022.
General news

“I don’t think we ever had as big, as wide a project, with over 50 participants – the breadth and the depth of all the many deliverables is really fantastic and we are very grateful for this plan to build towards Digital Language Equality.” Encompassed in the opening words by June Lowery-Kingston (Head of Unit G3, DG CONNECT, EU Commission) on day 2 of META-FORUM 2022 is the fact that one conference day alone could never have been enough to share all the results from the European Language Equality project.

Still, after a second keynote by François Alfonsi, Member of the European Parliament, the consortium went on to present its key insights from the last year – namely the representation of language in national AI strategies in Europe, the definition of Digital Language Equality and the dashboard visualising the data of the DLE metric. The following session focussed on the findings from the 33 reports on the state of technological support for European languages, the cross-language comparison that these findings allowed as well as the presentation of five selected reports – each introduced with a greeting in their respective language by the speakers.

After the lunch break, session 9 discussed the demands, needs and gaps of the European LT community, both in terms of LT developers, represented through various interviews and by the CLARIN and CLAIRE networks, and LT users and European citizens, which participated in the large-scale survey on LT support for EU languages conducted in the project. The session closed with the presentation of several technological deep dives into subjects such as Machine Translation, Speech Technologies, Text Analytics and Data and Knowledge. As with day 1, all sessions on European Language Equality held on day 2 can be found in our META-FORUM 2022 YouTube playlist.

While the start of ELE 2 is only a week and a half away, we still have great results from the first project to share: Our colleagues in Denmark, Sweden, Norway and Greenland took it upon themselves to write a language report on the Nordic minority languages. This report, which is not even a contractual deliverable – much like the one on Frisian, which will be published soon – gives many insights into the state of these European languages and also contributes to the emerging plan for Digital Language Equality. How this plan continues in the follow-up project ELE 2 will be presented in our next newsletter.
Upcoming events
The ELE consortium Partner presentation
The Language Council of Norway

The Language Council of Norway works to ensure that Norwegian remains a good and well-functioning language at all levels of society. The council collaborates with public and private players on language policy measures, including the use of plain language, terminology, the use of the language variants Bokmål and Nynorsk, as well as Norwegian language technology. In the digital age, this entails that language technology must work for both varieties of Norwegian, and speech technology must be able to handle dialectal variation. Therefore, The Language Council of Norway cooperates with The National Library to collect and create relevant language resources for Norwegian language technology. The resources are freely available and can be downloaded from the Norwegian Language Bank at the National Library.

Kristine Eide, senior adviser for ICT, states: “The ELE report on Norwegian Language technology, which we have written together with the National Library, has been an opportunity to connect with the rest of Europe. We see that many of the recommendations that are included in the other ELE reports are similar to ours, and we think common European funding, policies and practices can help smaller and medium sized languages to better language technology.“

As the national consultative body on language issues in Norway, the Language Council also promotes Norwegian sign language and the national minority languages Kven, Romani and Romanes, in cooperation with the language user groups. While the ELE reports have primarily focused on official EU languages, we were also given the opportunity to write an additional report on the status of smaller languages, minority languages and indigenous languages in the Nordic region. We have cooperated with the language Councils of Greenland, Sweden and Denmark as well as Giellatekno, which specialises in language technology for Saami and other morphologically-rich languages. The report identifies the problems that smaller languages face, and not only because of their general lack of resources for language technology. In particular, the lack of access to international digital platforms is a problem that cannot be solved locally but must be raised to a European level.
Next edition

The next ELT newsletter will be sent out on 5 July 2022. Until then, follow our ELT social media accounts (as linked below) for the latest news! 


Want to learn more? Visit https://european-language-technology.eu 
or contact us directly.
Website
YouTube
Twitter
LinkedIn
Copyright © 2022 ELE and ELG Consortium, All rights reserved.
Why did I get this email?
The European Language Grid is an initiative funded by the European Union’s Horizon 2020 programme under grant agreement № 825627 (ELG).
The European Language Equality Project has received funding from the European Union under the grant agreement № LC-01641480 – 101018166 (ELE)
Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.