Newsletter #5 – January 2022
Dear reader,

We wish you a happy 2022! Let us start off this new year with something everyone is currently looking for: good news. We can happily report on several achievements in our Language Technology projects, such as a new ELG tutorial video, an article on a successful pilot project and a number of ELE deliverables that we will soon be able to share. We also have an update regarding the schedule of our newsletter, which is currently received by almost 4,000 readers.

Starting with our first newsletter in 2022, we will send you a shorter but more regular update from the European Language Technology landscape, popping up in your mailbox every first and third Tuesday of the month. This new bi-weekly schedule of the newsletter allows for crisper and more up-to-date insights into everything happening in regard to Digital Language Equality and European LT developments.

We are also happy about the French Presidency of the EU Council in the first semester of 2022, promising some important work on topics relevant to our initiative. Scroll down to our ELE section for more on this!

We hope you have entered the new year well and look forward to more good news in the near future.

With best regards

Georg Rehm
 
Language Technology and NLP in the news
Social media highlights
  • The initiative for “Better Images of AI” has published its first gallery in order to provide alternatives to “a glowing brain and shining robot” – to much acclaim.
  • Look who made it into the highlights of the European AI Confederation “CLAIRE” and received a decorative card.

  • AI4Diversity presents an interesting Augmented Reality tool that can prevent both fun and frustration – a Rubrik’s cube solver

General news

This news is from last year, but not as old as that may sound: In the middle of December, a new ELG tutorial video was released. The 14-minute introduction explains the basic functionalities for browsing, using and providing resources to the European Language Grid. Lean back, relax and learn how ELG works as an easy-to-access cloud platform for Language Technology tools and corpora for all European languages. 

All of Europe is also what Coreon has in mind: The company develops multilingual language technology and uses ELG to provide easy access to its tools. In our latest blog article, we present the collaboration and Coreon’s ELG pilot project – read on down below for more.
How to use the ELG –European Language Grid Tutorial
New ELG blog articles

The European Language Grid aims to overcome the language barriers existing in Europe. One of the ELG pilot projects providing access to multilingual language resources was created by the Berlin-based company Coreon. We spoke with Michael Wetzel, Managing Director at Coreon, about how their project works and what impact it could have. Find out how both Coreon and ELG can help overcome the fragmentation of the European LT community – in our latest blog post.

A screenshot of Coreon's word tree with the example of the word "fish".
Selected new tools and resources on the
European Language Grid
TaPaCo: A Corpus of Sentential Paraphrases for 73 Languages – TaPaCo is a freely available paraphrase corpus for 73 languages extracted from the Tatoeba database. Tatoeba is a crowdsourcing project mainly geared towards language learners. Its aim is to provide example sentences and translations for particular linguistic constructions and words. The paraphrase corpus is created by populating a graph with Tatoeba sentences and equivalence links between sentences "meaning the same thing". This graph is then traversed to extract sets of paraphrases. Several language-independent filters and pruning steps are applied to remove uninteresting sentences. A manual evaluation performed on three languages shows that between half and three quarters of inferred paraphrases are correct and that most remaining ones are either correct but trivial, or near-paraphrases that neutralize a morphological distinction. The corpus contains a total of 1.9 million sentences, with 200 - 250 000 sentences per language. The resources were automatically harvested from Zenodo.

Selected new ELG members

phase-6 GmbH – phase6 is an Edutech company, focused on vocabulary training, a vital aspect of successful language learning. The company is focused on pupils within the German school system. Students can study vocabulary independently and aligned with the content of their school lessons. The value proposition of the phase6 vocabulary trainer is the cooperation with schoolbook publishers, allowing the users to easily import the relevant vocabulary from their school books without additional copying or typing.

General news

As of Saturday and with the beginning of the new year, France holds the Presidency of the EU Council. A challenging time overall, but some of the focus topics selected for the half-year presidency lead into promising directions. Namely, the promotion of digitalisation, the combating of inequalities and the renewal of the EU’s “humanist vocation” inspire the hope that Digital Language Equality by 2030 remains a realistic vision for Europe.

The ELE project supports this hope with the full specification of the DLE concept, which will be published soon and made available on our website. As the last third of the project runtime has kicked off, you can expect a number of other deliverables within the next months – so keep your eyes open for language reports, strategic agendas and deep dives into technologies and data!
New ELE blog articles

It has been mentioned many times, but maybe you are unsure about what Digital Language Equality is supposed to mean exactly. While waiting for the full specification, revisit our definition of DLE based on our project research so far – in this timeless blog article from June 2021.

The ELE consortium Partner presentation

Charles University

The Institute of Formal and Applied Linguistics (ÚFAL) is a research and teaching department in the School of Computer Science at the Faculty of Mathematics and Physics, Charles University (Univerzita Karlova, CUNI) in Prague. It was founded in 1990, continuing work formerly conducted by the Laboratory of Algebraic Linguistics since the early 1960s, first at the Faculty of Philosophy and later at the Faculty of Mathematics and Physics, which houses the institute to this day.

Its research encompasses many topics in the areas of Computational Linguistics, Language Technologies and Natural Language Processing, with a multitude of national and international projects, combining knowledge from linguistics, mathematics, general computer science, statistical modelling, machine and deep learning and software engineering.

As a teaching department, it offers Czech and English programs for both a doctorate (Ph.D.) and a Master's degree (Mgr. or MSc.) in Computational Linguistics and Machine Learning. It participates in the double-degree Erasmus Mundus European Masters Program in Language and Communication Technologies (LCT) and also carries Bachelor level courses for the Artificial Intelligence specialization of the Informatics (Computer Science) program.

From 2000-2011, the Institute served as the coordinator of the Center for Computational Linguistics. Since 2010, the Institute has been the main host of the LINDAT/CLARIAH-CZ Research Infrastructure as part of the CLARIN and DARIAH EU networks.

Prof. Dr. Jan Hajič, Project Leader of European Language Equality at CUNI: “Our involvement in the ELE project is to coordinate the collection of evidence from various stakeholders, from research to industry, to support the Strategic Research and Innovation Agenda in Language Technology for the next decade. We believe that it will enable the continuation of excellent research in the area of Language Technology in Europe, and thanks to that, in turn, the ubiquitous use of Language Technology in everyday life for everyone.”

Next edition

The next ELT newsletter will be sent out on 18 January 2022. Until then, follow our ELT social media accounts (as linked below) for the latest news! 


Want to learn more? Visit https://european-language-technology.eu 
or contact us directly.
Website
YouTube
Twitter
LinkedIn
Copyright © 2021 ELE and ELG Consortium, All rights reserved.
Why did I get this email?
The European Language Grid is an initiative funded by the European Union’s Horizon 2020 programme under grant agreement № 825627 (ELG).
The European Language Equality Project has received funding from the European Union under the grant agreement № LC-01641480 – 101018166 (ELE)
Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.