Newsletter #28 – June 2023
Dear reader,

With META-FORUM 2023 just around the corner, we’re excited to meet you in person on 27 June in Brussels, to conclude our project European Language Equality (ELE) and also to publicly unveil the ELE book. If all goes according to plan, all participants of the conference will receive a copy of the book.

In this month’s newsletter, we’re providing insights into some of the conference topics. If you haven’t already done so, you can check out the full programme on the ELE website. If you’d like to participate, you can register free of charge.  

As our monthly ELG resource, we’re featuring the NGT-Dutch Hotel Review Corpus that provides a parallel corpus of Dutch and Sign Language of the Netherlands. 

In the section “From the SRIA”, we’re taking a look at the vision and recommendations for Data and Knowledge

If you want to stay up to date with the latest developments around generative AI technology in Europe and beyond, also take a look at our curated press review section.

With best regards


Georg Rehm
 
Subscribe to the Common European Language Data Space (LDS) Newsletter

The European Language Data Space initiative that was started back in January 2023 recently launched its monthly newsletter, providing information on the latest developments in secure, privacy-preserving language data sharing and use across Europe. 

We’d like to invite you to subscribe to the newsletter for updates on LDS implementation, success stories, events, and more!


 
Language Technology and NLP in the news
Social media highlights
Selected new tools and resources on the
European Language Grid
NGT-Dutch Hotel Review Corpus – This month’s selected resource is already making its second appearance in our Newsletter. A few months back, we introduced it as part of the selected FSTP projects. Now that the project is finished, the results are available on the ELG website as a parallel corpus of hotel reviews in written English, written Dutch and in Sign Language of the Netherlands (NGT) videos.
 
META-FORUM 2023

META-FORUM 2023 will take place on 27 June in Brussels, Belgium. We will present the final results of the European Language Equality (ELE) project and discuss all kinds of topics touching upon language technologies, language resources, language-centric AI and especially digital language equality. We will talk about the future of the sector and also present the new ELE Book. You can register for free here.

The final reports of the finished FSTP projects will be presented in Session 3 of  META-FORUM 2023. After a first overview of the pilot projects, each representative is going to present their project’s results individually. 

Session 4 will focus on European Large Language Models and will feature several speakers.
Jussi Karlgren from Silo.AI (Sweden) will share insights on industrial language models for a multilingual Europe. Pedro Ortiz (DFKI, Germany) will talk about the development of multilingual large language models. The presentation by Barry Haddow (University of Edinburgh, UK) will focus on the EU project high-performance language technologies (HPLT). Michael Granitzer (University of Passau, Germany) will discuss European web crawls and Large Language Models in the context of the OpenWebSearch EU project, which aims to promote Europe's independence in web search and create an open and human-centred search engine market. It seeks to develop a European Open Web Index (OWI) and an open Web Search and Analysis Infrastructure (OWSAI) based on European values, principles, legislation, ethics, and standards. The session will conclude with a question and answer segment, allowing participants to engage in discussions about the topics presented.

To have a look at the full programme, featured topics, and speakers for all the sessions, check out the META-FORUM 2023 section on the ELE website.
 

From the SRIA
Research Topic: Data and Knowledge

The availability of suitable language data is crucial for training and evaluating advanced Language Technology tools, especially in deep-learning paradigms where the size of the training dataset directly affects tool quality. However, the current lack of parity in language resources contributes to digital language inequalities, varying across the EU due to factors like the number of speakers, commercial interest, and data accessibility restrictions. Untapped potential exists in quality language data within EU public sectors, particularly in domains like medical, health, pharmaceutical, legal, finance, insurance, science, manufacturing, publishing, and others. The scarcity of data, along with the need for annotated and labelled data, poses challenges and costs for both the research and industry communities. Research is needed to develop faster, cheaper, and more reliable methods for generating multilingual datasets. Furthermore, efforts like the movement of FAIR Data and Principles (Findability, Accessibility, Interoperability, and Reuse of digital assets) and the EU's Data Spaces initiative aim to address data availability issues by promoting findability, accessibility, interoperability, and reuse of digital assets.

You can read more about all SRIA recommendations here or take a look at the full document.
 
If you would like to voice your support for the ELE Programme and its goal and vision to achieve digital language equality in Europe by 2030, please consider filling out the endorsement form by clicking the button below and become a listed supporter on the ELE website:
Click here to endorse the ELE SRIA
Upcoming Events

If you have an event that you think the European language technology community should know about, get in touch with us to have it featured in this newsletter.
 

Next edition

The next ELT newsletter will be sent out on 4 July 2023. Until then, follow our ELT social media accounts (as linked below) for the latest news!


Want to learn more? Visit https://european-language-technology.eu 
or contact us directly.
Website
YouTube
Twitter
LinkedIn
Copyright © 2022 ELE and ELG Consortium, All rights reserved.
Why did I get this email?
The European Language Grid is an initiative funded by the European Union’s Horizon 2020 programme under grant agreement № 825627 (ELG).
The European Language Equality Project has received funding from the European Union under the grant agreement № LC-01641480 – 101018166 (ELE)
Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.