Newsletter #9 – March 2022
|
|
|
Dear reader,
yesterday, the ELE consortium submitted the better part of its project deliverables to the European Commission: With February ending, more than 40 project reports were handed over and will soon be published on our website. For details on the titles of the deliverables and their authors, head down to the ELE section of our newsletter – which is celebrating its six-month anniversary today!
After the major update of the ELG database in January, the platform continues to grow: The latest addition comes from Estonia, a country with a unique language whose support through LT services and corpora makes us especially happy. In the ELG section below, you can find more information on this update as well as a blog article on the January import, which was made possible thanks to the research efforts of the ELE project.
Finally, this newsletter brings you our interview with former MEP Jill Evans, whose European Parliament report on language equality in the digital age laid the groundwork for the ELE project and who believes that “the development of new technologies gives us an unprecedented opportunity to ensure true equality between languages”, as well as a profile of our ELE consortium partner CLARIN.
With best regards
Georg Rehm
|
|
Language Technology and NLP in the news
|
|
|
- “Symbolic AI: The key to the thinking machine” – VentureBeat, 11 February 2022
- “Why Eye-Tracking Matters in Machine Translation Post-Editing” – Slator, 11 February 2022
- “Report: 29% of execs have observed AI bias in voice technologies” – VentureBeat, 14 February 2022
- “LIST Represents Luxembourgish in European Language Equality Project” – Chronicle.lu, 14 February 2022
- “A.I. startups are reeling in hundreds of millions to help computers better understand humans” – VentureBeat, 16 February 2022
- “The EU’s AI rules will likely take over a year to be agreed” – AI News, 17 February 2022
- “Microsoft adds Upper Sorbian to Bing Translator” (German) – Microsoft, 21 February 2022
- “H2O.ai brings AI grandmaster-powered NLP to the enterprise” – ZDNet, 21 February 2022
- “Meta's Zuckerberg unveils AI projects aimed at building metaverse future” – Reuters, 23 February 2022
- “AI chatbot provides CBT in £1m NHS mental health trial” – Med-Tech Innovation, 23 February 2022
- “Machine learning's carbon footprint will shrink, Google claims” – Data Centre Dynamics, 24 February 2022
|
|
- One week ago, on 22 February, the Global Coalition for Language Rights hosted the Global Language Advocacy Day, prompting many language enthusiasts to share contributions, messages and opinions under the hashtag #GLAD22 on LinkedIn.
- Fittingly, the day before saw the celebration of the International Mother Language Day. Are programming languages considered mother languages? If so, then Europe counts some important natives in the digital world.
|
|
In the third month of the year, we have three national ELG workshops lined up: This March, our partnering institutions in Serbia, Norway and Romania will host dissemination events for their respective national language technology communities. The first event, titled “European language Grid and Serbian Language Technologies”, is hosted by the Faculty of Philology of the University of Belgrade and will take place on 11 March at 10am CET.
As usual, the ELG workshop in Serbia opens with an introduction and the basic functionalities of ELG in English, followed by a use case presentation about unsupervised sentiment analysis of customer complaints by Serbian company Telenor. The registration for the online event is free and possible through this Google form, while you can also follow the workshop via a YouTube live stream which is found on the event website. For information about the upcoming workshops in Norway (16 March) and Romania (24 March), feel free to follow our social media channels on Twitter and LinkedIn and stay posted for the next newsletter!
In January, we were able to report a large jump in the number of available resources on the European Language Grid: The research efforts by the 52 partners of the European Language Equality project more than doubled the database – a success that we described in more detail in our latest blog article (see below). However, the expansion of ELG is a continuous process with new Language Technology services, corpora and tools being added each week.
The latest addition arrived in the form of several resources for Estonian, such as a morphological analyzer. The resources are provided by the University of Tartu and the company Filosoft and are especially appreciated both as an expansion of the ELG platform and a contribution to Digital Language Equality, considering the relatively small numbers of speakers of Estonian and the uniqueness of the language.
|
|
Over the course of a weekend in the middle of January 2022, the European Language Grid (ELG) doubled in size. More than 6,000 new data resources, tools and services for 87 different languages were added to the ELG platform, pushing the ELG closer to one of its central objectives: developing into a joint European language technology platform in which ideally all relevant language resources and technologies are registered. How did that happen? Learn all about the research efforts of the European Language Equality project and how the results contributed to the ELG database in our latest blog article.
|
|
Selected new tools and resources on the
European Language Grid
|
|
|
Estonian TTS Preprocessor – Container (docker) based on preprocessing script for Estonian text-to-speech applications with interface compliant with ELG requirements. Therefore, it is embedded into the ELG catalogue and can be tried out on the web page or downloaded directly. This tool was added by the University of Tartu on 21 February, 2022.
|
|
Yesterday, before the end of the shortest month of the year, the European Language Equality project consortium submitted more than 40 deliverables to the European Commission: With over 30 language reports by the national partner institutions, 5 reports by European LT developer associations and 3 deep dives by our industry partners, the bulk of the ELE deliverables has been finalised and handed over. The reports will make up an important part of the strategic research agenda and the roadmap towards Digital Language Equality, whose preliminary draft is expected by the end of March. All deliverables will be available on our website soon.
“I believe that providing support for languages online has enormous potential in terms of accessibility to tools and resources, but also in achieving and confirming equal status for all languages, which is a very important principle. In an increasingly globalised world, that means that all languages, whatever the number of speakers and whether official or unofficial, can take their place side by side.” Our interview with former Member of the European Parliament Jill Evans is online - have a look below to learn more about her views towards digital language equality and the specific case of Welsh.
|
|
Jill Evans has done crucial work to further the goal of European language equality, most notably writing a report that paved the way for the creation of the ELE project. She advocated for equal representation of the Welsh language throughout her 21 years as a member of the European Parliament, successfully raising it to co-official status and using it in a parliamentary debate for the first time. We interviewed Ms Evans about developments surrounding the Welsh language and her experience campaigning for more balanced representation of European languages, especially in the digital realm. Find out how she believes new technologies can create “an unprecedented opportunity to ensure true equality between languages” – in our latest interview on the ELE blog!
|
|
The ELE consortium – Partner presentation
|
|
|
CLARIN
CLARIN stands for 'Common Language Resources and Technology Infrastructure'. It is the European research infrastructure for language as social and cultural data. In 2012, CLARIN was established as an ERIC and took up the mission to make digital language resources available to scholars, researchers, students, and citizen-scientists from all disciplines, especially in the social sciences and humanities (SSH), through single sign-on access.
CLARIN offers long-term solutions and technology services for deploying, connecting, analysing and sustaining digital language data and tools, and supports scholars who want to engage in cutting-edge, data-driven research. The access to language data and tools provided by the CLARIN infrastructure is organised through the model of service federation based on a distributed network of centres.
At the level of national consortia and centres, there are strong connections and alignment with DARIAH-EU and ELG initiatives, while participation in Horizon2020 and Horizon Europe projects supports the link to broader cluster structures, such as The Social Science and Humanities Open Cloud (SSHOC). In the wider landscape of disciplinary (or thematic) infrastructures, the CLARIN service offer is aligned with the ambition of the European Open Science Cloud (EOSC) to widen the accessibility and reusability of research data, as well as the wider Open Science agenda through a strong focus on making data FAIR.
In the ELE project, CLARIN is in charge of the D2.3. report that summarises the envisaged needs and visions that are necessary in order to achieve digital language equality (DLE) by 2030. The report is based on answers from respondents and interviewees from the CLARIN community in response to a survey, which identified a number of challenges that need to be dealt with in order to reach DLE by 2030:
- The scientific development of language technologies for languages of different sizes should be aligned to standardisation initiatives. To achieve this goal, research infrastructures such as CLARIN are already working on building the EOSC in order to ensure and support smooth sharing of data, tools and services
- More legal and administrative support for the field is a prerequisite for DLE. On the one hand, researchers need to have clear guidelines as to the application of the GDPR in their domain, not only on the national level, but also in the context of international collaborations within Europe and beyond. On the other hand, the investment in language technology (LT) should include appreciation and reward for the publications in local languages, for seeking solutions to the problems of smaller linguistic communities, and for ensuring the reproducibility and trustworthiness of the research workflows and outcomes
- Human resources and attention to human experts are extremely important. The high levels of educational standards currently achieved should be further strengthened and early-career researchers should be encouraged to get involved with use cases and the plethora of available tools.
In the context of CLARIN's contribution to the work towards DLE, it can be stated that CLARIN's strategy is already aligned in various ways to the identified challenges. However, it is vital that the ongoing activities within the CLARIN consortium are supported by large-scale funding to LT development at the European level, as outlined in the ELE programme.
|
|
The next ELT newsletter will be sent out on 15 March 2022. Until then, follow our ELT social media accounts (as linked below) for the latest news!
|
|
|
|
|