Newsletter #10 – March 2022
|
|
|
Dear reader,
In these dark and difficult times, we all think of ways to help. Internally, the ELG initiative has been discussing for the past two weeks how Language Technology can make a difference, especially with regard to supporting cross-lingual communication, which seems more important than ever. As a result, several new resources for Ukrainian, specifically Machine Translation models and one ASR model, have been added to the ELG platform, with more to follow. We are grateful to all colleagues involved for their work and contributions. The details can be found in the ELG subsection of the newsletter, followed by information on our upcoming two workshops, the first of which already takes place on 16 March.
The 48 deliverables that the European Language Equality project consortium submitted two weeks ago are now available online – an extensive insight into the state of Language Technology for Europe’s languages. This major effort by the ELE partners is the foundation of the strategic agenda that we will compile and prepare in the next couple of weeks – the deliverables are described in more detail in the ELE section below.
Important news also reached us from Spain: 1.1 billion Euros will be invested into a new Spanish language economy, including the co-official languages of the country. Who benefits from this investment project and what developments are foreseen is explained in our blog article linked further down. Right below, you can find a new profile on one of our ELE project partners, the Wikimedia chapter in Germany, and their important work towards free knowledge accessibility and language preservation.
With best regards
Georg Rehm
|
|
Language Technology and NLP in the news
|
|
|
- “AI Weekly: Meta’s flashy, AI-powered vision of the metaverse brushes over concerns” – VentureBeat, 25 February 2022
- “Mental Health Vocal Biomarker Startup Kintsugi Raises $20M” – Voicebot.ai, 28 February 2022
- “Meta AI Introduces ‘No Language Left Behind’ Project: An AI Model To Support Machine Translation For Low-Resource Languages” – Marktechpost, 1 March 2022
- “Enterpret launches with $4.3M, NLP technology to decipher customer feedback” – TechCrunch, 2 March 2022
- “Europe Is in Danger of Using the Wrong Definition of AI” – Wired, 2 March 2022
- “Sanctuary Cognitive Systems Raises $58.5M to Embed ‘Human-Like Intelligence’ in Robot Workers” – Voicebot.ai, 3 March 2022
- “Startups competing with OpenAI's GPT-3 all need to solve the same problems” – The Register, 3 March 2022
- “What the Wordle trend can teach us about language and technology” – CU Boulder Today, 3 March 2022
- “Preparing IMDB Movie Review Data for NLP Experiments” – Visual Studio Magazine, 3 March 2022
- “Researchers from Tel Aviv Propose Long-Text NLP Benchmark Called SCROLLS” – Marktechpost, 3 March 2022
- “ELSA, the English language app that speaks for non-native users” – The Indian Express, 6 March 2022
- “DeepMind Trains AI Agents Capable of Robust Real-time Cultural Transmission Without Human Data” – Synced, 7 March 2022
- “Stemming, Lemmatization— Which One is Worth Going For?” – Towards Data Science, 7 March 2022
- “Natural language processing startup NeuralSpace receives £1.2 million investment” – Imperial College London, 7 March 2022
- “IWD: Gender diversity in broadcast technology: More needs to be done” – IBC, 8 March 2022
- “Deep Haiku: Teaching GPT-J to Compose with Syllable Patterns” – Towards Data Science, 8 March 2022
- “How to Build an AI Infrastructure to Support NLP” – CIO, 9 March 2022
- “DeepMind claims its AI can decipher ancient Greek texts from damaged artifacts” – VentureBeat, 9 March 2022
- “A smartphone app that helps deaf people communicate is getting upgrades” – CBC News, 10 March 2022
- "EU funded Language Technology Platform Adds Tools, Expands Language Coverage" – Slator, 14 March 2022
|
|
- When communication is more important than ever, we are especially happy about our large project network and their willingness to share resources, which is also appreciated by our ELT followers.
- The Ethnologue has updated its map of living languages, of which 23 account for half the world’s population. The other 7,128 dots on the map are also great sources for knowledge and inspiration.
- One week ago, on 8 March, the International Women’s Day was celebrated globally and online. The European Commission used the opportunity to present seven extraordinary women from Ukraine.
|
|
In the last two weeks, the wider ELG initiative has been discussing how and where language technology can help in times where communication, especially across language barriers, is more important than ever. As a first step, the University of Helsinki shared additional Machine Translation (MT) models for Ukrainian through the European Language Grid. The models cover translations to and from several European languages, including Polish, German, Dutch and Hungarian.
Last week, ELG project partner HENSOLDT Analytics contributed an Automatic Speech Recognition (ASR) service for Ukrainian, based on a model developed by the University of Edinburgh. In the meantime, colleagues have been working on MT models between Ukrainian and Czech (Charles University), Ukrainian and Latvian (Tilde) and Ukrainian and Greek (ILSP) – if you know of any other current developments regarding the Ukrainian language, please let us know.
Tomorrow, the second of three national ELG workshops in March takes place: The Language Council of Norway (Språkrådet) invites the Norwegian and European LT community to the event “Lovlig teknisk” on 16 March at 10am CET, taking place at the National Library in Oslo as well as online. The free seminar kicks off with an update by the Language Bank of Norway on the latest developments in Norwegian Language Technology, especially in the context of the Norwegian Language Act, which came into effect on 1 January 2022. The ELG workshop itself starts at 1pm with an introduction to the European Language Grid, followed by the results of the Norwegian Language Report that was prepared for the European Language Equality project.
The third event is lined up for next Thursday: On 24 March, the Research Institute for Artificial Intelligence “Mihai Drăgănescu” (ICIA) will host the first Romanian workshop in the context of the European Language Grid (ELG). The program and registration form can be found on the event page, participation is free of charge, as usual.
Finally, the month of March has brought several updates to the European Language Grid. Have a look at our Gitlab page to see what’s new – and feel free to join ELG (if you haven’t yet) by clicking the button below.
|
|
Selected new tools and resources on the
European Language Grid
|
|
|
OPUS-MT: West Slavic languages-Ukrainian machine translation – Multilingual machine translation using neural networks. This machine translation tool translates from Czech and Polish into Ukrainian. As a docker container, the tool can be downloaded directly or tried out through an embedded service on the ELG website. It was added by the University of Helsinki on 07 March 2022.
|
|
Reykjavik University (RU) is an academic institution responsible for advanced education, research and scientific projects. The role of RU is to create and disseminate knowledge to improve competitiveness and quality of life for individuals and society with morality, sustainability and responsibility as guiding principles. The university provided tools and services regarding Icelandic and Faroese to the ELG catalogue.
|
|
The 48 deliverables that the European Language Equality consortium submitted to the European Commission two weeks ago are now available online, including 32 reports that describe and analyse the technology support for 32 different European languages, five reports from LT developer initiatives, six reports from LT initiatives and four technical deep dives by European Language Technology companies. Each report presents findings, insights and conclusions on the state of language technologies in Europe and is well worth diving into. Combined, all these insights will form the groundwork for the strategic research, innovation and implementation agenda, which is the next milestone of the ELE project. For now, major thanks are due to all partners and colleagues involved for their excellent work!
The start of March brought good news from Spain: The Spanish government has approved a new “PETRE” project for a “New Language Economy”. A total of 1.1 billion Euros will be invested in language projects, including a new Spanish corpus, the development of Artificial Intelligence that “thinks” in Spanish and the teaching and learning of Spanish worldwide. Read more and find our blog article on the new investment project in the section below.
The Polish Association for Translators and Interpreters has set up a website and an excel sheet for volunteer translators and interpreters that are willing to help wherever cross-language communication is needed. If you can provide translation services yourself or are in need of translation support, have a look at the Translators for Ukraine website.
|
|
On 1 March 2022, the Spanish Council of Ministers approved a new Strategic Project for the Recuperation and Economic Transformation (PETRE) titled “New Language Economy”: a 1.1 billion Euro investment plan to maximise the value of Spanish and the co-official languages in the country in the process of digital transformation. “We have helped developing the plan since 2014 and are happy that the approval recognizes the importance of language in the digital world, in AI, in culture, but also in terms of creating equality”, explains German Rigau of the Basque Center for Language Technology. Read more on the new, major investment plan in our latest blog article.
|
|
The ELE consortium – Partner presentation
|
|
|
Wikimedia Deutschland
Wikimedia is a global movement to promote free knowledge. Like Wikipedia, this movement grew through volunteer efforts to make the sum of all knowledge freely accessible. Every day, tens of thousands of volunteers around the world are working to improve Wikimedia projects. All of these projects are operated by the non-profit Wikimedia Foundation in San Francisco. Worldwide, 40 independent chapters support Wikimedia at the national level.
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V. is a non-profit organisation based in Berlin, Germany, and, with more than 150 employees and about 90,000 members, represents the oldest and largest of those independent chapters.
Wikimedia Deutschland furthers the ideals of free knowledge even beyond the free encyclopaedia: It is our aim to establish and promote the creation, collection and distribution of free knowledge in all parts of society. The main focus of our activities is the support of volunteers, the development of technology and software and the cooperation with cultural and scientific institutions. Furthermore, we advocate on both the national and EU-level a legal framework which allows for free knowledge to become part of our everyday life.
The Wikimedia projects – and probably first and foremost the different language versions of Wikipedia, but also its 9 sister projects like Wikidata and Wikisource – are known as important tools and resources for language revitalization and preservation projects. Wikimedia communities around the world are dedicated to making content available in every language – especially indigenous, small and under-resourced languages – free and open for everyone.
But: Creating and maintaining a Wikipedia can be hard and for smaller communities with only a small number of speakers, maintaining their own language version Wikipedia can be especially challenging and time intensive. Nonetheless, the Wikimedia movement wants to make a difference for small, minority, regional, lesser used and under-resourced languages and provide opportunities for those language communities to contribute to and work with their languages in an online environment.
This is the reason why we are very glad to be part of the ELE partner consortium; through this, we are able to make the voices of Wikimedia communities heard on an European level and bring their expertise of working with under-resourced languages in a digital environment to this project. For this, we analysed the responses to a survey and consulted with members of the Wikimedia communities in Europe on several occasions. We were able to summarise the challenges they are dealing with when using language technologies for their languages as well as document their visions for Digital Language Equality for our report in Task 2.2 “The Future Situation in 2030 users’ and consumers’ view”.
|
|
The next ELT newsletter will be sent out on 29 March 2022. Until then, follow our ELT social media accounts (as linked below) for the latest news!
|
|
|
|
|