Appen Leads Industry in Creating AI That Works for Everyone
29 April 2021 - 10:00PM
Business Wire
Appen’s range of AI projects and diverse global
contractor network ensure unbiased AI data for fair and equitable
AI projects
Appen Limited (ASX:APX), the leading provider of high-quality
training data for organizations that build effective AI systems at
scale, is enabling organizations to launch, update and operate
unbiased AI models through a range of projects and partnerships.
With support from the company’s global crowd of data annotation
specialists that’s more than a million strong, Appen has developed
diverse training data sets for AI models, particularly natural
language processing (NLP) initiatives to ensure end users receive
the same experience, no matter their language variety, dialect,
ethnolect, accent, race or gender.
AI projects based on biased or incomplete data don’t work for
everyone. According to a report published by PNAS in March 2020
(Proceedings of the National Academy of Sciences), popular
automated speech recognition (ASR) systems that are used for
virtual assistants, closed captioning, hands-free computing and
much more, exhibit significant racial disparities in performance.
The report concludes that more diverse training datasets are needed
to reduce these performance differences and ensure speech
recognition technology is inclusive. Language interpretation and
natural language processing (NLP) systems suffer from the same
challenge and require the same solution.
“The quality and diversity of training data directly impacts the
performance and bias present in AI models,” said Appen CEO Mark
Brayan. “As a data partner, we can supply complete training data
for many use cases to ensure AI models work for everyone. It’s
critical that we engage a diverse group of individuals to produce,
label, and validate the data to ensure the model being trained is
not only equitable, but also built responsibly.”
Range of Appen Language Projects
Appen demonstrates its commitment to creating AI for everyone
through a variety of projects and partnerships focused on the
diversity of languages and dialects.
- Translators without Borders (TWB) partnership –
Appen, in partnership with TWB, Amazon, Carnegie Mellon University,
Facebook, Google, John Hopkins University, Microsoft, and
Translated joined the Translation Initiative for COVID-19
(TICO-19), which supported the development of language technology
to make COVID-19 information available in as many languages as
possible, including languages in developing countries like
Congolese Swahili, Tigrinya, and Nigerian Fulfulde.
- The Inuktitut translation project – In collaboration
with the Government of Nunavut, Microsoft added Inuktitut, an
Indigenous language in North America spoken in the Canadian Arctic,
to Microsoft Translator, using Appen services.
- The Canadian French translation project – Appen
coordinated with native language consultants to help Microsoft add
"Canadian French" as a language option in Microsoft
Translator.
- African American Vernacular English (AAVE) off-the-shelf
datasets – Most existing training datasets used in ASR, search
engines, voice assistants and sentiment analysis are not
representative of AAVE. To make high-quality AAVE data available,
Appen is working with AAVE speakers among its crowd of annotators
to collect data for an OTS dataset based on conversations about a
broad range of topics.
“Biased AI data leads to projects that can fail to deliver the
expected business results and harm individuals they are supposed to
benefit,” said Dr. Judith Bishop, Senior Director of AI Specialists
at Appen. “The scale and complexity of AI projects makes it
impossible for most companies to acquire sufficient unbiased
high-quality data without partnering with an AI data expert.
Appen’s commitment to developing the most diverse and expert crowd
of data annotators provides the industry with a clearly
differentiated resource for building fair and ethical AI
projects.”
Appen’s Leading Approach to Diversity
Appen relies on training data annotators from over 170
countries. Language representation includes 235 unique languages
and 395 dialects. Over the years, the Appen crowd of annotators has
included over 30,000 fluent trilingual speakers – a true testament
to diversity and expertise.
Appen also offers off-the-shelf (OTS) datasets designed to make
it easier and faster for businesses to acquire the high-quality
training data they need to accelerate their AI and machine learning
projects. OTS datasets are available for 80 languages and multiple
dialects, including hard-to-acquire languages such as multiple
varieties of the Arabic language, Croatian, Greek, Hungarian, Thai
and more.
According to the United Nations Department of Economic and
Social Affairs, “about 97 percent of the world's population speaks
just 4 percent of its [7000] languages”. That 4 percent is only 280
languages – yet the number of languages well-served by AI core
technologies, is a fraction of that number. Appen aims to help
increase that number through these and future projects.
About Appen Limited
Appen collects and labels images, text, speech, audio, and video
used to build and continuously improve the world’s most innovative
artificial intelligence systems. With expertise in more than 235
languages, a global crowd of over 1 million skilled contractors,
and the industry’s most advanced AI-assisted data annotation
platform, Appen solutions provide the quality, security, and speed
required by leaders in technology, automotive, financial services,
retail, manufacturing, and governments worldwide. Founded in 1996,
Appen has customers and offices around the world.
View source
version on businesswire.com: https://www.businesswire.com/news/home/20210429005112/en/
Titus Capilnean Director, Corporate Marketing
tcapilnean@appen.com
Appen (ASX:APX)
Historical Stock Chart
From Nov 2024 to Dec 2024
Appen (ASX:APX)
Historical Stock Chart
From Dec 2023 to Dec 2024