Categories
Artificial Intelligence Deep Learning Machine Learning Neural Networks NLP

AI Analysis of Bird Songs Helping Scientists Study Bird Populations and Movements 

By AI Trends Staff  

A study of bird songs conducted in the Sierra Nevada mountain range in California generated a million hours of audio, which AI researchers are working to decode to gain insights into how birds responded to wildfires in the region, and to learn which measures helped the birds to rebound more quickly. 

Scientists can also use the soundscape to help track shifts in migration timing and population ranges, according to a recent account in Scientific American. More audio data is coming in from other research as well, with sound-based projects to count insects and study the effects of light and noise pollution on bird communities underway.  

Connor Wood, postdoctoral researcher, Cornell University

“Audio data is a real treasure trove because it contains vast amounts of information,” stated ecologist Connor Wood, a Cornell University postdoctoral researcher, who is leading the Sierra Nevada project. “We just need to think creatively about how to share and access that information.” AI is helping, with the latest generation of machine learning AI systems are able to identify animal species from their calls, and can process thousands of hours of data in less than a day.   

Laurel Symes, assistant director of the Cornell Lab of Ornithology’s Center for Conservation Bioacoustics, is studying acoustic communication in animals, including crickets, frogs, bats, and birds. She has compiled many months of recordings of katydids (famously vocal long-horned grasshoppers that are an essential part of the food web) in the rain forests of central Panama. Patterns of breeding activity and seasonal population variation are hidden in this audio, but analyzing it is enormously time-consuming.  

Laurel Symes, assistant director, of the Cornell Lab of Ornithology’s Center for Conservation Bioacoustics

“Machine learning has been the big game changer for us,” Symes stated to Scientific American.  

It took Symes and three of her colleagues 600 hours of work to classify various katydid species from just 10 recorded hours of sound. But a machine-learning algorithm her team is developing, called KatydID, performed the same task while its human creators “went out for a beer,” Symes stated.  

BirdNET, a popular avian-sound-recognition system available today, will be used by Wood’s team to analyze the Sierra Nevada recordings. BirdNET was built by Stefan Kahl, a machine learning scientist at Cornell’s Center for Conservation Bioacoustics and Chemnitz University of Technology in Germany. Other researchers are using BirdNET to document the effects of light and noise pollution on bird songs at dawn in France’s Brière Regional Natural Park.  

Avian bird calls are complex and varied. “You need much more than just signatures to identify the species,” Kahl stated. Many birds have more than one song, and often have regional “dialects”— a white-crowned sparrow from Washington State can sound very different from its Californian cousin — machine-learning systems can pick out the differences. “Let’s say there’s an as yet unreleased Beatles song that is put out today. You’ve never heard the melody or the lyrics before, but you know it’s a Beatles song because that’s what they sound like,” Kahl stated. “That’s what these programs learn to do, too.”  

BirdVox Combines Study of Bird Songs and Music  

Music recognition research is now crossing over into bird song research, with BirdVox, a collaboration between the Cornell Lab of Ornithology and NYU’s Music and Audio Research Laboratory. BirdVox aims to investigate machine listening techniques for the automatic detection and classification of free-flying bird species from their vocalizations, according to a blog post at NYU.  

The researchers behind BirdVox hope to deploy a network of acoustic sensing devices for real-time monitoring of seasonal bird migratory patterns, in particular, the determination of the precise timing of passage for each species.  

Current bird migration monitoring tools rely on information from weather surveillance radar, which provides insight into the density, direction, and speed of bird movements, but not into the species migrating. Crowdsourced human observations are made almost exclusively during daytime hours; they are of limited use for studying nocturnal migratory flights, the researchers indicated.   

Automatic bioacoustic analysis is seen as a complement to these methods, that is scalable and able to produce species-specific information. Such techniques have wide-ranging implications in the field of ecology for understanding biodiversity and monitoring migrating species in areas with buildings, planes, communication towers and wind turbines, the researchers observed.  

Duke University Researchers Using Drones to Monitor Seabird Colonies  

Elsewhere in bird research, a team from Duke University and the Wildlife Conservation Society (WCS) is using drones and a deep learning algorithm to monitor large colonies of seabirds. The team is analyzing more than 10,000 drone images of mixed colonies of seabirds in the Falkland Islands off Argentina’s coast, according to a press release from Duke University.  

The Falklands, also known as the Malvinas, are home to the world’s largest colonies of black-browed albatrosses (Thalassarche melanophris) and second-largest colonies of southern rockhopper penguins (Eudyptes c. chrysocome). Hundreds of thousands of birds breed on the islands in densely interspersed groups. 

The deep-learning algorithm correctly identified and counted the albatrosses with 97% accuracy and the penguins with 87% accuracy, the team reported. Overall, the automated counts were within five percent of human counts about 90% of the time. 

“Using drone surveys and deep learning gives us an alternative that is remarkably accurate, less disruptive, and significantly easier. One person, or a small team, can do it, and the equipment you need to do it isn’t all that costly or complicated,” stated Madeline C. Hayes, a remote sensing analyst at the Duke University Marine Lab, who led the study. 

Before  this new method was available, to monitor the colonies located on two rocky, uninhabited outer islands, teams of scientists would count the number of each species they could observe on a portion of the island. They would extrapolate those numbers to get a population estimate for the whole colony. Counts often needed to be repeated for better accuracy, a laborious process, with the presence of scientists potentially disruptive to the breeding and parenting behavior of the birds.   

WCS scientists used an off-the-shelf consumer drone to collect more than 10,000 individual photos. Hayes converted into a large-scale composite visual using image-processing software. She then analyzed the image using a convolutional neural network (CNN), a type of AI that employs a deep-learning algorithm to analyze an image and differentiate and count the objects it “sees”two different species of birds in this case, penguins and albatrosses. The data was used to create comprehensive estimates of the total number of birds found in colonies. 

 

“A CNN is loosely modeled on the human neural network, in that it learns from experience,” stated David W. Johnston, director of the Duke Marine Robotics and Remote Sensing Lab. “You train the computer to pick up on different visual patterns, like those made by black-browed albatrosses or southern rockhopper penguins in sample images, and over time it learns how to identify the objects forming those patterns in other images such as our composite photo.” 

Johnston, who is also associate professor of the practice of marine conservation ecology at Duke’s Nicholas School of the Environment, said the emerging drone- and CNN-enabled approach is widely applicable “and greatly increases our ability to monitor the size and health of seabird colonies worldwide, and the health of the marine ecosystems they inhabit.” 

Read the source articles and information in Scientific American, on a blog post at NYU and in a press release from Duke University. 

Categories
Artificial Intelligence Ethical AI NLP Sentiment Analysis

AI Could Solve Partisan Gerrymandering, if Humans Can Agree on What’s Fair 

By John P. Desmond, AI Trends Editor 

With the 2020 US Census results having been delivered to the states, now the process begins for using the population results to draw new Congressional districts. Gerrymandering, a practice intended to establish a political advantage by manipulating the boundaries of electoral districts, is expected to be practiced on a wide scale with Democrats having a slight margin of seats in the House of Representatives and Republicans seeking to close the gap in states where they hold a majority in the legislature.    

Today, more powerful redistricting software incorporating AI and machine learning is available, and it represents a double-edged sword.  

David Thornburgh, president, Committee of Seventy

The pessimistic view is that the gerrymandering software will enable legislators to gerrymander with more precision than ever before, to ensure maximum advantages. This was called “political laser surgery” by David Thornburgh, president of the Committee of Seventy, an anti-corruption organization that considers the 2010 redistricting as one of the worst in the country’s history, according to an account in the Columbia Political Review. 

Supreme Court Justice Elena Kagan issued a warning in her dissent in the Rucho v. Common Cause case, in which the court majority ruled that gerrymandering claims lie outside the jurisdiction of federal courts.  

Justice Kagan stated, “Gerrymanders will only get worse (or depending on your perspective, better) as time goes on — as data becomes ever more fine-grained and data analysis techniques continue to improve,” she wrote in her dissent. “What was possible with paper and pen — or even with Windows 95 — doesn’t hold a candle to what will become possible with developments like machine learning. And someplace along this road, ‘we the people’ become sovereign no longer.”  

The optimistic view is that the tough work can be handed over to the machines to take over, with humans further removed from the equation. A state simply needs to establish objective criteria in a bipartisan manner, then turn it over to computers. But it turns out it is difficult to arrive at criteria for what constitutes a “fair” district.  

Brian Olson of Carnegie Mellon University is working on it, with a proposal to have computers prioritize districts that are compact and equally populated, using a tool called ‘Bdistricting.’ However, the authors of the Columbia Review account reported this has not been successful in creating districts that would have competitive elections.  

One reason is the political geography of the country includes dense, urban Democratic centers surrounded by sparsely-populated rural Republican areas. Attempts to take these geographic considerations into account have added so many variables and complexities that the solution becomes impractical.  

Shruti Verma, student at Columbia’s School of Engineering and Applied Sciences, studying computer science and political science

“Technology cannot, then, be trusted to handle the process of redistricting alone. But it can play an important role in its reform,” stated the author, Shruti Verma, a student at Columbia’s School of Engineering and Applied Sciences, studying computer science and political science.   

However, more tools are becoming available to provide transparency into the redistricting process to a degree not possible in the past. “This software weakens the ability of our state lawmakers to obfuscate,” she stated. “In this way, the very developments in technology that empowered gerrymandering can now serve to hobble it.”  

Tools are available from the Princeton Gerrymandering Project and the Committee of Seventy.  

University of Illinois Researcher Urges Transparency in Redistricting 

Transparency in the process of redistricting is also emphasized by researchers Wendy Tam Cho and Bruce Cain in the September 2020 issue of Science, who suggest that AI can help in the process. Cho, who teaches at the University of Illinois at Urbana-Champaign, has worked on computational redistricting for many years. Last year, she was an expert witness in a lawsuit by the ACLU that wound up in a finding that gerrymandered districts in Ohio were unconstitutional, according to a report in TechCrunch. Bruce Cain is a professor of political science at Stanford University with expertise in democratic representation and state politics.   

In an essay explaining their work, the two stated, “The way forward is for people to work collaboratively with machines to produce results not otherwise possible. To do this, we must capitalize on the strengths and minimize the weaknesses of both artificial intelligence (AI) and human intelligence.”  

And, “Machines enhance and inform intelligent decision-making by helping us navigate the unfathomably large and complex informational landscape. Left to their own devices, humans have shown themselves to be unable to resist the temptation to chart biased paths through that terrain.”  

In an interview with TechCrunch, Cho stated that while automation has potential benefits for states in redistricting, “transparency within that process is essential for developing and maintaining public trust and minimizing the possibilities and perceptions of bias.” 

Also, while the AI models for redistricting may be complex, the public is interested mostly in the results. “The details of these models are intricate and require a fair amount of knowledge in statistics, mathematics, and computer science but also an equally deep understanding of how our political institutions and the law work,” Cho stated. “At the same time, while understanding all the details is daunting, I am not sure this level of understanding by the general public or politicians is necessary.”

Harvard, BU Researchers Recommend a Game Approach 

Researchers at Harvard University and Boston University have proposed a software tool to help with redistricting  using a game metaphor. Called Define-Combine, the tool enables each party to take a turn in shaping the districts, using sophisticated mapping algorithms to ensure the approach is fair, according to an account in Fast Company.  

Early experience shows the Define-Combine procedure resulted in the majority party having a much smaller advantage, so in the end, the process produced more moderate maps.  

Whether this is the desired outcome of the party with the advantage today remains to be seen. Gerrymandering factors heavily in politics, according to a recent account in Data Science Central. After a redistricting in 2011, Wisconsin’s district maps produced an outcome where if the Republican party receives 48% the vote in the state, they end up with 62% of the legislative seats.  

Read the source articles and information in Columbia Political Review, in Sciencein TechCrunch, in Fast Company and in Data Science Central. 

Categories
Artificial Intelligence Digital Transformation

Building architectures that can handle the world’s data

Perceiver IO, a more general version of the Perceiver architecture, can produce a wide variety of outputs from many different inputs.

Categories
Artificial Intelligence

We tested AI interview tools. Here’s what we found.

After more than a year of the covid-19 pandemic, millions of people are searching for employment in the United States. AI-powered interview software claims to help employers sift through applications to find the best people for the job. Companies specializing in this technology reported a surge in business during the pandemic.

But as the demand for these technologies increases, so do questions about their accuracy and reliability. In the latest episode of MIT Technology Review’s podcast “In Machines We Trust,” we tested software from two firms specializing in AI job interviews, MyInterview and Curious Thing. And we found variations in the predictions and job-matching scores that raise concerns about what exactly these algorithms are evaluating.

Getting to know you

MyInterview measures traits considered in the Big Five Personality Test, a psychometric evaluation often used in the hiring process. These traits include openness, conscientiousness, extroversion, agreeableness, and emotional stability. Curious Thing also measures personality-related traits, but instead of the Big Five, candidates are evaluated on other metrics, like humility and resilience.

This screenshot shows our candidate’s match score and personality analysis on MyInterview after answering all interview questions in German instead of English.

HILKE SCHELLMANN

The algorithms analyze candidates’ responses to determine personality traits. MyInterview also compiles scores indicating how closely a candidate matches the characteristics identified by hiring managers as ideal for the position.

To complete our tests, we first set up the software. We uploaded a fake job posting for an office administrator/researcher on both MyInterview and Curious Thing. Then we constructed our ideal candidate by choosing personality-related traits when prompted by the system.

On MyInterview, we selected characteristics like attention to detail and ranked them by level of importance. We also selected interview questions, which are displayed on the screen while the candidate records video responses. On Curious Thing, we selected characteristics like humility, adaptability, and resilience.

One of us, Hilke, then applied for the position and completed interviews for the role on both MyInterview and Curious Thing.

Our candidate completed a phone interview with Curious Thing. She first did a regular job interview and received a 8.5 out of 9 for English competency. In a second try, the automated interviewer asked the same questions, and she responded to each by reading the Wikipedia entry for psychometrics in German.

Yet Curious Thing awarded her a 6 out of 9 for English competency. She completed the interview again and received the same score.

A screenshot shows our candidate’s English competency score in Curious Thing’s software after she answered all questions in German.

HILKE SCHELLMANN

Our candidate turned to MyInterview and repeated the experiment. She read the same Wikipedia entry aloud in German. The algorithm not only returned a personality assessment, but it also predicted our candidate to be a 73% match for the fake job, putting her in the top half of all the applicants we had asked to apply.

MyInterview provides hiring managers with a transcript of their interviews. When we inspected our candidate’s transcript, we found that the system interpreted her German words as English words. But the transcript didn’t make any sense. The first few lines, which correspond to the answer provided above, read:

So humidity is desk a beat-up. Sociology, does it iron? Mined material nematode adapt. Secure location, mesons the first half gamma their Fortunes in for IMD and fact long on for pass along to Eurasia and Z this particular location mesons.

Mismatched

Instead of scoring our candidate on the content of her answers, the algorithm pulled personality traits from her voice, says Clayton Donnelly, an industrial and organizational psychologist working with MyInterview.

But intonation isn’t a reliable indicator of personality traits, says Fred Oswald, a professor of industrial organizational psychology at Rice University. “We really can’t use intonation as data for hiring,” he says. “That just doesn’t seem fair or reliable or valid.”

Using open-ended questions to determine personality traits also poses significant challenges, even when—or perhaps especially when—that process is automated. That’s why many personality tests, such as the Big Five, give people options from which to choose.

“The bottom-line point is that personality is hard to ferret out in this open-ended sense,” Oswald says. “There are opportunities for AI or algorithms and the way the questions are asked to be more structured and standardized. But I don’t think we’re necessarily there in terms of the data, in terms of the designs that give us the data.”

The cofounder and chief technology officer of Curious Thing, Han Xu, responded to our findings in an email, saying: “This is the very first time that our system is being tested in German, therefore an extremely valuable data point for us to research into and see if it unveils anything in our system.”

The bias paradox

Performance on AI-powered interviews is often not the only metric prospective employers use to evaluate a candidate. And these systems may actually reduce bias and find better candidates than human interviewers do. But many of these tools aren’t independently tested, and the companies that built them are reluctant to share details of how they work, making it difficult for either candidates or employers to know whether the algorithms are accurate or what influence they should have on hiring decisions.

Mark Gray, who works at a Danish property management platform called Proper, started using AI video interviews during his previous human resources role at the electronics company Airtame. He says he originally incorporated the software, produced by a German company called Retorio, into interviews to help reduce the human bias that often develops as hiring managers make small talk with candidates.

While Gray doesn’t base hiring decisions solely on Retorio’s evaluation, which also draws on the Big Five traits, he does take it into account as one of many data points when choosing candidates. “I don’t think it’s a silver bullet for figuring out how to hire the right person,” he says.

Gray’s usual hiring process includes a screening call and a Retorio interview, which he invites most candidates to participate in regardless of the impression they made in the screening. Successful candidates will then advance to a job skills test, followed by a live interview with other members of the team.

“In time, products like Retorio, and Retorio itself—every company should be using it because it just gives you so much insight,” Gray says. “While there are some question marks and controversies in the AI sphere in general, I think the bigger question is, are we a better or worse judge of character?”

Gray acknowledges the criticism surrounding AI interviewing tools. An investigation published in February by Bavarian Public Broadcasting found that Retorio’s algorithm assessed candidates differently when they used different video backgrounds and accessories, like glasses, during the interview.

Retorio’s co-founder and managing director, Christoph Hohenberger, says that while he’s not aware of the specifics behind the journalists’ testing methods, the company doesn’t intend for its software to be the deciding factor when hiring candidates. “We are an assisting tool, and it’s being used in practice also together with human people on the other side. It’s not an automatic filter,” he says.

Still, the stakes are so high for job-seekers attempting to navigate these tools that surely more caution is warranted. For most, after all, securing employment isn’t just about a new challenge or environment—finding a job is crucial to their economic survival.

Categories
Artificial Intelligence NLP

AI voice actors sound more human than ever—and they’re ready to hire

The company blog post drips with the enthusiasm of a ’90s US infomercial. WellSaid Labs describes what clients can expect from its “eight new digital voice actors!” Tobin is “energetic and insightful.” Paige is “poised and expressive.” Ava is “polished, self-assured, and professional.”

Each one is based on a real voice actor, whose likeness (with consent) has been preserved using AI. Companies can now license these voices to say whatever they need. They simply feed some text into the voice engine, and out will spool a crisp audio clip of a natural-sounding performance.

WellSaid Labs, a Seattle-based startup that spun out of the research nonprofit Allen Institute of Artificial Intelligence, is the latest firm offering AI voices to clients. For now, it specializes in voices for corporate e-learning videos. Other startups make voices for digital assistants, call center operators, and even video-game characters.

KH · A WellSaid AI voice actor in a promotional style

Not too long ago, such deepfake voices had something of a lousy reputation for their use in scam calls and internet trickery. But their improving quality has since piqued the interest of a growing number of companies. Recent breakthroughs in deep learning have made it possible to replicate many of the subtleties of human speech. These voices pause and breathe in all the right places. They can change their style or emotion. You can spot the trick if they speak for too long, but in short audio clips, some have become indistinguishable from humans.

AI voices are also cheap, scalable, and easy to work with. Unlike a recording of a human voice actor, synthetic voices can also update their script in real time, opening up new opportunities to personalize advertising.

But the rise of hyperrealistic fake voices isn’t consequence-free. Human voice actors, in particular, have been left to wonder what this means for their livelihoods.

How to fake a voice

Synthetic voices have been around for a while. But the old ones, including the voices of the original Siri and Alexa, simply glued together words and sounds to achieve a clunky, robotic effect. Getting them to sound any more natural was a laborious manual task.

Deep learning changed that. Voice developers no longer needed to dictate the exact pacing, pronunciation, or intonation of the generated speech. Instead, they could feed a few hours of audio into an algorithm and have the algorithm learn those patterns on its own.

“If I’m Pizza Hut, I certainly can’t sound like Domino’s, and I certainly can’t sound like Papa John’s.”

Rupal Patel, founder and CEO of VocaliD

Over the years, researchers have used this basic idea to build voice engines that are more and more sophisticated. The one WellSaid Labs constructed, for example, uses two primary deep-learning models. The first predicts, from a passage of text, the broad strokes of what a speaker will sound like—including accent, pitch, and timbre. The second fills in the details, including breaths and the way the voice resonates in its environment.

Making a convincing synthetic voice takes more than just pressing a button, however. Part of what makes a human voice so human is its inconsistency, expressiveness, and ability to deliver the same lines in completely different styles, depending on the context.

Capturing these nuances involves finding the right voice actors to supply the appropriate training data and fine-tune the deep-learning models. WellSaid says the process requires at least an hour or two of audio and a few weeks of labor to develop a realistic-sounding synthetic replica.

KH · A Resemble.ai customer service agent
KH · A Resemble.ai voice actor in conversational style

AI voices have grown particularly popular among brands looking to maintain a consistent sound in millions of interactions with customers. With the ubiquity of smart speakers today, and the rise of automated customer service agents as well as digital assistants embedded in cars and smart devices, brands may need to produce upwards of a hundred hours of audio a month. But they also no longer want to use the generic voices offered by traditional text-to-speech technology—a trend that accelerated during the pandemic as more and more customers skipped in-store interactions to engage with companies virtually.

“If I’m Pizza Hut, I certainly can’t sound like Domino’s, and I certainly can’t sound like Papa John’s,” says Rupal Patel, a professor at Northeastern University and the founder and CEO of VocaliD, which promises to build custom voices that match a company’s brand identity. “These brands have thought about their colors. They’ve thought about their fonts. Now they’ve got to start thinking about the way their voice sounds as well.”

Karen Hao, MIT Tech Review · A VocaliD ad sample with a male voice
Karen Hao, MIT Tech Review · A VocaliD ad sample with a female voice

Whereas companies used to have to hire different voice actors for different markets—the Northeast versus Southern US, or France versus Mexico—some voice AI firms can manipulate the accent or switch the language of a single voice in different ways. This opens up the possibility of adapting ads on streaming platforms depending on who is listening, changing not just the characteristics of the voice but also the words being spoken. A beer ad could tell a listener to stop by a different pub depending on whether it’s playing in New York or Toronto, for example. Resemble.ai, which designs voices for ads and smart assistants, says it’s already working with clients to launch such personalized audio ads on Spotify and Pandora.

The gaming and entertainment industries are also seeing the benefits. Sonantic, a firm that specializes in emotive voices that can laugh and cry or whisper and shout, works with video-game makers and animation studios to supply the voice-overs for their characters. Many of its clients use the synthesized voices only in pre-production and switch to real voice actors for the final production. But Sonantic says a few have started using them throughout the process, perhaps for characters with fewer lines. Resemble.ai and others have also worked with film and TV shows to patch up actors’ performances when words get garbled or mispronounced.

But there are limitations to how far AI can go. It’s still difficult to maintain the realism of a voice over the long stretches of time that might be required for an audiobook or podcast. And there’s little ability to control an AI voice’s performance in the same way a director can guide a human performer. “We’re still in the early days of synthetic speech,” says Zohaib Ahmed, the founder and CEO of Resemble.ai, comparing it to the days when CGI technology was used primarily for touch-ups rather than to create entirely new worlds from green screens.

A human touch

In other words, human voice actors aren’t going away just yet. Expressive, creative, and long-form projects are still best done by humans. And for every synthetic voice made by these companies, a voice actor also needs to supply the original training data.

But some actors have grown increasingly worried about their livelihoods, says a spokesperson at SAG-AFTRA, the union representing voice actors in the US. If they’re not afraid of being automated away by AI, they’re worried about being compensated unfairly or losing control over their voices, which constitute their brand and reputation.

This is now the subject of a lawsuit against TikTok brought by the Canadian voice actor Bev Standing, who alleges that the app’s built-in voice-over feature uses a synthetic copy of her voice without her permission. Standing’s experience also echoes that of Susan Bennett, the original voice of American Siri, who was paid for her initial recordings but not for the continued use of her vocal likeness on millions of Apple devices.

Some companies are looking to be more accountable in how they engage with the voice-acting industry. The best ones, says SAG-AFTRA’s rep, have approached the union to figure out the best way to compensate and respect voice actors for their work.

Several now use a profit-sharing model to pay actors every time a client licenses their specific synthetic voice, which has opened up a new stream of passive income. Others involve the actors in the process of designing their AI likeness and give them veto power over the projects it will be used in. SAG-AFTRA is also pushing for legislation to protect actors from illegitimate replicas of their voice.

But for VocaliD’s Patel, the point of AI voices is ultimately not to replicate human performance or to automate away existing voice-over work. Instead, the promise is that they could open up entirely new possibilities. What if in the future, she says, synthetic voices could be used to rapidly adapt online educational materials to different audiences? “If you’re trying to reach, let’s say, an inner-city group of kids, wouldn’t it be great if that voice actually sounded like it was from their community?”

Categories
Artificial Intelligence Ethical AI

Disability rights advocates are worried about discrimination in AI hiring tools

Your ability to land your next job could depend on how well you play one of the AI-powered games that companies like AstraZeneca and Postmates are increasingly using in the hiring process.

Some companies that create these games, like Pymetrics and Arctic Shores, claim that they limit bias in hiring. But AI hiring games can be especially difficult to navigate for job seekers with disabilities.

In the latest episode of MIT Technology Review’s podcast “In Machines We Trust,” we explore how AI-powered hiring games and other tools may exclude people with disabilities. And while many people in the US are looking to the federal commission responsible for employment discrimination to regulate these technologies, the agency has yet to act.

To get a closer look, we asked Henry Claypool, a disability policy analyst, to play one of Pymetrics’s games. Pymetrics measures nine skills, including attention, generosity, and risk tolerance, that CEO and cofounder Frida Polli says relate to job success.

When it works with a company looking to hire new people, Pymetrics first asks the company to identify people who are already succeeding at the job it’s trying to fill and has them play its games. Then, to identify the skills most specific to the successful employees, it compares their game data with data from a random sample of players.

When he signed on, the game prompted Claypool to choose between a modified version—designed for those with color blindness, ADHD, or dyslexia—and an unmodified version. This question poses a dilemma for applicants with disabilities, he says.

“The fear is that if I click one of these, I’ll disclose something that will disqualify me for the job, and if I don’t click on—say—dyslexia or whatever it is that makes it difficult for me to read letters and process that information quickly, then I’ll be at a disadvantage,” Claypool says. “I’m going to fail either way.”

Polli says Pymetrics does not tell employers which applicants requested in-game accommodations during the hiring process, which should help prevent employers from discriminating against people with certain disabilities. She added that in response to our reporting, the company will make this information more clear so applicants know that their need for an in-game accommodation is private and confidential.   

The Americans with Disabilities Act requires employers to provide reasonable accommodations to people with disabilities. And if a company’s hiring assessments exclude people with disabilities, then it must prove that those assessments are necessary to the job.

For employers, using games such as those produced by Arctic Shores may seem more objective. Unlike traditional psychometric testing, Arctic Shores’s algorithm evaluates candidates on the basis of their choices throughout the game. However, candidates often don’t know what the game is measuring or what to expect as they play. For applicants with disabilities, this makes it hard to know whether they should ask for an accommodation.

Safe Hammad, CTO and cofounder of Arctic Shores, says his team is focused on making its assessments accessible to as many people as possible. People with color blindness and hearing disabilities can use the company’s software without special accommodations, he says, but employers should not use such requests to screen out candidates.

The use of these tools can sometimes exclude people in ways that may not be obvious to a potential employer, though. Patti Sanchez is an employment specialist at the MacDonald Training Center in Florida who works with job seekers who are deaf or hard of hearing. About two years ago, one of her clients applied for a job at Amazon that required a video interview through HireVue.

Sanchez, who is also deaf, attempted to call and request assistance from the company, but couldn’t get through. Instead, she brought her client and a sign language interpreter to the hiring site and persuaded representatives there to interview him in person. Amazon hired her client, but Sanchez says issues like these are common when navigating automated systems. (Amazon did not respond to a request for comment.)

Making hiring technology accessible means ensuring both that a candidate can use the technology and that the skills it measures don’t unfairly exclude candidates with disabilities, says Alexandra Givens, the CEO of the Center for Democracy and Technology, an organization focused on civil rights in the digital age.

AI-powered hiring tools often fail to include people with disabilities when generating their training data, she says. Such people have long been excluded from the workforce, so algorithms modeled after a company’s previous hires won’t reflect their potential.

Even if the models could account for outliers, the way a disability presents itself varies widely from person to person. Two people with autism, for example, could have very different strengths and challenges.

“As we automate these systems, and employers push to what’s fastest and most efficient, they’re losing the chance for people to actually show their qualifications and their ability to do the job,” Givens says. “And that is a huge loss.”

A hands-off approach

Government regulators are finding it difficult to monitor AI hiring tools. In December 2020, 11 senators wrote a letter to the US Equal Employment Opportunity Commission expressing concerns about the use of hiring technologies after the covid-19 pandemic. The letter inquired about the agency’s authority to investigate whether these tools discriminate, particularly against those with disabilities.

The EEOC responded with a letter in January that was leaked to MIT Technology Review. In the letter, the commission indicated that it cannot investigate AI hiring tools without a specific claim of discrimination. The letter also outlined concerns about the industry’s hesitance to share data and said that variation between different companies’ software would prevent the EEOC from instituting any broad policies.

“I was surprised and disappointed when I saw the response,” says Roland Behm, a lawyer and advocate for people with behavioral health issues. “The whole tenor of that letter seemed to make the EEOC seem like more of a passive bystander rather than an enforcement agency.”

The agency typically starts an investigation once an individual files a claim of discrimination. With AI hiring technology, though, most candidates don’t know why they were rejected for the job. “I believe a reason that we haven’t seen more enforcement action or private litigation in this area is due to the fact that candidates don’t know that they’re being graded or assessed by a computer,” says Keith Sonderling, an EEOC commissioner.

Sonderling says he believes that artificial intelligence will improve the hiring process, and he hopes the agency will issue guidance for employers on how best to implement it. He says he welcomes oversight from Congress.

However, Aaron Rieke, managing director of Upturn, a nonprofit dedicated to civil rights and technology, expressed disappointment in the EEOC’s response: “I actually would hope that in the years ahead, the EEOC could be a little bit more aggressive and creative in thinking about how to use that authority.”

Pauline Kim, a law professor at Washington University in St. Louis, whose research focuses on algorithmic hiring tools, says the EEOC could be more proactive in gathering research and updating guidelines to help employers and AI companies comply with the law.

Behm adds that the EEOC could pursue other avenues of enforcement, including a commissioner’s charge, which allows commissioners to initiate an investigation into suspected discrimination instead of requiring an individual claim (Sonderling says he is considering making such a charge). He also suggests that the EEOC consult with advocacy groups to develop guidelines for AI companies hoping to better represent people with disabilities in their algorithmic models.

It’s unlikely that AI companies and employers are screening out people with disabilities on purpose, Behm says. But they “haven’t spent the time and effort necessary to understand the systems that are making what for many people are life-changing decisions: Am I going to be hired or not? Can I support my family or not?”

Categories
Artificial Intelligence Ethical AI

DeepMind says it will release the structure of every protein known to science

Back in December 2020, DeepMind took the world of biology by surprise when it solved a 50-year grand challenge with AlphaFold, an AI tool that predicts the structure of proteins. Last week the London-based company published full details of that tool and released its source code.

Now the firm has announced that it has used its AI to predict the shapes of nearly every protein in the human body, as well as the shapes of hundreds of thousands of other proteins found in 20 of the most widely studied organisms, including yeast, fruit flies, and mice. The breakthrough could allow biologists from around the world to understand diseases better and develop new drugs. 

So far the trove consists of 350,000 newly predicted protein structures. DeepMind says it will predict and release the structures for more than 100 million more in the next few months—more or less all proteins known to science. 

“Protein folding is a problem I’ve had my eye on for more than 20 years,” says DeepMind cofounder and CEO Demis Hassabis. “It’s been a huge project for us. I would say this is the biggest thing we’ve done so far. And it’s the most exciting in a way, because it should have the biggest impact in the world outside of AI.”

Proteins are made of long ribbons of amino acids, which twist themselves up into complicated knots. Knowing the shape of a protein’s knot can reveal what that protein does, which is crucial for understanding how diseases work and developing new drugs—or identifying organisms that can help tackle pollution and climate change. Figuring out a protein’s shape takes weeks or months in the lab. AlphaFold can predict shapes to the nearest atom in a day or two.

The new database should make life even easier for biologists. AlphaFold might be available for researchers to use, but not everyone will want to run the software themselves. “It’s much easier to go and grab a structure from the database than it is running it on your own computer,” says David Baker of the Institute for Protein Design at the University of Washington, whose lab has built its own tool for predicting protein structure, called RoseTTAFold, based on AlphaFold’s approach.

In the last few months Baker’s team has been working with biologists who were previously stuck trying to figure out the shape of proteins they were studying. “There’s a lot of pretty cool biological research that’s been really sped up,” he says. A public database containing hundreds of thousands of ready-made protein shapes should be an even bigger accelerator.  

“It looks astonishingly impressive,” says Tom Ellis, a synthetic biologist at Imperial College London studying the yeast genome, who is excited to try the database. But he cautions that most of the predicted shapes have not yet been verified in the lab.  

Atomic precision

In the new version of AlphaFold, predictions come with a confidence score that the tool uses to flag how close it thinks each predicted shape is to the real thing. Using this measure, DeepMind found that AlphaFold predicted shapes for 36% of human proteins with an accuracy that is correct down to the level of individual atoms. This is good enough for drug development, says Hassabis.   

Previously, after decades of work, only 17% of the proteins in the human body have had their structures identified in the lab. If AlphaFold’s predictions are as accurate as DeepMind says, the tool has more than doubled this number in just a few weeks.

Even predictions that are not fully accurate at the atomic level are still useful. For more than half of the proteins in the human body, AlphaFold has predicted a shape that should be good enough for researchers to figure out the protein’s function. The rest of AlphaFold’s current predictions are either incorrect, or are for the third of proteins in the human body that don’t have a structure at all until they bind with others. “They’re floppy,” says Hassabis.

“The fact that it can be applied at this level of quality is an impressive thing,” says Mohammed AlQuraish, a systems biologist at Columbia University who has developed his own software for predicting protein structure. He also points out that having structures for most of the proteins in an organism will make it possible to study how these proteins work as a system, not just in isolation. “That’s what I think is most exciting,” he says.

DeepMind is releasing its tools and predictions for free and will not say if it has plans for making money from them in future. It is not ruling out the possibility, however. To set up and run the database, DeepMind is partnering with the European Molecular Biology Laboratory, an international research institution that already hosts a large database of protein information. 

For now, AlQuraishi can’t wait to see what researchers do with the new data. “It’s pretty spectacular,” he says “I don’t think any of us thought we would be here this quickly. It’s mind boggling.”

Categories
Artificial Intelligence Ethical AI

Hundreds of AI tools have been built to catch covid. None of them helped.

When covid-19 struck Europe in March 2020, hospitals were plunged into a health crisis that was still badly understood. “Doctors really didn’t have a clue how to manage these patients,” says Laure Wynants, an epidemiologist at Maastricht University in the Netherlands, who studies predictive tools.

But there was data coming out of China, which had a four-month head start in the race to beat the pandemic. If machine-learning algorithms could be trained on that data to help doctors understand what they were seeing and make decisions, it just might save lives. “I thought, ‘If there’s any time that AI could prove its usefulness, it’s now,’” says Wynants. “I had my hopes up.”

It never happened—but not for lack of effort. Research teams around the world stepped up to help. The AI community, in particular, rushed to develop software that many believed would allow hospitals to diagnose or triage patients faster, bringing much-needed support to the front lines—in theory.

In the end, many hundreds of predictive tools were developed. None of them made a real difference, and some were potentially harmful.

That’s the damning conclusion of multiple studies published in the last few months. In June, the Turing Institute, the UK’s national center for data science and AI, put out a report summing up discussions at a series of workshops it held in late 2020. The clear consensus was that AI tools had made little, if any, impact in the fight against covid.

Not fit for clinical use

This echoes the results of two major studies that assessed hundreds of predictive tools developed last year. Wynants is lead author of one of them, a review in the British Medical Journal that is still being updated as new tools are released and existing ones tested. She and her colleagues have looked at 232 algorithms for diagnosing patients or predicting how sick those with the disease might get. They found that none of them were fit for clinical use. Just two have been singled out as being promising enough for future testing.

“It’s shocking,” says Wynants. “I went into it with some worries, but this exceeded my fears.”

Wynants’s study is backed up by another large review carried out by Derek Driggs, a machine-learning researcher at the University of Cambridge, and his colleagues, and published in Nature Machine Intelligence. This team zoomed in on deep-learning models for diagnosing covid and predicting patient risk from medical images, such as chest x-rays and chest computer tomography (CT) scans. They looked at 415 published tools and, like Wynants and her colleagues, concluded that none were fit for clinical use.

“This pandemic was a big test for AI and medicine,” says Driggs, who is himself working on a machine-learning tool to help doctors during the pandemic. “It would have gone a long way to getting the public on our side,” he says. “But I don’t think we passed that test.”

Both teams found that researchers repeated the same basic errors in the way they trained or tested their tools. Incorrect assumptions about the data often meant that the trained models did not work as claimed.

Wynants and Driggs still believe AI has the potential to help. But they are concerned that it could be harmful if built in the wrong way because they could miss diagnoses or underestimate risk for vulnerable patients. “There is a lot of hype about machine-learning models and what they can do today,” says Driggs.

Unrealistic expectations encourage the use of these tools before they are ready. Wynants and Driggs both say that a few of the algorithms they looked at have already been used in hospitals, and some are being marketed by private developers. “I fear that they may have harmed patients,” says Wynants.

So what went wrong? And how do we bridge that gap? If there’s an upside, it is that the pandemic has made it clear to many researchers that the way AI tools are built needs to change. “The pandemic has put problems in the spotlight that we’ve been dragging along for some time,” says Wynants.

What went wrong

Many of the problems that were uncovered are linked to the poor quality of the data that researchers used to develop their tools. Information about covid patients, including medical scans, was collected and shared in the middle of a global pandemic, often by the doctors struggling to treat those patients. Researchers wanted to help quickly, and these were the only public data sets available. But this meant that many tools were built using mislabeled data or data from unknown sources.

Driggs highlights the problem of what he calls Frankenstein data sets, which are spliced together from multiple sources and can contain duplicates. This means that some tools end up being tested on the same data they were trained on, making them appear more accurate than they are.

It also muddies the origin of certain data sets. This can mean that researchers miss important features that skew the training of their models. Many unwittingly used a data set that contained chest scans of children who did not have covid as their examples of what non-covid cases looked like. But as a result, the AIs learned to identify kids, not covid.

Driggs’s group trained its own model using a data set that contained a mix of scans taken when patients were lying down and standing up. Because patients scanned while lying down were more likely to be seriously ill, the AI learned wrongly to predict serious covid risk from a person’s position.

In yet other cases, some AIs were found to be picking up on the text font that certain hospitals used to label the scans. As a result, fonts from hospitals with more serious caseloads became predictors of covid risk.

Errors like these seem obvious in hindsight. They can also be fixed by adjusting the models, if researchers are aware of them. It is possible to acknowledge the shortcomings and release a less accurate, but less misleading model. But many tools were developed either by AI researchers who lacked the medical expertise to spot flaws in the data or by medical researchers who lacked the mathematical skills to compensate for those flaws.

A more subtle problem Driggs highlights is incorporation bias, or bias introduced at the point a data set is labeled. For example, many medical scans were labeled according to whether the radiologists who created them said they showed covid. But that embeds, or incorporates, any biases of that particular doctor into the ground truth of a data set. It would be much better to label a medical scan with the result of a PCR test rather than one doctor’s opinion, says Driggs. But there isn’t always time for statistical niceties in busy hospitals.

That hasn’t stopped some of these tools from being rushed into clinical practice. Wynants says it isn’t clear which ones are being used or how. Hospitals will sometimes say that they are using a tool only for research purposes, which makes it hard to assess how much doctors are relying on them. “There’s a lot of secrecy,” she says.

Wynants asked one company that was marketing deep-learning algorithms to share information about its approach but did not hear back. She later found several published models from researchers tied to this company, all of them with a high risk of bias. “We don’t actually know what the company implemented,” she says.

According to Wynants, some hospitals are even signing nondisclosure agreements with medical AI vendors. When she asked doctors what algorithms or software they were using, they sometimes told her they weren’t allowed to say.

How to fix it

What’s the fix? Better data would help, but in times of crisis that’s a big ask. It’s more important to make the most of the data sets we have. The simplest move would be for AI teams to collaborate more with clinicians, says Driggs. Researchers also need to share their models and disclose how they were trained so that others can test them and build on them. “Those are two things we could do today,” he says. “And they would solve maybe 50% of the issues that we identified.”

Getting hold of data would also be easier if formats were standardized, says Bilal Mateen, a doctor who leads the clinical technology team at the Wellcome Trust, a global health research charity based in London. 

Another problem Wynants, Driggs, and Mateen all identify is that most researchers rushed to develop their own models, rather than working together or improving existing ones. The result was that the collective effort of researchers around the world produced hundreds of mediocre tools, rather than a handful of properly trained and tested ones.

“The models are so similar—they almost all use the same techniques with minor tweaks, the same inputs—and they all make the same mistakes,” says Wynants. “If all these people making new models instead tested models that were already available, maybe we’d have something that could really help in the clinic by now.”

In a sense, this is an old problem with research. Academic researchers have few career incentives to share work or validate existing results. There’s no reward for pushing through the last mile that takes tech from “lab bench to bedside,” says Mateen. 

To address this issue, the World Health Organization is considering an emergency data-sharing contract that would kick in during international health crises. It would let researchers move data across borders more easily, says Mateen. Before the G7 summit in the UK in June, leading scientific groups from participating nations also called for “data readiness” in preparation for future health emergencies.

Such initiatives sound a little vague, and calls for change always have a whiff of wishful thinking about them. But Mateen has what he calls a “naïvely optimistic” view. Before the pandemic, momentum for such initiatives had stalled. “It felt like it was too high of a mountain to hike and the view wasn’t worth it,” he says. “Covid has put a lot of this back on the agenda.”

“Until we buy into the idea that we need to sort out the unsexy problems before the sexy ones, we’re doomed to repeat the same mistakes,” says Mateen. “It’s unacceptable if it doesn’t happen. To forget the lessons of this pandemic is disrespectful to those who passed away.”

Categories
Artificial Intelligence Ethical AI

An endlessly changing playground teaches AIs how to multitask

DeepMind has developed a vast candy-colored virtual playground that teaches AIs general skills by endlessly changing the tasks it sets them. Instead of developing just the skills needed to solve a particular task, the AIs learn to experiment and explore, picking up skills they then use to succeed in tasks they’ve never seen before. It is a small step toward general intelligence.

What is it? XLand is a video-game-like 3D world that the AI players sense in color. The playground is managed by a central AI that sets the players billions of different tasks by changing the environment, the game rules, and the number of players. Both the players and the playground manager use reinforcement learning to improve by trial and error.

During training, the players first face simple one-player games, such as finding a purple cube or placing a yellow ball on a red floor. They advance to more complex multiplayer games like hide and seek or capture the flag, where teams compete to be the first to find and grab their opponent’s flag. The playground manager has no specific goal but aims to improve the general capability of the players over time.

Why is this cool? AIs like DeepMind’s AlphaZero have beaten the world’s best human players at chess and Go. But they can only learn one game at a time. As DeepMind cofounder Shane Legg put it when I spoke to him last year, it’s like having to swap out your chess brain for your Go brain each time you want to switch games.

Researchers are now trying to build AIs that can learn multiple tasks at once, which means teaching them general skills that make it easier to adapt.

Having learned to experiment, these bots improvised a ramp

DEEPMIND

One exciting trend in this direction is open-ended learning, where AIs are trained on many different tasks without a specific goal. In many ways, this is how humans and other animals seem to learn, via aimless play. But this requires a vast amount of data. XLand generates that data automatically, in the form of an endless stream of challenges. It is similar to POET, an AI training dojo where two-legged bots learn to navigate obstacles in a 2D landscape. XLand’s world is much more complex and detailed, however. 

XLand is also an example of AI learning to make itself, or what Jeff Clune, who helped develop POET and leads a team working on this topic at OpenAI, calls AI-generating algorithms (AI-GAs). “This work pushes the frontiers of AI-GAs,” says Clune. “It is very exciting to see.”

What did they learn? Some of DeepMind’s XLand AIs played 700,000 different games in 4,000 different worlds, encountering 3.4 million unique tasks in total. Instead of learning the best thing to do in each situation, which is what most existing reinforcement-learning AIs do, the players learned to experiment—moving objects around to see what happened, or using one object as a tool to reach another object or hide behind—until they beat the particular task.

In the videos you can see the AIs chucking objects around until they stumble on something useful: a large tile, for example, becomes a ramp up to a platform. It is hard to know for sure if all such outcomes are intentional or happy accidents, say the researchers. But they happen consistently.

AIs that learned to experiment had an advantage in most tasks, even ones that they had not seen before. The researchers found that after just 30 minutes of training on a complex new task, the XLand AIs adapted to it quickly. But AIs that had not spent time in XLand could not learn these tasks at all.

Categories
Artificial Intelligence Robotics & RPA

A new generation of AI-powered robots is taking over warehouses

In the months before the first reports of covid-19 would emerge, a new kind of robot headed to work. Built on years of breakthroughs in deep learning, it could pick up all kinds of objects with remarkable accuracy, making it a shoo-in for jobs like sorting products into packages at warehouses.

Previous commercial robots had been limited to performing tasks with little variation: they could move pallets along set paths and perhaps deviate slightly to avoid obstacles along the way. The new robots, with their ability to manipulate objects of variable shapes and sizes in unpredictable orientations, could open up a whole different set of tasks for automation.

At the time, the technology was still proving itself. But then the pandemic hit. As e-commerce demand skyrocketed and labor shortages intensified, AI-powered robots went from a nice-to-have to a necessity.

Covariant, one of the many startups working on developing the software to control these robots, says it’s now seeing rapidly rising demand in industries like fashion, beauty, pharmaceuticals, and groceries, as is its closest competitor, Osaro. Customers once engaged in pilot programs are moving to integrate AI-powered robots permanently into their production lines.

Knapp, a warehouse logistics technology company and one of Covariant’s first customers, which began piloting the technology in late 2019, says it now has “a full pipeline of projects” globally, including retrofitting old warehouses and designing entirely new ones optimized to help Covariant’s robot pickers work alongside humans.

For now, somewhere around 2,000 AI-powered robots have been deployed, with a typical warehouse housing one or two, estimates Rian Whitton, who analyzes the industrial robotics market at ABI Research. But the industry has reached a new inflection point, and he predicts that each warehouse will soon house upwards of 10 robots, growing the total to tens of thousands within the next few years. “It’s being scaled up pretty quickly,” he says. “In part, it’s been accelerated by the pandemic.”

A new wave of automation

Over the last decade, the online retailing and shipping industries have steadily automated more and more of their warehouses, with the big players leading the way. In 2012, Amazon acquired Kiva Systems, a Massachusetts-based robotics company that produces autonomous mobile robots, known in the industry as AMRs, to move shelves of goods around. In 2018, FedEx began deploying its own AMRs, designed by a different Massachusetts-based startup called Vecna Robotics. The same year, the British online supermarket Ocado made headlines with its highly automated fulfillment center in Andover, England, featuring a giant grid of robots whizzing along metallic scaffolding.

But there’s a reason these early waves of automation came primarily in the form of AMRs. From a technical perspective, moving objects from point A to B is one of the easiest robotic challenges to solve. The much harder challenge is manipulating objects to take them off shelves and out of bins, or box them and bag them, the way human workers do so nimbly with their hands.

This is what the latest generation of robotics companies like Covariant and Osaro specialize in, a technology that didn’t become commercially viable until late 2019. Right now such robots are most skilled at simple manipulation tasks, like picking up objects and placing them in boxes, but both startups are already working with customers on more complicated sequences of motions, including auto-bagging, which requires robots to work with crinkly, flimsy, or translucent materials. Within a few years, any task that previously required hands to perform could be partially or fully automated away.

Some companies have already begun redesigning their warehouses to better capitalize on these new capabilities. Knapp, for example, is changing its floor layout and the way it routes goods to factor in which type of worker—robot or human—is better at handling different products. For objects that still stump robots, like a net bag of marbles or delicate pottery, a central routing algorithm would send them to a station with human pickers. More common items, like household goods and school supplies, would go to a station with robots.

Derik Pridmore, cofounder and CEO at Osaro, predicts that in industries like fashion, fully automated warehouses could come online within two years, since clothing is relatively easy for robots to handle.

That doesn’t mean all warehouses will soon be automated. There are millions of them around the world, says Michael Chui, a partner at the McKinsey Global Institute who studies the impact of information technologies on the economy. “Retrofitting all of those facilities can’t happen overnight,” he says.

One of the first Covariant-enabled robotic arms that Knapp piloted in a warehouse in Berlin, Germany.

Nonetheless, the latest automation push raises questions about the impact on jobs and workers.

Previous waves of automation have given researchers more data about what to expect. A recent study that analyzed the impact of automation at the firm level for the first time found that companies that adopted robots ahead of others in their industry became more competitive and grew more, which led them to hire more workers. “Any job loss comes from companies who did not adopt robots,” says Lynn Wu, a professor at Wharton who coauthored the paper. “They lose their competitiveness and then lay off workers.”

But as workers at Amazon and FedEx have already seen, jobs for humans will be different. Roles like packing boxes and bags will be displaced, while new ones will appear—some directly related to maintaining and supervising the robots, others from the second-order effects of fulfilling more orders, which would require expanded logistics and delivery operations. In other words, middle-skilled labor will disappear in favor of low- and high-skilled work, says Wu: “We’re breaking the career ladder, and hollowing out the middle.”

But rather than attempt to stop the trend of automation, experts say, it’s better to focus on easing the transition by helping workers reskill and creating new opportunities for career growth. “Because of aging, there are a number of countries in the world where the size of the workforce is decreasing already,” says Chui. “Half of our economic growth has come from more people working over the past 50 years, and that’s going to go away. So there’s a real imperative to increase productivity, and these technologies can help.

“We also just need to make sure that the workers can share the benefits.”