‘Slap in the face’: Images of Canadian child abuse victims training AI generators

By Kelly Geraldine Malone

Pictures of Canadian victims are among the thousands of images depicting child sexual abuse that an internet watchdog group found in databases used to train popular artificial image generators

“That’s another slap in the face to victims,” said Lloyd Richardson, director of technology at the Canadian Centre for Child Protection.

Richardson said it shows artificial intelligence must be considered as the federal government develops its long-awaited online harms legislation. 

A recent report form the Stanford Internet Observatory found more than 3,200 images of suspected child sexual abuse in the database LAION — the publicly available non-profit Large-scale Artificial Intelligence Open Network — which was used to train well-known AI image-makers.

The observatory, based at Stanford University, worked with the Canadian Centre for Child Protection to verify the findings through the centre’s Project Arachnid tool, which has a log of known images of child sexual abuse.

Richardson did not say how many of the thousands of images depicted Canadian victims. 

AI image generators exploded in popularity after they were launched, but they came with significant concern and controversy, including from artists whose work was used without consent. 

The generators are trained on billions of images from the internet, using the text descriptions attached. Most popular generators use LAION.

David Thiel, chief technologist at the Stanford Internet Observatory, said LAION doesn’t just contain images of art, architecture and corgis to teach the image generators.

“They were trained on a ton of explicit materials, they were also trained on a ton of pictures of children and people of other ages as well, but they were also trained on (child sexual abuse material) itself,” he said. 

Thiel, who authored the new report, said the child sexual abuse material is only a small part of the 5.8 billion images in LAION’s network but it is still affecting what the AI image generator creates. It can sexualize an image of a child or even make some images somewhat resemble a known victim, he said.

Thiel said the negative impacts are immeasurable and are already playing out around the world. He’s been contacted by people in multiple countries concerned about AI-generated nude photos of teenagers and children. 

A Winnipeg school notified parents earlier this month that AI-generated photos of underage female students had been shared. At least 17 photos taken from students’ social media were explicitly altered using artificial intelligence. School officials said they had contacted police.

“It is a significantly related issue,” Thiel said.

LAION said in a statement to The Associated Press this week that it had temporarily taken down the data sets to ensure they are safe before republishing. 

Thiel said the damage is done and it’s too late to turn back. People have already downloaded the data sets containing child sexual abuse and will continue to use them. 

“You have to actually get it right the first time. There’s not just like, ‘Get it to market and fix it later,’” he said. 

“We are going to be dealing with this for years because people rushed it to market without taking the safety measures.”

Data sets should have been screened for child sexual abuse material before they were made publicly available, Thiel said, adding AI-generated images have also been found to be racist and misogynist. 

Richardson said it is not only industry’s responsibility to deal with the issues emerging from AI. 

Prime Minister Justin Trudeau promised to introduce online harms legislation in the 2019 federal election campaign. It is intended to deal with hate speech, terrorist content and sexual abuse material. 

But last year the government sent its initial plans back to the drawing board after facing criticism, including about the bill’s impacts on free speech.

The group of experts reworking the bill recently published an open letter saying Canadian children are less protected than kids in countries where similar laws are already in effect.

Richardson said there are certainly steps Ottawa could take.

“There’s this notion that the internet is this removed thing from the reality of sovereign law, which is just complete nonsense,” he said. 

“I think there are absolutely things we could be doing in this space.”

Banner image: David Thiel, chief technologist at the Stanford Internet Observatory and author of its report that discovered images of child sexual abuse in the data used to train artificial intelligence image generators, poses for a photo on Wednesday, Dec. 20, 2023, in Obidos, Portugal. THE CANADIAN PRESS/Camilla Mendes dos Santos via AP

This report by The Canadian Press was first published Dec. 22, 2023.