Child sex abuse images found in dataset training image generators, report says

Jigsy · December 26, 2023, 11:40pm

Oh dear…

More than 1,000 known child sexual abuse materials (CSAM) were found in a large open dataset—known as LAION-5B—that was used to train popular text-to-image generators such as Stable Diffusion, Stanford Internet Observatory (SIO) researcher David Thiel revealed on Wednesday.

SIO’s report seems to confirm rumors swirling on the Internet since 2022 that LAION-5B included illegal images, Bloomberg reported. In an email to Ars, Thiel warned that “the inclusion of child abuse material in AI model training data teaches tools to associate children in illicit sexual activity and uses known child abuse images to generate new, potentially realistic child abuse content.”

Thiel began his research in September after discovering in June that AI image generators were being used to create thousands of fake but realistic AI child sex images rapidly spreading on the dark web. His goal was to find out what role CSAM may play in the training process of AI models powering the image generators spouting this illicit content.

“Our new investigation reveals that these models are trained directly on CSAM present in a public dataset of billions of images, known as LAION-5B,” Thiel’s report said. “The dataset included known CSAM scraped from a wide array of sources, including mainstream social media websites”—like Reddit, X, WordPress, and Blogspot—as well as “popular adult video sites”—like XHamster and XVideos.

Shortly after Thiel’s report was published, a spokesperson for LAION, the Germany-based nonprofit that produced the dataset, told Bloomberg that LAION “was temporarily removing LAION datasets from the Internet” due to LAION’s “zero tolerance policy” for illegal content. The datasets will be republished once LAION ensures “they are safe,” the spokesperson said. A spokesperson for Hugging Face, which hosts a link to a LAION dataset that’s currently unavailable, confirmed to Ars that the dataset is now unavailable to the public after being switched to private by the uploader.

Removing the datasets now doesn’t fix any lingering issues with previously downloaded datasets or previously trained models, though, like Stable Diffusion 1.5. Thiel’s report said that Stability AI’s subsequent versions of Stable Diffusion—2.0 and 2.1—filtered out some or most of the content deemed “unsafe,” “making it difficult to generate explicit content.” But because users were dissatisfied by these later, more filtered versions, Stable Diffusion 1.5 remains “the most popular model for generating explicit imagery,” Thiel’s report said.

A spokesperson for Stability AI told Ars that Stability AI is “committed to preventing the misuse of AI and prohibit the use of our image models and services for unlawful activity, including attempts to edit or create CSAM.” The spokesperson pointed out that SIO’s report “focuses on the LAION-5B dataset as a whole,” whereas “Stability AI models were trained on a filtered subset of that dataset” and were “subsequently fine-tuned” to “mitigate residual behaviors.” The implication seems to be that Stability AI’s filtered dataset is not as problematic as the larger dataset.

Stability AI’s spokesperson also noted that Stable Diffusion 1.5 “was released by Runway ML, not Stability AI.” There seems to be some confusion on that point, though, as a Runway ML spokesperson told Ars that Stable Diffusion “was released in collaboration with Stability AI.”

A demo of Stable Diffusion 1.5 noted that the model was “supported by Stability AI” but released by CompVis and Runway. While a YCombinator thread linking to a blog—titled “Why we chose not to release Stable Diffusion 1.5 as quickly”—from Stability AI’s former chief information officer, Daniel Jeffries, may have provided some clarity on this, it has since been deleted.

Runway ML’s spokesperson declined to comment on any updates being considered for Stable Diffusion 1.5 but linked Ars to a Stability AI blog from August 2022 that said, “Stability AI co-released Stable Diffusion alongside talented researchers from” Runway ML.

Stability AI’s spokesperson said that Stability AI does not host Stable Diffusion 1.5 but has taken other steps to reduce harmful outputs. Those include only hosting “versions of Stable Diffusion that include filters” that “remove unsafe content” and “prevent the model from generating unsafe content.”

“Additionally, we have implemented filters to intercept unsafe prompts or unsafe outputs when users interact with models on our platform,” Stability AI’s spokesperson said. “We have also invested in content labelling features to help identify images generated on our platform. These layers of mitigation make it harder for bad actors to misuse AI.”

Beyond verifying 1,008 instances of CSAM in the LAION-5B dataset, SIO found 3,226 instances of suspected CSAM in the LAION dataset. Thiel’s report warned that both figures are “inherently a significant undercount” due to researchers’ limited ability to detect and flag all the CSAM in the datasets. His report also predicted that “the repercussions of Stable Diffusion 1.5’s training process will be with us for some time to come.”

“The most obvious solution is for the bulk of those in possession of LAION‐5B‐derived training sets to delete them or work with intermediaries to clean the material,” SIO’s report said. “Models based on Stable Diffusion 1.5 that have not had safety measures applied to them should be deprecated and distribution ceased where feasible.”

At first, Thiel suspected that the image generators were combining two concepts, such as “explicit act” and “child,” to create the inappropriate images. However, because he knew that the dataset was “being fed by essentially unguided crawling” of the web, including “a significant amount of explicit material,” he also didn’t rule out the possibility that image generators could also be directly referencing CSAM included in the LAION-5B dataset.

To verify instances of CSAM in the dataset, Thiel relied on Microsoft’s CSAM-hashing database, PhotoDNA, as well as databases managed by child safety organizations Thorn, the National Center for Missing and Exploited Children (NCMEC), and the Canadian Centre for Child Protection (C3P).

These groups confirmed that not only was CSAM present but some illegal images were duplicated in the dataset, increasing the chances that image generator outputs may depict and further traumatize a known victim of child sexual abuse.

Figuring out how much CSAM may have influenced AI training models was a challenge, because rather than storing images, the dataset references image data stored at URLs. Keyword-based analysis was limited because images sometimes use generic labels to avoid detection. There is also no comprehensive list of search terms used to find CSAM, and even if there was, known search terms may be omitted by poor language translation. For those reasons, Thiel concluded that “text descriptions are of limited utility for identifying CSAM.”

Ditching the keyword-based analysis, Thiel relied on a “larger‐scale analysis” that had other limitations, like dead links that were used to train datasets but potentially no longer actively hosted CSAM. Another major limitation was the comprehensiveness of the PhotoDNA database, which did not provide matches for “significant amounts of illustrated cartoons depicting CSAM” that “appear to be present in the dataset.”

Although the LAION datasets have since been removed online, there’s no telling how many researchers have already downloaded them—or when those downloads occurred. According to SIO’s report, any model trained on “a LAION‐5B dataset populated even in late 2023 implies the possession of thousands of illegal images.”

However, just because the images are present in the model, that “does not necessarily indicate that the presence of CSAM drastically influences the output of the model above and beyond the model’s ability to combine the concepts of sexual activity and children,” SIO said. That means that the presence of CSAM in training data may not be as significant as training models on images of children and on explicit content that can be combined through text prompts to generate illegal outputs.

Either way, though, the presence of CSAM “likely does still exert influence,” SIO’s report concluded, especially if there are “repeated identical instances of CSAM” that may increase the odds that an image generator’s output will reference and resemble “specific victims.”

SIO’s report recommended several steps that makers of image generators could take to “mitigate the problems posed by the distribution of the content and its inclusion in model-training data, as well as ways to prevent such incidents in the future.” Among solutions, known CSAM could be flagged and removed from hosting URLs. The metadata and any actual images from LAION reference datasets could also be removed from the LAION-5B dataset.

It’s currently not known how many researchers have downloaded the LAION datasets, but some problematic source material has already been removed online after being reported to NCMEC and C3P, SIO confirmed.

SIO pointed out that researchers could have prevented abuses of image generators by consulting with child safety experts before releasing models and checking images referenced in LAION datasets “against known lists of CSAM.”

“Age estimation models could also potentially assist with the detection of potentially illegal materials,” SIO wrote.

These steps could help reduce the influence of CSAM on image generators’ output, but “the most difficult task” would be removing CSAM from the models themselves, SIO’s report said.

“For images that match known CSAM, the image and text embeddings could be removed from the model, but it is unknown whether this would meaningfully affect the ability of the model to produce CSAM or to replicate the appearance of specific victims,” SIO’s report said.

Makers of AI models could attempt to erase concepts from the model, such as removing “children” or “nudity” entirely, SIO said, but would struggle to remove “CSAM” as a concept without “access to illegal material.”

A more extreme solution, SIO’s report proposed, would be to simply stop feeding material depicting children into models trained on erotic content, which would limit “the ability of models to conflate the two types of material.” Perhaps the most effective step would be to exclude images of children from all generalized training sets, SIO recommended.

The urgency of resolving the problem may require extreme solutions, though.

“We are now faced with the task of trying to mitigate the material present in extant training sets and tainted models; a problem gaining urgency with the surge in the use of these models to produce not just generated CSAM, but also CSAM… of real children, often for commercial purposes,” SIO’s report said.

I think I might just stop using the terms CSEM/CSAM as the article (or quoted text) pushed drawings under that umbrella…

Chie · December 26, 2023, 11:45pm

I looked at the report and confirmed that drawings/cartoons were not actually being detected in relation to image embeddings derived from confirmed CSAM content. Text embeddings seemed to be picked up, but only through keyword associations.

elliot · December 27, 2023, 6:06am

The fact that some of the people they interviewed don’t see a problem with it other than “it might generate taboo content” is disturbing

Chie · December 27, 2023, 7:03am

Clean rooming was already being done by various gtoups looking to use AI to create porn, whereby all images of real children were completely scrubbed and associations were replaced with 3DCGI images and other types of virtual pornography (including petite JAV content).

island · December 28, 2023, 1:48pm

Funny how the interest of adult males in young girls has never stopped after thousands of years. It will never stop. It’s the nature of the human animal.