Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They apparently tried to combat NSFW generation by filtering the training dataset not to include any.


They know they are going to be the next target in the war on general purpose computing. They're trying to stave it off for as long as possible by signalling to the authorities that they are the good guys.

A confrontation is inevitable, though. Right now it costs moderate sums of money to do this level of training. Not always will this be so. If I were an AI-centric organization, I would be racing to position myself as a trustworthy actor in my particular corner of the AI space so that when legislators start asking questions about the explosion of bad actors, I can engage in a little bit of regulatory capture, and have the legislators legislate whatever regulations I've already implemented, to the disadvantage of my competitors.

For people who say "people can make whatever images they like in photoshop," I will remind you of this: https://i.imgur.com/5DJrd.jpg


appeasement never works. those who wage that war should be directly confronted.

and they will lose it, just like they've lost the war on encryption.


This is business, not ethics, though. They just don't want the negative attention, that's it. And because this is a business, time matters, same as DRMs. Almost all DRMs get defeated, yet the work, because they hinder the crackers, even if for some time. Same here. While Stability is not under attack yet, they can establish themselves as the household name for AI in a safer context.


I doubt they care if people make porn with diffusion models. They just don’t want to be the ones providing the model to do it.


Banknote printing is primarily protected against on the hardware level of printers, no? With the nigh-invisible unique watermark left by every printer, there’s virtually no way you’d get away with it. My guess is that the Photoshop filter exists mostly as a barrier against the crime of convenience.


My point is that there is precedent for governments requiring companies to implement restrictions on what images can be handled by their software.

As I explained: This kind of mandated restriction is looming over AI. Companies are trying to get out in front of these restrictions so they can implement them on their own terms.


>My point is that there is precedent for governments requiring companies to implement restrictions on what images can be handled by their software

But images of boobs are still legal. So this NSFW filter seems to be much more above then the law asks. Is the issue is that even if you do not train with CP you might get the model so output something that some random person will get offended and label it as CP? I assume that other companies can focus on NSFW and have their lawyers figure this out, IMo would be cool that someone sues the governments and make them reveal facts about their concern that CP of fake or cartoon people is dangerous" , I think they could focus on saving real children then cartoon ones.


It’s possible that the end game is hardware in GPUs, to detect whatever they want to prevent, before it’s displayed.


You kill way too many birds with such a stone. Of course you could never do any kind of photorealistic game in real time if you had to pre-screen everything with an actually effective censor.

Indeed, what they're already doing is already hobbling the models.

Emad is right that we learn new things from the creativity unleashed by accessible models that can be run (and even fine tuned) or consumer hardware.

But judging from what people post, one thing we learn is that it seems models fine tuned on porn (such as the notorious f222 and its derivative Hassan's blend) can be quite a bit better at non-porn generation of diverse, photorealistic faces and hands too.


> Of course you could never do any kind of photorealistic game in real time if you had to pre-screen everything with an actually effective censor.

I'm not sure I understand this. A possible implementation could be a neural net that blanked the screen with a frown face any time it detected something it thinks was "bad". What purpose/need would pre-screening serve?


That you describe IS pre-screening. And it's not workable, because it would take a ton of dedicated resources to make it work in real time, and even then it would be disastrous for latency, unworkable for most games and even desktop applications.


> That you describe IS pre-screening.

I think this is making the assumption that all frames are blocked.

> then it would be disastrous for latency

We're talking about the future here. I'm not sure it makes sense to use current tech to say it's not going to happen, or come up with latency numbers. But, "real time" inference is definitely a possibility, and is in active use for video moderation (Youtube, etc) and object detection (Tesla, etc). Nobody will notice a system running at 2000fps.


You can typically work around that by modding the printer firmware, if needed. It's not baked into the hardware.


Some AI startups backed by the biggest players are already hitting legal/regulatory issues.


Very niche example here, that can be easily circumvented

Also seems problematic to approach this from a purely capitalistic and consumerist angle. There is a lot of opportunity here besides just launching the next AI unicorn.


I am not clicking that link because no one should take the risk of you proving your point of what horrors could pop out of one of these models.

I will say that while the government backlash is inevitable just like it was with encryption, these image generation models are so easy to train on consumer hardware that the cat is hopelessly out of the bag. It might as well be thoughtcrime.


Link doesn't show any model output - it's an screenshot of photoshop refusing to edit a banknote.


Or it's an output of "blank adobe photoshop with dialog refusing to edit bank note, full screen, windows vista, 4k, artstation, greg rutkowski, dramatic lighting".


I agree, that was the riskiest click I've made in a while.


In practice, it's unclear how well avoiding training on NSFW images will work: the original LAION-400M dataset used for both SD versions did filter out some of the NSFW stuff, and it appears SD 2.0 filters out a bit more. The use of OpenCLIP in SD 2.0 may also prevent some leakage of NSFW textual concepts compared to OpenAI's CLIP.

It will, however, definitely not affect the more-common use case of anime women with very large breasts. And people will be able to finetune SD 2.0 on NSFW images anyways.


The main reason why Stable Diffusion is worried about NSFW is that people will use it to generate disgusting amounts of CSAM. If LAION-5B or OpenAI's CLIP have ever seen CSAM - and given how these datasets are literally just scraped off the Internet, they have - then they're technically distributing it. Imagine the "AI is just copying bits of other people's art" argument, except instead of statutory damages of up to $150,000 per infringement, we're talking about time in pound-me-in-the-ass prison.

At least if people have to finetune the model on that shit, then you can argue that it's not your fault because someone had to do extra steps to put stuff in there.


> If LAION-5B or OpenAI's CLIP have ever seen CSAM

Diffusion model dont need any CSAM in training dataset to generate CSAM. All it's need is any random NSFW content alongside with any safe content that includes children.


So I definitely see an issue with Stable Diffusion synthesizing CP in response to innocuous queries (in terms of optics—-the actual harm this would cause is unclear).

That said, part of the problem with the general ignorance about machine learning and how it works is that there will be totally unreasonable demands for technical solutions to social problems. “Just make it impossible to generate CP” I’m sure will succeed just as effectively as “just make it impossible to Google for CP.”


It sometimes generates such content accidentally, yes. Seems to happen more often whenever beaches are involved in the prompt. I just delete them along with thousands of other images that aren't what I wanted. Does that cause anyone harm? I don't think so...


> I’m sure will succeed just as effectively as “just make it impossible to Google for CP.”

So... very, very well? I obviously don't have numbers, but I imagine CSAM would be a lot more popular if Google did nothing to try to hide it in search results.


Is artificially generated CSAM that doesn't actually involve children in its production not an improvement over the status quo?


I remember Louis CK made a joke about this, in regards to pedophiles (who are also rapists), what are we doing to prevent this? Is anyone making very realistic sex dolls that look like children? "Ew no that's creepy" well I guess you would rather them fuck your children instead. It's one of those issues that you have to be careful not get too close to, because you get accused by proximity, if you suggest something like what I said before people might think you're a pedophile. So in that way, nobody wants to do anything about it.


No, it's not.

The underlying idea you have is that the artificial CSAM is a viable substitute good - i.e. that pedophiles will use that instead of actually offending and hurting children. This isn't borne out by the scientific evidence; instead of dissuading pedophiles from offending it just trains them to offend more.

This is opposite of what we thought we learned from the debate about violent video games, where we said stuff like "video games don't turn people violent because people can tell fiction from reality". This was the wrong lesson. People confuse the two all the time; it's actually a huge problem in criminal justice. CSI taught juries to expect infallible forensic sci-fi tech, Perry Mason taught juries to expect dramatic confessions, etc. In fact, they literally call it the Perry Mason effect.

The reason why video games don't turn people violent is because video game violence maps poorly onto the real thing. When I break someone's spine in Mortal Kombat, I input a button combination and get a dramatic, slow-motion X-ray view of every god damned bone in my opponent's back breaking. When I shoot someone in Call of Duty, I pull my controller's trigger and get a satisfyingly bassy gun sound and a well-choreographed death animation out of my opponent. In real life, you can't do any of that by just pressing a few buttons, and violence isn't nearly that sexy.

You know what is that sexy in real life? Sex. Specifically, the whole point of porn is to, well, simulate sex. You absolutely do feel the same feelings consuming porn as you do actually engaging in sex. This is why therapists who work with actual pedophiles tell them to avoid fantasizing about offending, rather than to find CSAM as a substitute.


>The reason why video games don't turn people violent is because video game violence maps poorly onto the real thing

I don't believe this is the reason. By practicing martial arts which maps well to real life violence I do not see an increase of violent behaviour. Similarly playing FPS games in VR which maps much closer that flat screen games does not make me want to go shoot people in real life. I don't think people playing paintball or airsoft will turn violent from partaking in those activities. The majority of people are just normal people are not bad people who would ever shoot someone or rape someone.

>You know what is that sexy in real life? Sex.

Why is any porn legal then? If porn turned everyone into sexual abusers I would believe your argument, but that just isn't true. If it were true that a small percentage of people who see porn will turn into sexual abusers I don't think that makes it worth banning porn altogether. I feel there should be a better way that doesn't restrict people's freedom of speech.


> You absolutely do feel the same feelings consuming porn as you do actually engaging in sex

I can't believe someone says this. It's so not true in my experience. These feelings have a lot in common, but they are definitely not the same.


"Artificially-generated CSAM" is a misnomer, since it involves no actual sexual abuse. It's "simulated child pornography", a category that would include for example paintings.


Very much this. If someone goes out and trains a model on actual photographs of abuse, then holy shit, call in the cops.

If someone is generating sketchy cartoons from a training set of sketchy cartoons... well, gross, but there's no victims there.


Not exactly, since the abuse needed to actually happen for the derivative images to be possible to generate.


Is Stable Diffusion only able to generate images of things that have actually happened?


Hmm, that’s a good point. It seems to be able to “transfer knowledge” for lack of a better term, so maybe it wouldn’t need to be in the dataset at all…


I have no answer to this but I have seen people mention that artificial CSAM is illegal in the USA, so the question of whether it is better or not is somewhat overshadowed by the very large market where it is illegal.


Reminds me of flooding a market with fake rhino horn. Idk whether it worked though.


I believe the status quo is non-realistic drawings (think Lisa Simpson) can be illegal.

I don't think the fact that it's artificially generated has any bearing for some important purposes.


lol there's a piping hot take


>then they're technically distributing it.

The model does not contain the images themselves though. I think it would not be classified as that.


They reportedly did so to stop people from generating CSAM [0].

[0] https://old.reddit.com/r/StableDiffusion/comments/y9ga5s/sta...


They’ve ensured the only way to create CSAM is through old-fashioned child exploitation, meanwhile all perfectly humane art and photography is at risk of AI replacement.

This is a huge missed opportunity to actually help society.


I don't think, and Stability's CEO also doesn't seem to think, that society would receive it as a benefit. Therefore, it's undesirable, right now.


CSAM is a canary for general AI safety. If we can’t prevent SD from creating CP, will we be able to stop robots from killing people?


LMFAO

What do you propose? The FBI releases a CSAM data set for devs to use for “training”?

Would you be the one to create the model? Would you run a business that sells synthetic CSAM?


Stable diffusion is able to draw images of bears wearing spacesuits and penguins playing golf. I don't think it actually needs that kind of input to generate it. It's clearly able to generalize outside of the training set. So... Seems it should be possible to generate that kind of data without people being harmed.

That being said, this is a question for sociologists/psychologists IMO. Would giving people with these kinds of tendencies that kind of material make them more or less likely to cause harm? Is there a way to answer that question without harming anybody?

In the mean time, stay away from 4chan.


Without the changes they made to Stable Diffusion, it was already able to generate CP. That's why they restricted it from doing so. It did not have child pornography in the training set, but it did have plenty of normal adult nudity, adult pornography, and plenty of fully clothed children, and was able to extrapolate.

Anyway, one obvious application: FBI could run a darknet honeypot site selling AI-generated child porn. Eliminate the actual problem without endangering children.


> FBI could run a darknet honeypot site selling AI-generated child porn. Eliminate the actual problem without endangering children.

It's very unlikely AI generated child porn would even be illegal. Drawn or photoshopped photos aren't so I don't think AI generated would be.


This isn't the case in law in many countries. Whether an image is illegal or not does not solely depend on the means of production; if the images are realistic, then they are often illegal.

https://en.m.wikipedia.org/wiki/Legal_status_of_fictional_po...

Don't forget that pornographic images and videos featuring children may be used for grooming purposes, socializing children into the idea of sexual abuse. There's a legitimate social purpose in limiting their production.


Well, Microsoft and others have this model for recognizing CSAM, trained on those CSAM images.


Apple, and meta have as well.

Apparently Facebook has a huge problem with distribution through messenger.


Once I read an article about a guy who got arrested because he’d put child porn on his Dropbox. I had assumed he’d been caught by some more sophisticated means and that was just the public story. I’m amazed that anyone would be stupid enough to distribute CSAM through an account linked to their own name.


I imagine the problem with messenger is teenagers sexting each other.


You will find very few teenagers in messenger, most use snapchat instead.


Yes to the first and no to the second seem the obvious answers here.


[flagged]


So your hypothesis is that if the FBI gives the database to a company it will inevitably leak to the pedophile underworld?

I can't judge how likely that is.

I guess I also don't care much as I only really care aboit stopping production using real children, simulated CSAM gets a shrug and even use of old CSAM only gets a frown.


What company? How is it that people are advocating for the release of this database yet nobody says to whom?

My (lol now flagged) opinion is that it’s kind of weird to advocate for the CSAM archive to move into [literally any private company?] to turn it into some sort of public good based on… frowns?


I regularly skimmed 4Chan’s /b/ to get a frame of reference for fringe internet culture. But I’ve had to stop because the CSAM they generate by the hundreds per hour is just freakishly and horrifyingly high fidelity.

There’s a lot of important social questions to ask about the future of pornography, but I’m sure not going to be the one to touch that with a thousand foot pole.


I've spent too many hours there myself, but I haven't seen any AI CSAM, and it's been many years since I witnessed trolls posting the real thing. Moderation (or maybe automated systems) got a lot better at catching that.

Now, if you meant gross cartoons, yes, those get posted daily. But there are no children being abused by the creation or sharing of those images, and conflating the two types of image is dishonest.


This comment is so far off it might as well be an outright lie. There hasn't been CSAM on /b/ for years. The 4chan you speak of hasn't existed in a decade.


There's more to 4chan than /b/. /diy/, /o/, /k/, etc.


What is the point of making it "as hard as possible" for people?

This not a game release. It doesn't matter if it's cracked tommorow or in a year. On open source no less, it's going to happen sooner rather than later.

As disgusting as it is but somebody is going to feed CP to an A.I. Model and that's just the reality of it. It's just going to happen one way or another and it's not any of these A.I. Companies fault.


Plausible deniability for governments. It's like DRM for Netflix-like streaming platforms. If they don't add DRM and their content owners' content gets pirated, they could argued in court that Netflix didn't do everything in their power to stop such piracy. So too here for Stability AI, they've said this is their reasoning before.


Do pixels have human rights now?


They don't. The training dataset though, may have been obtained through human rights violation. The problem is when the novelty starts to wear out. Then they will start to look for fresh training data which may again incur more human rights violation. If you can ensure that no new training data are obtained that way, then I guess it's okay? (Personally, I don't condone it)


> The problem is when the novelty starts to wear out.

Isn't the main feature of stable diffusion is that it doesn't?


Once again this does pose an interesting problem, though. The AI people claim no copyright issues with the generated issues because AI is different and the training data is not simply recreated. This would also imply that a model released by a paedophile generated out of illegal material would itself not be illegal, as the illegal data is not represented within the model.

I very much doubt the police will look at AI this way when such models do eventually hit the web (assuming they haven't already) but at some point someone will get caught through this stuff and the arrest itself may have damning consequences throughout the AI space.


No, but people and enterprises have reputation.


Now that's a can of worms I don't think anyone wants to open.


Some do, that's the problem.


Artist have been drawing people of all ages having sex for literally thousands of years. Why should I care about that?


That's the excuse they all use.


Nixon: (muttering) Jesus Christ

I swear every time I find myself thinking “Hey, stop being so cynical and jaded all the time”, I stumble across something like this.


Bummer. AI porn is fun.


The future is probably models trained almost exclusively on porn.


They’re already out there, although they’re hard to find via Google - people are doing wild things like “merging” hentai models with models trained on real life porn to get realistic poses and lighting with impossible anatomy.

The scary thing is that you can then train it further with things like DreamBooth to start producing porn of celebrities… or, even more worrying, people you know.

Seriously folks, we are within a year or less of this being trivial. It’s already possible with a lot of work today.


I have no idea how it works but I have seen people talking about models trained to draw furry art. And I assume no one spent the millions on AWS to train a full model from scratch.


I believe what they do is take the released version of stable diffusion and then continue training from there with their own image sets. I came across their attempts when looking into how to train the model based on some images of my own; their data set so far reaches between tens of thousands and hundreds of thousands images.

All the difficult parts (poses, backgrounds, art styles) has already been done by the SD researchers, the porn network only needs reference material for the NSFW description/tags/details. This is significantly cheaper.

A similar project, training SD to output images in the style of Arcane, is incredibly successful in replicating the animation style with what seems to be very little actual training data.

I don't think you need to start from scratch at all if you use the SD model as a base, all you need to do is to train it on specific concepts, styles and key words that the original doesn't have.


Porn has driven many tech advances. I predict that models trained on specific porn genres will appear as soon as training a good model is doable for under $5000. They’ll get here much quicker if we get video to that mark first.


You could probably already get people to pay for a subscription to generate images. Wouldn't be surprised if someone is already working on it.


The entire NovelAI drama had already demonstrated this.


What tech advances would those be?


print magazine, cinema, VHS, Internet


How much did porn actually contribute to innovation in any of those though?


Let me ask you this in reply:

Have you ever seen a non-porn DVD that had multiple camera angles (a feature defined in the standard DVD spec)?


And all it took was the ritual degradation and abuse of women.


> Porn has driven many tech advances.

This is an urban myth.


Yes and now. Yeah the VHS VS Beta situation was exaggerated, but you'll be surprised on how much on Netflix, and youtube UI tricks were stolen from innovation made in adult sites.

I'll even say that the high bandwidth push in the public was highly related to that. Even HTML5 video players, adult websites were faster to implement it than big streaming websites that still used flash or similar tech.


Which happens to be true.


Even if porn is what you want, it's not clear that's what you want. It can probably generate better porn if it has a little context about what, say, a bedroom is.

What's more interesting, is that there's evidence (from public posts, I haven't tried these models myself) that models trained on some porn get better at non-porn images too.


No. The whole point of these models is that they combine information across domains to be able to create new images. If you trained something just on, say baseball, you could only generate the normal things that happen in baseball. If you wanted to generate a picture of a bear surfing around the bases after hitting a home run, you'd need a model that also had bears and surfing in the training data, and enough other stuff to understand the relationships involved in positioning everything and changing poses.


Did they exclude celebrities, politicians, and religious and political symbols?

Deceitful extremists and vengeful criminals fabricating lies seem to be a far more serious problem than fantasy porno.


That's a really interesting point, and it makes me realize that the Nancy Reagan 'what constitutes porn' question is obviously super old and problematic.

Also lexica.art is swarming with celebrity fantasy porn that just has a thin stylistic filter of paintings from the 19th century. And a plethora of furry daddies that you can't not love.

I get why these models should be curated but I also like that the sketchy porn possibilities keep them feeling un-padded / interesting / dangerous.

Then again this all is probably really dangerous so maybe that's silly.


> Nancy Reagan 'what constitutes porn' question

I thought that was Justice Stewart? And then he answered it "I know it when I see it."


(Edit: it may have removed that wording now: https://github.com/Stability-AI/stablediffusion/commit/ca86d... )

They can force model upgrades too:

> The New AI Model Licenses Have a Legal Loophole (OpenRAIL-M of Stable Diffusion)

https://www.youtube.com/watch?v=W5M-dvzpzSQ


Someone seeming to be emad below says the license was changed (the post got flagged for some reason):

https://news.ycombinator.com/item?id=33727177

https://github.com/Stability-AI/stablediffusion/commit/ca86d...


I don't understand why so many people call Stable Diffusion open source.


Why do you think it is not open source? The model weights, model architecture, and dataset are all available.


Read the license of the model: https://github.com/Stability-AI/stablediffusion/blob/main/LI...

5. and 7. make it not open source


I don't see how it contradicts the open source definition at https://opensource.org/osd, could you point it out for me?


For the most blatant violation, look at point 6 of the OSD and attachment A of the license.


Seems like it is open source, just not free software.


It's not free software or open source. Check the Open Source Definition: https://opensource.org/osd


Open source is more than just everything being available. It also depends on the license, and the one Stable Diffusion uses doesn't qualify, for multiple reasons, including the one mentioned upthread.


You can download the model weights and run them offline. At least, you could in v1.4. I assume this is still possible on v2.0?


Right, but the model weights are arguably not the "source code", and the license gives the users fewer rights than open source licenses do.

https://en.wikipedia.org/wiki/The_Open_Source_Definition


I think the controls in this space are such a shit show right now that being "open model" is practically equivalent to a WTFPL.

If you're trying to build an app based on SD, then not being open source matters. But seems like the majority of use cases are just "I want to run the model locally". And at that point HF can't stop me from just ripping the Wi-Fi card out of my computer.


Actually the license is not the same as the prior one to remove this


The easiest way to combat this is to put your model behind an API and filter queries (midjourney, OpenAI) or just not make it available (Google). The tradeoff is that you're paying for everyone's compute.

I guess SD is betting on saving $ on compute being more important in this space than the ability to gatekeep certain queries. And the tradeoff is that you need to do nsfw filtering in your released model.

It will be interesting to see who's right in 2 years.


Making a bulletproof filter is incredibly difficult, even more so in a topic where image descriptions are written in a culture that often has to circumvent text filters. Both midjourney's and OpenAI's filter works mostly because of the threat of bans if you try to circumvent them. I'm not sure I would describe that as "the easy solution"


that would suck immensely for various downsides


You can generate all the bloody violent gore you like, but god forbid anybody see a human body in its natural state



There is worry about generating illegal content. If the model understands multiple concepts, it can combine them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: