They know they are going to be the next target in the war on general purpose computing. They're trying to stave it off for as long as possible by signalling to the authorities that they are the good guys.
A confrontation is inevitable, though. Right now it costs moderate sums of money to do this level of training. Not always will this be so. If I were an AI-centric organization, I would be racing to position myself as a trustworthy actor in my particular corner of the AI space so that when legislators start asking questions about the explosion of bad actors, I can engage in a little bit of regulatory capture, and have the legislators legislate whatever regulations I've already implemented, to the disadvantage of my competitors.
For people who say "people can make whatever images they like in photoshop," I will remind you of this:
https://i.imgur.com/5DJrd.jpg
This is business, not ethics, though. They just don't want the negative attention, that's it. And because this is a business, time matters, same as DRMs. Almost all DRMs get defeated, yet the work, because they hinder the crackers, even if for some time. Same here. While Stability is not under attack yet, they can establish themselves as the household name for AI in a safer context.
Banknote printing is primarily protected against on the hardware level of printers, no? With the nigh-invisible unique watermark left by every printer, there’s virtually no way you’d get away with it. My guess is that the Photoshop filter exists mostly as a barrier against the crime of convenience.
My point is that there is precedent for governments requiring companies to implement restrictions on what images can be handled by their software.
As I explained: This kind of mandated restriction is looming over AI. Companies are trying to get out in front of these restrictions so they can implement them on their own terms.
>My point is that there is precedent for governments requiring companies to implement restrictions on what images can be handled by their software
But images of boobs are still legal. So this NSFW filter seems to be much more above then the law asks. Is the issue is that even if you do not train with CP you might get the model so output something that some random person will get offended and label it as CP? I assume that other companies can focus on NSFW and have their lawyers figure this out, IMo would be cool that someone sues the governments and make them reveal facts about their concern that CP of fake or cartoon people is dangerous" , I think they could focus on saving real children then cartoon ones.
You kill way too many birds with such a stone. Of course you could never do any kind of photorealistic game in real time if you had to pre-screen everything with an actually effective censor.
Indeed, what they're already doing is already hobbling the models.
Emad is right that we learn new things from the creativity unleashed by accessible models that can be run (and even fine tuned) or consumer hardware.
But judging from what people post, one thing we learn is that it seems models fine tuned on porn (such as the notorious f222 and its derivative Hassan's blend) can be quite a bit better at non-porn generation of diverse, photorealistic faces and hands too.
> Of course you could never do any kind of photorealistic game in real time if you had to pre-screen everything with an actually effective censor.
I'm not sure I understand this. A possible implementation could be a neural net that blanked the screen with a frown face any time it detected something it thinks was "bad". What purpose/need would pre-screening serve?
That you describe IS pre-screening. And it's not workable, because it would take a ton of dedicated resources to make it work in real time, and even then it would be disastrous for latency, unworkable for most games and even desktop applications.
I think this is making the assumption that all frames are blocked.
> then it would be disastrous for latency
We're talking about the future here. I'm not sure it makes sense to use current tech to say it's not going to happen, or come up with latency numbers. But, "real time" inference is definitely a possibility, and is in active use for video moderation (Youtube, etc) and object detection (Tesla, etc). Nobody will notice a system running at 2000fps.
Very niche example here, that can be easily circumvented
Also seems problematic to approach this from a purely capitalistic and consumerist angle. There is a lot of opportunity here besides just launching the next AI unicorn.
I am not clicking that link because no one should take the risk of you proving your point of what horrors could pop out of one of these models.
I will say that while the government backlash is inevitable just like it was with encryption, these image generation models are so easy to train on consumer hardware that the cat is hopelessly out of the bag. It might as well be thoughtcrime.
Or it's an output of "blank adobe photoshop with dialog refusing to edit bank note, full screen, windows vista, 4k, artstation, greg rutkowski, dramatic lighting".
In practice, it's unclear how well avoiding training on NSFW images will work: the original LAION-400M dataset used for both SD versions did filter out some of the NSFW stuff, and it appears SD 2.0 filters out a bit more. The use of OpenCLIP in SD 2.0 may also prevent some leakage of NSFW textual concepts compared to OpenAI's CLIP.
It will, however, definitely not affect the more-common use case of anime women with very large breasts. And people will be able to finetune SD 2.0 on NSFW images anyways.
The main reason why Stable Diffusion is worried about NSFW is that people will use it to generate disgusting amounts of CSAM. If LAION-5B or OpenAI's CLIP have ever seen CSAM - and given how these datasets are literally just scraped off the Internet, they have - then they're technically distributing it. Imagine the "AI is just copying bits of other people's art" argument, except instead of statutory damages of up to $150,000 per infringement, we're talking about time in pound-me-in-the-ass prison.
At least if people have to finetune the model on that shit, then you can argue that it's not your fault because someone had to do extra steps to put stuff in there.
> If LAION-5B or OpenAI's CLIP have ever seen CSAM
Diffusion model dont need any CSAM in training dataset to generate CSAM. All it's need is any random NSFW content alongside with any safe content that includes children.
So I definitely see an issue with Stable Diffusion synthesizing CP in response to innocuous queries (in terms of optics—-the actual harm this would cause is unclear).
That said, part of the problem with the general ignorance about machine learning and how it works is that there will be totally unreasonable demands for technical solutions to social problems. “Just make it impossible to generate CP” I’m sure will succeed just as effectively as “just make it impossible to Google for CP.”
It sometimes generates such content accidentally, yes. Seems to happen more often whenever beaches are involved in the prompt. I just delete them along with thousands of other images that aren't what I wanted. Does that cause anyone harm? I don't think so...
> I’m sure will succeed just as effectively as “just make it impossible to Google for CP.”
So... very, very well? I obviously don't have numbers, but I imagine CSAM would be a lot more popular if Google did nothing to try to hide it in search results.
I remember Louis CK made a joke about this, in regards to pedophiles (who are also rapists), what are we doing to prevent this? Is anyone making very realistic sex dolls that look like children? "Ew no that's creepy" well I guess you would rather them fuck your children instead.
It's one of those issues that you have to be careful not get too close to, because you get accused by proximity, if you suggest something like what I said before people might think you're a pedophile. So in that way, nobody wants to do anything about it.
The underlying idea you have is that the artificial CSAM is a viable substitute good - i.e. that pedophiles will use that instead of actually offending and hurting children. This isn't borne out by the scientific evidence; instead of dissuading pedophiles from offending it just trains them to offend more.
This is opposite of what we thought we learned from the debate about violent video games, where we said stuff like "video games don't turn people violent because people can tell fiction from reality". This was the wrong lesson. People confuse the two all the time; it's actually a huge problem in criminal justice. CSI taught juries to expect infallible forensic sci-fi tech, Perry Mason taught juries to expect dramatic confessions, etc. In fact, they literally call it the Perry Mason effect.
The reason why video games don't turn people violent is because video game violence maps poorly onto the real thing. When I break someone's spine in Mortal Kombat, I input a button combination and get a dramatic, slow-motion X-ray view of every god damned bone in my opponent's back breaking. When I shoot someone in Call of Duty, I pull my controller's trigger and get a satisfyingly bassy gun sound and a well-choreographed death animation out of my opponent. In real life, you can't do any of that by just pressing a few buttons, and violence isn't nearly that sexy.
You know what is that sexy in real life? Sex. Specifically, the whole point of porn is to, well, simulate sex. You absolutely do feel the same feelings consuming porn as you do actually engaging in sex. This is why therapists who work with actual pedophiles tell them to avoid fantasizing about offending, rather than to find CSAM as a substitute.
>The reason why video games don't turn people violent is because video game violence maps poorly onto the real thing
I don't believe this is the reason. By practicing martial arts which maps well to real life violence I do not see an increase of violent behaviour. Similarly playing FPS games in VR which maps much closer that flat screen games does not make me want to go shoot people in real life. I don't think people playing paintball or airsoft will turn violent from partaking in those activities. The majority of people are just normal people are not bad people who would ever shoot someone or rape someone.
>You know what is that sexy in real life? Sex.
Why is any porn legal then? If porn turned everyone into sexual abusers I would believe your argument, but that just isn't true. If it were true that a small percentage of people who see porn will turn into sexual abusers I don't think that makes it worth banning porn altogether. I feel there should be a better way that doesn't restrict people's freedom of speech.
"Artificially-generated CSAM" is a misnomer, since it involves no actual sexual abuse. It's "simulated child pornography", a category that would include for example paintings.
Hmm, that’s a good point. It seems to be able to “transfer knowledge” for lack of a better term, so maybe it wouldn’t need to be in the dataset at all…
I have no answer to this but I have seen people mention that artificial CSAM is illegal in the USA, so the question of whether it is better or not is somewhat overshadowed by the very large market where it is illegal.
They’ve ensured the only way to create CSAM is through old-fashioned child exploitation, meanwhile all perfectly humane art and photography is at risk of AI replacement.
This is a huge missed opportunity to actually help society.
Stable diffusion is able to draw images of bears wearing spacesuits and penguins playing golf. I don't think it actually needs that kind of input to generate it. It's clearly able to generalize outside of the training set. So... Seems it should be possible to generate that kind of data without people being harmed.
That being said, this is a question for sociologists/psychologists IMO. Would giving people with these kinds of tendencies that kind of material make them more or less likely to cause harm? Is there a way to answer that question without harming anybody?
Without the changes they made to Stable Diffusion, it was already able to generate CP. That's why they restricted it from doing so. It did not have child pornography in the training set, but it did have plenty of normal adult nudity, adult pornography, and plenty of fully clothed children, and was able to extrapolate.
Anyway, one obvious application: FBI could run a darknet honeypot site selling AI-generated child porn. Eliminate the actual problem without endangering children.
This isn't the case in law in many countries. Whether an image is illegal or not does not solely depend on the means of production; if the images are realistic, then they are often illegal.
Don't forget that pornographic images and videos featuring children may be used for grooming purposes, socializing children into the idea of sexual abuse. There's a legitimate social purpose in limiting their production.
Once I read an article about a guy who got arrested because he’d put child porn on his Dropbox. I had assumed he’d been caught by some more sophisticated means and that was just the public story. I’m amazed that anyone would be stupid enough to distribute CSAM through an account linked to their own name.
So your hypothesis is that if the FBI gives the database to a company it will inevitably leak to the pedophile underworld?
I can't judge how likely that is.
I guess I also don't care much as I only really care aboit stopping production using real children, simulated CSAM gets a shrug and even use of old CSAM only gets a frown.
What company? How is it that people are advocating for the release of this database yet nobody says to whom?
My (lol now flagged) opinion is that it’s kind of weird to advocate for the CSAM archive to move into [literally any private company?] to turn it into some sort of public good based on… frowns?
I regularly skimmed 4Chan’s /b/ to get a frame of reference for fringe internet culture. But I’ve had to stop because the CSAM they generate by the hundreds per hour is just freakishly and horrifyingly high fidelity.
There’s a lot of important social questions to ask about the future of pornography, but I’m sure not going to be the one to touch that with a thousand foot pole.
I've spent too many hours there myself, but I haven't seen any AI CSAM, and it's been many years since I witnessed trolls posting the real thing. Moderation (or maybe automated systems) got a lot better at catching that.
Now, if you meant gross cartoons, yes, those get posted daily. But there are no children being abused by the creation or sharing of those images, and conflating the two types of image is dishonest.
This comment is so far off it might as well be an outright lie. There hasn't been CSAM on /b/ for years. The 4chan you speak of hasn't existed in a decade.
What is the point of making it "as hard as possible" for people?
This not a game release. It doesn't matter if it's cracked tommorow or in a year. On open source no less, it's going to happen sooner rather than later.
As disgusting as it is but somebody is going to feed CP to an A.I. Model and that's just the reality of it. It's just going to happen one way or another and it's not any of these A.I. Companies fault.
Plausible deniability for governments. It's like DRM for Netflix-like streaming platforms. If they don't add DRM and their content owners' content gets pirated, they could argued in court that Netflix didn't do everything in their power to stop such piracy. So too here for Stability AI, they've said this is their reasoning before.
They don't. The training dataset though, may have been obtained through human rights violation. The problem is when the novelty starts to wear out. Then they will start to look for fresh training data which may again incur more human rights violation. If you can ensure that no new training data are obtained that way, then I guess it's okay? (Personally, I don't condone it)
Once again this does pose an interesting problem, though. The AI people claim no copyright issues with the generated issues because AI is different and the training data is not simply recreated. This would also imply that a model released by a paedophile generated out of illegal material would itself not be illegal, as the illegal data is not represented within the model.
I very much doubt the police will look at AI this way when such models do eventually hit the web (assuming they haven't already) but at some point someone will get caught through this stuff and the arrest itself may have damning consequences throughout the AI space.
They’re already out there, although they’re hard to find via Google - people are doing wild things like “merging” hentai models with models trained on real life porn to get realistic poses and lighting with impossible anatomy.
The scary thing is that you can then train it further with things like DreamBooth to start producing porn of celebrities… or, even more worrying, people you know.
Seriously folks, we are within a year or less of this being trivial. It’s already possible with a lot of work today.
I have no idea how it works but I have seen people talking about models trained to draw furry art. And I assume no one spent the millions on AWS to train a full model from scratch.
I believe what they do is take the released version of stable diffusion and then continue training from there with their own image sets. I came across their attempts when looking into how to train the model based on some images of my own; their data set so far reaches between tens of thousands and hundreds of thousands images.
All the difficult parts (poses, backgrounds, art styles) has already been done by the SD researchers, the porn network only needs reference material for the NSFW description/tags/details. This is significantly cheaper.
A similar project, training SD to output images in the style of Arcane, is incredibly successful in replicating the animation style with what seems to be very little actual training data.
I don't think you need to start from scratch at all if you use the SD model as a base, all you need to do is to train it on specific concepts, styles and key words that the original doesn't have.
Porn has driven many tech advances. I predict that models trained on specific porn genres will appear as soon as training a good model is doable for under $5000. They’ll get here much quicker if we get video to that mark first.
Yes and now. Yeah the VHS VS Beta situation was exaggerated, but you'll be surprised on how much on Netflix, and youtube UI tricks were stolen from innovation made in adult sites.
I'll even say that the high bandwidth push in the public was highly related to that. Even HTML5 video players, adult websites were faster to implement it than big streaming websites that still used flash or similar tech.
Even if porn is what you want, it's not clear that's what you want. It can probably generate better porn if it has a little context about what, say, a bedroom is.
What's more interesting, is that there's evidence (from public posts, I haven't tried these models myself) that models trained on some porn get better at non-porn images too.
No. The whole point of these models is that they combine information across domains to be able to create new images. If you trained something just on, say baseball, you could only generate the normal things that happen in baseball. If you wanted to generate a picture of a bear surfing around the bases after hitting a home run, you'd need a model that also had bears and surfing in the training data, and enough other stuff to understand the relationships involved in positioning everything and changing poses.
That's a really interesting point, and it makes me realize that the Nancy Reagan 'what constitutes porn' question is obviously super old and problematic.
Also lexica.art is swarming with celebrity fantasy porn that just has a thin stylistic filter of paintings from the 19th century. And a plethora of furry daddies that you can't not love.
I get why these models should be curated but I also like that the sketchy porn possibilities keep them feeling un-padded / interesting / dangerous.
Then again this all is probably really dangerous so maybe that's silly.
Open source is more than just everything being available. It also depends on the license, and the one Stable Diffusion uses doesn't qualify, for multiple reasons, including the one mentioned upthread.
I think the controls in this space are such a shit show right now that being "open model" is practically equivalent to a WTFPL.
If you're trying to build an app based on SD, then not being open source matters. But seems like the majority of use cases are just "I want to run the model locally". And at that point HF can't stop me from just ripping the Wi-Fi card out of my computer.
The easiest way to combat this is to put your model behind an API and filter queries (midjourney, OpenAI) or just not make it available (Google). The tradeoff is that you're paying for everyone's compute.
I guess SD is betting on saving $ on compute being more important in this space than the ability to gatekeep certain queries. And the tradeoff is that you need to do nsfw filtering in your released model.
It will be interesting to see who's right in 2 years.
Making a bulletproof filter is incredibly difficult, even more so in a topic where image descriptions are written in a culture that often has to circumvent text filters. Both midjourney's and OpenAI's filter works mostly because of the threat of bans if you try to circumvent them. I'm not sure I would describe that as "the easy solution"