They apparently tried to combat NSFW generation by filtering the training datase... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		rogers18445 on Nov 24, 2022 \| parent \| context \| favorite \| on: Stable Diffusion 2.0 They apparently tried to combat NSFW generation by filtering the training dataset not to include any.

bloaf on Nov 24, 2022 | [–]

They know they are going to be the next target in the war on general purpose computing. They're trying to stave it off for as long as possible by signalling to the authorities that they are the good guys.

A confrontation is inevitable, though. Right now it costs moderate sums of money to do this level of training. Not always will this be so. If I were an AI-centric organization, I would be racing to position myself as a trustworthy actor in my particular corner of the AI space so that when legislators start asking questions about the explosion of bad actors, I can engage in a little bit of regulatory capture, and have the legislators legislate whatever regulations I've already implemented, to the disadvantage of my competitors.

For people who say "people can make whatever images they like in photoshop," I will remind you of this: https://i.imgur.com/5DJrd.jpg

throwaway0x7E6 on Nov 24, 2022 | | [–]

appeasement never works. those who wage that war should be directly confronted.

and they will lose it, just like they've lost the war on encryption.

npteljes on Nov 24, 2022 | | | [–]

This is business, not ethics, though. They just don't want the negative attention, that's it. And because this is a business, time matters, same as DRMs. Almost all DRMs get defeated, yet the work, because they hinder the crackers, even if for some time. Same here. While Stability is not under attack yet, they can establish themselves as the household name for AI in a safer context.

SequoiaHope on Nov 24, 2022 | | | | [–]

I doubt they care if people make porn with diffusion models. They just don’t want to be the ones providing the model to do it.

jimbob45 on Nov 24, 2022 | | | [–]

Banknote printing is primarily protected against on the hardware level of printers, no? With the nigh-invisible unique watermark left by every printer, there’s virtually no way you’d get away with it. My guess is that the Photoshop filter exists mostly as a barrier against the crime of convenience.

bloaf on Nov 24, 2022 | | | [–]

My point is that there is precedent for governments requiring companies to implement restrictions on what images can be handled by their software.

As I explained: This kind of mandated restriction is looming over AI. Companies are trying to get out in front of these restrictions so they can implement them on their own terms.

simion314 on Nov 24, 2022 | | | [–]

>My point is that there is precedent for governments requiring companies to implement restrictions on what images can be handled by their software

But images of boobs are still legal. So this NSFW filter seems to be much more above then the law asks. Is the issue is that even if you do not train with CP you might get the model so output something that some random person will get offended and label it as CP? I assume that other companies can focus on NSFW and have their lawyers figure this out, IMo would be cool that someone sues the governments and make them reveal facts about their concern that CP of fake or cartoon people is dangerous" , I think they could focus on saving real children then cartoon ones.

nomel on Nov 24, 2022 | | | | [–]

It’s possible that the end game is hardware in GPUs, to detect whatever they want to prevent, before it’s displayed.

vintermann on Nov 24, 2022 | | | [–]

You kill way too many birds with such a stone. Of course you could never do any kind of photorealistic game in real time if you had to pre-screen everything with an actually effective censor.

Indeed, what they're already doing is already hobbling the models.

Emad is right that we learn new things from the creativity unleashed by accessible models that can be run (and even fine tuned) or consumer hardware.

But judging from what people post, one thing we learn is that it seems models fine tuned on porn (such as the notorious f222 and its derivative Hassan's blend) can be quite a bit better at non-porn generation of diverse, photorealistic faces and hands too.

nomel on Nov 29, 2022 | | | [–]

> Of course you could never do any kind of photorealistic game in real time if you had to pre-screen everything with an actually effective censor.

I'm not sure I understand this. A possible implementation could be a neural net that blanked the screen with a frown face any time it detected something it thinks was "bad". What purpose/need would pre-screening serve?

vintermann on Nov 29, 2022 | | | [–]

That you describe IS pre-screening. And it's not workable, because it would take a ton of dedicated resources to make it work in real time, and even then it would be disastrous for latency, unworkable for most games and even desktop applications.

nomel on Nov 30, 2022 | | | [–]

> That you describe IS pre-screening.

I think this is making the assumption that all frames are blocked.

> then it would be disastrous for latency

We're talking about the future here. I'm not sure it makes sense to use current tech to say it's not going to happen, or come up with latency numbers. But, "real time" inference is definitely a possibility, and is in active use for video moderation (Youtube, etc) and object detection (Tesla, etc). Nobody will notice a system running at 2000fps.

guncjy on Nov 24, 2022 | | | | [–]

You can typically work around that by modding the printer firmware, if needed. It's not baked into the hardware.

bitL on Nov 24, 2022 | | | [–]

Some AI startups backed by the biggest players are already hitting legal/regulatory issues.

hackernewds on Nov 24, 2022 | | | [–]

Very niche example here, that can be easily circumvented

Also seems problematic to approach this from a purely capitalistic and consumerist angle. There is a lot of opportunity here besides just launching the next AI unicorn.

futureshock on Nov 24, 2022 | | | [–]

I am not clicking that link because no one should take the risk of you proving your point of what horrors could pop out of one of these models.

I will say that while the government backlash is inevitable just like it was with encryption, these image generation models are so easy to train on consumer hardware that the cat is hopelessly out of the bag. It might as well be thoughtcrime.

llui85 on Nov 24, 2022 | | | [–]

Link doesn't show any model output - it's an screenshot of photoshop refusing to edit a banknote.

bongobingo1 on Nov 24, 2022 | | | [–]

Or it's an output of "blank adobe photoshop with dialog refusing to edit bank note, full screen, windows vista, 4k, artstation, greg rutkowski, dramatic lighting".

titanomachy on Nov 24, 2022 | | | | [–]

I agree, that was the riskiest click I've made in a while.

minimaxir on Nov 24, 2022 | | [–]

In practice, it's unclear how well avoiding training on NSFW images will work: the original LAION-400M dataset used for both SD versions did filter out some of the NSFW stuff, and it appears SD 2.0 filters out a bit more. The use of OpenCLIP in SD 2.0 may also prevent some leakage of NSFW textual concepts compared to OpenAI's CLIP.

It will, however, definitely not affect the more-common use case of anime women with very large breasts. And people will be able to finetune SD 2.0 on NSFW images anyways.

kmeisthax on Nov 24, 2022 | | [–]

The main reason why Stable Diffusion is worried about NSFW is that people will use it to generate disgusting amounts of CSAM. If LAION-5B or OpenAI's CLIP have ever seen CSAM - and given how these datasets are literally just scraped off the Internet, they have - then they're technically distributing it. Imagine the "AI is just copying bits of other people's art" argument, except instead of statutory damages of up to $150,000 per infringement, we're talking about time in pound-me-in-the-ass prison.

At least if people have to finetune the model on that shit, then you can argue that it's not your fault because someone had to do extra steps to put stuff in there.

SXX on Nov 24, 2022 | | | [–]

> If LAION-5B or OpenAI's CLIP have ever seen CSAM

Diffusion model dont need any CSAM in training dataset to generate CSAM. All it's need is any random NSFW content alongside with any safe content that includes children.

evouga on Nov 24, 2022 | | | | [–]

So I definitely see an issue with Stable Diffusion synthesizing CP in response to innocuous queries (in terms of optics—-the actual harm this would cause is unclear).

That said, part of the problem with the general ignorance about machine learning and how it works is that there will be totally unreasonable demands for technical solutions to social problems. “Just make it impossible to generate CP” I’m sure will succeed just as effectively as “just make it impossible to Google for CP.”

dhdgrygev on Nov 24, 2022 | | | [–]

It sometimes generates such content accidentally, yes. Seems to happen more often whenever beaches are involved in the prompt. I just delete them along with thousands of other images that aren't what I wanted. Does that cause anyone harm? I don't think so...

creata on Nov 24, 2022 | | | | [–]

> I’m sure will succeed just as effectively as “just make it impossible to Google for CP.”

So... very, very well? I obviously don't have numbers, but I imagine CSAM would be a lot more popular if Google did nothing to try to hide it in search results.

fastball on Nov 24, 2022 | | | | [–]

Is artificially generated CSAM that doesn't actually involve children in its production not an improvement over the status quo?

theclansman on Nov 24, 2022 | | | [–]

I remember Louis CK made a joke about this, in regards to pedophiles (who are also rapists), what are we doing to prevent this? Is anyone making very realistic sex dolls that look like children? "Ew no that's creepy" well I guess you would rather them fuck your children instead. It's one of those issues that you have to be careful not get too close to, because you get accused by proximity, if you suggest something like what I said before people might think you're a pedophile. So in that way, nobody wants to do anything about it.

kmeisthax on Nov 24, 2022 | | | | [–]

No, it's not.

The underlying idea you have is that the artificial CSAM is a viable substitute good - i.e. that pedophiles will use that instead of actually offending and hurting children. This isn't borne out by the scientific evidence; instead of dissuading pedophiles from offending it just trains them to offend more.

This is opposite of what we thought we learned from the debate about violent video games, where we said stuff like "video games don't turn people violent because people can tell fiction from reality". This was the wrong lesson. People confuse the two all the time; it's actually a huge problem in criminal justice. CSI taught juries to expect infallible forensic sci-fi tech, Perry Mason taught juries to expect dramatic confessions, etc. In fact, they literally call it the Perry Mason effect.

The reason why video games don't turn people violent is because video game violence maps poorly onto the real thing. When I break someone's spine in Mortal Kombat, I input a button combination and get a dramatic, slow-motion X-ray view of every god damned bone in my opponent's back breaking. When I shoot someone in Call of Duty, I pull my controller's trigger and get a satisfyingly bassy gun sound and a well-choreographed death animation out of my opponent. In real life, you can't do any of that by just pressing a few buttons, and violence isn't nearly that sexy.

You know what is that sexy in real life? Sex. Specifically, the whole point of porn is to, well, simulate sex. You absolutely do feel the same feelings consuming porn as you do actually engaging in sex. This is why therapists who work with actual pedophiles tell them to avoid fantasizing about offending, rather than to find CSAM as a substitute.

charcircuit on Nov 24, 2022 | | | [–]

>The reason why video games don't turn people violent is because video game violence maps poorly onto the real thing

I don't believe this is the reason. By practicing martial arts which maps well to real life violence I do not see an increase of violent behaviour. Similarly playing FPS games in VR which maps much closer that flat screen games does not make me want to go shoot people in real life. I don't think people playing paintball or airsoft will turn violent from partaking in those activities. The majority of people are just normal people are not bad people who would ever shoot someone or rape someone.

>You know what is that sexy in real life? Sex.

Why is any porn legal then? If porn turned everyone into sexual abusers I would believe your argument, but that just isn't true. If it were true that a small percentage of people who see porn will turn into sexual abusers I don't think that makes it worth banning porn altogether. I feel there should be a better way that doesn't restrict people's freedom of speech.

SergeyHack on Nov 26, 2022 | | | | [–]

> You absolutely do feel the same feelings consuming porn as you do actually engaging in sex

I can't believe someone says this. It's so not true in my experience. These feelings have a lot in common, but they are definitely not the same.

Hamuko on Nov 24, 2022 | | | | [–]

"Artificially-generated CSAM" is a misnomer, since it involves no actual sexual abuse. It's "simulated child pornography", a category that would include for example paintings.

novatad on Nov 24, 2022 | | | [–]

Very much this. If someone goes out and trains a model on actual photographs of abuse, then holy shit, call in the cops.

If someone is generating sketchy cartoons from a training set of sketchy cartoons... well, gross, but there's no victims there.

just-ok on Nov 24, 2022 | | | | [–]

Not exactly, since the abuse needed to actually happen for the derivative images to be possible to generate.

Hamuko on Nov 24, 2022 | | | [–]

Is Stable Diffusion only able to generate images of things that have actually happened?

just-ok on Nov 24, 2022 | | | [–]

Hmm, that’s a good point. It seems to be able to “transfer knowledge” for lack of a better term, so maybe it wouldn’t need to be in the dataset at all…

SequoiaHope on Nov 24, 2022 | | | | [–]

I have no answer to this but I have seen people mention that artificial CSAM is illegal in the USA, so the question of whether it is better or not is somewhat overshadowed by the very large market where it is illegal.

PartiallyTyped on Nov 24, 2022 | | | | [–]

Reminds me of flooding a market with fake rhino horn. Idk whether it worked though.

mook on Nov 24, 2022 | | | | [–]

I believe the status quo is non-realistic drawings (think Lisa Simpson) can be illegal.

I don't think the fact that it's artificially generated has any bearing for some important purposes.

kyletns on Nov 24, 2022 | | | | [–]

lol there's a piping hot take

charcircuit on Nov 24, 2022 | | | | [–]

>then they're technically distributing it.

The model does not contain the images themselves though. I think it would not be classified as that.

satvikpendem on Nov 24, 2022 | | [–]

They reportedly did so to stop people from generating CSAM [0].

[0] https://old.reddit.com/r/StableDiffusion/comments/y9ga5s/sta...

datalopers on Nov 24, 2022 | | [–]

They’ve ensured the only way to create CSAM is through old-fashioned child exploitation, meanwhile all perfectly humane art and photography is at risk of AI replacement.

This is a huge missed opportunity to actually help society.

npteljes on Nov 24, 2022 | | | [–]

I don't think, and Stability's CEO also doesn't seem to think, that society would receive it as a benefit. Therefore, it's undesirable, right now.

alignItems on Nov 24, 2022 | | | | [–]

CSAM is a canary for general AI safety. If we can’t prevent SD from creating CP, will we be able to stop robots from killing people?

braingenious on Nov 24, 2022 | | | | [–]

LMFAO

What do you propose? The FBI releases a CSAM data set for devs to use for “training”?

Would you be the one to create the model? Would you run a business that sells synthetic CSAM?

snek_case on Nov 24, 2022 | | | [–]

Stable diffusion is able to draw images of bears wearing spacesuits and penguins playing golf. I don't think it actually needs that kind of input to generate it. It's clearly able to generalize outside of the training set. So... Seems it should be possible to generate that kind of data without people being harmed.

That being said, this is a question for sociologists/psychologists IMO. Would giving people with these kinds of tendencies that kind of material make them more or less likely to cause harm? Is there a way to answer that question without harming anybody?

In the mean time, stay away from 4chan.

datalopers on Nov 24, 2022 | | | | [–]

Without the changes they made to Stable Diffusion, it was already able to generate CP. That's why they restricted it from doing so. It did not have child pornography in the training set, but it did have plenty of normal adult nudity, adult pornography, and plenty of fully clothed children, and was able to extrapolate.

Anyway, one obvious application: FBI could run a darknet honeypot site selling AI-generated child porn. Eliminate the actual problem without endangering children.

tick_tock_tick on Nov 24, 2022 | | | [–]

> FBI could run a darknet honeypot site selling AI-generated child porn. Eliminate the actual problem without endangering children.

It's very unlikely AI generated child porn would even be illegal. Drawn or photoshopped photos aren't so I don't think AI generated would be.

barrkel on Nov 24, 2022 | | | [–]

This isn't the case in law in many countries. Whether an image is illegal or not does not solely depend on the means of production; if the images are realistic, then they are often illegal.

https://en.m.wikipedia.org/wiki/Legal_status_of_fictional_po...

Don't forget that pornographic images and videos featuring children may be used for grooming purposes, socializing children into the idea of sexual abuse. There's a legitimate social purpose in limiting their production.

sroussey on Nov 24, 2022 | | | | [–]

Well, Microsoft and others have this model for recognizing CSAM, trained on those CSAM images.

PartiallyTyped on Nov 24, 2022 | | | [–]

Apple, and meta have as well.

Apparently Facebook has a huge problem with distribution through messenger.

bambataa on Nov 24, 2022 | | | [–]

Once I read an article about a guy who got arrested because he’d put child porn on his Dropbox. I had assumed he’d been caught by some more sophisticated means and that was just the public story. I’m amazed that anyone would be stupid enough to distribute CSAM through an account linked to their own name.

boppo1 on Nov 24, 2022 | | | [–]

I imagine the problem with messenger is teenagers sexting each other.

PartiallyTyped on Nov 24, 2022 | | | [–]

You will find very few teenagers in messenger, most use snapchat instead.

concordDance on Nov 24, 2022 | | | | [–]

Yes to the first and no to the second seem the obvious answers here.

braingenious on Nov 24, 2022 | | | [3 more]

[flagged]

concordDance on Nov 24, 2022 | | | [–]

So your hypothesis is that if the FBI gives the database to a company it will inevitably leak to the pedophile underworld?

I can't judge how likely that is.

I guess I also don't care much as I only really care aboit stopping production using real children, simulated CSAM gets a shrug and even use of old CSAM only gets a frown.

braingenious on Nov 24, 2022 | | | [–]

What company? How is it that people are advocating for the release of this database yet nobody says to whom?

My (lol now flagged) opinion is that it’s kind of weird to advocate for the CSAM archive to move into [literally any private company?] to turn it into some sort of public good based on… frowns?

Waterluvian on Nov 24, 2022 | | | [–]

I regularly skimmed 4Chan’s /b/ to get a frame of reference for fringe internet culture. But I’ve had to stop because the CSAM they generate by the hundreds per hour is just freakishly and horrifyingly high fidelity.

There’s a lot of important social questions to ask about the future of pornography, but I’m sure not going to be the one to touch that with a thousand foot pole.

novatad on Nov 24, 2022 | | | [–]

I've spent too many hours there myself, but I haven't seen any AI CSAM, and it's been many years since I witnessed trolls posting the real thing. Moderation (or maybe automated systems) got a lot better at catching that.

Now, if you meant gross cartoons, yes, those get posted daily. But there are no children being abused by the creation or sharing of those images, and conflating the two types of image is dishonest.

j-krieger on Nov 25, 2022 | | | | [–]

This comment is so far off it might as well be an outright lie. There hasn't been CSAM on /b/ for years. The 4chan you speak of hasn't existed in a decade.

boppo1 on Nov 24, 2022 | | | | [–]

There's more to 4chan than /b/. /diy/, /o/, /k/, etc.

GaryPalmer on Nov 24, 2022 | | | [–]

What is the point of making it "as hard as possible" for people?

This not a game release. It doesn't matter if it's cracked tommorow or in a year. On open source no less, it's going to happen sooner rather than later.

As disgusting as it is but somebody is going to feed CP to an A.I. Model and that's just the reality of it. It's just going to happen one way or another and it's not any of these A.I. Companies fault.

satvikpendem on Nov 24, 2022 | | | [–]

Plausible deniability for governments. It's like DRM for Netflix-like streaming platforms. If they don't add DRM and their content owners' content gets pirated, they could argued in court that Netflix didn't do everything in their power to stop such piracy. So too here for Stability AI, they've said this is their reasoning before.

throwup on Nov 24, 2022 | | | [–]

Do pixels have human rights now?

syockit on Nov 24, 2022 | | | [–]

They don't. The training dataset though, may have been obtained through human rights violation. The problem is when the novelty starts to wear out. Then they will start to look for fresh training data which may again incur more human rights violation. If you can ensure that no new training data are obtained that way, then I guess it's okay? (Personally, I don't condone it)

Zuiii on Nov 24, 2022 | | | [–]

> The problem is when the novelty starts to wear out.

Isn't the main feature of stable diffusion is that it doesn't?

jeroenhd on Nov 24, 2022 | | | | [–]

Once again this does pose an interesting problem, though. The AI people claim no copyright issues with the generated issues because AI is different and the training data is not simply recreated. This would also imply that a model released by a paedophile generated out of illegal material would itself not be illegal, as the illegal data is not represented within the model.

I very much doubt the police will look at AI this way when such models do eventually hit the web (assuming they haven't already) but at some point someone will get caught through this stuff and the arrest itself may have damning consequences throughout the AI space.

npteljes on Nov 24, 2022 | | | | [–]

No, but people and enterprises have reputation.

Gigachad on Nov 24, 2022 | | | [–]

Now that's a can of worms I don't think anyone wants to open.

mysterydip on Nov 24, 2022 | | | [–]

Some do, that's the problem.

randyrand on Nov 24, 2022 | | | [–]

Artist have been drawing people of all ages having sex for literally thousands of years. Why should I care about that?

BeFlatXIII on Nov 24, 2022 | | | [–]

That's the excuse they all use.

trafficante on Nov 24, 2022 | | | [–]

Nixon: (muttering) Jesus Christ

I swear every time I find myself thinking “Hey, stop being so cynical and jaded all the time”, I stumble across something like this.

argella on Nov 24, 2022 | | [–]

Bummer. AI porn is fun.

Gigachad on Nov 24, 2022 | | [–]

The future is probably models trained almost exclusively on porn.

tempthrowaway11 on Nov 24, 2022 | | | [–]

They’re already out there, although they’re hard to find via Google - people are doing wild things like “merging” hentai models with models trained on real life porn to get realistic poses and lighting with impossible anatomy.

The scary thing is that you can then train it further with things like DreamBooth to start producing porn of celebrities… or, even more worrying, people you know.

Seriously folks, we are within a year or less of this being trivial. It’s already possible with a lot of work today.

Gigachad on Nov 24, 2022 | | | [–]

I have no idea how it works but I have seen people talking about models trained to draw furry art. And I assume no one spent the millions on AWS to train a full model from scratch.

jeroenhd on Nov 24, 2022 | | | [–]

I believe what they do is take the released version of stable diffusion and then continue training from there with their own image sets. I came across their attempts when looking into how to train the model based on some images of my own; their data set so far reaches between tens of thousands and hundreds of thousands images.

All the difficult parts (poses, backgrounds, art styles) has already been done by the SD researchers, the porn network only needs reference material for the NSFW description/tags/details. This is significantly cheaper.

A similar project, training SD to output images in the style of Arcane, is incredibly successful in replicating the animation style with what seems to be very little actual training data.

I don't think you need to start from scratch at all if you use the SD model as a base, all you need to do is to train it on specific concepts, styles and key words that the original doesn't have.

joshspankit on Nov 24, 2022 | | | | [–]

Porn has driven many tech advances. I predict that models trained on specific porn genres will appear as soon as training a good model is doable for under $5000. They’ll get here much quicker if we get video to that mark first.

snek_case on Nov 24, 2022 | | | [–]

You could probably already get people to pay for a subscription to generate images. Wouldn't be surprised if someone is already working on it.

rfoo on Nov 24, 2022 | | | [–]

The entire NovelAI drama had already demonstrated this.

the_shivers on Nov 24, 2022 | | | | [–]

What tech advances would those be?

fragmede on Nov 24, 2022 | | | [–]

print magazine, cinema, VHS, Internet

thedorkknight on Nov 24, 2022 | | | [–]

How much did porn actually contribute to innovation in any of those though?

joshspankit on Nov 24, 2022 | | | [–]

Let me ask you this in reply:

Have you ever seen a non-porn DVD that had multiple camera angles (a feature defined in the standard DVD spec)?

guncjy on Nov 24, 2022 | | | | [–]

And all it took was the ritual degradation and abuse of women.

moron4hire on Nov 24, 2022 | | | | [–]

> Porn has driven many tech advances.

This is an urban myth.

feelandcoffee on Nov 24, 2022 | | | [–]

Yes and now. Yeah the VHS VS Beta situation was exaggerated, but you'll be surprised on how much on Netflix, and youtube UI tricks were stolen from innovation made in adult sites.

I'll even say that the high bandwidth push in the public was highly related to that. Even HTML5 video players, adult websites were faster to implement it than big streaming websites that still used flash or similar tech.

Godel_unicode on Nov 24, 2022 | | | | [–]

Which happens to be true.

vintermann on Nov 24, 2022 | | | | [–]

Even if porn is what you want, it's not clear that's what you want. It can probably generate better porn if it has a little context about what, say, a bedroom is.

What's more interesting, is that there's evidence (from public posts, I haven't tried these models myself) that models trained on some porn get better at non-porn images too.

version_five on Nov 24, 2022 | | | | [–]

No. The whole point of these models is that they combine information across domains to be able to create new images. If you trained something just on, say baseball, you could only generate the normal things that happen in baseball. If you wanted to generate a picture of a bear surfing around the bases after hitting a home run, you'd need a model that also had bears and surfing in the training data, and enough other stuff to understand the relationships involved in positioning everything and changing poses.

kristopolous on Nov 24, 2022 | | [–]

Did they exclude celebrities, politicians, and religious and political symbols?

Deceitful extremists and vengeful criminals fabricating lies seem to be a far more serious problem than fantasy porno.

pkdpic on Nov 24, 2022 | | [–]

That's a really interesting point, and it makes me realize that the Nancy Reagan 'what constitutes porn' question is obviously super old and problematic.

Also lexica.art is swarming with celebrity fantasy porn that just has a thin stylistic filter of paintings from the 19th century. And a plethora of furry daddies that you can't not love.

I get why these models should be curated but I also like that the sketchy porn possibilities keep them feeling un-padded / interesting / dangerous.

Then again this all is probably really dangerous so maybe that's silly.

rootusrootus on Nov 25, 2022 | | | [–]

> Nancy Reagan 'what constitutes porn' question

I thought that was Justice Stewart? And then he answered it "I know it when I see it."

cma on Nov 24, 2022 | | [–]

(Edit: it may have removed that wording now: https://github.com/Stability-AI/stablediffusion/commit/ca86d... )

They can force model upgrades too:

> The New AI Model Licenses Have a Legal Loophole (OpenRAIL-M of Stable Diffusion)

https://www.youtube.com/watch?v=W5M-dvzpzSQ

cma on Nov 24, 2022 | | [–]

Someone seeming to be emad below says the license was changed (the post got flagged for some reason):

https://news.ycombinator.com/item?id=33727177

https://github.com/Stability-AI/stablediffusion/commit/ca86d...

josephcsible on Nov 24, 2022 | | | [–]

I don't understand why so many people call Stable Diffusion open source.

xkapastel on Nov 24, 2022 | | | [–]

Why do you think it is not open source? The model weights, model architecture, and dataset are all available.

charcircuit on Nov 24, 2022 | | | [–]

Read the license of the model: https://github.com/Stability-AI/stablediffusion/blob/main/LI...

5. and 7. make it not open source

xkapastel on Nov 24, 2022 | | | [–]

I don't see how it contradicts the open source definition at https://opensource.org/osd, could you point it out for me?

josephcsible on Nov 24, 2022 | | | [–]

For the most blatant violation, look at point 6 of the OSD and attachment A of the license.

simlevesque on Nov 24, 2022 | | | | [–]

Seems like it is open source, just not free software.

josephcsible on Nov 24, 2022 | | | [–]

It's not free software or open source. Check the Open Source Definition: https://opensource.org/osd

josephcsible on Nov 24, 2022 | | | | [–]

Open source is more than just everything being available. It also depends on the license, and the one Stable Diffusion uses doesn't qualify, for multiple reasons, including the one mentioned upthread.

myaccount9786 on Nov 24, 2022 | | | | [–]

You can download the model weights and run them offline. At least, you could in v1.4. I assume this is still possible on v2.0?

schoen on Nov 24, 2022 | | | [–]

Right, but the model weights are arguably not the "source code", and the license gives the users fewer rights than open source licenses do.

https://en.wikipedia.org/wiki/The_Open_Source_Definition

myaccount9786 on Nov 24, 2022 | | | [–]

I think the controls in this space are such a shit show right now that being "open model" is practically equivalent to a WTFPL.

If you're trying to build an app based on SD, then not being open source matters. But seems like the majority of use cases are just "I want to run the model locally". And at that point HF can't stop me from just ripping the Wi-Fi card out of my computer.

emadm on Nov 24, 2022 | | | [–]

Actually the license is not the same as the prior one to remove this

myaccount9786 on Nov 24, 2022 | | [–]

The easiest way to combat this is to put your model behind an API and filter queries (midjourney, OpenAI) or just not make it available (Google). The tradeoff is that you're paying for everyone's compute.

I guess SD is betting on saving $ on compute being more important in this space than the ability to gatekeep certain queries. And the tradeoff is that you need to do nsfw filtering in your released model.

It will be interesting to see who's right in 2 years.

wongarsu on Nov 24, 2022 | | [–]

Making a bulletproof filter is incredibly difficult, even more so in a topic where image descriptions are written in a culture that often has to circumvent text filters. Both midjourney's and OpenAI's filter works mostly because of the threat of bans if you try to circumvent them. I'm not sure I would describe that as "the easy solution"

techsin101 on Nov 24, 2022 | | | [–]

that would suck immensely for various downsides

throwup on Nov 24, 2022 | [3 more]

You can generate all the bloody violent gore you like, but god forbid anybody see a human body in its natural state

dang on Nov 24, 2022 | | [–]

"Avoid generic tangents."

https://news.ycombinator.com/newsguidelines.html

echelon on Nov 24, 2022 | | [–]

There is worry about generating illegal content. If the model understands multiple concepts, it can combine them.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact