The porn / sex-chat one is really disappointing. It seems they've given up even pretending that they are trying to do something beneficial for society. This is just a pure society-be-damned money grab.
I'm pretty sure that if they didn't deliberately chose to train on sex chat/stories, etc, then the LLM wouldn't be any good at it. The model isn't getting this capability by training on WikiPedia or Reddit.
So, it's not a matter of them not being able to do a good job of preventing the model from doing it, therefore giving up and instead encouraging it to do it (which anyways makes no sense), but rather them having chosen to train the model to do this. OpenAI is targetting porn as one of their profit centers.
>The model isn't getting this capability by training on WikiPedia or Reddit
I don't know about the former, but the latter absolutely has sexually explicit material that could make the model more likely to generate erotic stories, flirty chats, etc.
OK, maybe bad example, but it would be easy to create a classifier to identify stuff like that and omit it from the training data if they wanted to, and now that they are going to be selling this I'd assume they are explicitly seeking out and/or paying for creation of training material of this type.