@ceo_of_monoeye_dating@Nerd02@bmygsbvur@db0 The last time the topic came up, the only publicly available API for this was owned by the feds. I don't know if this tool downloads a model (I also don't know how such a model could be legal to possess) or if it consults an API (which would be a privacy concern). In either case, you'd have to be very careful about false positives.
@p@ceo_of_monoeye_dating@Nerd02@bmygsbvur@db0 Yeah, it's using local CLIP model, something I've suggested both to gr*f and jakparty.soy admin. The problem is that it requires a lot of clock cycles, preferably on GPU, so it isn't something people with $5 VPSes can afford. Not fully sure about effectiveness, either, malicious actors can keep scrambling the image so that it passes the filter yet is still recognizable by human brain.
@p@Nerd02@bmygsbvur@db0@mint The problem with the models is the fact that training data can be reverse engineered from the model. If the model's not trained on any CP, there's not likely to be any problem.
@ceo_of_monoeye_dating@Nerd02@bmygsbvur@db0@mint Yeah, presumably it is better at detecting stuff that it produces itself, but my understanding is that this kind of model is legally questionable to possess because of that.
@ceo_of_monoeye_dating@Nerd02@bmygsbvur@db0@mint Yeah, but youtube-dl was on Github for years and then suddenly declared an evil piracy tool and scrubbed and banned. The odds that you get bonked are also higher than the odds that Github gets bonked; "I got it from Github" doesn't constitute much of a defense.
In either case, I don't have much investment in the legality of that model because I don't plan to acquire it. Just it was my understanding that possessing a model that was trained on some source material and that can be used to produce material resembling the source material is considered the same, legally, as possessing the source material. I'm not an expert on that and I don't think there have even been any cases yet.
To be specific they use one of the ViT-L/14 models. This type of labeling models have been around for a long time. They used to be called text-from-image or some other similar verbose description.
If the current generative models can produce porn then they can also produce CSAM, there's no need to go through another layer. The issue with models trained on actual illegal material is that then they could be reverse engineered to output the very same material that they have been trained with, in addition to very realistic generated ones. It's similar to how LLMs can be used to extract potentially private information they've been trained with.