nsfw-image-detection-384

Marqo image-classification timm

Marqo/nsfw-image-detection-384

2,198,756

下载量

52

收藏数

38

浏览量

apache-2.0

许可

简介

__NOTE: Like all models, this one can make mistakes. NSFW content can be subjective and contextual, this model is intended to help identify this content, use at your own risk.__

模型卡片

许可协议 apache-2.0

框架 timm

image-classification timm

模型详情

已翻译

Model card for nsfw-image-detection-384

注意：与所有模型一样，本模型也可能出错。NSFW 内容可能具有主观性和情境性，本模型旨在帮助识别此类内容，请自行承担使用风险。

Marqo/nsfw-image-detection-384 是一个轻量级图像分类模型，用于识别 NSFW 图像。该模型比其他开源模型小约 18–20 倍，并在我们的数据集上实现了 98.56% 的卓越准确率。本模型使用 384x384 像素的图像作为输入，采用 16x16 像素的 patch。

该模型在包含 220,000 张图像的专有数据集上训练。训练集包括 100,000 个 NSFW 样本和 100,000 个 SFW 样本，而测试集包含 10,000 个 NSFW 样本和 10,000 个 SFW 样本。该数据集涵盖多种内容类型，包括：真实照片、绘画、Rule 34 素材、meme 以及 AI 生成图像。NSFW 的定义可能因情境而异，我们的数据集构建时包含了具有挑战性的样本，但该定义可能无法 100% 符合所有使用场景，因此我们建议进行实验并尝试不同的阈值，以确定该模型是否适合您的需求。

模型使用

使用 timm 进行图像分类

pip install timm

from urllib.request import urlopen
from PIL import Image
import timm
import torch

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

model = timm.create_model("hf_hub:Marqo/nsfw-image-detection-384", pretrained=True)
model = model.eval()

data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

with torch.no_grad():
    output = model(transforms(img).unsqueeze(0)).softmax(dim=-1).cpu()

class_names = model.pretrained_cfg["label_names"]
print("Probabilities:", output[0])
print("Class:", class_names[output[0].argmax()])

评估

本模型在我们的数据集上优于现有的 NSFW 检测器，以下是与 AdamCodd/vit-base-nsfw-detector 和 Falconsai/nsfw_image_detection 的对比评估：

与其他模型的对比

阈值与精确率 vs 召回率

调整 NSFW 概率的阈值可以在精确率、召回率和准确率之间进行权衡。这在需要不同置信度的不同应用中可能非常有用。

阈值评估
精确率与召回率曲线

训练详情

本模型是对 timm/vit_tiny_patch16_384.augreg_in21k_ft_in1k 模型的微调版本。

参数

batch_size: 256
color_jitter: 0.2
color_jitter_prob: 0.05
cutmix: 0.1
drop: 0.1
drop_path: 0.05
epoch_repeats: 0.0
epochs: 20
gaussian_blur_prob: 0.005
hflip: 0.5
lr: 5.0e-05
mixup: 0.1
mixup_mode: batch
mixup_prob: 1.0
mixup_switch_prob: 0.5
momentum: 0.9
num_classes: 2
opt: adamw
remode: pixel
reprob: 0.5
sched: cosine
smoothing: 0.1
warmup_epochs: 2
warmup_lr: 1.0e-05
warmup_prefix: false

引用

@article{dosovitskiy2020vit,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={ICLR},
  year={2021}
}

@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}