Git Product home page Git Product logo

nsfw_model's Introduction

NSFW Detector logo

NSFW Detection Machine Learning Model

All Contributors

Trained on 60+ Gigs of data to identify:

  • drawings - safe for work drawings (including anime)
  • hentai - hentai and pornographic drawings
  • neutral - safe for work neutral images
  • porn - pornographic images, sexual acts
  • sexy - sexually explicit images, not pornography

This model powers NSFW JS - More Info

Current Status:

93% Accuracy with the following confusion matrix, based on Inception V3. nsfw confusion matrix

Requirements:

See requirements.txt.

Usage

For programmatic use of the library.

from nsfw_detector import predict
model = predict.load_model('./nsfw_mobilenet2.224x224.h5')

# Predict single image
predict.classify(model, '2.jpg')
# {'2.jpg': {'sexy': 4.3454722e-05, 'neutral': 0.00026579265, 'porn': 0.0007733492, 'hentai': 0.14751932, 'drawings': 0.85139805}}

# Predict multiple images at once
predict.classify(model, ['/Users/bedapudi/Desktop/2.jpg', '/Users/bedapudi/Desktop/6.jpg'])
# {'2.jpg': {'sexy': 4.3454795e-05, 'neutral': 0.00026579312, 'porn': 0.0007733498, 'hentai': 0.14751942, 'drawings': 0.8513979}, '6.jpg': {'drawings': 0.004214506, 'hentai': 0.013342537, 'neutral': 0.01834045, 'porn': 0.4431829, 'sexy': 0.5209196}}

# Predict for all images in a directory
predict.classify(model, '/Users/bedapudi/Desktop/')

If you've installed the package or use the command-line this should work, too...

# a single image
nsfw-predict --saved_model_path mobilenet_v2_140_224 --image_source test.jpg

# an image directory
nsfw-predict --saved_model_path mobilenet_v2_140_224 --image_source images

# a single image (from code/CLI)
python3 nsfw_detector/predict.py --saved_model_path mobilenet_v2_140_224 --image_source test.jpg

Download

Please feel free to use this model to help your products!

If you'd like to say thanks for creating this, I'll take a donation for hosting costs.

Latest Models Zip (v1.1.0)

https://github.com/GantMan/nsfw_model/releases/tag/1.1.0

Original Inception v3 Model (v1.0)

Original Mobilenet v2 Model (v1.0)

PyTorch Version

Kudos to the community for creating a PyTorch version with resnet! https://github.com/yangbisheng2009/nsfw-resnet

TF1 Training Folder Contents

Simple description of the scripts used to create this model:

  • inceptionv3_transfer/ - Folder with all the code to train the Keras based Inception v3 transfer learning model. Includes constants.py for configuration, and two scripts for actual training/refinement.
  • mobilenetv2_transfer/ - Folder with all the code to train the Keras based Mobilenet v2 transfer learning model.
  • visuals.py - The code to create the confusion matrix graphic
  • self_clense.py - If the training data has significant inaccuracy, self_clense helps cross validate errors in the training data in reasonable time. The better the model gets, the better you can use it to clean the training data manually.

e.g.

cd training
# Start with all locked transfer of Inception v3
python inceptionv3_transfer/train_initialization.py

# Continue training on model with fine-tuning
python inceptionv3_transfer/train_fine_tune.py

# Create a confusion matrix of the model
python visuals.py

Extra Info

There's no easy way to distribute the training data, but if you'd like to help with this model or train other models, get in touch with me and we can work together.

Advancements in this model power the quantized TFJS module on https://nsfwjs.com/

My Twitter is @GantLaborde - I'm a School Of AI Wizard New Orleans. I run the twitter account @FunMachineLearn

Learn more about me and the company I work for.

Special thanks to the nsfw_data_scraper for the training data. If you're interested in a more detailed analysis of types of NSFW images, you could probably use this repo code with this data.

If you need React Native, Elixir, AI, or Machine Learning work, check in with us at Infinite Red, who make all these experiments possible. We're an amazing software consultancy worldwide!

Cite

@misc{man,
  title={Deep NN for NSFW Detection},
  url={https://github.com/GantMan/nsfw_model},
  journal={GitHub},
  author={Laborde, Gant}}

Contributors

Thanks goes to these wonderful people (emoji key):


Gant Laborde

💻 📖 🤔

Bedapudi Praneeth

💻 🤔

This project follows the all-contributors specification. Contributions of any kind welcome!

nsfw_model's People

Contributors

bedapudi6788 avatar colindean avatar gantman avatar jessetrana avatar ottomanz avatar sickerin avatar technikempire avatar txyugood avatar vanphongle avatar xfalcox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nsfw_model's Issues

Nazi symbolism

Can you add examples of Nazi symbolism to the model? The model detects sexual content, but not the swastika.

Training Data

@GantMan I am trying to download the data with the torrent and since last 2 days it's showing 0 peers. If any one is able to download the data, can you please share it. Thank you.

Better training on TF 2.1

Just FYI, if you preprocess images and shove them through the make_image_classifier.py module of TF hub, you can train much faster and get to much higher accuracy, quicker.

Process is:

  • Merge all images into the five folders. Do not separate into train, test and val. The script will do this for you later.
  • Resize all images down to 224x before hand.
  • Set batch size to 1024-ish (depends on memory/video card memory)
  • Pass --do_fine_tuning to the script as well.
  • Before executing, edit the model compilation code in the TF-Hub module to separate the final softmax layer from the previous dense layer. The code (as it is in Google's repo) combines both with layer.Dense(... activation:softmax...). You need to delete the activation from that dense layer and append a separate softmax layer immediately after it as the final layer and explicitly set its data type to float32. The reason you're doing this is because you're going to also edit the script to configure TF to use mixed precision to drastically boost computation speed, so long as you have a fairly recent nvidia card. Follow these instructions. The config must be changed before initializing any layers.
  • Lastly, edit the main() definition of the script to call set_memory_growth(YOUR_FIRST_GPU) to true, as demonstrated here. CUDNN will die if you don't.

In order to do that last step, you'll need to not install TF hub from pip, but rather git clone tf hub's code. Once cloned, you'll need to modify the imports of the make_image_classifier script to pull the local files in the same dir as it, then execute those modified scripts.

I'm only on epoch 3 of 10 (not 100 anymore) and I'm hitting 92.57% validation accuracy. I'm doing this on a pretty junky AMD machine that I popped an RTX 2060 into. Before doing the preprocessing, I was facing growing old waiting for this to get somewhere. I thought I'd share this because I've seen many other comments about people taking days to run training. It's only going to take ~2 hours to run through to epoch 10.

For greater clarity what's happening here is that you're not just transfer learning, you're telling TF to specialize the lower layers of the model to the domain, which increases accuracy by "a few points" according to Google.

Hope this helps someone.

The same pic, but the result is different

I test nsfw model

nsfw.299x299.h5
{'826ea252-8595-415f-a4f4-6e867b794b1e.png': {'hentai': 0.023223665, 'sexy': 0.049387105, 'drawings': 0.064280376, 'porn': 0.35258076, 'neutral': 0.5105281}}

nsfw_mobilenet224.h5
{'826ea252-8595-415f-a4f4-6e867b794b1e.png': {'drawings': 0.007675596, 'hentai': 0.01800691, 'sexy': 0.05218548, 'neutral': 0.32376507, 'porn': 0.598367}}

the same pic, but the result is different, my pic is neutral

Potential issue with `self_clense.py`

I haven't looked through the self_clense.py script thoroughly, but it looks like you are using the model to change the distribution of your data by moving misclassified instances to their, supposedly, correct directory (class).
This process may (and most likely did) reduce one type of noise, but enhance the other.
So any potential gain in accuracy that you may have gotten from this process is only due to the feedback loop between the model and data distribution. This is not a rigorous/scientifically valid approach to "fix" noisy data.

Can not tune learning_rate

I try to retrain this model with train_initialization.py!
And I try to tune learning_rate. like this: opt = SGD(lr=0.02, momentum=.9, decay=0.00000072, nesterov=True)
but I check lr graph, the lr line always the same line !
Have any ideal! thanks

New version is broken

Getting this error:
ImportError: cannot import name 'NSFWDetector' from 'nsfw_detector' (unknown location)

How many images as test data?

Thanks for sharing data and code.
you say you get 93% accurary, so I want to know how many images you use as test data?

fine tune do not change model output

Hi Thank you very much for your pre-trained model!

I am gathering some new data and using your code to fine-tune.After 100 epochs, I get a new model (one weird thing is that the size of new model reduce a lot.....). However when I test it, it seems that new model behave exactly same as the pre-trained one(same acc,same recall and same precision for each class).

Any idea why it happens? I am checking your code and seems that you only set one layer trainable?

How to Train porn Class Only?

I have 10,000 pornographic pictures. How to Train porn Class Only? I trained only one category to predict that all pictures were pornographic.

Question

Are the usage instructions for new version is not the same?

Best Regards,
Marko

The names of the requested model outputs.

Quick question for any ML expert out there,

I'm writing a C# program and would love to use this model in my program, in order to bind the model to C#, I need the names of the requested model outputs (in Netron, when you click on softMax there usually is an ID for the output data). Unfortunately, outputs is empty in Netron, is there a default output ID I can use?

(see outputColumnName at https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transforms.tensorflowmodel.scoretensorflowmodel?view=ml-dotnet#Microsoft_ML_Transforms_TensorFlowModel_ScoreTensorFlowModel_System_String_System_String_System_Boolean_ )

Detect nsfw in video file

Hello.
Thank you very much for this excellent work.
I'm using it to detect nsfw in video frames. The problem I have is that I can only do it if I extract a frame, save it to disk and then open it and analyze it.
I use opencv.
Is there any way to directly analyze the frame that gives opencv ?

Let me give you an example

cam = cv2.VideoCapture('video.avi')
while(True):
ret,frame = cam.read()
result = detector.predict(frame) ???????

(question) tensorflow-lite model

My goal is have NSFW detection on android. For the time being, I'm okay with it not being super fast/efficient (though eventually I'd like to go that direction).

I converted the keras model to a tensorflow lite model using the following:

tflite_convert --inference_input_type=FLOAT --inference_type=FLOAT --output_file=nsfw_mobilenet2.224x224.tflite --keras_model_file=nsfw_mobilenet2.224x224.h5

I then followed the tensorflow-lite example provided by Google. Here's references to what I believe are the two most important files:

I wired it up, and everything appears to run. However I'm not getting as accurate of results.

A few questions:

  1. Did I miss anything when converting the model?
  2. The addPixelValue method (see https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/android/app/src/main/java/org/tensorflow/lite/examples/classification/tflite/ClassifierFloatMobileNet.java) has the following:
@Override
protected void addPixelValue(int pixelValue) {
  imgData.putFloat((((pixelValue >> 16) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
  imgData.putFloat((((pixelValue >> 8) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
  imgData.putFloat(((pixelValue & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
}

Where IMAGE_MEAN and IMAGE_STD are 127.5. What value should they be for this model? I've tried 127.5 and 255--both give less accurate results.

Thanks for your time

Update: Looking at https://github.com/infinitered/nsfwjs/blob/master/src/index.ts

      // Normalize the image from [0, 255] to [0, 1].
      const normalized = img
        .toFloat()
        .div(this.normalizationOffset) as tf.Tensor3D

This makes me believe I'd need to do that instead. So I believe this would be done with:

    private static final float NORMALIZE_PIXELS = 255.f;

    private static void addPixelValue(ByteBuffer buffer, int pixelValue) {
        buffer.putFloat(((pixelValue >> 16) & 0xFF) / NORMALIZE_PIXELS);
        buffer.putFloat(((pixelValue >> 8) & 0xFF) / NORMALIZE_PIXELS);
        buffer.putFloat((pixelValue & 0xFF) / NORMALIZE_PIXELS);
    }

I'm still not getting amazing results though.

Unreproducable: accuracy for porn class in nsfw_data_scrapper dataset

I tried to test the model against all the data from https://github.com/alexkimxyz/nsfw_data_scraper. It turns out the the model accuracy was pretty low (0.84) for porn classes. I checked several false negative images, should belong to porn. Any suggestions? Thanks

Below are detailed results for porn images:

porn 0.8378566785677277
sexy 0.09817904345614349
neutral 0.028440081768767722
hentai 0.033282149350465834
drawings 0.0022420468568952363

Error when using Tensorflow Inception v3 Model with OpenCV

Hello! I'm having problems trying to load a Tensoflow Inception v3 Model using readNetFromTensorflow in OpenCV 4.1.0 with python 3.6:

import cv2
cv.dnn.readNetFromTensorflow("nsfw.299x299.pb")

I get following error:

cv2.error: OpenCV(4.1.0) /Users/travis/build/skvark/opencv-python/opencv/modules/dnn/src/tensorflow/tf_importer.cpp:535: error: (-2:Unspecified error) Input [batch_normalization_1/ones_like] for node [batch_normalization_1/FusedBatchNorm_1] not found in function 'getConstBlob'

Also I have tried to generate pbtxt from pb file and load model like this:

import cv2
cv.dnn.readNetFromTensorflow("nsfw.299x299.pb", "nsfw.299x299.pbtxt")

But get another error:

cv2.error: OpenCV(4.1.0) /Users/travis/build/skvark/opencv-python/opencv/modules/dnn/src/tensorflow/tf_importer.cpp:616: error: (-215:Assertion failed) const_layers.insert(std::make_pair(name, li)).second in function 'addConstNodes'

This issue could be relevant to opencv/opencv#14073 and seems that model file could have some issues. Have anybody succeeded using this model with OpenCV? Thanks for help!

ValueError: Unknown layer: KerasLayer

I'm doing in google colab and after loading model i got an error.
Will you explain me what should i do?
`ValueError Traceback (most recent call last)
in ()
1 from nsfw_detector import predict
----> 2 model = predict.load_model('./saved_model.h5')
3
4 # Predict single image
5 predict.classify(model, 'nsfw.jpg')

9 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name)
319 cls = get_registered_object(class_name, custom_objects, module_objects)
320 if cls is None:
--> 321 raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
322
323 cls_config = config['config']

ValueError: Unknown layer: KerasLayer`

ModuleNotFoundError: No module named 'nsfw_detector'

I'm trying to run example and just for test I created file with only one row inside:
from nsfw_detector import NSFWDetector
but I've got the error No module named 'nsfw_detector'
I have tensorflow ( 1.13.1 ) , keras ( 2.2.4 ) and also use setup.py install for nsfw_model repository.
What I miss ?

Thank you.

Error

Hi @GantMan @jessetrana @txyugood @TechnikEmpire @sickerin
I had converted my hdf5 model to tflite model and imlemented in my flutter appliaction and i am getting an error like
Caused by: java.lang.IllegalArgumentException: Cannot copy between a TensorFlowLite tensor with shape [1, 38] and a Java object with shape [1, 40]
could you help me to resolve this

Thanks and Regards,
Manikantha Sekhar.

loss value

Hi,Thanks for the repo, I want to know the loss value of the model in train set and val set

Publishing the graph

Can you publish the graph as well for the .pb files ?
Compiling from the tensorflow sources can be hard depending on the environment.

How many images in the training set for each class?

Hi,thanks for sharing the code and model, it helps me a lot
Can you tell me how many images in the training set for the 5 classes, I'm not familiar about keras, does the code below means that only 500*batch_size images are trained every epoch, and not every images in the training set of nsfw_data_scraper is trained in an epoch?
image

btw,I test the model on the test set(2000 images for each class), the correct rate of class neutral is 0.1 lower than the result confusion matrix shows while other classes performs well

Can I use this code to train different kind of images?

I'm quite new to machine learning, so please bear with me. I've got this idea to classify documents e.g. ID card, invoice, certificate. I was thinking to get a head start by reusing this model, but to train it on a different data set. Is it that 'simple' or am I underestimating the task?

Prolem in training phase

Hi, @GantMan @jessetrana @txyugood @TechnikEmpire @sickerin
I'm a college student and I'm studying your project. I want to train my own model with only 2 classes, I placed 2 folders of images (which represent for 2 classes images) in project's image directory but training scripts fail to separate train/test/validation folder from my 2 images folders. How can I fix this??
Thank you.

Convert model for TensorFlow 1.1.0

I run nsfw.299x299.pb model on react-native-tensorflow and got this message:

Error: Running model failed: Invalid argument: NodeDef mentions attr 'dilations' not in
Op<name=Conv2D; signature=input:T, filter:T -> output:T; attr=T:type,allowed=[DT_HALF,
DT_FLOAT, DT_DOUBLE]; attr=strides:list(int);
attr=use_cudnn_on_gpu:bool,default=true; attr=padding:string,allowed=["SAME", "VALID"];
attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]>; NodeDef:
conv2d_1/convolution = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1],
padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true,
_device="/job:localhost/replica:0/task:0/cpu:0"](_recv_input_1_0, conv2d_1/kernel/read).
(Check whether your GraphDef-interpreting binary is up to date with 
your GraphDef-generating binary.).
	 [[Node: conv2d_1/convolution = Conv2D[T=DT_FLOAT, data_format="NHWC", 
dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true,
_device="/job:localhost/replica:0/task:0/cpu:0"](_recv_input_1_0, conv2d_1/kernel/read)]]
  1. A google search prompted me that the tensorflow version and the models do not match. Is there a way to convert a model for version 1.1.0?
  2. I need to run this model on iOS and Android on JS with react-native. How can i do this?

Generate tagged saved_model.pd incl variables for Go (Mobilenet v2 224)

We're working on a private cloud solution for personal photo management, see https://github.com/photoprism/photoprism

In order to enable uploading in our demo again, I would like to add a NSFW filter based on TensorFlow. Your model looks promising, however I spent the whole day converting it to the right format and failed... either the output directory does not contain "variables.index" or the .pb can not be created because of too many / not enough layers, input parameters or output values. TF for Go strictly requires a tagged .pb file, we can not use the one that is offered for download.

This script worked for us to export Nasnet:

import keras
from keras.applications.nasnet import NASNetMobile
from keras.preprocessing import image
from keras.applications.xception import preprocess_input, decode_predictions
import numpy as np
import tensorflow as tf
from keras import backend as K
import json
import os
import sys

if not os.path.isfile("NASNet-mobile.h5"):
    os.system("curl -L -o NASNet-mobile.h5 http://modeldepot.io/assets/uploads/models/models/09a9e3fd-ebf0-46d4-bd5d-8be69d80cf44_NASNet-mobile.h5")

if len(sys.argv[1:]) > 0:
    modelName = sys.argv[1]
else:
    modelName = "nasnet-mobile"

sess = tf.Session()
K.set_session(sess)

model = NASNetMobile(weights="NASNet-mobile.h5")
img = image.load_img('gorge.jpg', target_size=(224,224)) #note the input size
img_arr = np.expand_dims(image.img_to_array(img), axis=0)
x = preprocess_input(img_arr)

preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])

print('input: ', model.input)
print('output: ', model.output)

# Use TF to save the graph model instead of Keras save model to load it in Golang
builder = tf.saved_model.builder.SavedModelBuilder(modelName)
# Tag the model, required for Go
builder.add_meta_graph_and_variables(sess, ["photoprism"])
builder.save()
sess.close()

Simply replacing NASNetMobile with MobileNetV2 + nsfw_mobilenet2.224x224.h5 didn't work (layer mismatch, I think this has ~107 and MobileNetV2 has about 105).

When using your code (here in this repository) to load the model, the error when saving is:

"FailedPreconditionError (see above for traceback): Attempting to use uninitialized value SGD_1/decay"

Docker is used as runtime environment - maybe we need other versions of TF or Keras?

#!/usr/bin/env sh

docker run -ti \
       -v ${PWD}:/nsfw \
       -w /nsfw \
       -u $(id -u):$(id -g) \
       gildasch/tensorflow-keras \
       python save.py

Our code for running inference (once we have a working model):
https://github.com/photoprism/photoprism/blob/develop/internal/photoprism/tensorflow.go

Would be amazing if somebody can give us a hint! 👍

Why not resnet?

Since resnet is very strong now, why didn't your try it?
Is it something wrong or some theory not suitable for nsfw?

Looking forward to your guidance

License

I'd like to use this in my project but couldn't see anything with license.
It seems like it's MIT but just to be sure, what kind of license does this project have?

Autoconfigure image size based on model?

For the same image, I got a pretty different classification on the Inception v3 Model
Keras 299x299 Image Model downloaded here vs the 93% accurate NSFW JS on the website. Is the model here kept up to date? How often is it retrained and updated?

New class training and data

Hello!

Im a ML practitioner from Chile (so im sorry in advance for any english mistakes). I was thinking about improving this model with a new NSFW class: violence (gore for example). I understand basic concepts of transfer learning and how to do it, but since you are the expert I would like to know if you can help me with:

1.- How much data of the original classes would you use for this new training? Should I use the 60 gbs or that would be overkill since the model already can already classify the 5 original classes.
2.- How much data do you think i need for the new class?

Btw im so grateful for your code, especially the mobilenet version! how long did it take to train that model?

Cya!

How do I use the model in TensorFlow JS?

Can anyone share the code of how I can use this model in tfjs?

const model = await tf.loadLayersModel('models/model.json')

I downloaded the model zip and loaded it as follows. How can I then use this in javascript code to access the model and classify images

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.