Git Product home page Git Product logo

rotate-captcha-crack's Introduction

Rotate-Captcha-Crack

中文 | English

Predict the rotation angle of given picture through CNN. It can be used for rotate-captcha cracking.

Test result:

test_result

Three kinds of models are implemented, as shown below.

Name Backbone Cross-Domain Loss (less is better) Params FLOPs
RotNet ResNet50 71.7920° 24.246M 4.132G
RotNetR RegNetY 3.2GFLOPs 19.1594° 18.468M 3.223G
RCCNet_v0_5 RegNetY 3.2GFLOPs 42.7774° 17.923M 3.223G

RotNet is the implementation of d4nst/RotNet over PyTorch. RotNetR is based on RotNet. I just renewed its backbone and reduce its class number to 180. It's average prediction error is 19.1594°, obtained by 64 epochs of training (2 hours) on the Google Street View dataset.

About the Cross-Domain Test: Google Street View and Landscape-Dataset for training, and Captcha Pictures from Baidu for testing (thanks to @xiangbei1997)

The captcha picture used in the demo above comes from RotateCaptchaBreak

Try it!

Prepare

  • GPU supporting CUDA10+ (mem>=4G for training)

  • Python>=3.8 <3.12

  • PyTorch>=1.11

  • Clone the repository and install all requiring dependencies

git clone --depth=1 https://github.com/Starry-OvO/rotate-captcha-crack.git
cd ./rotate-captcha-crack
pip install .

DONT miss the . after install

  • Or, if you prefer venv
git clone --depth=1 https://github.com/Starry-OvO/rotate-captcha-crack.git
python -m venv ./rotate-captcha-crack --system-site-packages
cd ./rotate-captcha-crack
# Choose the proper script to acivate venv according to your shell type. e.g. ./Script/active*
python -m pip install -U pip
pip install .

Download the Pretrained Models

Download the zip files in Release and unzip them to the ./models dir.

The directory structure will be like ./models/RCCNet_v0_5/230228_20_07_25_000/best.pth

The models' names will change frequently as the project is still in beta status. So, if any FileNotFoundError occurs, please try to rollback to the corresponding tag first.

Test the Rotation Effect by a Single Captcha Picture

If no GUI is presented, try to change the debugging behavior from showing images to saving them.

python test_captcha.py

Use HTTP Server

  • Install extra dependencies
pip install aiohttp httpx[cli]
  • Launch server
python server.py
  • Another Shell to Send Images
 httpx -m POST http://127.0.0.1:4396 -f img ./test.jpg

Train Your Own Model

Prepare Datasets

  • For this project I'm using Google Street View and Landscape-Dataset for training. You can collect some photos and leave them in one directory. Without any size or shape requirement.

  • Modify the dataset_root variable in train.py, let it points to the directory containing images.

  • No manual labeling is required. All the cropping, rotation and resizing will be done soon after the image is loaded.

Train

python train_RotNetR.py

Validate the Model on Test Set

python test_RotNetR.py

Details of Design

Most of the rotate-captcha cracking methods are based on d4nst/RotNet, with ResNet50 as its backbone. RotNet treat the angle prediction as a classification task with 360 classes, then use CrossEntropy to compute the loss.

Yet CrossEntropy will bring a sizeable metric distance of about $358°$ between $1°$ and $359°$, clearly defies common sense, it should be a small value like $2°$. Meanwhile, the angle_error_regression given by d4nst/RotNet is less effective. That's because when dealing with outliers, the gradient will lead to a non-convergence result. You can easily understand this through the subsequent comparison between loss functions.

My regression loss function RotationLoss is based on MSELoss, with an extra cosine-correction to decrease the metric distance between $±k*360°$.

$$ \mathcal{L}(dist) = {dist}^{2} + \lambda_{cos} (1 - \cos(2\pi*{dist})) $$

Why MSELoss here? Because the label generated by self-supervised method is guaranteed not to contain any outliers. So our design does not need to consider the outliers. Also, MSELoss won't break the derivability of loss function.

The loss function is derivable and almost convex over the entire $\mathbb{R}$. Why say almost? Because there will be local minimum at $predict = \pm 1$ when $\lambda_{cos} \gt 0.25$.

Finally, let's take a look at the figure of two loss functions:

loss

rotate-captcha-crack's People

Contributors

starry-ovo avatar controlnet avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.