Git Product home page Git Product logo

picklescan's Introduction

Python Pickle Malware Scanner

PyPI Test

Security scanner detecting Python Pickle files performing suspicious actions.

For more generic model scanning, Protect AI's modelscan is now available to scan not only Pickle files but also PyTorch, TensorFlow, and Keras.

Getting started

Scan a malicious model on Hugging Face:

pip install picklescan
picklescan --huggingface ykilcher/totally-harmless-model

The scanner reports that the Pickle is calling eval() to execute arbitrary code:

https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin:archive/data.pkl: global import '__builtin__ eval' FOUND
----------- SCAN SUMMARY -----------
Scanned files: 1
Infected files: 1
Dangerous globals: 1

The scanner can also load Pickles from local files, directories, URLs, and zip archives (a-la PyTorch):

picklescan --path downloads/pytorch_model.bin
picklescan --path downloads
picklescan --url https://huggingface.co/sshleifer/tiny-distilbert-base-cased-distilled-squad/resolve/main/pytorch_model.bin

To scan Numpy's .npy files, pip install the numpy package first.

The scanner exit status codes are (a-la ClamAV):

  • 0: scan did not find malware
  • 1: scan found malware
  • 2: scan failed

Develop

Create and activate the conda environment (miniconda is sufficient):

conda env create -f conda.yaml
conda activate picklescan

Install the package in editable mode to develop and test:

python3 -m pip install -e .

Edit with VS Code:

code .

Run unit tests:

pytest tests

Run manual tests:

  • Local PyTorch (zip) file
mkdir downloads
wget -O downloads/pytorch_model.bin https://huggingface.co/ykilcher/totally-harmless-model/resolve/main/pytorch_model.bin
picklescan -l DEBUG -p downloads/pytorch_model.bin
  • Remote PyTorch (zip) URL
picklescan -l DEBUG -u https://huggingface.co/prajjwal1/bert-tiny/resolve/main/pytorch_model.bin

Lint the code:

black src tests
flake8 src tests --count --show-source

Publish the package to PyPI: bump the package version in setup.cfg and create a GitHub release. This triggers the publish workflow.

Alternative manual steps to publish the package:

python3 -m pip install --upgrade pip
python3 -m pip install --upgrade build
python3 -m build
python3 -m twine upload dist/*

Test the package: bump the version of picklescan in conda.test.yaml and run

conda env remove -n picklescan-test
conda env create -f conda.test.yaml
conda activate picklescan-test
picklescan --huggingface ykilcher/totally-harmless-model

Tested on Linux 5.10.102.1-microsoft-standard-WSL2 x86_64 (WSL2).

References

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.