Git Product home page Git Product logo

octoscat's Introduction

Octoscat

Octoscat is both a small library that can be used to detect high-entropy strings, and a cli tool that searches through git commit history for high-entropy strings (things like RSA keys, tokens, secrets, etc).

How does it work?

There is nothing fancy or magic about how octoscat works, it's rather simple. You feed it strings (git commit diffs, emails, or whatever your heart desires), and it will just iterate through each line, word, and character to find strings that look like they could be secrets.

Potential secrets are found by checking if each character of each word (space separated strings) contain characters allowed in the BASE64 and HEX character sets. If you happen to come across a string that is >20 characters, and each character lives within those two character sets -- it might be a secret!

Then using shannon entropy, octoscat calculates entropy of the potential secret strings. If the entropy is beyond a certain threshold, it is deemed a secret and returned as a positive match.

This approach is not bulletproof. You will likely run into many false positives, but it should find most of the real secrets. If you can think of a way to improve the accuracy, please feel free to open a pull request or discuss in the github issues.

Contributors

Thank you to all of the folks who have contributed to this project:

TODO

  • Add the ability to query all repos of a given github organization.
  • Add the ability to query all organization members' public github repositories.
  • Add the ability to store results in a way that can be rendered in a web interface.
  • Add the ability to run this tool continuously, and get notified when new positive matches occur.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.