Git Product home page Git Product logo

hmrc.leak-detection's Introduction

leak-detection

Download

Overview

Service used to find leaks in git repositories using regular expressions. It is worth noting here that as the service only runs periodically, leaked credentials might have already been found by unsavoury characters. Education is the best tool not to leak secrets. Further reading: https://blog.acolyer.org/2019/04/08/how-bad-can-it-git-characterizing-secret-leakage-in-public-github-repositories/?-characterizing-secret-leakage-in-public-github-repositories/

Checks performed by Leak Detection Service

Currently there are two kinds of checks performed by LDS:

  • Check for occurrence of a regular expression from the predefined set
  • Check for existence of repository.yaml file that contains unique fingerprint of public/private MDTP repository.

Regular expression rules

LDS allows to create collection of rules that detect certain types of secrets that might be stored in a GIT repository. Each rule definition consist of set of properties. E.g.

 {
      id = "cert_1"
      scope = fileContent
      regex = """-----(BEGIN|END).*?PRIVATE.*?-----"""
      description = "certificates and private keys"
      ignoredFiles = ["^\\/.*phantomjs.*", "^\\/.*xxx.*", "^\\/.*Foo.*", """/\.bar\.yml"""]
      ignoredExtensions = ${allRules.knownBinaryFilesExtensions}
    },

In such case the rule applies to all non-binary files (with filename extension that is not of ingoredExtensions), which match specified regex.

Checking for existence of repository.yaml file.

This check is performed if a configuration parameter alerts.slack.enabledForRepoVisibility is set to true. If the check is enabled, whenever commit is made to the repository that doesn't contain repository.yaml file or the file doesn't contain valid fingerprint, the alert will be sent.

Testing in a local environment

Requirements

  • Ensure sbt is installed.
  • You will also need a GitHub personal access token: https://github.com/settings/tokens
    • Export the GitHub token with export GITHUB_TOKEN=abc123abc123abc123abc123abc123abc123abc123abc123.
  • Run sbt "run -DgithubSecrets.personalAccessToken="bc123abc123abc123abc123abc123abc123abc123abc123 in the repository.
  • MongoDB running locally. No local authorisation required.
    • On Ubuntu (likely all Debian derivatives): sudo apt-get install mongodb-server && sudo systemctl start mongodb is sufficient.

Rules

  • In /conf/application.conf, modify the allRules section with whatever regular expressions you want.

Scan a single branch example

  • Ensure you are in the .scripts directory: cd .scripts
  • Run: ./rescan_repo.sh leak-detection scan-progress-file
    • Where leak-detection is the repository you wish to scan and scan-progress-file is the file that saves the progress of the scan.

Scanning all branches in all repositories

  • Ensure you are in the .scripts directory: cd .scripts
  • Create a plain text file with all repositories to scan, one repository per line.
    • Example file leak_test_list:
leak-detection
cds-file-upload-frontend
  • Ensure you are in the .scripts directory: cd .scripts
  • Run: ./rescan_all.sh leak_test_list scan-progress-file
    • Where leak_test_list is the name of the file with the list of repositories to scan and scan-progress-file is the file that saves the progress of the scan.

License

This code is open source software licensed under the Apache 2.0 License

hmrc.leak-detection's People

Contributors

andrewlynn avatar arminio avatar bennetimo avatar chotai avatar christopherjturner avatar colin-lamed avatar dabd avatar dariogii avatar hmrc-web-operations avatar jachro avatar jakobgrunig avatar konradwudkowski avatar martinusnel avatar nigelhp avatar tomasz-rosiek avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.