Welcome to open source secret scanner

This project is intended to capture the improvement of modified trufflehog secrets scanner comparing opensource trufflehog and github secret scanner.

Demo.mov

How to run the entire application

Run the jupyter notebook to find the improvement comparison between 3 scanners
- The datasets are present in the project directory, try to run the jupyter notebook to find the improvements.

How to run the data collection infrastructure: use below commands

This application consists of 5 different modules.

Frontend -
- A simple UI portal where the scan will be queued.
- Both private and public repositories can be configured in this UI
- For public repository we don't have to configure any personal access token.
- For private repository user has to create a personal token in the respective SCM to schedule the scan
Scan migrate -
- a db migration docker image to create necessary database tables in the given target db connection
Scan service -
- A simple microservice written in go programming language
- this service will expose a REST endpoint /api/v1/scan, and when frontend hits this endpoint scan will be inserted into the scan_requests table
- after inserting the record, scan-service will push the data into the a kafka queue topic named scan-requested.
Scan queue processor -
- A simple microservice written in golang
- this service will start consuming the data from the kafka queue topic scan-requested
- once it consumes the data, it will make call to the github REST API to find the requested repo metadata such as size, branches etc.,
- based on the size, it will create a persistence volume and persistent volume claim and pod in the given kubernetes cluster
- after making resource allocation in the kubernetes truffle-scanner will scan the repo detailed provided in the POD env variables.
- Change the status in the database from queued to processed
Truffle scanner -
- A simple go microservices written in golang
- consume required details from environment variables.
- if the repo is private then using git cli tool, it will clone the repo in certain directory and run the trufflehog executable to scan the cloned reop.
- if public it will use the git schema runner from trufflehog
- once the scan is complete, prase the output and writes and output to the database by batch.
github-ss-etl -
- A simple go tool written in golang
- uses github go sdk to pull secret scanning data from given repository

Running all services individually takes huge effort. Instead we will use docker to run all services under docker.

Pre Requisites:

Install docker in your machine
Create postgres database in azure and expose network publicly
Create azure key vault to store secrets required to use secrets securely
- Create azure subscription.
- Create key vault
- Create a app registration with client id and client secret
- Give necessary permissions like Owner, Contributor, Key Vault Contributor, Key Vault Crypto Officer, Key Vault Secrets User under Access Control in azure portal for the registered App.
Create below secrets
- DB-PASSWORD - db password will be fetched in all microservices.
- GITHUB-API-TOKEN - create a github api token for checking metadata of the public github repos.
- SS-TOKEN - token for the repos to fetch secret scanning.
Since the runner requires kubernetes to run the scanner pod, use minikube to install free kubernetes
- Create neccassary resources in kubernetes and update the values in global-secrets.yaml according to your configuration. kubectl create namespace truffle-scanner kubectl -n truffle-scanner apply -f global-secret.yaml
Copy the kubernetes config into config folder.
- you can find the kubernetes config in root folder of your system, if windows then C:/Users//.kube/config if linux then ~/.kube/config
Provide all necessary values in the .env file.
run docker compose up -d zookeeper kafka scanner-db
run docker compose up -d scan-migrate scan-service scan-queue-processor opss
open http://localhost:3000 and try to schedule to scan.
enter the values and click on the submit
run the below command to see the pods running in the kubernetes which will automatically clone the repo, run the scan and upload the results to db.
- kubectl -n truffle-scanner get pods

Note: If some of the above process looks complicated to run, ask the repo owner for their credentials and cloud specifications for demo purpose

hsivakum / opss-runner Goto Github PK

opss-runner's Introduction

Welcome to open source secret scanner

This project is intended to capture the improvement of modified trufflehog secrets scanner comparing opensource trufflehog and github secret scanner.

How to run the entire application

How to run the data collection infrastructure: use below commands

Running all services individually takes huge effort. Instead we will use docker to run all services under docker.

opss-runner's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent