A wrapper around muffet to automate broken link checking for HashiCorp properties.
Create a Github workflow file in your repository located at /.github/workflows/broken-link-checker.yaml
.
name: Broken Link Checker
on: [push]
jobs:
check_links:
runs-on: ubuntu-16.04
steps:
- uses: SFDigitalServices/wait-for-deployment-action@v2
id: deployment
with:
timeout: 600
github-token: ${{ github.token }}
environment: Preview
- name: Install Go
uses: actions/setup-go@v2
with:
go-version: 1.15.x
- name: Install Muffet
run: GO111MODULE=on go get -u github.com/raviqqe/muffet/v2
- name: Install Broken Link Checker
run: GO111MODULE=on go get -u github.com/hashicorp/broken-link-checker
- name: Run
run: broken-link-checker ${{ steps.deployment.outputs.url }}
env:
VERBOSE: true
MAX_CONNECTIONS: 5
TIMEOUT_SECONDS: 10
EXCLUSIONS: linkedin.com,facebook.com
At the bottom of the file above, there are four environment variables that can be adjusted.
Having this enabled will echo out all errors that came from Muffet that we deem dubious as specified by the filterErrors function.
This filters out a bunch of things that would have triggered an error (failed Github workflow) that really aren't indicative of an actual problem.
In order to limit the number of 429 errors this is exposed as an environment variable.
The higher the value the quicker this script will run, but the more likely that servers will hit you with 429s.
The max amount of time a request can take before canceling.
Allows the filtering of specific domains to not check links from.
This is helpful for sites like LinkedIn who successfully detect the request comes from bots and throws an error.