praetorian-inc / gato Goto Github PK
View Code? Open in Web Editor NEWGitHub Actions Pipeline Enumeration and Attack Tool
License: Apache License 2.0
GitHub Actions Pipeline Enumeration and Attack Tool
License: Apache License 2.0
Our integration tests pointed out this issue. The subsequent clone operation will fail if the GitHub fork API call does not lead to a forked repo within a few seconds.
To fix this, check every 15 seconds if the forked repository is visible, and only perform the clone at that point.
Gato has a lot of CLI parameters. Currently, these are passed as arguments to each module object (e.g. enumeration, attack). It would be good to move these to a configuration object that is passed to the enumeration class in order to prevent parameter bloat.
This will help facilitate adding more parameters and perhaps enumeration profiles for public repositories. I'm envisioning something like nmap's profiles.
The profiles will wrap a commonly used set of parameters (for example, look for non-ephemeral self-hosted runners in public repositories and focus on run-log analysis, or skip runner enumeration and focus on secrets and repository permissions).
Currently, the attack feature (fork PR, push exec and push exfil) only adds a new workflow. If other workflows also run on push
they will execute in addition to the malicious workflow. This can have unintended effects depending on what the different workflows do.
This can be solved by removing all other files within the .github/workflows
directory before adding the new one. Since Gato uses the GitHub contents API, this only allows changing one file at a time. Deleting files individually would also trigger events.
The solution here is to use the Git database APIs (https://docs.github.com/en/rest/git/blobs?apiVersion=2022-11-28, etc.) to create a new commit that:
.github/workflows
directory.This will significantly increase the stability and opsec profile of Gato's attack features.
The gato attack features currently push changes using git commands. Gato will not work against organizations that require SSH certificate authentication for clones and push operations. It is possible to use the GitHub API to create a new branch, commit to that branch, and then delete it. This will bypass the SSH authentication requirement entirely.
In an organization with both organization and repository secrets, Gato will fail to decrypt the final secrets blob for certain secret values. The stack trace is shown below:
2023-10-06T19:48:48.7558218Z ...cut blob...
[!] Decrypted and Decoded Secrets:
Traceback (most recent call last):
File "/gato/venv/bin/gato", line 8, in <module>
sys.exit(entry())
File "/gato/venv/lib/python3.7/site-packages/gato/main.py", line 6, in entry
sys.exit(cli.cli(sys.argv[1:]))
File "/gato/venv/lib/python3.7/site-packages/gato/cli/cli.py", line 83, in cli
arguments.func(arguments, subparsers)
File "/gato/venv/lib/python3.7/site-packages/gato/cli/cli.py", line 202, in attack
args.file_name
File "/gato/venv/lib/python3.7/site-packages/gato/attack/attack.py", line 562, in secrets_dump
padding.PKCS1v15()).decode()
File "/gato/venv/lib/python3.7/site-packages/cryptography/hazmat/backends/openssl/rsa.py", line 446, in decrypt
return _enc_dec_rsa(self._backend, self, ciphertext, padding)
File "/gato/venv/lib/python3.7/site-packages/cryptography/hazmat/backends/openssl/rsa.py", line 97, in _enc_dec_rsa
return _enc_dec_rsa_pkey_ctx(backend, key, data, padding_enum, padding)
File "/gato/venv/lib/python3.7/site-packages/cryptography/hazmat/backends/openssl/rsa.py", line 163, in _enc_dec_rsa_pkey_ctx
raise ValueError("Encryption/decryption failed.")
ValueError: Encryption/decryption failed.
Modifying the code to explicitly exclude the single "bad" secret works, indicating that the particular value is the problem. I originally thought the issue was with the -e
switch in echo
, but that's not the case.
For anyone dealing with this, you can modify /attack/cicd_attack.py
line 89 and manually set the secrets
list to only the values you care about.
If GitHub releases a new GitHub-hosted Runner tag, Gato will report workflows that use this tag as "might execute on self-hosted runners" without context.
We should float the tag used to the user directly and provide them the context. This will also allow us to identify new tags quicker.
Occasionally, the self-hosted runner will fail to join the org (before or not at all, investigation needed) before the tests are ran, leading to failed tests.
The GET request for run logs appears to be returning the following response after some time:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>
GitHub likely has an undocumented rate limit for their run-log download API. I will keep an eye on this issue. The mitigation might be to run the run-log download feature only after exhausting YAML-based analysis to ensure we are not downloading excessive duplicate logs. For example, we can check the yaml associated with each run and confirm that it is associated with a self-hosted runner before pulling the run logs.
EDIT: This behavior only seems to manifest when proxying through Burp.
Line 55 in b347c55
While we likely need some sleep in this code block for the frequency of requests, I'm not sure we need a one minute sleep to enumerate even the simplest of organizations.
Looks like macos-latest is now using ARM images, which do not have supported python versions for the versions we target.
We'll need to upgrade the unit test targets to match python versions with available ARM deployments.
https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json
Hi,
It would be great if there was a way to remove any unnecessary data (Removing the ASCII art) via --silent, -s
flag and terminal color codes --no-color, -nc
to the output. Thank you for the tool!
Currently, Gato's search functionality only prints results out.
Adding a feature to print repository slugs to a text file would be helpful so the results can quickly be passed to Gato's enumeration functionality.
Implementation Suggestions:
Add a flag -oT
that takes a path to a file and save results to the file in addition to printing them out.
Example:
org1/Repo1
org1/Repo2
org2/Repo1
This issue tracks a mid-term goal of improving Gato's scanning flow to speed up enumeration.
Current State:
Gato takes a long time to run for organizations with hundreds or more repositories.
Challenges:
gato
without the skip run-logs flag useless against large organizations.Possible Solutions:
.github/workflows
directory for a repository that does not have one. If Gato starts off with GraphQL queries to retrieve repositories and all metadata relevant to further checks, Gato can disable checks if we know they will just come back empty.Definition of Done: Enumerating a large organization with thousands of repositories is not painful.
The GH_TOKEN check doesn't account for older tokens that don't start with ghp_
but are otherwise still valid.
Line 118 in 1fcde32
If only the ghp_
format is supported it would be helpful to have the error message indicate that.
Alternatively, the token is validated in the __setup_user_info
function already so potentially just GH_TOKEN
is set to something may be an option too.
Currently, the workflow attack supports monitoring job execution, downloading logs, and cleaning up. This is not supported in the PR attack method. We should generalize this post attack flow and achieve parity between the two attack methods.
The workflow run logs that Gato already parses contain metadata such as:
GITHUB_TOKEN
permissionsGato should capture this information and save it to the JSON output / print more important results.
Hello, this looks like an awesome tool!
Is it possible to add a cli flag for github api url? That would enable support for GitHub Enterprise Server :)
Cheers,
emil
The enumeration module currently writes all output to the console.
It would be helpful if the tool provided the option to save the enumeration output to JSON. The JSON output should contain all of the output from the console. This feature should be implemented as an --output-json
or -oJ
flag.
From a design perspective, the JSON object should be created when the enumeration object is made and then updated as the enumeration process executes. The JSON's structure should be the same whether the self, org, or repo enumeration types are utilized.
The JSON should have the following high-level schema:
{
"user": {
.... USER INFO HERE
},
"organizations": [
"org_info": {
"runners": [ ... If org level runners ]
},
"repositories": [
"repo1": {
"runners": [ ...IF REPO RUNNERS ... ]
},
"repo2": {
"runners": []
}
]
]
}
Hi all, when running the search via the local Burp proxy on 8080 I am getting [Errno 111] 'Connection refused' - has anybody managed to run gato via proxy successfully?
$ gato -p localhost:8080 search --target sshayb
...
Traceback (most recent call last):
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
sock = connection.create_connection(
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
...
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/shay/git/gato/venv/bin/gato", line 8, in <module>
sys.exit(entry())
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/main.py", line 6, in entry
sys.exit(cli.cli(sys.argv[1:]))
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 83, in cli
arguments.func(arguments, subparsers)
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 242, in enumerate
orgs = [gh_enumeration_runner.enumerate_organization(
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/enumerate/enumerate.py", line 146, in enumerate_organization
if not self.__setup_user_info():
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/enumerate/enumerate.py", line 64, in __setup_user_info
self.user_perms = self.api.check_user()
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/github/api.py", line 620, in check_user
result = self.call_get('/user')
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/github/api.py", line 248, in call_get
api_response = requests.get(request_url, headers=get_header,
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/adapters.py", line 513, in send
raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /user (Caused by ProxyError('Unable to connect to proxy', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f23cd2e6170>: Failed to establish a new connection: [Errno 111] Connection refused')))```
I ran gato against a target repo with a 3rd party PAT (i.e. personal one) and it worked fine for enumeration. I then used a PAT for my work (classic PAT like before) GH account and when I tried to enumerate, it throws a NoneType error complaining about it not being subscriptable.
Additional Info:
gato -s e -t XXXXX --output-yaml . --output-json XXXXX.json
[+] The authenticated user is: XXXX
[!] The token has no scopes!
[+] Enumerating the XXXXX organization!
[!] The user has only public access!
[+] About to enumerate 379 repos within the XXXXX organization!
[+] Querying and caching workflow YAML files!
Traceback (most recent call last):
File "/root/gato/venv/bin/gato", line 8, in
sys.exit(entry())
File "/root/gato/venv/lib/python3.10/site-packages/gato/main.py", line 6, in entry
sys.exit(cli.cli(sys.argv[1:]))
File "/root/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 83, in cli
arguments.func(arguments, subparsers)
File "/root/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 242, in enumerate
orgs = [gh_enumeration_runner.enumerate_organization(
File "/root/gato/venv/lib/python3.10/site-packages/gato/enumerate/enumerate.py", line 184, in enumerate_organization
self.repo_e.construct_workflow_cache(result.json()['data']['nodes'])
File "/root/gato/venv/lib/python3.10/site-packages/gato/enumerate/repository.py", line 190, in construct_workflow_cache
owner = result['nameWithOwner']
TypeError: 'NoneType' object is not subscriptable
There are lots of checks that could be performed on a repository that involve additional API queries, and we probably only want to run them after we've identified a repository of interest:
pull_request
workflows without approval.Essentially, learn as much as possible via what is publicly accessible. This could be a flag that only works with single repository enumeration.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.