Git Product home page Git Product logo

gato's People

Contributors

adnanekhan avatar ds-koolaid avatar jimmyscchang avatar jstawinski avatar mas0nd avatar rtrompier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gato's Issues

Issues with Fork PR attack when fork takes time

Our integration tests pointed out this issue. The subsequent clone operation will fail if the GitHub fork API call does not lead to a forked repo within a few seconds.

To fix this, check every 15 seconds if the forked repository is visible, and only perform the clone at that point.

Improve CLI Parameter Handling

Gato has a lot of CLI parameters. Currently, these are passed as arguments to each module object (e.g. enumeration, attack). It would be good to move these to a configuration object that is passed to the enumeration class in order to prevent parameter bloat.

This will help facilitate adding more parameters and perhaps enumeration profiles for public repositories. I'm envisioning something like nmap's profiles.

The profiles will wrap a commonly used set of parameters (for example, look for non-ephemeral self-hosted runners in public repositories and focus on run-log analysis, or skip runner enumeration and focus on secrets and repository permissions).

Remove other Workflow files when executing Attack module

Currently, the attack feature (fork PR, push exec and push exfil) only adds a new workflow. If other workflows also run on push they will execute in addition to the malicious workflow. This can have unintended effects depending on what the different workflows do.

This can be solved by removing all other files within the .github/workflows directory before adding the new one. Since Gato uses the GitHub contents API, this only allows changing one file at a time. Deleting files individually would also trigger events.

The solution here is to use the Git database APIs (https://docs.github.com/en/rest/git/blobs?apiVersion=2022-11-28, etc.) to create a new commit that:

  • WITHIN THE NEW BRANCH removes all files in the .github/workflows directory.
  • Adds the malicious file.
  • Creates and pushes a single commit to the branch.

This will significantly increase the stability and opsec profile of Gato's attack features.

Verbose Error Logs

If the user does not have proper permissions to clone the GitHub repository, verbose error logs are generated when attempting to run the enumerate command

gato

Utilize GitHub API to perform commits

The gato attack features currently push changes using git commands. Gato will not work against organizations that require SSH certificate authentication for clones and push operations. It is possible to use the GitHub API to create a new branch, commit to that branch, and then delete it. This will bypass the SSH authentication requirement entirely.

Decryption Error with Certain Secret Values

In an organization with both organization and repository secrets, Gato will fail to decrypt the final secrets blob for certain secret values. The stack trace is shown below:

2023-10-06T19:48:48.7558218Z ...cut blob...

[!] Decrypted and Decoded Secrets:

Traceback (most recent call last):
  File "/gato/venv/bin/gato", line 8, in <module>
    sys.exit(entry())
  File "/gato/venv/lib/python3.7/site-packages/gato/main.py", line 6, in entry
    sys.exit(cli.cli(sys.argv[1:]))
  File "/gato/venv/lib/python3.7/site-packages/gato/cli/cli.py", line 83, in cli
    arguments.func(arguments, subparsers)
  File "/gato/venv/lib/python3.7/site-packages/gato/cli/cli.py", line 202, in attack
    args.file_name
  File "/gato/venv/lib/python3.7/site-packages/gato/attack/attack.py", line 562, in secrets_dump
    padding.PKCS1v15()).decode()
  File "/gato/venv/lib/python3.7/site-packages/cryptography/hazmat/backends/openssl/rsa.py", line 446, in decrypt
    return _enc_dec_rsa(self._backend, self, ciphertext, padding)
  File "/gato/venv/lib/python3.7/site-packages/cryptography/hazmat/backends/openssl/rsa.py", line 97, in _enc_dec_rsa
    return _enc_dec_rsa_pkey_ctx(backend, key, data, padding_enum, padding)
  File "/gato/venv/lib/python3.7/site-packages/cryptography/hazmat/backends/openssl/rsa.py", line 163, in _enc_dec_rsa_pkey_ctx
    raise ValueError("Encryption/decryption failed.")
ValueError: Encryption/decryption failed.

Modifying the code to explicitly exclude the single "bad" secret works, indicating that the particular value is the problem. I originally thought the issue was with the -e switch in echo, but that's not the case.

For anyone dealing with this, you can modify /attack/cicd_attack.py line 89 and manually set the secrets list to only the values you care about.

Potential False Positives due to Unknown Labels

If GitHub releases a new GitHub-hosted Runner tag, Gato will report workflows that use this tag as "might execute on self-hosted runners" without context.

We should float the tag used to the user directly and provide them the context. This will also allow us to identify new tags quicker.

Nightly Integration Test Inconsistency

Occasionally, the self-hosted runner will fail to join the org (before or not at all, investigation needed) before the tests are ran, leading to failed tests.

Runlog download error 400s

The GET request for run logs appears to be returning the following response after some time:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>

GitHub likely has an undocumented rate limit for their run-log download API. I will keep an eye on this issue. The mitigation might be to run the run-log download feature only after exhausting YAML-based analysis to ensure we are not downloading excessive duplicate logs. For example, we can check the yaml associated with each run and confirm that it is associated with a self-hosted runner before pulling the run logs.

EDIT: This behavior only seems to manifest when proxying through Burp.

Unnecessary sleep

time.sleep(60)

While we likely need some sleep in this code block for the frequency of requests, I'm not sure we need a one minute sleep to enumerate even the simplest of organizations.

[enhancement] Silent and a No Color option

Hi,

It would be great if there was a way to remove any unnecessary data (Removing the ASCII art) via --silent, -s flag and terminal color codes --no-color, -nc to the output. Thank you for the tool!

Write Search Results to File

Currently, Gato's search functionality only prints results out.

Adding a feature to print repository slugs to a text file would be helpful so the results can quickly be passed to Gato's enumeration functionality.

Implementation Suggestions:

Add a flag -oT that takes a path to a file and save results to the file in addition to printing them out.

Example:

org1/Repo1
org1/Repo2
org2/Repo1

Speed Up Enumeration

This issue tracks a mid-term goal of improving Gato's scanning flow to speed up enumeration.

Current State:

Gato takes a long time to run for organizations with hundreds or more repositories.

Challenges:

  • Gato makes a lot of REST API calls. For large organizations, many of these calls do not return valuable information. This slows things down a lot.
  • The workflow run log download step is very slow. Gato short circuits if it finds a self-hosted runner, but if there is not one, and there are very complex runs Gato will download and extract zip files that are megabytes in size (currently 10 per repository). This makes running gato without the skip run-logs flag useless against large organizations.

Possible Solutions:

  • Incremental output of results + save/resume functionality. Gato currently enumerates everything while storing results in memory. It then converts everything to JSON and writes it (if the JSON output flag is enabled). If we incrementally store progress, operators can pause/resume the enumeration of a large organization.
  • GraphQL initial filter pass: Many of Gato's API calls return 404s (such as running a call to list repository contents for the .github/workflows directory for a repository that does not have one. If Gato starts off with GraphQL queries to retrieve repositories and all metadata relevant to further checks, Gato can disable checks if we know they will just come back empty.
  • For organization enumeration, only download run logs if there is an affirmative self-hosted runner ID from workflow file analysis. We can increase the scanning depth for single repo mode and even disable short-circuiting to enumerate all accessible runners.

Definition of Done: Enumerating a large organization with thousands of repositories is not painful.

Incomplete GH_TOKEN check

The GH_TOKEN check doesn't account for older tokens that don't start with ghp_ but are otherwise still valid.

if "ghp_" not in gh_token:

If only the ghp_ format is supported it would be helpful to have the error message indicate that.

Alternatively, the token is validated in the __setup_user_info function already so potentially just GH_TOKEN is set to something may be an option too.

PR Attack Feature Parity

Currently, the workflow attack supports monitoring job execution, downloading logs, and cleaning up. This is not supported in the PR attack method. We should generalize this post attack flow and achieve parity between the two attack methods.

Capture Additional Runner Metadata

The workflow run logs that Gato already parses contain metadata such as:

  • GITHUB_TOKEN permissions
  • Runner group name
  • Requested labels
  • Organization/Repository level runner

Gato should capture this information and save it to the JSON output / print more important results.

GHES support

Hello, this looks like an awesome tool!

Is it possible to add a cli flag for github api url? That would enable support for GitHub Enterprise Server :)

Cheers,

emil

JSON Output For Enumeration Module

The enumeration module currently writes all output to the console.

It would be helpful if the tool provided the option to save the enumeration output to JSON. The JSON output should contain all of the output from the console. This feature should be implemented as an --output-json or -oJ flag.

From a design perspective, the JSON object should be created when the enumeration object is made and then updated as the enumeration process executes. The JSON's structure should be the same whether the self, org, or repo enumeration types are utilized.

The JSON should have the following high-level schema:

{
    "user": {
        .... USER INFO HERE
    },
    "organizations": [
        "org_info": {
            "runners": [ ... If org level runners ]
        },
        "repositories": [
            "repo1": {
                "runners": [ ...IF REPO RUNNERS ... ]
            },
            "repo2": {
                "runners": []
            }   
        ]
    ]
}

Connection refused via proxy

Hi all, when running the search via the local Burp proxy on 8080 I am getting [Errno 111] 'Connection refused' - has anybody managed to run gato via proxy successfully?

$ gato -p localhost:8080 search --target sshayb
...
Traceback (most recent call last):
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

...
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/shay/git/gato/venv/bin/gato", line 8, in <module>
    sys.exit(entry())
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/main.py", line 6, in entry
    sys.exit(cli.cli(sys.argv[1:]))
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 83, in cli
    arguments.func(arguments, subparsers)
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 242, in enumerate
    orgs = [gh_enumeration_runner.enumerate_organization(
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/enumerate/enumerate.py", line 146, in enumerate_organization
    if not self.__setup_user_info():
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/enumerate/enumerate.py", line 64, in __setup_user_info
    self.user_perms = self.api.check_user()
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/github/api.py", line 620, in check_user
    result = self.call_get('/user')
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/gato/github/api.py", line 248, in call_get
    api_response = requests.get(request_url, headers=get_header,
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/shay/git/gato/venv/lib/python3.10/site-packages/requests/adapters.py", line 513, in send
    raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /user (Caused by ProxyError('Unable to connect to proxy', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f23cd2e6170>: Failed to establish a new connection: [Errno 111] Connection refused')))```

Crash w/ Stack Trace and None Type Error when Company PAT vs Personal PAT

I ran gato against a target repo with a 3rd party PAT (i.e. personal one) and it worked fine for enumeration. I then used a PAT for my work (classic PAT like before) GH account and when I tried to enumerate, it throws a NoneType error complaining about it not being subscriptable.

Additional Info:

gato -s e -t XXXXX --output-yaml . --output-json XXXXX.json
[+] The authenticated user is: XXXX
[!] The token has no scopes!
[+] Enumerating the XXXXX organization!
[!] The user has only public access!
[+] About to enumerate 379 repos within the XXXXX organization!
[+] Querying and caching workflow YAML files!
Traceback (most recent call last):
File "/root/gato/venv/bin/gato", line 8, in
sys.exit(entry())
File "/root/gato/venv/lib/python3.10/site-packages/gato/main.py", line 6, in entry
sys.exit(cli.cli(sys.argv[1:]))
File "/root/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 83, in cli
arguments.func(arguments, subparsers)
File "/root/gato/venv/lib/python3.10/site-packages/gato/cli/cli.py", line 242, in enumerate
orgs = [gh_enumeration_runner.enumerate_organization(
File "/root/gato/venv/lib/python3.10/site-packages/gato/enumerate/enumerate.py", line 184, in enumerate_organization
self.repo_e.construct_workflow_cache(result.json()['data']['nodes'])
File "/root/gato/venv/lib/python3.10/site-packages/gato/enumerate/repository.py", line 190, in construct_workflow_cache
owner = result['nameWithOwner']
TypeError: 'NoneType' object is not subscriptable

Deep-Dive Enumaration

There are lots of checks that could be performed on a repository that involve additional API queries, and we probably only want to run them after we've identified a repository of interest:

  • Enumerate all accessible self-hosted runners by downloading more run logs.
  • Enumerating check runs triggered by PRs
  • Testing if previous contributors can run pull_request workflows without approval.

Essentially, learn as much as possible via what is publicly accessible. This could be a flag that only works with single repository enumeration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.