Git Product home page Git Product logo

scary-strings's Introduction

๐Ÿ˜ฑ Scary Strings! ๐Ÿ˜ฑ

Flag potentially dangerous API calls in source code, a.k.a. lines containing scary strings from a security perspective!

Overview

This repository contains a list of strings (usually function names) that are relevant to security auditing, usually because they perform a sensitive operation like changing the state of a database or accessing the filesystem.

In addition to technology-specific wordlists, there comments folder contains strings likely to be related to developer notes left in source code.

For Hackers

Search for these strings and generate ideas for hacking. Maybe you can spot where the database is being modified and work your way backward to finding a SQL injection. Maybe a 'TODO' message reveals a bug that the devs didn't fix. The possibilities are endless. Save yourself time and repetitive-stress injury by jumping to the dangerous parts of the app. This collection of wordlists will show you all thermal exhaust ports on the Death Star so you don't have to explore the whole thing.

For Developers

Scanning for these strings is a good way to improve the security of your app. Typically there are good practices and patterns for doing things safely according to the language you're using. If you can verify that such function calls are handled safely, great! Your app is more secure than when you started.

Wordlists

wordlists
โ”œโ”€โ”€ blockchain
โ”‚ย ย  โ””โ”€โ”€ all
โ”œโ”€โ”€ comments
โ”‚ย ย  โ”œโ”€โ”€ all
โ”‚ย ย  โ”œโ”€โ”€ derogatory
โ”‚ย ย  โ”œโ”€โ”€ security
โ”‚ย ย  โ””โ”€โ”€ todo
โ”œโ”€โ”€ cosmossdk
โ”‚ย ย  โ”œโ”€โ”€ abci
โ”‚ย ย  โ”œโ”€โ”€ module-auth
โ”‚ย ย  โ”œโ”€โ”€ module-authz
โ”‚ย ย  โ”œโ”€โ”€ module-bank
โ”‚ย ย  โ”œโ”€โ”€ module-group
โ”‚ย ย  โ””โ”€โ”€ module-staking
โ”œโ”€โ”€ cryptography
โ”‚ย ย  โ””โ”€โ”€ all
โ”œโ”€โ”€ go
โ”‚ย ย  โ”œโ”€โ”€ all
โ”‚ย ย  โ”œโ”€โ”€ cryptography
โ”‚ย ย  โ”œโ”€โ”€ db-access
โ”‚ย ย  โ”œโ”€โ”€ deprecated
โ”‚ย ย  โ”œโ”€โ”€ err
โ”‚ย ย  โ”œโ”€โ”€ randomness
โ”‚ย ย  โ””โ”€โ”€ unsafe
โ”œโ”€โ”€ java
โ”‚ย ย  โ”œโ”€โ”€ db_access
โ”‚ย ย  โ”œโ”€โ”€ file_access
โ”‚ย ย  โ”œโ”€โ”€ file_inclusion
โ”‚ย ย  โ”œโ”€โ”€ os_command_execution
โ”‚ย ย  โ””โ”€โ”€ url_redirect
โ”œโ”€โ”€ javascript
โ”‚ย ย  โ”œโ”€โ”€ all
โ”‚ย ย  โ”œโ”€โ”€ deprecated
โ”‚ย ย  โ”œโ”€โ”€ dom-xss
โ”‚ย ย  โ”œโ”€โ”€ generic
โ”‚ย ย  โ”œโ”€โ”€ randomness
โ”‚ย ย  โ”œโ”€โ”€ react
โ”‚ย ย  โ””โ”€โ”€ redos
โ”œโ”€โ”€ linters
โ”‚ย ย  โ””โ”€โ”€ all
โ”œโ”€โ”€ perl
โ”‚ย ย  โ””โ”€โ”€ all
โ”œโ”€โ”€ php
โ”‚ย ย  โ”œโ”€โ”€ all
โ”‚ย ย  โ”œโ”€โ”€ db_access
โ”‚ย ย  โ”œโ”€โ”€ dynamic_code_execution
โ”‚ย ย  โ”œโ”€โ”€ file_access
โ”‚ย ย  โ”œโ”€โ”€ file_inclusion
โ”‚ย ย  โ”œโ”€โ”€ os_command_execution
โ”‚ย ย  โ”œโ”€โ”€ randomness
โ”‚ย ย  โ”œโ”€โ”€ redos
โ”‚ย ย  โ”œโ”€โ”€ serialization
โ”‚ย ย  โ”œโ”€โ”€ sockets
โ”‚ย ย  โ”œโ”€โ”€ superglobals
โ”‚ย ย  โ”œโ”€โ”€ url_redirection
โ”‚ย ย  โ””โ”€โ”€ xxe
โ”œโ”€โ”€ python
โ”‚ย ย  โ”œโ”€โ”€ all
โ”‚ย ย  โ”œโ”€โ”€ bypass
โ”‚ย ย  โ”œโ”€โ”€ object_serialization
โ”‚ย ย  โ”œโ”€โ”€ os_command_execution
โ”‚ย ย  โ””โ”€โ”€ string_formatting
โ”œโ”€โ”€ rust
โ”‚ย ย  โ”œโ”€โ”€ all
โ”‚ย ย  โ”œโ”€โ”€ clone
โ”‚ย ย  โ”œโ”€โ”€ panic-macros
โ”‚ย ย  โ”œโ”€โ”€ randomness
โ”‚ย ย  โ”œโ”€โ”€ resource-exhaustion
โ”‚ย ย  โ”œโ”€โ”€ slices
โ”‚ย ย  โ”œโ”€โ”€ unsafe
โ”‚ย ย  โ”œโ”€โ”€ unwrap
โ”‚ย ย  โ””โ”€โ”€ vectors
โ”œโ”€โ”€ secrets
โ”‚ย ย  โ”œโ”€โ”€ all
โ”‚ย ย  โ”œโ”€โ”€ api-keys
โ”‚ย ย  โ””โ”€โ”€ public-keys
โ”œโ”€โ”€ solana
โ”‚ย ย  โ””โ”€โ”€ all
โ””โ”€โ”€ solidity
    โ””โ”€โ”€ all

16 directories, 65 files

Sources

Most of the entries in the wordlists come from my work experience as a security engineer and penetration tester. References for some of these choices can be found in the git commit history as well as the project's GitHub Issues.

For many programming of the supported programming languages, the lists come from well-known hacking books listed below. Note that these books were published in 2011 so some of the information may be dated.

Similar projects

scary-strings's People

Contributors

johnsaigle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

scary-strings's Issues

FileNotFoundError when running script outside of its path

Traceback (most recent call last):
  File "/home/jsaigle/.local/bin/scary-strings", line 134, in <module>
    args.path, build_list_of_file_extensions(language), build_list_of_exclude_dirs(language))
  File "/home/jsaigle/.local/bin/scary-strings", line 13, in build_list_of_file_extensions
    with open(filepath, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jsaigle/.local/bin/extensions/php'

It's possible the script only works within its own directory right now. May need to use os.path.realpath() here to resolve the issue.

Add rust slice panics

e.g. https://doc.rust-lang.org/std/primitive.slice.html#panics-31

Look for more on this page

  • copy_from_slice
  • clone_from_slice
  • flatten_mut
  • swap
  • windows
  • chunks
  • chunks_mut
  • chunks_exact
  • chunks_exact_mut
  • as_chunks
  • as_chunks_mut
  • as_rchunks
    • as_rchunks_mut
  • array_chunks
    • array_chunks_mut
  • array_windows
  • rchunks
  • rchunks_mut
  • rchunks_exact
  • rchunks_exact_mut
  • split_at
  • split_at_mut
  • split_array_ref
  • split_array_mut
  • rsplit_array_ref
  • rsplit_array_mut
  • select_nth_unstable
    • select_nth_unstable_by
  • rotate_left
  • rotate_right
  • copy_within
  • as_simd
  • repeat

Unicode issues

blorp

Traceback (most recent call last):
  File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 138, in <module>
    result = [scan_file(file_to_scan, function_names, language, args.scan_comments, comment_strings)
  File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 138, in <listcomp>
    result = [scan_file(file_to_scan, function_names, language, args.scan_comments, comment_strings)
  File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 51, in scan_file
    lines = list(map(str.strip, f.read().splitlines()))
  File "/usr/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 1021: invalid start byte

Add PHP's randomness functions

e.g. mt_rand() is not actually random and shouldn't be used in security. This would be a good one to flag.

A new wordlist should probably be created called 'randomness' containing this, its aliases, and related functions.

'Malformed UTF-8' results in crash

Description

Occurred while scanning https://github.com/josecl/cool-php-captcha which contains an accented character here, possibly in other places also:

https://github.com/josecl/cool-php-captcha/blob/dd8ea199c3fb238ea73781479415646c2542470b/captcha.php#L3

Program output

===>>>>>   SCARY STRINGS   <<<<<===

Source code analysis tool, Copyright (C) 2017 by John Saigle
Analyse source code for potentially dangerous APIs, or 'scary strings'!
This is free software. <https://github.com/johnsaigle/scary-strings>
Scanning 5 files...
Malformed UTF-8
  in sub MAIN at scary-strings.p6 line 42
  in block <unit> at scary-strings.p6 line 121

Improve python wordlist

the WAHH book woefully does not include a wordlist for python.

I've added some functions based on some blog posts I've read but the list is very small.

It would be great to increase support for Python here.

Add Go

tls.Config for cryptographic issues
Use of panic,
....

Comments-only mode

Refactor arguments so that a language does not need to be specified. This will allow the script to e.g. check only comments rather than function calls.

Add make targets

Create a Makefile to:

  • Concatenate the wordlists into the all lists for each folder
  • Update the tree output in the README (this could be a Git pre-commit hook too)

Add support for more programming languages

Currently the only existing wordlists are for PHP. It would be good to at least extend this to Perl and Perl6.

  • Add additional wordlists for dangerous APIs

  • Add exclude folders for common third-party libraries for these languages

Add Solidity

Ideas:

  • selfdestruct
  • anything that moves assets, e.g. ERC20 and NFT functions to approve/send/transfer
  • transfer vs call
  • delegatecall

It would also be a good idea to check slither and the code4rena auto-audit tool for other function calls

Add web application error messages

ร  la this file in SecLists: https://github.com/danielmiessler/SecLists/blob/master/Pattern-Matching/errors.txt

Most of this repo is intended to be used in white-box source code analysis but a list of error messages like this would be a useful thing to import into Burp, for example. You could load this up and have Burp flag HTTP responses containing these strings.

Note that the list above is several years old and most of the errors seem to be related to .NET and PHP. It could be good to extend this to other technologies like Express, Go, and weird Java tech like Tomcat

Provide list of files not to scan

Would be nice to be able to pass a list of files to exclude. For example I scanned a project recently that had a giant wordlist that was in a .php file. This won't contain the dangerous API calls that this script wants to find, but will slow the execution significantly.

Examine comments as well

Comments are skipped in the current source code.

However, it could be useful to extend this tool to include scanning for TODO strings, e.g.:

  • TODO
  • FIXME
  • XXX
  • and other things that get highlighted in vim etc.

It would be good also to add a regex for strings inserted by editors. For example, VS Code has a JIRA integration that will prompt devs to attach their TODO message to a ticket. It auto-generates a URL to JIRA when it does this. It would be good to look for this and other similar things. Links to sites like GitHub, Jira, Redmine, etc. in general would be useful.

Some heuristic type words could be useful too. People sometimes write things like "This is broken and someone should fix it later" or "insecure, fix this after demo" or "check permissions on this later".

CSV report gets messed up when "Line of Code" cell contains commas

This is happening because of naive splitting on commas on a line of code.

The simplest solution would probably be to change the output to a TSV file instead of a CSV file.

TSVs still might contain problems if a line of code contains tab characters, but everybody uses spaces instead of tabs anyway, right?

Add concurrency to speed up results

The script must check every line in a file against every other line in a wordlist, and do this for all files. This takes a lot of time for longer files or against the default wordlist.

Adding concurrency could help speed up the results here.

Add Cosmos

Ideas

  • BeginBlocker and EndBlocker -- should be reviewed for code that can panic
  • Any module function from the SDK that involves transferring funds
  • UNIX time functions
  • Grants and authorizations

It should be non-overlapping with the Go wordlists. A user can run both Go and Cosmos rules on a Cosmos project

Scan for lines that contain ignore directives for linters

A lot of linting tools will allow you to write a comment that will tell the scanner to not throw a warning to the devs. This is usually a good place to look for bugs as it indicates a place where the devs have silenced a warning

Use ASTs to represent scanned code instead of going line-by-line

Currently the project scans each file into memory and examines every single line using a regex. This is fine for now but probably not very efficient and is error-prone.

It would be interesting to try to use something like PHP-AST to parse the code in an intelligent way and scan only the function names (or alternatively just the comments along with #11). It might be more efficient and would make this tool much more robust.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.