Git Product home page Git Product logo

scary-strings's Issues

Unicode issues

blorp

Traceback (most recent call last):
  File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 138, in <module>
    result = [scan_file(file_to_scan, function_names, language, args.scan_comments, comment_strings)
  File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 138, in <listcomp>
    result = [scan_file(file_to_scan, function_names, language, args.scan_comments, comment_strings)
  File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 51, in scan_file
    lines = list(map(str.strip, f.read().splitlines()))
  File "/usr/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 1021: invalid start byte

Improve python wordlist

the WAHH book woefully does not include a wordlist for python.

I've added some functions based on some blog posts I've read but the list is very small.

It would be great to increase support for Python here.

FileNotFoundError when running script outside of its path

Traceback (most recent call last):
  File "/home/jsaigle/.local/bin/scary-strings", line 134, in <module>
    args.path, build_list_of_file_extensions(language), build_list_of_exclude_dirs(language))
  File "/home/jsaigle/.local/bin/scary-strings", line 13, in build_list_of_file_extensions
    with open(filepath, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jsaigle/.local/bin/extensions/php'

It's possible the script only works within its own directory right now. May need to use os.path.realpath() here to resolve the issue.

Add make targets

Create a Makefile to:

  • Concatenate the wordlists into the all lists for each folder
  • Update the tree output in the README (this could be a Git pre-commit hook too)

Examine comments as well

Comments are skipped in the current source code.

However, it could be useful to extend this tool to include scanning for TODO strings, e.g.:

  • TODO
  • FIXME
  • XXX
  • and other things that get highlighted in vim etc.

It would be good also to add a regex for strings inserted by editors. For example, VS Code has a JIRA integration that will prompt devs to attach their TODO message to a ticket. It auto-generates a URL to JIRA when it does this. It would be good to look for this and other similar things. Links to sites like GitHub, Jira, Redmine, etc. in general would be useful.

Some heuristic type words could be useful too. People sometimes write things like "This is broken and someone should fix it later" or "insecure, fix this after demo" or "check permissions on this later".

Add rust slice panics

e.g. https://doc.rust-lang.org/std/primitive.slice.html#panics-31

Look for more on this page

  • copy_from_slice
  • clone_from_slice
  • flatten_mut
  • swap
  • windows
  • chunks
  • chunks_mut
  • chunks_exact
  • chunks_exact_mut
  • as_chunks
  • as_chunks_mut
  • as_rchunks
    • as_rchunks_mut
  • array_chunks
    • array_chunks_mut
  • array_windows
  • rchunks
  • rchunks_mut
  • rchunks_exact
  • rchunks_exact_mut
  • split_at
  • split_at_mut
  • split_array_ref
  • split_array_mut
  • rsplit_array_ref
  • rsplit_array_mut
  • select_nth_unstable
    • select_nth_unstable_by
  • rotate_left
  • rotate_right
  • copy_within
  • as_simd
  • repeat

Add Go

tls.Config for cryptographic issues
Use of panic,
....

Scan for lines that contain ignore directives for linters

A lot of linting tools will allow you to write a comment that will tell the scanner to not throw a warning to the devs. This is usually a good place to look for bugs as it indicates a place where the devs have silenced a warning

'Malformed UTF-8' results in crash

Description

Occurred while scanning https://github.com/josecl/cool-php-captcha which contains an accented character here, possibly in other places also:

https://github.com/josecl/cool-php-captcha/blob/dd8ea199c3fb238ea73781479415646c2542470b/captcha.php#L3

Program output

===>>>>>   SCARY STRINGS   <<<<<===

Source code analysis tool, Copyright (C) 2017 by John Saigle
Analyse source code for potentially dangerous APIs, or 'scary strings'!
This is free software. <https://github.com/johnsaigle/scary-strings>
Scanning 5 files...
Malformed UTF-8
  in sub MAIN at scary-strings.p6 line 42
  in block <unit> at scary-strings.p6 line 121

Add support for more programming languages

Currently the only existing wordlists are for PHP. It would be good to at least extend this to Perl and Perl6.

  • Add additional wordlists for dangerous APIs

  • Add exclude folders for common third-party libraries for these languages

Comments-only mode

Refactor arguments so that a language does not need to be specified. This will allow the script to e.g. check only comments rather than function calls.

Add PHP's randomness functions

e.g. mt_rand() is not actually random and shouldn't be used in security. This would be a good one to flag.

A new wordlist should probably be created called 'randomness' containing this, its aliases, and related functions.

Add Solidity

Ideas:

  • selfdestruct
  • anything that moves assets, e.g. ERC20 and NFT functions to approve/send/transfer
  • transfer vs call
  • delegatecall

It would also be a good idea to check slither and the code4rena auto-audit tool for other function calls

Add web application error messages

ร  la this file in SecLists: https://github.com/danielmiessler/SecLists/blob/master/Pattern-Matching/errors.txt

Most of this repo is intended to be used in white-box source code analysis but a list of error messages like this would be a useful thing to import into Burp, for example. You could load this up and have Burp flag HTTP responses containing these strings.

Note that the list above is several years old and most of the errors seem to be related to .NET and PHP. It could be good to extend this to other technologies like Express, Go, and weird Java tech like Tomcat

Add Cosmos

Ideas

  • BeginBlocker and EndBlocker -- should be reviewed for code that can panic
  • Any module function from the SDK that involves transferring funds
  • UNIX time functions
  • Grants and authorizations

It should be non-overlapping with the Go wordlists. A user can run both Go and Cosmos rules on a Cosmos project

Provide list of files not to scan

Would be nice to be able to pass a list of files to exclude. For example I scanned a project recently that had a giant wordlist that was in a .php file. This won't contain the dangerous API calls that this script wants to find, but will slow the execution significantly.

Add concurrency to speed up results

The script must check every line in a file against every other line in a wordlist, and do this for all files. This takes a lot of time for longer files or against the default wordlist.

Adding concurrency could help speed up the results here.

CSV report gets messed up when "Line of Code" cell contains commas

This is happening because of naive splitting on commas on a line of code.

The simplest solution would probably be to change the output to a TSV file instead of a CSV file.

TSVs still might contain problems if a line of code contains tab characters, but everybody uses spaces instead of tabs anyway, right?

Use ASTs to represent scanned code instead of going line-by-line

Currently the project scans each file into memory and examines every single line using a regex. This is fine for now but probably not very efficient and is error-prone.

It would be interesting to try to use something like PHP-AST to parse the code in an intelligent way and scan only the function names (or alternatively just the comments along with #11). It might be more efficient and would make this tool much more robust.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.