johnsaigle / scary-strings Goto Github PK
View Code? Open in Web Editor NEWCollection of wordlists containing dangerous function calls in many languages
License: GNU General Public License v3.0
Collection of wordlists containing dangerous function calls in many languages
License: GNU General Public License v3.0
blorp
Traceback (most recent call last):
File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 138, in <module>
result = [scan_file(file_to_scan, function_names, language, args.scan_comments, comment_strings)
File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 138, in <listcomp>
result = [scan_file(file_to_scan, function_names, language, args.scan_comments, comment_strings)
File "/mnt/c/Users/JohnSaigle/bin/scary-strings/scary_strings.py", line 51, in scan_file
lines = list(map(str.strip, f.read().splitlines()))
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 1021: invalid start byte
the WAHH book woefully does not include a wordlist for python.
I've added some functions based on some blog posts I've read but the list is very small.
It would be great to increase support for Python here.
aka steal this list: https://github.com/Puliczek/awesome-list-of-secrets-in-environment-variables/
For example, this seems... risky
https://www.php.net/manual/en/wrappers.ssh2.php
3.9 is fine. See if there's a way to specify e.g. 3.8 or higher
Traceback (most recent call last):
File "/home/jsaigle/.local/bin/scary-strings", line 134, in <module>
args.path, build_list_of_file_extensions(language), build_list_of_exclude_dirs(language))
File "/home/jsaigle/.local/bin/scary-strings", line 13, in build_list_of_file_extensions
with open(filepath, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/jsaigle/.local/bin/extensions/php'
It's possible the script only works within its own directory right now. May need to use os.path.realpath()
here to resolve the issue.
Create a Makefile
to:
all
lists for each foldertree
output in the README (this could be a Git pre-commit hook too)e.g.
strcopy
gets
See also the Secure Development Lifecycle banned functions: https://www.forward.com.au/pfod/ArduinoProgramming/ArduinoStrings/Security%20Development%20Lifecycle%20(SDL)%20Banned%20Function%20Calls%20_%20Microsoft%20Docs.pdf
Comments are skipped in the current source code.
However, it could be useful to extend this tool to include scanning for TODO strings, e.g.:
It would be good also to add a regex for strings inserted by editors. For example, VS Code has a JIRA integration that will prompt devs to attach their TODO message to a ticket. It auto-generates a URL to JIRA when it does this. It would be good to look for this and other similar things. Links to sites like GitHub, Jira, Redmine, etc. in general would be useful.
Some heuristic type words could be useful too. People sometimes write things like "This is broken and someone should fix it later" or "insecure, fix this after demo" or "check permissions on this later".
e.g. https://doc.rust-lang.org/std/primitive.slice.html#panics-31
Look for more on this page
Add examples for taking the list and using them in a clever way with grep or rg.
Linking to the gf repo would be a good idea too because it has good examples of grep patterns.
tls.Config for cryptographic issues
Use of panic,
....
Like Rust, go has unsafe
that allows devs to break memory safety.
Check also https://github.com/jlauinger/go-geiger
The reflect
package and/or functions are also important
A lot of linting tools will allow you to write a comment that will tell the scanner to not throw a warning to the devs. This is usually a good place to look for bugs as it indicates a place where the devs have silenced a warning
Ideas:
invoke_signed (e.g. https://blog.neodyme.io/posts/solana_common_pitfalls/#arbitrary-signed-program-invocation)
anything to do with calling external programs
anything to do transferring funds or assets
Occurred while scanning https://github.com/josecl/cool-php-captcha which contains an accented character here, possibly in other places also:
===>>>>> SCARY STRINGS <<<<<===
Source code analysis tool, Copyright (C) 2017 by John Saigle
Analyse source code for potentially dangerous APIs, or 'scary strings'!
This is free software. <https://github.com/johnsaigle/scary-strings>
Scanning 5 files...
Malformed UTF-8
in sub MAIN at scary-strings.p6 line 42
in block <unit> at scary-strings.p6 line 121
Currently the only existing wordlists are for PHP. It would be good to at least extend this to Perl and Perl6.
Add additional wordlists for dangerous APIs
Add exclude folders for common third-party libraries for these languages
unsafe
panic (and variants) -- https://halborn.com/dont-panic-how-improper-error-handling-can-lead-to-blockchain-hacks/
extern
::with_capacity -- https://doc.rust-lang.org/std/vec/struct.Vec.html#panics
vec!
Usage of .clone()
Extract SQL function calls from popular Go libraries, like https://github.com/stripe-archive/safesql#how-does-it-work but without the SAST component
Packages listed in the above link:
https://pkg.go.dev/database/sql#DB
https://github.com/jinzhu/gorm
https://github.com/jmoiron/sqlx
Any others? That repo has not been updated for years so maybe there are new popular packages that people are using.
dangerouslySetInnerHtml (React)
eval
...?
Refactor arguments so that a language does not need to be specified. This will allow the script to e.g. check only comments rather than function calls.
e.g. mt_rand() is not actually random and shouldn't be used in security. This would be a good one to flag.
A new wordlist should probably be created called 'randomness' containing this, its aliases, and related functions.
Some scenarios where this is useful and perhaps some sample input and output.
Ideas:
It would also be a good idea to check slither and the code4rena auto-audit tool for other function calls
Here's an example of some stuff: https://github.com/tomnomnom/gf/blob/master/examples/sec.json
asymmetric key pairs would be a good example, e.g. RSA PRIVATE and equivalents for other algorithms
If there are common patterns for API keys for various services that would be great too. Check the cloud version of hacktricks
e.g.
ร la this file in SecLists: https://github.com/danielmiessler/SecLists/blob/master/Pattern-Matching/errors.txt
Most of this repo is intended to be used in white-box source code analysis but a list of error messages like this would be a useful thing to import into Burp, for example. You could load this up and have Burp flag HTTP responses containing these strings.
Note that the list above is several years old and most of the errors seem to be related to .NET and PHP. It could be good to extend this to other technologies like Express, Go, and weird Java tech like Tomcat
Ideas
It should be non-overlapping with the Go wordlists. A user can run both Go and Cosmos rules on a Cosmos project
Would be nice to be able to pass a list of files to exclude. For example I scanned a project recently that had a giant wordlist that was in a .php file. This won't contain the dangerous API calls that this script wants to find, but will slow the execution significantly.
The script must check every line in a file against every other line in a wordlist, and do this for all files. This takes a lot of time for longer files or against the default wordlist.
Adding concurrency could help speed up the results here.
Replace "Help message will go here" with real help message
This is happening because of naive splitting on commas on a line of code.
The simplest solution would probably be to change the output to a TSV file instead of a CSV file.
TSVs still might contain problems if a line of code contains tab characters, but everybody uses spaces instead of tabs anyway, right?
I've been copying and pasting function names as I add them. This is error-prone and easily automated by concatenating all the other wordlists together.
https://boostsecurityio.github.io/lotp/
https://github.com/boostsecurityio/lotp
Basically these can cause RCE in specific contexts. Could be interesting and simple to add the Go/Python examples here
Currently the project scans each file into memory and examines every single line using a regex. This is fine for now but probably not very efficient and is error-prone.
It would be interesting to try to use something like PHP-AST to parse the code in an intelligent way and scan only the function names (or alternatively just the comments along with #11). It might be more efficient and would make this tool much more robust.
https://snyk.io/blog/top-5-c-security-risks/
https://snyk.io/blog/unintimidating-intro-to-c-cpp-vulnerabilities/
https://snyk.io/blog/exploring-3-types-of-directory-traversal-vulnerabilities-in-c-c/
e.g.
printf
gets
...
and more classics
It might be good to structure this to be non-overlapping with the C wordlist. It would be possible to audit a C++ project by concatenating some combination of C and C++ lists that way
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.