Git Product home page Git Product logo

osxcollector's Introduction

osxcollector

Build Status PyPI

OSXCollector Manual

OSXCollector is a forensic evidence collection & analysis toolkit for OSX.

Forensic Collection

The collection script runs on a potentially infected machine and outputs a JSON file that describes the target machine. OSXCollector gathers information from plists, SQLite databases and the local file system.

Forensic Analysis

Armed with the forensic collection, an analyst can answer the question like:

  • Is this machine infected?
  • How'd that malware get there?
  • How can I prevent and detect further infection?

Yelp automates the analysis of most OSXCollector runs converting its output into an easily readable and actionable summary of just the suspicious stuff. Check out OSXCollector Output Filters project to learn how to make the most of the automated OSXCollector output analysis.

Performing Collection

osxcollector.py is a single Python file that runs without any dependencies on a standard OSX machine. This makes it really easy to run collection on any machine - no fussing with brew, pip, config files, or environment variables. Just copy the single file onto the machine and run it:

sudo osxcollector.py is all it takes.

$ sudo osxcollector.py
Wrote 35394 lines.
Output in osxcollect-2014_12_21-08_49_39.tar.gz

If you have just cloned the GitHub repository, osxcollector.py is inside osxcollector/ directory, so you need to run it as:

$ sudo osxcollector/osxcollector.py

IMPORTANT: please make sure that python command on your Mac OS X machine uses the default Python interpreter shipped with the system and is not overridden, e.g. by the Python version installed through brew. OSXCollector relies on a couple of native Python bindings for OS X libraries, which might be not available in other Python versions than the one originally installed on your system. Alternatively, you can run osxcollector.py explicitly specifying the Python version you would like to use:

$ sudo /usr/bin/python2.7 osxcollector/osxcollector.py

The JSON output of the collector, along with some helpful files like system logs, has been bundled into a .tar.gz for hand-off to an analyst.

osxcollector.py also has a lot of useful options to change how collection works:

  • -i INCIDENT_PREFIX/--id=INCIDENT_PREFIX: Sets an identifier which is used as the prefix of the output file. The default value is osxcollect.

    $ sudo osxcollector.py -i IncontinentSealord
    Wrote 35394 lines.
    Output in IncontinentSealord-2014_12_21-08_49_39.tar.gz

    Get creative with incident names, it makes it easier to laugh through the pain.

  • -p ROOTPATH/--path=ROOTPATH: Sets the path to the root of the filesystem to run collection on. The default value is /. This is great for running collection on the image of a disk.

    $ sudo osxcollector.py -p '/mnt/powned'
  • -s SECTION/--section=SECTION: Runs only a portion of the full collection. Can be specified more than once. The full list of sections and subsections is:

    • version
    • system_info
    • kext
    • startup
      • launch_agents
      • scripting_additions
      • startup_items
      • login_items
    • applications
      • applications
      • install_history
    • quarantines
    • downloads
      • downloads
      • email_downloads
      • old_email_downloads
    • chrome
      • history
      • archived_history
      • cookies
      • login_data
      • top_sites
      • web_data
      • databases
      • local_storage
      • preferences
    • firefox
      • cookies
      • downloads
      • formhistory
      • history
      • signons
      • permissions
      • addons
      • extension
      • content_prefs
      • health_report
      • webapps_store
      • json_files
    • safari
      • downloads
      • history
      • extensions
      • databases
      • localstorage
      • extension_files
    • accounts
      • system_admins
      • system_users
      • social_accounts
      • recent_items
    • mail
    • full_hash
    $ sudo osxcollector.py -s 'startup' -s 'downloads'
  • -c/--collect-cookies: Collect cookies' value. By default OSXCollector does not dump the value of a cookie, as it may contain sensitive information (e.g. session id).

  • -l/--collect-local-storage: Collect the values stored in web browsers' local storage. By default OSXCollector does not dump the values as they may contain sensitive information.

  • -d/--debug: Enables verbose output and python breakpoints. If something is wrong with OSXCollector, try this.

    $ sudo osxcollector.py -d

Details of Collection

The collector outputs a .tar.gz containing all the collected artifacts. The archive contains a JSON file with the majority of information. Additionally, a set of useful logs from the target system logs are included.

Common Keys

Every Record

Each line of the JSON file records 1 piece of information. There are some common keys that appear in every JSON record:

  • osxcollector_incident_id: A unique ID shared by every record.
  • osxcollector_section: The section or type of data this record holds.
  • osxcollector_subsection: The subsection or more detailed descriptor of the type of data this record holds.
File Records

For records representing files there are a bunch of useful keys:

  • atime: The file accessed time.
  • ctime: The file creation time.
  • mtime: The file modified time.
  • file_path: The absolute path to the file.
  • md5: MD5 hash of the file contents.
  • sha1: SHA1 hash of the file contents.
  • sha2: SHA2 hash of the file contents.

For records representing downloaded files:

  • xattr-wherefrom: A list containing the source and referrer URLs for the downloaded file.
  • xattr-quarantines: A string describing which application downloaded the file.
SQLite Records

For records representing a row of a SQLite database:

  • osxcollector_table_name: The table name the row comes from.
  • osxcollector_db_path: The absolute path to the SQLite file.

For records that represent data associated with a specific user:

  • osxcollector_username: The name of the user

Timestamps

OSXCollector attempts to convert timestamps to human readable date/time strings in the format YYYY-mm-dd hh:MM:ss. It uses heuristics to automatically identify various timestamps:

  • seconds since epoch
  • milliseconds since epoch
  • seconds since 2001-01-01
  • seconds since 1601-01-01

Sections

version section

The current version of OSXCollector.

system_info section

Collects basic information about the system:

  • system name
  • node name
  • release
  • version
  • machine
kext section

Collects the Kernel extensions from:

  • /System/Library/Extensions
  • /Library/Extensions
startup section

Collects information about the LaunchAgents, LaunchDaemons, ScriptingAdditions, StartupItems and other login items from:

  • /System/Library/LaunchAgents
  • /System/Library/LaunchDaemons
  • /Library/LaunchAgents
  • ~/Library/LaunchAgents
  • /Library/LaunchDaemons
  • /System/Library/ScriptingAdditions
  • /Library/ScriptingAdditions
  • /System/Library/StartupItems
  • /Library/StartupItems
  • ~/Library/Preferences/com.apple.loginitems.plist

More information about the Max OS X startup can be found here: http://www.malicious-streams.com/article/Mac_OSX_Startup.pdf

applications section

Hashes installed applications and gathers install history from:

  • /Applications
  • ~/Applications
  • /Library/Receipts/InstallHistory.plist
quarantines section

Quarantines are basically the info necessary to show the 'Are you sure you wanna run this?' when a user is trying to open a file downloaded from the Internet. For some more details, checkout the Apple Support explanation of Quarantines: http://support.apple.com/kb/HT3662

This section collects also information from XProtect hash-based malware check for quarantines files. The plist is at: /System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/XProtect.plist

XProtect also add minimum versions for Internet Plugins. That plist is at: /System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/XProtect.meta.plist

downloads section

Hashes all users' downloaded files from:

  • ~/Downloads
  • ~/Library/Mail Downloads
  • ~/Library/Containers/com.apple.mail/Data/Library/Mail Downloads
chrome section

Collects following information from Google Chrome web browser:

  • History
  • Archived History
  • Cookies
  • Extensions
  • Login Data
  • Top Sites
  • Web Data

This data is extracted from ~/Library/Application Support/Google/Chrome/Default

firefox section

Collects information from the different SQLite databases in a Firefox profile:

  • Cookies
  • Downloads
  • Form History
  • History
  • Signons
  • Permissions
  • Addons
  • Extensions
  • Content Preferences
  • Health Report
  • Webapps Store

This information is extracted from ~/Library/Application Support/Firefox/Profiles

For more details about Firefox profile folder see http://kb.mozillazine.org/Profile_folder_-_Firefox

safari section

Collects information from the different plists and SQLite databases in a Safari profile:

  • Downloads
  • History
  • Extensions
  • Databases
  • Local Storage
accounts section

Collects information about users' accounts:

  • system admins: /private/var/db/dslocal/nodes/Default/groups/admin.plist
  • system users: /private/var/db/dslocal/nodes/Default/users
  • social accounts: ~/Library/Accounts/Accounts3.sqlite
  • users' recent items: ~/Library/Preferences/com.apple.recentitems.plist
mail section

Hashes files in the mail app directories:

  • ~/Library/Mail
  • ~/Library/Mail Downloads
full_hash section

Hashes all the files on disk. All of 'em. This does not run by default. It must be triggered with:

$ sudo osxcollector.py -s full_hash

Basic Manual Analysis

Forensic analysis is a bit of art and a bit of science. Every analyst will see a bit of a different story when reading the output from OSXCollector. That's part of what makes analysis fun.

Generally, collection is performed on a target machine because something is hinky: anti-virus found a file it doesn't like, deep packet inspect observed a callout, endpoint monitoring noticed a new startup item. The details of this initial alert - a file path, a timestamp, a hash, a domain, an IP, etc. - that's enough to get going.

Timestamps

Simply greping a few minutes before and after a timestamp works great:

$ cat INCIDENT32.json | grep '2014-01-01 11:3[2-8]'

Browser History

It's in there. A tool like jq can be very helpful to do some fancy output:

$ cat INCIDENT32.json | grep '2014-01-01 11:3[2-8]' | jq 'select(has("url"))|.url'

A Single User

$ cat INCIDENT32.json | jq 'select(.osxcollector_username=="ivanlei")|.'

Automated Analysis

The OSXCollector Output Filters project contains filters that process and transform the output of OSXCollector. The goal of filters is to make it easy to analyze OSXCollector output.

Development Tips

The functionality of OSXCollector is stored in a single file: osxcollector.py. The collector should run on a naked install of OS X without any additional packages or dependencies.

Ensure that all of the OSXCollector tests pass before editing the source code. You can run the tests using: make test

After making changes to the source code, run make test again to verify that your changes did not break any of the tests.

License

This work is licensed under the GNU General Public License and a derivation of https://github.com/jipegit/OSXAuditor

Blog post

Presentations

External Presentations

Resources

Want to learn more about OS X forensics?

A couple of other interesting tools:

  • KnockKnock - KnockKnock is a command line python script that displays persistent OS X binaries that are set to execute automatically at each boot.
  • Grr - Google Rapid Response: remote live forensics for incident response
  • osquery - SQL powered operating system instrumentation, monitoring, and analytics

osxcollector's People

Contributors

0xdabbad00 avatar armtash avatar bfrizb avatar futureimperfect avatar ipinak avatar jamesottaway avatar jipegit avatar jjsendor avatar piax93 avatar secretsquirrel avatar sroberts avatar tomelm avatar waffle-iron avatar ytonui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osxcollector's Issues

Domains ending in '.loc' on blacklists break OSXCollector's Analyze Filter

Domains ending in '.loc' on blacklists break OSXCollector's Analyze Filter

For example, if the whitelist has "yelp.loc" on it and you run the analyze filter:
$ cat MALWARE-TEST-/MALWARE-TEST- | python -m osxcollector.output_filters.analyze

you'll get a traceback list this:

Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/analyst/Desktop/osxcollector-master/osxcollector/output_filters/analyze.py", line 453, in <module>
    main()
  File "/Users/analyst/Desktop/osxcollector-master/osxcollector/output_filters/analyze.py", line 449, in main
    run_filter_main(AnalyzeFilter)
  File "osxcollector/output_filters/base_filters/output_filter.py", line 128, in run_filter_main
    filter_arguments = output_filter_cls().get_argument_parser()
  File "/Users/analyst/Desktop/osxcollector-master/osxcollector/output_filters/analyze.py", line 81, in __init__
    filter_chain.append(OpenDnsRelatedDomainsFilter(related_when=AnalyzeFilter.find_related_when, **kwargs))
  File "osxcollector/output_filters/opendns/related_domains.py", line 64, in __init__
    self._whitelist = create_blacklist(config_get_deep('domain_whitelist'))
  File "osxcollector/output_filters/util/blacklist.py", line 37, in create_blacklist
    return Blacklist(blacklist_name, blacklist_keys, blacklist_file_path, blacklist_is_regex, blacklist_is_domains)
  File "osxcollector/output_filters/util/blacklist.py", line 66, in __init__
    self._blacklisted_values = [self._convert_to_matching_term(val) for val in self._blacklisted_values]
  File "osxcollector/output_filters/util/blacklist.py", line 87, in _convert_to_matching_term
    domain = clean_domain(blacklisted_value)
  File "osxcollector/output_filters/util/domains.py", line 54, in clean_domain
    raise BadDomainError(u'Can not clean {0} {1}'.format(unclean_domain, repr(extracted)))
osxcollector.output_filters.exceptions.BadDomainError: Can not clean yelp.loc ExtractResult(subdomain=u'yelp', domain=u'loc', suffix='')

"ValueError: Invalid IPv6 URL" when running the analyze filter

I have got the ValueError: Invalid IPv6 URL when running the analyze filter for a certain file path:

$ cat ../INCIDENT-28-2015_07_27-10_56_02.json | python -m osxcollector.output_filters.analyze -f '/Users/jdoe/Downloads/some_malware_file'
Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/jsendor/Documents/osxcollector/osxcollector/output_filters/analyze.py", line 453, in <module>
    main()
  File "/Users/jsendor/Documents/osxcollector/osxcollector/output_filters/analyze.py", line 449, in main
    run_filter_main(AnalyzeFilter)
  File "osxcollector/output_filters/base_filters/output_filter.py", line 142, in run_filter_main
    _run_filter(output_filter)
  File "osxcollector/output_filters/base_filters/output_filter.py", line 109, in _run_filter
    blob = output_filter.filter_line(blob)
  File "osxcollector/output_filters/base_filters/chain.py", line 48, in filter_line
    return self._on_filter_line(blob, self._head_of_chain)
  File "osxcollector/output_filters/base_filters/chain.py", line 61, in _on_filter_line
    return self._on_filter_line(link.filter_line(blob), link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 61, in _on_filter_line
    return self._on_filter_line(link.filter_line(blob), link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 61, in _on_filter_line
    return self._on_filter_line(link.filter_line(blob), link._next_link)
  File "osxcollector/output_filters/find_domains.py", line 33, in filter_line
    self._look_for_domains(blob)
  File "osxcollector/output_filters/find_domains.py", line 63, in _look_for_domains
    self._look_for_domains(elem, key)
  File "osxcollector/output_filters/find_domains.py", line 56, in _look_for_domains
    domain = self._url_to_domain(maybe_url)
  File "osxcollector/output_filters/find_domains.py", line 79, in _url_to_domain
    split_url = urlsplit(url)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urlparse.py", line 213, in urlsplit
    raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL

I am quite sure the problem is with some URL in the OSXCollector output file.

re.search throwing sre_constants.error in osxcollector/output_filters/related_files.py

Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/ivanlei/source/osxcollector/osxcollector/output_filters/analyze.py", line 390, in <module>
    help='[OPTIONAL] Output monochrome analysis')
  File "/Users/ivanlei/source/osxcollector/osxcollector/output_filters/analyze.py", line 385, in main
    parser.add_option('--related-domains-depth', dest='related_domains_depth', default=DEFAULT_RELATED_DOMAINS_DEPTH,
  File "osxcollector/output_filters/base_filters/output_filter.py", line 130, in run_filter
    final_blobs = output_filter.end_of_lines()
  File "osxcollector/output_filters/base_filters/chain.py", line 51, in end_of_lines
    return self._on_end_of_lines(self._head_of_chain)
  File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 65, in _on_end_of_lines
    for blob in link.end_of_lines():
  File "osxcollector/output_filters/related_files.py", line 58, in end_of_lines
    if re.search(term, line):
  File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/re.py", line 142, in search
    return _compile(pattern, flags).search(string)
  File "/Users/ivanlei/virtual_envs/osxcollector/lib/python2.7/re.py", line 242, in _compile
    raise error, v # invalid expression
sre_constants.error: nothing to repeat

Add whitelist support to virustotal.LookupUrlsFilter

Looking up all of the URLs in the osxcollector output might cause some sensitive data to be send to VirusTotal.

Let's follow what we are doing with the domains in LookupDomainsFilter and not look up the reports for the whitelisted stuff.

ConnectionError: Request to https://www.virustotal.com/vtapi/v2/file/report had an empty response

TypeError: init() got an unexpected keyword argument 'server_hostname'
<Greenlet at 0x109d8d050: <bound method AsyncRequest.send of <grequests.AsyncRequest object at 0x109d8f290>>(stream=False)> failed with TypeError

Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Volumes/YelpFiles/Code_Yelp/osxcollector/osxcollector/output_filters/analyze.py", line 453, in
main()
File "/Volumes/YelpFiles/Code_Yelp/osxcollector/osxcollector/output_filters/analyze.py", line 449, in main
run_filter_main(AnalyzeFilter)
File "osxcollector/output_filters/base_filters/output_filter.py", line 140, in run_filter_main
_run_filter(output_filter, input_stream=fp_in)
File "osxcollector/output_filters/base_filters/output_filter.py", line 114, in _run_filter
final_blobs = output_filter.end_of_lines()
File "osxcollector/output_filters/base_filters/chain.py", line 69, in end_of_lines
return self._on_end_of_lines(self._head_of_chain)
File "osxcollector/output_filters/base_filters/chain.py", line 88, in _on_end_of_lines
final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 88, in _on_end_of_lines
final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 88, in _on_end_of_lines
final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 88, in _on_end_of_lines
final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 83, in _on_end_of_lines
for blob in link.end_of_lines():
File "osxcollector/output_filters/base_filters/threat_feed.py", line 105, in end_of_lines
self._add_threat_info_to_blobs()
File "osxcollector/output_filters/base_filters/threat_feed.py", line 115, in _add_threat_info_to_blobs
all_threat_info = self._lookup_iocs(self.ioc_set)
File "osxcollector/output_filters/virustotal/lookup_hashes.py", line 34, in _lookup_iocs
reports = vt.get_file_reports(all_iocs)
File "/Library/Python/2.7/site-packages/threat_intel/util/http.py", line 353, in wrapper
result = fn(_args, *_kwargs)
File "/Library/Python/2.7/site-packages/threat_intel/virustotal.py", line 41, in get_file_reports
response_chunks = self._request_reports("resource", resource_chunks, 'file/report')
File "/Library/Python/2.7/site-packages/threat_intel/virustotal.py", line 287, in _request_reports
return self._requests.multi_get(self.BASE_DOMAIN + endpoint_name, query_params=params)
File "/Library/Python/2.7/site-packages/threat_intel/util/http.py", line 144, in multi_get
return self._multi_request(MultiRequest._VERB_GET, urls, query_params, None, to_json=to_json)
File "/Library/Python/2.7/site-packages/threat_intel/util/http.py", line 340, in _multi_request
all_responses.extend(self._wait_for_response(requests, to_json))
File "/Library/Python/2.7/site-packages/threat_intel/util/http.py", line 296, in _wait_for_response
raise ConnectionError('Request to {0} had an empty response'.format(request.url))
requests.exceptions.ConnectionError: Request to https://www.virustotal.com/vtapi/v2/file/report had an empty response

Split osxcollector.py into several modules

osxcollector.py grew over time and right now is one big unmaintainable chunk of code.
The initial motivation to keep it in one file to make it easy to run seems to be a bit too tight as the changes to the core file are not made that often. It should be possible to release it as an executable or via tools like pip in order to make it easy to install and run.

Code separation into modules would make it easier to maintain the code base and avoid duplication with the other projects (e.g. for things like DictUtils that are also in the OSXCollector Output Filters repository).

[ERROR] failed section 'str' object has no attribute 'get'

[ERROR] failed section 'str' object has no attribute 'get' <type 'exceptions.AttributeError'> [('osxcollector.py', 574, 'collect', 'collection_method()'), ('osxcollector.py', 1113, '_collect_applications', "self._log_packages_in_dir(pathjoin(ROOT_PATH, 'Applications'))"), ('osxcollector.py', 711, '_log_packages_in_dir', "cfbundle_executable = plist.get('CFBundleExecutable')")] - {'osxcollector_incident_id': 'osxcollect-2015_01_12-16_05_41', 'osxcollector_section': 'applications'}

Normalize "updateDate" and "installDate" fields for Firefox extensions information

Firefox extensions information retrieved from extensions.json file in the user's profile contains fields with timestamp data, e.g. updateDate and installDate.
They should be normalized like the other timestamps retrieved by OSXCollector to a human-readable format.

It appears that the timestamp format used there does not fall into any of the four formats already recognized by the OSXCollector timestamp normalization function (seconds since 2001, seconds since epoch, microseconds since epoch and microseconds since 1601).

Can you share some Kibana dashboards?

Since mentioned that you throw this into your ELK stack, would be awesome to export a Kibana dashboard or two as github gists to help showcase the capabilities as well as save us from re-creating. Thanks!

Add timeline view

Having the analyze output filter is useful for summarizing the events from the triage collection; however, a timeline view would also be extremely beneficial.

There are plenty of timestamps being parsed ('creation_utc', 'ctime', 'last_access_utc', 'last_visit_time', 'mtime', 'scan_date', 'visit_time', 'ZDATE' ... etc.). For any blob that contains one of the predefined timestamps (could declare them in initial scripts or make them datetime objects for dynamic recognition) place the timestamps and any related details determined to be of interest on a line to create a timeline of the events within the triage JSON file. The timeline view helps see the sequence of events unfold which works in conjunction or can replace the output view from analyze output filter as it currents stands.

I can provide example use cases, output renderings & how one might go about doing this is necessary.

osxcollector.output_filters.chrome.find_extensions drops all chrome lines except extensions

Before running the filter, notice so many subsections

$ cat foo.json | jq -c 'select(.osxcollector_section=="chrome")' | jq -c '{"ss":.osxcollector_subsection}' | sort | uniq -c
   1 {"ss":"archived_history"}
3216 {"ss":"cookies"}
   4 {"ss":"databases"}
64929 {"ss":"history"}
 637 {"ss":"local_storage"}
  50 {"ss":"login_data"}
   1 {"ss":"preferences"}
  32 {"ss":"top_sites"}
3095 {"ss":"web_data"}
   2 {"ss":null}

Notice not so many subsections

$ cat foo.json | jq -c 'select(.osxcollector_section=="chrome")' | python -m osxcollector.output_filters.chrome.find_extensions | jq -c '{"ss":.osxcollector_subsection}' | sort | uniq -c
  27 {"ss":"extensions"}

AnalyzeFilter should assume a file is malware when related: ["files"]

Sample output. Note this is malware but the filter doesn't know this is malware. It just saw something funny. It should just assume this is malware.

This whole things started with just a few clues. Now look what I found.
- downloads downloads
  ctime: "2014-10-16 12:31:20"
  file_path: "/Users/foo/Downloads/Note_0075_copy.zip"
  mtime: "2014-10-16 12:31:20"
  related: ["files"]
- chrome history
  current_path: "/Users/foo/Downloads/Note_0075_copy.zip"
  end_time: "2014-10-16 12:31:20"
  id: 2
  mime_type: "application/x-zip-compressed"
  opened: 1
  original_mime_type: "application/x-zip-compressed"
  received_bytes: 102835
  referrer: "https://us-mg6.mail.yahoo.com/neo/launch?.rand=XXX&action=inbox"
  start_time: "2014-10-16 12:31:20"
  state: 1
  target_path: "/Users/foo/Downloads/Note_0075_copy.zip"
  total_bytes: 102835
  related: ["files"]
Nothing hides from Very Readable Output Bot

Collect all trusted root CAs

Something as good as:

$ security dump-trust-settings
$ security dump-trust-settings -s
$ security dump-trust-settings -d

would be real nice.

UnicodeEncodeError in osxcollector.output_filters.opendns.lookup_domains

  "__main__", fname, loader, pkg_name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
  exec code in run_globals
File "/Users/ivanlei/source/osxcollector/osxcollector/output_filters/analyze.py", line 395, in <module>
  main()
File "/Users/ivanlei/source/osxcollector/osxcollector/output_filters/analyze.py", line 390, in main
  monochrome=options.monochrome))
File "osxcollector/output_filters/base_filters/output_filter.py", line 130, in run_filter
  final_blobs = output_filter.end_of_lines()
File "osxcollector/output_filters/base_filters/chain.py", line 51, in end_of_lines
  return self._on_end_of_lines(self._head_of_chain)
File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
  final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
  final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
  final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
  final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
  final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 70, in _on_end_of_lines
  final_lines = self._on_end_of_lines(link._next_link)
File "osxcollector/output_filters/base_filters/chain.py", line 65, in _on_end_of_lines
  for blob in link.end_of_lines():
File "osxcollector/output_filters/base_filters/threat_feed.py", line 108, in end_of_lines
  threat_info = self._lookup_iocs(self.ioc_set, self._suspicious_ioc_set)
File "osxcollector/output_filters/opendns/lookup_domains.py", line 44, in _lookup_iocs
  security_responses = investigate.security(categorized_responses.keys())
File "osxcollector/output_filters/util/http.py", line 289, in wrapper
  result = fn(_args, *_kwargs)
File "osxcollector/output_filters/opendns/api.py", line 60, in security
  urls = self._to_urls(fmt_url_path, domains)
File "osxcollector/output_filters/opendns/api.py", line 26, in _to_urls
  url_paths = [fmt_url_path.format(path_arg) for path_arg in url_path_args]
UnicodeEncodeError: 'ascii' codec can't encode character u'\xad' in position 0: ordinal not in range(128)

Missing cffi in requirements-dev.txt

Tried to setup a virtualenv, but I couldn't get one of the commands to work (following the instructions in the README file).
When I tried:
$pip install -r ./requirements-dev.txt
it's complaining about xattr

At least on my Mac OS X 10.11.3 I'm missing cffi which I added before xattr in the requirements-dev.txt file. Now it works.

OSXCollector Output Filters also import Foundation making it impossible to run on non-Mac machine

I have run analyze filter on *nix machine and got the following:

Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/nail/home/jsendor/pg/osxcollector/osxcollector/output_filters/analyze.py", line 35, in <module>
    from osxcollector.output_filters.chrome.find_extensions import FindExtensionsFilter as ChromeExtensionsFilter
  File "osxcollector/output_filters/chrome/find_extensions.py", line 6, in <module>
    from osxcollector.osxcollector import DictUtils
  File "osxcollector/osxcollector.py", line 42, in <module>
    import Foundation
ImportError: No module named Foundation

Output Filters should not import anything from osxcollector/osxcollector.py as it makes them impossible to run on non-Mac machines that do not provide Foundation.

Grab file hashes from browser extensions

So far OSXCollector can list all extensions from Firefox and Chrome but information about a particular extension is limited only to the information available there (name, install date, update page, etc.).

Getting file hashes from all of the extensions installed on a machine would provide a better insight when investigating whether a malicious extension is installed on the machine.

Firefox and Chrome use standard directories where the extensions are installed so it won't be a problem to get the file hashes from there.

virustotal.LookupURLsFilter fails if URLs are longer than 2000 chars

As the resource parameter in {{url/report}} VirusTotal method is send over HTTP GET it encodes the parameters in a URL query.

This could cause for some of the requests an erroneous situation where the resource parameter could not fit into the URL.

There is a debug output in {{osxcollector/output_filters/util/http.py}} that prints the URL for each request.
For the failed requests the URL seems to not contain any query parameters:

[ERROR] url[https://www.virustotal.com/vtapi/v2/url/report] status_code[<UNKNOWN>]
https://www.virustotal.com/vtapi/v2/url/report
[ERROR] url[https://www.virustotal.com/vtapi/v2/url/report] status_code[<UNKNOWN>]
https://www.virustotal.com/vtapi/v2/url/report
[ERROR] url[https://www.virustotal.com/vtapi/v2/url/report] status_code[<UNKNOWN>]
https://www.virustotal.com/vtapi/v2/url/report

So it rather looks like some limitation in the Requests package than some shortcoming of VirusTotal API.

"plist is wrong type" error when the History.plist file is empty

OSXCollector produces an error when the History.plist (osxcollector-section: safari, osxcollector-subsection: history) file is empty:

[ERROR] plist is wrong type. plist_path[/Users/<user>/Library/Safari/History.plist] type[str] -
{'osxcollector_incident_id': 'reefer-madness', 'osxcollector_username': '<user>', 'osxcollector_subsection': 'history', 'osxcollector_section': 'safari'}

AttributeError: 'str' object has no attribute 'get' when passing the IP with -i argument

When passing IP address with -i argument:

$ cat some_malware_incidents.json | python -m osxcollector.output_filters.analyze -M -i 'SOME.IP.ADDRESS.HERE'

I've got:

Traceback (most recent call last):
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/Users/jsendor/Documents/osxcollector/osxcollector/output_filters/analyze.py", line 493, in <module>
    main()
  File "/Users/jsendor/Documents/osxcollector/osxcollector/output_filters/analyze.py", line 490, in main
    run_filter(output_filter)
  File "osxcollector/output_filters/base_filters/output_filter.py", line 93, in run_filter
    final_blobs = output_filter.end_of_lines()
  File "osxcollector/output_filters/base_filters/chain.py", line 67, in end_of_lines
    return self._on_end_of_lines(self._head_of_chain)
  File "osxcollector/output_filters/base_filters/chain.py", line 86, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 86, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 86, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 86, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 86, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 86, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 86, in _on_end_of_lines
    final_lines = self._on_end_of_lines(link._next_link)
  File "osxcollector/output_filters/base_filters/chain.py", line 81, in _on_end_of_lines
    for blob in link.end_of_lines():
  File "osxcollector/output_filters/opendns/related_domains.py", line 77, in end_of_lines
    domains = self._find_related_domains(domains, ips)
  File "osxcollector/output_filters/opendns/related_domains.py", line 104, in _find_related_domains
    related_domains.update(self._rr_history_to_domains(rr_history_info))
  File "osxcollector/output_filters/opendns/related_domains.py", line 122, in _rr_history_to_domains
    for rr_domain in rr_history.get('rrs', []):
AttributeError: 'str' object has no attribute 'get'

Missing 3rd party kext info

If I run

kextstat

on a live machine I see some kernel extensions that are not in the osxcollector output. What's with that?

Logging issues

On master I get a ton of logging errors. These errors keep important data from making into the .json output. Imagine this, but hundreds of times.

[ERROR]   <type 'exceptions.IOError'> [('./osxcollector.py', 1009, '_log_file_info_for_directory', 'file_info = _get_file_info(file_path, True)'), ('./osxcollector.py', 312, '_get_file_info', 'where_from = _get_where_froms(file_path)'), ('./osxcollector.py', 243, '_get_where_froms', 'return _get_extended_attr(file_path, ATTR_KMD_ITEM_WHERE_FROMS)'), ('./osxcollector.py', 262, '_get_extended_attr', 'xattr_val = getxattr(file_path, attr)'), ('/usr/local/lib/python2.7/site-packages/xattr/__init__.py', 178, 'getxattr', 'return xattr(f).get(attr, options=symlink and XATTR_NOFOLLOW or 0)'), ('/usr/local/lib/python2.7/site-packages/xattr/__init__.py', 68, 'get', 'return self._call(_getxattr, _fgetxattr, name, 0, 0, options | self.options)'), ('/usr/local/lib/python2.7/site-packages/xattr/__init__.py', 59, '_call', 'return name_func(self.value, *args)'), ('/usr/local/lib/python2.7/site-packages/xattr/lib.py', 642, '_getxattr', 'raise error(path)'), ('/usr/local/lib/python2.7/site-packages/xattr/lib.py', 628, 'error', 'raise IOError(errno, strerror, path)')] - {'osxcollector_incident_id': 'osxcollect-2015_12_19-15_57_57', 'osxcollector_username': 'yourmom', 'osxcollector_section': 'mail'}`

Repro:

sudo ./osxcollector.py

OSXCollector raises "Unable to parse plist" error when the plist is empty

In some cases, e.g. when the Safari browser history is empty, the corresponding plist file is empty as well. It can happen if the user has never used the browser.

In that case the error:

[ERROR] Unable to parse plist: [The data couldn’t be read because it isn’t in the correct format.]. plist_path[/Users/jdoe/Library/Safari/History.plist] - {'osxcollector_incident_id': 'safari-2015_12_08-20_10_31', 'osxcollector_username': 'jdoe', 'osxcollector_subsection': 'history', 'osxcollector_section': 'safari'}

is raised.

Rather then that, OSXCollector could raise a warning and note that the plist is empty.

Add whitelist for file hashes to the Output Filters

Currently there is a black list for file hashes and domains and a whitelist for domains, but not for file hashes.
The whitelist for the file hashes should ignore things like the legitimate OS X binaries and plists, e.g. listed here.

Ignore adblock_custom field value when discovering domains in find_domains filter

Currently find_domains filter tries to extract domain names from any value.

adblock_custom contains a lot of domains (not to mention that they are stored in just one big string) that are on the AdBlock's blacklist. It does not make sense to extract any domain names from this field as it could contain a lot of malware websites that user actually not visited.

This field is in osxcollector_section chrome and osxcollector_subsection local_storage. osxcollector_table_name is ItemTable

I have tried to analyze specificMalware and grepped for {{installmac}}.
It was found in that item and when I looked at how many domains were extracted from this single value (which apparently is the whole local storage of Chrome web browser) it was big:

$ cat foo.json | grep installmac | jq '.osxcollector_domains' | wc -l
     990

SSL errors when running analyze filter

Getting this when I run the analyze filter on my machine. I get the same errors on a fresh install of OS X (in a VM) with a fresh clone of osxcollector.

[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
[ERROR] url[https://www.virustotal.com/vtapi/v2/file/report] status_code[<UNKNOWN>]
Traceback (most recent call last):
  File "/Users/<username>/osxcollector/venv_osxcollector2/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/Users/<username>/osxcollector/venv_osxcollector2/lib/python2.7/site-packages/grequests.py", line 71, in send
    self.url, **merged_kwargs)
  File "/Users/<username>/osxcollector/venv_osxcollector2/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/<username>/osxcollector/venv_osxcollector2/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/Users/<username>/osxcollector/venv_osxcollector2/lib/python2.7/site-packages/requests/adapters.py", line 431, in send
    raise SSLError(e, request=request)
SSLError: EOF occurred in violation of protocol (_ssl.c:590)
<Greenlet at 0x11c74c730: <bound method AsyncRequest.send of <grequests.AsyncRequest object at 0x11c85bb90>>(stream=False)> failed with SSLError

pip freeze:
altgraph==0.12
aspy.yaml==0.2.1
backports.ssl-match-hostname==3.4.0.2
cached-property==1.2.0
certifi==2015.4.28
cffi==1.2.0
coverage==3.7.1
functools32==3.2.3.post2
gevent==1.0.2
greenlet==0.4.7
grequests==0.2.0
jsonschema==2.5.1
macholib==1.7
mock==1.0.1
modulegraph==0.12.1
nodeenv==0.13.3
ordereddict==1.1
pluggy==0.3.0
pre-commit==0.5.2
py==1.4.30
py2app==0.9
pycparser==2.14
pyflakes==0.9.1
pyobjc==3.0.4
pyobjc-core==3.0.4
pyobjc-framework-Accounts==3.0.4
pyobjc-framework-AddressBook==3.0.4
pyobjc-framework-AppleScriptKit==3.0.4
pyobjc-framework-AppleScriptObjC==3.0.4
pyobjc-framework-Automator==3.0.4
pyobjc-framework-CalendarStore==3.0.4
pyobjc-framework-CFNetwork==3.0.4
pyobjc-framework-Cocoa==3.0.4
pyobjc-framework-Collaboration==3.0.4
pyobjc-framework-CoreData==3.0.4
pyobjc-framework-CoreLocation==3.0.4
pyobjc-framework-CoreText==3.0.4
pyobjc-framework-CoreWLAN==3.0.4
pyobjc-framework-DictionaryServices==3.0.4
pyobjc-framework-DiskArbitration==3.0.4
pyobjc-framework-EventKit==3.0.4
pyobjc-framework-ExceptionHandling==3.0.4
pyobjc-framework-FSEvents==3.0.4
pyobjc-framework-InputMethodKit==3.0.4
pyobjc-framework-InstallerPlugins==3.0.4
pyobjc-framework-InstantMessage==3.0.4
pyobjc-framework-LatentSemanticMapping==3.0.4
pyobjc-framework-LaunchServices==3.0.4
pyobjc-framework-PreferencePanes==3.0.4
pyobjc-framework-PubSub==3.0.4
pyobjc-framework-QTKit==3.0.4
pyobjc-framework-Quartz==3.0.4
pyobjc-framework-ScreenSaver==3.0.4
pyobjc-framework-ScriptingBridge==3.0.4
pyobjc-framework-SearchKit==3.0.4
pyobjc-framework-ServiceManagement==3.0.4
pyobjc-framework-Social==3.0.4
pyobjc-framework-StoreKit==3.0.4
pyobjc-framework-SyncServices==3.0.4
pyobjc-framework-SystemConfiguration==3.0.4
pyobjc-framework-WebKit==3.0.4
PyYAML==3.11
requests==2.7.0
simplejson==3.7.3
six==1.9.0
SQLAlchemy==1.0.8
testify==0.7.2
threat-intel==0.1.3
tldextract==1.6
tornado==4.2.1
tox==2.0.2
virtualenv==13.1.0
wheel==0.24.0
xattr==0.7.8

$ python --version
Python 2.7.10
$ python -c 'import ssl; print ssl._OPENSSL_API_VERSION'
(0, 9, 8, 29, 15)

Missing Info on Dependencies

Working directly off the instructions on the wiki starting up osxcollector on a new 10.10 machine led to a number of errors related to missing Python dependencies.

Specifically I needed to run: sudo pip install xattr pyobjc

This may be confusing to new users. Since most of the documentation is in the wiki would you want that information in there or as something added to the README? I'm happy to submit a PR if I can.

OpenDnsRelatedDomainsFilter should take a when function like RelatedFilesFilter

The OpenDnsRelatedDomainsFilter currently only finds domains related to the initial input.
It would be better to also automatically find domains related to the source of known malicious downloads.

Add a when function like RelatedFilesFilter that can be used to specify "when this is associated with known hinky stuff".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.