Git Product home page Git Product logo

turbinia's Introduction

Turbinia

Unit tests e2e tests

Summary

Turbinia is an open-source framework for deploying, managing, and running distributed forensic workloads. It is intended to automate running of common forensic processing tools (i.e. Plaso, TSK, strings, etc) to help with processing evidence in the Cloud, scaling the processing of large amounts of evidence, and decreasing response time by parallelizing processing where possible.

How it works

Turbinia is composed of different components for the client, server and the workers. These components can be run in the Cloud, on local machines, or as a hybrid of both. The Turbinia client makes requests to process evidence to the Turbinia server. The Turbinia server creates logical jobs from these incoming user requests, which creates and schedules forensic processing tasks to be run by the workers. The evidence to be processed will be split up by the jobs when possible, and many tasks can be created in order to process the evidence in parallel. One or more workers run continuously to process tasks from the server. Any new evidence created or discovered by the tasks will be fed back into Turbinia for further processing.

Communication from the client to the server is currently done with either Google Cloud PubSub or Kombu messaging. The worker implementation can use either PSQ (a Google Cloud PubSub Task Queue) or Celery for task scheduling.

The main documentation for Turbinia can be found here. You can also find out more about the architecture and how it works here.

Status

Turbinia is currently in Alpha release.

Installation

There is an installation guide here.

Usage

The basic steps to get things running after the initial installation and configuration are:

  • Start Turbinia server component with turbiniactl server command
  • Start Turbinia API server component with turbiniactl api_server command if using Celery
  • Start one or more Turbinia workers with turbiniactl celeryworker if using Celery, or turbiniactl psqworker if using PSQ
  • Install turbinia-client via pip install turbinia-client
  • Send evidence to be processed from the turbinia client with turbinia-client submit ${evidencetype}
  • Check status of running tasks with turbinia-client status

turbinia-client can be used to interact with Turbinia through the API server component, and here is the basic usage:

$ turbinia-client -h
Usage: turbinia-client [OPTIONS] COMMAND [ARGS]...

  Turbinia API command-line tool (turbinia-client).

                          ***    ***
                           *          *
                      ***             ******
                     *                      *
                     **      *   *  **     ,*
                       *******  * ********
                              *  * *
                              *  * *
                              %%%%%%
                              %%%%%%
                     %%%%%%%%%%%%%%%       %%%%%%
               %%%%%%%%%%%%%%%%%%%%%      %%%%%%%
  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%  ** *******
  %%                                                   %%  ***************
  %%                                (%%%%%%%%%%%%%%%%%%%  *****  **
    %%%%%        %%%%%%%%%%%%%%%
    %%%%%%%%%%                     %%          **             ***
       %%%                         %%  %%             %%%           %%%%,
       %%%      %%%   %%%   %%%%%  %%%   %%%   %%  %%%   %%%  %%%       (%%
       %%%      %%%   %%%  %%%     %%     %%/  %%  %%%   %%%  %%%  %%%%%%%%
       %%%      %%%   %%%  %%%     %%%   %%%   %%  %%%   %%%  %%% %%%   %%%
       %%%        %%%%%    %%%       %%%%%     %%  %%%    %%  %%%   %%%%%

  This command-line tool interacts with Turbinia's API server.

  You can specify the API server location in ~/.turbinia_api_config.json

Options:
  -c, --config_instance TEXT  A Turbinia instance configuration name.
                              [default: (dynamic)]
  -p, --config_path TEXT      Path to the .turbinia_api_config.json file..
                              [default: (dynamic)]
  -h, --help                  Show this message and exit.

Commands:
  config    Get Turbinia configuration.
  evidence  Get or upload Turbinia evidence.
  jobs      Get a list of enabled Turbinia jobs.
  result    Get Turbinia request or task results.
  status    Get Turbinia request or task status.
  submit    Submit new requests to the Turbinia API server.

Check out the turbinia-client documentation page for a detailed user guide.

You can also interact with Turbinia directly from Python by using the API library. We provide some examples here

Other documentation

Obligatory Fine Print

This is not an official Google product (experimental or otherwise), it is just code that happens to be owned by Google.

turbinia's People

Contributors

aarontp avatar alimez avatar beamcodeup avatar berggren avatar c-h-fahy avatar coryaltheide avatar dependabot[bot] avatar dfjxs avatar ericzinnikas avatar fpiedrah avatar fryyyyy avatar giovannt0 avatar hacktobeer avatar holzmanolagrene avatar igor8mr avatar jaegeral avatar jleaniz avatar joachimmetz avatar jorlamd avatar mwatkins-fb avatar onager avatar ramo-j avatar rgayon avatar rjcolonna avatar roshanmaskey avatar sa3eed3ed avatar simon-berg avatar slaynot avatar tomchop avatar wajihyassine avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

turbinia's Issues

Add {pre,post}processor hooks

These can be used to process the evidence or local node before and after a task runs. Examples are attaching a cloud disk, mounting a disk, un-encrypting a disk, etc.

Easy handling of multiple Turbinia instances from the client

Right now the user has to manually manage separate configs to talk to different Turbinia instances. It would be nice if there was a way to have multiple servers registered under one config, or possibly have a way for servers to register in a cloud project, and have the client be able to easily select from any server instance in a given cloud project.

Autogenerate turbiniactl command args from evidence objects

We should autogenerate all of the commands and command args for turbiniactl based on the evidence object types and attributes so that all evidence types can be added through tubiniactl without needing to manually keep things in sync.

Create new GCS -> Persistent Disk copy task

This will create a new Persistent disk with a filesystem that is slightly larger than the image file on GCS, and will copy the raw image from GCS directly as a file in the Persistent Disk FS. This will require a new Evidence type called something like PersistentDiskLocalImage.

turbiniactl status command

We need a way to get the status out of Turbinia remotely after things start. This may mean creating a new pubsub interface or other back-channel mechanism since right now the way to talk to Turbinia remotely is through pubsub (currently one-way).

Reconsider num of workers per host

Right now PSQ starts a worker per core, but some of the underlying tools (i.e. plaso) do this as well, so it doesn't make sense to have that many worker tasks running. I'm not sure if a single worker per host is the right option either though.

Add Turbinia instance as datastore namespace

Right now when we save data into Datastore we don't have any way to distinguish which Turbinia instance is saving the entities, which means that we can't have more than one Turbinia instance per cloud project without mixing the task status results together. We should add a new 'instance' (or something similar) as part of the entity key to help disambiguate. We could potentially use the config.PSQ_TOPIC variable here as this should already be unique.

Fix/Update init scripts

Right now I have PR for some super basic/hacky start-up scripts, but we should update those into something more legit.

Implement turbiniamgmt workercheck command

Right now we don't have a way to check how many workers are connected, and this can be non-trivial because workers can connect from anywhere. We should implement a 'turbiniactl workercheck' (or similar) command to run a quick check on all the workers. PSQ has a Broadcast worker mechanism that we can use for this.

Add ssdeep/hashdeep job

(There is another open ended issue to track adding multiple job types, but I'm going to start breaking them out into their own issues).

Add support for multi-pools for workers

We can add different worker pools that are created to match individual job types (e.g. so Plaso can have worker nodes that have more cpu than workers for other job types). Right now it's one global worker pool.

Update setup.py

It's out of date, and has packages listed from the previous iteration.

Add privilege aware execution handler

Currently Turbinia assumes that it runs as a user with sudo privileges. We should add methods to the TurbiniaTask object to handle executing privileged commands rather than hard-coding sudo into commands being run.

Timesketch support

Timesketch has a fancy new python API. We should upload output to timesketch where relevant.

Adding errors to TurbiniaTaskResult causes it to be unpickleable

[INFO] Took 0.27 sec
Traceback (most recent call last):
File "/home/turbinia/src/turbinia/turbiniactl", line 175, in
worker.listen()
File "/home/turbinia/turbinia-env/local/lib/python2.7/site-packages/psq/worker.py", line 60, in listen
self.run_task(task)
File "/home/turbinia/turbinia-env/local/lib/python2.7/site-packages/psq/worker.py", line 69, in run_task
task.execute(self.queue)
File "/home/turbinia/turbinia-env/local/lib/python2.7/site-packages/psq/task.py", line 96, in execute
queue.storage.put_task(self)
File "/home/turbinia/turbinia-env/local/lib/python2.7/site-packages/psq/datastore_storage.py", line 61, in put_task
entity['data'] = dumps(task)
File "/home/turbinia/turbinia-env/local/lib/python2.7/site-packages/google/cloud/client.py", line 134, in getstate
'Clients have non-trivial state that is local and unpickleable.',
pickle.PicklingError: Pickling client objects is explicitly not supported.
Clients have non-trivial state that is local and unpickleable.

Add cloud storage output option

Right now the output for jobs is put into a locally mounted filesystem, and we should allow for the option to put the output directly into cloud storage.

Add a README That Explains this Project

Hey @berggren (& other Googlers!),

I've heard this project referenced multiple times, and I'm super curious, but the README doesn't really give much of an idea what this project is for. Is it like a Salt/Puppet/Ansible thing? Is it an orchestration platform? I know I could read the code, but that's a big investment. A decent README that explains what this project is about would go a long way to helping folks evaluate if it's helpful.

Thanks!

WebUI

We want a simple webui to start turbinia jobs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.