Git Product home page Git Product logo

manta-sharkspotter's Introduction

sharkspotter

Sharkspotter is a program that scans Postgres for Manta objects that should reside on a specific shark. This was made to be a quick work around for the absence of online storage auditing required by large Manta deployments.

Installing

$ npm install

Running

Invoke sharkspotter from the CLI with a few required arguments.

Description of the flags:

  -b, --begin		manta relation _id to start search
			from
			default is 0
  -e, --end		manta relation _id to end search at
			default is the larger of max(_id) and
      			max(_idx)
  -d, --domain		domain name of manta services
			e.g. us-east.joyent.us
  -m, --moray		moray shard to search
			e.g. 2.moray
  -s, --shark		shark to search objects in moray for
			e.g. 3.stor
  -c, --chunk-size	number of objects to search in each PG
			query
			default is 10000

Here's an example from a development setup:

$ node --abort-on-uncaught-exception sharkspotter.js -d walleye.kkantor.com -m 2.moray -s 3.stor -c 20000

The process outputs bunyan-style logs, so feel free to pipe the command into bunyan, to a file to look at later, or using tee(1), both!

This command will start the search from the 'beginning' of the manta table on the 2.moray shard ending at the 80000th record in chunks of 20000. The program will search for objects that should exist on 3.stor.

Matching objects are written to a file in the current directory. The file name follows the pattern moray_shard.pid.out.

The output format folows the form:

owner_uuid object_uuid shark_1 shark_2 ... shark_N

One object is listed on each line.

Here's an example of the output using the above example invocation:

$ tail 2.moray.walleye.kkantor.com.14013.out
0864994a-6ef0-e5a5-a86c-e64790f5e90c 8052f0eb-16da-ca6e-a94c-e95da3419f3c 3.stor.walleye.kkantor.com 1.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c a9a675a1-6d00-e9c7-a585-8d87b318877e 1.stor.walleye.kkantor.com 3.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c 6b278fd0-c353-6c28-ba79-e365ceb5484c 3.stor.walleye.kkantor.com 1.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c 0d15642e-3081-e568-e59a-a42e8ee879e9 3.stor.walleye.kkantor.com 1.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c 7c586264-de59-4af5-fe45-d9ead9567e71 3.stor.walleye.kkantor.com 1.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c 236daedb-b0d8-6391-c970-a916087e07d8 3.stor.walleye.kkantor.com 1.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c cd6db21f-fd01-482b-f369-e00678f9cebb 3.stor.walleye.kkantor.com 2.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c 2cf1f699-9cb3-c05a-b3aa-8c43dd207cc7 3.stor.walleye.kkantor.com 2.stor.walleye.kkantor.com
0864994a-6ef0-e5a5-a86c-e64790f5e90c f3bc9886-bc53-680f-dfe6-dada6663c53a 3.stor.walleye.kkantor.com 1.stor.walleye.kkantor.com

Monitoring

The bundled DTrace script, progress.d, can be used to watch the progress of one sharkspotter process. You must pass in a quoted shard number as the sole argument:

$ ./progress.d '"2"'
waiting for records from moray...
2.moray: 0% [ 0 / 134222 ]
2.moray: 0% [ 5 / 134222 ]
2.moray: 3% [ 4375 / 134222 ]
2.moray: 6% [ 8402 / 134222 ]
2.moray: 9% [ 12374 / 134222 ]
2.moray: 12% [ 17000 / 134222 ]
2.moray: 16% [ 22000 / 134222 ]
2.moray: 19% [ 26001 / 134222 ]
2.moray: 22% [ 30662 / 134222 ]
2.moray: 25% [ 34378 / 134222 ]
2.moray: 28% [ 38717 / 134222 ]
2.moray: 31% [ 42263 / 134222 ]
2.moray: 34% [ 46641 / 134222 ]
2.moray: 37% [ 50719 / 134222 ]
2.moray: 40% [ 54265 / 134222 ]
2.moray: 42% [ 57222 / 134222 ]
2.moray: 45% [ 60745 / 134222 ]
2.moray: 48% [ 64442 / 134222 ]
2.moray: 51% [ 68598 / 134222 ]
2.moray: 54% [ 73000 / 134222 ]
2.moray: 57% [ 77359 / 134222 ]
2.moray: 61% [ 82000 / 134222 ]
2.moray: 63% [ 85475 / 134222 ]
2.moray: 67% [ 90002 / 134222 ]
2.moray: 69% [ 93538 / 134222 ]
2.moray: 72% [ 97365 / 134222 ]
2.moray: 75% [ 101257 / 134222 ]
2.moray: 78% [ 105000 / 134222 ]
2.moray: 81% [ 109000 / 134222 ]
2.moray: 83% [ 112288 / 134222 ]
2.moray: 86% [ 116053 / 134222 ]
2.moray: 89% [ 120000 / 134222 ]
2.moray: 91% [ 123428 / 134222 ]
2.moray: 94% [ 127451 / 134222 ]
2.moray: 97% [ 131337 / 134222 ]
sharkspotter completed

manta-sharkspotter's People

Contributors

kodykantor avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

manta-sharkspotter's Issues

ignore MPU part records

MPU parts are removed on storage nodes during the mako-finalize portion of an MPU after being concatenated into a single contiguous file. The metadata shards that hold information about the MPU parts aren't notified that the parts have disappeared from disk.

This complicates things for tools using sharkspotter data to restore object redundancy. There are a couple things sharkspotter can do to work around the current MPU functionality:

  1. Make sure that the upload associated with given MPU parts are committed.
  2. Ignore all MPU part records

For option 1), If the MPU is committed then the objects found in the manta table can definitely be ignored. If the MPU is not committed then the MPU could be in-flight things are less clear. Maybe the MPU will be finalized and committed between the time sharkspotter is run and the data is analyzed for corruption (meaning HTTP 404s for MPU parts), or maybe the MPU will still be in flight when the data is analyzed for corruption (meaning an HTTP 200 for MPU parts).

As we can see, option 1) doesn't simplify things much. At the end of the day we may have some files that are marked as missing, but were actually part of a committed MPU. So no matter what we'll have to do some post-processing to determine whether or not MPU parts are legitimately unaccounted for.

I suggest sharkspotter implements option 2). We have precedent to take this approach (MANTA-3410). To do this I think we could use a regex that ignores all objects in each user's 'uploads' directory.

sharkspotter should be able to run against more than one shard

Currently sharkspotter can only be pointed at a single Moray shard. This was the quick and easy solution, and led to the addition of kick_off_sharkspotter.sh.

Sharkspotter data is limited in its usefulness without data from multiple (all) shards in a Manta deployment.

We'll also need to handle errors better in sharkspotter if we wanted to implement this change. Currently sharkspotter basically gives up if its Moray connections die. This would be bad behavior if we have 100+ shards and one of them flopping causes all 100 sharkspotter searches to quit. #3 is filed for this.

Adding some more monitoring to sharkspotter would also be necessary. If one Moray backend dies out of 100 it would be time consuming to wade through the noise of log files to discover where in the scan the Moray backend died. #4 is filed for this.

handle errors better

Currently sharkspotter basically gives up if its Moray connections die. So in this case the operator would have to a) notice and b) manually restart sharkspotter from the last successful query chunk (identified by looking in the sharkspotter logs).

Error handling improvements would hopefully include restarting the search for the failed Moray backend when we're able to connect again from the last known successful query chunk.

There's probably a lot more low-hanging fruit to improve error handling too.

use proper packaging

Sharkspotter is deployed by creating a tarball from this repository, copying it somewhere, and extracting the tarball. A node binary needs to be shipped with this library too, which is not checked in to this repo.

We should:

  • add a Makefile to this repo
  • bundle a node binary
  • add a linter and style checker

and if we're serious about this tool we should consider bundling/moving it to the mola repository so it can be shipped in the Manta 'ops' zone.

add better monitoring

The easiest way to figure out the progress of a sharkspotter process is to run the bundled progress.d script to watch progress through a specified shard.

It would be nice if there were a CLI 'adm'-type command to interact with a running sharkspotter process to:

  • get info about the configuration (shard, storage_id, progress through table, number of records discovered, etc.)
  • get info about any errors that occurred and how far through the table sharkspotter was when they occurred

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.