Git Product home page Git Product logo

manta-mackerel's Introduction

manta-mackerel

This repository is part of the Joyent Manta project. For contribution guidelines, issues, and general documentation, visit the main Manta project page.

Mackerel generates Metering and usage reports for Manta.

Metering code is in two main categories - scripts that set up and kick off Marlin jobs (job creation), and code that is uploaded as assets and executed as part of a Marlin job (assets).

Job creation code is in /lib Programs that generate job input keys are in /lib/keygen Assets are in /assets Configuration files are in /etc

Storage data comes from Postgres dumps from each Moray shard. Request data comes from Muskie audit logs. Compute data comes from Marlin agent audit logs.

Running metering jobs

Run metering jobs using

bin/meter [-f configPath] [-c] [-r] -d date -j jobName

where date is in some format readable by Date.parse() (e.g. ISO-8601 or output from date(1)) and jobName is one of storage, compute, request, accessLogs, or summarizeDaily. Setting the -c flag will print the job manifest without creating a job. Setting the -r flag will automatically retry the job in the event of failures.

Examples:

bin/meter -j "storage" -d "$(date)"
bin/meter -j "accessLogs" -d "$(date -d '1 month ago')"
bin/meter -j "summarizeDaily" "$(date -d '2 days ago')" -f path/to/config

Configuration file

The configuration file config.json should be in a format compatible with require (i.e. JSON or a module that exports a JS object).

Configuration settings include:

{
    // local path to manta client config
    'mantaConfigFile': '/path/to/file',

    // mapping from manta object to local file path
    'assets': {
        '/manta/asset/path': '/local/file/path',
        ...
    }

    // where the uuid->login mapping is written to locally
    'mantaLookupPath': '/path/to/file'

    // retry configuration settings for result monitoring
    'monitorBackoff': {
        'initialDelay': 1000, // milliseconds
        'maxDelay': 120000, // milliseconds
        'failAfter': 20 // count
    }

    // mahi config
    c.mahi = {
        host: 'localhost',
        port: 6379
    }
}

Job details look like:

{
    'jobs': {
        jobName: { // e.g. storage, request, compute

            'keygen': 'name of key generator', // under /lib/keygen

            // any arguments to pass to the keygen
            'keygenArgs': {
                'source': '/manta/source/path'
            },

            // the job manifest passed to marlin
            'job': {
                'name': name,
                'phases': [ {
                        'type': type,
                        'assets': [...],
                        'init': initStr,
                        'exec': execStr
                }, {
                ...
                } ]
            },

            // where to create a link to the result when the job finishes
            'linkPath': '/path/to/latest'

            // any environment variables you need set at job runtime
            env: {
                'DEST': '/manta/path.json',
                ...
            }
        },
        ...
    },
    ...

}

Some job configuration is passed to a job via environment variables.

  • ACCESS_DEST: the Manta path to append to "/:user" for access logs
  • COUNT_UNAPPROVED_USERS: if true, users with approved_for_provisioning set to false will be included in reports DATE: the date the job will meter for
  • DELIVER_UNAPPROVED_REPORTS: if true, users with approved_for_provisioning set to false will receive usage reports in their /reports directory DEST: the Manta path where usage reports should be saved to
  • DROP_POSEIDON_REQUESTS: if true, drop requests where poseidon is the caller from usage reports or access logs HEADER_CONTENT_TYPE: content type for usage reports
  • MALFORMED_LIMIT: in request and compute, how many lines of malformed json to allow before throwing an error. scalar ("500") or percent ("1%") MIN_SIZE: minimum size of objects in storage metering
  • NAMESPACES: a list of all namespaces in Manta
  • USER_DEST: the Manta path to append to "/:user" for usage reports
  • USER_LINK: the path to create a link to the latest usage report generated for a user

Testing

Mackerel needs a Manta deployment. Point to a Manta deployment by filling in the fields in etc/test-config.js, which overrides config entries in the normal config file at etc/config.js. Entries to override include:

* manta url/user/sign: manta credentials
* mantaBaseDirectory: modify this to make the integration tests use your
    directory instead of the default /poseidon/stor/usage
* lookupFile: path to the test lookup file
* mahi.host: mahi host (only used for generating lookups)

Run tests using: make test

To run a single integration test: nodeunit -t storage test/integration.test.js

Resharding

See MANTA-1744 and MANTA-1665.

There are two things that need to happen for consumers of dumps when resharding happens:

  1. Shard discovery via Manta, and
  2. Dedupe rows

Ideally, these two things would happen automatically. However, the infrequeent nature of resharding makes automating these tasks low priority.

If a reshard needed to happen and the automatic systems were not in place yet, shard discovery would need to updated manually via configuration. Duplicate rows would be removed via marlin jobs that would run over the (very few) dumps that have duplicates.

A somewhat related but separate issue is the transformation of dumps from SQL to JSON. Currently that transformation is occurring on the postgres boxes themselves. However, as the datasets get larger, it would be preferable to use marlin to transform the dumps.

manta-mackerel's People

Contributors

fkuo avatar bcantrill avatar trentm avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.