Git Product home page Git Product logo

collectors's Introduction

BenchFlow

BenchFlow is an open-source expert system providing a complete platform for automating performance tests and performance analysis. We know that not all the developers are performance experts, but in nowadays agile environment, they need to deal with performance testing and performance analysis every day. In BenchFlow, the users define objective-driven performance testing using an expressive and SUT-aware DSL implemented in YAML. Then BenchFlow automates the end-to-end process of executing the performance tests and providing performance insights, dealing with system under test deployment relying on Docker technologies, distributing simulated users load on different server, error handling, performance data collection and performance metrics and insights computation.

Quick links: BenchFlow Documentation | TODO - also link to the documentation

TODO (try BenchFlow)

Purpose

TODO (BenchFlow has a strong focus on developer happiness & ease of use, and a batteries-included philosophy.)

Current project focus

The BenchFlow expert system is currently mainly focused on enabling the performance benchmark of Workflow Management Systems supporting the BPMN 2.0 modeling and execution language. Despite the main focus, most of its components are reusable and already general enough to support performance benchmarks of generic Web Services. We strongly encourage extending BenchFlow by adding missing functionalities specific to your particular benchmarking needs. TODO ([point to setup and getting started]). Website related to the current focus: http://benchflow.inf.usi.ch.

We have a temporary logo, we are going to have a proper logo at some point in the future.

Upcoming project focus

(TODO) automated objective-driven performance testing, and integration in continuous software improvement lifecycle.

Features (Why BenchFlow?)

TODO (also link to the documentation)

definition of a performance benchmark/test through a dedicated DSL; automatisation of the deployment of the System Under Test on distributed infrastructures using Docker; reliable execution of the performance benchmark using Faban; data collection and cleaning; data analysis in the form of computed metrics and KPIs.

Use Cases

TODO (to show the uses of the tool, linking to an actual article explaining how to do that, point also to controbuting for extending)

Installation or Upgrade

TODO (explain current project state in dev, and link to docs and uses to state that it is usable, but not 100% battle tested)

TODO (getbenchflow in container for client and docs for the rest [also links to docker hub if needed, ot least in the developer documentation], current release. Explain: Docker as prerequisite)

(TODO, maybe at the top) Project Status: The project is currently in active development and is tested on Mac OS X for the client command line side tools, and Ubuntu 14.04.2 LTS for the server side tools. The main project branch is devel [maybe for now say that there are no release yet, but we are in the process of having the first release]

Getting Started

Prerequisites

TODO (simplest example, then links to the docs for advanced stuff)

Installing

TODO (needs help or customisation, write contacts)

Built With

TODO

Contributing

TODO (also related to extending to custom software, and links to developer documentation and TODOs)

Versioning

TODO (SemVer + link to docs)

Authors

TODO

License

Copyright © 2014-2017, Vincenzo Ferme for own and contributors committed code and artefacts.

The license for all the not third-party code in the BenchFlow repositories is RPL-1.5, unless otherwise noted

collectors's People

Contributors

cerfoglg avatar simonedavico avatar vincenzoferme avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

simonedavico

collectors's Issues

Improved REST API with correct methods

Currently we assume that all calls are done with GET on the API. To have a proper RESTful API we need to accept the correct methods, which should be:

  • PUT for /store
  • POST for /start
  • PUT for /stop

If the wrong method is used the collector should return a HTTP 405 METHOD NOT ALLOWED

Make the implemented collectors clean and remove hard coded stuff

Remove:

  • Hard coded environment variables
  • Hard coded strings
  • Not needed folders and files

Add:

  • Dependency on envconsul to retrieve the environment variables
  • Consul service discovery

Manage:
- Dependency by only using Godeps

Docker:

  • Container go that builds a project from git and remove the dependency on golang;
  • Also have Dockerfile for local development, using base container to avoid code cloning among Dockerfile, if possible

Comments:

  • @Cerfoglg add some explicative comment to the code, after refactoring it.

Investigate and Fix the Following Bugs

Zip Collector

If TO_ZIP is /, then what I get is Data read ‘22789403’ is not equal to the size ‘22757376’ of the input Reader. and no data are saved on Minio

Logs Collector

No data are on Minio after the call.

Add CSV saving to mysqldump

Add extra functionality to the mysqldump collector to:

  • Save a given set of tables as CSV files. Each table needs to be saved in its own CSV file.
  • Save the definition of the given set of tables, as in saving the type of the columns (int, varchar, ...). Each table should have its own CSV file with a single row defining for each column its data type.

In the case of Mysql, the dump should be achieved by quering the database for an entire table, returned in tab separated format, and altering the output to make it into Coma Separated Values. Column types can be obtained the same way.

By saving our databases in CSV format we streamline the process of transforming the data in the transform phase of ETL (Extract Transform Load) by having a single format for all possible databases, and let the different unique collectors dump the different databases into CSV files.

Load the data on Minio in the correct Bucket

The data must be loaded on the following bucket:

  • /benchmarks/benchmark_id/runs/INCREMENTAL_NUMBER

Each collector must store a zip of the collected information by using the following filename:

  • NameOfTheCollectorContainer_CollectorName (e.g., wfms-dbms_mysqldump)

    NOTE: for now lets assume that NameOfTheCollectorContainer is provided through an ENV variable and CollectorName is known by the collector

Currently we only test by executing a single run of a driver, hence the INCREMENTAL_NUMBER can be assumed to be 1. In the future, the number generation must be handled somewhere, but maybe not in the collectors. The collectors will have to know the INCREMENTAL_NUMBER in a way.

Make all collectors use gzip

For all collectors, when zipping a file use gzip for compression.

By using gzip we remove the need to use external zipping methods, as gzip is present on all linux installations. In addition, gzip compressed files can be used directly by Spark when creating a context if opened directly, and even without it Python has a gzip package as part of the default installation (meaning we also don't have to import any extra python modules for handling compressed data).

Gzip also offers a good compress ratio at a very good speed, which makes it a good choice for the database dumps we are storing on Minio (CSV format).

Enable Execution Logs Collection

We need somehow to collect the execution of the collectors (for which we need to define the critical section to log), in case something goes wrong. One proposal would be to collect these logs on a file (directly or using a logs collector) and store them on Minio. This is useful for debugging purposes.

Requirements:

  • We should take care of the ephemeral execution of the collectors

Possible solution:

  • We might need a Gateway acting as a coordinator for the BenchFlow services, and handling centralised log collection from the same.

Think about apply the same to the monitors, so that we can enable again the logging.

Properly Comment the Code

Comment all the choices in the code for which a comment might help to understand the why. Although it is tricky to decide what comment and what not to comment, a suggestion can be to comment all the code that after a couple of day you wrote need more than 5 seconds to grasp what the code is doing. For sure algorithms must be commented and documented in the most important steps.

See also #13

Make log collector an offline microservice

Make it so that the log collector will contact the docker API and obtain the logs with a single request, rather than attaching to the container and constantly obtaining the entries.

This way, we don't need to attach to a container and potentially impact performance of them, and makes log collecting into a single "collect" call, rather than start and stopping like the stats. The docker API returns the stdout and stderr of the container, which can be read from, and the logs written down into a file.

Also provide the option to query the microservice for the logs starting from a given time. This is easy to implement, given the docker API for returning logs already provides a "since" option when queried.

Clean and define a service deployment template relying on Docker Compose

@simonedavico define a deployment descriptor we should define to add a BenchFlow service (e.g., collectors or monitors) to BenchFlow.

Here some discussed examples (where the # are generated):

#The service name should be "benchflowServiceName_BoundServiceName"
mysql:
  image: 'benchflow/collectors:mysql_dev'
  # container_name: mysql_db_TRIAL_ID
  environment:
    - KAFKA_HOST=${BENCHFLOW_ENV_KAFKA_IP}
    - MINIO_ALIAS=benchflow
    - MINIO_HOST=http://${BENCHFLOW_ENV_MINIO_IP}:${BENCHFLOW_ENV_MINIO_PORT}
    - MINIO_ACCESSKEYID=${BENCHFLOW_ENV_MINIO_ACCESSKEYID}
    - MINIO_SECRETACCESSKEY=${BENCHFLOW_ENV_MINIO_SECRETACCESSKEY}

    # - BENCHFLOW_EXPERIMENT_ID=camunda
    # - BENCHFLOW_TRIAL_ID=camunda_1O
    # - BENCHFLOW_TRIAL_TOTAL_NUM=1
    - MYSQL_DB_NAME=${BENCHFLOW_BENCHMARK_CONFIG_MYSQL_DB_NAME}
    - TABLE_NAMES=${BENCHFLOW_BENCHMARK_CONFIG_TABLE_NAMES}

    # the IP can be the local IP
    - MYSQL_HOST=${BENCHFLOW_BENCHMARK_BOUNDSERVICE_IP}
    - MYSQL_PORT=${BENCHFLOW_BENCHMARK_BOUNDSERVICE_PORT}
    - MYSQL_USER=${BENCHFLOW_BENCHMARK_CONFIG_MYSQL_USER}
    - MYSQL_USER_PASSWORD=${BENCHFLOW_BENCHMARK_CONFIG_MYSQL_USER_PASSWORD}

    # - BENCHFLOW_CONTAINER_NAME=mysql_db_TRIAL_ID
    - BENCHFLOW_COLLECTOR_NAME=mysql
    - BENCHFLOW_DATA_NAME=mysql

    # - "constraint:node==bull"
  expose:
    - 8080
  ports:
    - '8080' #192.168.41.128::8080
#The service name should be "benchflowServiceName_BoundServiceName"
stats:
  image: 'benchflow/collectors:stats_dev'
  # container_name: stats_camunda_TRIAL_ID
  environment:
    - KAFKA_HOST=${BENCHFLOW_ENV_KAFKA_IP}
    - MINIO_ALIAS=benchflow
    - MINIO_HOST=http://${BENCHFLOW_ENV_MINIO_IP}:${BENCHFLOW_ENV_MINIO_PORT}
    - MINIO_ACCESSKEYID=${BENCHFLOW_ENV_MINIO_ACCESSKEYID}
    - MINIO_SECRETACCESSKEY=${BENCHFLOW_ENV_MINIO_SECRETACCESSKEY}

    # - BENCHFLOW_EXPERIMENT_ID=camunda
    # - BENCHFLOW_TRIAL_ID=camunda_1O
    # - BENCHFLOW_TRIAL_TOTAL_NUM=1
    - CONTAINERS=${BENCHFLOW_BENCHMARK_BOUNDSERVICE_CONTAINER_NAME}

    # - BENCHFLOW_CONTAINER_NAME=stats_camunda_TRIAL_ID
    - BENCHFLOW_COLLECTOR_NAME=stats
    - BENCHFLOW_DATA_NAME=stats

    # - "constraint:node==lisa1"

  volumes:
    - /var/run/docker.sock:/var/run/docker.sock:ro
  expose:
    - 8080
  ports:
    - '8080' #192.168.41.105::8080

Define and Uniform the APIs of the Collectors

We need to define a common REST API to interact with collectors. This should be differentiated between offline and online collectors.

Current state:

offline:

  • zip defines a /data API to collect and store the data
  • logs defines a store API to collect and store the data
  • dump defines a data API to collect and store the data

online:

  • stats defines two APIs to start and stop the collection, where the stop API also store the data

Solve stats "Done" channel error

In the stats collector, we are using this golang API for connecting to Docker and retrieving the stats: https://github.com/fsouza/go-dockerclient

In the code, we use this function to retrieve the stats: https://godoc.org/github.com/fsouza/go-dockerclient#Client.Stats

The function is blocking, meaning that once started the routine can't be exited unless the function is stopped. It's possible to stop the function by signalling on the Done channel, which can be passed to the function inside the StatsOpts structure. Theoretically, sending a boolean to the channel should interrupt the function, however when attempting to do so an error is returned that states the channel is closed. It's possible this is an issue with the API itself.

What needs to be done:

  • Debug the program step by step, and try writing a smaller straight forward code to test the interruption of the function directly.
  • If the issue is in the API, open an issue on the GitHub of the API itself to report this problem.
  • Wait for the answer and fix the issue
  • Try if the same approach of the stopChannel works also for the doneChannel. It depends on how the library handle this channel. So: one channel for all the goroutine that gets closed when done.

Related issues: #1, #4, benchflow/monitors#1,

Add a collector that collects and stores the Container Stats from the Docker Stats API

The required functionalities are:

  1. it must work from inside a container;
  2. it must collect the Stats of a list of containers identified by names and provided through an Environment variables. The containers's name are separated by ":";
  3. it must store the stats on a tmp file local to the container;
  4. it must define APIs to decide when to start and stop the data collection;
  5. as for the other collectors: it must define APIs to zip the data and store them on a remote S3 compatible datastore;
  6. it must work with the less impact as possible on the monitored container performance.

Notes about CPU usage:

  • Compute the percentage usage given the number of core assigned to a container, not according to the host
  • cpushares: can be a relative weight to other containers
  • total_usage: CPU percentage is not feasible because Docker enables many options to share the CPU with other containers. We use the total_usage instead.

Some useful references:

  1. Powerful go-dockerclient and Stats API: https://godoc.org/github.com/fsouza/go-dockerclient#Client.Stats
  2. A test case that show how to use the API using the client at point 1: https://github.com/fsouza/go-dockerclient/blob/34eaaf52874d8ce5d57be011a4852eb83d950125/container_test.go#L1630
  3. Docker Stats APIs: https://docs.docker.com/reference/api/docker_remote_api_v1.20/#get-container-stats-based-on-resource-usage

Develop a solution for Minio key hashing

In the format we use for the Minio keys, we append a hash to the key to speed up the lookup when accessing Minio. We require a way to generate the hash for a given key, that can be accessed regardless of implemented language for our components.

The convenient solution is to develop an additional golang microservice we can query to generate the hash of a given key. This way we have a single microservice to handle that, meaning we won't need to implement hashing for other languages, and changing the hash function can be done once in a single location.

In addition, select a hash function to use. Some good ideas to consider here: http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html

Add a collector that collects the log of a specified list of Docker containers

The collector is similar to the one that collects Docker Stats (#1), in the sense that it offers similar functionalities but on a different API.

The required functionalities are:

  1. it must work from inside a container;
  2. it must collect the Logs of a list of containers identified by names and provided through an Environment variables. The containers's name are separated by ":";
  3. it must store the stats on a tmp file local to the container;
  4. it must enable the timestamps option of the Docker Logs API;
  5. it must start the collection immediately after its start;
  6. as for the other collectors: it must define APIs to zip the data and store them on a remote S3 compatible datastore;
  7. it must work with the less impact as possible on the monitored container performance.

Some useful references:

  1. Powerful go-dockerclient and Logs API: https://godoc.org/github.com/fsouza/go-dockerclient#Client.Logs
  2. A test case that show how to use the API using the client at point 1: https://github.com/fsouza/go-dockerclient/blob/34eaaf52874d8ce5d57be011a4852eb83d950125/container_test.go#L1153
  3. Docker Logs APIs: https://docs.docker.com/reference/api/docker_remote_api_v1.20/#get-container-logs

Collect Network Usage for Containers using --net="host"

The Docker Stats API, always returns zero for the network stats, when you set --net="host". We should investigate a way to collect network statistics for the containers using --net="host", probably by developing a dedicate container.

Some hints can be found on the following link: https://docs.docker.com/engine/articles/runmetrics/#network-metrics

The Docker API related issue are on the following page: https://github.com/docker/docker/labels/area%2Fapi

Collectors must Respond with JSON

Apply the same improvement applied for the monitors so that all the responses to client are in structured JSON objects, where the structure should be placed in the commons package to the collectors. The structure can simply be:

{
   status: "SUCCESS" or "FAILED"
   message: "..."
}

Of course the same information should be printed on the standard output of the collectors. This applies to all the collectors, because everything must be logged and the client should be aware about what happens in the collectors.

Start from the changes made in #89 and improve the error handling, the structure of the returned message, the use of http errors in case of internal error and so on.

Improvements to the Stats Collector when using --net="host" to be Evaluated

Identify when using net="container" when container refer to a container using net="host" and use nethogs in this case too.

We now assume

The device to monitor are all the ones available to the container, that should be all the available interface to the host, since we don't limit them. Example of command: nethogs -d 1 docker0 eth0 lo.

Investigate whether it is possible to get only the ones used by the monitored containers, to reduce the collected data and the load. Now we collect all of them, so we get all the data.

Obtain container ID from stats collector and send it via Kafka

Our current implementation of stats collecting doesn't retrieve or make any use of the container's actual ID when collecting the stats, sending them to Minio and signalling their presence to Kafka. This is an important piece of information, as it is required when the Spark Tasks Sender will eventually execute the stats transformer and store the collected data for the specific container ID.

Modify the current implementation so it:

  • Retrieves the linked containers' IDs
  • Use container ID to name files on Minio instead of container name
  • Sends the container IDs over Kafka. Alter the minio key into a comma separated list of keys, one key per collected stats of a container

To indetify the container for which we collected environment statistics, we now rely of the file name of the statistics stored on Minio, that as part of the name has the container_id. We then "uniquely" identify a container's stats by grouping the statistics per experiment_id, trial_id and container_id. There can be the (very) unlikely chance that we end up having the same container_id for to different containers part of a trial. Then we must use our internally generated container_properties_id once we'll also collect the container properties with (probably) a dedicated collector for the scope that should store these data prior to any tranformer needing them can start.

We should investigate how to guarantee the uniqueness of this association. A possibility can be to store the container_id paired (_ separated) with an hash obtained by combining: the container name (that we ensure to generate unique), the host's mac address, the container_id itself, the experiment_id and the trial_id

Impacted functionalities:

  • Stats Collector
  • Spark-Tasks-Sender
  • Stats Transformer
  • Cassandra schema
  • Stats Analysers

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.