Git Product home page Git Product logo

baritone's Introduction

Continuous Integration Build Optimization

This repository has been created in order to optimize continuous integration builds. We concentrate on code coverage files that are generated during continuous integration and then deleted without being saved anywhere. For the time being, we are solely focusing on open-source Java Maven projects and GitHub Actions.

How To Run

Running the scripts are pretty simple, inputs and outputs are created by the scripts and there is no need to format them in between. There are 3 folders under this project: Data Collector, Data and Job analyzer.

Data Collector

Scripts are run in this order: repository_collector, file_collector, content_collector.

repository_collector:

Finds ["Name", "Link", "Default Branch", "SHA", "Stargazers Count", "Forks Count", "Date"] of the repositories and saves it to the repositories.csv file.

file_collector:

Reads repositories.csv file and finds files and their paths related to ["Maven", "Gradle", "Travis CI", "Github Actions"]. Then it saves the information to the file file_paths.csv.

content_collector:

Reads file_paths.csv file and finds keywords related to Jacoco, Cobertura or Javadoc. If there is a dependency for these plugins in pom.xml file or in build.gradle file it saves the path under its column e.g. Maven Jacoco: pom.xml. If there is a keyword for these dependencies on the yml file like "jacoco" it saved the file path under the corresponding CI tool and the plugin name column e.g. GA(GitHub Actions) Jacoco: .github/workflows.main.yml. It also collects if these yml files are potentially using a platform for uploading code coverage results (since our first aim was to find unnecessary code coverage reports) by looking keywords e.g. GA Coveralls: .github/workflows/main.yml.

We used three different script to find information about the repositories because sometimes we encounter errors and this failed the collection of information. Thus, we needed to run the script again however there is an API request limit on GitHub and running the scripts from the beginning (from the collection of repositories) could cause unnecessary request repetition and wasting the requests.

Data

Under this folder there are the files created by the data collector. filtered_repositories.csv file contains the repositories which you wanted to use for job analyzer. Simply copy the row of the repository from the file_contents.csv file and paste it here.

Job Analyzer

This script takes the repository information then take the yml file contents and configure it. After configuring, it pushes the changes to the forked repository and automatically triggers GitHub Actions to start the build with configured yml files. In the build files generated are monitored and analyzed and the results pushed to the optimizing-ci-builds/ci-analyzes repository.

The main.py script contains four parts and is designed to automate the entire procedure.

Phases

Phase 1: Collection

We fork the repository and add necessary GitHub Environment secrets to the repository (This part done once and not used if there isn't a new repository or change in the added secret). After that we collect the yml file contents.

Phase 2: Configuring The Yaml Files

In the second phase, we hard coded configuration of files. It adds some steps to yaml files to set up Inotifywait, runs a python script to analyze the Inotifywait logs and lastly pushes the results to another repository.

Phase 3: Pushing the Changes

After configuring the files, we push them to our forked version of the corresponding repositories.

Phase 4: Analysis

Analysis part are done under by CI builds, using the python script we added to yml file, and the results are pushed to the ci-analyzes repository.

baritone's People

Contributors

0-x-2-2 avatar 1kjo avatar 5ht2 avatar auniqueuser avatar babbaj avatar bddvlpr avatar bytez1337 avatar c0nn3r avatar cdagaming avatar corruptedseal avatar echocage avatar ehylo avatar entropy5 avatar evilsourcerer avatar ftc55 avatar gamecenterjerry avatar ironexception avatar leijurv avatar logandark avatar mariusdkm avatar millenniumambiguity avatar nacgarg avatar oldgalileo avatar orinion avatar scorbett123 avatar typecasto avatar wagyourtail avatar zacsharp avatar zephreo avatar zeromemes avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.