Git Product home page Git Product logo

llvm-premerge-checks's Introduction

This repo is holding VM configurations for machine cluster and scripts to run pre-merge tests triggered by http://reviews.llvm.org.

As LLVM project has moved to Pull Requests and Phabricator will no longer trigger builds, this repository will likely be gone.

Pull request migration schedule.

Overview

Presentation by Louis Dione on LLVM devmtg 2021 https://youtu.be/B7gB6van7Bw

IMAGE ALT TEXT HERE

The pre-merge checks for the LLVM project are a continuous integration (CI) workflow. The workflow checks the patches the developers upload to the LLVM Phabricator instance.

Phabricator (https://reviews.llvm.org) is the code review tool in the LLVM project.

The workflow checks the patches before a user merges them to the main branch - thus the term pre-merge testing*. When a user uploads a patch to the LLVM Phabricator, Phabricator triggers the checks and then displays the results.

The CI system checks the patches before a user merges them to the main branch. This way bugs in a patch are contained during the code review stage and do not pollute the main branch. The more bugs the CI system can catch during the code review phase, the more stable and bug-free the main branch will become. citation needed

This repository contains the configurations and script to run pre-merge checks for the LLVM project.

Feedback

If you notice issues or have an idea on how to improve pre-merge checks, please create a new issue or give a ❤️ to an existing one.

Sign up for beta-test

To get the latest features and help us developing the project, sign up for the pre-merge beta testing by adding yourself to the "pre-merge beta testing" project on Phabricator.

Opt-out

In case you want to opt-out entirely of pre-merge testing, add yourself to the OPT OUT project.

If you decide to opt-out, please let us know why, so we might be able to improve in the future.

Requirements

The builds are only triggered if the Revision in Phabricator is created/updated via arc diff. If you update a Revision via the Web UI it will not trigger a build.

To get a patch on Phabricator tested the build server must be able to apply the patch to the checked out git repository. If you want to get your patch tested, please make sure that either:

  • You set a git hash as sourceControlBaseRevision in Phabricator which is
  • available on the Github repository, or you define the dependencies of your
  • patch in Phabricator, or your patch can be applied to the main branch.

Only then can the build server apply the patch locally and run the builds and tests.

Accessing results on Phabricator

Phabricator will automatically trigger a build for every new patch you upload or modify. Phabricator shows the build results at the top of the entry: build status

The CI will compile and run tests, run clang-format and clang-tidy on lines changed.

If a unit test failed, this is shown below the build status. You can also expand the unit test to see the details: unit test results.

Contributing

We're happy to get help on improving the infrastructure and workflows!

Please check contibuting first.

Development gives an overview how different parts interact together.

Playbooks shows concrete examples how to, for example, build and run agents locally.

If you have any questions please contact by mail or find user "goncharov" on LLVM Discord.

Additional Information

License

This project is licensed under the "Apache 2.0 with LLVM Exception" license. See LICENSE for details.

llvm-premerge-checks's People

Contributors

abrachet avatar aeubanks avatar alphahot avatar christiankuehnel avatar chsigg avatar danilaml avatar dependabot[bot] avatar eccothedolphin avatar fooishbar avatar gchatelet avatar gmngeoffrey avatar jankratochvil avatar joker-eph avatar kcc avatar ldionne avatar metaflow avatar mforster avatar mizvekov avatar movsic avatar mstorsjo avatar mtrofin avatar nikic avatar rnk avatar tlively avatar wanders avatar yxsamliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llvm-premerge-checks's Issues

Provide enough disk space to agents

Some data:

  • The ccache can take up to 20 GB and is shared between jobs on the same machine.
  • The job workspace for llvm is about 46 GB.
  • GCP provides 375GB of storage per SSD.
  • So we can only keep several workspace per pod before running out of disc space.

acceptance criteria:

  • The agents are configured to have enough storage to run the assigned about of jobs.
  • We can easily scale up the disc space, when adding more jobs to an agent.

Run tests on MacOS

At the moment all our resources are on Google Cloud and that does not offer MacOS machines. So we need someone else to host and pay for those.

Manage storage properly

acceptance criteria:

  • ccache is in a volume and configured per machine
  • agent build dir is on a volume with enough storage

Auto-create a bug for failing tests/builds

acceptance criteria

  • If someone pushed a commit to master that breaks a test, a bug on buganizer is created automatically.
  • The bug is assigned to the person pushing the change.

make Jenkins log accessable

acceptance criteria:

  • the full Jenkins build log is available on the result server

idea

  • each build trigger another build that gets the results from the server (file system or web UI) and copies it to the result storage
  • look into Pipelines for this

Gather user feedback

Send out questions to all know beta testers and ask for feedback and recommendation on turning it on for all users.

make sure dependencies between patches on Phabricator works

When applying patches with parent diffs set something seems to go wrong repeatedly. Somehow arc patchis not able to apply these. I'm not really sure what the problem is and how we can work around this.

Maybe this is related to arc applying patches that were already merged. So maybe we need to manually iterate over the parents and check in the git log, if they were applied already...

Mark unrelated problems

acceptance criteria:

If a test on the parent revision of the patch fails, do not complain in the pre-merge tests based on that revision.

Ideas

  • Build every revision on master and remember which tests failed.
  • When running the tests on a patch, check which tests already failed on parent revision.
  • In the user feedback differentiate between failures on the parent and the patch.
  • Do not fail the build if all failing tests are already broken on the parent.

Build and test on Windows

acceptance criteria

  • Jenkins can build and test LLVM on a Windows machine
  • The Windows machine is running on GCP.
  • get failing tests fixed: https://bugs.llvm.org/show_bug.cgi?id=44151
  • automatically trigger Windows builds in Phabricator diffs
  • also show build results for Windows builds to users

documentation

Investigate llvm/test/tools/llvm-ar/mri-utf8.test

The premerge check (always) claims this test is failing.
However it passes locally for me, and AFAICS on all the buildbots.
So there may be something odd about the configuration of the premerge check workers (or maybe this is a real test failure in a valid though untested configuration and the test/code needs to be fixed)

Set up monitoring and central logging

acceptance criteria

  • measure build times (from submitting a patch to phabricator until results are in)
  • measure CPU, RAM and disc usage on all machines
  • measure number of rollbacks/day

extend documentation

acceptance criteria

  • document the vision
  • document the current solution and UI
  • move installation instructions to different file
  • document the limitations

add checks for clang-tidy and clang-format

acceptance criteria:

  • all patched are checked with clang-tidy
  • all patched are checked with clang-format
  • the results are reported to the Phabricator page
  • the checks only consider the modified lines (using clang-format-diff.py / clang-tidy-diff.py)

Understand performance issue with LIT

background

  • When running the LLVM test suite on a workstation, it takes ~ 70 sec.
  • When running it on the Kubernetes cluster on a 32 core machine...
    • ... via the Jenkins agent it takes 25 min.
    • ... via local login it takes 90 sec (which is what we expected).
  • The problem can be fixed with setting the open files limit uname -n 1024 (instead of the current value of ~1.000.000) before running the test suite. Values between 512 and 8192 were also tested and resulted in the same execution time as 1024.
  • We have no clue why that solves the performance problem.

acceptance criteria

  • We know why changing the ulimit impacts the LIT performance.
  • If it's a bug in LIT: either there is a bug report in LLVM or the bug in LIT is fixed.

setup staging for CI changes

acceptance criteria:

  • There is a way to test CI changes before rolling them out to users.
  • There is a workflow how to push changes first to testing and then to production.

reproduce build locally

acceptance criteria

  • the containers are set up in a way so that users can run the tests locally
  • there is documentation on how to do this
  • We clarified if we can share the containers (e.g. Windows licensing)
  • If we can share the containers: users can access the containers from a public Docker repository

Jenkins agent failure should lead to restart

If build / test has failed due to e.g. machine restart Jenkins gives up and reports build as failed.
Would be nice to restart build automatically in such cases.

Initial search found naginator plugin that should do the trick.

create test report

acceptance criteria:

  • results of the tests are written to a test report.
  • test report is available via the web interface.
  • the comment on Phabricator lists the failed tests
  • The test report is nicely readable by a human, maybe we need to post-process is to html somehow.

This is also related to #14

announce beta test on mailing list

once we have a reaonable version:

  • announce a public beta test on the mailing list
  • add people to the herald config
  • point to documentation on github in email.

Authenticate with github account

acceptance criteria

  • all Jenkins users are authenticated with their github accounts
  • build results are still public accessible

rationale

  • LLVM contributors need a github account anyway.
  • keeping the credentials in files is annoying
  • we might want to give more people access eventually
  • 2-factor-authentication is cool

move build scripts to some repo, use pipelines

status quo:

  • the build steps are configured in a text box in Jenkins.
  • we do no have them in version control
  • you can't run them locally
  • the Jenkins jobs are configured in the UI

acceptance criteria:

  • build scripts are moved from the Jenkins text box to the LLVM repository
  • build scripts can be checked by build server before merging
  • build scripts can be code-reviewed before merging
  • Jenkins uses the "Pipeline" feature to configure jobs from a SCM
  • A build log is uploaded even when arc patch fails (solve #11)

documentation:

Improve feedback in Phabricator (even further)

acceptance criteria

  • the feedback in Phabricator looks something like this:
Ran `check-all`, 1 failures:
  LLVM.tools/llvm-ar::mr-utf8.test
Logs: [ninja log], [cmake log], [CMakeCache.txt]

Sign up for public beta testing

I you are interested in paricipating in the beta tests for the pre-merge checks:
Leave a comment on this issue with your Phabricator user name. We will then add you.

For the Phabricator integration and bug reports, please see the user documentation.

Only build and test changed modules

Ideas

  • Right now we're building and testing several projects in LLVM for every change (see script for configured projects).
  • To speed things up it would be nice to only build and test the projects that are affected by a patch. This would also reduce the number reported errors that are not related to the patch.
  • We could use the ENABLED_PROJECTS flag of cmake to select which projects to build and test.

Problems

  • We do not know the dependencies between files, projects and tests.
  • If we set up a dependency table, this needs to be maintained manually.

technical considerations

  • input options:
    • run git diff and git status --short in the script to get the new and changed files and folders
    • parse the patch from Phabricator for the changes. You can look at apply_patch.py for an example on how to get the diff from Phabricator.
  • output: print string to stdout that can be used for ENABLED_PROJECTS in CMake.

Scale compute power

acceptance criteria

  • All new/changed patches are checked within 2h.
  • We benchmarked the tests on "C" type machines to see if that's faster.
  • We have benchmarks for 16, 32 and 64 cores for:
    • clean build
    • cached build
    • ninja check
    • ninja check-all

run sphinx

acceptance criteria

  • the documentation is checked by running sphinx

Print link to revision on Phabricator in build log

As a user I want to navigate easily from the build log to the revision in phabricator so that I can see what triggered the build.

acceptance criteria

  • a link to the revision in phabricator is shown in the build log
  • if possible: also add a link to the build job in jenkins to the change in phabricator

Auto-revert failing patches

acceptance criteria:

  • all patches that cause a failure (failed build or failed test) on master are reverted automatically
  • we have community buy-in for this feature.

Create separate GCP project

acceptance criteria

  • We have a new Cloud project with defined billing information.
  • The cluster is run from a new, separate "project" in GCP.
  • The create cluster script is cleaned up before setting up the cluster:
    • The "services" node pool is removed
    • The "default-pool" has only one n1-standard-4 machine.
    • The proxy, Jenkins, etc. use the "default-pool".
    • The "jenkins-agents" pool is unchanged (or scaled up if required).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.