vaticle / dependencies Goto Github PK
View Code? Open in Web Editor NEWBazel dependency declarations for build tools reused across @vaticle repositories (only for @vaticle)
License: Mozilla Public License 2.0
Bazel dependency declarations for build tools reused across @vaticle repositories (only for @vaticle)
License: Mozilla Public License 2.0
We have submitted a patch to GRPC grpc/grpc-java#7771, we need to monitor when this PR is released, and after that, we need to bump our dependencies to use the newest version of GRPC-java.
We need to propagate the update of grpc to all our repositories.
Right now, it is possible for packages inside Grakn Core to depend on libraries declared in artifacts.bzl
of other repositories.
For example, take the following example where Grakn Core also needs to load artifacts.bzl
from Grabl Tracing:
load("//dependencies/maven:artifacts.bzl",
graknlabs_grakn_core_artifacts = "artifacts")
load("@graknlabs_grabl_tracing//dependencies/maven:artifacts.bzl",
graknlabs_grabl_tracing_artifacts = "artifacts")
load("@graknlabs_dependencies//library/maven:rules.bzl", "maven")
maven(
graknlabs_grabl_tracing_artifacts +
graknlabs_grakn_core_artifacts,
)
There's nothing that prevents Grakn Core to load Maven packages declared in the artifacts.bzl
coming from Grabl Tracing. This is not ideal since Grakn Core should only be able to depend on Maven dependencies explicitly declared in artifacts.bzl
.
Use unused_deps
to scan (in build
CI job of every repo) whether any libraries that are unused we specified in deps
We need to support remote profiling (JMX); correct set of options for this to work is:
SERVER_JAVAOPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1089 -Dcom.sun.management.jmxremote.rmi.port=1089 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=<PUBLIC IP>"
The rule depends on certifi
and it will fail to execute if it's not installed. Therefore we have to make sure to first install it before calling the rule:
pip install certifi
export RELEASE_NOTES_TOKEN=$REPO_GITHUB_TOKEN
bazel run @graknlabs_build_tools//ci:release-notes -- client-java $(cat VERSION) ./RELEASE_TEMPLATE.md
The best solution is to bundle it so the user doesn't need to install it.
We are depending on unuseddeps
that is 1) old and 2) not coming from the mainline. As of 041308b, we host the binary ourselves in repo.vaticle.com
: https://raw.githubusercontent.com/graknlabs/dependencies/041308b4f3a06958fb32fa58fb0bc7ed465d8f64/tool/unuseddeps/deps.bzl
This is not very maintainable since we can easily lost the binary by accident without an easy way of rebuilding them - it will involve figuring out how to build or acquire the binary from some other sources. This also makes upgrading hard.
Therefore, we should depend on a binary that is 1) recent and 2) from the mainline.
Rather than using curl
, we should use Kotlin or Java's built-in HTTP library:
2cd7098#diff-c48ac7fec5f0945f1acf00b63746eed18fadf811a704574296862556b208e6eaR11-R28
artifacts.bzl` declares what dependencies are allowed in a given repository and has allowed us to manage dependencies and versioning effectively.
Some dependencies declared there are only used in building and testing (eg., JUnit, Github library, Checkstyle, etc) whereas some others are used in production (ie., actually included in the distribution).
We should split the declaration in artifacts.bzl
accordingly
production
dependencies (for dependencies used in the distribution)build
dependencies (for dependencies used in build and test)With this split, a library declared in production
can be included in the distribution whereas a library declared in build
cannot, and should only be used in assembly, deployment, or tests.
The workspace name for each dependencies should also clearly indicate whether they are production or a build dependency:
java_library(
...
deps = [
@production//google_or_tools
]
)
java_test(
...
deps = [
@build//junit_junit
]
)
We should also create a way to verify that there are no build dependency that is used in production:
bazel run //:dependency-test
ERROR: target //server:server depends on a build dependency '@build//junit_junit'
Being able to separate production and build dependencies will help us manage the dependency better - we can be strict in allowing what goes in production, while being more lenient for dependencies in build.
This issue is comprised of two issues:
It's possible for one Bazel repository to be loaded multiple times & therefore it's unclear which version gets loaded
Ex: rules_pkg
is loaded by both rules_docker
and vaticle_bazel_distribution
. rules_docker
loads v1
and vaticle_bazel_distribution
loads v2
and which version gets used is controlled by their ordering in WORKSPACE
file.
Goal:
WORKSPACE
fileThe desirable state to be in is that each releasable artifact (nominally, binaries and distributed packages) is built with a specific rule that generates a single, versioned, complete archive containing all files required to distribute. This means that the artifact can be checked and tested, and this testing acts as a checkpoint to validate downstream dependents.
As hermetic as Bazel is, dependency versions are decided by the main WORKSPACE, which means that building anything from another repo is always non-hermetic. Artifact dependency as a pattern enforces the opposite, the hermetic artifact cannot be unintentionally different downstream.
In order to better support this pattern, we should bake it into our build rules better, supporting a more clear notion of "the final target" (approximately assemble_linux_targz right now) and far less other repetition of config of the resulting artifact.
Where assemble_linux_targz acts as a "group of packages", we should instead combine other final artifacts, as this ensures that testing has been performed properly on the individual components.
We want //ci:sync-version
to bump the version of the LOCAL repository in which it is executed in. What that means is that if the current repository has VERSION
file containing X.Y.Z
, when //ci:sync-version
is called, it will change the content of the file to be X.Y.Z+1
.
E.g. VERSION
file content changes from 1.5.2
to 1.5.3
.
Now, we have the option of:
a) provide /path/to/VERSION
as arguments to the bazel run
command (e.g. -- /path/to/VERSION
)
b) provide /path/to/VERSION
as argument in the target declaration in the BUILD
file
c) not provide /path/to/VERSION
as it should always be expected (and thus hard-coded) to be at the root of every repository.
Release approval needs to print an intuitive error message when the credential isn't properly set. Right now it prints nothing:
Tests have been ran and everything is in a good, releasable state. It is possible to proceed with the release process. Waiting for approval.
.
An example can be found here: https://circleci.com/gh/graknlabs/graql/487
Right now, run-bazel-rbe.sh
can't execute bazel run
commands because they will be expanded incorrectly where --config=rbe
is appended at the very last:
bazel run //:deploy-brew -- snapshot --config=rbe
With the correct expansion, --config=rbe
should be put right before --
:
bazel run //:deploy-brew --config=rbe -- snapshot
Rewrite the script in Python so that:
Add <module name="UnusedImports"/
in
https://github.com/graknlabs/build-tools/blob/master/checkstyle/templates/checkstyle.xml
This is to catch all the unused imports in our codebase
If the CI workflow of an older commit in docs
is being re-run, we don't want the //ci:release-docs
to override the submodule reference of docs
in web-dev
to be overridden if it is more up to date than the older commit. Thus, we should only bump the commit of docs
in web-dev
if it is ahead/diverged.
You can see a reference implementation of this in: https://github.com/graknlabs/grabl/blob/5961de4a418b8bf4b2f1defe316b44ed5f04cfe4/git_repo.py#L372:L384
Repositories such as client-nodejs, which generate large amounts of (ignored) files in the source folder when compiled, are difficult for tool/checkstyle/test-coverage
to analyse because they contain so many files.
We already work around this problem by excluding specific named folders and files from the checkstyle coverage test, as follows:
workspace_files, _ = tc.shell_execute([
'find', '.',
'(', '-name', '.git',
'-o', '-name', '.idea',
'-o', '-name', '.ijwb',
'-o', '-name', '.github',
'-o', '-name', '.bazelversion',
'-o', '-name', '.gitkeep',
'-o', '-name', 'VERSION',
'-o', '-name', '*.md',
'-o', '-name', 'node_modules',
')', '-prune', '-o', '-type', 'f', '-print'
], cwd=os.getenv("BUILD_WORKSPACE_DIRECTORY"))
but this is woefully unmaintainable and would be much nicer if it parsed the .gitignore
file.
Can we have a quick script to visualise all of our transitive dependencies in a diagram? We can take inspiration from this script in @graknlabs_hypergraph
.
If we can have a way to differentiate a direct dependency vs a transitive dependency in this diagram, that would be great. But if it's not doable with an immediately available API, don't worry about it. We can always tell the direct dependencies from the artifacts.bzl
file.
We need to improve our Maven dependency snapshot by categorising them into three categories:
artifacts-distribution.snapshot
: the dependencies that get included in the distribution. For example, for Grakn Core it would be the zip and tar distribution and for Client Java the Maven JAR.artifacts-test.snapshot
: dependencies that are used in tests (eg., JUnit, GraknCoreRunner, Cucumber), minus the dependencies in artifacts-distribution.snapshot
. (ps: Could this be sufficiently implemented by querying for all dependencies of java_test
targets??)artifacts-build.snapshot
: the rest of the dependencies, that are not in artifact-distribution.snapshot
or artifact-test.snapshot
.They should be generated in the same way as the old one, eg., when the user invokes update.sh
:
$ ./dependencies/maven/update.sh assemble-mac-zip, assemble-linux-targz, assemble-windows-zip
Regenerating new snapshot files from WORKSPACE...
DONE!
'artifacts-distribution.snapshot' refreshed: 3 dependencies added, 1 dependencies removed.
Added dependencies:
- @maven//a
- @maven//b
- @maven//c
Removed dependencies:
- @maven//d
'artifacts-test.snapshot' refreshed: 0 dependencies added, 0 dependencies removed.
'artifacts-build.snapshot' refreshed: 0 dependencies added, 0 dependencies removed.
[x] bin
[x] builder
antrl
[x] java
[x] grpc
[x] nodejs
[x] python
config
[x] distribution
apt
brew
[x] docker
[x] maven
npm
pypi
[x] rpm
[x] library
[x] maven
npm
pypi
[x] tools
[x] bazel-install
[x] bazel_run
[x] checkstyle
[x] misc
[x] release
[x] sonarcloud
[x] sync
[x] test_cache
[x] unused-deps
[x] images
This is much better than the current way where we have to add 'Checkout SSH keys' in CircleCI.
Right now the first 2 lines of our AGPL and Apache licenses are:
# GRAKN.AI - THE KNOWLEDGE GRAPH
# Copyright (C) 2019 Grakn Labs Ltd
Can we change both lines of both license headers to be:
# Copyright (C) 2020 Grakn Labs
Unused deps fail to detect unused dependencies in our native_java_libraries
macro. This is true for dependencies declared in the deps
as well as native_libraries_deps
field.
"@maven//:info_picocli_picocli"
to //rocks:rocks
:https://github.com/graknlabs/grakn/blob/7ef3ac2843774173c72d1e69d658c0c2f21006f0/rocks/BUILD#L29-L70
native_java_libraries(
name = "rocks",
srcs = glob(["*.java"]),
deps = [
# Internal dependencies
"//common:common",
"//concurrent:concurrent",
"//graph:graph",
"@maven//:info_picocli_picocli",
...
)
bazel run @graknlabs_dependencies//tool/unuseddeps:unused-deps -- remove
The "@maven//:info_picocli_picocli"
dep should be removed
The "@maven//:info_picocli_picocli"
dep is not removed
We have forked these dependencies: @rules_antlr, @rules_python, @io_bazel_skydoc, @com_github_grpc_grpc, @stackb_rules_proto. Maintaining a fork requires more effort. Let's get the changes submitted to the mainline.
Here's the description of the forks in more detail:
# defined in @graknlabs_bazel_distribution
# Load Docker
git_repository(
name = "io_bazel_skydoc",
remote = "https://github.com/graknlabs/skydoc.git",
branch = "experimental-skydoc-allow-dep-on-bazel-tools",
)
# defined in @graknlabs_dependencies
def deps():
git_repository(
name = "rules_antlr",
remote = "https://github.com/graknlabs/rules_antlr",
commit = "8fd16b2900ebf6b893c2b7695850960dcc2d102c"
)
# defined in @graknlabs_dependencies
def deps():
git_repository(
name = "com_github_grpc_grpc",
remote = "https://github.com/graknlabs/grpc",
commit = "4a1528f6f20a8aa68bdbdc9a66286ec2394fc170"
)
...
git_repository(
name = "stackb_rules_proto",
remote = "https://github.com/graknlabs/rules_proto",
commit = "fd3aa227fdaa178c077ef9d72156b772d3b8c05d",
)
# defined in @graknlabs_kglib
def rules_python():
git_repository(
name = "rules_python",
# Grakn python rules
remote = "https://github.com/graknlabs/rules_python.git",
commit = "ee519e17ed5265bdd2431937bd271e3b76ad5b0a"
)
NOTE: The issue also includes forked dependencies used by other repos (ie., bazel-distribution and kglib).
The number of environment variables have increased. We need to scan them and see if we can minimise and make them more consistent.
We are making use of TypeDBRunner
and TypeDBClusterRunner
classes for orchestrating tests in Java. Having the same classes in Python would be beneficial.
Encountered this issue on graknlabs/kglib
, which uses pip3_import
to install python dependencies.
The issue is encountered for a CI job defined as follows:
release-approval:
machine: true
steps:
- install-bazel-linux-rbe
- checkout
- run: pyenv install 3.6.3
- run: pyenv global 3.6.3
- run: |
export RELEASE_APPROVAL_USERNAME=$REPO_GITHUB_USERNAME
export RELEASE_APPROVAL_TOKEN=$REPO_GITHUB_TOKEN
bazel run @graknlabs_build_tools//ci:release-approval
The error looks to reference a change between the expectations of Python2 and of Python3, indicating that release-approval.py
isn't compatible with Python3:
INFO: Build completed successfully, 4 total actions
Traceback (most recent call last):
File "/home/circleci/.cache/bazel/_bazel_circleci/f16e36219ef33c22efc2ad20f3e3775c/execroot/kglib/bazel-out/k8-fastbuild/bin/external/graknlabs_build_tools/ci/release-approval.runfiles/graknlabs_build_tools/ci/release-approval.py", line 41, in <module>
new_release_signature = hmac.new(git_token, json.dumps(grabl_data), hashlib.sha1).hexdigest()
File "/opt/circleci/.pyenv/versions/3.6.3/lib/python3.6/hmac.py", line 144, in new
return HMAC(key, msg, digestmod)
File "/opt/circleci/.pyenv/versions/3.6.3/lib/python3.6/hmac.py", line 42, in __init__
raise TypeError("key: expected bytes or bytearray, but got %r" % type(key).__name__)
TypeError: key: expected bytes or bytearray, but got 'str'
Exited with code 1
We need to add the ability to build a cross-platform rocksdbjni-dev (Windows, Linux, OS X).
Given that it can only be achieved by building the library on each platform (eg., you can only get a Windows library by performing the build step on a Windows machine), we should approach the problem by automating the whole build steps in the Grabl CI:
//library/rocksdbjni-dev
onto its own repository, rocksdbjni-dev-builder
deploy-windows
, deploy-mac
, deploy-linux
, and deploy-all
.deploy-mac
must ran on a Mac machine. Given that it's not possible to do so in Azure, we'll have to spawn the job in CircleCIdeploy-all
should download the deployed library from the other three jobs and combine them into a single final JAR. This final JAR is what should be included in Grakn Core distribution.We found a Bazel rule @rules_foreign_cc
[link] which allows for building make
projects from inside a Bazel workspace. We should replace our builder script RocksDbBuilder.kt
with this rule if possible
Create a dependency-analysis command that can determine whether the given symbols are up to date with the latest master.
bazel run @graknlabs_build_tools//grabl/analysis:dependency-analysis -- $CIRCLE_ORG/$CIRCLE_REPO@$CIRCLE_SHA1 graknlabs/graql@graknlabs_graql graknlabs/protocol@graknlabs_protocol graknlabs/client-java@graknlabs_client_java
The program should post a JSON request to the following URL: https://grabl.io/event/automation/analysis. The JSON request should look like this:
{
"dependency-analysis": {
"workflow": "graknlabs/grakn@edbea5c:build:1:quality:2",
"commit-dependency": [
{
"repository": "graknlabs/common",
"commit": "123ab46",
"status": "up-to-date"
},
{
"repository": "graknlabs/protocol",
"commit": "ac563bd",
"status": "up-to-date"
},
{
"repository": "graknlabs/graql",
"commit": "876dc53",
"status": "out-of-date"
}
]
}
}
The script should fail if it receives responses other than 2xx.
We are making use of TypeDBRunner
and TypeDBClusterRunner
classes for orchestrating tests in Java. Having the same classes in NodeJS would be beneficial.
See the artifacts.snapshot
file from @graknlabs_verification
linked below.
We can see that each dependency artifact is written twice: once with the version, and once without. Is this expected? This feels like a bug. Can we fix it so that it just outputs one artifact once just with the version?
Are we expecting that //tool/checkstyle:test-coverage
to throw if you forgot to add .bzl
file (or any file for that matter) into checkstyle_test ? That's what it's supposed to do, right?
If you look at this PR: vaticle/typedb-common#31 (before my changes that are about to come in), the following files were not covered by checkstyle and they actually have the wrong license:
//dependencies/maven/artifacts.bzl
//.grabl/automation.yml
//binary/grakn-bin.spec
//dependencies/maven/update.sh
in that repo is also missing license header.
Given that Grakn now has a multi-repo architecture, and especially after we migrate our full distribution/end-to-end tests to a separate repo (graknlabs/test) (issue vaticle/typedb#5270), we can only know if Grakn Core (or any repository) is releasable after all other repositories that depend on it has passed CI workflow after sync-dependencies.
Introduce @graknlabs_build_tools//release-validate
CI tool to validate whether a commit on a given repository is releasable by checking that the last stage of test in @graknlabs_test
triggered by the sync-dependencies from the source commit has passed.
We have reverted error outputs of process executions in two places:
We need to fix them again so that errors are returned properly.
This would require custom license header which is accepted as license
param in checkstyle_test
Problem to Solve
Given that Grakn now has a multi-repo architecture, we need to make sure that a repository is released if it depends on other repositories by tag, rather than commit. If a dependency is referenced by commit, that means that dependency is still in "snapshot" mode, and only available in the *-snapshot
distribution repos.
Proposed Solution
Introduce @graknlabs_build_tools//release-validate-deps CI tool to validate whether a given repository is releasable by checking that all of its releasable Grakn Labs dependencies are referenced by tag. An attempt to release a repository which depends on snapshot dependencies should produce a failure.
RocksDbBuilder depends on a parameter that is currently hardcoded - the JAVA_HOME
path. Its desired value can be obtained trivially by running /usr/libexec/java_home
; however, in the Bazel sandbox, /usr
is not accessible. On my machine JAVA_HOME
is /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
but that would not be true in the general case.
Currently, in order to alter this parameter, we have to change the source code - ie the Kotlin code.
This parameter should be a command-line parameter that we pass into bazel build
, as follows:
bazel build //library/rocksdbjni:deploy-maven --define java_home=`/usr/libexec/java_home`
We can't currently do it because rocksdbjni-jar
is a genrule
. We would need to create a full fledged Bazel rule, which would replace our genrule
.
Then, the path would be available in the rule using ctx.var.java_home
, and we'd be able to pass it into the Kotlin script using arguments
(provided we use ctx.actions.run
)
bazel build //... should be runnable without RBE in order to support PRs from external forks. Right now it would fail: https://circleci.com/gh/graknlabs/grakn/21383
This is what every repo would have:
load("@graknlabs_build_tools//ci:rules.bzl", "release_validat_deps")
release_validate_deps(
name = "release-validate-deps",
refs = "@graknlabs_grakn_core_workspace_refs//:refs.json",
tagged_deps = ["@graknlabs_console`, …, `@graknlabs_protocol`],
tags = ["manual"]
)
Sample invocation: bazel test //:release-validate-deps
There may be some unused stuff in .bazelrc
from the Google RBE days. Let's remove them.
NOTE: This issue applies to all repositories, but for conciseness, I decided to create only one issue here.
We need to use Grabl's API key credential when deleting the release branch in the release-cleanup
CI job of every repository. Right now, the deletion is done using a separate SSH key and given that we've already got Grabl's API key credential, we should use that instead:
release-cleanup:
git push --delete https://[email protected]/$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME $CIRCLE_BRANCH
PS: This issue needs to be resolved in every repository in Graknlabs. In order to not have to create multiple identical issues, I've decided to create this issue in build-tools
.
fg
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.