ccremer / greposync Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 2.04 MB

Managed Git repositories in bulk

Home Page: https://ccremer.github.io/greposync

License: Apache License 2.0

Dockerfile 0.15% Makefile 1.26% Go 98.30% Smarty 0.16% Shell 0.13%

git modulesync

greposync's People

Contributors

Stargazers

Watchers

greposync's Issues

Randomize sync order

Git hosting services like GitHub have API rate limiting in place. Even when using a dedicated API token, GitHub may block requests if there's too much activity with the token (either by amount of requests or by bursting requests in short time). One can retry later since this rate limit gets reset after some time.

An idea to counter this is to randomize the order in which repositories get updated or synced.

The scenario works around cases where such a limit can be hit by always running the same order.
Consider a list repositories from A-Z. Repos at the beginning of the list will almost always be synced, while the ones at end never if the API limiting takes place in the middle of it.

Randomizing the list at the start should better distribute creating PRs. However it'll be only useful in cases where running the update happens on a schedule or without filters.

Suppress output when updating in parallel

Currently, logs and messages to the logging facility output everything immediately.
If jobs run in parallel, there could be confusing output shown where it's not easily identifiable which messages belong to which repo.

The idea is to suppress the logging output only to a single "success" message or a summarized error, where each update error is listed.

Add a flag to show the log output of the failed updates in an ordered fashion after running (e.g. like Docker buildkit).

Allow to influence the output path

Sometimes, repositories benefit from a template, yet they have a slightly different structure than the template.
For example, some files in template/module/.gitignore might actually end up in a subdir my-repo/.gitignore of the target repo.

Maybe also apply this feature to whole directories, not just files?

Support init files

Sometimes there are cases where a file needs to be in every repository, but they're individual and should stay unmanaged.
However, to improve experience of onboarding a new repository there should be an initial version of this file in case it doesn't exist.

Basically: If file is missing -> create from template -> don't ever touch again.

we could do this with a new special flag: initOnly: true

Copy file permissions from template

There may be cases where some executable scripts are synced.
Ensure that same file permissions stay when rendering.

Add possibility to define values for whole directories

It could be useful in templating or configuring repositories to apply common variables to a whole bunch of files within a subdirectory.

.github:
  defaultBranch: master

.github/workflows/release.yml:
  # here `defaultBranch` property would be inherited by the `.github` key. Or overwritten.

Expose metadata of the control repository

Currently, .Metadata exposes .Repository of the Git repository that is being updated.

This idea is about adding the Repository details of the controlling/template repository itself.

Ideas:

.Metadata.Self
.Metadata.Control, .Metadata.ControlRepository
.Metadata.Greposync

These metadata could be used in templates, but also PR body and title.

Possible fields:

Git branch, tag, commit
Git remote URL
Date and time of the update (although Sprig's date functions can do this already

Rethink CLI library

While https://github.com/urfave/cli works, there are some issues that I see:

It can distinguish between global and subcommand-local flags. However, it cannot define the same flags for 2 subcommands without making them global. To me, it doesn't make sense to make flags global just because 2 subcommands use the same flag name.
Global flags have to be put before the subcommand. for example, if --verbose is a global flag, then it has to be <bin> --verbose subcommand, while <bin> subcommand --verbose doesn't work. This is a bad user experience, but one that be accepted if the chosen lib cannot handle this.

however, the CLI flag library does stand out in the --help output. We see

Description
Usage
Flag names, both short and long
For non-boolean flags, we see flags like --pr-bodyTemplate value, where value indicates that something has to be added. Not every CLI library does this.
Default values
No hidden flags except --help (some CLI libraries add additional ones completely unusable in some contexts)

Other libraries

https://github.com/spf13/cobra (I'd rather avoid this, too many dependencies and forced workflow)
https://github.com/integrii/flaggy, seems super interesting but doesn't seem to be very active)
https://github.com/alecthomas/kingpin (dead)
https://github.com/alecthomas/kong (has a weird interface to configure commands and flags)
https://github.com/mitchellh/cli (dead?)
https://github.com/gookit/gcli (does too much, opinionated with colors, progressbar, unwanted flags etc)
https://github.com/jawher/mow.cli (doesn't have yet any red flags, gotta try out)

Notes:

CLI lib doesn't need to cover YAML or env vars, we can do this with Koanf already
CLI lib shouldn't need to define hierarchical flag names to make it compatible with Koanf. I'd rather have a flag like --include that gets mapped to the project.include struct. We can/have to somehow adapt them though since the YAML config files provide options that can be overridden by CLI flags. OTOH, there are settings available exclusively via CLI flags (e.g. --dry-run)

New Pull request didn't get labels applied

In ccremer/go-command-pipeline#16 the labels were not applied.

Logs

  DEBUG   Using config (config="{"project":{"rootDir":"repos","jobs":1},"log":{"level":"debug","showDiff":true,"showLog":true},"pr":{"create":true,"targetBranch":"","labels":["greposync"],"bodyTemplate":"This Pull request updates this repository with changes from a greposync template repository.","subject":"Update from greposync"},"template":{"rootDir":"template"},"git":{"skipReset":false,"skipCommit":false,"skipPush":false,"forcePush":false,"commitMessage":"Update from greposync","commitBranch":"greposync","defaultBranch":"","name":"","namespace":"ccremer"},"repositoryLabels":null}")
  DEBUG   Executing step (step="configure infrastructure")
  DEBUG   Executing step (step="fetch managed repos config")
  DEBUG   Loading config file (name="managed_repos.yml")
...
  DEBUG    (github.com/ccremer/go-command-pipeline) Executing step (step="push changes")
  INFO     (github.com/ccremer/go-command-pipeline) git push origin greposync
  DEBUG    (github.com/ccremer/go-command-pipeline) Executing step (step="find existing pull request")
  DEBUG    (github.com/ccremer/go-command-pipeline) No PR found
  DEBUG    (github.com/ccremer/go-command-pipeline) Executing step (step="ensure pull request")
  INFO     (github.com/ccremer/go-command-pipeline) PR created (url="https://github.com/ccremer/go-command-pipeline/pull/16")
 SUCCESS   (github.com/ccremer/go-command-pipeline) Update finished for repository

Expected:

New PR should have received the "greposync" label.
Note: At the point of running, the repository didn't have the "greposync" label. Maybe it needs to exist first?

Version

v0.2.0

Add GitLab provider

Currently, only GitHub is supported when it comes to managing PRs and labels.

Add support for GitLab API

Ignore errors of failed updates

When updating multiple repositories, some of them may fail for various reasons: wrong permissions, network error, etc.
Add a flag that allows to skip a repository update error, so that others may continue.

Dependency Dashboard

This issue provides visibility into Renovate updates and their statuses. Learn more

Awaiting Schedule

These updates are awaiting their schedule. Click on a checkbox to get an update now.

Update golang.org to 9780585 (golang.org/x/oauth2, golang.org/x/sys)

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Update module github.com/Masterminds/sprig/v3 to v3.2.2
Update codecov/codecov-action action to v3
Update mikepenz/release-changelog-builder-action action to v3
Click on this checkbox to rebase all open PRs at once

Check this box to trigger a request for Renovate to run again on this repository

Sync labels in GitHub

Add ability to keep a certain set of labels in sync across all managed repos.
Sync properties like color, description
Remove unwanted labels
Leave unmanaged labels alone

Suppress git diff

The git diff command can really bloat the console output.
Add a flag that disables the diff'ing.

Support for partial templating

There are use cases where a repository may need to customize a bigger part of a template, but keep everything managed by modulesync.
Those files would then be merged toghether.

Example 1: Add custom job to a GitHub workflow. One could provide such YAML via .sync.yml file, but maintaining GitHub actions in this file is not really the way to go.

Possible implementation:

With Go templating: {{ include "custom-file.txt" "fallback-text" }}

Regression: Include and exclude flags are missing

v0.1.0 has the project-include and project-exclude flags to control which repositories get actually synced or ignored.
In v0.2.0-rc release they are missing, lost during DDD rewrite.

Add filtering CLI flags

Add CLI flags like --filter that takes a regex to filter for repositories. Only repo names that match the filter are upated.
Consequentially, add the negative like --negative-filter to skip repos matching the regex.

This is the same feature that ModuleSync has.

Add testing mode

Writing templates can be tedious with lots of try-and-error
It should be possible to render templates in a local non-Git directory and compare them with a given set of expected file contents.

With that, it's possible to write kind of "unit tests" of a template.

Proposed file structure:

.tests
└── case-1
    └── README.adoc
tests
├── case-1
│   ├── README.adoc
│   └── .sync.yaml
└── tests.yaml

Test cases are put into arbitrary-named folders inside the tests directory in a template repository.
Each of these folders all the files that are expected to be rendered.
The .sync.yaml contains the template variables with the exact same syntax as .sync.yaml for a managed repository (in fact it's treated as a repository).
The .tests folder exists only after running gsync test and contains the rendered files based on the files in tests with the same directory structure.
The file tests/tests.yaml may contain any global variables (similar to config_defaults.yaml).
That way, the render process is almost exactly the same.

Configurable Git base

Currently, [email protected] is hardcoded to be the Git base when fetching repositories.
Make this configurable.

Add Pull mode

Greposync acts like modulesync where it pushes the updated template files to the target repositories and optionally makes PRs.

This is called "push" mode.

This method doesn't scale well. Especially GitHub does rate limiting and there's only limited options to work with those.

Other projects (e.g. https://github.com/cruft/cruft) work pull-based. Meaning the execution is inversed: The template repository gets cloned into a local directory, then the templates get rendered and finally a git diff is made. This allows including such a "check" as a form of linting step into CI/CD of the target repository itself.
This method scales better with large ecosystems.

Rework special values syntax

ModuleSync knows 2 kinds of special values:

delete
unmanaged

From a syntax perspective, it's easy to mistake them for variables that are being used in templates.
However, greposync shall make this more explicit and also make those free for use in templates.

Furthermore, greposync supports more special values:

targetPath
initOnly (possibly, see #100 )

Proposal:

Rename parameter targetPath to :special:targetPath
Rename parameter delete to :special:delete *
Rename parameter unmanaged to :special:unmanaged *

*This will cause a breaking change for all .sync.yml files in all managed repos. To make migration easier, the following shall be done:

Offer a migration CLI flag, e.g. --migrate-special-values that replaces delete and unmanaged with the new parameter names and includes the change in the commit. Note: We'd have to search-replace line-by-line and cannot read/write as YAML, as that likely removes any user comments.
If only the legacy parameters are present, use those
If new parameters are present, use the new ones (regardless whether legacy parameters are present or not)

Integration tests

Add integration tests with git repositories.
Use 1-2 repositories as playground. reset them after testing.

Allow to use different root dir

Currently, gsync is expected to be called from the current workdir.
It should be possible to add a path as CLI flag where everything is relative from that path.
That way, users don't have to cd .. into the dirs when managing multiple template repositories.

Provide a summary of the updates

At the end of a run, the output might be cluttered with Diffs and Logs.
The repo update should provide a summary table.

This table could look like this:

Repo	Status	Changed files	Pull request
repo-1	Success	README.adoc, .gitignore and 12 others	`https://github.com/.../12`
repo-2	Failure	unknown	not found