github / gh-gei Goto Github PK
View Code? Open in Web Editor NEWMigration CLI for GitHub to GitHub migrations
License: MIT License
Migration CLI for GitHub to GitHub migrations
License: MIT License
Customers have asked if it's possible to add an option to let them set a state other than private for a repo when it's done migrating. This capability is already implemented in Octoshift, so this issue tracks passing the appropriate data to Octoshift via the CLI to use this feature.
The customer can configure this in the CLI. When they run generate script we will add a --target-repo-visibility
option to migrate repo that by default matches the state that the source repo was in.
--target-repo-visibility
to the migrate-repo
command that adds the state that the source repo was in (i.e. private/public/internal). If the field isn't included then the default should be private.--target-repo-visibility
should be included in for ado2gh customers. We can just default to private for that scenario.My thoughts on best way to do this:
Have a pre-existing ADO org + GH org
GH org should be using EMU's
Have the relevant AAD groups created and sync'd to GH already
Write some setup/teardown code that does the following:
Then have some sample data generation code:
Run the test(s):
Assert on results:
add an extra option to generate-script that can generate a 2nd rollback script that undoes what the migration script does.
This will make doing dry-runs easier, especially when a customer wants to do a dry-run that involves modifying things in ADO (e.g. pipelines)
Makes it so disable-repo will fail, should tweak the permissions it deny's to not include the disable repo permission.
I did this as env vars to start because it's usually a more convenient way to pass around secrets, especially if you may be screen sharing while using the CLI. And you definately don't want secrets to be included in the generated script from the generate-script command.
For convenience though, an option to specify them directly as CLI args would be a nice addition.
NOTE: This FastTrack engagement ends tomorrow; they are friendly and may be willing to help troubleshoot beyond tomorrow, but right now we have a window.
Default behavior should be to generate the minimum possible scripts - probably just the migrate-repo commands.
Then have optional arguments that can be passed to generate-script to add more functions to the generated script. E.g.:
--create-teams
--link-Idp-groups
--rewire-pipelines
--boards-integration
--lock-ado-repo
etc
This should involve the removal of the existing --repos-only flag
Possibly include a --all or something similar flag for customers that want to do all the things.
Based on the below conversation we need to following options:
When the following command runs:
octoshift generate-script --github-org <Your GitHub Organization> --ado-org <Your ADO organization> --output migrate.sh
This will generate a file called migrate.sh
which starts with ./octoshift
. However, this should be just octoshift
. Otherwise, it will just look at the current path for octoshift
CLI tool.
The current tooling only supports running one repo migration at a time and waiting for completion before moving onto the next. The backend API's support queueing up many jobs at once, and the CLI tooling needs to be updated to support it.
This will likely be handled by changing the logic contained in the migration script generated from the generate-script command. We may want to support adding an option to generate-script giving the user control over whether the migrations run in parallel or not (the default should be parallel). Will need an option on migrate-repo also indicating whether it should wait or not (default is to queue and return - i.e. not wait).
Create a video and link it in the readme for how to use this tool
Without the all orgs scope the API call to get the list of orgs won't work, and I believe the API call to get the Org ID will also fail. We can get around the first problem by making the user pass an org name as a CLI argument. Not sure how to get around the 2nd issue. There must be some API end point we can use to get the Org ID for the org that the PAT is scoped to (I just don't know what it is)
If you have multiple builds running integration tests at the same time they appear to become flaky.
Octoshift has two externs this fall, and they are working on a cli that is intended to be used as a planning tool for customers. That repo is located here. Eventually, we hope to release our tool into open source, hosted on npm.
There is some overlap in functionality between the two utilities, and I think it would be very helpful for our externs to see the structure of the OctoshiftCLI and how API interactions are being made.
The ask: I'd like to request read-only access to this repo for arypat and kevinmsmith131 until mid-December.
Lots of places where we call GH api's that may require paging through results and we're not doing it yet.
Right now if a PAT doesn't have the right permissions, some of the API calls will just crash and spit out a stack trace.
Would probably be a nicer experience if the tooling started by explicitly validating the PAT(s) had the necessary permissions, and giving user-friendly error messages if they didn't.
If the approach is to create a Maintainers/Admins GH team for each team project, then map those to AAD groups. The way it works now is the customer has to create the AAD groups in advance, then the CLI will do the GH Team linking. Would probably be better if the CLI had the capability to also create the AAD groups as part of the migration process.
This will require credentials to be able to write to AAD (i.e. service principal client id/secret).
Assuming the source ADO and target GHEC-EMU both use the same AAD tenant, this should be easily automatable.
Should put in place some poor man's DI (for the AdoApi/GitHubApi/AdoClient/GithubClient) so that the Command classes are easier to test and mock out the Api layers.
Full blow DI framework is probably overkill at this point.
When building the artifacts to publish a new release, we're not baking the version number (from the release/tag) into the binary. So when you do octoshift --version it will always show 1.0.0.
In old .net this would be done by updating the assemblyinfo.cs files, can't remember how it's done in .Net core.
Create some basic docs on how to use the CLI tooling. Especially what PAT scopes are needed in order to run the various commands.
I've been testing with All Orgs + Full Access for the ADO PAT, and similarly broad scopes for the GH PAT.
The scopes needed will be diff for diff commands (e.g. generate-script only does ADO reads, but obviously some other commands are writing stuff).
There's another issue #20 to code in some explicit tests for the right permissions, this issue represents just figuring out what's needed and documenting it.
I am getting an error when migrate.sh
runs after executing the following command:
octoshift generate-script --github-org <Your GitHub Organization> --ado-org <Your ADO organization> --output migrate.sh
[hh:mm AM] [DEBUG] RESPONSE (OK): {"data":{"createMigrationSource":null},"errors":[{"type":"FORBIDDEN","path":["createMigrationSource"],"locations":[{"line":1,"column":132}],"message":"Owner xxx is not authorized to perform imports."}]}
I am guessing this is because I ignored to setup Azure Pipeline App in my GitHub and connected with Azure DevOps.
[hh:mm PM] [INFO] Repo: xxx
[hh:mm PM] [WARNING] CANNOT FIND GITHUB APP SERVICE CONNECTION IN ADO ORGANIZATION: xxxx. You must install the Pipelines app in GitHub and connect it to any Team Project in this ADO Org first.
The current API endpoints are not the right ones for EMU. We must not have fully tested that capability with DFIN because they didn't have the permissions to create AAD groups for us to test with.
We probably need to detect whether it's EMU or team-sync and call the right endpoints. But if only one of those scenarios is working it should be EMU first.
If duplicate repo names are detected, instead of just spitting out a warning and putting the onus on the user, maybe do something intelligent to ensure unique repo names, then the user can always changed the migration script if desired.
E.g. prefix with Team Project and/or Org name (if not already doing that), and/or toss a number on the end
The current command disable-ado-repo makes it unreadable.
We should have a lock-ado-repo command that makes it read-only pre-migration, then use disable-ado-repo post-migration.
We already implemented this for DFIN, need to port it to this codebase.
Right now it accepts a comma-separated list of github repos, would be better if you simply called the command once per repo.
This will require a change in implementation, because right now it creates the connection and adds the repos all in one call. But the api's needed are slightly different to add a repo to an existing connection.
This is important because if a customer wanted to migrate only some repos in a team project, then later migrate the rest. The integrate boards command as it stands today won't work right. But this change will fix that.
Probably as a PR comment using one of the custom actions out there that automate this.
Include brief overview of how the code is structured.
e.g. Program.cs -> Commands -> Api -> Client
And testing + static analysis rules we follow.
etc etc
Bring the Octoshift CLI to a state that aligns with how customers want to use Octoshift, covers running migrations from ADO, GHES & GHEC, and can be picked up self-service and used without in hand holding for the first few migrations. This will help to reduce the time investment that's needed to get a customer started using Octoshift. While also bringing the CLI benefit to GHES and GHEC migration customers.
DRI: @dylan-smith
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.