aegershman / upgrade-tiles-proof-of-concept Goto Github PK
View Code? Open in Web Editor NEWOut of date. Proof-of-concept to demonstrate using selective-deploys on tiles/stemcells in PCF
License: MIT License
Out of date. Proof-of-concept to demonstrate using selective-deploys on tiles/stemcells in PCF
License: MIT License
In order to keep operators clued into the inner-workings of the upgrade-tile tasks, we should include more echo/output statements to display what's happening
Making the output helpful and informative will prevent the need for 'debug' mode (set -x), which will display everything-- including credentials-- in the output window
In order to ensure operational consistency & safety, we should ensure the pipeline fails when a product is uploaded with the incorrect stemcell listed in the product.yml
Once it runs "apply changes", the logs don't properly stream
https://opsman-dev-api-docs.cfapps.io/#getting-a-list-of-recent-install-events
If another tile is in the process of being installed, the pipeline needs to wait. But it shouldn't wait anytime there's a staged change-- it should only wait when the Opsman is in the process of "installing".
The logic in stage-tile is weirdly complicated. It's especially frustrating because you cannot have multiple versions of a tile uploaded; you must delete unused products to allow stage-tile to work properly. Perhaps there's a better way to update a tile.
Perhaps #7 could be applied in this case?
How can the pipeline know which major stemcell version to go retrieve? Should each individual tile pipeline be in charge of downloading/uploading the latest stemcell? Should some other pipeline automatically listen for available stemcells & upload them automatically, then the tile-pipelines be responsible for applying them? Should I just load the opsman with a pivnet API token & poll the "check for available stemcells" API endpoint?
Inspiration: https://github.com/pivotal-cf/pcf-pipelines/blob/master/tasks/upload-product-and-stemcell/task.sh
https://opsman-dev-api-docs.cfapps.io/#upgrading-a-product
What does this API endpoint do? Does it do a selective deploy? Could it be used in substitution for downloading product/uploading/staging?
In order to demonstrate the other techniques of upgrading tiles (single-tile single-pipeline; multi-tile single-pipeline, etc.), we should include a little documentation on other ways we've done tile upgrades.
If I try to associate a stemcell to a product, but that stemcell version is already applied to the product, will this still register as a "staged change"? If I'm going to be making API calls to associate a stemcell to a product every time I call "apply-selective-deploy", I need to make sure this doesn't cause unnecessary problems.
I'm hoping for the Opsman response to be something like "stemcell version already applied".
In order to control automation & finish experimentation of tiles within our sandbox foundation, we should implement the remaining product tiles in sandbox
Unfortunately this current model only upgrades the tiles in one foundation. But I would like to have a tile version "flow" through to other foundations; e.g., tile_a-0.2 must successfully pass through 'sandbox' before it can be applied to 'dev'.
In order to control planned downtimes, we should regulate when apply-changes
are automatically run on a tile
The pcf-pipelines task upload-product-and-stemcell
will upload the stemcell for a product... but I don't need the stemcell to be uploaded in this task anymore. I only want the product. Therefore, we should remove the "-and-stemcell" part from the "upload-product-and-stemcell" task
In order to prevent multiple "apply changes" jobs from running at once, we should use some kind of mutex on the opsman. In the pcf-pipelines project, they use a task called "wait-opsman-clear". Perhaps we can use something else? Perhaps instead of doing a poll-wait on the opsman, we can use the "pool resource"? https://github.com/concourse/pool-resource
Would supercede
#10
E.g., slack notification, email notification, etc. Something to indicate that things have gone wrong.
I'm not sure if "3445..*" is the best pattern to use. Should investigate using most efficient regex pattern.
In order to maintain documentation for tile upgrades, we should have the ability to create/close GitHub issues on the upgrade-tiles
repo.
We should separate 'upload-product' and 'stage-product' in order to prevent unnecessary double-uploading. If I want to re-stage a product, why should I have to re-upload it? And if I want to upload a product, why should I have to stage it? Consider separating these.
There should be a separate pipeline for janitorial tasks in the Opsman, such as--
etc
In order to keep timely backups of the Opsman, s3 backups should be performed after apply-changes is run
https://opsman-dev-api-docs.cfapps.io/#triggering-an-install-process
What happens if I call the POST /api/v0/installations
endpoint like this?
./om-darwin \
--target $TARGET \
--client-id $USERNAME \
--client-secret $SECRET \
--skip-ssl-validation \
curl \
--request POST \
--path /api/v0/installations \
--data '
{
"deploy_products": "all"
}
'
Here's the deal:
In the UI, if you hit "apply changes", it will run smoke tests / errands / etc. for all tiles. We don't want this. We want selective deploys.
When using the API endpoint, presumably, if you don't pass in any --data '...'
, it is the equivalent of hitting "apply changes" in the UI.
When using the API endpoint, you can pass in a list of product GUIDs to trigger selective deploys only on those products. That works just fine.
But what if we use this API endpoint and explicitly pass in --data '{deploy_products: all}'
? I know 'all' is the default value, but is the behavior different? Can we get it to perform selective deploys only on the staged changes? The benefit here is that then we don't have to explicitly pass in an array of product GUIDS
Maybe my stage-deploy task simplification was naive, because now on products that have a *-build.11
semver system, it falls apart.
Healthwatch 1.1.7:
Archive: ./pivnet-product/p-healthwatch-1.1.7-build.11.pivotal
creating: metadata/
extracting: metadata/p_healthwatch.yml
could not execute "stage-product": failed to stage product: cannot find product p-healthwatch 1.1.7
It wants to be configured with the "full" version, e.g., 1.1.7-build.11
For example, instead of downloading a tile and re-uploading it, would it just be easier to issue API calls to opsman & tell it to go download a product? Would it be easier to just outfit the opsman with a pivnet API token && coordinate all changes as "commands" to the opsman?
Is there a way to add a pivnet token to opsman via API call? Flow: get a pivnet token, put it in credhub, then have Concourse upload pivnet token to opsman. Then Concourse handles upgrading opsman by issuing commands via API, rather than having to supply the actual pivnet product/stemcell/etc.
Incoherent thought, trying to get this out-- The content of the lock, metadata from lock, could this be passed to stage-and-apply
?
In order to handle "off-release" stem cells (stem cells which come out in-between tile version releases), this pipeline should include a resource to apply the most up-to-date stem cell version
In order to extract the product name (and get the corresponding product GUID from the opsman), we're downloading the entire product from pivnet & looking at the metadata. Do we have to do this? Considering all we need is the product name?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.