Git Product home page Git Product logo

infrastructure's Introduction

Infrastructure as code

Treating Haiku's infrastructure as cattle instead of kittens since 2017.

Directories

  • docs - Full documentation on Haiku's infrastructure
  • containers - Manifests to build and deploy containers
  • deployments - Kubernetes manifests for Haiku infrastructure
  • playground - Things that we're experimenting with. Not used in production.

Architecture

Architecture

Quickstart

These are the path of least resistance for new admins to do "things"

๐Ÿ’ฅ DANGER
Never run kubectl delete on persistent volume claims you need! Running kubectl delete / kubectl delete -f on things describing volume claims will result in Kubernetes dropping (deleting) persistant volumes. (AKA massive + rapid data loss)

Running through the Kubernetes Basics class is recommended!

Pre-requirements

  • Install kubectl
  • Export the Kubernetes configuration from Digital Ocean, import locally
    • If this is your first kubernetes cluster, just put it at ~/.kube/config

Quick Commands

aka, i'm a sysadmin and a dog in a lab-coat

Check your configured cluster

List the known Kubernetes Clusters (contexts) of my client:

kubectl config get-contexts

Change the Kubernetes Cluster my local client focuses on:

kubectl config use-context (NAME)

List Deployments

Deployments are "how many desired sets of pods" within the cluster.

kubectl get deployments

Scaling Deployments

If you want something to "stop running for a while", this is the easiest and safest way. NEVER run kubectl delete if you don't know what you're doing.

kubectl scale --replicas=0 deployments/(NAME)

List Pods

Pods are one or more tightly related containers running in Kubernetes Deleting a Pod will result in the related deployment recreating it.

Entering a container

aka, equavient to docker exec -it (NAME) /bin/bash -l...

If the pod has one container:

kubectl exec -it pod/(NAME) -- /bin/bash -l

If the pod has multiple containers:

kubectl exec -it pod/(NAME) -c containername -- /bin/bash -l

Examining Stuff

kubectl describe pod/(NAME)
kubectl describe deployment/(NAME)

Initial Installation

  • Deploy ingress controller via instructions in deployments/ingress-controller/traefik
  • Deploy manifests in deployments for various services
  • Scale each deployment to 0 replicas
    • kubectl scale --replicas=0 deployments/(BLAH)
  • Populate persistent volumes for each application
    • see tools/migration_tools for some scripts to do this en-masse via rsync
  • Once ready for switchover, adjust DNS to new load balancer
  • Scale up applications
  • kubectl scale --replicas=1 deployments/(BLAH)

Rolling Restarts

To perform a rolling restart of each deployment replica:

kubectl rollout restart deployment/(NAME)

Example

-n kube-system is the namespace. We run Traefik in a seperate namespace since it's important.

Rolling restart of Traefik:

kubectl -n kube-system rollout restart daemonset/traefik-ingress-controller

Rolling Upgrade

Here we upgrade a container image from the command line. You can also update the matching yml document and run kubectl apply -f (thing).yml

Example

-n kube-system is the namespace. We run Traefik in a seperate namespace since it's important.

Rolling upgrade of Traefik:

kubectl -n kube-system set image daemonset/traefik-ingress-controller traefik-ingress-lb=docker.io/traefik:v2.6

Accessing Services / Pods

You can port-forward / tunnel from various points within the Kubernetes cluster to your local desktop. This is really useful for troubleshooting or understanding issues better.

Listen on localhost port 8888, to port 9999 within the pod

kubectl port-forward pod/(NAME) 8888:9999

Listen on localhost port 8080, to named port web of the service

kubectl port-forward service/(NAME) 8080:web

Pressing ctl+c will terminate the port-forwarding proxy

Importing data

Restoring volume / database backups: See deployments/other/restore.yml*

Manual database import: cat coolstuff.sql | kubectl exec -i deployment/postgres -- psql -U postgres

Forcing CronJobs

We leverage multiple jobs to perform various automatic activities within kubernetes. Some example jobs include postgresql backups to s3, persistent volume backups to s3, and syncing various git repositories.

Once and a while, you may want to force these jobs to run before performing maintenance, or for testing purposes.

  • pgbackup - PostgreSQL backup jobs
  • pvbackup - Persistent volume backup jobs

There are several example restore jobs in deployments/other. These can be manually edited and applied to restore data. It's highly recommended to review these CAREFULLY before use as a mistake could result in unattended data loss.

These restore jobs should be used on empty databases / persistent volumes only!

  1. Listing CronJobs
    $ kubectl get cronjobs
    NAME                        SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
    discourse-pgbackup          0 0 * * 1,4   False     0        2d14h           6d13h
    discourse-pvbackup          0 3 * * 3     False     0        11h             6d17h
    gerrit-github-sync          0 * * * *     False     0        38m             13d
    gerrit-pvbackup             0 1 * * 1,4   False     0        2d13h           8d
    haikudepotserver-pgbackup   0 0 * * 1,4   False     0        2d14h           3d21h
    .
    
  2. Forcing a CronJob to run This is a great thing to do before any maintenance :-)
    $ kubectl create job --from=cronjob/discourse-pgbackup discourse-pgbackup-manual-220316
    
  3. Monitoring manual CronJob
    $ kubectl get jobs
    NAME                                 COMPLETIONS   DURATION   AGE
    discourse-pgbackup-manual-220316     1/1           1m         1m
    
    $ kubectl logs jobs/discourse-pgbackup-manual-220316
    Backup discourse...
    Backup complete!
    Encryption complete!
    Added `s3remote` successfully.
    `/tmp/discourse_2022-03-14.sql.xz.gpg` -> `s3remote/haiku-backups/pg-discourse/discourse_2022-03-14.sql.xz.gpg`
    Total: 0 B, Transferred: 136.45 MiB, Speed: 77.32 MiB/s
    Snapshot of discourse completed successfully! (haiku-backups/pg-discourse/discourse_2022-03-14.sql.xz.gpg)
    

Secrets

For obvious reasons, ๐Ÿ”‘ secrets are omitted from this repository.

infrastructure's People

Contributors

forza-tng avatar gatak avatar hrithikkumar49 avatar jessicah avatar kallisti5 avatar korli avatar mmlr avatar nielx avatar petr-akhlamov avatar pulkomandy avatar waddlesplash avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

infrastructure's Issues

Implement automatic backups of Docker volumes

A simplistic backup script was created and placed in root's home directory on maui which when executed, grabs the contents of /var/lib/docker/volumes and compresses it into an archive in /var/backups with a standardized name.

[root@maui ~]# ls -la /var/backups
total 4479488
drwxr-xr-x.  2 root root      4096 Feb 22 00:35 .
drwxr-xr-x. 20 root root      4096 Sep 20 20:17 ..
-rw-r--r--.  1 root root 854837532 Sep 20 20:23 docker_volumes-1505931571.tar.xz
-rw-r--r--.  1 root root 930442172 Nov  4 21:09 docker_volumes-1509825854.tar.xz
-rw-r--r--.  1 root root 931729732 Nov 14 19:49 docker_volumes-1510685040.tar.xz
-rw-r--r--.  1 root root 930946048 Jan  4 16:33 docker_volumes-1515079678.tar.xz
-rw-r--r--.  1 root root 939003176 Feb 22 00:41 docker_volumes-1519256121.tar.xz

This needs to be better in a few ways:

  • automatic grooming of old backups.
  • automatic (weekly?) push to Hetzner secure backups space.
  • automatic run, manual trigger at the moment (hey, it's beta)

Support for GitHub notifications is broken

Since a lot of time Haiku-infrastructure doesn't support GitHub notifications for developer branches anymore. Since creating branches is not allowed anymore due to gerrit, this needs to be fixed ASAP.

Status Page for Infrastructure

With all the different bits of web infrastructure all needing to be maintained/managed, it would be good if we had a status page so both users and admins can check the status of web infrastructure. Plus, for scheduled system maintenance, it would be good to have a location for sysadmins to issue maintenance notices.

Now, I did some research and there are two types of status pages:

  1. "Static" status pages: generated using a static site generator like Hugo. These can be deployed using our existing Netlify plans and thus are relatively simple to set up. Mainly involves admins sending alerts for downtime, maintenance.
    Options: https://github.com/cstate/cstate, https://marquez.co/statusfy, https://github.com/statsig-io/statuspage (with auto-monitoring),
  2. "Dynamic" status pages: dynamic pages, some even have auto-monitoring functions. May be more complex to set up.
    Options: https://upptime.js.org/, https://github.com/Monitorr/Monitorr, https://github.com/valeriansaliou/vigil

We'll have to determine which type of status page is best for us and whether we need any features like notifications through email, Matrix etc.

Rollout of kubernetes

As documented in #66 , we need to grow beyond docker swarm. We have everything in place to begin moving to a Digital Ocean managed Kubernetes cluster.

Old ingress:
ingress.haiku-os.org -> limerick -> Docker Swarm -> Traefik v1.7

New ingress:
ingress.ams3.haiku-os.org -> DO Load Balancer -> Traefik v2.6 -> (pods)

Preparation:

  • Complete Traefik Ingress Controller
  • Complete Traefik TCP LB config for git / rsync / etc
  • Create scheduled downtime handler
  • Test longhorn storage RWX
  • Create migration script for repeat rsync from limerick to k8s
  • Test longhorn storage RWX with 3-k8s node recycle โŒ Not suitable #75
  • Test longhorn backup restoration โŒ Not suitable #75
  • Create scheduled CronJob to backup volumes to secure S3 bucket.
  • Test image updates and pod upgrades with ReadWriteOnce storage (vs ReadWriteMany)

Tasks:

  • Migrate SMTP, Migrate SMTP pv - Migrated via duplication
  • Migrate Gerrit, cgit, pv data - Done. Running on k8s
  • Deploy Postgresql14 database + Migrate roles - Done
  • Migrate Trac postgresql database, migrate trac
  • Migrate Pootle, i18n (@nielx working on)
  • Migrate Concourse
  • Migrate Redis, Discourse (BIG)
  • Migrate Haikuports, haikuports buildmaster, hpkgbouncer (BIG because of complexity + data size)
  • Migrate misc little things and tune

Future:

  • Investigate some RWX storage solutions. RWO limits our rolling / zero downtime update ability
  • Investigate running Gerrit with a scale > 1. Has locking issues when more than one run on single node.

IPv6 Support

Digital Ocean load balancers don't support IPv6. (Confirmed via their support)

Please vote this one if you're interested: https://ideas.digitalocean.com/network/p/ipv6-for-load-balancers

As an alternative, Vultr (a competitor we evaluated) actually supports IPv6 on their load balancers. We went with Digital Ocean because of wider recognition, however with DO dragging on features like IPv6, maybe it's time to re-evaluate them?

[Change Request] Deploy HaikuDepot version 1.0.151

Description

Update HaikuDepot to version 1.5.1

How has the change been tested

Steps to implement the change

Note: Please mark changes from the default steps below in bold

  1. Verify that the image is available in the package registry.
  2. Start a job to backup the database:
    $ kubectl create job --from=cronjob/haikudepotserver-pgbackup haikudepotserver-pgbackup-manual-1.0.151
    
  3. Monitor the job to make sure it finishes correctly:
    $ kubectl logs -f jobs/haikudepotserver-pgbackup-manual-1.0.151
    Backup haikudepotserver...
    gpg: directory '/root/.gnupg' created
    gpg: keybox '/root/.gnupg/pubring.kbx' created
    Added `s3remote` successfully.
    `/tmp/haikudepotserver_2023-08-06.sql.xz.gpg` -> `s3remote/haiku-backups/pg-haikudepotserver/haikudepotserver_2023-08-06.sql.xz.gpg`
    Total: 0 B, Transferred: 245.32 MiB, Speed: 86.05 MiB/s
    Snapshot of haikudepotserver completed successfully! (haiku-backups/pg-haikudepotserver/haikudepotserver_2023-08-06.sql.xz.gpg)
    
  4. Apply any pre-deployment configuration changes (see section Configuration Changes)
  5. Update the version in the infrastructure repository in deployments/haikudepotserver.yml.
  6. Apply the update to the server:
    $ kubectl apply -f deployments/haikudepotserver.yml
    
  7. Apply any post-deployment configuration changes (see section Configuration Changes)
  8. Post-deployment checks (is the web service responding, can you refresh the data using the HaikuDepot app)
  9. Commit and push the updated deployment configuration to GitHub.
  10. Announce the update on the haiku-sysadmin and haiku mailing list.

Configuration Changes

Please list any configuration changes, and note whether they need to be done pre-deploy or post-deploy

None

Rollback Plan

If the update is unsuccesful, try rolling back the image with the following commands:

$ git restore deployments/haikudepotserver.yml
$ kubectl apply -f deployments/haikudepotserver.yml

If the update applied database transformations, or the database go corrupted in any other way, please also restore the database to the backup crated as part of these update steps.

Investigate and monitor outage on Friday, June 9th.

We suffered an outage that @nielx resolved on June 9th, 2023. I feel like the RCA was a resource shortage since we deployed a the new Keycloak server a few days prior for SSO.

For now we're monitoring, however if it happens again we will likely need to grow our number of k8s nodes from 3 to 4 (incurring a ~$48 / month cost)

Grow beyond Docker Swarm

When we rolled our containerized infrastructure out in 2017, we chose Docker Swarm + Docker Compose due to its simplicity.

However, since then Kubernetes has grown by leaps and bounds to become the "standard" for > 50% of cloud organizations. We use Rexray in our Docker swarm to have "Kubernetes like managed persistent volumes"... however Rexray is essentially a dead project creating risk.

The BIG downside is cloud costs have been steadily increasing everywhere. A reasonable sized 3-node Kubernetes cluster for our workloads at Digital Ocean is around $144-$240 / month. K8S + DO Managed load balancers are likely extra as well. (today we spend ~$124.03 / month on one single big node and a small IPFS gateway node in Germany)

Pros:

  • More standardized stack
  • Limitless scaling. We could fire up a few additional K8S nodes before big releases like R1
  • Worker nodes can be rebooted with limited impact to running services
  • We gain 100% uptime upgrades (badly needed for things like software repositories... though it may not matter as much with IPFS?)

Cons:

  • Cost (more smaller nodes over one larger node)

Other solutions like Hashicorp Nomad have been suggested... but I don't know these solutions well enough to make an informed decision. AWS + GCP + Digital Ocean + Vultr all support managed K8S, and none of them support managed Nomad.

The cheaper solution would be going back to dedicated server instances or colocation running Kubernetes... but these all require manpower to manage (which we're short on).

tldr; Does the increase in price justify the improved ease of management?

Investigate internal oauth server

review.haiku-os.org is dependent on Github.

Since git access is critical to our infrastructure, we may want to investigate deploying our own oauth server at some point. We could begin to tie various services back to this common authentication gateway.

Github was an easy solution for the short term, but it makes us dependent on an external 3rd party service... are we ok with this?

concourse: re-runs builds each day even when there are no new commits on the git repos.

Not sure if this is the correct place to bring this up. I might be reading things the wrong way... and this is not actually an issue. If that's the case, I apologize in advance.

While nosing around https://ci.haiku-os.org/ and https://cgit.haiku-os.org/haiku/log/, I've noticed that... despite the last commit on the beta3 branch being from 2022-07-12, concourse seems to be creating new builds daily for that branch (both for 64 bits and 32 bits).

I admit that I might reading it wrong, but logs like for build 478 and older ones like build 445, to use any two examples... seem to be rebuilding things for the same git commit reference.

As this seems quite wasteful to my untrained eye... I thought it made sense to report it. Again, if this is just "working as expected"... please disregard my intrusion.

Buildbot is *unbelievably* slow after adding the 4 new workers

Running htop on Maui shows that it uses around 5-10% CPU for the main process, so it must be blocked on something else. Even the backend is slow: it was taking 30 seconds to start the next task after completing the prior one in some cases.

It looks like we still use SQLite for the database: https://github.com/haiku/infrastructure/blob/master/data/buildbot/master.cfg#L59

And per the documentation, it looks like quite a lot of stuff is stored in the database, and virtually all pages and build status stuff is stored and access from it: http://docs.buildbot.net/latest/developer/database.html

Indeed, they have some tickets about it: buildbot/buildbot#3002

It looks like they support using a Postgres database instead. We should do that ASAP.

Install Matomo for web analytics

https://github.com/matomo-org/docker

We should probably install it as https://metrics.haiku-os.org/matomo/, if we put it as analytics.haiku-os.org then some adblockers will block it for being called "analytics" (lol.)

It needs MySQL and serves via FastCGI, which is a bit of a pain. Maybe we should add a "generic nginx" container instead of spinning up nginxes for every individual thing that needs fastcgi? Or go vote for traefik/traefik#753.

Setup + deploy maui server

A running track of what needs to be done to complete baron -> maui transition
Setup

  • Purchase Maui from Hetzner
  • Configure sysadmins
  • Setup docker + tools + volumes
  • haiku/infrastructure git repo

Deploy git/cgit
aka http://git.haiku-os.org @ vmrepo
Converting from bare ssh git + cgit to Gerrit.

  • Develop and deploy gerrit + cgit containers
  • Persistent volume structure
  • Properly route to using nginx configurations
  • Github OAuth integration
  • Convert git hooks to something that Gerrit can execute
  • Test Gerrit workflow
  • Ensure all commiters can access new git service and have proper permissions
  • Advertise new git services to commiters

Deploy trac
aka https://dev.haiku-os.org @ vmdev

  • Deploy trac container
  • Properly route to using nginx configurations

Deploy ports-mirror server
aka https://ports-mirror.haiku-os.org @ baron

  • Develop + deploy ports-mirror container
  • Properly route to using nginx configurations

Deploy pootle/userguide server
aka https://i18n.haiku-os.org @ vmdev

  • Develop + deploy pootle + userguide containers
  • Properly route to using nginx configurations

Deploy buildbot
was aka https://buildbot.haiku-os.org @ baron
new aka https://build.haiku-os.org @ maui

  • deploy buildbot container
  • Properly route to using nginx configurations

Deploy haikudepot
aka https://depot.haiku-os.org @ vmrepo

  • Develop + deploy haikudepot container
  • Properly route to using nginx configurations

Deploy buildmaster
aka https://vmpkg.haiku-os.org @ vmpkg
Risk: Not sure how haikuporter buildmaster is going to work in a container

  • Develop + deploy buildbot container
  • Properly route to using nginx configurations

Deploy discourse
aka https://discuss.haiku-os.org @ vmsite
Do we want to move to discourse hosted version?

Clean up baron

  • Decommission vmdev
  • Decommission vmrepo
  • Decommission vmsite
  • Decommission vmpkg
  • Decommission baron (yay!)

Planning: Container deployment and grooming

We need to better define how containers are deployed and groomed.

Today we use docker-compose to manage our containers. We might want to look into other solutions.

Requirements:

  • Ensuring the proper containers are running
  • Ensuring the proper number of containers are running
  • Ensuring containers are started with the correct volumes
  • Ensuring containers are started with the correct ports + ips
  • End users should be able to deploy to docker installed on their desktops
  • Ensuring all requirements above are documented for each container.

cgit: Update to v1.2.4 Once Released

The version of cgit that we are running is now about 2-3 years old, and is internally making use of a git version that is, correspondingly, 2-3 years old. We are currently running cgit v1.2.1 while the latest is v1.2.3.

While we could upgrade now, I see that a new release may be just around a corner. It may be worth waiting for this update, as it would bring the internal git version from 2.25.1 (released more than a year ago), to 2.31.0 (the latest release).

Update to Trac 1.4

Trac 1.4 is due to come out at the end of the month, so it is a good moment to see where we are.

Check support for the modules that we use:

  • link-haiku-cgit.py
  • TracAccountManager (version 0.5.dev0) (updated to 0.5.1.dev0, will require Genshi)
  • TracMasterTickets (4.0.0.dev0) (updated to 4.0.2)
  • TracPoll (0.4.0.dev0) (does not depend on Genshi)
  • TracRobotsTxt (2.1)
  • TracSpamFilter (1.2.1.dev0) (needs an update on 1.3/1.4, see https://trac.edgewall.org/browser/plugins/trunk/spam-filter)
  • TracSubcomponents (1.2.1) (updated to 1.3.0)
  • TracVote (0.7.0.dev0) (should work, does not depend on Genshi)

After verifying the support of the modules (and preparing the upgraded versions), the steps would be the following:

  1. Create a new image for Trac 1.4 (or 1.3.6 during testing)
  2. Set up a temporary subdomain(dev-next)
  3. Clone the docker volume
  4. Clone the database
  5. Set up the trac 1.4 image for dev-next
  6. Set up the database connection string on the dev-next image
  7. Update the plugins on dev-next
  8. Run trac-admin upgrade on dev-next
  9. Start the image
  10. Test!

Finish i18n migration

  • Pootle: Have the build fix for musl reviewed (nielx)
  • Pootle: Set up repository push access for updated translations
  • Pootle: Check the synchronization script
  • Userguide: Check the functionality of the userguide tool
  • Userguide: Check 404
  • Pootle: Set up a weekly cron job to synchronize everything
  • Pootle/userguide: use pg_dump to periodically dump the db

Test longhorn fault tolerance

Our storage should be durable enough to withstand a rolling k8s node outage. This means longhorn should replicate data when a k8s node is cordoned off to other nodes in the cluster.

For longhorn to pass testing for Haiku Infrastructure, it needs to prove it can withstand a rolling outage of each node in the cluster.

Test parameters:

  • Add 20 GiB of data to longhorn
  • Recycle each node pool node in DO (with a standard cordon process)
  • Monitor that longhorn replicates data to other nodes (while stalling the node shutdown)
  • Perform this on each node to prove full redundancy is maintained

Pootle: import catkeys from buildbot

The current Pootle image is large (1.2 GB), because of all the build tools that are included in the image. In the previous setup the catkeys were generated by the build bots, and then imported into the system. I wish to return to this situation.

Solving this would mean to define a way to import the files.

ports-mirror restart

The ports-mirror python script has been exiting (and causing the container to restart over and over) due to a bug.

Fetching origin
Fetching origin
Already up to date.
updating haikuports.git/master
2018-05-23T13:58:50
	cdrtools-3.02~a09.recipe => }.tar.bz2 (http://downloads.sf.net/cdrtools/cdrtools-${portVersion/\~/}.tar.bz2)
* unable to download }.tar.bz2 - curl gave resultcode 22
2018-05-23T13:58:51
	dragonmemory-1.recipe => DragonMemory-1-source.tgz (http://cznic.dl.sourceforge.net/project/dragonmemory/DragonMemory-source.tgz)
* unable to download DragonMemory-1-source.tgz - curl gave resultcode 6
2018-05-23T13:58:51
	dragonmemory-1.recipe => DragonMemory-1-source.tgz (http://heanet.dl.sourceforge.net/project/dragonmemory/DragonMemory-source.tgz)
* unable to download DragonMemory-1-source.tgz - curl gave resultcode 22
2018-05-23T13:58:51
	abe-1.1.recipe => abe-1.1.tar.gz (http://superb-dca3.dl.sourceforge.net/project/abe/abe/abe-1.1/abe-1.1.tar.gz)
* unable to download abe-1.1.tar.gz - curl gave resultcode 6
2018-05-23T13:58:51
	cd-5.8.recipe => cd-5.8_Sources.zip (http://heanet.dl.sourceforge.net/project/canvasdraw/5.8/Docs%20and%20Sources/cd-5.8_Sources.zip)
* unable to download cd-5.8_Sources.zip - curl gave resultcode 22
2018-05-23T13:58:51
	agg-2.5.recipe => agg-2.5.tar.gz (http://gnashdev.org/tools/ltib/agg-2.5.tar.gz)
* unable to download agg-2.5.tar.gz - curl gave resultcode 22
2018-05-23T13:58:52
	aiksaurus-1.2.1.recipe => aiksaurus-1.2.1.tar.gz (http://switch.dl.sourceforge.net/project/aiksaurus/aiksaurus/1.2.1/aiksaurus-1.2.1.tar.gz)
* unable to download aiksaurus-1.2.1.tar.gz - curl gave resultcode 6
2018-05-23T13:58:52
	libpaper-1.1.24.recipe => libpaper_1.1.24.tar.gz (http://ftp.de.debian.org/debian/pool/main/libp/libpaper/libpaper_1.1.24.tar.gz)
* unable to download libpaper_1.1.24.tar.gz - curl gave resultcode 22
2018-05-23T13:58:52
	xpdf-4.00.recipe => xpdf-4.00.tar.gz (http://www.xpdfreader.com/dl/xpdf-4.00.tar.gz)
* unable to download xpdf-4.00.tar.gz - curl gave resultcode 22
2018-05-23T13:58:52
	xpdf-3.04.recipe => xpdf-3.04.tar.gz (http://mirror.ctan.org/support/xpdf/xpdf-3.04.tar.gz)
* unable to download xpdf-3.04.tar.gz - curl gave resultcode 22
Traceback (most recent call last):
  File "/usr/local/bin/update-ports-mirror", line 137, in <module>
    updateFromCheckout(gitRepoDir)
  File "/usr/local/bin/update-ports-mirror", line 87, in updateFromCheckout
    os.mkdir(targetdir)
OSError: [Errno 2] No such file or directory: '/ports-mirror/srv-www/aobook/aobook-haiku-${portVersion/_'
Fetching origin

[Change Request] Deploy Discourse version 3.0.6

Description

Update Discourse to version 3.0.6

How has the change been tested

Dev-tested by the container developer.

Steps to implement the change

Note: Please mark changes from the default steps below in bold

  1. Verify that the image is available in the package registry.
  2. Make the installation read-only using the Enable read-only button on the Admin/Backups page.
  3. Start a backup by using the Backup button on that page.
  4. Update the version in the infrastructure repository in deployments/discourse.yml.
  5. Apply the update to the server:
    $ kubectl apply -f deployments/discourse.yml
    
  6. Post-deployment checks (is the web service responding, is the site read-write again)
  7. Commit and push the updated deployment configuration to GitHub.
  8. Announce the update on the haiku-sysadmin and haiku mailing list.

Configuration Changes

Please list any configuration changes, and note whether they need to be done pre-deploy or post-deploy

None

Rollback Plan

If the update is unsuccesful, try rolling back the image with the following commands:

$ git restore deployments/haikudepotserver.yml
$ kubectl apply -f deployments/haikudepotserver.yml

If the update applied database transformations, or the database go corrupted in any other way, use Discourse's built in database restore features to return the data to the previously saved version.

refactor ports-mirror to use remote object storage

Ports-mirror rummages around in Haikuports and makes backups of sources for our packages. It's handy, however used a considerable amount of locally bound resources and storage.

https://github.com/haiku/infrastructure/tree/master/containers/ports-mirror

ports-mirror really needs refactored to use generic s3 buckets (aka Wasabi) for storage. This would enable us to archive large amounts of storage for cheap, while maintaining small and agile storage dependencies on our infrastructure

Ideas:

  • While archiving ports, store the date, time, source, sha256, recipe as metadata in s3
    • sha256 in metadata is important so we can compare what the recipe expects, and what is stored in s3.
  • We should be able to point haikuporter directly at the s3 storage provider ideally to reduce bandwidth within our infrastructure.

No longer store nightlies on walter

Previously:

  • buildbot workers were modified to upload nightlies directly over s3 to walter.
  • generate-download-pages was modified to parse s3 buckets directly for nightly images

Given the two facts above, nothing is technically stopping us from "removing the nightlies from walter" and only hosting them on Wasabi. (we could keep the sha256sum's on walter though)

This would:

  • Reduce the space requirements of our core infrastructure from "1TiB or more" to "a few hundred GiB"
  • we could have a few core developers mirror the nightly repository as a long-term backup.

We are already mirroring the nightly images from walter to wasabi, and the cost is roughly $4.99 / month

buildbot.haiku-os.org certificate expired.

Buildbot's certificate appears to have expired:

NET::ERR_CERT_DATE_INVALID
Subject: buildbot.haiku-os.org
Issuer: Let's Encrypt Authority X3
Expires on: Apr 8, 2018
Current date: Apr 16, 2018

Move haiku repos over to object storage / cdn

The wasabi object storage has been a pretty reliable solution to host our repositories. We should consider hosting our repositories on object storage as well.

Limitations:

  • s3 doesn't support symlinks, "current" points to a versioned repo
  • We shouldn't release haiku with a repo configuration which 100% depends on an external vendor.

To solve this, it would be nice if we had a tool / application which could provide haiku users "HTTP 302" redirects to external object storage / s3.

Git hooks async due to gerrit bug

Due to https://bugs.chromium.org/p/gerrit/issues/detail?id=5514 , the git hooks are running as ref-updated vs ref-update.

  • ref-updated == fires properly on "push" and on "review accept"
  • ref-update == fires properly on "push", but doesn't seem to fire on "review accept"

Once the bug linked above is resolved, and our gerrit container is updated to the fixed version, we should investigate moving back to the more proper ref-update (it will require some minor modifications to ref-update since the flags differ)

[Change Request] Deploy HaikuDepot version 1.0.149

Description

Update HaikuDepot to version 1.0.149

How has the change been tested

  • Dev-tested by the HaikuDepot dev.
  • Automated tests run by the Github action

Steps to implement the change

Note: Please mark changes from the default steps below in bold

  1. Verify that the image is available in the package registry.
  2. Start a job to backup the database:
    $ kubectl create job --from=cronjob/haikudepotserver-pgbackup haikudepotserver-pgbackup-manual-1.0.149
    
  3. Monitor the job to make sure it finishes correctly:
    $ kubectl logs -f jobs/haikudepotserver-pgbackup-manual-1.0.149
    Backup haikudepotserver...
    gpg: directory '/root/.gnupg' created
    gpg: keybox '/root/.gnupg/pubring.kbx' created
    Added `s3remote` successfully.
    `/tmp/haikudepotserver_2023-08-06.sql.xz.gpg` -> `s3remote/haiku-backups/pg-haikudepotserver/haikudepotserver_2023-08-06.sql.xz.gpg`
    Total: 0 B, Transferred: 245.32 MiB, Speed: 86.05 MiB/s
    Snapshot of haikudepotserver completed successfully! (haiku-backups/pg-haikudepotserver/haikudepotserver_2023-08-06.sql.xz.gpg)
    
  4. Apply any pre-deployment configuration changes (see section Configuration Changes)
  5. Update the version in the infrastructure repository in deployments/haikudepotserver.yml.
  6. Apply the update to the server:
    $ kubectl apply -f deployments/haikudepotserver.yml
    
  7. Apply any post-deployment configuration changes (see section Configuration Changes)
  8. Post-deployment checks (is the web service responding, can you refresh the data using the HaikuDepot app)
  9. Commit and push the updated deployment configuration to GitHub.
  10. Announce the update on the haiku-sysadmin and haiku mailing list.

Configuration Changes

Please list any configuration changes, and note whether they need to be done pre-deploy or post-deploy

None

Rollback Plan

If the update is unsuccesful, try rolling back the image with the following commands:

$ git restore deployments/haikudepotserver.yml
$ kubectl apply -f deployments/haikudepotserver.yml

If the update applied database transformations, or the database go corrupted in any other way, please also restore the database to the backup crated as part of these update steps.

Offer rsync of haikuports

Haikuports lives in it's own world outside of Haiku's core infrastructure:

So, the rsync services use by everything else don't offer up haikuports (the thing we need mirrors of most)
Figure out a way to mount haikuports repos into the repo-sync container so haikuports can be mirrored like everything else.

Gerrit 2.16 EOL

According to the Release Plan of Gerrit 3.2, the 2.16 release is EOL. This means we need to start looking into updating Gerrit at some point.

The possible targets could be:

  • Gerrit 3.0
  • Gerrit 3.1
  • Gerrit 3.2

Initial to do list:

  • Review whether the target version works in WebPositive
  • Create a list of plugins and review whether they work in the target version
  • Review all repository-customizations (i.e. hook scripts) for possible incompatibilities
  • Create an update plan, including backup
  • Update!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.