mozilla-it / haul Goto Github PK
View Code? Open in Web Editor NEWStatic site hosting for Mozilla properties
License: Mozilla Public License 2.0
Static site hosting for Mozilla properties
License: Mozilla Public License 2.0
To support the publicsuffix.org website, we want to add CloudFront to the Haul stack. Within this issue, I want to plan that work. At the moment, I need to figure out if this should be additional Terraform within this repository or a new nubis-terraform
module that we reference here.
To enable triggering ACME certificate re-verification
Expose traefik prometheus metric
Over here: https://github.com/mozilla-it/haul/blob/master/sites/start.groovy#L10
The Git url for the l10n repo should be : https://github.com/mozilla-l10n/fx36start-l10n.git
Today we got an alert in bug 1429069 that the SSL cert for static.mozilla.com which resides on Haul was expiring in 14 days. It appears that Traefik may hard-code a 30-day buffer for renewals before certs expire so it appears that Traefik was not properly renewing certs.
/var/log/upstart/traefik.log
2017/08/31 15:56:56 server.go:2317: http: TLS handshake error from X.X.X.X:DDDDD: tls: client offered an unsupported, maximum protocol version of 301
Pipeline and Script Security plugin needs updating
The Planet build container is failing to generate content for the projects
planet. This issue was reported here:
https://bugzilla.mozilla.org/show_bug.cgi?id=1421778
I made a minor change to the container (changing relative paths to full paths) which seems to solve this issue. dhartnell/mozilla-planet-builder:4.5
was pushed to Docker Hub and it should solve the issue. I'll update planet-mozilla.groovy
to use the newer container and validate that the site still builds successfully for the stage environment. After that, I can update the other Planet Groovy files and plan a production deploy.
The site is hosted elsewhere, as noted here:
https://github.com/mozilla-itcloud/scl3-migration-project/issues/4
Traefik error logs as seen in Kibana have timestamps in the message field and that makes searching for substrings very hard in Kibana due to Lucenes limited wildcard capabilities (especially not being able to query a field with an initial wildcard, e.g. '.substring.')
Example message field as seen in Kibana:
�[37mDEBU�[0m[2017-12-05T21:17:01Z] Round trip: http://127.0.0.1:82, code: 200, duration: 18.739618ms tls:version: 303, tls:resume:false, tls:csuite:c02f, tls:server:planet.mozilla.org
We should:
Both of these will make searching for Traefik errors at least a bit easier.
Increase planet.mozilla.org build container from tag 4.6 to 4.7. This container image includes the following Planet build script which stores the build cache for each planet in the EFS mount:
https://github.com/danielhartnell/mozilla-planet-builder/blob/master/planet-build.sh#L76
It includes our Consul fix and is an official RC
This is what is currently breaking the builds and likely will keep on breaking them from time to time
It's a simple redirect to https://marketplace.firefox.com/, and that's
now gone.
Place this argument in worker module
scale_load_defaults = true
Log says
2018-05-24 17:20:53 +0000 [error]: config error file="/etc/td-agent/td-agent.conf" error_class=Fluent::ConfigError error="Other 'in_tail' plugin already use same pos_file path: plugin_id = object:3fa0711524ac, pos_file path = /var/log/traefik.pos"
For instance:
Line 13 in 2a5bfe8
vs:
rsync -av /data/genericrhel6/src/planet.mozilla.de/ dst/
WIll not only only copy what has changed, but will also only show what has changed.
To give us Let's Encrypt HTTP-01 Solver support
Otherwise, it's very noisy
Upstream has moved its package locations
There are some harcoded values in here that we should remove and make it more generic
In troubleshooting Issue #54 we deployed a log-level change to Haul stage. (PR #53) Nothing was deployed to prod. During the deploy to stage though the prod Traefik appears to have been kicked into action and started renewing certs it was not renewing previously (but should have been). The timing is so coincidental it seems unlikely to have been an accident.
12:59:22 PST: Stage haul deploy initiated
13:00:03 PST: Prod Traefik leaps into action after not doing much for a long time: Logs in Kibana: https://sso.core.us-west-2.appsvcs-generic.nubis.allizom.org/kibana/app/kibana#/discover?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:'2018-01-09T17:00:00.000Z',mode:absolute,to:'2018-01-09T23:15:38.762Z'))&_a=(columns:!(message),index:'logstash-*',interval:auto,query:(query_string:(analyze_wildcard:!t,query:'tag:traefik.error%20AND%20stack:haul-prod')),sort:!('@timestamp',asc))
We verified that certs on prod and stage Haul were renewed today with Let's Encrypt, all appear to have the same date stamps.
Expected outcome of stage deploy:
No effects in prod or on prod certs
Observed outcome:
Stage deploy appears to have triggered prod Traefik into renewing prod certs that it was supposed to have renewed previously.
We had a score of 70/100 several months ago:
https://bugzilla.mozilla.org/show_bug.cgi?id=1375084
It looks like we lost that after the migration:
https://observatory.mozilla.org/analyze/static.mozilla.com
Let's improve that. The bug has a list of headers I added in the past. It should be fine to add them here:
https://github.com/mozilla-it/haul/blob/master/nubis/puppet/sites.pp#L1-L12
Header always append X-Frame-Options SAMEORIGIN
Header set X-Content-Type-Options "nosniff"
Header set X-XSS-Protection "1; mode=block"
Header set Strict-Transport-Security "max-age=31536000"
<VirtualHost *:81>
ServerName https://apps.mozillalabs.com
Redirect permanent / https://marketplace.firefox.com/
</VirtualHost>
The Haul deployments are failing because the role assumed by CI is not authorized to perform: acm:ListCertificates.
https://sso.core.us-west-2.appsvcs-generic.nubis.allizom.org/jenkins/job/haul-deployment/51/console
For more yummy metrics
See https://github.com/jonnenauha/prometheus_varnish_exporter
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.