spiritedmedia / systems Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 1.0 252 KB

Code and documentation for building, deploying, and serving code.

Shell 57.68% JavaScript 42.32%

systems's People

Contributors

Stargazers

Watchers

Forkers

montchr

systems's Issues

Upgrade to WordPress 4.8

See https://wordpress.org/news/2017/06/evans/

Rebuild AMI - ??/??/??

Upgrade WordPress to 4.7
Copy new object-cache.php file to the /wp-content directory for WP Redis 0.60.
Backup S3 media to disk

Rebuild AMI 02/08/2017

Uncomment out brute force attack prevention for wp-login.php in /etc/nginx/common/wpcommon-php7-modified.conf
Set port 80 to be the default server in /etc/nginx/sites-available/spiritedmedia.com
Update WordPress to 4.7.2
Backup media from S3 to the AMI

Update WordPress to 4.8.2

https://wordpress.org/news/2017/09/wordpress-4-8-2-security-and-maintenance-release/

Add google-service-account-credentials.json File

The Google Analytics API needs to load credentials from a JSON file. This contains a private key that the PHP library uses to sign each request sent to the API. For security purposes this needs to be live above the web server root so it is not accessible from the web. On the live server the location needs to be in /var/www/spiritedmedia.com/credentials/google-service-account-credentials.json The contents of the file are stored in 1Password.

Steps after SSHing into the server:

cd /var/www/spiritedmedia.dev
sudo mkdir credentials
sudo vim credentials/google-service-account-credentials.json
# Paste the JSON credentials in and save the file
sudo chown -R www-data: credentials/

See https://github.com/spiritedmedia/spiritedmedia/pull/2339

Rebuild AMI 2017-06-08

Upgrade WordPress to 4.8 (#17)
Remove '--dry-run' argument from deploy-production.sh (#16)
Also run the following: sudo pip install --upgrade awscli so the AWS cli is available to root (#16)
Upgrade OS patches

Enable and Monitor `debug.log`

The nginx error log is being monitored but WordPress' built in debug logging functionality would be another place to monitor for errors. To do this we need to modify wp-config.php so WP_DEBUG is enabled by default:

define('WP_DEBUG', true);
define( 'WP_DEBUG_LOG', true );
if ( isset( $_GET['debug'] ) && $_GET['debug'] == '!wr8KCsv9V7%' ) {
    define( 'WP_DEBUG_DISPLAY', true );
    define( 'SAVEQUERIES', true );
} else {
    define( 'WP_DEBUG_DISPLAY', false );
}

We also need to tell the AWS CloudWatch log monitor of the new log we want to track by appending a new config to /var/awslogs/etc/awslogs.conf

[/var/www/spiritedmedia.com/htdocs/wp-content/debug.log]
datetime_format = %d-%b-%Y %H:%M:%S 
file = /var/www/spiritedmedia.com/htdocs/wp-content/debug.log
buffer_duration = 5000
log_stream_name = spiritedmedia.com
initial_position = start_of_file
log_group_name = /var/www/spiritedmedia.com/htdocs/wp-content/debug.log

Be sure to stop and start the logging daemon:

sudo service awslogs stop then sudo service awslogs start

Reduce CDN Bill

https://bunnycdn.com/ works the same as our current CDN provider KeyCDN. BunnyCDN charges $0.01/GB vs $0.04/GB for KeyCDN.

From March 18 - September 13 (180 days) we used 5.32TB of bandwidth from KeyCDN or 29.5GB per day. We would spend $427.65 per year at $0.04 per GB with KeyCDN, $107.68 at $0.01 per GB with BunnyCDN.

BunnyCDN has less data centers (POPs) which isn't a big deal for our regional traffic. KeyCDN is planning on building a POP in Denver according to https://www.keycdn.com/network which would be great for Denverite.

Add `TACHYON_URL` Constant to `wp-config.php`

define( 'TACHYON_URL', S3_UPLOADS_BUCKET_URL . '/wp-content/uploads' ); needs to be added to the production server's wp-config.php file after the S3_UPLOADS_BUCKET_URL constant is defined.

Without this constant Tachyon probably won't kick in to action and our URLs won't get written to the CDN version. Things will break. Chaos will ensue etc.

Rebuild AMI - 12/28/2016

Update sync.include to include woff2 files for syncing to S3
Do some $_SERVER juggling so WordPress can correctly identify if an SSL request is being performed even behind a load balancer
Security Updates

Make Two Scripts for Turning Basic Auth On/Off on Staging Server

There are times when we need external services to test our pages (like Twitter reading our social graph meta data in the HTML) and basic auth blocks that from happening. A simple way to temporarily turn off basic auth would be nice.

We can do this manually by editing /var/www/staging.spiritedmedia.com/conf/nginx/site.conf and editing the following lines:

# To turn off basic auth, uncomment the following line
set $auth_basic off;
auth_basic $auth_basic;
auth_basic_user_file /var/www/staging.spiritedmedia.com/conf/nginx/.htpasswd;

set $auth_basic off; means no basic auth
# set $auth_basic off; means basic auth

Then we need to restart nginx for the changes to take effect: sudo ee stack restart --nginx

Move to an External Cron

Currently each machine runs a WP CLI script every minute to manage cron events (See b279d5c)

The problem with doing this in production is we have multiple servers running at a given time which means the cron events are triggered multiple times per minute. The best way to resolve this would be to use an external cron service that pings one URL so the cron job is only triggered once per minute regardless with how many machines are running in production.

Possible solutions:

https://cron-job.org/: Free but we don't control it
Using an AWS Lambda function + CloudWatch to replace cron: Costs would be minimal and we control it. See http://brianstempin.com/2016/02/29/replacing-the-cron-in-aws/

Either solution would also require us to set-up a PHP script that the external cron would hit so whenever we add a new site we don't need to manually update the external cron script. See https://tribulant.com/blog/wordpress/replace-wordpress-cron-with-real-cron-for-site-speed/

Upgrade Tachyon Dynamic Image Resizing Infrastructure

At the end of November, 2017, we put infrastructure in place to be able to run Tachyon for dynamic image resizing via query strings (https://github.com/spiritedmedia/spiritedmedia/issues/2016). Unfortunately this caused other issues like animated gifs not to load (https://github.com/spiritedmedia/spiritedmedia/issues/2422), unsupported file types weren't being returned (https://github.com/spiritedmedia/spiritedmedia/issues/2455) etc.

After looking at how Humanmade designed Tachyon they use the API Gateway service to connect a Lambda function to the Internet. Our CDN points to a CloudFront distribution which points to the API Gateway which passes the request to a Lambda function, which queries S3 for the object which then returns the request using the reverse path. The problem with this is the API Gateway is a bottleneck. It requires binary responses (images, PDFs, videos) be base64 encoded strings which the API Gateway then decodes. The API Gateway service also has a limit to the total size of the payload, somewhere around 4-5MB. Payloads bigger than this result in a server error and are thus not served for the request.

Sine Tachyon was originally released Amazon introduced Lambda@Edge which lets you run a Lambda function at different points of a request to their CDN, CloudFront. With Lambda@Edge we can listen for an event and modify the location of the request to the origin server that fetches the asset. Instead of trying to make Lambda handle the request, we can use a Lambda function to change the logic of the request and rely on CloudFront to do the rest of the work of actually returning the response to that request.

How it works is we listen for when CloudFront needs to make a request to the origin server because it doesn't have a cached copy of the response to the URL being requested. If the request isn't for an image, then we bail on the Lambda function and let CloudFront continue to do its thing. But if the image is an image and has query strings that we recognize for manipulation then we can check if the image has already been processed and return that cached response. Otherwise we can pull down the original asset from S3, modify if (resize, manipulate colors, flip or flop dimensions etc.), save the result in a new directory on the S3 bucket, and tell CloudFront to use the new location to service the request. The result of this work is our own project Tachyon@Edge at https://github.com/spiritedmedia/tachyon-edge/

And as of 2/23/2018 it is now live on our own infrastructure serving requests.

Transfer dnvr.it ShortURL to Hover.com from GoDaddy

The domain expires on 2017-09-01 and we want to centralize all of our domains on Hover. The domain points to bit.ly. Access to the GoDaddy account is in 1password. May need additional access to bit.ly.

See

Update WordPress to 4.9

https://wordpress.org/news/2017/11/tipton/

Update AMI 04/10/2018

Update OS patches
Sync media from S3 to uploads folder
Remove ActiveCampaign API Keys from wp-config.php (#41)
Move to an External Cron (#42)
Update WordPress from 4.9.1 to 4.9.5

Remove '--dry-run' argument from deploy-production.sh

Apparently # --dry-run in an argument for a command doesn't actually comment out the argument. When code changes were pushed dry-run mode was enabled and compiled assets like CSS and JavaScript weren't actually synced to Amazon S3 and served via the CDN.

I've made changes to both servers for now but this needs to be addresses the next time we bake a new AMI.

Create central location for error log aggregation

See http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/QuickStartEC2Instance.html
See http://cloudacademy.com/blog/centralized-log-management-with-aws-cloudwatch-part-1-of-3/

Modify Redis Page Caching Configs

Copy redis-php7.conf to redis-php7-modified.conf

Comment out skipping caching of requests with query strings:

#if ($query_string != "") {
#  set $skip_cache 1;
#}

Remove xmlrpc.php and /feed/ from the regex to check if we skip cache. We want requests to feeds to be cached.

if ($request_uri ~* "(/wp-admin/|wp-.*.php|index.php|sitemap(_index)?.xml|[a-z0-9_-]+-sitemap([0-9]+)?.xml)") {
  set $skip_cache 1;
}

Make sure we're including our new modified conf file in `/etc/nginx/sites-available/spiritedmedia.com`

include common/redis-php7-modified.conf;

Install NTP

sudo apt-get install ntp

This should help keep the server times in sync. Also double check the time zone configuration to make sure they are both set to the same time zone, America/New_York.

Check Out HTTP/2 Server Push plugin for WordPress

My plugin gets a list of the CSS and JS which are queued via wp_enqueue_script() and wp_enqueue_style().

And it sends link: response header which enables to pre-load those resources like following.
Link: ..., </wp-content/themes/example/style.css>; rel=preload; as=style, ...

https://miya.io/2018/03/10/http-2-server-push-plugin-for-wordpress/

Update WordPress to 4.9.6

https://wordpress.org/news/2018/05/wordpress-4-9-6-privacy-and-maintenance-release/

They added some new data tools for GDPR compliance. The only interesting new feature for the editors is there is now a "Mine" filter in the media library.

Purge Static Assets from CDN At The End of Deployment

Because we deploy to one production server at a time it is possible that static assets with a new version number can cache old versions of a resource. This is a race condition where a request is made from a server whose codebase was recently updated but ended up hitting a CDN endpoint where the new static asset hadn't been synced yet.

In other words /path/to/file.css?ver=1.2.3 is actually serving version 1.2.2 of the file. One way of fixing this is to make an API request to our CDN provider, bunnyCDN, to purge a list of assets.

A cURL request would look something like this:

curl --include \
     --request POST \
     --header "Content-Type: application/json" \
     --header "Accept: application/json" \
     --header "AccessKey: abc123" \
  'https://bunnycdn.com/api/purge?url=https%3A%2F%2Fexample.com'

This should happen near the end of deployment-production.sh after the static assets have been synced: https://github.com/spiritedmedia/systems/blob/3886746cc71dc953853ba8bd0df6e0a4461288b4/web-servers/spiritedmedia.com/deploy-production.sh

See https://bunnycdn.docs.apiary.io/#reference/0//api/statistics/post-purge-cache?console=1

The bunnyCDN API key can be found at https://bunnycdn.com/dashboard/account

Enable ?debug query string to enable WP_DEBUG constant on live site

There are times when we encounter a Fatal Error on the live site and debugging the cause can be tricky. It would save lots of time if we could append ?debug=<random-hard-to-guess-string-jibberish> and see the error message.

We can do this by setting up a conditional in the wp-config.php file for each of the production server. Like so:

if ( isset( $_GET['debug'] ) && $_GET['debug'] === '5cWf438WDFN3ZQyQsTtRfXeB' ) {
   define( 'WP_DEBUG', true );
} else {
   define( 'WP_DEBUG', false );
}

if ( WP_DEBUG ) {
   // Other constants that should be set if WP_DEBUG is true
}

This will require making a change and re-baking a server image before going live.

Upgrade and Make EasyEngine Use PHP 7.1

Just some configuration changes as outlined in https://gist.github.com/VirtuBox/e215195d76b7549c32037045b01c17cf

See https://community.rtcamp.com/t/easy-engine-x-php-7-1/9956/7

Update AMI 01/09/2018

Update OS patches
Sync media from S3 to uploads folder
Remove WP Offload S3 Lite Constants from wp-config.php (#34)
Update Database Endpoint (#36)

Rebuild AMI 05/12

Update nginx build See https://easyengine.io/blog/easyengine-now-includes-the-latest-nginx-1-10-3-build/
Drop S3cmd and use AWS CLI instead (#14)
Support accessing EC2 instances directly via SSL for testing (#10)
Update WordPress to 4.7.4
Patches and updates
Backup media from S3 to the local disk

Rebuild AMI 03/02/2017

Update the OS
Backup media from S3
Add YouTube API key to wp-config.php

define( 'YOUTUBE_DATA_API_KEY', 'AIzaSyAiTWBODuombS_Xwax2ZzZbisskVnsw3ag' );

Update AMI 10/26/2017

Update OS patches
Sync media from S3 to uploads folder
Enable and Monitor debug.log (#23)
Instal NTP (#24)
Update PGP repo Key for EasyEngine (#25)
Update WordPress to 4.8.2 (#28)
Add google-service-account-credentials.json File (#29)

Block XML-RPC Requests

We don't use it and it opens up a vector for resource draining:

Edit /var/www/spiritedmedia.com/conf/nginx/protect-system-files.conf:

# Block XML-RPC requests
location = /xmlrpc.php {
  deny all;
  access_log off;
  log_not_found off;
}

Rebuild AMI 2017-03-07

Update to WordPress 4.7.3

Update AMI 6/23/2017

Update OS patches
Sync media from S3 to uploads folder
Block XML-RPC Events (#21)
Modify Redis Page Caching Configs (#20)
Update WP CLI (#19)
Create central location for error log aggregation (#2)

Add `.wp-cli/config.yml` to `~`

With WP CLI we can specify a global configuration file pointing to the path of our WordPress install. This means when SSHing into the server we don't need to navigate to the WordPress directory to issue wp cli commands.

Contents of ~/.wp-cli/config.yml should be the following:

path: /var/www/spiritedmedia.com/htdocs/

Compile the latest nginx release from source with EasyEngine

Bash script to compile the latest nginx release from source with additonal modules with EasyEngine

See https://github.com/VirtuBox/nginx-ee/ and https://community.rtcamp.com/t/compile-the-latest-nginx-release-from-source-with-easyengine/9912

Update AMI 12/26/2017

Update OS patches
Sync media from S3 to uploads folder
Update AWS CloudWatch Log Monitor Config (#31)
Update WordPress to 4.9 (#32)
Add MailChimp API credentials to wp-config.php(https://github.com/spiritedmedia/spiritedmedia/pull/2410)
Add S3 Uploads credentials to wp-config.php(https://github.com/spiritedmedia/spiritedmedia/pull/2418)

Update PGP repo Key for EasyEngine

Issues with PGP repo key?
Run sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 3050AC3CD2AE6F03
Ref.: https://t.co/7STmSKTaNK

Via https://twitter.com/easyengine/status/894867899914964992

Add Support for Brotli Compression

Brotli is a compression algorithm from Google that is more efficient compared to gzip. Our CDN, KeyCDN, announced support for it today (https://www.keycdn.com/blog/keycdn-brotli-support/?utm_campaign=2017-02-16-brotli-support) if the origin server serves brotli compressed files.

This would require recompiling Nginx with a new module enabled:
https://afasterweb.com/2016/03/15/serving-up-brotli-with-nginx-and-jekyll/

Or we can just wait for EasyEngine to support it one day: EasyEngine/easyengine#759

Remove WP Offload S3 Lite Constants from wp-config.php

DBI_AWS_ACCESS_KEY_ID and DBI_AWS_SECRET_ACCESS_KEY are no longer used since we removed the plugin with https://github.com/spiritedmedia/spiritedmedia/pull/2418

Update AMI 10/01/2018

Update OS patches
Update WP CLI from version 1.5.1 to 2.0.1
Update WordPress from 4.9.6 to 4.9.8 (See https://wordpress.org/news/2018/07/wordpress-4-9-7-security-and-maintenance-release/ and https://wordpress.org/news/2018/08/wordpress-4-9-8-maintenance-release/)
Install and Configure HyperDB (#50)

Rebuild AMI - 12/14/2016

Add Amazon SES constants to wp-config.php See https://github.com/spiritedmedia/spiritedmedia/pull/1766
Update EasyEngine to version 3.7.4
Install OS updates
Sync S3 media --> Image as a ghetto backup

Reduce ElastiCache Bill

We use AWS' ElastiCache service for Redis. It is run 24x7 in production and a key component in our site infrastructure. Currently we are using a T2 Small Cache node at $0.034 per hour.

We can step down to a smaller node, T2 Micro for $0.017 per hour.

There are also Reserved ElastiCache Nodes:
1 Year
t2.micro = $0.006/hour + $51 upfront
t2.small = $0.011/hour + $102 upfront

3 Years
t2.micro = $0.004/hour + $109 upfront
t2.small = $0.008/hour + $218 upfront

Question: How much is our T2 Small Cache node being utilized? T2 Small = 1.55 GB of memory, T2 Micro = 0.555 GB of memory.

Install and Configure HyperDB

We've been running into some database performance issues. The CPU credits on our database RDS instance are depleting at a rapid rate. To stabilize things I decided to launch a read replica to help take some of the strain off of the master database.

To do this we'll need to install and configure HyperDB so we can tell WordPress where to direct READ queries and where to direct WRITE queries.

See https://wordpress.org/plugins/hyperdb/

Update Database Endpoint

Since switching to an Aurora database instead of MariaDB we now need to update the database endpoint in the AMI.

Old: prod-spiritedmedia-alt.citcswzewrv3.us-east-1.rds.amazonaws.com
New: prod-spiritedmedia-aurora.citcswzewrv3.us-east-1.rds.amazonaws.com

The rest of the database credentials remain the same.

Support SSL when EC2 Instances are Accessed Directly

Part of the update process involves updating the domain names via your local hosts file so the instances can be accessed directly. This is not possible after we switched to HTTPS as the load balancer handles decrypting the SSL connection and bypassing the load balancer means that doesn't work.

Some ideas:

Configure nginx to listen on port 443 to use a self-signed certificate. EasyEngine has a file called /etc/nginx/snippets/snakeoil.conf
Configure something in the load balancer to forward the request properly if a different port is used in the URL: https://billypenn.com:8080. This would involve some DNS juggling as well.

Update AWS CloudWatch Log Monitor Config

The debug.log stream wasn't updating. Turns out we need to specify the log format is in UTC time not local time. Always we want to set log_stream_name to the {instance_id}. This lets us stream the logs to AWS CloudWatch where we can look at an individual stream or a combined stream in a log group. Nifty.

Update AWS CloudWatch log monitor config at /var/awslogs/etc/awslogs.conf

[/var/nginx/spiritedmedia.com.error.log]
datetime_format = %Y/%m/%d %H:%M:%S
file = /var/log/nginx/spiritedmedia.com.error.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /var/log/nginx/spiritedmedia.com.error.log

[/var/log/nginx/error.log]
datetime_format = %Y/%m/%d %H:%M:%S
file = /var/log/nginx/error.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /var/log/nginx/error.log

[/var/nginx/spiritedmedia.com.access.log]
datetime_format = %d/%b/%Y %H:%M:%S %z
file = /var/log/nginx/spiritedmedia.com.access.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /var/log/nginx/spiritedmedia.com.access.log

[/var/www/spiritedmedia.com/htdocs/wp-content/debug.log]
datetime_format = %d-%b-%Y %H:%M:%S
time_zone = UTC
file = /var/www/spiritedmedia.com/htdocs/wp-content/debug.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /var/www/spiritedmedia.com/htdocs/wp-content/debug.log

Be sure to restart the logging daemon:
sudo service awslogs restart

Probably a good idea to delete the debug.log file as well.

Drop S3CMD, Use AWS CLI Instead

s3cmd tends to choke on larger s3 buckets like our media upload directory bucket. AWS CLI should be better at handling this.

See https://eladnava.com/backing-up-your-amazon-s3-buckets-to-ec2/

Reduce RDS Bill

Currently we run a db.t2.medium Multi-AZ instance at $0.136 per hour. this doesn't include stoage. The RDS service automatically handles running two databases in different availability zones for a high availability set-up, syncing changes, and backups.

Amazon Aurora could reduce our monthly bill in half since Aurora automatically handles syncing the data to multiple availability zones without the increase in price. See https://aws.amazon.com/rds/aurora/pricing/

There are also reserved instances available.

Update AMI 05/25/2018

Update OS patches
Update WP CLI from version 1.2.1 to 1.5.1
Add .wp-cli/config.yml to ~(#48)
Add TACHYON_URL Constant to wp-config.php(#45)
Update WordPress to 4.9.6 (#46)

Update WP CLI

Run sudo wp cli update --allow-root to update 0.24.1 to 1.2.1

Remove ActiveCampaign API Keys from `wp-config.php`

Since https://github.com/spiritedmedia/spiritedmedia/pull/2450 was deployed we no longer need ActiveCampaign API keys defined in wp-config.php