WordPress plugin for handling legacy redirects in a scalable manner.
Please see our wiki for detailed documentation.
- PHP 7.4+
- WordPress 5.9+
See CHANGELOG.md for the full list of changes.
Licensed under GPL-2.0-or-later
.
WordPress plugin for handling large volumes of legacy redirects in a scalable manner.
WordPress plugin for handling legacy redirects in a scalable manner.
Please see our wiki for detailed documentation.
See CHANGELOG.md for the full list of changes.
Licensed under GPL-2.0-or-later
.
As the plugin uses wp_safe_redirect
, it would be helpful to report any non-whitelisted host which is being imported.
Example implementation:
$parsed_url = parse_url( $redirect_being_imported );
if ( true === isset( $parsed_url['host'] ) && false === empty( $parsed_url['host'] ) ) {
$wpp = parse_url(home_url());
$allowed_hosts = (array) apply_filters( 'allowed_redirect_hosts', array($wpp['host']), isset($parsed_url['host']) ? $parsed_url['host'] : '' );
if ( !in_array($parsed_url['host'], $allowed_hosts) && $parsed_url['host'] != strtolower($wpp['host'])) ) {
WP_CLI::line( sprintf( 'Non whitelisted host is being imported: %s', $parsed_url['host'] ) );
}
}
For ease of installation. There is currently a bootleg version (https://packagist.org/packages/aaronholbrook/wpcom-legacy-redirector) but would be good to get the proper version on packagist.
Some similar requests to automattic repos for reference
Currently to change a URL, the Redirect has to be deleted and then re-added.
The ability to edit a redirect would improve ease of use.
Add a class that holds the enum list (class constants) of supported redirect HTTP status codes - like 301 and 302. Note that everything other than 301 is not yet used out of the box, but there is a filter around a 301
; adding a constant for 302, for example, would allow someone to use this as well to set a 302 redirect.
Something like:
<?php
declare(strict_types=1);
namespace Automattic\LegacyRedirector\Domain\Redirect;
class Redirect_Code {
public const MULTIPLE_CHOICES = 300;
public const MOVED_PERMANENTLY = 301;
public const FOUND = 302;
public const SEE_OTHER = 303;
public const USE_PROXY = 305;
public const TEMPORARY_REDIRECT = 307;
public const PERMANENT_REDIRECT = 308;
public static function valid_redirection_codes() {
return [
self::MULTIPLE_CHOICES,
self::MOVED_PERMANENTLY,
self::FOUND,
self::SEE_OTHER,
self::USE_PROXY,
self::TEMPORARY_REDIRECT,
self::PERMANENT_REDIRECT,
];
}
}
We noticed a performance issue with this plugin on VIP Go. The function register_redirect_custom_capability is hooked in to run at init, which calls wpcom_vip_add_role_caps on the VIP environment. This should only be called when the capabilities change, per the VIP documentation at https://wpvip.com/documentation/vip-go/custom-user-roles/
When preparing to do a bulk import from a CSV, I tested locally and found none of the redirects were being added. I was able to track this down to the check on whether WP_CLI is defined.
There is already a pull request (#8) which includes a fix for it, but it also included an additional filter and is being held up from being merged.
Use Case:
It would be interesting to be able to run a cli command (or something similar) that verifies that the redirects were pulled in successfully and work as expected.
Fill in any gaps for undocumented functions that exist today. Make sure all doc blocks conform to standards. This will also clear up a big chunk of PHPCS warnings. ๐
The insert_legacy_redirect()
method will return "True if inserted; false if not permitted; otherwise error upon validation issue.".
It would be handy to make the newly inserted post ID to be available.
If we updated the true
return, then this may break existing ===
checks.
We can add a new param to the method and which can dictate whether true
or the post ID is returned on success.
For a 2.0 release, we can make the change so that only the inserted post ID is returned by default.
In develop
branch, I noticed when adding a relative path, the redirect_*_preview
above the input shows the current sites path and appends the inputs value to it. However once submitted, the subsites path is not included, which may be misleading.
Steps to reproduce:
develop
branchexample.com/subsite1/
)/testing
example.com/subsite1/testing/
example.com/testing/
instead of the previewed example.com/subsite1/testing/
Video: https://io.davidsword.ca/wpcomlegacyredirect.mov
The TO input has the same issue, of previewing the subsite path in the preview, but not including it in the actual stored redirect value. There's an additional oddity where if the subsite path isn't included (which will result in a broken link), the link validation does not catch it and it shows as a valid link.
301s are cached by browsers (at least Firefox and Chrome) with no expiry date.
This poses a problem when we need to update redirects in some cases.
I suggest adding a header like Cache-Control: max-age= where seconds may be a fairly large number, perhaps even one day.
We may think of setting up a default behaviour and some configurable params.
Would need something like:
add_filter( 'ep_indexable_post_types', 'legacy_redirector_do_not_index_in_elasticpress' );
function legacy_redirector_do_not_index_in_elasticpress( $types ) {
$deny_list = array( 'vip-legacy-redirect' );
// Unset the items with values matching the deny list.
return array_diff( $types, $deny_list );
}
While this originated from the VIP Search, the ep_indexable_post_types
filter applies to any instance of ElasticPress (not just on VIP), so we would not want to put this snippet into a vipgo-helper.php file.
From the Wiki...
There are three ways to create redirects:
The plugin documentation should live in this repo and be the canonical source.
The current documentation can be found here:
https://docs.wpvip.com/technical-references/redirects/wpcom-legacy-redirector-plugin/
README.md or Github's wiki feature could be used. The README.md also needs an update to remove a broken link to the old documentation source.
tl;dr Check line-endings in a CSV file are consistent with the host environment before import.
I tried to use this plugin with a .csv
file that produced errors on import. What was reported had overlapping URLs, and only showed 7 out of the 88 URLs attempting to be imported.
I was able to spot a pattern on which URLs these were in the CSV file, and found it was about every 2000 bytes. That lead me to consider that the fgetcsv()
file appeared to be grabbing 2000 bytes of data as a single row, which implied that multiple rows were being concatenated into one.
From this, I was able to make and prove the conclusion that the file was using Mac OS 9 line-endings (CR
). The server I was trying to run this on required Unix line-endings (LF
). Converting the file locally, pushing to the server and importing then worked fine.
The plugin should check for line-endings, or, as suggested here, enable auto-detection of line-endings before opening the file.
sonarcloud.io is an online service for Code Quality Review, it might be a good idea to set this up along with PHPCS and Wordpress coding standards and it will give way for better code practices and better collaboration.
A current scan revealed a list of potential items that should be fixed.
The main plugin class has a validate_urls()
method, and a separate validate()
method.
The UI class has a validate_redirects_notices
, the addition of a "Validate" link, the handling for checking that a link is valid, and a highly-coupled call to validate when adding a redirection via the UI.
When importing a CSV, there's a --skip-validation
flag to bypass all of this!
I want to see the validation of a From value, and any validation of a To value to be handled separately. These should be built out as individual validators, which are then registered to a UrlValidator class (Strategy Pattern). This UrlValidator (could be a different instance for From and To, depending on if they need different validations) can then be injected as a dependency when it is needed.
Should start with making a list of the different validations that happen for any data.
Is version 1.4.0-alpha
the latest recommended version for sites on VIP Go? If so, could we please tag a new release and merge the current work from develop
to master
?
Constant should be defined for the plugin's slug.
define( 'WPCOM_LEGACY_REDIRECTOR_PLUGIN_SLUG', 'wpcom-legacy-redirector' );
There are currently 47 references in the code base to the string 'wpcom-legacy-redirector'
i.e. 'not_found' => __( 'No redirects found.', 'wpcom-legacy-redirector' )
As part of #63905-z we had the interesting request that legacy redirects should ignore utm_* parameters so that they don't get stripped when redirects occur.
Per discussion in #14 it was determined that whitelisting of params would be an ideal solution so that users can allow campaign and other params to be passed through to the redirected page.
It'd be nice if we could list redirects via WP CLI for investigative purposes
Emojis and chars like the horizontal ellipsis โฆ
can be submitted successfully in the FROM path's, but they don't actually redirect.
This could be a problem if for example this plugin is used as a shortener by marketers, ie: /๐ฅ-deals
The plugin supports trashing the redirect posts, however the logic evaluating the redirect request doesn't account for the post status at all:
WPCOM-Legacy-Redirector/includes/class-lookup.php
Lines 42 to 54 in 2499ef0
If we do want to support the trashing, the logic needs to be extended to check for the post_status
of the resolved $redirect_post
. Alternatively, we could disable the trash support and allow hard-deletes.
Please let us know if this needs to be addressed and we can submit a pull request.
$ wp wpcom-legacy-redirector insert-redirect /foo 5
Error: Couldn't insert /foo -> 5 (Redirect is pointing to a Post ID that does not exist.)
Th insert-redirect
command fails when using a post ID as the To destination value. Removing a check for the $_POST superglobal, as per #115 fixes it, but then this breaks the UI view:
(The different redirect there is using a /bar
To value, rather than a post ID)
#115 was reverted in #116, but the problem still likely lies in WPCOM_Legacy_Redirector::vip_legacy_redirect_parent_id()
.
The check_if_404()
contains a wp_remote_get()
(and a vip_safe_wp_remote_get()
) call. When using a self-signed certificate, this stops the check from passing as a WP_Error is returned:
object(WP_Error)#1040 (2) {
["errors"]=> array(1) {
["http_request_failed"]=> array(1) {
[0]=> string(63) "cURL error 60: SSL certificate problem: self signed certificate"
}
}
["error_data"]=> array(0) { }
}
However, the UI only shows an error of Error: Redirects need to be from URLs that have a 404 status.
, which is not indicative of the actual error.
As per #125 (comment) when redirect posts are added, they end up with a draft
post status.
To enable a concept of active and inactive (and trashed #125) redirects, we need to:
publish
post status.publish
status (if not already trashed).publish
post status.A future separate issue can then look at the UI and implementation for toggling the active status of a redirect.
Currently, the add new redirect page is found at edit.php?post_type=vip-legacy-redirect&page=add-redirect
.
This is created by registering a new admin page, via add_submenu_page()
.
However, since the purpose of the screen is just to add a new instance of a custom post type (something that WP already supports), we can go about this in a different way.
What we need is to allow the creation of new posts (vip-legacy-redirect
), and then register a callback that amends the metaboxes on the page, to add a custom metabox that includes the fields we need.
Among other benefits, would be that the Publish metabox would be visible - this allows the later supporting of scheduled publishing of redirects to make them live, and different redirect statuses. Until this we could likely remove these from the Publish metabox.
'create_posts' => 'do_not_allow'
capability. This will turn on the Add New button at the top of the Redirects list page (see #39), and add a new "Add New" page at post-new.php?post_type=vip-legacy-redirect
.It would be useful to have an export CLI command that would grab the current redirects out as a CSV file.
If we delete the redirect with wp_delete_post( WPCOM_Legacy_Redirector::get_redirect_post_id( '/<redirect>' ) );
, it still redirects from the trash, unless you force delete via wp_delete_post( WPCOM_Legacy_Redirector::get_redirect_post_id( '/<redirect>' ), true );
.
When calling the import-from-meta
CLI command I get this error Redirect is pointing to a Post ID that does not exist.
After doing some research it appears that the vip_legacy_redirect_parent_id
function is expecting the $post
variable to be a WP_Post
object, when in fact it is sometimes a post ID.
The PR below ensures that the $post
variable is a WP_Post
object before trying to access its post_parent
property.
A common situation we've had to deal with is handling AMP redirects. For example, if you redirect:
https://domain.com/old-path to https://domain.com/new-path
If you're running AMP, you technically also need to redirect:
https://domain.com/old-path/amp to https://domain.com/new-path/amp
This may not be a use case for everyone so I don't think it needs to be part of the core plugin code. You can filter the request path for determining the URL match with wpcom_legacy_redirector_request_path
to drop /amp to find the match, but there's no way to currently filter $redirect_uri
after it's returned to add it back on.
What we've done is add a filter after it's returned in maybe_do_redirect()
like:
$redirect_uri = apply_filters( 'wpcom_legacy_redirector_redirect_uri', $redirect_uri, $request_path );
I'd suggest adding the same to the core plugin code. Happy to make a PR for this, but unclear if this should be done against version 1.3.0 or 1.4.0.
To be clear, I understand you could just add redirects for both, but this seems inefficient to essentially double the size of the database entries when this is easily handled programmatically.
After the validation has been decoupled (#58), we should add a validate-csv
command, which would replace the idea of a --dry-run
flag when inserting via CSV files.
Note that this would be checking the data is valid before adding, and is different to the idea at #13 of verifying that stored redirects work.
Currently, if I import a redirect for /test
and visit example.com/test/ in my browser, I won't get redirected. The workaround is to upload redirects for /test/
and /test
to the same destination URL.
Instead, does it make sense to add a check here that basically does this?
if ( $request_path ) {
$redirect_uri = self::get_redirect_uri( $request_path );
if ( ! $redirect_uri ) {
if ( substr( $request_path , -1 )=='/' ) {
$redirect_uri = self::get_redirect_uri( rtrim( $request_path, '/' );
} else {
$redirect_uri = self::get_redirect_uri( $request_path . '/' );
}
}
if ( $redirect_uri ) {
header( 'X-legacy-redirect: HIT' );
$redirect_status = apply_filters( 'wpcom_legacy_redirector_redirect_status', 301, $url );
wp_safe_redirect( $redirect_uri, $redirect_status );
exit;
}
}
Here: https://github.com/Automattic/WPCOM-Legacy-Redirector/blob/master/wpcom-legacy-redirector.php#L108
I'm sure there's a more eloquent way to do this, but it seems silly to have to worry about both variations of a URL.
WordPress core includes a X-Redirect-By
header when wp_redirect()
or wp_safe_redirect()
is used. By default, this has a value of WordPress
.
This plugin has a header of:
HIT is typically thought of in relation to caching, rather than redirecting, so the use of it here may cause confusion.
Better would be to change
to add a third arg with a reference to the plugin name.This would mean that WP 5.1 is the minimum required version.
While it would be good to have individual redirects be editable, I can't think of any case where one would want to bulk edit multiple redirects at the same time.
(Maybe if there was an activated (draft) / deactivated (publish) feature, it would be useful?)
The bulk edit feature currently makes no sense:
Let's disable it for now.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.