Generating Apache style .htaccess redirects from Drupal's Path Redirect module

Date: Fri Sep 30 2011 Apache Tricks »»»» Drupal Tutorial »»»» Drupal Planet
The "Path Redirect" module for Drupal 6 (http://drupal.org/project/path_redirect) is an excellent way to set up redirects from one URL to another. This module can be used for any purpose where you want the HTTP request for URL's on a Drupal website to automatically redirect to another URL. For example :-
  • you might want to have a nice URL to promote for some excellent product (example.com/excellent-camera) that lands on a merchant website while hiding the complexity of the merchant URL (making sure to include the affiliate link).
  • Or you might have changed the URL alias of some postings (using pathauto?) and need to redirect visitors from the old URL to the new one.
  • Or there might be a mangled URL on a 3rd website pointing at your website, and you want to make sure those visitors land on the correct page. Or you might have deleted a posting.
  • Or you might be moving your content from one website to another website.

The related functionality in Apache is the Redirect and RedirectMatch configuration directives. The purpose is identical ... except ... in Apache the redirects are declared statically in a configuration file (such as a

.htaccess
 file) whereas the Path Redirect module puts the data into a database table.  Hence for the redirect to take place in Drupal means a few database accesses.

There are many reasons to generate Apache style .htaccess Redirect directives from Path Redirect's tables. What we have in this tutorial is a simple drush-based technique for doing so.

We could note in passing at this point that Path Redirect is only for Drupal 6. Despite their "#D7CX" pledge, there is no Drupal 7 version, and apparently they have a longish-term plan to move this functionality into Drupal Core, but in the meantime there is a new module for Drupal 7 that's still in development and which is a base module for those several modules which are looking to provide redirection capabilities. (http://drupal.org/project/redirect)

I started to rathole on a rant about upgrade paths and delays in getting modules forward ported for Drupal 7. But that would detract from my goal here, which is to present a simple Drush based technique for generating a .htaccess file from Path Redirect's table.

Here's the basic idea


drush sqlq 'select source,redirect from path_redirect;' | sed 1d | sed 's/.*/Redirect permanent \/&/' >htaccess
echo Redirect permanent / http://...domain.../ >>htaccess

Because that's a dense bit of shell scripting let's take it apart and see how it ticks.

First thing to go over is the schema of the

path_redirect

mysql> describe path_redirect;
+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| rid       | int(11)          | NO   | PRI | NULL    | auto_increment | 
| source    | varchar(255)     | NO   | MUL | NULL    |                | 
| redirect  | varchar(255)     | NO   |     | NULL    |                | 
| query     | varchar(255)     | YES  |     | NULL    |                | 
| fragment  | varchar(50)      | YES  |     | NULL    |                | 
| language  | varchar(12)      | NO   |     |         |                | 
| type      | smallint(6)      | NO   |     | NULL    |                | 
| last_used | int(10) unsigned | NO   |     | 0       |                | 
+-----------+------------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)

If you inspect the fields of this table you see "type" is the HTTP code for the redirect (e.g. 301 for permanent redirect). You see that "source" is the relative URL (no leading slash) within the Drupal site. And that "redirect" is the URL the browser will be redirected to, with "query" and "fragment" being additional pieces of the URL.

The first step of this script is the basic query to access the data we need to generate Redirect directives:-


drush sqlq 'select source,redirect from path_redirect;'

The "drush sqlq" command passes the given SQL command to the active database. In this case I'm requesting only the source and redirect columns, because I know all of my redirects are permanent ones (code=301). If you had a mix of temporary and permanent redirects this should read:


drush sqlq 'select type,source,redirect from path_redirect;'

This dumps out a simple text oriented tab-seperated table from the database. However the first row of the output is the names of each column. While that might be useful for importing into a spreadsheet, it's not useful for our purpose. Hence we want to remove the first line:-


drush sqlq 'select source,redirect from path_redirect;' | sed 1d 

The first

sed
 fragment deletes that first row of the output.

The next step is to massage this into Apache directives. The appropriate documentation is on the apache website (http://httpd.apache.org/docs/2.0/mod/mod_alias.html). What we're looking to create is lines in this format:-


Redirect [status] URL-path URL

We can see the output from the "drush sqlq" is almost in this format except for:-

  • We need the word "Redirect" at the beginning of the line
  • The status is optional, and can be the words "permanent" or "temporary" or the HTTP code such as "301".
  • The URL-path cannot be a relative path, whereas the "source" column in the table is a relative path.

Adding a second

sed
 snippet to the command takes care of this:-

drush sqlq 'select source,redirect from path_redirect;' | sed 1d | sed 's/.*/Redirect permanent \/&/'

What this does is match the whole line then prepend "Redirect permanent /" to the content of the line.

The final line is necessary if you're sending all traffic from an old Drupal site to a new site. Maybe you want specific URL's on the old site to redirect visitors to matching new URL's on the new site, while sending the rest of them to a front page on the site.


echo Redirect permanent / http://...domain.../

There you go. The script given above directs the output to a file named "htaccess". Do with it as you wish, and good luck.