PhpRiot
Become Zend Certified

Prepare for the ZCE exam using our quizzes (web or iPad/iPhone). More info...


When you're ready get 7.5% off your exam voucher using voucher CJQNOV23 at the Zend Store

Creating Search Engine Friendly URLs In PHP

Using The Apache Forcetype Directive

An alternative to using mod_rewrite is to instead use the ForceType directive. What this does, is allow PHP scripts without a .php extension to be executed as PHP scripts. Normally web servers are configured so PHP scripts must finish with .php, so other non-PHP scripts (such as .html files) don’t have to be processed by PHP.

Going back to our example in the mod_rewrite section, instead of having a script called news.php in the root directory, our script would just be called news. So it would be accessed using http://www.example.com/news.

Using the following in our httpd.conf or a .htaccess, this news file will processed on the server as a PHP file.

Listing 5 .htaccess
<Files news>
    ForceType application/x-httpd-php
</Files>

Now, when we access our article using http://www.example.com/news/63.html, our news script is accessed directly, and we must parse out the /63.html part. This is stored in the server variable PATH_INFO.

Listing 6 news
<?php
    echo $_SERVER['PATH_INFO'];
    // outputs '/63.html'
?>

So now we can use regular expressions to extract the number 63 from this string. There are other techniques you will find useful for extracting data also, such as using PHP’s explode() function. For example, if you explode this string on /, then all parts of the path will be stored in an array (there’s only 1 part in this example though, so it’s not worth doing). Anyway, back to regular expressions.

Here is a regular expression (compatible with preg_match()), that looks for a string that has precisely a slash at the start, then a number, followed by .html. It then stores the matches to an array, from which we extract the article Id.

Listing 7 news
<?php
    $path = $_SERVER['PATH_INFO'];
    preg_match('!^/(\d+)\.html$!', $path, $matches);
 
    // $matches[0] will store the entire matched string, while $matches[1]
    // stores the string matched in the first set of brackets. We want it
    // to be an int, so we simply cast it.
    $news_id = (int) $matches[1];
?>

Normally we’d use / as the regex delimeter, but since we’re matching a slash, it’s tidier to use something different (in this case !). Additionally, we’re matching 1 (+) or more digits (\d), and we must escape the . since we’re matching a literal period (. normally means “any character”).

So that’s all there is to it. Now you can use the $news_id accordingly in that script. Of course, if the path was in an incorrect format, the matched $news_id would come out as 0 after we casted it as an int, so in other words, it’ll still be safe to plugin to your database, even if the article doesn’t exist.

In This Article


Article History

Jan 10, 2006
Initial article version
Feb 28, 2008
Added the "Using mod_rewrite as a 404 Handler" page to the article