Creating Search Engine Friendly URLs In PHP
Using The Apache Forcetype Directive
An alternative to using mod_rewrite is to instead use the
ForceType directive. What this does, is allow PHP scripts without a
.php extension to be executed as PHP scripts. Normally web servers are configured so PHP scripts must finish with
.php, so other non-PHP scripts (such as
.html files) don’t have to be processed by PHP.
Going back to our example in the
mod_rewrite section, instead of having a script called
news.php in the root directory, our script would just be called
news. So it would be accessed using
Using the following in our
httpd.conf or a
news file will processed on the server as a PHP file.
<Files news> ForceType application/x-httpd-php </Files>
Now, when we access our article using
http://www.example.com/news/63.html, our news script is accessed directly, and we must parse out the
/63.html part. This is stored in the server variable
echo $_SERVER['PATH_INFO']; // outputs '/63.html'
So now we can use regular expressions to extract the number 63 from this string. There are other techniques you will find useful for extracting data also, such as using PHP’s explode() function. For example, if you explode this string on
/, then all parts of the path will be stored in an array (there’s only 1 part in this example though, so it’s not worth doing). Anyway, back to regular expressions.
Here is a regular expression (compatible with preg_match()), that looks for a string that has precisely a slash at the start, then a number, followed by
.html. It then stores the matches to an array, from which we extract the article Id.
$path = $_SERVER['PATH_INFO']; preg_match('!^/(\d+)\.html$!', $path, $matches); // $matches will store the entire matched string, while $matches // stores the string matched in the first set of brackets. We want it // to be an int, so we simply cast it. $news_id = (int) $matches;
Normally we’d use
/ as the regex delimeter, but since we’re matching a slash, it’s tidier to use something different (in this case
!). Additionally, we’re matching 1 (
+) or more digits (
\d), and we must escape the
. since we’re matching a literal period (
. normally means “any character”).
So that’s all there is to it. Now you can use the
$news_id accordingly in that script. Of course, if the path was in an incorrect format, the matched
$news_id would come out as
0 after we casted it as an
int, so in other words, it’ll still be safe to plugin to your database, even if the article doesn’t exist.