PhpRiot
Become Zend Certified

Prepare for the ZCE exam using our quizzes (web or iPad/iPhone). More info...


When you're ready get 7.5% off your exam voucher using voucher CJQNOV23 at the Zend Store

Creating A Fulltext Search Engine In PHP 5 With The Zend Framework's Zend Search Lucene

Querying Our Index

On the previous page we looked at how to write queries to search the index. We learned how to include and exclude terms, and also how to search different fields in our indexed data.

Now we will look at actually pulling documents from our index using that term.

There are essentially two ways to query the index: passing the raw query in and letting Zend_Search_Lucene parse the query (ideal when you’re writing a search engine where you’re not sure what the user will enter), or by manually building up the query with API function calls.

In either case, you use the find() method on the index. The find() method returns a list of matches from your index.

Firstly though, you must open your existing index. To do this we use the static open() method from the Zend_Search_Lucene class. Like the create() method, this takes the filesystem path of the index as the first argument.

Listing 13 listing-13.php
<?php
    require_once('Zend/Search/Lucene.php');
 
    $indexPath = '/var/www/phpriot.com/data/docindex';
 
    $index = Zend_Search_Lucene::open($indexPath);
 
    $hits = $index->find('php +author:Quentin');
?>
Caution: If the index doesn't exist when you try to open it (or if you don't have sufficient permissions to read it) an exception will be thrown. You should handle these exceptions in your code by wrapping the call to open() in a try / catch.

This sample code searches our index by also articles containing ‘php’, written by me. Note that when we opened our index, we did not pass the second parameter as we did when we created the index. This is because we are not writing the index, we are querying it.

We could also manually build this same query with function calls like so:

Listing 14 listing-14.php
<?php
    require_once('Zend/Search/Lucene.php');
 
    $indexPath = '/var/www/phpriot.com/data/docindex';
 
    $index = Zend_Search_Lucene::open($indexPath);
 
    $query = new Zend_Search_Lucene_Search_Query_MultiTerm();
    $query->addTerm(new Zend_Search_Lucene_Index_Term('php'), null);
    $query->addTerm(new Zend_Search_Lucene_Index_Term('Quentin', 'author'), true);
 
    $hits = $index->find($query);
?>

The second parameter for addTerm() used determines whether or not a field is required. true means it is required (like putting a plus sign before the term), false means it is prohibited (like putting a minus sign before the term), null means it isn’t required or prohibited.

The second parameter for Zend_Search_Lucene_Index_Term specifies the field to search index. By default this is contents.

On the whole, it is easier to simply allow Zend_Search_Lucene to parse the query.

Dealing with returned results

The results found from your query are returned in an array, meaning you can simply use count() on the array to determine the number of hits.

Each of the indexed fields are available as a class property.

So to loop over the results as we indexed them previously (with a title, author and teaser), we would do the following:

Listing 15 listing-15.php
<?php
    require_once('Zend/Search/Lucene.php');
 
    $query = 'php +author:Quentin';
 
    $indexPath = '/var/www/phpriot.com/data/docindex';
 
    $index = Zend_Search_Lucene::open($indexPath);
 
    $hits = $index->find($query);
    $numHits = count($hits);
?>
 
<p>
    Found <?= $hits ?> result(s) for query <?= $query ?>.
</p>
 
<?php foreach ($hits as $hit) { ?>
    <h3><?= $hit->title ?> (score: <?= $hit->score ?>)</h3>
    <p>
        By <?= $hit->author ?>
    </p>
    <p>
        <?= $hit->teaser ?><br />
        <a href="<?= $hit->url ?>">Read more...</a>
    </p>
<?php } ?>

Here we also used an extra field called score. As mentioned previously, this is used as an indicator as to how well a document matched the query. Results with the highest score are listed first.

Creating a simple search engine

Using our code above, we can easily transform this into a simple site search engine. All we need to do is add a form and plug in the submitted query. Let’s assume this script is called search.php:

Listing 16 search.php
<?php
    require_once('Zend/Search/Lucene.php');
 
    $query = isset($_GET['query']) ? $_GET['query'] : '';
    $query = trim($query);
 
    $indexPath = '/var/www/phpriot.com/data/docindex';
 
    $index = Zend_Search_Lucene::open($indexPath);
 
    if (strlen($query) > 0) {
        $hits = $index->find($query);
        $numHits = count($hits);
    }
?>
 
<form method="get" action="search.php">
    <input type="text" name="query" value="<?= htmlSpecialChars($query) ?>" />
    <input type="submit" value="Search" />
</form>
 
<?php if (strlen($query) > 0) { ?>
    <p>
        Found <?= $hits ?> result(s) for query <?= $query ?>.
    </p>
 
    <?php foreach ($hits as $hit) { ?>
        <h3><?= $hit->title ?> (score: <?= $hit->score ?>)</h3>
        <p>
            By <?= $hit->author ?>
        </p>
        <p>
            <?= $hit->teaser ?><br />
            <a href="<?= $hit->url ?>">Read more...</a>
        </p>
    <?php } ?>
<?php } ?>

Error handling

The one thing we haven’t dealt with yet are errors in the search. For instance, if we were to type in title: with no query behind it then an error would occur. We handle this by catching the Zend_Search_Lucene_Exception exception.

Listing 17 listing-17.php
<?php
    require_once('Zend/Search/Lucene.php');
 
    $query = isset($_GET['query']) ? $_GET['query'] : '';
    $query = trim($query);
 
    $indexPath = '/var/www/phpriot.com/data/docindex';
 
    $index = Zend_Search_Lucene::open($indexPath);
 
    try {
        $hits = $index->find($query);
    }
    catch (Zend_Search_Lucene_Exception $ex) {
        $hits = array();
    }
    $numHits = count($hits);
?>

This means now that if an error occurs in the search, we simply assume zero hits were returned, thereby handling the error without indicating to the user that anything went wrong.

Of course, you could also choose to get the error message from the exception and output that instead ($ex->getMessage()).

In This Article


Article History

Apr 27, 2006
Initial article version
Dec 17, 2007
Updated to use Zend Framework 1.0.3