PhpRiot
Become Zend Certified

Prepare for the ZCE exam using our quizzes (web or iPad/iPhone). More info...


When you're ready get 7.5% off your exam voucher using voucher CJQNOV23 at the Zend Store

Search Results Highlighting

Zend_Search_Lucene provides two options for search results highlighting.

The first one is utilizing Zend_Search_Lucene_Document_Html class (see HTML documents section for details) using the following methods:

<?php
/**
 * Highlight text with specified color
 *
 * @param string|array $words
 * @param string $colour
 * @return string
 */
public function highlight($words$colour '#66ffff');
<?php
/**
 * Highlight text using specified View helper or callback function.
 *
 * @param string|array $words  Words to highlight. Words could be organized
                               using the array or string.
 * @param callback $callback   Callback method, used to transform
                               (highlighting) text.
 * @param array    $params     Array of additionall callback parameters passed
                               through into it (first non-optional parameter
                               is an HTML fragment for highlighting)
 * @return string
 * @throws Zend_Search_Lucene_Exception
 */
public function highlightExtended($words$callback$params = array())

To customize highlighting behavior use highlightExtended() method with specified callback, which takes one or more parameters [12] , or extend Zend_Search_Lucene_Document_Html class and redefine applyColour($stringToHighlight, $colour) method used as a default highlighting callback. [13]

View helpers also can be used as callbacks in context of view script:

<?php
$doc
->highlightExtended('word1 word2 word3...', array($this'myViewHelper'));

The result of highlighting operation is retrieved by Zend_Search_Lucene_Document_Html->getHTML() method.

Note

Highlighting is performed in terms of current analyzer. So all forms of the word(s) recognized by analyzer are highlighted.

E.g. if current analyzer is case insensitive and we request to highlight 'text' word, then 'text', 'Text', 'TEXT' and other case combinations will be highlighted.

In the same way, if current analyzer supports stemming and we request to highlight 'indexed', then 'index', 'indexing', 'indices' and other word forms will be highlighted.

On the other hand, if word is skipped by current analyzer (e.g. if short words filter is applied to the analyzer), then nothing will be highlighted.

The second option is to use Zend_Search_Lucene_Search_Query->highlightMatches(string $inputHTML[, $defaultEncoding = 'UTF-8'[, Zend_Search_Lucene_Search_Highlighter_Interface $highlighter]]) method:

<?php
$query 
Zend_Search_Lucene_Search_QueryParser::parse($queryStr);
$highlightedHTML $query->highlightMatches($sourceHTML);

Optional second parameter is a default HTML document encoding. It's used if encoding is not specified using Content-type HTTP-EQUIV meta tag.

Optional third parameter is a highlighter object which has to implement Zend_Search_Lucene_Search_Highlighter_Interface interface:

<?php
interface Zend_Search_Lucene_Search_Highlighter_Interface
{
    
/**
     * Set document for highlighting.
     *
     * @param Zend_Search_Lucene_Document_Html $document
     */
    
public function setDocument(Zend_Search_Lucene_Document_Html $document);

    
/**
     * Get document for highlighting.
     *
     * @return Zend_Search_Lucene_Document_Html $document
     */
    
public function getDocument();

    
/**
     * Highlight specified words (method is invoked once per subquery)
     *
     * @param string|array $words  Words to highlight. They could be
                                   organized using the array or string.
     */
    
public function highlight($words);
}

Where Zend_Search_Lucene_Document_Html object is an object constructed from the source HTML provided to the Zend_Search_Lucene_Search_Query->highlightMatches() method.

If $highlighter parameter is omitted, then Zend_Search_Lucene_Search_Highlighter_Default object is instantiated and used.

Highlighter highlight() method is invoked once per subquery, so it has an ability to differentiate highlighting for them.

Actually, default highlighter does this walking through predefined color table. So you can implement your own highlighter or just extend the default and redefine color table.

Zend_Search_Lucene_Search_Query->htmlFragmentHighlightMatches() has similar behavior. The only difference is that it takes as an input and returns HTML fragment without <>HTML>, <HEAD>, <BODY> tags. Nevertheless, fragment is automatically transformed to valid XHTML.



[12] The first is an HTML fragment for highlighting and others are callback behavior dependent. Returned value is a highlighted HTML fragment.

[13] In both cases returned HTML is automatically transformed into valid XHTML.

Zend Framework