Zend_Search_Lucene provides two options for search results
highlighting.
The first one is utilizing Zend_Search_Lucene_Document_Html class
(see HTML documents
section for details) using the following methods:
<?php
/**
* Highlight text with specified color
*
* @param string|array $words
* @param string $colour
* @return string
*/
public function highlight($words, $colour = '#66ffff');
<?php
/**
* Highlight text using specified View helper or callback function.
*
* @param string|array $words Words to highlight. Words could be organized
using the array or string.
* @param callback $callback Callback method, used to transform
(highlighting) text.
* @param array $params Array of additionall callback parameters passed
through into it (first non-optional parameter
is an HTML fragment for highlighting)
* @return string
* @throws Zend_Search_Lucene_Exception
*/
public function highlightExtended($words, $callback, $params = array())
To customize highlighting behavior use highlightExtended()
method with specified callback, which takes one or more parameters
[12]
, or extend Zend_Search_Lucene_Document_Html class and redefine
applyColour($stringToHighlight, $colour) method used as a
default highlighting callback.
[13]
View helpers also can be used as callbacks in context of view script:
<?php
$doc->highlightExtended('word1 word2 word3...', array($this, 'myViewHelper'));
The result of highlighting operation is retrieved by
Zend_Search_Lucene_Document_Html->getHTML() method.
Note
Highlighting is performed in terms of current analyzer. So all forms of the word(s) recognized by analyzer are highlighted.
E.g. if current analyzer is case insensitive and we request to highlight 'text' word, then 'text', 'Text', 'TEXT' and other case combinations will be highlighted.
In the same way, if current analyzer supports stemming and we request to highlight 'indexed', then 'index', 'indexing', 'indices' and other word forms will be highlighted.
On the other hand, if word is skipped by current analyzer (e.g. if short words filter is applied to the analyzer), then nothing will be highlighted.
The second option is to use
Zend_Search_Lucene_Search_Query->highlightMatches(string $inputHTML[,
$defaultEncoding = 'UTF-8'[,
Zend_Search_Lucene_Search_Highlighter_Interface $highlighter]]) method:
<?php
$query = Zend_Search_Lucene_Search_QueryParser::parse($queryStr);
$highlightedHTML = $query->highlightMatches($sourceHTML);
Optional second parameter is a default HTML document encoding. It's used if encoding is not specified using Content-type HTTP-EQUIV meta tag.
Optional third parameter is a highlighter object which has to implement
Zend_Search_Lucene_Search_Highlighter_Interface interface:
<?php
interface Zend_Search_Lucene_Search_Highlighter_Interface
{
/**
* Set document for highlighting.
*
* @param Zend_Search_Lucene_Document_Html $document
*/
public function setDocument(Zend_Search_Lucene_Document_Html $document);
/**
* Get document for highlighting.
*
* @return Zend_Search_Lucene_Document_Html $document
*/
public function getDocument();
/**
* Highlight specified words (method is invoked once per subquery)
*
* @param string|array $words Words to highlight. They could be
organized using the array or string.
*/
public function highlight($words);
}
Where Zend_Search_Lucene_Document_Html object is an object
constructed from the source HTML provided to the
Zend_Search_Lucene_Search_Query->highlightMatches() method.
If $highlighter parameter is omitted, then
Zend_Search_Lucene_Search_Highlighter_Default object is
instantiated and used.
Highlighter highlight() method is invoked once per subquery, so
it has an ability to differentiate highlighting for them.
Actually, default highlighter does this walking through predefined color table. So you can implement your own highlighter or just extend the default and redefine color table.
Zend_Search_Lucene_Search_Query->htmlFragmentHighlightMatches() has similar
behavior. The only difference is that it takes as an input and returns
HTML fragment without <>HTML>, <HEAD>, <BODY> tags.
Nevertheless, fragment is automatically transformed to valid XHTML.




