PhpRiot
Become Zend Certified

Prepare for the ZCE exam using our quizzes (web or iPad/iPhone). More info...


When you're ready get 7.5% off your exam voucher using voucher CJQNOV23 at the Zend Store

Phrase Query

Phrase Queries can be used to search for a phrase within documents.

Phrase Queries are very flexible and allow the user or developer to search for exact phrases as well as 'sloppy' phrases.

Phrases can also contain gaps or terms in the same places; they can be generated by the analyzer for different purposes. For example, a term can be duplicated to increase the term its weight, or several synonyms can be placed into a single position.

<?php
$query1 
= new Zend_Search_Lucene_Search_Query_Phrase();

// Add 'word1' at 0 relative position.
$query1->addTerm(new Zend_Search_Lucene_Index_Term('word1'));

// Add 'word2' at 1 relative position.
$query1->addTerm(new Zend_Search_Lucene_Index_Term('word2'));

// Add 'word3' at 3 relative position.
$query1->addTerm(new Zend_Search_Lucene_Index_Term('word3'), 3);

...

$query2 = new Zend_Search_Lucene_Search_Query_Phrase(
                array(
'word1''word2''word3'), array(0,1,3));

...

// Query without a gap.
$query3 = new Zend_Search_Lucene_Search_Query_Phrase(
                array(
'word1''word2''word3'));

...

$query4 = new Zend_Search_Lucene_Search_Query_Phrase(
                array(
'word1''word2'), array(0,1), 'annotation');

A phrase query can be constructed in one step with a class constructor or step by step with Zend_Search_Lucene_Search_Query_Phrase::addTerm() method calls.

Zend_Search_Lucene_Search_Query_Phrase class constructor takes three optional arguments:

<?php
Zend_Search_Lucene_Search_Query_Phrase
(
    [array 
$terms[, array $offsets[, string $field]]]
);

The $terms parameter is an array of strings that contains a set of phrase terms. If it's omitted or equal to NULL, then an empty query is constructed.

The $offsets parameter is an array of integers that contains offsets of terms in a phrase. If it's omitted or equal to NULL, then the terms' positions are assumed to be sequential with no gaps.

The $field parameter is a string that indicates the document field to search. If it's omitted or equal to NULL, then the default field is searched.

Thus:

<?php
$query 
=
    new 
Zend_Search_Lucene_Search_Query_Phrase(array('zend''framework'));

will search for the phrase 'zend framework' in all fields.

<?php
$query 
= new Zend_Search_Lucene_Search_Query_Phrase(
                 array(
'zend''download'), array(02)
             );

will search for the phrase 'zend ????? download' and match 'zend platform download', 'zend studio download', 'zend core download', 'zend framework download', and so on.

<?php
$query 
= new Zend_Search_Lucene_Search_Query_Phrase(
                 array(
'zend''framework'), null'title'
             
);

will search for the phrase 'zend framework' in the 'title' field.

Zend_Search_Lucene_Search_Query_Phrase::addTerm() takes two arguments, a required Zend_Search_Lucene_Index_Term object and an optional position:

<?php
Zend_Search_Lucene_Search_Query_Phrase
::addTerm(
    
Zend_Search_Lucene_Index_Term $term[, integer $position]
);

The $term parameter describes the next term in the phrase. It must indicate the same field as previous terms, or an exception will be thrown.

The $position parameter indicates the term position in the phrase.

Thus:

<?php
$query 
= new Zend_Search_Lucene_Search_Query_Phrase();
$query->addTerm(new Zend_Search_Lucene_Index_Term('zend'));
$query->addTerm(new Zend_Search_Lucene_Index_Term('framework'));

will search for the phrase 'zend framework'.

<?php
$query 
= new Zend_Search_Lucene_Search_Query_Phrase();
$query->addTerm(new Zend_Search_Lucene_Index_Term('zend'), 0);
$query->addTerm(new Zend_Search_Lucene_Index_Term('framework'), 2);

will search for the phrase 'zend ????? download' and match 'zend platform download', 'zend studio download', 'zend core download', 'zend framework download', and so on.

<?php
$query 
= new Zend_Search_Lucene_Search_Query_Phrase();
$query->addTerm(new Zend_Search_Lucene_Index_Term('zend''title'));
$query->addTerm(new Zend_Search_Lucene_Index_Term('framework''title'));

will search for the phrase 'zend framework' in the 'title' field.

The slop factor sets the number of other words permitted between specified words in the query phrase. If set to zero, then the corresponding query is an exact phrase search. For larger values this works like the WITHIN or NEAR operators.

The slop factor is in fact an edit distance, where the edits correspond to moving terms in the query phrase. For example, to switch the order of two words requires two moves (the first move places the words atop one another), so to permit re-orderings of phrases, the slop factor must be at least two.

More exact matches are scored higher than sloppier matches; thus, search results are sorted by exactness. The slop is zero by default, requiring exact matches.

The slop factor can be assigned after query creation:

<?php
// Query without a gap.
$query =
    new 
Zend_Search_Lucene_Search_Query_Phrase(array('word1''word2'));

// Search for 'word1 word2', 'word1 ... word2'
$query->setSlop(1);
$hits1 $index->find($query);

// Search for 'word1 word2', 'word1 ... word2',
// 'word1 ... ... word2', 'word2 word1'
$query->setSlop(2);
$hits2 $index->find($query);

Zend Framework