Memory Usage

Zend_Search_Lucene is a relatively memory-intensive module.

It uses memory to cache some information and optimize searching and indexing performance.

The memory required differs for different modes.

The terms dictionary index is loaded during the search. It's actually each 128th [21] term of the full dictionary.

Thus memory usage is increased if you have a high number of unique terms. This may happen if you use untokenized phrases as a field values or index a large volume of non-text information.

An unoptimized index consists of several segments. It also increases memory usage. Segments are independent, so each segment contains its own terms dictionary and terms dictionary index. If an index consists of N segments it may increase memory usage by N times in worst case. Perform index optimization to merge all segments into one to avoid such memory consumption.

Indexing uses the same memory as searching plus memory for buffering documents. The amount of memory used may be managed with MaxBufferedDocs parameter.

Index optimization (full or partial) uses stream-style data processing and doesn't require a lot of memory.

[21] The Lucene file format allows you to configure this number, but Zend_Search_Lucene doesn't expose this in its API. Nevertheless you still have the ability to configure this value if the index is prepared with another Lucene implementation.

