Busvebacken

Using Xapian you can dramatically improve the performance of searching in moin and furthermore unlock some more features (see the search prefixes above) not possible with the legacy search engine.

Setting it up

Requirements

You must have Xapian itself and its Python bindings (xapian-core and xapian-bindings) from http://www.xapian.org/ at least in version 1.0.0 installed.

To process attachment files, moin uses filter plugins - here is the list of filter plugins included:

File type

Dependency

Notes

Text files (.txt)

-

tries utf-8 and iso-8859-15 encodings (or forces to ASCII if those do not work)

JPEG images (.jpg)

-

EXIF data is extracted

Open Office files (.sx?)

-

e.g. from older OpenOffice.org/StarOffice versions

Open Document files (.od?)

-

e.g. from recent OpenOffice.org/StarOffice versions

Binary files

-

moin uses a strings like filter to process those, as well as a blacklist with stuff you don't want to search

MS Word files (.doc)

antiword

filter calls antiword

MS Excel files (.xls)

catdoc

filter calls xls2csv

PDF files (.pdf)

xpdf-utils

filter calls pdttotext

After installing additional filters (or dependencies) you should (re)build your index. Xapian will find the new filters / support packages automagically. The next time your search results may contain results linking directly to your attachments.

Configuration

In your wikiconfig, you have several options on how to configure Xapian:

xapian_search

False

if True, enables Xapian search

xapian_index_dir

None

if set, set and use a separate index directory for every wiki distinguished by wikiname; useful for wikifarms to seperate indices (note: needs rebuilding the index)

xapian_index_history

True

if True, it will instruct the indexer to index all revisions of a page to let users search in their history (note: needs rebuilding the index)

xapian_stemming

False

if True, enables stemming of terms in Xapian (note: needs rebuilding the index)

(Re-)Building an index

You can use the supplied command line tool moin to initially build, completely rebuild and update an existing index.

To build your index the first time, execute

moin --config-dir=/where/your/configdir/is --wiki-url=wiki-url/ index build --mode=add

in your command line. You can check the status of Xapian and its index on SystemInfo.

Moreover, the following modes can be passed to the command above to control the building of the index:

/!\ Please note that you must rebuild your index if you change at least one of xapian_index_history, xapian_index_dir or xapian_stemming configuration options!

Testing

You can test if Xapian is enabled and if an index is available by checking SystemInfo. To check if searches are performed using Xapian, enable show_timings in your wikiconfig, perform a search and look for _xapianSearch on the bottom of the page.

Usage

Xapian is basically used the same way as all other search engines. Due to Xapian's advanced features some new search term prefixed were introduced which are not already available in the legacy search engine (commonly referred to as moin search). See HelpOnSearching for more information and/or use the new advanced search dialogue available on FindPage to see what's available and possible.