FathomFive is a classification aware lucene powered spidering and indexing solution, written in pure Java. It supports a variety of content types, provides an easy to use admin interface, and a customisable search interface. It spiders from HTTP and OAI.
There's a new release out, 20080428. This contains a number of new features, including:
Update to support Visio
Update to support OOXML (Office Open XML) files, such as .xlsx and .docx... read more