Terrier - Terabyte Retriever
Terrier originated from a framework initially developed in 2001. Since then, much work has been involved in extending and optimising this framework. The system is based on Divergence From Randomness parameter-free probabilistic models for automatic document ranking and query expansion. Unlike other information retrieval systems Terrier learns from empirical data and adapts to the users’ information needs and queries.
The framework offers a modular API for querying, allowing the development of applications as diverse as experimenting with standard test collections, or building Web, intranet or desktop search engines.
Terrier has an outstanding performance with respect to other current public technologies that aim to provide similar retrieval facilities.
Terrier has been successfully used in internationally-acclaimed forums and several public organisations in the Netherlands, in Italy and in the USA have expressed interest in using Terrier for their intranet search facility.
The University of Glasgow is now seeking to develop this system further and would welcome interest from potential collaborators.Key Benefits
- Written in cross-platform Java.
- Highly compressed disk data structures.
- Handling large-scale document collections.
- Indexing of tagged document collections, as well as documents of various formats such as HTML, PDF, or Microsoft Word, Excel and Powerpoint files.
- Direct file for efficient query expansion.
- Interactive querying application.
- Indexing of field information.
- Indexing of position information on a word, or a block level.
- Parameter-free document weighting and query expansion models.
- Support for classic retrieval models, such as tf-idf, BM25 and Rocchio's query expansion.
Applications
- Web searching
- Desktop searching
- Intranet searching
- Information retrieval testbed
- Full-text searching
- Multilingual information retrieval
- Query performance prediction
- Adaptive retrieval approaches
- XML searching
IP Status
A core version Terrier is currently being distributed as open source software, under the Mozilla Public License (MPL). It can be downloaded from http://ir.dcs.gla.ac.uk/terrier
If you would like further information about
this opportunity please fill out the form below. Your enquiry will
be passed on to the relevant University who will respond to you directly.