User Performance versus Precision Measures for Simple Search Tasks
Andrew Turpin
Falk Scholer
School of Computer Science and Information Technology,
RMIT University,
Melbourne, Australia.
Status
Proc. 29th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR2006),
Seattle,
to appear July 2006.
Abstract
Several recent studies have demonstrated that the type of improvements in
information retrieval system effectiveness reported in forums such as SIGIR and
TREC do not translate into a benefit for users.
Two of the studies used an instance recall task, and a third used a question
answering task, so perhaps it is unsurprising that the precision based measures
of IR system effectiveness on one-shot query evaluation do not correlate with
user performance on these tasks.
In this study, we evaluate two different information retrieval tasks on TREC
Web-track data: a precision-based user task, measured by the length of time
that users need to find a single document that is relevant to a TREC topic;
and, a simple recall-based task, represented by the total number of relevant
documents that users can identify within five minutes.
Users employ search engines with controlled mean average precision (MAP) of
between 55\% and 95\%.
Our results show that there is no significant relationship between system
effectiveness measured by MAP and the precision-based task.
A significant, but weak relationship is present for the precision at one
document returned metric.
A weak relationship is present between MAP and the simple recall-based task.