Automated Microarray Classification Challenge
Overview of the Challenge
The diagnosis of cancer on the basis of gene expression profiles is well
established, so much so that microarray classification has become one of the
classic applications of machine learning in computational biology. The aim
of this challenge is to determine the best fully automated approach
to microarray classification. An unusual feature of the competition is that
instead of submitting predictions on test cases, the competitors submit a
MATLAB implementation of their algorithm (R and Java interfaces are also in
development), which is then tested off-line by the challenge organisers.
This test the true operational value of the method, in the hands of an
operator who is not neccessarily an expert in a given technique.
The first prize will include a free registration to ICMLA'08.
- 10 March 2008 - challenge opens
- 15 July 2008 - challenge closes
- ?? December 2008 - ICMLA 2008 special session - results announced
The submissions will be evaluated using a set of representative benchmark
datasets, the identity of which will only be disclosed after the challenge has
ended. A re-sampling scheme will be used to estimate the generalisation
performance of each method, using the Area Under the Receiver Operating
Characteristic (AUROC) statistic. The scoring function is designed to reward
only statistically significant differences in performance and sparsity, and
is computed as follows:
In this way, the scoring function aims to provide a ranking of techniques,
primarily based on statistically significant differences in predictive
performance, but favouring sparser models in cases where there is no
significant difference in predictive performance.
- The Wilcoxon signed rank test is used to determine the statistical
significance of all pairwise differences in the AUROC score between
algorithms. For every statistically significant difference, the
winner is awarded one point and the loser zero points.
- If the difference in AUROC is not significant, the Wilcoxon signed
rank test is used again, this time to determine whether there is a
statistically significant difference in sparsity (the proportion of
features discarded). For every statistically significant difference,
the winner is awarded one point and the loser zero points.
- If there are no statistically significant differences in AUROC
or sparsity, both classifiers are awarded half a point each.
- The score achieved by a classifier for a given benchmark dataset is
then the average number of points scored in all pairwise comparisons
against the competing algorithms.
- The overall score is the average score over all benchmark datasets.
Benefits of the Challenge Design
- The design is a test of the algorithm rather than the ability of the
operator to fine tune their classifier for a particular dataset. It
therefore gives a better indication of the operational value of an
algorithm in the field.
- Selection bias is eliminated as the algorithms must be fully
automated, and will be applied completely independently in each fold
of the resampling scheme used in performance estimation.
- At the end of the challenge, the datasets and submitted code will be
made publically available. This will provide a valuable resource for
the development of new microarray classification techniques, by
facilitating fair and direct evaluation against state-of-the-art