Automated Microarray Classification Challenge

Overview of the Challenge

The diagnosis of cancer on the basis of gene expression profiles is well established, so much so that microarray classification has become one of the classic applications of machine learning in computational biology. The aim of this challenge is to determine the best fully automated approach to microarray classification. An unusual feature of the competition is that instead of submitting predictions on test cases, the competitors submit a MATLAB implementation of their algorithm (R and Java interfaces are also in development), which is then tested off-line by the challenge organisers. This test the true operational value of the method, in the hands of an operator who is not neccessarily an expert in a given technique.

The first prize will include a free registration to ICMLA'08.

Important dates

10 March 2008 - challenge opens
15 July 2008 - challenge closes
?? December 2008 - ICMLA 2008 special session - results announced

Evaluation

The submissions will be evaluated using a set of representative benchmark datasets, the identity of which will only be disclosed after the challenge has ended. A re-sampling scheme will be used to estimate the generalisation performance of each method, using the Area Under the Receiver Operating Characteristic (AUROC) statistic. The scoring function is designed to reward only statistically significant differences in performance and sparsity, and is computed as follows:

The Wilcoxon signed rank test is used to determine the statistical significance of all pairwise differences in the AUROC score between algorithms. For every statistically significant difference, the winner is awarded one point and the loser zero points.
If the difference in AUROC is not significant, the Wilcoxon signed rank test is used again, this time to determine whether there is a statistically significant difference in sparsity (the proportion of features discarded). For every statistically significant difference, the winner is awarded one point and the loser zero points.
If there are no statistically significant differences in AUROC or sparsity, both classifiers are awarded half a point each.
The score achieved by a classifier for a given benchmark dataset is then the average number of points scored in all pairwise comparisons against the competing algorithms.
The overall score is the average score over all benchmark datasets.

In this way, the scoring function aims to provide a ranking of techniques, primarily based on statistically significant differences in predictive performance, but favouring sparser models in cases where there is no significant difference in predictive performance.

Benefits of the Challenge Design

The design is a test of the algorithm rather than the ability of the operator to fine tune their classifier for a particular dataset. It therefore gives a better indication of the operational value of an algorithm in the field.
Selection bias is eliminated as the algorithms must be fully automated, and will be applied completely independently in each fold of the resampling scheme used in performance estimation.
At the end of the challenge, the datasets and submitted code will be made publically available. This will provide a valuable resource for the development of new microarray classification techniques, by facilitating fair and direct evaluation against state-of-the-art methods.

Organisers:

Advisors:

Prof. Geoff McLachlan
Mr Geoffrey Guile

Website implementation:

Dr Nicola Talbot

University of East Anglia