Automated Microarray Classification Challenge

Overview of the Challenge

The diagnosis of cancer on the basis of gene expression profiles is well established, so much so that microarray classification has become one of the classic applications of machine learning in computational biology. The aim of this challenge is to determine the best fully automated approach to microarray classification. An unusual feature of the competition is that instead of submitting predictions on test cases, the competitors submit a MATLAB implementation of their algorithm (R and Java interfaces are also in development), which is then tested off-line by the challenge organisers. This test the true operational value of the method, in the hands of an operator who is not neccessarily an expert in a given technique.

The first prize will include a free registration to ICMLA'08.

Important dates


The submissions will be evaluated using a set of representative benchmark datasets, the identity of which will only be disclosed after the challenge has ended. A re-sampling scheme will be used to estimate the generalisation performance of each method, using the Area Under the Receiver Operating Characteristic (AUROC) statistic. The scoring function is designed to reward only statistically significant differences in performance and sparsity, and is computed as follows:
  1. The Wilcoxon signed rank test is used to determine the statistical significance of all pairwise differences in the AUROC score between algorithms. For every statistically significant difference, the winner is awarded one point and the loser zero points.
  2. If the difference in AUROC is not significant, the Wilcoxon signed rank test is used again, this time to determine whether there is a statistically significant difference in sparsity (the proportion of features discarded). For every statistically significant difference, the winner is awarded one point and the loser zero points.
  3. If there are no statistically significant differences in AUROC or sparsity, both classifiers are awarded half a point each.
  4. The score achieved by a classifier for a given benchmark dataset is then the average number of points scored in all pairwise comparisons against the competing algorithms.
  5. The overall score is the average score over all benchmark datasets.
In this way, the scoring function aims to provide a ranking of techniques, primarily based on statistically significant differences in predictive performance, but favouring sparser models in cases where there is no significant difference in predictive performance.

Benefits of the Challenge Design



Website implementation: