UEA Computational Biology Laboratory

Introduction

This page provides a MATLAB re-implementation of the methods described in [1], which approximately reproduce the experimental results given in that paper. The software requires the fminunc function from the MATLAB optimization toolbox. If you do not have this toolbox, you will need to alter the @graddesc object to use a different gradient descent optimiser. Note that the software is of "research quality" and has no documentation and is provided essentially unsupported.

Improved Experimental Method

The software provided will evaluate the LS-SVM and LS-SVM-BR (@l2lssvm in the software) over the suite of thirteen benchmark datasets used in the original paper. The results are not quite the same, as a different gradient descent optimiser is used, but the overall pattern is similar. The results provided below were obtained using a slightly improved methodology. In order to make sure that the difference in performance between the methods with the standard spherical RBF kernel and the eliptical ARD kernel are due to over-fitting the model selection criterion, rather than due to local minima in the cost function, the ARD kernel is optimised twice. The first time, it is optimised starting from the equivalent optimal RBF kernel (so the model selection criterion cannot be worse than for the RBF kernel) and also from the default value (0.125). The solution with the lowest value of PRESS is then used. These results should be regarded as more reliable than those given in [1], as a more thouroughly tested gradient descent function was used and because the experimental method is better.

Results

The results are shown in the table below, for each dataset, the best mean error rate is shown in bold, the worst is shown underlined. Note that the results for the RBF kernel are generally better than those for th ARD kernel. As the RBF kernel is a special case of the ARD kernel, this illustrates that over-fitting the model selection criterion is a genuine problem in using kernel learning methods where there are many kernel parameters to be determined. The LS-SVM-BR generally out-performs the LS-SVM for the ARD kernel, in some cases by a very substantial margin (e.g. heart), which shows that Bayesian regularisation of the hyper-parameters is beneficial. An even better approach to this problem is given in [2], where the kernel parameters are treated as parameters, rather than hyper-parameters.

Dataset	RBF		RBF		ARD		ARD
Dataset	LS-SVM		LS-SVM		LS-SVM-BR		LS-SVM-BR

banana		10.6045		10.5849		10.7176		10.6369
breast cancer		26.7403		27.5325		29.2338		27.6364
diabetis		23.3200		23.2567		24.3633		23.4300
flare_solar		34.1950		34.0875		34.3150		33.4575
german		23.5533		23.5800		25.9733		24.5133
heart		16.7200		16.1600		24.7500		16.3800
image		2.9950		2.9901		1.9307		1.7673
ringnorm		1.6121		1.6133		2.1169		1.9814
splice
thyroid		4.6800		4.7200		5.2000		4.9333
titanic		22.5758		22.7201		22.4715		22.62114
twonorm		2.8449		2.8453		5.4793		4.9494
waveform		9.7959		9.7791		14.1389		10.8233

Downloads

MATLAB software (jmlr2007a.zip)

There are eight experiments in all, run the matlab script run_experiment in each directory, and the results will be in a file called summary.txt in the directory [benchmark]/results/, where [benchmark] is the name of the benchmark dataset. The key experiments are:
- experiment001 - RBF LS-SVM
- experiment002 - RBF LS-SVM-BR
- experiment007 - ARD LS-SVM
- experiment008 - ARD LS-SVM-BR

Acknowledgements

The reproduction of the results presented here was carried out on the High Performance Computing Cluster supported by the Research and Specialist Computing Support service at the University of East Anglia.

References

[1]	G. C. Cawley and N. L. C. Talbot, "Preventing over-fitting in model selection via Bayesian regularisation of the hyper-parameters", Journal of Machine Learning Research, volume 8, pages 841-861, April 2007. (pdf)
[1]	Gavin C. Cawley and Nicola L. C. Talbot, "Kernel Learning at the First Level of Inference", Neural Networks, volume 53, pages 69-80, May 2014. (, preprint)

Research Team: Gavin Cawley & Nicola Talbot.

UEA Computational Biology Laboratory

On this Site

Related Websites

Introduction

Improved Experimental Method

Results

Downloads

Acknowledgements

References

UEA Computational Biology Laboratory

On this Site

Related Websites

Preventing Overfitting in Model Selection...

Introduction

Improved Experimental Method

Results

Downloads

Acknowledgements

References