CoverType

The original ForestCover/Covertype dataset from UCI machine learning repository is a multiclass classification dataset. It is used in predicting forest cover type from cartographic variables only (no remotely sensed data). This study area includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. These areas represent forests with minimal human-caused disturbances, so that existing forest cover types are more a result of ecological processes rather than forest management practices. This dataset has 54 attributes (10 quantitative variables, 4 binary wilderness areas and 40 binary soil type variables). Here, outlier detection dataset is created using only 10 quantitative attributes. Instances from class 2 are considered as normal points and instances from class 4 are anomalies. The anomalies ratio is 0.9%. Instances from the other classes are omitted.

ForestCover is available on Aftershock and normal observations are available in the included training dataset consisting of 10 dimensions per observation. During evaluation, the main program of your submission is expected to access /ingress/covertype/testing.csv which has the same form as the development dataset and produce sequentially aligned anomaly confidence values (in [0, 1]) at /egress/covertype/predictions.csv.

Scores

Method Authors AUC Runtime (ms)
Bionic (Pre-Release) K. Demetriou, I. Becker, S. Hailes 0.994 4,577.113
Baselines - Isolation Forest K. Demetriou, I. Becker, S. Hailes 0.856 6,938.642
Baselines - One Class SVM K. Demetriou, I. Becker, S. Hailes 0.732 141,668.249
Baselines - Robust Covariance K. Demetriou, I. Becker, S. Hailes 0.702 38.984
Baselines - Local Outlier Factor K. Demetriou, I. Becker, S. Hailes 0.503 4,566.676