Skip to main content
  • Poster presentation
  • Open access
  • Published:

Estimating the applicability domain of kernel based QSPR models using classical descriptor vectors

The propagation of machine learning based property prediction methods (e.g. QSAR, QSPR,.…) has lead to the question of the reliability of the prediction. This leads to the development of methods enabling the estimation of the reliability of a model based prediction.

There are two principal approaches in dealing with this demand: estimating the expected derivation from the prediction (e.g. gaussian processes) or classifying each compound whether the model is specified for it or not. The last approach has become known as estimating the applicability domain [1, 2] of a model. One drawback of the different AD estimation methods is that most of them are based on the spatial embedding of the training dataset in the descriptor space. Thus these algorithms are not directly suited in modelling the applicability domain of kernel-based predictors, which are working in a extremely high dimensional implicit feature space.

In this study we examined to what extent a standard descriptor based AD model can be used to describe the applicability domain of an optimal assignment kernel [3] based predictor. We split the popular Huuskonen [4] logS dataset 2:1 in a training and a test set and compared some standard AD methods [1, 2] (range-based, convex hull, leverage,…) regarding the correlation of the estimated AD with the test error. The results indicate that it is possible to estimate the applicability domain of a kernel based model using classical descriptor encodings of the molecules. Furthermore the results show that there are significant differences between the different methods. In our application the geometrical convex hull approach was superior.

References

  1. Jaworska J, Nikolova-Jeliazkova N, Aldenberg T: ATLA. 2005, 445-459.

    Google Scholar 

  2. Netzeva TI, Worth AP, Aldenberg T: ATLA. 2005, 1-19.

    Google Scholar 

  3. Fröhlich H, Wegner JK, Zell A: Proc Int Joint Conf Neur Net (IJCNN). 2005, 913-918.

    Google Scholar 

  4. Huuskonen J: J Chem Inf Mod. 2000, 773-777.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Fechner, N., Hinselmann, G., Schmiedl, C. et al. Estimating the applicability domain of kernel based QSPR models using classical descriptor vectors. Chemistry Central Journal 2 (Suppl 1), P2 (2008). https://doi.org/10.1186/1752-153X-2-S1-P2

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/1752-153X-2-S1-P2

Keywords