RECCR Rensselaer Exploratory Center for Cheminformatics Research
News Members Projects Publications Software Data MLI ECCRS
Targeted Task Models for Cheminformatics Process Development


Current QSPR models for ion-exchange chromatography predict the protein retention time, but the key fact for bioseparations is the relative order of displacement. The statistical learning theory underlying SVM suggests that we can get better results by directly modeling the problem of ranking the displacement order of proteins rather than by trying to solve the harder problem of accurately modeling retention times (Vapnik, 1998). Highly nonlinear ranking methods have been developed by simply changing the loss function used in SVM to a loss function appropriate for ranking (Joachims, 2002). In the past PLS and K-PLS could not be readily adapted to other loss function. As the name implies, PLS was created for least squares regression Recently we have developed a novel dimensionality reduction method called Boosted Latent Factors (BLF) (Momma and Bennett 2005). For any give loss function, BLF creates latent variables or principal components similar to those produced by PLS and PCA. We have extended BLF to ranking loss-function with great success. BLF can use the kernel approach of SVM and K-PLS to construct highly nonlinear ranking functions. For the least squares loss, BLF reduces to PLS. But now we can rapidly create learning methods for any convex loss function that maintain the many benefits of PLS. For example all of the feature selection and causal methodologies discussed in the Causal Chemometrics Modeling Module discussed can be readily adapted to BLF. The 1-norm SVM feature selection and model interpretation methods developed for cheminformatics and chromatography can also be adapted into the BLF selection framework (Breneman et al 2003).

Previous || Next

Rensselaer Polytechnic Institute RECCR Home Page || Member Area || Wiki

Copyright ©2005 Rensselaer Polytechnic Institute
All rights reserved worldwide.