RECCR Rensselaer Exploratory Center for Cheminformatics Research
News Members Projects Publications Software Data MLI ECCRS
Alternative Model Fusion

Co-Investigator: Mark J. Embrechts

Associate Professor, Department of Decision Sciences & Engineering Systems

Data Fusion - Integration of data from multiple sources

Background

Data fusion was first introduced in the radar sensing community and refers to the process of combining multi-sensor data from different sources such that the resulting information/model is in some sense "better" than would be possible when these sources were used individually. We have extended the idea of data fusion to molecular property analysis and prediction, where rather than using different sensor sources, we use different descriptor fields for a set of molecules and apply data fusion techniques to improve the predictive performance of QSAR models for unknown cases. In this situation we use the term "auto-fusion" rather than data fusion, because the same molecules, and in certain cases the same descriptors are used, but different preprocessing techniques extract different features from the data - such as principal component analysis and independent component analysis (ICA). It has been shown that kernel partial- least squares (K-PLS) models in auto-fusion mode show a significant boost in performance compared to traditional K-PLS models. Note that this approach is distinct from the more familiar methods of consensus or bagged modeling, and performs better in prediction.

Data Fusion
Rensselaer Polytechnic Institute RECCR Home Page || Member Area || Wiki

Copyright ©2005 Rensselaer Polytechnic Institute
All rights reserved worldwide.