Using Soft Consensus Clustering for Combining Multiple Clusterings of Chemical Structures
Keywords:Consensus clustering, graph partitioning, molecular datasets, soft clustering
The consensus clustering has shown capability to improve the robustness, novelty and stability of individual clusterings in many areas including chemoinformatics. In this paper, graph-based consensus method (cluster-based similarity partitioning algorithm CSPA) and soft consensus clustering were examined for combining multiple clusterings of chemical structures. The clustering is evaluated based on the ability to separate active from inactive molecules in each cluster. Experiments suggest that the effectiveness of soft consensus method can obtain better results than the hard consensus method (CSPA).
Punera, K., Ghosh, J. 2008. Consensus-based Ensembles of Soft Clusterings. Applied Artificial Intelligence. 22(7â€“8): 780â€“810.
Vega-Pons, S., and Ruiz-Schulcloper, J. A. 2011. Survey of Clustering Ensemble Algorithms. International Journal of Pattern Recognition and Artificial Intelligence. 25(3): 337â€“372.
Topchy, A., Jain, A. K., Punch, W. 2004. A Mixture Model of Clustering Ensembles. SIAM Int. Conf. Data Mining. 379â€“390.
Fred, A. L. N., Jain, A. K. 2005. Combining Multiple Clustering Using Evidence Accumulation. IEEE Trans. Patt. Anal. Mach. Intell. 27: 835â€“850.
Chu, C-W., Holliday, J., Willett, P. 2012. Combining Multiple Classifications of Chemical Structures Using Consensus Clustering. Bioorgan Med Chem. 20(18): 5366â€“5371.
Saeed, F., Salim, N., Abdo, A., Hentabli, H. 2013. Graph-based Consensus Clustering for Combining Multiple Clusterings of Chemical Structures. Molecular Informatics. 32(2): 165â€“178.
Saeed, F., Salim, N., Abdo, A. 2012.Voting-based Consensus Clustering for Combining Multiple Clusterings of Chemical Structures. Journal of Cheminformatics. 4(1): 37.
Sci Tegic Accelrys Inc., the MDL Drug Data Report (MDDR) database is available from at http :// www.accelrys.com / (accessed 1st of November 2012).
Abdo, A., Chen, B., Mueller, C., Salim, N., Willett, P. 2010. Ligand-based Virtual Screening Using Bayesian Networks. J Chem Inf Model. 50: 1012â€“1020.
Abdo, A., Salim, N. 2011. New Fragment Weighting Scheme for the Bayesian Inference Network in Ligand-based Virtual Screening. J Chem Inf Model. 51: 25â€“32.
Abdo, A., Saeed. F., Hentabli, H., Ahmed, A., Salim, N. 2012. Ligand Expansion in Ligand-based Virtual Screening Using Relevance Feedback. J Comput-Aided Mol Des. 26: 279â€“287.
Pipeline Pilot software: SciTegic Accelrys Inc. San Diego: Accelrys Inc website; 2008. http://www.accelrys.com/.
Strehl, A., Ghosh, J. 2002. Cluster Ensemblesâ€”A Knowledge Reuse Framework for Combining Multiple Partitions. J. Machine Learning Research. 3: 583â€“617.
Karypis, G., Aggarwal, R., Kumar, V., Shekhar, S. 1997. Multilevel Hypergraph Partitioning: Application in VLSI Domain, DAC'97: Proc. 34th Ann. Conf. Design Automation (ACM, New York, NY, USA). 5261997529.
Van, Rijsbergen, C. J. 1979. Information Retrieval. 2nd Edition. London: Butterworths;
Varin, T., Saettel, N., Villain, J., Lesnard, A., Dauphin, F., Bureau, R., Rault, S. J. 2008. 3D Pharmacophore, Hierarchical Methods, and 5-HT4 Receptor Binding Data. Enzyme Inhib Med Chem. 23: 593â€“603.
How to Cite
Copyright of articles that appear in Jurnal Teknologi belongs exclusively to Penerbit Universiti Teknologi Malaysia (Penerbit UTM Press). This copyright covers the rights to reproduce the article, including reprints, electronic reproductions, or any other reproductions of similar nature.