Schema Matching Quality: Thesaurus as the Matcher
DOI:
https://doi.org/10.11113/jt.v70.3514Keywords:
Schema matching, thesaurus, information retrieval, performanceAbstract
Thesaurus is used in many Information Retrieval (IR) applications such as data integration, data warehousing, semantic query processing and classifiers. It was also utilized to solve the problem of schema matching. Considering the fact of existence of many thesauri for a certain area of knowledge, the quality of schema matching results when using different thesauri in the same field is not predictable. In this paper, we propose a methodology to study the performance of the thesaurus in solving schema matching. The paper also presents results of experiments using different thesauri. Precision, recall, F-measure, and similarity average were calculated to show that the quality of matching changed according to the used thesaurus. Â
References
American National Standards Institute. 2005. ANSI/NISO Z39.19-2005.
Masterman, M. 1957. The Thesaurus in Syntax and Semantics. Mechanical Translation. 4(1–2): 35–43.
Aitchison, J., D. Bawden, and A. Gilchrist. 1997. Thesaurus Construction and Use: A Practical Manual. 3rd ed.
Golub, K. 2006. Automated Subject Classification of Textual Web Pages, Based on a Controlled Vocabulary: Challenges and Recommendations. New Review of Hypermedia and Multimedia. 12(1): 11–27.
Kuo, J.-J., et al. 2002. Multi-document Summarization Using Informative Words and Its Evaluation with a QA System. In Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing. Springer-Verlag. 391–-401.
Ralf, S., H. Johan, and S. Stefan. 2000. Using Thesauri for Automatic Indexing and for the Visualisation of Multilingual Document Collections. In Ontologies and Lexical Knowledge Bases: Proceedings of the First International OntoLex Workshop.
Steinberger, R., B. Pouliquen, and J. Hagman. 2002. Cross-Lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC. In Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing Springer-Verlag. 415–424.
Boudin, F., J.-Y. Nie, and M. Dawes. 2012. Using a Medical Thesaurus to Predict Query Difficulty. In Advances in Information Retrieval. R. Baeza-Yates, et al. Editors. Springer Berlin Heidelberg. 480–484.
Sabbah, T., R. Jayousi, and Y. Abuzir. 2009. Schema Matching Using Thesaurus. In Proceeding of 3rd International Conference on Software, Knowledge, Information Management and Applications.
Dong, C. and J. Bailey. 2006. A Framework for Integrating XML Transformations. In Conceptual Modeling-ER 2006. D. Embley, A. Olivé, and S. Ram, Editors. Springer Berlin Heidelberg. 182–195.
Madhavan, J., P. A. Bernstein, and E. Rahm. 2001. Generic Schema Matching with Cupid. In Proceedings of the 27th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc. 49–58.
Doan, A., P. Domingos, and A. Halevy. 2003. Learning to Match the Schemas of Data Sources: A Multistrategy Approach. Mach. Learn. 50(3): 279–301.
Madhavan, J., et al. 2005. Corpus-Based Schema Matching. In Proceedings of the 21st International Conference on Data Engineering IEEE Computer Society. 57–68.
Rahm, E. and P. A. Bernstein. 2001. A Survey of Approaches to Automatic Schema Matching. The VLDB Journal. 10(4): 334–350.
Shvaiko, P. and J. Euzenat. 2005. A Survey of Schema-based Matching Approaches. Journal on Data Semantics. IV: 146–171.
Zamboulis, L. 2003. XML Schema Matching & XML Data Migration & Integration: A Step Towards The Semantic Web Vision,
Melnik, S., H. Garcia-Molina, and E. Rahm. 2002. Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching. In Data Engineering, 2002. Proceedings. 18th International Conference on.
Thang, H. Q. and V. S. Nam. 2010. XML Schema Automatic Matching Solution. International Journal of Electrical, Computer, and Systems Engineering. 4(1): 68–74.
Princeton University. 2010. About WordNet. WordNet. Princeton University.
Xu, L. 2003. Source Discovery and Schema Mapping for Data Integration. Brigham Young University. 137.
Mirza, B., C. Laurent, and S. Joel. 2006. MAXSM: A Multi-Heuristic Approach to XML Schema Matching.
Sabbah, T. 2009. Using Thesaurus as a Schema Matching Approach at the Element Level. Unpublised MSc. Thesis. Al Quds University.
Downloads
Published
Issue
Section
License
Copyright of articles that appear in Jurnal Teknologi belongs exclusively to Penerbit Universiti Teknologi Malaysia (Penerbit UTM Press). This copyright covers the rights to reproduce the article, including reprints, electronic reproductions, or any other reproductions of similar nature.