【软件学院】学术报告——Normalization of Duplicate Records from Multiple Sources

  • 日期:2018-10-08        来源:四川大学软件学院         点击数:


报告题目:Normalization of Duplicate Records from Multiple Sources

报告人:Weiyi Meng ProfessorDepartment of Computer Science State University of New York at Binghamton

U.S.A.

报告时间:20181089:00

报告地点:学院报告厅(望江校区基础教学大楼B302


报告内容:

Data consolidation is a challenging issue in data integration. The value of data explodes when it is linked and fused with other data from numerous (Web) sources. The promise of Big Data hinges upon addressing several big data integration challenges,   such as record linkage at scale, real-time data fusion, and integrating Deep Web. Although much work has been conducted on  these problems, there is limited work on creating a uniform, standard record from a group of records corresponding to the same real-world entity. Such a record representation, referred to as normalized record, is important for both front-end and back-end  applications. We refer to this task as record normalization. In this talk, I will introduce our recent work in formalizing the         record normalization problem and present an in-depth analysis of normalization granularity levels (e.g., record, field, and field-value-component) and of normalization forms (e.g., typical versus complete). I will also introduce a comprehensive                 framework for computing the final normalized record. The proposed framework includes a large number of record                   normalization strategies.


报告人简介:

Weiyi Meng is currently a professor and the chair of the Department of Computer Science of the State University of New York at Binghamton. He previously served as Associate Dean for Research and Graduate Studies of the Thomas J. Watson School of Engineering and Applied Science. He received his bachelor’s degree in mathematics from Sichuan University as a member of  class 77. He received his MS and Ph.D. in computer science from University of Illinois at Chicago in 1988 and 1992,                respectively. His research interests include metasearch engines, Web database integration systems, Internet-based information  retrieval, information trustworthiness analysis, Web data quality, Web information extraction, sentiment analysis, and database management system. He is the co-author of three books “Deep Web Query Interface Understanding and Integration”,              “Advanced Metasearch Engine Technology” and “Principles of Database Query Processing for Advanced Applications”. He    has over 150 research publications. He has served as general chair and PC chair of several international conferences and served on the editorial boards of several journals.



欢迎广大师生踊跃参加!





外事科

        2018101

 来源链接:http://sw.scu.edu.cn/info/1046/5153.htm