F033583 Introduction to Web Search and Mining 互联网信息搜索与挖掘

 

课程名称 (Course Name) Introduction to Web Search and Mining

课程代码 (Course Code):F033583

学分/学时 (Credits/Credit Hours) 2 / 32

开课时间 (Course Term )  Spring

开课学院(School Providing the Course: SEIEE  

任课教师(Teacher:  ZHU Qili Kenny

课程讨论时数(Course Discussion Hours:  0

课程实验数(Lab Hours:   0

课程内容简介(Course Introduction):

The World Wide Web (WWW) is the largest source of open-domain information today. The popularization of the web has revolutionized the way people search and retrieve information. This course presents the fundamental theory and practice behind web search engines and introduce some basic techniques to extract information and mine knowledge from the web, with an emphasis on text documents. After learning from this course, you should be able to understand the basic internals of a web search engine, and perhaps build a small search engine of yourself. On the other hand, you should get enough hands-on experience to write a crawler to extract data from the web and do various data analytics on the acquired data.

教学大纲(Course Teaching Outline):

1.       Information retrieval models (Boolean, vector space, language models)

2.       Indexing and index compression

3.       Link analysis (PageRank, HITS)

4.       Semantic Search

5.       Web crawling

6.       Recommender systems

课程进度计划(Course Schedule):

TBD

课程考核要求(Course Assessment Requirements)

In-class quizzes  30%

Assignments    40%

Projects        30%

参考文献(Course References)

1.       Introduction to Information Retrieval, Jul 7, 2008, by Christopher D. Manning and Prabhakar Raghavan

2.       Mining the Web: Discovering Knowledge from Hypertext Data Hardcover – October 23, 2002, by Soumen Chakrabarti

3.       Web Information Retrieval (Data-Centric Systems and Applications), Aug 30, 2013, by Stefano Ceri and Alessandro Bozzon

4.       Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications), Aug 6, 2013, by Bing Liu

预修课程(Prerequisite Course

Discrete Math, Probability and Statistics, Database systems, Machine Learning or Data Mining

[ 2015-11-26 ]