CS6550

Introduction to Information Retrieval

Class Hours: Tuesday/Thursday 9:10-10:30am, WEB 1250

Instructor

Qingyao Ai

Office Hours: Thursday 10:40am-11:40am, MEB 2172

Prerequisites

Text Books (optional):

Resources

Grading

The grade will count the assessments using the following proportions:

Tentative Class Schedule

Week, Date Topic Reminder Slides Readings
Week 01, 01/06 - 01/10 ▪ Course Overview
Introduce information retrieval and the course organization.
▪ The life of a query
Modern search engine architecture overview and how information is retrieved for a given search query.
     
Week 02, 01/13 - 01/17 ▪ Evaluation
Introduce the concept of ranking and how to evaluate IR systems with ranking metrics.
▪ Crawls and feeds
How to collect and create information retrieval collections.
    NDCG: Järvelin and Kekäläinen (TOIS 2002)
ERR Chapelle et al. (CIKM 2009)
BigTable Chang et al. (OSDI 2006)
Week 03, 01/20 - 01/24 ▪ Processing text
The basic techniques for text processing such as stemming, stop words removal, etc.
▪ Processing Web
Link analysis and spam detection.
    Zipf’s Law and Heap’s Law Sano et al. (2012)
Topic-sensitive PageRank Haveliwala (WWW 2002)
Week 04, 01/27 - 01/31 ▪ Indexing
Introduce basic indexing techniques (e.g., inverted index) for efficient information retrieval.
▪ Compression
Data compression and index compression algorithms.
    BitFunnel Goodwin et al. (SIGIR 2017)
Week 05, 02/03 - 02/07 ▪ Queries and Interfaces
Interface design and query process techniques including suggestions, reformulation, etc.
▪ Retrieval Model (1)
Vector space models.
Assignment 1 Due (02/09)   Local vs. Global Analysis Xu and Croft (SIGIR 1996)
Week 06, 02/10 - 02/14 ▪ Retrieval Model (2)
Classic probabilistic models.
▪ Retrieval Model (3)
Language modeling approaches and smoothing.
    2-Possion Model Robertson (SIGIR 1994)
Query Likelihood model Ponte and Croft (SIGIR 1998)
KL-divergence model Lafferty and Zhai (SIGIR 2001)
Language smoothing Lafferty and Zhai (SIGIR 2001)
Week 07, 02/17 - 02/21 ▪ Retrieval Model (4)
Enhanced language modeling approaches.
▪ Relevance Feedback
The concept and basic techniques of relevance feedback and pseudo relevance feedback.
    Dependence models Metzler and Croft (SIGIR 2005)
Huston and Croft (CIKM 2014)
Relevance Model Lavrenko and Croft (SIGIR 2001)
Model-based feedback model Zhai and Lafferty (CIKM 2001)
Week 08, 02/24 - 02/28 ▪ Search Result Diversification
The motivation of result diversification and classic models.
▪ Learning to Rank (1)
The motivation and basic concepts of learning to rank, including query-document pairs, feature vectors, etc.
    Maximal Marginal Relevance Carbonell and Goldstein (SIGIR 1998)
Diversity evaluation larke et al. (SIGIR 2008)
LTR textbook Li.
Week 09, 03/02 - 03/06 ▪ Learning to Rank (2)
The optimization paradigms for learning-to-rank models, e.g. pointwise, pairwise, and listwise methods.
▪ Classification and Clustering
The concepts and algorithms for text classification and clustering.
Assignment 2 Due (03/08)   Cluster-based LM Liu and Croft (SIGIR 2004)
RankNet Burges et al. (ICML 2008)
ListMLE Xia et al. (ICML 2008)
ListMLE Xia et al. (ICML 2008)
LambdaMART Burges.
DLCM Ai et al. (SIGIR 2018)
Week 10, 03/09 - 03/13 Spring Break Project Proposal Due (03/15)    
Week 11, 03/16 - 03/20 ▪ Class Project Proposal Presentation
Proposal presentation and mutual evaluation
▪ Beyond bag-of-words (1)
Latent space models and distributed representations.
    LSI Deerwester et al. (JASIS 1990)
LDA-LM Wei and Croft (SIGIR 2006)
Week 12, 03/23 - 03/27 ▪ Beyond bag-of-words (2)
Neural Information Retrieval.
▪ Recommendation systems (1)
The basic concepts and naive algorithms for recommender systems.
    DSSM Huang et al. (CIKM 2013)
DRMM Guo et al. (CIKM 2016)
K-NRM Xiong et al. (SIGIR 2017)
Week 13, 03/30 - 04/03 ▪ Recommendation systems (2)
Collaborative filtering.
▪ Adv. User Study and Crowdsourcing
The study and applications of user modeling and crowdsourcing in information retrieval.
    NCF He et al. (WWW 2017)
JRL Zhang et al. (CIKM 2017)
Week 14, 04/06 - 04/10 ▪ Adv. Click Models and Unbiased Learning
The idea of biased user behavior and the algorithms for unbiased optimization.
▪ Adv. Personalization
How to personalize retrieval results according to individual information needs.
Assignment 3 Due (04/12)    
Week 15, 04/13 - 04/17 ▪ Adv. Question Answering
Introduction on classic and state-of-the-art methods for question answering in open domains.
▪ Class Project Final Presentation (1)
The presentation and evaluation of course projects.
     
Week 16, 04/20 - 04/24 ▪ Class Project Final Presentation (2)
The presentation and evaluation of course projects.
     
Week 17, 04/27 - 05/01 ▪ Class Project Report Due (04/29)      

Acknowledgements

Special thanks to Dr. Hamed Zamani from Microsoft Research, Prof. Jiepu Jiang from Virginia Tech University, and Prof. James Allan from University of Massachusetts Amherst. Some teaching materials are borrowed from their course sites for CS656: Information Retrieval.