Intro 2 - Text Classification (8/31/2023)
Content:
- Text classification definition and datasets
- Generative text classifiers (naive Bayes)
- Discriminative text classifiers (logistic regression)
- Statistical significance testing
- Dataset understanding and creation
Reading Material
- Recommended Reading: Data Statements for NLP (Bender and Friedman 2018)
- Recommended Reading: The Hitchhiker’s Guide to Testing Statistical Significance in Natural Language Processing (Dror et al. 2018)
- Recommended Reading: Deep Learning Based Text Classification: A Comprehensive Review (Minaee et al. 2020)
- Reference: RACE: Large-scale ReAding Comprehension Dataset From Examinations (Lai et al. 2017)
- Reference: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank (Socher et al. 2013)
- Reference: A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts (Pang and Lee 2004)
- Reference: Character-level Convolutional Networks for Text Classification (Zhang et al. 2015)
- Reference: Conditional Structure vs. Conditional Estimation in NLP Models (Klein and Manning 2002)
- Reference: Noisy Channel Language Model Prompting for Few-Shot Text Classification (Min et al. 2022)
Slides: Text Classification Slides
Code: Bag-of-Words Text Classifier Code