CS 421: Natural Language Processing

Fall 2019

Contact Information

Professor: Natalie Parde (parde@uic.edu)
Office Hours: SEO 1132, Tuesday 1:30 - 2:30 p.m. / Thursday 3:00 - 4:00 p.m.
 
Teaching Assistant: Usman Shahid (hshahi6@uic.edu)
Office Hours: SELW 4029, Wednesday 12:00 - 2:00 p.m.
 
Piazza: https://piazza.com/uic/fall2019/cs421

What is this class about?

Natural language processing (NLP) is the subfield of artificial intelligence that focuses on automatically understanding and generating natural language (e.g., Arabic, Navajo, Spanish, or English). It is crucial to many everyday applications ...if you've searched for something online or engaged in dialogue with one of your devices today, you've made use of many different NLP technologies already. This class will provide an introduction to the foundations and most popular applications of natural language processing, through a combination of readings, short assignments, exams, and (for grad students and optionally undergrads) a semester-long project. Topics covered will include text preprocessing, part-of-speech tagging, syntactic and dependency parsing, language modeling, word embeddings, statistical and neural models, dialogue systems, question answering, and machine translation, among others.

Textbooks

Readings and (some) assignments for this class will be drawn from the following sources:
- Daniel Jurafsky and James H Martin. Speech and Language Processing (2nd Edition). Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 2009.
- Daniel Jurafsky and James H Martin. Speech and Language Processing (3rd Edition). Draft, 2019.

The third edition of the book is still being written; its current draft can be freely accessed at the link above. The second edition is available for purchase from the UIC bookstore and other retailers, and can generally be found at affordable prices since it's been out for awhile. You'll ultimately be responsible for content from those official sources, so if you purchase a different version (e.g., an international edition of the second version), make sure to check for any misaligned material.

Assignments and Exams

This is a 400-level course, designed for both graduate students and advanced undergraduates. Depending on your classification, you may have enrolled in either the four-hour version (grad students) or the three-hour version (undergrad students). There are slightly different requirements for the two versions of the course, with the biggest difference being that grad students will be required to complete a semester-long project. Undergrads may opt to complete a project as well if they would like, in which case their final course grade will be determined according to the same breakdown as that used for graduate students; however, doing this extra work is certainly not a requirement. Some further details about the work you will be expected to complete for this course are provided below: Grading rubrics for assignments, exams, and the project will be posted on Gradescope. Once grading is complete for a given assignment or exam, the solution will also be posted. You are encouraged to use these solutions to further your understanding of the course material and to prepare for future exams. Final course grades will be determined according to the following breakdowns:

Schedule

The most recent version of the course schedule is available below. This schedule is subject to change ...check back regularly for updates! I'll post my own lecture slides in the "Downloads" column soon after they are presented in class. For the readings, (v2) corresponds to the second edition of the Jurafsky and Martin text, and (v3) corresponds to the third edition draft.

Week Topic Readings Deliverables Downloads
8/26-8/30 Introduction, Text Preprocessing, and Edit Distance (v2) Chapter 1 (all), Chapter 2 (2.1), Chapter 3 (3.8-3.11) Introduction to CS 421

Text Preprocessing and Edit Distance
9/2-9/6 Automata, Transducers, and Hidden Markov Models (v2) Chapter 2 (2.2), Chapter 3 (3.1-3.7), Chapter 6 (6.1-6.5) Assignment 1: 9/6 by 12 p.m. (noon) Automata, Transducers, and Hidden Markov Models
9/9-9/13 Part-of-Speech Tagging and Formal Grammars (v2) Chapter 5 (all), Chapter 12 (all) Part-of-Speech Tagging and Formal Grammars
9/16-9/20 Syntactic and Dependency Parsing (v2) Chapter 13 (all), (v3) Chapter 14 (all) Assignment 2: 9/20 by 12 p.m. (noon) Syntactic and Dependency Parsing
9/23-9/27 First-Order Logic and Review/Catch-Up (v2) Chapter 17 (all) First-Order Logic

Midterm 1 Review
9/30-10/4 Exam 1 (10/1) and N-Gram Language Modeling (v2) Chapter 4 (4.1-4.7) Language Models & N-grams
10/7-10/11 Word Embeddings (v3) Chapter 6 (all) Assignment 3: 10/11 by 12 p.m. (noon) Word Embeddings
10/14-10/18 Naïve Bayes, Text Classification, and Evaluation Metrics (v3) Chapter 4 (all) Naïve Bayes, Text Classification, and Evaluation Metrics
10/21-10/25 Neural Networks and Neural Language Models (v3) Chapter 7 (all) Assignment 4: 10/25 by 12 p.m. (noon) Neural Networks and Neural Language Models
10/28-11/1 Sequence Processing with Recurrent Networks and Review/Catch-Up (v3) Chapter 9 (all) Sequence Processing with Recurrent Networks

Midterm 2 Review
11/4-11/8 Exam 2 (11/5) and Information Extraction (v2) Chapter 22 (all) Information Extraction
11/11-11/15 Dialogue Systems and Chatbots (v3) Chapter 26 (all), (v2) Chapter 24 (24.2) Dialogue Systems and Chatbots
11/18-11/22 Question Answering and Summarization (v2) Chapter 23 (23.3-23.7), (v3) Chapter 23 (all) Assignment 5: 11/22 by 12 p.m. (noon)
11/25-11/29 Machine Translation (v2) Chapter 25 (25.1-25.9)
12/2-12/6 Project Presentations and Review Code/Paper: 12/2 by 12 p.m. (noon)
12/9-12/13 Exam 3 (12/11, 10:30 a.m. - 12:30 p.m.)

Final Notes

This website is provided partially for student convenience, partially for my own record-keeping purposes, and partially for the benefit of others who are not able to enroll in the course but who may find the content interesting for one reason or another. It is not a substitute for the course pages on Blackboard and Gradescope, or the course discussion board on Piazza! Please refer to those sources for copies of the full syllabus, assignments, grading rubrics, submission links, and other useful information. If you are not enrolled in the course but would like to request access to those materials, please send me an email introducing yourself and explaining why you would like to have access to them.

Happy studying!