CS 521: Statistical Natural Language Processing
Spring 2025
Contact Information
Professor: |
|
Natalie Parde (parde@uic.edu) |
Office Hours: |
|
Tuesday 3:00 - 5:00 p.m. |
What is this class about?
Modern natural language processing (NLP) is a fast-moving field that harnesses data-driven methods to power many everyday applications, including intelligent virtual assistants (e.g., Siri or Alexa) and generative AI (e.g., ChatGPT). This class will introduce advanced topics in data-driven NLP and provide an overview of active research in relevant topic areas. It will do so through a combination of readings, paper presentations and critiques, and a semester-long project. Topics covered will include data collection, common deep learning architectures for NLP, advanced language modeling, and contemporary applications, among others.
Textbooks
Readings for the first part of the semester will be drawn from the following source:
-
Daniel Jurafsky and James H Martin. Speech and Language Processing (3rd Edition). Draft, 2024.
Note that this resource is a draft of the upcoming third edition of
Speech and Language Processing. Chapter order and content is subject to change throughout the semester (I'll try to update the course website whenever I notice this occurring, but feel free to ping me if you think something seems out of date). Readings for the second part of the semester will be drawn primarily from journals and conference proceedings. Some suggested papers for each topic in the second part of the semester are provided in the course syllabus. You are welcome to present a paper that is not in the list of suggestions, as long as I approve it.
Assignments
This course acts as a middle ground between UIC's introductory, lecture-based NLP course (CS 421) and the more specialized, seminar-style NLP courses (e.g., CS 532) offered by the department. The coursework will be closer to what you would expect to find in a seminar-style course, but the general format of the class will contain elements of both. For the first portion of the semester, lectures will be given about advanced topics in contemporary natural language processing. For the remainder of the semester, students will present and critique research from recent NLP papers. Some further details about the work you will be expected to complete for this course are provided below:
- Paper Presentation: You will present an overview of one paper, summarizing its motivations, methods, results, and any other analyses included by the authors. You can review paper presentation videos from major NLP conferences in the ACL Anthology for some guidance regarding what your presentation should look like; however, feel free to creatively incorporate the required material however you prefer. Paper presentations should be 15 minutes long.
- Oral Critiques: For two paper presentations not including your own, you will orally provide your perspective of the paper, including what you feel were the paper's strengths and weaknesses. Oral critiques should be a maximum of five minutes, and you do not need to create slides; you can think of these critiques as an avenue through which to lead the paper discussion.
- Written Critiques: During three ``research'' weeks (i.e., weeks with paper presentations), you will submit a short ($\sim$ one page) written critique of one of the papers being presented and discussed. The critique should include a brief summary of the paper, highlights of aspects of the paper that are particularly good or should be improved, an analysis of the soundness of the methodology and evaluation, and an explanation of whether or not the conclusions drawn by the authors are justified.
- Paper Discussion: You will engage in paper discussions by attending your peers' research presentations and asking questions or providing comments when applicable. You will need to attend and discuss papers in 10 distinct class sessions (spread across the eight "research" weeks) to earn full paper discussion points.
- Project: A central component of this course is the semester-long project. You may complete your project individually or in pairs (if completing the project in a pair, the workload should scale accordingly). You'll be afforded considerable flexibility in selecting your project topics—ideally, if you're working on a thesis or dissertation, you'll be able to incorporate the work resulting from this course into your research. The project will comprise four different deliverables:
- Proposal: You will create a short (5 minutes, or up to 10 minutes if working in a pair) presentation detailing your plans and defining your research objectives. Your research proposal should include at least one research question that you can empirically test, and your associated hypothesis about the experimental outcome.
- Project Source: You will submit the source code (either directly or through a link to the repository) used for your project, along with well-documented instructions for replicating the work.
- Project Presentation: You will create a moderate-length (7 minutes, or up to 14 minutes if working in a pair) presentation discussing your methodology, evaluation, and key findings to the class.
- Project Report: You will write a conference/journal-style paper about your project, including a literature review, methodology, evaluation, and conclusions. Use the ACL proceedings template to format your paper. If you're working on a thesis or dissertation, you can view this as an easy way to have a paper ready to submit by the end of the semester, complete with feedback from a faculty member! Project reports should range from 2500-4500 words, not including references.
Grading rubrics are posted on Blackboard. Final course grades will be determined according to the following breakdown:
- Research Discussion:
- Paper Presentation: 15%
- Oral Critique: 10% (5% per paper)
- Written Critique: 24% (8% per paper)
- Paper Discussion: 10% (1% per class session)
- Project:
- Proposal Presentation: 9%
- Project Source: 5%
- Project Presentation: 12%
- Project Report: 15%
Schedule
The most recent version of the course schedule is available below. This schedule is subject to change ...check back regularly for updates!
All deadlines are at 12 p.m. (noon) unless otherwise stated.
Week |
Topic |
Readings |
Deliverables |
Slides |
1/13 - 1/17 |
Introduction and Data Collection |
— |
— |
Download
|
1/20 - 1/24 |
Machine Translation and Advanced Deep Learning Models for Sequence Processing |
Chapters 9 and 13 |
— |
Download
|
1/27 - 1/31 |
Transfer Learning with Pretrained Language Models and Large Language Models (LLMs) |
Chapters 10 and 11 |
1/31: Paper Selection Deadline |
Download
|
2/3 - 2/7 |
Prompting, Retrieval-Augmented Generation, and Essential Research Skills |
Chapter 12 |
2/7: Oral Critique Selection Deadline |
Download
|
2/10 - 2/14 |
Project Proposals |
— |
Project Proposal on Scheduled Date |
— |
2/17 - 2/21 |
Project Proposals and Language Modeling |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 2/17: Written Critique |
— |
2/24 - 2/28 |
Prompting LLMs |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 2/24: Written Critique |
— |
3/3 - 3/7 |
Fine-Tuning LLMs |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 3/3: Written Critique |
— |
3/10 - 3/14 |
Interpreting LLMs |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 3/10: Written Critique |
— |
3/17 - 3/21 |
Efficient and Low-Resource NLP |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 3/17: Written Critique |
— |
3/24 - 3/28 |
Spring Break |
— |
— |
— |
3/31 - 4/4 |
Multimodal and Multilingual NLP |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 3/31: Written Critique |
— |
4/7 - 4/11 |
Ethics and NLP |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 4/7: Written Critique |
— |
4/14 - 4/18 |
NLP Applications |
Research Papers:
|
Project Proposals, Paper Presentations, and Oral Critiques on Scheduled Dates 4/14: Written Critique |
— |
4/21 - 4/25 |
Project Presentations |
— |
Project Presentations on Scheduled Date 4/21: Project Source |
— |
4/28 - 5/2 |
Project Presentations |
— |
Project Presentations on Scheduled Date 5/2: Project Report |
— |
5/5 - 5/9 |
Finals Week (No Class) |
— |
— |
— |
Final Notes
This website is provided partially for student convenience, partially for my own record-keeping purposes, and partially for the benefit of others who are not able to enroll in the course but who may find the content interesting for one reason or another. It is not a substitute for the course page on Blackboard or the course discussion board on Piazza! Please refer to those sources for copies of the full syllabus, assignments, grading rubrics, submission links, and other useful information. If you are not enrolled in the course but would like to request access to those materials, please send me an email introducing yourself and explaining why you would like to have access to them.
Happy studying!