CS 594: Language and Vision

Spring 2019

Contact Information

Professor: Natalie Parde
Email:       parde@uic.edu
Blackboard: CS 594 Language and Vision (33648) 2019 Spring
Piazza: piazza.com/uic/spring2019/cs594section33648
Office: SEO 1132
Office Hours: Tuesday/Thursday 2:00-3:00 p.m.

What is this class about?

Researchers in artificial intelligence are increasingly applying multimodal solutions to traditional problems, especially within the realm of natural language processing. In particular, synthesizing NLP with computer vision allows intelligent systems to harness both visual and linguistic information to generate content and derive meaning. This seminar course will introduce you to current research in fundamental language + vision problems, and provide you with a scientific background in relevant application areas. By the end of the course, you will have gained exposure to core concepts through a combination of: lectures on fundamental principles of NLP, CV, and deep learning; paper discussions; and a semester-long project in a focus area that you select. Topics covered will include grounded language learning, physically situated dialogue, automated image captioning, automated video description, visual question answering, text-to-image generation, visual story entailment, and language disambiguation via images.

Textbooks and Readings

Recommended reading for the first three weeks (lectures on fundamental principles of NLP, CV, and deep learning) will be from the following sources:
- Dan Jurafsky and James H Martin. Speech and language processing, volume 3. Pearson London, 2014
- Richard Szeliski. Computer vision: algorithms and applications. Springer Science & Business Media, 2010
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015

Free versions of these resources can be accessed at the links above. Reading materials for the remainder of the course will be comprised of conference and journal papers, all accessible online. A list of suggested discussion papers is provided on Blackboard. You're also welcome to suggest discussion papers that are not on this list (all suggested papers are subject to my approval). I'll post the list of papers that will be discussed each week once the paper selections have been finalized.

Assignments

This is a seminar-style class. Instead of having traditional homework and exams, we'll have presentations and projects! Specifically, your course work will be comprised of the following: I've posted my grading rubrics for each of these assignments on Blackboard, in the interest of transparency. Your final course grade will be determined according to the following breakdown:

Schedule

The most recent version of the course schedule is available below. This schedule is subject to change ...check back regularly for updates! I'll post my own lecture slides in the "Downloads" column soon after they are presented in class. If/when students give permission, I will also post their presentation slides for others to download. Note that whether or not you give permission for me to post your slides has zero bearing on your course grade—if you'd like them to be made available, that's great, and if not, that's perfectly fine as well.

Week Topic Deliverables Downloads
1/14-1/18 Introduction and NLP Overview Introduction to CS 594
1/21-1/25 NLP and CV Overview Paper Selection: 1/26 by 11:59 p.m. Introduction to NLP

Introduction to Computer Vision
1/28-2/1 Deep Learning Overview Pros and Cons Selections: 2/2 by 11:59 p.m. Introduction to Deep Learning
2/4-2/8 Project Proposals In-Class Presentations
2/11-2/15 Principles of Grounded Language Learning

Multimodal Machine Learning: A Survey and Taxonomy
Paper Critique: 2/11 by 12:00 p.m. Principles of Grounded Language Learning
2/18-2/22 Game-based Grounded Language Learning

Interactive Language Acquisition with One-Shot Visual Concept Learning through a Conversational Game

Learning Multi-Modal Grounded Linguistic Semantics by Playing "I Spy"

Grounding Language through Evolutionary Language Games
Paper Critique: 2/18 by 12:00 p.m. Game-based Grounded Language Learning
2/25-3/1 Physically Situated Dialogue

Visual Dialog

Learning to Recognize Novel Objects in One Shot through Human-Robot Interactions in Natural Language Dialogues
Paper Critique: 2/25 by 12:00 p.m. Physically Situated Dialogue
3/4-3/8 Visual Dependency Parsing and Visual Sentiment Analysis

Image Description using Visual Dependency Representations

Cross-Media Learning for Image Sentiment Analysis in the Wild

A Review of Affective Computing: From Unimodal Analysis to Multimodal Fusion
Paper Critique: 3/4 by 12:00 p.m. Visual Dependency Parsing and Visual Sentiment Analysis
3/11-3/15 Automated Image Captioning and Image-Text Alignment

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

Deep Visual-Semantic Alignments for Generating Image Descriptions
Paper Critique: 3/11 by 12:00 p.m. Automated Image Captioning and Image-Text Alignment
3/18-3/22 Automated Video Description and Visual Story Entailment

Sequence to Sequence - Video to Text

The Amazing Mysteries of the Gutter: Drawing Inferences between Panels in Comic Book Narratives
Paper Critique: 3/18 by 12:00 p.m. Automated Video Description and Visual Story Entailment
3/25-3/29 Spring Break
4/1-4/5 Project Updates In-Class Presentations
4/8-4/12 Text-to-Image Generation and Visual Question Answering

Text to 3D Scene Generation with Rich Lexical Grounding

Visual7w: Grounded Question Answering in Images

From Recognition to Cognition: Visual Commonsense Reasoning
Paper Critique: 4/8 by 12:00 p.m. Text-to-Image Generation and Visual Question Answering
4/15-4/19 Language Disambiguation via Images

Black Holes and White Rabbits: Metaphor Identification with Visual Features

Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes

Illustrative Language Understanding: Large-Scale Visual Grounding with Image Search
Paper Critique: 4/15 by 12:00 p.m. Language Disambiguation via Images
4/22-4/26 Project Presentations In-Class Presentations (Some Students)
Final Project: 4/22 by 12:00 p.m.
4/29-5/3 Project Presentations In-Class Presentations (Some Students)
Final Paper: 5/3 by 12:00 p.m.
5/6-5/10 Finals Week (No Class)

Final Notes

This website is provided partially for student convenience, partially for my own record-keeping purposes, and partially for the benefit of others who are not able to enroll in the course but who may find the content interesting for one reason or another. It is not a substitute for the official course page on Blackboard, or the course discussion board on Piazza! Please refer to those sources for copies of the full syllabus, assignment descriptions, example assignments, grading rubrics, submission links, and other useful information. If you are not enrolled in the course but would like to request access to those materials, please send me an email introducing yourself and explaining why you would like to have access to them.

Happy studying!