15-387/86-375/675 Computational Perception

Carnegie Mellon University

Fall 2021

Course Description

The perceptual capabilities of even the simplest biological organisms are far beyond what we can achieve with machines. Whether you look at sensitivity, robustness, adaptability and generalizability, perception in biology just works, and works in complex, ever changing environments, and can make inference on the most subtle sensory patterns. Is it the neural hardware? Does the brain use a fundamentally different algorithm? What can we learn from biological systems and human perception?

In this course, we will study the biological and psychological data of biological perceptual systems, mostly the visual system, in depth, and then apply computational thinking to investigate the principles and mechanisms underlying natural perception. You will learn how to reason scientifically and computationally about problems and issues in perception, how to extract the essential computational properties of those abstract ideas, and finally how to convert these into explicit mathematical models and computational algorithms. The course is targeted to both neuroscience and psychology students who are interested in learning computational thinking, as well as computer science and engineering students who are interested in learning more about the neural and computational basis of perception. Prerequisites: First year college calculus, differential equations, linear algebra, basic probability theory and statistical inference, and programming experience (Matlab) are desirable.

Course Information

Instructors	Office or zoom (Office hours)	Email (Phone)
Tai Sing Lee (Professor)	Friday 9-10 a.m. Class zoom link	taislee@andrew.cmu.edu
Tianqin Li (TA)	Monday 5:30-6:30 pm and Tuesday 8:00-9:00 p.m. Class zoom link	tianqinl@cs.cmu.edu

Class location and time: WEH 5310 or Zoom (link announced on CANVAS), Monday/Wednesday 1:30 p.m - 2:50 p.m.
Class recitation and journal club: zoom only: 1:30 p.m - 2:50 p.m.
Website: http://www.cnbc.cmu.edu/~tai/cp21.html (course info and readings )
Canvas: Lecture materials and Information would be on Canvas.

Handouts on Canvas .
Frisby and Stone Seeing: The computational approach to biological vision . MIT Press, 2010 (recommended).
Supplementary Simon Prince Computer Vision: Models, Learning, and Inference . Cambridge University Press , 2012. Downloadable at: http://www.computervisionmodels.com. More relevant to graduate students.

Classroom Etiquette

If in person, refrain from using laptop and cell phones. If on zoom, please turn on your video.

Grading Scheme 15-387

Evaluation	Grade Points
Assignments	65
Midterm	10
Final Exam	15
Class Participation	10

Total points: 100
Grading scheme: A: > 88, B: > 75. C: > 65. D > 50.

Grading Scheme 86-375

Evaluation	Points
Assignments	39
Midterm	10
Final Exam	15
Flex Requirement	26
Class participation (10)	10

Total credit for 86-375: 100
Flex Requirement: problem set (13), a term project (13) OR term paper (13) OR Journal Club (13)
Grading scheme: A: > 88, B: > 75. C: > 65. D > 50

Grading Scheme 86-675

Evaluation	Points
Assignments	65
Midterm	10
Final Exam *	15
Journal Club *	3 Presentations
Term Project *	See below.
Class participation	10

Total credit for 86-675: 100
Journal Club at least 10 times attendance, 2-3 presentations.
Term project can be used to replace one or two problem sets, depending on the scale of the project.
Grading scheme: A+ top 2 students in the class. A > 88, B: > 75. C: > 65

Homework

There will be 5 homework assignments involving Matlab and/or Python and/or Pytorch. The focus is on performing experiments on existing codes rather than coding algorithms from scratch.
Each student will have 7 days grace period for late homework. This grace period can be used for one or multiple assignments. Use it wisely and you cannot ask for more. CANVAS or gradescope submission after the starting of class time of due day is considered late by one day.

Term Project

Term project option is an available for 86-375 and 675 students counts for 25 and 26 points. 15-387 students can do a term project to replace one of their assignment (max: 13 points). A term project must involve some computational experiments, using either downloadable softwares or programs you develop. All the codes should be documented and archived in github and submitted together with a term paper (6-8 pages) materials in a different pdf/doc file and/or matlab zip files. Term project should take about 30 hours to complete. Undergraduate student should work on his/her own project. Graduate students can work on team on a larger scale project but need to be approved in advance. Project proposal is due by midterm. Students are encouraged to discuss project ideas with professor from the very beginning of the semester.

Term Paper

Term paper option is an available for 86-375 students and counts for a max of 26 points. It should be an extensive in-depth review of a particular topic to be approved by the professor before midterm. The paper will be about 8 pages in NeuRIPS format. Paper proposal is due by midterm. The student is required to give a powerpoint presentation in person or on zoom at the end of the semester.

Journal Club

Journal club option is an available option for 86-675 students and count for 25/26 points toward the total grade. Each student is expected to present three to four times during the course of the semester and participate in 90% of the journal club discussion.

Examinations

There will be a midterm (10 points) and a final exam (15 points) to test materials covered in the lectures and homework assignments.
There will be 12 in-class exercises, each worths 1 point. Full credit: 10 points.

Syllabus

Date	Lecture Topic	Assignments
	SENSORY CODING
M 8/30	1. Introduction
W 9/1	2. Computational Approach
M 9/6	Label Day (no class)
W 9/8	3. Retina	Homework 1
M 9/13	4. Frequency Analysis
W 9/15	5. Neural Network
M 9/20	6. Optics, Lightness and Color
W 9/22	7. Retinex and Intrinsic Images	Homework 2
	PERCEPTUAL INFERENCE
M 9/27	8. Lightness perception
W 9/29	9. Dimensional reduction
M 10/4	10. Source Separation	Mid-Course Evaluation
W 10/6	11. Belief Net	Homework 3;
M 10/11	12. Inference: Depth
W 10/13	Midterm
Th 10/14	Mid-semester break
M 10/18	13. Inference: Motion	Mid-term Grade. Project Proposal due
W 10/20	14. Perceptual Organization
M 10/25	15. Texture Perception
W 10/27	16. Content and Style	Homework 4
M 11/1	17. Visual Hierarchy
W 11/3	18. Analysis by Synthesis
F 11/5	No Journal Club: Community Engagement
	OBJECT AND SCENES
M 11/8	19. Predictive coding
W 11/10	20. Self-supervised learning	Homework 5
M 11/15	21. Compositional theory
W 11/17	22. Object and Parts
M 11/22	23. Attention
W 11/24	Thanksgiving break
M 11/29	24. Awareness	HW 5 due
W 12/1	25. Review
F 12/3	Last day of Class	Paper Presentation
X 12/18	Final Exam and Presentations

Reading (draft)

Week 1 (Lectures 1 and 2) Computational Philosophy

What is the purpose of perception? What do we see what we see? What does seeing mean? How do we formulate theories and computational theories of perception? These are the questions we will explore and think about. We will read some classic papers by David Marr and Gibson.
Marr, D. (1982) Chapter 1: The Philosophy and the Approach Vision 8-38 .
Gibson's ch 8 of The Ecological Approach to Visual Perception Houghton Mifflin Press. 1979

Week 2,3 (Lectures 3, 4, 5) Retina and Neural Network

Visual perception starts with the eyes and the photoreceptors. However, there is already sophisticated complications in the retina. We will read the classic and the modern papers on retinal processing, cover some basic background on frequency analysis, as well as the current computaitonal approach (via deep learning) for understanding the retinal processing. We will do a problem set on retinal processing, and its relationship to some visual illusion and perception. For journal club, we can read some efficient coding papers by Simoncelli, Lewicki and Ganguli for understanding the representation in the retina.
Lettvin, Maturana, McCulloch and Pitt. (1959) What the frog's eye tells the frog's brain Proceedings of the IRE 1940-1959 .
Gollisch and Meister (2010) Eye Smarter than Scientists Believed: Neural Computations in Circuits of the Retina Neuron 65: 151-164.
Maheswaranathan, Kastner, Baccus and Ganguli (218) Inferring hidden struture in multilayered neural circuits. PLOS Computaitonal Biology 14(8):e1006291
Maheswaranthan, .... Ganguli and Baccus (2018) Deep learning models reveal internal structure and diverse computations in the retina under natural scenes bioRxiv, June 8, 2018.
McIntosh, Maheswaranathan, Nayebi, Ganguli and Baccus (2016) Deep Learning Models of the Retinal Response to Natural Scenes NIPS
Perdreau, F. & Cavanagh, P. (2011). Do artists see their retinas? Frontiers in Human Neuroscience, 5:171

Week 4 (Lecture 6,7) Lightness perception and Intrinsic Images

Our perception of brightness (or lightness) and color is not determined what are sened by the retina, but in fact an interpretation of the ligthness and color properties of the object surfaces in the world. We will explore the classic theory of retinex as well as modern computational theory of intrinsic images for understanding lightness and color perception, culminating in a problem set on these issues. Discussion: the dress.
Adelson, Ed, (2000) Lightness Perception and Lightness Illusion The New Cognitive Neuroscience, Gazzaniga ed. MIT Press. (2000).
Land, E, (1977) The retinex theory of color vision Scientific America 1977
Horn, B, (1974) Determininng lightness from an image Computer Graphics and Image Procwssing. 1974
Morel JM, Petro AB, Sbert C. (2010) IEEE Trans Image Process. 19(11) 2825-37.
Tappen, Freeman and Adelson (2005) Recovering intrinsic images from a single image IEEE PAMI. 27(9): 1459-1472.

Week 11 (Lecture 21, 22) Compositional Theory, Objects and Parts

Early computer vision approach emphasized on analysis by synthesis. This framework can be generalized to conceptualize the recurrent interaction in a compositional hierarchy of the visual system.
E. Bienenstock, S. Geman, and D. Potter. Compositionality, MDL Priors, and Object Recognition NIPS Advances in Neural Information Processing Systems 9. 1998.
S. Geman, Hierarchy in machine and natural vision. Proceedings of the 11th Scandinavian Conference on Image Analysis, 1999.
Yuille. Towards a Theory of Compositional Learning and Encoding of Objects 1st IEEE Workshop in Information Theory in Computer Vision and Pattern Recognition. ICCV 2011.
S.C. Zhu and D. Mumford (2006) A Stochastic Grammar of Images Foundations and Trends in Computer Graphics and Vision 2(4): 259-362.

Week 12 (Lecture 23, 24) Attention and Awareness

Perception is dynamic, and is known to involve routing and attention. We will consider both biological and computational aspects of attention.
Grace Lindsay (2020) Attention in psychology, neuroscience and machine learning Frotnier Computational Neuroscience April 2020.
Luo and Maunsell (2019) Attention can be subdivided into neurobiological components corresponding to distinct behavioral effects PNAS 116(52) 26187-26194.
Eric Knudsen (2018) Fundamental components of attention. Annual Review of Neuroscience
Olshausen, Anderson and Van Essen (1995) Mutliscale dynamic routing circuit for forming size- and position-invariant object recognition J. Neuroscience 2:45-62.
Sabour, S., Frosst, N. and G. Hinton (2017) Dynamic routing between capsules NIPS
Kovacs, I., Papathomas, T., Yang, M. and Feher, A. (1996). When the brain changes its mind: Interocular grouping during binocular rivalry. Proceedings of the National Academy of Sciences, 93(26), pp.15508-15511.

Questions or comments: contact Tai Sing Lee
Last modified: August 2021, Tai Sing Lee