15-387/86-375/675 Computational Perception
Carnegie Mellon University
Fall 2022
Course Description
The perceptual capabilities of even the simplest biological organisms are far beyond what we can achieve with machines. Whether you look at sensitivity, robustness, adaptability and generalizability, perception in biology just works, and works in complex, ever changing environments, and can make inference on the most subtle sensory patterns. Is it the neural hardware? Does the brain use a fundamentally different algorithm? What can we learn from biological systems and human perception?
In this course, we will study the biological and psychological data of biological perceptual systems, mostly the visual system, in depth, and then apply computational thinking to investigate the principles and mechanisms underlying natural perception.
You will learn how to reason scientifically and computationally about problems and issues in perception, how to extract the essential computational properties of those abstract ideas, and finally how to convert these into explicit mathematical models and computational algorithms. The course is targeted to both neuroscience and psychology students who are interested in learning computational thinking, as well as computer science and engineering students who are interested in learning more about the neural and computational basis of perception. Prerequisites: First year college calculus, differential equations, linear algebra, basic probability theory and statistical inference, and programming experience are desirable.
Course Information
Instructors | Office Hours. | Email (Phone) |
Tai Sing Lee (Professor) | Friday 10:00 am. Zoom Office Hour | taislee@andrew.cmu.edu |
Shang Gao Li (TA) | Tuesday 8:00 p.m. and Friday 5:00 p.m. | shanggao@andrew.cmu.edu |
Violet Han (half-TA) | Monday 5:00 p.m. | yinuoh@andrew.cmu.edu |
All Office Hours will be held on zoom, using course zoom link unless notified and arranged otherwise
- Class location and time: WEH 2302 Monday/Wednesday 1:25 p.m - 2:45 p.m.
- Class recitation and journal club: Class Zoom. Friday 1:25 p.m - 2:45 p.m.
- Website: http://www.ni.cmu.edu/~tai/cp22.html (course info and readings )
- Canvas: Lecture materials and Information would be on Canvas.
Recommended Textbook
- Handouts on Canvas .
- Frisby and Stone Seeing: The computational approach to biological vision . MIT Press, 2010 (recommended).
Classroom Etiquette
- Refrain from using laptop and cell phones during class.
Grading Scheme 15-387
Evaluation | Grade Points |
Assignments | 65 |
Midterm | 10 |
Final Exam | 15 |
Class Participation | 10 |
- Total points: 100
- Grading scheme: A: > 88, B: > 75. C: > 65. D > 50.
Grading Scheme 86-375
Evaluation | Points |
Assignments | 39 |
Midterm | 10 |
Final Exam | 15 |
Flex Requirement | 26 |
Class participation (10) | 10 |
- Total credit for 86-375: 100
39 points out of top three of five problem sets.
- Flex Requirement: problem set (13), a term project (13) OR term paper (13) OR Journal Club (13)
- Grading scheme: A: > 88, B: > 75. C: > 65. D > 50
Grading Scheme 86-675
Evaluation | Points |
Assignments | 65 |
Midterm | 10 |
Final Exam * | 15 |
Journal Club * | 3 Presentations |
Term Project * | See below. |
Class participation | 10 |
- Total credit for 86-675: 100
- Journal Club at least 10 times attendance, 2-3 presentations.
- Term project can be used to replace one or two problem sets, depending on the scale of the project.
- Grading scheme: A > 88, B: > 75. C: > 65
Homework
- There will be 5 homework assignments involving Pytorch (Matlab?). The focus is on performing
experiments on existing codes rather than coding algorithms from scratch.
- Each student will have 7 days grace period for late homework. This grace period
can be used for one or multiple assignments. Use it wisely and
you cannot ask for more.
CANVAS or gradescope submission after the starting of class time of due day is considered late by one day.
Term Project
- Term project option is an available for 86-375 and 675 students
counts for 25 and 26 points. 15-387 students can do a term project to replace one of their assignment (max: 13 points).
A term project must involve some computational experiments, using either downloadable softwares or programs you develop.
All the codes should be documented and archived in github and submitted together with a term paper
(6-8 pages) materials in a different pdf/doc file and/or matlab zip files.
Term project should take about 30 hours
to complete. Undergraduate student should work on his/her own project.
Graduate students can work on team on a larger scale project but need to be approved in advance.
Project proposal is due by midterm.
Students are encouraged to discuss project ideas with professor
from the very beginning of the semester.
Term Paper
- Term paper option is an available for 86-375 students and
counts for a max of 26 points. It should be an extensive in-depth review of a particular topic to be approved by
the professor before midterm. The paper will be about 8 pages in NeuRIPS format.
Paper proposal is due by midterm. The student is required to give a powerpoint presentation in person or
on zoom at the end of the semester.
Journal Club
- Journal club option is an available option for 86-675 students and count
for 25/26 points toward the total grade.
Each student is expected to present three to four times during the course of the semester and participate in 90% of the journal club discussion.
Examinations
- There will be a midterm (10 points) and a final exam (15 points) to test materials covered
in the lectures and homework assignments.
- There would be 12 in-class exercises given out in random time, each worths 1 point. Full credit: 10 points.
Syllabus
Date |
Lecture Topic |
|
Assignments |
| SENSORY CODING | | |
M 8/29 | 1. Introduction | | |
W 8/31 | 2. Perceputal Theories | | |
M 9/5 | Label Day (no class) | | |
W 9/7 | 3. Sensors and Retina |
| Homework 1 |
M 9/12 | 4. Frequency Analysis | | |
W 9/14 | 5. Pyramid | | |
M 9/19 | 6. Lightness and Color | |   |
W 9/21 | 7. Retinex and Intrinsic Images | | Homework 2 |
| PERCEPTUAL INFERENCE | | |
M 9/26 | 8. Lightness perception | | |
W 9/28 | 9. Shape from Shading | | |
M 10/3 | 10. Visual Cortex | | Mid-Course Evaluation |
W 10/5 | 11. Neural Networks | | Homework 3 |
M 10/10 | 12. Contours | | |
W 10/12 | Midterm | | |
F 10/14 | Family Weekend | | |
M 10/17 | Fall break | | |
W 10/19 | Fall break | | |
F 10/21 | Fall break | | |
M 10/24 | 13. Junctions | | Mid-term Grade. Project Proposal due |
W 10/26 | 14. Efficient Code | | |
F 10/28 | Community Day - No Class | | |
M 11/1 | 14. Organization | | |
W 11/3 | 15. Texture | | Homework 4 (Nov 5) |
M 11/7 | 16. Metamers | | |
W 11/9 | 17. Surfaces | | |
M 11/14 | 18. Objects | | |
W 11/16 | 19. Inferences | | Proposal due. Homework 5 (out Nov 18) |
M 11/21 | 20. Composition and Objects | | |
W 11/23 | Thanksgiving break | | |
M 11/28 | 21. Attention | | |
W 11/30 | 22. Consciousness and Cognition | | |
M 12/5 | 23 Review | | HW 5 due |
W 12/7 | In-class Final Exam | | Take-home due |
S 12/18 | Final Exam Slot. 8:30-11:30 am | | |
Reading (relevant, but optional reading)
Week 1 (Lectures 1 and 2) Observations, Theories and Computational Philosophy
Week 2,3 (Lectures 3, 4, 5) Retina, Pyramid and Neural Network
- Visual perception starts with the eyes and the photoreceptors. However,
there is already sophisticated computation in the
retina. We will read some classic and the modern papers on retinal processing, cover some basic background on
frequency analysis, pyramid representation, as well as the current computational approach (via deep learning) for modeling
retinal processing. We will do a problem set on retinal processing, and explore its relationship to some visual illusion
and perception.
This week, Jingkai Wen will present Gollisch and Meister (2010) and Joshua Kosnoff will present Maheswarantha ...Ganguli and Baccus (2018).
-
Lettvin, Maturana, McCulloch and Pitt. (1959) What the frog's eye tells the frog's brain Proceedings of the IRE 1940-1959 .
-
Gollisch and Meister (2010) Eye Smarter than Scientists Believed:
Neural Computations in Circuits of the Retina Neuron 65: 151-164.
-
Maheswaranathan, Kastner, Baccus and Ganguli (218) Inferring hidden struture in multilayered neural circuits. PLOS Computaitonal Biology 14(8):e1006291
-
Maheswaranthan, .... Ganguli and Baccus (2018) Deep learning models reveal internal structure and diverse computations in the retina under natural scenes bioRxiv, June 8, 2018.
-
McIntosh, Maheswaranathan, Nayebi, Ganguli and Baccus (2016) Deep Learning Models of the Retinal Response to Natural Scenes NIPS
-
Perdreau, F. & Cavanagh, P. (2011). Do artists see their retinas? Frontiers in Human Neuroscience, 5:171
Week 4 (Lecture 6,7,8) Lightness perception and Intrinsic Images
- Our perception of brightness (or lightness) and color is not determined what are sensed by the retina, but in fact an interpretation of the ligthness and color properties of the object surfaces in the world. We will explore the classic theory of retinex as well as modern computational theory of intrinsic images for understanding lightness and color perception, culminating in a problem set on these issues.
Shaurjya Madal will present Janner-Tenenbaum's (2017) NeurIPS paper, Madeline Davis will present Ma-Torralba 2018 ECCV paper.
-
Adelson, Ed, (2000) Lightness Perception and Lightness Illusion The New Cognitive Neuroscience, Gazzaniga ed. MIT Press.
(2000).
-
Land, E, (1977) The retinex theory of color vision Scientific America
1977
-
Horn, B, (1974) Determininng lightness from an image Computer Graphics and Image Procwssing.
1974
-
Morel JM, Petro AB, Sbert C. (2010) IEEE Trans Image Process.
19(11) 2825-37.
-
Tappen, Freeman and Adelson (2005) Recovering intrinsic images from a single image IEEE PAMI.
27(9): 1459-1472.
-
Michael Janner, Jiajun Wu, Tejas D. Kulkarni, Ilker Yildirim, Joshua B. Tenenbaum (2017) Self-Supervised Intrinsic Image Decomposition. NeurIPS.
-
Wei-Chiu Ma, Hang Chu, Bolei Zhou,
Raquel Urtasun and Antonio Torralba1 (2018) Single Image Intrinsic Decomposition wihtout a Single Intrinsic Image. ECCV
Week 5 (Lecture 9,10). Perception of 3D shapes
- Shading as well as many other cues allow us to infer 3D shapes. Intrinsic images of shading is a consequence of illumination on 3D shape, suggesting illumination and 3D shapes are further decompositon of shading images. We wonder whether how 3D shape as an intrinsic image or objects is represented in the brain. Is it 2D images, 2.5D sketches or 3D solids.
Emily Lopez explores whether the brain might encode some intrinsic images by reading Ramanujan Srinath et al's paper on solid shape coding. Rachel Hagani will read the classic paper by H H Bülthoff , S Y Edelman, M J Tarr on "How are three-dimensional objects represented in the brain?" while Yihan Zhang will read a deep learning paper by Jiayun Wu that might have implications on our understanding of the brain.
-
Ramaujan Srinath, A. Emonds, O. Wang, A. Lempel, E. Dunn-Weiss, CE, Connor, K. Nielsen
Early Emergence of Solid Shape Coding in Natural
and Deep Network Vision. Current Biology, 31, 51-65. 2021.
-
S Y Edelman, M J Tarr on "How are three-dimensional objects represented in the brain?"
Cerebral Cortex 1995 May-Jun;5(3):247-60.
-
Zhang, X, Zhang, Z, Zhang C, Tenenbaum J, Freeman W, Wu, J. Learning to Reconstruct Shapes from Unseen Classes NeurIPS 2018
Week 5 (Lecture 9,10). Source Separation and Representation Learning
Week 6 (Lectures 11, 12) Perceptual inference: contours, depth, surfaces
Weeks 7 and 8 (Lectures 13, 14, 15, 16) Perceptual Organization: Grouping, Gestalt, Content and Style
- In addition to inferring "visible" physical properties of the world, the brain also tries to organize the sensory information into parsimonious descriptions to infer more abstract and global properties or summary statistics of the world such as boundary,
texture, style, and contents. In this segment of the course, we will study Gestalt school of thoughts and models for extracting these properties.
-
Bela Julez (1981) Textons, the elements of texture perception and their interaction. Nature 290. 91-97.
-
Heeger and Bergen (1995) Pyramid-based texture analysis/synthesis SIGGRAPH 1995
-
Portilla and Simoncelli (2000) A parametric texture model based on joint statsitics of complex wavelet coefficients Internal journal of computer vision 40(1), 49-71.
-
L. A. Gatys, A. S. Ecker, and M. Bethge (2015) Texture Synthesis Using Convolutional Neural Networks NIPS 28
-
L. A. Gatys, A. S. Ecker, and M. Bethge (2017) Texture and art with deep neural networks Current Opinions in Neurbiology 46, 178-186.
-
L. A. Gatys, A. S. Ecker, and M. Bethge (2016)
Image Style Transfer Using Convolutional Neural Networks CVPR 2016
-
Freeman J, Simoncelli EP. (2011) Metamers of the ventral stream. Nature Neuroscience.
-
Kovacs, I., Papathomas, T., Yang, M. and Feher, A. (1996). When the brain changes its mind: Interocular grouping during
binocular rivalry. Proceedings of the National Academy of Sciences, 93(26), pp.15508-15511.
-
Bilge Sayim and Patrick Cavangah (2011) What line drawings reveal about the visual brain.
Week 9 and 10. (Lecture 17, 18, 19, 20) Analysis by Synthesis and Predictive Coding
- Early computer vision approach emphasized on analysis by synthesis. This framework can be generalized to conceptualize the recurrent interaction in the hierarchical visual system and perception as inverse graphics. We will study these theories and their neural foundation, as well as to see how these principles can be extended to self-supervised learning based on prediction principles.
-
Van Essen, Anderson and Felleman (1992) Information processing in primate visual systems: an integrated approach Science 5043: 419-423.
-
Fellman and Van Essen ( 1991) Distributed Hierarchical Processing the the Primate Cerebral Cortex Cerebral Cortex 1-47.
-
Mumford, D (1992) On the computational architecture of the neocortex Biological Cybern.
66: 241-251.
-
Rao and Ballard (1998) Predictive coding in the visual cortex: a functional interpretation of some oextra-classical receptive field effects Nature Neuroscience
2(1), 79-87.
-
Lee and Mumford (2003) Hierarchical Bayesian inference in the visual system J. Optical Society of America
20(7), 1434-1448.
-
Lee, T.S. (2015) The Visual System's Internal Models of the World Proceedings of the IEEE Vol 103, issue 8, 1359-1378.
-
Lotter, Krieman and Cox (2020) A neural network trained for prediction mimics diverse features of bioloigcal neurons and perception Nature Machine intelligence
vol 2, 210-219.
-
Ilker Yildirim, Mario Belledonne, Winrich Freiwald, Josh Tenenbaum (2020) Efficient inverse graphics in biological face processing Sci. Adv. 2020; 6 : eaax5979 4 March 2020
-
T. D. Kulkarni, W. F. Whitney, P. kohli, J. Tenenbaum, Deep convolutional inverse graphics network, in Proceeding of the Advances in Neural Information Processing Systems (NIPS, 2015), pp. 2539–2547
-
Kar K, Kubilius J, Schmidt K, Issa EB, DiCarlo JJ. Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior. Nature Neuroscience. 2019. doi: 10.1038/s41593-019-0392-5.
Week 11 (Lecture 21, 22) Compositional Theory, Scenes, Objects and Parts
- Compositionary Theory argues the brain or the visual system is organized in a recursive nearly decompositional system that best models the compositional nature of parts, objects and scenes in the natural world, their flexible combination and deformation. We will explore some classical and modern theories on composition, exploring the linkage between modern deep neural networks and symbolic AI from this perspective, and the connection between language and vision.
-
E. Bienenstock, S. Geman, and D. Potter. Compositionality, MDL Priors, and Object Recognition NIPS Advances in Neural Information Processing Systems 9.
1998.
-
S. Geman, Hierarchy in machine and natural vision. Proceedings of the 11th Scandinavian Conference on Image Analysis,
1999.
-
Yuille. Towards a Theory of Compositional Learning and Encoding of Objects 1st IEEE Workshop in Information Theory in Computer Vision and Pattern Recognition. ICCV
2011.
-
S.C. Zhu and D. Mumford (2006) A Stochastic Grammar of Images Foundations and Trends in Computer Graphics and Vision 2(4): 259-362.
Week 12 (Lecture 23, 24) Attention and Awareness
Week 13 (Lecture 25, 26) Perception, Art and Beauty and Review
What is beauty? Is it just something in the eyes of the beholders, or is it something universal? In this final week of the class, we hope to explore the concepts and computational theories of beauty and art and how they might be related to art, and the computational principles underlying
perception that we have studied in the course.
-
Cavanagh, P. (2005) The artist as neuroscientist. Nature, 434, 301-307.
-
Schmidhuber Jurgen. (1997) Low complexity art.
-
Schmidhuber Jurgen. (2008) Driven by Compression Progress: A simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, and jokes!
Questions or comments:
contact Tai Sing Lee
Last modified: August 2022, Tai Sing Lee