Computer Vision Notes Summary

Useful Website

CVPR: IEEE Conference on Computer Vision and Pattern Recognition
ICCV: International Conference on Computer Vision
ECCV: European Conference on Computer Vision
NIPS: Neural Information Processing Systems
ICLR: International Conference on Learning Representations
ICML: International Conference on Machine Learning
Publications from the Stanford Vision Lab
Awesome Deep Vision
Past CS229 Projects: Example projects from Stanford's machine learning class
Kaggle challenges: An online machine learning competition website. For example, a Yelp classification challenge.

Image Classification: [Krizhevsky et al.], [Russakovsky et al.], [Szegedy et al.], [Simonyan et al.], [He et al.], [Huang et al.], [Hu et al.] [Zoph et al.]
Object detection: [Girshick et al.], [Ren et al.], [He et al.]
Image segmentation: [Long et al.] [Noh et al.] [Chen et al.]
Video classification: [Karpathy et al.], [Simonyan and Zisserman] [Tran et al.] [Carreira et al.] [Wang et al.]
Scene classification: [Zhou et al.]
Face recognition: [Taigman et al.] [Schroff et al.] [Parkhi et al.]
Depth estimation: [Eigen et al.]
Image-to-sentence generation: [Karpathy and Fei-Fei], [Donahue et al.], [Vinyals et al.] [Xu et al.] [Johnson et al.]
Visualization and optimization: [Szegedy et al.], [Nguyen et al.], [Zeiler and Fergus], [Goodfellow et al.], [Schaul et al.]

ImageNet: a large-scale image dataset for visual recognition organized by WordNet hierarchy
SUN Database: a benchmark for scene recognition and object detection with annotated scene categories and segmented objects
Places Database: a scene-centric database with 205 scene categories and 2.5 millions of labelled images
NYU Depth Dataset v2: a RGB-D dataset of segmented indoor scenes
Microsoft COCO: a new benchmark for image recognition, segmentation and captioning
Flickr100M: 100 million creative commons Flickr images
Labeled Faces in the Wild: a dataset of 13,000 labeled face photographs
Human Pose Dataset: a benchmark for articulated human pose estimation
YouTube Faces DB: a face video dataset for unconstrained face recognition in videos
UCF101: an action recognition data set of realistic action videos with 101 action categories
HMDB-51: a large human motion dataset of 51 action classes
ActivityNet: A large-scale video dataset for human activity understanding
Moments in Time: A dataset of one million 3-second videos