Difference between revisions of "ReadingGroup"
m (→LeCun, Bengio & Hinton: Deep Learning Review) |
m (→Proposed Schedule) |
||
Line 19: | Line 19: | ||
* 12 Octr 8.30am: LeCun, Bengio & Hinton: Deep Learning Review | * 12 Octr 8.30am: LeCun, Bengio & Hinton: Deep Learning Review | ||
** Location: EEBF 4302 | ** Location: EEBF 4302 | ||
− | * 19 Octr 8.30am: | + | * 19 Octr 8.30am: Rumelhart, Hinton & Williams: Backpropagation |
** Location: EEBF 4302 | ** Location: EEBF 4302 | ||
− | * 26 Octr 8.30am: | + | * 26 Octr 8.30am: LeCun, Bottou, Bengio & Haffner: CNNs |
** Location: EEBF 4302 | ** Location: EEBF 4302 | ||
* 02 Novr 8:30am: Simonyan & Zisserman: VGG-16 | * 02 Novr 8:30am: Simonyan & Zisserman: VGG-16 |
Revision as of 07:10, 12 October 2017
Contents
- 1 Getting involved
- 2 Proposed Schedule
- 3 Details
- 3.1 Foley & Maitlin Chapter 6 - Distance & Size Perception
- 3.2 Saxena, Min & Ng: Make3D
- 3.3 Michels, Saxena & Ng: High speed obstacle avoidance
- 3.4 Karsch, Liu & Kang: Depth Transfer
- 3.5 LeCun, Bengio & Hinton: Deep Learning Review
- 3.6 Rumelhart, Hinton & Williams: Backpropagation
- 3.7 LeCun, Bottou, Bengio & Haffner: CNNs
- 3.8 Krizhevsky, Sutskever & Hinton: ImageNet/AlexNet
- 3.9 Simonyan & Zisserman: VGG-16
- 3.10 Eigen, Puhrsch & Fergus: Depth map prediction
- 3.11 Luo et al.: Deep Learning + Stereo
- 3.12 Shelhamer, Long & Darrell: Fully Convolutional Segmentation
- 3.13 Loffe & Szegedy: Batch Normalization
- 3.14 He, Zhang, Ren & Sun: ResNets
- 3.15 Giuisti et al.: Forest trails CNN
- 3.16 Cao, Wu & Shen: Fully convolutional depth 1
- 3.17 Laina et al.: Fully convolutional depth 2
- 3.18 Li, Klein & Yao: Fully convolutional depth 3
- 3.19 Liao, Huang, Wang, Kodagoda, Yu & Liu: Fuse with laser
- 3.20 Girshick, Donahue, Darrell & Malik: R-CNN
- 3.21 Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images
- 3.22 Goodfellow et al.: Generative Adversarial Nets
- 3.23 Oord et al.: Pixel-RNN & Pixel-CNN
- 3.24 Isola et al. Pix2Pix
- 4 List of interested people
- 5 Additional Resources
Getting involved
There is now a mailing list for this reading group. Send me (Damien) an email to get on it. No problem.
Note: this reading group is about deep learning as applied to depth estimation from a single image - one of the super hot topics. If your interest is deep learning in general, you may find some of the readings a little bit off-topic. So let me know if you want some idea about which you should read for.
Proposed Schedule
The below schedule is only proposed, and subject to change.
- 14 Sept 8.30am: Foley & Maitlin Chapter 6: Distance & Size Perception
- Location: EEBF 4302
- 21 Sept 8.30am: Saxena, Min & Ng: Make3D
- Location: EEBF 4302
- 28 Sept 8.30am: Michels, Saxena & Ng: High speed obstacle avoidance
- Location: EEBF 4302
- 05 Octr 8.30am: Karsch, Liu & Kang: Depth Transfer
- Location: EEBF 4302
- 12 Octr 8.30am: LeCun, Bengio & Hinton: Deep Learning Review
- Location: EEBF 4302
- 19 Octr 8.30am: Rumelhart, Hinton & Williams: Backpropagation
- Location: EEBF 4302
- 26 Octr 8.30am: LeCun, Bottou, Bengio & Haffner: CNNs
- Location: EEBF 4302
- 02 Novr 8:30am: Simonyan & Zisserman: VGG-16
- Location: EEBF 4302
- 09 Novr 8.30am: Eigen, Puhrsch & Fergus: Depth map prediction
- Location: EEBF 4302
- 16 Novr 8.30am: Luo et al.: Deep Learning + Stereo
- Location: EEBF 4302
- 23 Novr 8.30am: Shelhamer, Long & Darrell: Fully Convolutional Segmentation
- Location: EEBF 4302
- 30 Novr 8.30am: Batch Normalization
- Location: EEBF 4302
- 07 Decr 8.30am: He, Zhang, Ren & Sun: ResNet
- Location: EEBF 4302
- 14 Decr 8.30am: Giuisti et al.: Forest trails CNN
- Location: EEBF 4302
- 22 Decr 8:30am: Cao, Wu & Shen: Fully convolutional depth 1
- Location: EEBF 4302
- Week of 25 Decr: Laina et al.: Fully convolutional depth 2
- Week of 01 Janr: Break
- Week of 08 Janr: Break
- Week of 15 Jany: Li, Klein & Yao: Fully convolutional depth 3
- Week of 22 Jany: Liao, Huang, Wang, Kodagoda, Yu & Liu: Fuse with laser
- Week of 29 Jany: Girshick, Donahue, Darrell & Malik: R-CNN
- Week of 06 Febr: Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images
- Week of 13 Febr: Goodfellow et al.: Generative Adversarial Nets
- Week of 20 Febr: Oord et al.: Pixel-RNN and Pixel-CNN
- Week of 27 Febr: Isola et al. Pix2Pix
Details
Foley & Maitlin Chapter 6 - Distance & Size Perception
Because our project is about using machine learning to extract depth from a single image (with deep learning, then applying it to robot problems) it pays to learn a bit about how humans do it...
https://books.google.com.tr/books?id=jLBmCgAAQBAJ&printsec=frontcover
Go to Chapter 6.
If that doesn't work (some have reported finding it difficult to access Chapter 6), try the following link: http://tinyurl.com/yalnnwp9 - some have reported being able to access the chapter by doing a google search for content.
Another thing to try that has worked for some is to log out of any google/gmail account before trying to access.
If nothing else works, email me.
Saxena, Min & Ng: Make3D
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4531745
This is the classic paper that brought machine learning to the problem of depth from a single image, quite successfully, considering previous attempts. It uses Markov Random Fields, which are a bit advanced but, importantly, quite slow.
Note: because our library has a subscription to IEEE Xplore, you can access the above link from on-campus or via off-campus library access or via VPN.
But, here is an alternative link: http://www.cs.cornell.edu/~asaxena/reconstruction3d/saxena_make3d_learning3dstructure.pdf
There are some videos and things available here: http://make3d.cs.cornell.edu/ -- there used to be a live online demo but they've closed that. There is also a list of results on the Make3D dataset up till about 2012: http://make3d.cs.cornell.edu/results_stateoftheart.html
After that other datasets started being used also.
Superpixels are used in the study. Here is a quick intro to them: http://ttic.uchicago.edu/~xren/research/superpixel/
MRFs are more difficult and if anybody has seen a good tutorial for them let me know so that I can link to it here. The best I could find is https://mitpress.mit.edu/sites/default/files/titles/content/9780262015776_sch_0001.pdf but it is still a bit difficult. We will probably end up discussing what MRFs are a lot on Thursday.
Michels, Saxena & Ng: High speed obstacle avoidance
http://dl.acm.org/citation.cfm?id=1102426
Here the same authors focus on a related problem, that of determining open spaces for guiding a vehicle, again using machine learning techniques.
This version of the paper might be of higher quality (thanks to Hossein for finding):
http://ai.stanford.edu/~asaxena/rccar/ICML_ObstacleAvoidance.pdf
Karsch, Liu & Kang: Depth Transfer
Here is the target paper: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5551153
For those who are not on campus, a temporary link: http://web.itu.edu.tr/djduff/Share/KarschEtAl2014.pdf
This is a nonparametric approach to depth from a single image. They search a database of images similar to the observed one then aligns the found image with the observed one then warps the found image retrieved from the database to estimate the depth of the current image. It depends on an approach called SIFTFlow to do the alignment.
Here is a paper describing "SIFTFlow" (if you have the time to go deeper): http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6787109
Free version: http://people.csail.mit.edu/celiu/SIFTflow/
Or a shorter conference version available off-campus: http://people.csail.mit.edu/celiu/ECCV2008/
LeCun, Bengio & Hinton: Deep Learning Review
http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html?foxtrotcallback=true
Alternative links: http://pages.cs.wisc.edu/~dyer/cs540/handouts/deep-learning-nature2015.pdf https://www.researchgate.net/publication/277411157_Deep_Learning
A whirlwind compressed intro to deep learning and its parts.
For a more gentle introduction to deep learning: http://cs231n.stanford.edu/
Or you can find lots of gentle short intros: https://www.google.com.tr/search?q=intro+to+deep+learning
Rumelhart, Hinton & Williams: Backpropagation
An early paper introducing backpropagation, the main way we train neural networks nowadays: http://www.nature.com/articles/323533a0
Alternative link: http://www.cs.toronto.edu/~hinton/absps/naturebp.pdf
LeCun, Bottou, Bengio & Haffner: CNNs
http://ieeexplore.ieee.org/abstract/document/726791/
Alternative link: http://www.dengfanxin.cn/wp-content/uploads/2016/03/1998Lecun.pdf
Here is the classic paper applying convolutional neural networks to image processing.
Krizhevsky, Sutskever & Hinton: ImageNet/AlexNet
We will not discussed this in the reading group.
Here is when convolutional neural networks and deep learning really showed what it could do - the problem of image recognition.
But we won't use this because a lot of the complexity it introduces turns out not to be necessary. Later methods are "cleaner". So we have taken it out of the reading list.
Simonyan & Zisserman: VGG-16
http://arxiv.org/abs/1409.1556
A relatively recent "deep" deep net with 16 layers for image recognition. Note: successful recent networks have one thousand layers.
Feel free to take a look at AlexNet above to get an idea of the space of approaches.
Eigen, Puhrsch & Fergus: Depth map prediction
https://www.cs.nyu.edu/~deigen/depth/
Finally, we apply deep neural convolutional networks to the problem that we are interested in.
Luo et al.: Deep Learning + Stereo
Combining deep learning and stereo.
https://www.cs.toronto.edu/~urtasun/publications/luo_etal_cvpr16.pdf
Shelhamer, Long & Darrell: Fully Convolutional Segmentation
http://arxiv.org/abs/1605.06211
Here a related problem is solved, that of semantic segmentation, but this approach is applicable to our problem.
Loffe & Szegedy: Batch Normalization
https://arxiv.org/abs/1502.03167
A recent technique that has enabled powerful new methods and ultimately much deeper neural networks. Important stuff.
He, Zhang, Ren & Sun: ResNets
https://arxiv.org/abs/1512.03385
This work and variations on it have been the basis of the 1000 layer recent neural networks. Important stuff.
Giuisti et al.: Forest trails CNN
http://ieeexplore.ieee.org/document/7358076/
Alternative link: http://rpg.ifi.uzh.ch/docs/RAL16_Giusti.pdf
See also youtube: https://www.youtube.com/watch?v=umRdt3zGgpU
Here we have a CNN-based update to the learn-to-navigate-from-images problem addressed by Saxena et al. above.
Cao, Wu & Shen: Fully convolutional depth 1
http://arxiv.org/abs/1605.02305
Here we start a series of recent papers that take different approaches using deep nets to depth from a single image.
Laina et al.: Fully convolutional depth 2
http://arxiv.org/abs/1606.00373
Here we continue a series of recent papers that take different approaches using deep nets to depth from a single image.
Li, Klein & Yao: Fully convolutional depth 3
http://arxiv.org/abs/1607.00730
Here we finalise a series of recent papers that take different approaches using deep nets to depth from a single image.
Liao, Huang, Wang, Kodagoda, Yu & Liu: Fuse with laser
https://arxiv.org/abs/1611.02174
Here we see an interesting depth-from-single-image sensor fusion with robotics applications.
Girshick, Donahue, Darrell & Malik: R-CNN
https://arxiv.org/abs/1311.2524
We take a slight segue to check out how tracking has been done recently with neural networks. Note that Faster-RCNN and more recent alternatives use similar principles but do it faster.
Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images
https://arxiv.org/abs/1411.5928
A non-adversarial approach to generating images.
Goodfellow et al.: Generative Adversarial Nets
https://papers.nips.cc/paper/5423-generative-adversarial-nets
Another important recent development that we may make use of.
Oord et al.: Pixel-RNN & Pixel-CNN
https://arxiv.org/abs/1601.06759
Producing distributions over images. We have always intended to do something like this for depth images.
http://arxiv.org/abs/1606.05328
Isola et al. Pix2Pix
https://arxiv.org/abs/1611.07004
We can use this too. And it's cool.
List of interested people
(who I will contact with information about the schedule etc.)
- Abdulmajeed M. K.
- Alican M.
- Anas M.
- K. Bulut Ö.
- Imaduddin A. M.
- Tolga C.
- Bilge A.
- Hatice K.
- Hossein P.
- Torkan G.
- Buse Sibel K.
- Oğuzhan C.
- M. Alperen Ö.
- Hatice K.
- Özgür Ö.
- Doğay K.
- Onur A.
- Alper K.
- Furkan A.
- Elena B. S.
- Müjde A.
- Emeç E.
- Onur A.
- Ekrem Alper K.
- Utku Ö.
- Mert Ş.
- Jimmy A.
- Hasan K.
- B. Uğur T.
Additional Resources
EBook: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/DeepLearning-NowPublishing-Vol7-SIG-039.pdf Deep Learning: Methods and Applications by Li Deng and Dong Yu
Online course with slides videos and assignments: http://cs231n.stanford.edu/ CS231n: Convolutional Neural Networks for Visual Recognition (Stanford)
My NN/Keras bootcamp slides: http://files.djduff.net/nn.zip