# Difference between revisions of "ReadingGroup"

m (→List of interested people) |
m |
||

Line 30: | Line 30: | ||

* Week of 22 Jany: Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images | * Week of 22 Jany: Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images | ||

* Week of 29 Jany: Break | * Week of 29 Jany: Break | ||

− | * Week of 06 Febr: Oord et al.: | + | * Week of 06 Febr: Oord et al.: Pixel-RNN and Pixel-CNN |

Line 40: | Line 40: | ||

https://books.google.com.tr/books?id=jLBmCgAAQBAJ&printsec=frontcover | https://books.google.com.tr/books?id=jLBmCgAAQBAJ&printsec=frontcover | ||

Go to Chapter 6. | Go to Chapter 6. | ||

+ | |||

+ | Because our project is about using machine learning to extract depth from a single image it pays to learn a bit about how humans do it. | ||

== Saxena, Min & Ng: Make3D == | == Saxena, Min & Ng: Make3D == | ||

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4531745 | http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4531745 | ||

+ | |||

+ | This is the classic paper that brought machine learning to the problem of depth from a single image, quite successfully, considering previous attempts. It uses Markov Random Fields, which are a bit advanced but, importantly, quite slow. | ||

== Michels, Saxena & Ng: High speed obstacle avoidance == | == Michels, Saxena & Ng: High speed obstacle avoidance == | ||

http://dl.acm.org/citation.cfm?id=1102426 | http://dl.acm.org/citation.cfm?id=1102426 | ||

+ | |||

+ | Here the same authors focus on a related problem, that of determining open spaces for guiding a vehicle, again using machine learning techniques. | ||

== Karsch, Liu & Kang: Depth Transfer == | == Karsch, Liu & Kang: Depth Transfer == | ||

Line 54: | Line 60: | ||

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6787109 | http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6787109 | ||

+ | |||

+ | This is a nonparametric approach to depth from a single image. They search a database of images similar to the observed one then warp the image retrieved from the database to estimate the depth of the current image. | ||

== LeCun, Bottou, Bengio & Haffner: CNNs == | == LeCun, Bottou, Bengio & Haffner: CNNs == | ||

http://ieeexplore.ieee.org/abstract/document/726791/ | http://ieeexplore.ieee.org/abstract/document/726791/ | ||

+ | |||

+ | Here is the classic paper applying convolutional neural networks to image processing. | ||

== Krizhevsky, Sutskever & Hinton: ImageNet/AlexNet == | == Krizhevsky, Sutskever & Hinton: ImageNet/AlexNet == | ||

https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf | https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf | ||

+ | |||

+ | Here is when convolutional neural networks and deep learning really showed what it could do - the problem of image recognition. | ||

== Simonyan & Zisserman: VGG-16 == | == Simonyan & Zisserman: VGG-16 == | ||

http://arxiv.org/abs/1409.1556 | http://arxiv.org/abs/1409.1556 | ||

+ | |||

+ | A relatively recent "deep" deep net with 16 layers for image recognition. Note: successful recent networks have one thousand layers. | ||

== Eigen, Puhrsch & Fergus: Depth map prediction == | == Eigen, Puhrsch & Fergus: Depth map prediction == | ||

https://www.cs.nyu.edu/~deigen/depth/ | https://www.cs.nyu.edu/~deigen/depth/ | ||

+ | |||

+ | Finally, we apply deep neural convolutional networks to the problem that we are interested in. | ||

== Shelhamer, Long & Darrell: Fully Convolutional Segmentation == | == Shelhamer, Long & Darrell: Fully Convolutional Segmentation == | ||

http://arxiv.org/abs/1605.06211 | http://arxiv.org/abs/1605.06211 | ||

+ | |||

+ | Here a related problem is solved, that of semantic segmentation, but this approach is applicable to our problem. | ||

+ | |||

+ | == Loffe & Szegedy: Batch Normalization == | ||

+ | |||

+ | https://arxiv.org/abs/1502.03167 | ||

+ | |||

+ | A recent technique that has enabled powerful new methods and ultimately much deeper neural networks. Important stuff. | ||

== He, Zhang, Ren & Sun: ResNets == | == He, Zhang, Ren & Sun: ResNets == | ||

https://arxiv.org/abs/1512.03385 | https://arxiv.org/abs/1512.03385 | ||

+ | |||

+ | This work and variations on it have been the basis of the 1000 layer recent neural networks. Important stuff. | ||

== Girshick, Donahue, Darrell & Malik: R-CNN == | == Girshick, Donahue, Darrell & Malik: R-CNN == | ||

https://arxiv.org/abs/1311.2524 | https://arxiv.org/abs/1311.2524 | ||

+ | |||

+ | We take a slight seque to check out how tracking has been done recently with neural networks. Note that Faster-RCNN and more recent alternatives use similar principles but do it faster. | ||

+ | |||

+ | == Liao, Huang, Wang, Kodagoda, Yu & Liu: Fuse with laser == | ||

+ | |||

+ | https://arxiv.org/abs/1611.02174 | ||

+ | |||

+ | Here we see an interesting depth-from-single-image sensor fusion with robotics applications. | ||

== Giuisti et al.: Forest trails CNN == | == Giuisti et al.: Forest trails CNN == | ||

Line 88: | Line 122: | ||

See also youtube. | See also youtube. | ||

+ | |||

+ | Here we have a CNN-based update to the learn-to-navigate-from-images problem addressed by Saxena et al. above. | ||

== Cao, Wu & Shen: Fully convolutional depth 1 == | == Cao, Wu & Shen: Fully convolutional depth 1 == | ||

http://arxiv.org/abs/1605.02305 | http://arxiv.org/abs/1605.02305 | ||

+ | |||

+ | Here we start a series of recent papers that take different approaches using deep nets to depth from a single image. | ||

== Laina et al.: Fully convolutional depth 2 == | == Laina et al.: Fully convolutional depth 2 == | ||

http://arxiv.org/abs/1606.00373 | http://arxiv.org/abs/1606.00373 | ||

+ | |||

+ | Here we start a series of recent papers that take different approaches using deep nets to depth from a single image. | ||

== Li, Klein & Yao: Fully convolutional depth 3 == | == Li, Klein & Yao: Fully convolutional depth 3 == | ||

http://arxiv.org/abs/1607.00730 | http://arxiv.org/abs/1607.00730 | ||

+ | |||

+ | Here we start a series of recent papers that take different approaches using deep nets to depth from a single image. | ||

== Goodfellow et al.: Generative Adversarial Nets == | == Goodfellow et al.: Generative Adversarial Nets == | ||

https://papers.nips.cc/paper/5423-generative-adversarial-nets | https://papers.nips.cc/paper/5423-generative-adversarial-nets | ||

+ | |||

+ | Another important recent development that we may make use of. | ||

== Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images == | == Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images == | ||

Line 109: | Line 153: | ||

https://arxiv.org/abs/1411.5928 | https://arxiv.org/abs/1411.5928 | ||

− | == Oord et al.: | + | A non-adversarial approach to the same problem. |

+ | |||

+ | == Oord et al.: Pixel-RNN & Pixel-CNN == | ||

https://arxiv.org/abs/1601.06759 | https://arxiv.org/abs/1601.06759 | ||

+ | |||

+ | Producing distributions over images. We have always intended to do something like this for depth images. | ||

http://arxiv.org/abs/1606.05328 | http://arxiv.org/abs/1606.05328 |

## Revision as of 08:11, 5 September 2017

## Contents

- 1 Getting involved
- 2 Proposed Schedule
- 3 Details
- 3.1 Foley & Maitlin Chapter 6 - Distance & Size Perception
- 3.2 Saxena, Min & Ng: Make3D
- 3.3 Michels, Saxena & Ng: High speed obstacle avoidance
- 3.4 Karsch, Liu & Kang: Depth Transfer
- 3.5 LeCun, Bottou, Bengio & Haffner: CNNs
- 3.6 Krizhevsky, Sutskever & Hinton: ImageNet/AlexNet
- 3.7 Simonyan & Zisserman: VGG-16
- 3.8 Eigen, Puhrsch & Fergus: Depth map prediction
- 3.9 Shelhamer, Long & Darrell: Fully Convolutional Segmentation
- 3.10 Loffe & Szegedy: Batch Normalization
- 3.11 He, Zhang, Ren & Sun: ResNets
- 3.12 Girshick, Donahue, Darrell & Malik: R-CNN
- 3.13 Liao, Huang, Wang, Kodagoda, Yu & Liu: Fuse with laser
- 3.14 Giuisti et al.: Forest trails CNN
- 3.15 Cao, Wu & Shen: Fully convolutional depth 1
- 3.16 Laina et al.: Fully convolutional depth 2
- 3.17 Li, Klein & Yao: Fully convolutional depth 3
- 3.18 Goodfellow et al.: Generative Adversarial Nets
- 3.19 Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images
- 3.20 Oord et al.: Pixel-RNN & Pixel-CNN

- 4 List of interested people

# Getting involved

**To register your interest, to get announcements, etc. from me about it, send me an email on djduff@itu.edu.tr.**
Decision about a time slot during the week will be made just after the course registration period.

# Proposed Schedule

The below schedule is only **proposed**, and subject to change.

- Week of 11 Sept: Foley & Maitlin Chapter 6: Distance & Size Perception
- Week of 18 Sept: Saxena, Min & Ng: Make3D
- Week of 25 Sept: Michels, Saxena & Ng: High speed obstacle avoidance
- Week of 02 Octr: Karsch, Liu & Kang: Depth Transfer
- Week of 09 Octr: LeCun, Bottou, Bengio & Haffner: CNNs
- Week of 16 Octr: Krizhevsky, Sutskever & Hinton: ImageNet/AlexNet
- Week of 23 Octr: Simonyan & Zisserman: VGG-16
- Week of 30 Octr: Break
- Week of 06 Novr: Eigen, Puhrsch & Fergus: Depth map prediction
- Week of 13 Novr: Shelhamer, Long & Darrell: Fully Convolutional Segmentation
- Week of 20 Novr: He, Zhang, Ren & Sun: ResNet
- Week of 27 Novr: Girshick, Donahue, Darrell & Malik: R-CNN
- Week of 04 Decr: Liao, Huang, Wang, Kodagoda, Yu & Liu: Fuse with laser
- Week of 11 Decr: Giuisti et al.: Forest trails CNN
- Week of 18 Decr: Break
- Week of 25 Decr: Cao, Wu & Shen: Fully convolutional depth 1
- Week of 01 Jany: Laina et al.: Fully convolutional depth 2
- Week of 08 Jany: Li, Klein & Yao: Fully convolutional depth 3
- Week of 15 Jany: Goodfellow et al.: Generative Adversarial Nets
- Week of 22 Jany: Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images
- Week of 29 Jany: Break
- Week of 06 Febr: Oord et al.: Pixel-RNN and Pixel-CNN

# Details

## Foley & Maitlin Chapter 6 - Distance & Size Perception

https://books.google.com.tr/books?id=jLBmCgAAQBAJ&printsec=frontcover Go to Chapter 6.

Because our project is about using machine learning to extract depth from a single image it pays to learn a bit about how humans do it.

## Saxena, Min & Ng: Make3D

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4531745

This is the classic paper that brought machine learning to the problem of depth from a single image, quite successfully, considering previous attempts. It uses Markov Random Fields, which are a bit advanced but, importantly, quite slow.

## Michels, Saxena & Ng: High speed obstacle avoidance

http://dl.acm.org/citation.cfm?id=1102426

Here the same authors focus on a related problem, that of determining open spaces for guiding a vehicle, again using machine learning techniques.

## Karsch, Liu & Kang: Depth Transfer

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5551153

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6787109

This is a nonparametric approach to depth from a single image. They search a database of images similar to the observed one then warp the image retrieved from the database to estimate the depth of the current image.

## LeCun, Bottou, Bengio & Haffner: CNNs

http://ieeexplore.ieee.org/abstract/document/726791/

Here is the classic paper applying convolutional neural networks to image processing.

## Krizhevsky, Sutskever & Hinton: ImageNet/AlexNet

Here is when convolutional neural networks and deep learning really showed what it could do - the problem of image recognition.

## Simonyan & Zisserman: VGG-16

http://arxiv.org/abs/1409.1556

A relatively recent "deep" deep net with 16 layers for image recognition. Note: successful recent networks have one thousand layers.

## Eigen, Puhrsch & Fergus: Depth map prediction

https://www.cs.nyu.edu/~deigen/depth/

Finally, we apply deep neural convolutional networks to the problem that we are interested in.

## Shelhamer, Long & Darrell: Fully Convolutional Segmentation

http://arxiv.org/abs/1605.06211

Here a related problem is solved, that of semantic segmentation, but this approach is applicable to our problem.

## Loffe & Szegedy: Batch Normalization

https://arxiv.org/abs/1502.03167

A recent technique that has enabled powerful new methods and ultimately much deeper neural networks. Important stuff.

## He, Zhang, Ren & Sun: ResNets

https://arxiv.org/abs/1512.03385

This work and variations on it have been the basis of the 1000 layer recent neural networks. Important stuff.

## Girshick, Donahue, Darrell & Malik: R-CNN

https://arxiv.org/abs/1311.2524

We take a slight seque to check out how tracking has been done recently with neural networks. Note that Faster-RCNN and more recent alternatives use similar principles but do it faster.

## Liao, Huang, Wang, Kodagoda, Yu & Liu: Fuse with laser

https://arxiv.org/abs/1611.02174

Here we see an interesting depth-from-single-image sensor fusion with robotics applications.

## Giuisti et al.: Forest trails CNN

http://ieeexplore.ieee.org/document/7358076/

See also youtube.

Here we have a CNN-based update to the learn-to-navigate-from-images problem addressed by Saxena et al. above.

## Cao, Wu & Shen: Fully convolutional depth 1

http://arxiv.org/abs/1605.02305

## Laina et al.: Fully convolutional depth 2

http://arxiv.org/abs/1606.00373

## Li, Klein & Yao: Fully convolutional depth 3

http://arxiv.org/abs/1607.00730

## Goodfellow et al.: Generative Adversarial Nets

https://papers.nips.cc/paper/5423-generative-adversarial-nets

Another important recent development that we may make use of.

## Dosovitskiy, Springenberg, Tatarchenko & Brox: Generating images

https://arxiv.org/abs/1411.5928

A non-adversarial approach to the same problem.

## Oord et al.: Pixel-RNN & Pixel-CNN

https://arxiv.org/abs/1601.06759

Producing distributions over images. We have always intended to do something like this for depth images.

http://arxiv.org/abs/1606.05328

# List of interested people

(who I will contact with information about the schedule etc.)

- Abdulmajeed K.
- Alican M.
- Anas M.
- K. Bulut Ö.
- Imaduddin A. M.
- Tolga C.