Satellite imagery generation with Generative Adversarial Networks (GANs)

Estimated time:
time
min

<h2><b>What are GANs?</b></h2> Some time ago, I showed you how to create a simple <b>Convolutional Neural Network (ConvNet)</b> for satellite imagery classification using <b>Keras</b>. ConvNets are not the only cool thing you can do in Keras, they are actually just the tip of an iceberg. Now,I think it’s about time to show you something more! Before we start, I will recommend that you review my two previous posts (<a href="https://appsilon.com/ship-recognition-in-satellite-imagery-part-i/">Ship recognition in satellite imagery part I</a> and <a href="https://appsilon.com/ship-recognition-in-satellite-imagery-part-ii/">part II</a>) if you haven’t already. Okay, so what are GANs? <b>Generative adversarial networks</b>, or GANs, were introduced in 2014 by Ian Goodfellow. They are <b>generative algorithms</b> comprised of two deep neural networks “playing” against each other. To fully understand GANs, we have to first understand how the generative method works. Let’s go back to our ConvNet for satellite imagery classification. As you remember, our task looked like this: We wanted to predict class (ship or non-ship). To be more specific, we wanted to find the probability that the image belongs to the specific class, given the image. Each image was composed of a set of pixels that we were using as features/inputs. Mathematically, we were using a set of features, X (pixels), to get the conditional probability of Y (class) given X (pixels): <p style="text-align: center;"><em>p(y|x)</em></p> This is an example of a <b>discriminative</b> algorithm. Generative algorithms, on the other hand, do the complete opposite. Using our example, assuming that the class of an image is “ship,” what should the image look like? More precisely, what value should each pixel have? This time, we’re generating the distribution of X (pixels) given Y (class): <p style="text-align: center;"><em>p(x|y)</em></p> Now that we know how the generative algorithms work, we can dive deeper into GANs. Like I said previously, GANs are composed of two deep neural networks. The first network is called the <b>generator</b>, and it’s basically responsible for creating new instances of data from random noise. The second network is called <b>discriminator</b>, and it “judges” if the data generated by the generator is real or fake by comparing it to real data. <img class="aligncenter size-full wp-image-1623" src="https://wordpress.appsilon.com/assets/uploads/2019/01/pasted-image-0.png" alt="Deep Convolutional Generative Adversarial Networks (DCGAN)" width="1122" height="260" /> <i>Note that I’m not saying that those are ConvNets or Recurrent Neural Networks. There are many different variations of GANs and depending on the task, we will use different networks to build our GAN. For example, later on, we will use </i><b><i>Deep</i></b> <b><i>Convolutional</i></b> <b><i>Generative Adversarial Networks</i></b> <b><i>(DCGAN)</i></b><i>  to generate new satellite imagery.</i> <iframe style="border: 1px solid #CCC; border-width: 1px; margin-bottom: 5px; max-width: 100%;" src="//www.slideshare.net/slideshow/embed_code/key/JwKhsPbqfJ1oKn" width="595" height="485" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" allowfullscreen="allowfullscreen"> </iframe> &nbsp; <h2><b>DCGAN in R</b></h2> To build a GAN in R, we have to first build a generator and discriminator. Then, we will join them together. We want to create DCGAN for satellite imagery where the generator network will take random noise as input and will return the new image as an output. <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> image_height &lt;- 80 # Image height in pixels image_width &lt;- 80 # Image width in pixels image_channels &lt;- 3 # Number of color channels - here Red, Green and Blue noise_dim &lt;- 80 # Length of gaussian noise vector for generator input <br># Setting generator input as gaussian noise vector generator_input &lt;- layer_input(shape = c(noise_shape)) <br># Setting generator output - 1d vector will be reshaped into an image array generator_output &lt;- generator_input %&gt;%  layer_dense(units = 64 * image_height / 4 * image_width / 4) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_reshape(target_shape = c(image_height / 4, image_width / 4, 64)) %&gt;%  layer_conv_2d(filters = 128, kernel_size = 5, padding = "same") %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d_transpose(filters = 128, kernel_size = 4, strides = 2, padding = "same") %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d_transpose(filters = 256, kernel_size = 4, strides = 2, padding = "same") %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 256, kernel_size = 5, padding = "same") %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 256, kernel_size = 5, padding = "same") %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = image_channels, kernel_size = 7, activation = "tanh", padding = "same") <br># Setting up the model generator &lt;- keras_model(generator_input, generator_output) </code></pre> </figure> The discriminator will take a real or generated image as input and return the probability of the image’s authenticity, indicating if the image was real or not. <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> # Setting discriminator input as an image array discriminator_input &lt;- layer_input(shape = c(image_height, image_width, image_channels)) <br># Setting discriminator output - the probability that image is real or not discriminator_output &lt;- discriminator_input %&gt;%  layer_conv_2d(filters = 256, kernel_size = 4) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 256, kernel_size = 2, strides = 2) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 128, kernel_size = 2, strides = 2) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 128, kernel_size = 2, strides = 2) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 128, kernel_size = 2, strides = 2) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 128, kernel_size = 2, strides = 2) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_conv_2d(filters = 128, kernel_size = 2, strides = 2) %&gt;%  layer_activation_leaky_relu() %&gt;%  layer_flatten() %&gt;%  layer_dropout(rate = 0.3) %&gt;%  layer_dense(units = 1, activation = "sigmoid") <br># Setting up the model discriminator &lt;- keras_model(discriminator_input, discriminator_output) </code></pre> </figure> As previously stated, both networks are “playing” against each other. The discriminator’s task is to distinguish real and fake images, and the generator has to create new data (which is an image in this case) that will indistinguishable from real data. Because the discriminator is returning probabilities, we can use binary cross-entropy as the loss function. <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> discriminator %&gt;% compile(  optimizer = optimizer_rmsprop(    lr = 0.0006,    clipvalue = 1.0,    decay = 1e-7  ),  loss = "binary_crossentropy" ) </code></pre> </figure> Before we merge our two networks into a GAN, we will freeze the discriminator weights so that they won’t be updated when the GAN is trained. Otherwise, this would cause the discriminator to return "true" value for each image we pass into it. Instead, we will train networks separately. <figure class="highlight"> <pre class="language-r"><code class="language-r" data-lang="r"> <br>freeze_weights(discriminator) gan_input &lt;- layer_input(shape = c(noise_shape)) gan_output &lt;- discriminator(generator(gan_input)) gan &lt;- keras_model(gan_input, gan_output) gan %&gt;% compile(  optimizer = optimizer_rmsprop(    lr = 0.0003,    clipvalue = 1.0,    decay = 1e-7  ),  loss = "binary_crossentropy" ) <br># Training the GAN doesn't follow the simplicity as we could experience while working with Convolutional Networks. In simplification, we have to train both networks separately in a loop. for(i in 1:1000) {  # TRAIN THE DISCRIMINATOR  # TRAIN THE GAN # You can find full code of the training process for similar example in https://www.manning.com/books/deep-learning-with-r } </code></pre> </figure> If you want to learn more about GANs and Keras, I would encourage that you<a href="https://www.manning.com/books/deep-learning-with-r"> read Deep Learning with R</a>. It’s a great place to start your adventure with Keras and deep learning. <h2><b>Results</b></h2> I’ve checked a few architectures of my GAN, and below, you will find some of the results. We can see that the generator is learning how to create some simple “ship-like” shapes. All of them share the same orientation as the ship, water hue, and so on.  We can also see what happens when a GAN is over-trained because we’re getting some really abstract pictures. The results are limited for two reasons. First of all, we worked on a really small sample size. Secondly, we should try out many different architectures of neural networks. In this example, I was working on my local machine, but using a cluster of machines over a longer period of time would likely give us much better results. &nbsp; <img class="aligncenter wp-image-1626 size-full" src="https://wordpress.appsilon.com/assets/uploads/2019/01/Zrzut-ekranu-z-2018-10-22-09-56-00.png" alt="Satellite imagery generation with r" width="1054" height="748" /> <img class="aligncenter wp-image-1627 size-full" src="https://wordpress.appsilon.com/assets/uploads/2019/01/Zrzut-ekranu-z-2018-10-22-09-56-55.png" alt="Satellite imagery generation with r" width="1053" height="599" />

Contact us!
Damian's Avatar
Damian Rodziewicz
Head of Sales
tutorial
satellite imagery
ai&research