AI_ML_DL’s diary


Chapter 17 Representation Learning and Generative Learning Using Autoencoders and GANs

Chapter 17  Representation Learning and Generative Learning Using Autoencoders and GANs

Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow 2nd Edition by A. Geron


Autoencoders are artificial neural networks capable of learning dense representaions of the input data, called latent representations or codings, without any supervision (i.e., the training set is unlabeled).

These codings typically have a much lower dimensionality than the input data, making autoencoders useful for dimensionality reduction (see Chapter 8), especially for visualization purposes.

Autoencoders also act as feature detectors, and they can be used for unsupervised pretraining of deep neurel networks (as we discussed in Chapter 11).

Lastly, some autoencoders are generative models: they are capable of randomely generating new data that looks very similar to the training data.

For example, you could train an autoencoder on pictures of faces, and it would then be able to generate new faces.

However, the generated images are usually fuzzy and not entirely realistic.


In contrast, faces generated by generative adversarial networks (GANs) are now so convincing that it is hard to believe that the people they represent do not exist.

You can judge so for youself by visng, a website that shows faces generated by a recent GAN architecture called StyleGAN (you can also check out to see some generated Airbnb bedrooms).

GANs are now widely used for super resolution (increasing the resolution of an image), colorization (, poweful image editing (e.g., replacing photo bombers with realistic background), turning a simple sketch into a photorealistic image, predicting the next frames in a video, augmenting a dataset (to train other models), generating other types of data (such as text, audio, and time series), identifying the weaknesses in other models and strengthening them, and more.


Autoencoders and GANs are both unsupervised, they both learn dense representations, they can both be used as generative models, and they have many similar applications.

However, they work very differently:


Efficient Data Representations


Performing PCA with an Undercomplete Linear Autoencoder


Stacked Autoencoders


Implementing a stacked Autoencoder Using Keras


Visualizing the Reconstructions


Visualizing the Fashion MNIST Dataset


Unsupervised Pretraining Using Stacked Autoencoders






style=138 iteration=1



style=138 iteration=20



style=138 iteration=500