Chapter 17 Representation Learning and Generative Learning Using Autoencoders and GANs
Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow 2nd Edition by A. Geron
Autoencoders are artificial neural networks capable of learning dense representaions of the input data, called latent representations or codings, without any supervision (i.e., the training set is unlabeled).
These codings typically have a much lower dimensionality than the input data, making autoencoders useful for dimensionality reduction (see Chapter 8), especially for visualization purposes.
Autoencoders also act as feature detectors, and they can be used for unsupervised pretraining of deep neurel networks (as we discussed in Chapter 11).
Lastly, some autoencoders are generative models: they are capable of randomely generating new data that looks very similar to the training data.
For example, you could train an autoencoder on pictures of faces, and it would then be able to generate new faces.
However, the generated images are usually fuzzy and not entirely realistic.
In contrast, faces generated by generative adversarial networks (GANs) are now so convincing that it is hard to believe that the people they represent do not exist.
You can judge so for youself by visng https://thispersondoesnotexist.com/, a website that shows faces generated by a recent GAN architecture called StyleGAN (you can also check out https://thisrentaldoesnotexist.com/ to see some generated Airbnb bedrooms).
GANs are now widely used for super resolution (increasing the resolution of an image), colorization (https://github.com/jantic/DeOldify), poweful image editing (e.g., replacing photo bombers with realistic background), turning a simple sketch into a photorealistic image, predicting the next frames in a video, augmenting a dataset (to train other models), generating other types of data (such as text, audio, and time series), identifying the weaknesses in other models and strengthening them, and more.
Autoencoders and GANs are both unsupervised, they both learn dense representations, they can both be used as generative models, and they have many similar applications.
However, they work very differently:
Efficient Data Representations
Performing PCA with an Undercomplete Linear Autoencoder
Stacked Autoencoders
Implementing a stacked Autoencoder Using Keras
Visualizing the Reconstructions
Visualizing the Fashion MNIST Dataset
Unsupervised Pretraining Using Stacked Autoencoders