Chapter 12 Custom Models and Training with TensorFlow
Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow 2nd Edition by A. Geron
Up until now, we've used only TensorFlow's high-level API, tf.keras, but it already got us pretty far: we built various neural network architectures, including regression and classification nets, Wide & Deep nets, and self-normalizing nets, using all sorts of techniques, such as Batch Normalization, dropout, and learning rate schedules.
In fact, 95% of the use cases you will encounter will not require anything other than tf.keras (and tf.data see Chapter 13). But it's time to dive deeper into TensorFlow and take a look at its lower-level Python API. This will be useful when you need extra control to write custom loss functions, custom metrics, layers, models, initializers, regularizers, weight constraints, and more. You may even need to fully control the training loop itself, for ezample to apply special transformations or constraints to the gradients (beyond just clipping them) or to use multiple optimizers for different parts of the network.
We will cover all these cases in this chapter, and we will also look at how you can boost your custom models and training algorithms using TensorFlow's automatic graph generation feature.
But first, let's take a quick tour of TensorFlow
A Quick Tour of TensorFlow
・Its core is very similar to NumPy, but with GPU support.
・It supports distributed computing (across multiple devices and servers).
・It includes a kind of just-in-time (JIT) compiler that allows it to optimize computations for speed and memory usage. It works by extracting the computation grapf from Python function, then optimizing it (e.g., by pruning unused nodes), and finally running it efficiently (e.g., by automatically running independent operations in parallel).
・Computation graphs can be exported to a portable format, so you can train a TensorFlow model in one environment (e.g., using Python on Linux) and run it in another (e.g., using Java on an Android device).
・It implements autodiff (see Chapter 10 and Appendix D) and provides some excellent optimizers, such as RMSProp and Nadam (see Chapter 11), so you can easily minimize all sorts of loss functions.
Using TensorFlow like NumPy
Tensors and Operations
You can create a tensor with tf.constant( ).
For example, here is a tensor representing a matrix with two rows and three columns of floats:
>>> tf.constant( [ [ 1., 2., 3. ], [ 4., 5., 6. ] ] ) # matrix
<tf.Tensor: id=0, shape=(2, 3), dtype=float32, numpy=
array( [ [ 1., 2., 3. ],
[ 4., 5., 6. ] ], dtype=float32,)>
>>> tf.constant(42) # scalar
<tf.Tensor: id=1, shape=( ), dtype=int32, numpy=42>
just like an ndarray, a tf.Tensor has a shape and a data type (dtype):
>>> t = tf.constant( [ [ 1., 2., 3. ], [ 4., 5., 6. ] ] )
TensorShape( [ 2, 3 ] )
Indexing works much like in KumPy:
>>> t[ : , 1: ]
<tf.Tensor: id=5, shape=(2, 2), dtype=float32, numpy=
array( [ [ 2., 3. ],
[ 5., 6. ] ], dtype=float32)>
>>> t[ . . . , 1, tf.newaxis]
<tf.Tensor: id=15, shape=(2,1), dtype=float32, numpy=
array( [ [ 2. ],
[ 5. ] ], dtype=float32)>
>>> t + 10
<tf.Tensor: id=18, shape=(2, 3), dtype=float32, numpy=
array( [ [ 11., 12., 13. ],
[ 14., 15., 16. ] ], dtype=float32)>
<tf.Tensor: id=20, shape=(2, 3), dtype=float32, numpy=
array( [ [ 1., 4., 9. ],
[ 16., 25., 36. ] ], dtype=float32)>
>>> t @ tf.transpose(t)
<tf.Tensor: id=24, shape=(2, 2), dtype=float32, numpy=
array( [ [ 14., 32. ],
[ 32., 77. ] ], dtype=float32)>
Note that writing t + 10 is equivalent to calling tf.add(t, 10) (indeed, Python calls the magic method t.__add__(19), which just calls tf.add(t, 10)). Other operators like - and * are also supported. The @ operator was added in Python 3.5, for matrix multiplication: it is equivalent to calling the tf.matmul( ) function.
Keras' Low-Level API
The Keras API has its own low-level API, located in keras.backend. It includes functions like square( ), exp( ), and sqrt( ). In tf.keras, these functions generally just call the corresponding TensorFlow operations. If you want to write code that will be portable to other Keras implementations, you should use these Keras functions. However, they only cover a subset of all functions available in TensorFlow, so in this book we will use the TensorFlow operations directly. Here is as simple example using keras.backend, which is commonly named K for short:
>>> from tensorflow import keras
>>> K = keras.backend
>>> K.square(K.transpose(t)) +10
<tf.Tensor: id=39, shape=(3, 2), dtype=float32, numpy=
array( [ [ 11., 26. ],
[ 14., 35. ],
[ 19., 46. ] ], dtype=float32)>
Tensors and Numpy
Notice that NumPy was 64-bit precision by default, while TensorFlow uses 32-bit.
This is because 32-bit precision is generally more than enough for neural networks, plus it runs faster and uses less RAM.
So when you create a tensor from a NumPy array, make sure to set dtype=tf.float32.
Other Data Structures
Custamizing Models and Training Algorithms
Custom Loss Functions
Saving and Loading Models That Contain Custom Components
Custom Activation Functions, Initializers, Regularizers, and Constraints
Losses and Metrics Based on Model Internals
Computing Gradients Using Autodiff
Custom Training Loops
TensorFlow Functions and Graphs
AutoGraph and Tracing
TF Function Rules