%matplotlib inline
import importlib
import utils2; importlib.reload(utils2)
from utils2 import *
Using TensorFlow backend.
limit_mem()
from keras.datasets.cifar10 import load_batch
This notebook contains a Keras implementation of Huang et al.'s DenseNet
Our motivation behind studying DenseNet is because of how well it works with limited data.
DenseNet beats state-of-the-art results on CIFAR-10/CIFAR-100 w/ and w/o data augmentation, but the performance increase is most pronounced w/o data augmentation.
Compare to FractalNet, state-of-the-art on both datasets:
That increase is motivation enough.
So what is a DenseNet?
Put simply, DenseNet is a Resnet where we replace addition with concatenation.
Recall that in broad terms, a Resnet is a Convnet that uses residual block structures.
These "blocks" work as follows:
As mentioned, the difference w/ DenseNet is instead of adding Lt to Lt+1, it is being concatenated.
As with Resnet, DenseNet consists of multiple blocks. Therefore, there is a recursive relationship across blocks:
The number of filters added to each layer needs to be monitored, given that the input space for each block keeps growing.
Huang et al. calls the # of filters added at each layer the growth rate, and appropriately denotes this number with the related letter k.
Let's load data.
def load_data():
path = 'data/cifar-10-batches-py'
num_train_samples = 50000
x_train = np.zeros((num_train_samples, 3, 32, 32), dtype='uint8')
y_train = np.zeros((num_train_samples,), dtype='uint8')
for i in range(1, 6):
data, labels = load_batch(os.path.join(path, 'data_batch_' + str(i)))
x_train[(i - 1) * 10000: i * 10000, :, :, :] = data
y_train[(i - 1) * 10000: i * 10000] = labels
x_test, y_test = load_batch(os.path.join(path, 'test_batch'))
y_train = np.reshape(y_train, (len(y_train), 1))
y_test = np.reshape(y_test, (len(y_test), 1))
x_train = x_train.transpose(0, 2, 3, 1)
x_test = x_test.transpose(0, 2, 3, 1)
return (x_train, y_train), (x_test, y_test)
(x_train, y_train), (x_test, y_test) = load_data()
Here's an example of CIFAR-10
plt.imshow(x_train[1])
<matplotlib.image.AxesImage at 0x7f137c53d470>
We want to normalize pixel values (0-255) to unit interval.
x_train = x_train/255.
x_test = x_test/255.
Let's make some helper functions for piecing together our network using Keras' Functional API.
These components should all be familiar to you:
def relu(x): return Activation('relu')(x)
def dropout(x, p): return Dropout(p)(x) if p else x
def bn(x): return BatchNormalization(mode=0, axis=-1)(x)
def relu_bn(x): return relu(bn(x))
Convolutional layer:
def conv(x, nf, sz, wd, p):
x = Convolution2D(nf, sz, sz, init='he_uniform', border_mode='same',
W_regularizer=l2(wd))(x)
return dropout(x,p)
Define ConvBlock as sequence:
The authors also use something called a bottleneck layer to reduce dimensionality of inputs.
Recall that the filter space dimensionality grows at each block. The input dimensionality will determine the dimensionality of your convolution weight matrices, i.e. # of parameters.
At size 3x3 or larger, convolutions can become extremely costly and # of parameters can increase quickly as a function of the input feature (filter) space. Therefore, a smart approach is to reduce dimensionality of filters by using a 1x1 convolution w/ smaller # of filters before the larger convolution.
Bottleneck consists of:
nf * 4def conv_block(x, nf, bottleneck=False, p=None, wd=0):
x = relu_bn(x)
if bottleneck: x = relu_bn(conv(x, nf * 4, 1, wd, p))
return conv(x, nf, 3, wd, p)
Now we can define the dense block:
xbx and conv block output bx for next blockdef dense_block(x, nb_layers, growth_rate, bottleneck=False, p=None, wd=0):
if bottleneck: nb_layers //= 2
for i in range(nb_layers):
b = conv_block(x, growth_rate, bottleneck=bottleneck, p=p, wd=wd)
x = merge([x,b], mode='concat', concat_axis=-1)
return x
As typical for CV architectures, we'll do some pooling after computation.
We'll define this unit as the transition block, and we'll put one between each dense block.
Aside from BN -> ReLU and Average Pooling, there is also an option for filter compression in this block. This is simply feature reduction via 1x1 conv as discussed before, where the new # of filters is a percentage of the incoming # of filters.
Together with bottleneck, compression has been shown to improve performance and computational efficiency of DenseNet architectures. (the authors call this DenseNet-BC)
def transition_block(x, compression=1.0, p=None, wd=0):
nf = int(x.get_shape().as_list()[-1] * compression)
x = relu_bn(x)
x = conv(x, nf, 1, wd, p)
return AveragePooling2D((2, 2), strides=(2, 2))(x)
We've now defined all the building blocks (literally) to put together a DenseNet.
Returns: keras tensor with nb_layers of conv_block appended
From start to finish, this generates:
nb_block times, ommitting Transition block after last Dense block(Depth-4)/nb_block layersdef create_dense_net(nb_classes, img_input, depth=40, nb_block=3,
growth_rate=12, nb_filter=16, bottleneck=False, compression=1.0, p=None, wd=0, activation='softmax'):
assert activation == 'softmax' or activation == 'sigmoid'
assert (depth - 4) % nb_block == 0
nb_layers_per_block = int((depth - 4) / nb_block)
nb_layers = [nb_layers_per_block] * nb_block
x = conv(img_input, nb_filter, 3, wd, 0)
for i,block in enumerate(nb_layers):
x = dense_block(x, block, growth_rate, bottleneck=bottleneck, p=p, wd=wd)
if i != len(nb_layers)-1:
x = transition_block(x, compression=compression, p=p, wd=wd)
x = relu_bn(x)
x = GlobalAveragePooling2D()(x)
return Dense(nb_classes, activation=activation, W_regularizer=l2(wd))(x)
Now we can test it out on CIFAR-10.
input_shape = (32,32,3)
img_input = Input(shape=input_shape)
x = create_dense_net(10, img_input, depth=100, nb_filter=16, compression=0.5,
bottleneck=True, p=0.2, wd=1e-4)
model = Model(img_input, x)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=keras.optimizers.SGD(0.1, 0.9, nesterov=True), metrics=["accuracy"])
parms = {'verbose': 2, 'callbacks': [TQDMNotebookCallback()]}
K.set_value(model.optimizer.lr, 0.1)
This will likely need to run overnight + lr annealing...
model.fit(x_train, y_train, 64, 20, validation_data=(x_test, y_test), **parms)
Train on 50000 samples, validate on 10000 samples Epoch 1/20 561s - loss: 1.9801 - acc: 0.4810 - val_loss: 2.0473 - val_acc: 0.5045 Epoch 2/20 556s - loss: 1.4368 - acc: 0.6571 - val_loss: 1.8446 - val_acc: 0.5864 Epoch 3/20 547s - loss: 1.2204 - acc: 0.7122 - val_loss: 1.3181 - val_acc: 0.6696 Epoch 4/20 556s - loss: 1.0634 - acc: 0.7547 - val_loss: 1.3620 - val_acc: 0.6658 Epoch 5/20 560s - loss: 0.9536 - acc: 0.7829 - val_loss: 2.6235 - val_acc: 0.4702 Epoch 6/20 557s - loss: 0.8835 - acc: 0.8025 - val_loss: 2.4969 - val_acc: 0.4981 Epoch 7/20 551s - loss: 0.8293 - acc: 0.8155 - val_loss: 1.1944 - val_acc: 0.7281 Epoch 8/20 551s - loss: 0.7949 - acc: 0.8244 - val_loss: 1.1396 - val_acc: 0.7366 Epoch 9/20 551s - loss: 0.7620 - acc: 0.8340 - val_loss: 1.9196 - val_acc: 0.5916 Epoch 10/20 551s - loss: 0.7472 - acc: 0.8389 - val_loss: 2.6207 - val_acc: 0.4900 Epoch 11/20 550s - loss: 0.7251 - acc: 0.8449 - val_loss: 1.4957 - val_acc: 0.6859 Epoch 12/20 551s - loss: 0.7117 - acc: 0.8503 - val_loss: 1.0381 - val_acc: 0.7751 Epoch 13/20 552s - loss: 0.7006 - acc: 0.8547 - val_loss: 1.6471 - val_acc: 0.6685 Epoch 14/20 556s - loss: 0.6945 - acc: 0.8555 - val_loss: 0.9267 - val_acc: 0.8087 Epoch 15/20 551s - loss: 0.6859 - acc: 0.8592 - val_loss: 1.0987 - val_acc: 0.7642 Epoch 16/20 550s - loss: 0.6756 - acc: 0.8645 - val_loss: 0.9704 - val_acc: 0.7940 Epoch 17/20 551s - loss: 0.6730 - acc: 0.8642 - val_loss: 0.9401 - val_acc: 0.7800 Epoch 18/20 551s - loss: 0.6666 - acc: 0.8700 - val_loss: 0.9759 - val_acc: 0.7830 Epoch 19/20 550s - loss: 0.6654 - acc: 0.8709 - val_loss: 0.8896 - val_acc: 0.8044 Epoch 20/20 551s - loss: 0.6617 - acc: 0.8712 - val_loss: 1.1052 - val_acc: 0.7570
<keras.callbacks.History at 0x7f04f8b132b0>
K.set_value(model.optimizer.lr, 0.01)
model.fit(x_train, y_train, 64, 4, validation_data=(x_test, y_test), **parms)
Train on 50000 samples, validate on 10000 samples Epoch 1/4 550s - loss: 0.5463 - acc: 0.9128 - val_loss: 0.5737 - val_acc: 0.9033 Epoch 2/4 551s - loss: 0.4833 - acc: 0.9311 - val_loss: 0.5695 - val_acc: 0.9033 Epoch 3/4 551s - loss: 0.4575 - acc: 0.9366 - val_loss: 0.5590 - val_acc: 0.9051 Epoch 4/4 550s - loss: 0.4361 - acc: 0.9429 - val_loss: 0.5656 - val_acc: 0.9048
<keras.callbacks.History at 0x7f05ec7caf28>
K.set_value(model.optimizer.lr, 0.1)
model.fit(x_train, y_train, 64, 20, validation_data=(x_test, y_test), **parms)
Train on 50000 samples, validate on 10000 samples Epoch 1/20 551s - loss: 0.6589 - acc: 0.8728 - val_loss: 1.3259 - val_acc: 0.6935 Epoch 2/20 551s - loss: 0.6510 - acc: 0.8766 - val_loss: 0.9672 - val_acc: 0.7880 Epoch 3/20 551s - loss: 0.6508 - acc: 0.8784 - val_loss: 1.1104 - val_acc: 0.7581 Epoch 4/20 551s - loss: 0.6462 - acc: 0.8793 - val_loss: 1.0601 - val_acc: 0.7877 Epoch 5/20 550s - loss: 0.6456 - acc: 0.8816 - val_loss: 0.9799 - val_acc: 0.7876 Epoch 6/20 551s - loss: 0.6427 - acc: 0.8830 - val_loss: 0.9377 - val_acc: 0.8028 Epoch 7/20 551s - loss: 0.6409 - acc: 0.8837 - val_loss: 1.8484 - val_acc: 0.5932 Epoch 8/20 551s - loss: 0.6378 - acc: 0.8831 - val_loss: 1.1806 - val_acc: 0.7420 Epoch 9/20 550s - loss: 0.6381 - acc: 0.8843 - val_loss: 1.0799 - val_acc: 0.7774 Epoch 10/20 551s - loss: 0.6344 - acc: 0.8870 - val_loss: 0.9114 - val_acc: 0.8163 Epoch 11/20 561s - loss: 0.6394 - acc: 0.8858 - val_loss: 0.9710 - val_acc: 0.7982 Epoch 12/20 560s - loss: 0.6367 - acc: 0.8863 - val_loss: 0.8751 - val_acc: 0.8249 Epoch 13/20 561s - loss: 0.6230 - acc: 0.8899 - val_loss: 1.2588 - val_acc: 0.7254 Epoch 14/20 561s - loss: 0.6298 - acc: 0.8895 - val_loss: 0.9942 - val_acc: 0.7801 Epoch 15/20 560s - loss: 0.6321 - acc: 0.8888 - val_loss: 0.8516 - val_acc: 0.8378 Epoch 16/20 559s - loss: 0.6268 - acc: 0.8893 - val_loss: 0.8288 - val_acc: 0.8301 Epoch 17/20 561s - loss: 0.6279 - acc: 0.8904 - val_loss: 1.2768 - val_acc: 0.7219 Epoch 18/20 561s - loss: 0.6248 - acc: 0.8920 - val_loss: 0.9362 - val_acc: 0.8015 Epoch 19/20 561s - loss: 0.6184 - acc: 0.8941 - val_loss: 0.9204 - val_acc: 0.8181 Epoch 20/20 561s - loss: 0.6254 - acc: 0.8915 - val_loss: 1.0211 - val_acc: 0.7706
<keras.callbacks.History at 0x7f04f55fcb00>
K.set_value(model.optimizer.lr, 0.01)
model.fit(x_train, y_train, 64, 40, validation_data=(x_test, y_test), **parms)
Train on 50000 samples, validate on 10000 samples Epoch 1/40 556s - loss: 0.5141 - acc: 0.9320 - val_loss: 0.5652 - val_acc: 0.9165 Epoch 2/40 560s - loss: 0.4530 - acc: 0.9477 - val_loss: 0.5451 - val_acc: 0.9199 Epoch 3/40 560s - loss: 0.4290 - acc: 0.9546 - val_loss: 0.5409 - val_acc: 0.9188 Epoch 4/40 559s - loss: 0.4101 - acc: 0.9584 - val_loss: 0.5259 - val_acc: 0.9224 Epoch 5/40 549s - loss: 0.3934 - acc: 0.9620 - val_loss: 0.5365 - val_acc: 0.9198 Epoch 6/40 551s - loss: 0.3813 - acc: 0.9631 - val_loss: 0.5150 - val_acc: 0.9209 Epoch 7/40 556s - loss: 0.3685 - acc: 0.9644 - val_loss: 0.5238 - val_acc: 0.9197 Epoch 8/40 556s - loss: 0.3565 - acc: 0.9668 - val_loss: 0.5188 - val_acc: 0.9204 Epoch 9/40 555s - loss: 0.3430 - acc: 0.9693 - val_loss: 0.5078 - val_acc: 0.9206 Epoch 10/40 553s - loss: 0.3325 - acc: 0.9707 - val_loss: 0.5107 - val_acc: 0.9191 Epoch 11/40 556s - loss: 0.3220 - acc: 0.9721 - val_loss: 0.5091 - val_acc: 0.9191 Epoch 12/40 556s - loss: 0.3121 - acc: 0.9738 - val_loss: 0.5033 - val_acc: 0.9212 Epoch 13/40 556s - loss: 0.3082 - acc: 0.9723 - val_loss: 0.4970 - val_acc: 0.9226 Epoch 14/40 556s - loss: 0.2986 - acc: 0.9749 - val_loss: 0.5553 - val_acc: 0.9058 Epoch 15/40 555s - loss: 0.2913 - acc: 0.9746 - val_loss: 0.5065 - val_acc: 0.9203 Epoch 16/40 552s - loss: 0.2824 - acc: 0.9762 - val_loss: 0.4912 - val_acc: 0.9218 Epoch 17/40 554s - loss: 0.2774 - acc: 0.9764 - val_loss: 0.5191 - val_acc: 0.9125 Epoch 18/40 554s - loss: 0.2722 - acc: 0.9769 - val_loss: 0.5023 - val_acc: 0.9184 Epoch 19/40 550s - loss: 0.2654 - acc: 0.9771 - val_loss: 0.4965 - val_acc: 0.9183 Epoch 20/40 547s - loss: 0.2603 - acc: 0.9778 - val_loss: 0.5552 - val_acc: 0.9061 Epoch 21/40 547s - loss: 0.2549 - acc: 0.9779 - val_loss: 0.4868 - val_acc: 0.9168 Epoch 22/40 547s - loss: 0.2494 - acc: 0.9793 - val_loss: 0.4754 - val_acc: 0.9242 Epoch 23/40 547s - loss: 0.2462 - acc: 0.9785 - val_loss: 0.5014 - val_acc: 0.9136 Epoch 24/40 548s - loss: 0.2427 - acc: 0.9792 - val_loss: 0.5226 - val_acc: 0.9075 Epoch 25/40 547s - loss: 0.2376 - acc: 0.9794 - val_loss: 0.4829 - val_acc: 0.9159 Epoch 26/40 547s - loss: 0.2325 - acc: 0.9800 - val_loss: 0.5066 - val_acc: 0.9125 Epoch 27/40 548s - loss: 0.2312 - acc: 0.9790 - val_loss: 0.4887 - val_acc: 0.9155 Epoch 28/40 548s - loss: 0.2277 - acc: 0.9792 - val_loss: 0.4959 - val_acc: 0.9107 Epoch 29/40 547s - loss: 0.2255 - acc: 0.9788 - val_loss: 0.6025 - val_acc: 0.8956 Epoch 30/40 548s - loss: 0.2216 - acc: 0.9798 - val_loss: 0.4708 - val_acc: 0.9180 Epoch 31/40 548s - loss: 0.2238 - acc: 0.9772 - val_loss: 0.5193 - val_acc: 0.9084 Epoch 32/40 548s - loss: 0.2174 - acc: 0.9790 - val_loss: 0.5216 - val_acc: 0.9100 Epoch 33/40 547s - loss: 0.2176 - acc: 0.9782 - val_loss: 0.4960 - val_acc: 0.9153 Epoch 34/40 548s - loss: 0.2128 - acc: 0.9790 - val_loss: 0.4644 - val_acc: 0.9188 Epoch 35/40 548s - loss: 0.2113 - acc: 0.9795 - val_loss: 0.4759 - val_acc: 0.9196 Epoch 36/40 547s - loss: 0.2090 - acc: 0.9789 - val_loss: 0.5176 - val_acc: 0.9066 Epoch 37/40 548s - loss: 0.2078 - acc: 0.9802 - val_loss: 0.4602 - val_acc: 0.9208 Epoch 38/40 547s - loss: 0.2112 - acc: 0.9772 - val_loss: 0.4998 - val_acc: 0.9096 Epoch 39/40 548s - loss: 0.2051 - acc: 0.9794 - val_loss: 0.5156 - val_acc: 0.9066 Epoch 40/40 547s - loss: 0.2046 - acc: 0.9781 - val_loss: 0.4961 - val_acc: 0.9108
<keras.callbacks.History at 0x7f04f5497d30>
K.set_value(model.optimizer.lr, 0.001)
model.fit(x_train, y_train, 64, 20, validation_data=(x_test, y_test), **parms)
Train on 50000 samples, validate on 10000 samples Epoch 1/20 547s - loss: 0.1885 - acc: 0.9845 - val_loss: 0.4287 - val_acc: 0.9256 Epoch 2/20 548s - loss: 0.1772 - acc: 0.9886 - val_loss: 0.4198 - val_acc: 0.9279 Epoch 3/20 547s - loss: 0.1734 - acc: 0.9901 - val_loss: 0.4181 - val_acc: 0.9283 Epoch 4/20 547s - loss: 0.1706 - acc: 0.9910 - val_loss: 0.4188 - val_acc: 0.9280 Epoch 5/20 548s - loss: 0.1679 - acc: 0.9918 - val_loss: 0.4127 - val_acc: 0.9298 Epoch 6/20 548s - loss: 0.1670 - acc: 0.9921 - val_loss: 0.4159 - val_acc: 0.9301 Epoch 7/20 548s - loss: 0.1650 - acc: 0.9926 - val_loss: 0.4139 - val_acc: 0.9300 Epoch 8/20 547s - loss: 0.1631 - acc: 0.9933 - val_loss: 0.4087 - val_acc: 0.9304 Epoch 9/20 548s - loss: 0.1619 - acc: 0.9934 - val_loss: 0.4150 - val_acc: 0.9302 Epoch 10/20 547s - loss: 0.1609 - acc: 0.9939 - val_loss: 0.4154 - val_acc: 0.9294 Epoch 11/20 547s - loss: 0.1611 - acc: 0.9933 - val_loss: 0.4102 - val_acc: 0.9310 Epoch 12/20 547s - loss: 0.1584 - acc: 0.9943 - val_loss: 0.4105 - val_acc: 0.9306 Epoch 13/20 547s - loss: 0.1594 - acc: 0.9934 - val_loss: 0.4093 - val_acc: 0.9309 Epoch 14/20 547s - loss: 0.1582 - acc: 0.9940 - val_loss: 0.4110 - val_acc: 0.9298 Epoch 15/20 547s - loss: 0.1567 - acc: 0.9942 - val_loss: 0.4080 - val_acc: 0.9315 Epoch 16/20 547s - loss: 0.1565 - acc: 0.9940 - val_loss: 0.4113 - val_acc: 0.9304 Epoch 17/20 548s - loss: 0.1558 - acc: 0.9942 - val_loss: 0.4093 - val_acc: 0.9292 Epoch 18/20 548s - loss: 0.1561 - acc: 0.9939 - val_loss: 0.4079 - val_acc: 0.9310 Epoch 19/20 548s - loss: 0.1552 - acc: 0.9942 - val_loss: 0.4153 - val_acc: 0.9297 Epoch 20/20 547s - loss: 0.1535 - acc: 0.9951 - val_loss: 0.4069 - val_acc: 0.9313
<keras.callbacks.History at 0x7f05ec7ea6a0>
K.set_value(model.optimizer.lr, 0.01)
model.fit(x_train, y_train, 64, 10, validation_data=(x_test, y_test), **parms)
Train on 50000 samples, validate on 10000 samples Epoch 1/10 548s - loss: 0.1819 - acc: 0.9842 - val_loss: 0.4929 - val_acc: 0.9092 Epoch 2/10 547s - loss: 0.2018 - acc: 0.9751 - val_loss: 0.5761 - val_acc: 0.8880 Epoch 3/10 548s - loss: 0.2046 - acc: 0.9742 - val_loss: 0.5411 - val_acc: 0.8950 Epoch 4/10 548s - loss: 0.2008 - acc: 0.9765 - val_loss: 0.5607 - val_acc: 0.8957 Epoch 5/10 548s - loss: 0.1956 - acc: 0.9778 - val_loss: 0.4991 - val_acc: 0.9049 Epoch 6/10 548s - loss: 0.1996 - acc: 0.9760 - val_loss: 0.4714 - val_acc: 0.9112 Epoch 7/10 548s - loss: 0.1947 - acc: 0.9779 - val_loss: 0.5921 - val_acc: 0.8855 Epoch 8/10 547s - loss: 0.1958 - acc: 0.9770 - val_loss: 0.5096 - val_acc: 0.9058 Epoch 9/10 547s - loss: 0.1976 - acc: 0.9754 - val_loss: 0.5129 - val_acc: 0.9041 Epoch 10/10 548s - loss: 0.1940 - acc: 0.9767 - val_loss: 0.5693 - val_acc: 0.8869
<keras.callbacks.History at 0x7f04f52ac668>
K.set_value(model.optimizer.lr, 0.001)
model.fit(x_train, y_train, 64, 20, validation_data=(x_test, y_test), **parms)
Train on 50000 samples, validate on 10000 samples Epoch 1/20 548s - loss: 0.1879 - acc: 0.9801 - val_loss: 0.4073 - val_acc: 0.9270 Epoch 2/20 548s - loss: 0.1631 - acc: 0.9893 - val_loss: 0.4040 - val_acc: 0.9265 Epoch 3/20 547s - loss: 0.1601 - acc: 0.9905 - val_loss: 0.4007 - val_acc: 0.9295 Epoch 4/20 547s - loss: 0.1560 - acc: 0.9919 - val_loss: 0.4016 - val_acc: 0.9294 Epoch 5/20 548s - loss: 0.1540 - acc: 0.9921 - val_loss: 0.3988 - val_acc: 0.9293 Epoch 6/20 547s - loss: 0.1529 - acc: 0.9926 - val_loss: 0.4013 - val_acc: 0.9283 Epoch 7/20 548s - loss: 0.1497 - acc: 0.9937 - val_loss: 0.3984 - val_acc: 0.9312 Epoch 8/20 548s - loss: 0.1508 - acc: 0.9929 - val_loss: 0.3993 - val_acc: 0.9304 Epoch 9/20 547s - loss: 0.1486 - acc: 0.9937 - val_loss: 0.3988 - val_acc: 0.9303 Epoch 10/20 547s - loss: 0.1471 - acc: 0.9938 - val_loss: 0.3978 - val_acc: 0.9302 Epoch 11/20 547s - loss: 0.1460 - acc: 0.9942 - val_loss: 0.3945 - val_acc: 0.9306 Epoch 12/20 547s - loss: 0.1453 - acc: 0.9943 - val_loss: 0.3988 - val_acc: 0.9292 Epoch 13/20 547s - loss: 0.1456 - acc: 0.9939 - val_loss: 0.4004 - val_acc: 0.9298 Epoch 14/20 547s - loss: 0.1434 - acc: 0.9946 - val_loss: 0.3978 - val_acc: 0.9314 Epoch 15/20 547s - loss: 0.1427 - acc: 0.9946 - val_loss: 0.3974 - val_acc: 0.9311 Epoch 16/20 547s - loss: 0.1417 - acc: 0.9949 - val_loss: 0.3978 - val_acc: 0.9320 Epoch 17/20 548s - loss: 0.1403 - acc: 0.9954 - val_loss: 0.4010 - val_acc: 0.9317 Epoch 18/20 548s - loss: 0.1395 - acc: 0.9955 - val_loss: 0.3989 - val_acc: 0.9324 Epoch 19/20 547s - loss: 0.1409 - acc: 0.9951 - val_loss: 0.3997 - val_acc: 0.9312 Epoch 20/20 548s - loss: 0.1402 - acc: 0.9948 - val_loss: 0.3973 - val_acc: 0.9323
<keras.callbacks.History at 0x7f04f5264588>
And we're able to replicate their state-of-the-art results!
%time model.save_weights('models/93.h5')
CPU times: user 31.1 s, sys: 452 ms, total: 31.6 s Wall time: 31.1 s