Kaggleに挑戦-33

Peking University/Baidu - Autonomous Driving

Can you predict vehicle angle in different settings?

＊1月21日の締め切りに向けて、お手本のプログラムコードを理解し、動かし、性能向上を図る。

＊Kaggleの利用規約遵守のため、Kaggleで得た情報を具体的に表示するのはやめた。

＊Kaggleスコアアップへの取り組み

・画素数を元に戻してスコアがどうなるかを調べる。

＊プログラミング

・data augmentationがどうなっているのかを調べる。

＊Kaggle

・画素数を元に戻して計算開始。

・だが、GPUの残り時間が6.5時間しかない。commitできない可能性が高い。そうなると続きは土曜日まで待つことになる。GPUをこまめに管理して止めないと裏で動いている可能性がある。

・残り時間が6時間です。と、警告が出た。これだと、はrejectになるだろうな。

・計算が終わったので、commitしてみた。残り5時間6分で、受け付けられた。このまま、最後まで進んでほしいな。

・completeの表示が現れ、predictをsubmitした。しかし、スコアは、以前の同じ値が再現されなかった。ともかく、直前に走らせたプログラムの、画素数のみ変更し、半分にした結果、スコアは2倍になるべきところが、約1.5倍となった。ばらつきはあるにしろ、画素数のみ増やして、スコアが下がったのは、間違いない。

・つまり、画素数だけ増やしてもダメ、という実験結果となった。これで今週のGPU持ち時間の残りは90分となり、事実上、ゼロ、となった。

＊今回のコンペの最後に、data augmentationを試してみたいと思う。

＊プログラミング

・fastaiのビデオレクチャーのLesson 3のセグメンテーションの解析の説明で、data augmentationの説明（fastaiではtransforms）で、元画像を変形するとマスクと合わなくなるので、ここでは、幅を固定して高さ方向のみ変形している、と理解した。

・Kerasのマニュアルに次の説明がある。

Example of transforming images and masks together.

# we create two instances with the same arguments

data_gen_args = dict(featurewise_center=True,

　　　　　　　　　 featurewise_std_normalization=True,

rotation_range=90,

width_shift_range=0.1,

height_shift_range=0.1,

zoom_range=0.2)

image_datagen = ImageDataGenerator(**data_gen_args)

mask_datagen = ImageDataGenerator(**data_gen_args)

# Provide the same seed and keyword arguments to the fit and flow methods

seed = 1

image_datagen.fit(images, augment=True, seed=seed)

mask_datagen.fit(masks, augment=True, seed=seed)

image_generator = image_datagen.flow_from_directory(

'data/images',

class_mode=None,

seed=seed)

mask_generator = mask_datagen.flow_from_directory(

'data/masks',

class_mode=None,

seed=seed)

# combine generators into one which yields image and masks

train_generator = zip(image_generator, mask_generator)

model.fit_generator(

train_generator,

steps_per_epoch=2000,

epochs=50)

・pytorchは、GitHubの「pytorch/vision」の「Proposal for extending transforms #230」に、いくつかの方法が示されている。1例をコピペしておく。

class Dataset(object):
    def __init__(self, transforms=None):
        self.transforms = transforms
    def __getitem__(self, idx):
        # get image1, image2, bounding_box
        # the transforms takes all inputs into account
        if self.transforms:
            image1, image2, bounding_box = self.transforms(image1, image2, bounding_box)
        return image1, image2, bounding_box

from torchvision.transforms import random_horizontal_flip

class MyJointRandomFlipTransform(object):
    def __call__(self, image1, image2, bounding_box):
        # provide a functional interface for the current transforms
        # so that they can be easily reused, and have the parameters
        # of the transformation if needed
        image1, params = random_horizontal_flip(image1, return_params=True)
        # reuses the same transformations, if wanted
        image2 = random_horizontal_flip(image2, params=params)
        # no transformation in torchvision for bounding_box, have to do it
        # ourselves
        if params.flip:
            bounding_box[:, 1] = image1.size(2) - bounding_box[:, 1]
            bounding_box[:, 3] = image1.size(2) - bounding_box[:, 3]
        return image1, image2, bounding_box