Octopod Vision¶

The computer vision aspects of Octopod are housed here. This includes sample model architectures, dataset class, and helper functions.

Model Architectures¶

class octopod.vision.models.multi_task_resnet.ResnetForMultiTaskClassification(pretrained_task_dict=None, new_task_dict=None, load_pretrained_resnet=False)¶

PyTorch image attribute model. This model allows you to load in some pretrained tasks in addition to creating new ones.

Examples

To instantiate a completely new instance of ResnetForMultiTaskClassification and load the weights into this architecture you can set pretrained to True:

model = ResnetForMultiTaskClassification(
    new_task_dict=new_task_dict,
    load_pretrained_resnet = True
)

# DO SOME TRAINING

model.save(SOME_FOLDER, SOME_MODEL_ID)

To instantiate an instance of ResnetForMultiTaskClassification that has layers for pretrained tasks and new tasks, you would do the following:

model = ResnetForMultiTaskClassification(
    pretrained_task_dict=pretrained_task_dict,
    new_task_dict=new_task_dict
)

model.load(SOME_FOLDER, SOME_MODEL_DICT)

# DO SOME TRAINING

Parameters

pretrained_task_dict (dict) – dictionary mapping each pretrained task to the number of labels it has
new_task_dict (dict) – dictionary mapping each new task to the number of labels it has
load_pretrained_resnet (boolean) – flag for whether or not to load in pretrained weights for ResNet50. useful for the first round of training before there are fine tuned weights

export(folder, model_id, model_name=None)¶

Exports the entire model state dict to a specific folder. Note: if the model has pretrained_classifiers and new_classifiers, they will be combined into the pretrained_classifiers attribute before being saved.

Parameters

folder (str or Path) – place to store state dictionaries
model_id (int) – unique id for this model
model_name (str (defaults to None)) – Name to store model under, if None, will default to multi_task_bert_{model_id}.pth

Side Effects

saves one file:

folder / model_name

forward(x)¶

Defines forward pass for image model

Parameters

x (dict of image tensors containing tensors for) –
and cropped images. the full image tensor (full) –
the key 'full_img' and the cropped tensor has (has) –
key 'crop_img' (the) –

Returns

Return type

A dictionary mapping each task to its logits

freeze_all_pretrained()¶: Freeze pretrained classifier layers and core model layers

freeze_core()¶: Freeze all core model layers

freeze_dense()¶: Freeze all core model layers

load(folder, model_id)¶

Loads the model state dicts from a specific folder.

Parameters

folder (str or Path) – place where state dictionaries are stored
model_id (int) – unique id for this model

Side Effects

loads from three files:

folder / f’resnet_dict_{model_id}.pth’
folder / f’dense_layers_dict_{model_id}.pth’
folder / f’pretrained_classifiers_dict_{model_id}.pth’

save(folder, model_id)¶

Saves the model state dicts to a specific folder. Each part of the model is saved separately to allow for new classifiers to be added later.

Note: if the model has pretrained_classifiers and new_classifers, they will be combined into the pretrained_classifiers_dict.

Parameters

folder (str or Path) – place to store state dictionaries
model_id (int) – unique id for this model

Side Effects

saves three files:

folder / f’resnet_dict_{model_id}.pth’
folder / f’dense_layers_dict_{model_id}.pth’
folder / f’pretrained_classifiers_dict_{model_id}.pth’

unfreeze_pretrained_classifiers()¶: Unfreeze pretrained classifier layers

unfreeze_pretrained_classifiers_and_core()¶: Unfreeze pretrained classifiers and core model layers

Dataset¶

class octopod.vision.dataset.OctopodImageDataset(x, y, transform='train', crop_transform='train')¶

Load data specifically for use with a image models

Parameters

x (pandas Series) – file paths to stored images
y (list) – A list of dummy-encoded categories or strings For instance, y might be [0,1,2,0] for a 3 class problem with 4 samples, or strings which will be encoded using a sklearn label encoder
transform (str or list of PyTorch transforms) – specifies how to preprocess the full image for a Octopod image model To use the built-in Octopod image transforms, use the strings: train or val To use custom transformations supply a list of PyTorch transforms
crop_transform (str or list of PyTorch transforms) – specifies how to preprocess the center cropped image for a Octopod image model To use the built-in Octopod image transforms, use strings train or val To use custom transformations supply a list of PyTorch transforms

class octopod.vision.dataset.OctopodImageDatasetMultiLabel(x, y, transform='train', crop_transform='train')¶

Subclass of OctopodImageDataset used for multi-label tasks

Parameters

x (pandas Series) – file paths to stored images
y (list) – a list of lists of binary encoded categories or strings with length equal to number of classes in the multi-label task. For a 4 class multi-label task a sample list would be [1,0,0,1], A string example would be [‘cat’,’dog’], (if the classes were [‘cat’,’frog’,’rabbit’,’dog]), which will be encoded using a sklearn label encoder to [1,0,0,1].
transform (str or list of PyTorch transforms) – specifies how to preprocess the full image for a Octopod image model To use the built-in Octopod image transforms, use the strings: train or val To use custom transformations supply a list of PyTorch transforms
crop_transform (str or list of PyTorch transforms) – specifies how to preprocess the center cropped image for a Octopod image model To use the built-in Octopod image transforms, use strings train or val To use custom transformations supply a list of PyTorch transforms

Helper Functions¶

octopod.vision.helpers.center_crop_pil_image(img)¶

Helper function to crop the center out of images.

Utilizes the centercrop function from wildebeest

Parameters: img (array) – PIL image array
Returns: PIL.Image
Return type: Slice of input image corresponding to a cropped area around the center