Octopod Vision

The computer vision aspects of Octopod are housed here. This includes sample model architectures, dataset class, and helper functions.

Model Architectures

class octopod.vision.models.multi_task_resnet.ResnetForMultiTaskClassification(pretrained_task_dict=None, new_task_dict=None, load_pretrained_resnet=False)

PyTorch image attribute model. This model allows you to load in some pretrained tasks in addition to creating new ones.

Examples

To instantiate a completely new instance of ResnetForMultiTaskClassification and load the weights into this architecture you can set pretrained to True:

model = ResnetForMultiTaskClassification(
    new_task_dict=new_task_dict,
    load_pretrained_resnet = True
)

# DO SOME TRAINING

model.save(SOME_FOLDER, SOME_MODEL_ID)

To instantiate an instance of ResnetForMultiTaskClassification that has layers for pretrained tasks and new tasks, you would do the following:

model = ResnetForMultiTaskClassification(
    pretrained_task_dict=pretrained_task_dict,
    new_task_dict=new_task_dict
)

model.load(SOME_FOLDER, SOME_MODEL_DICT)

# DO SOME TRAINING
Parameters
  • pretrained_task_dict (dict) – dictionary mapping each pretrained task to the number of labels it has

  • new_task_dict (dict) – dictionary mapping each new task to the number of labels it has

  • load_pretrained_resnet (boolean) – flag for whether or not to load in pretrained weights for ResNet50. useful for the first round of training before there are fine tuned weights

export(folder, model_id, model_name=None)

Exports the entire model state dict to a specific folder. Note: if the model has pretrained_classifiers and new_classifiers, they will be combined into the pretrained_classifiers attribute before being saved.

Parameters
  • folder (str or Path) – place to store state dictionaries

  • model_id (int) – unique id for this model

  • model_name (str (defaults to None)) – Name to store model under, if None, will default to multi_task_bert_{model_id}.pth

Side Effects

saves one file:
  • folder / model_name

forward(x)

Defines forward pass for image model

Parameters
  • x (dict of image tensors containing tensors for) –

  • and cropped images. the full image tensor (full) –

  • the key 'full_img' and the cropped tensor has (has) –

  • key 'crop_img' (the) –

Returns

Return type

A dictionary mapping each task to its logits

freeze_all_pretrained()

Freeze pretrained classifier layers and core model layers

freeze_core()

Freeze all core model layers

freeze_dense()

Freeze all core model layers

load(folder, model_id)

Loads the model state dicts from a specific folder.

Parameters
  • folder (str or Path) – place where state dictionaries are stored

  • model_id (int) – unique id for this model

Side Effects

loads from three files:
  • folder / f’resnet_dict_{model_id}.pth’

  • folder / f’dense_layers_dict_{model_id}.pth’

  • folder / f’pretrained_classifiers_dict_{model_id}.pth’

save(folder, model_id)

Saves the model state dicts to a specific folder. Each part of the model is saved separately to allow for new classifiers to be added later.

Note: if the model has pretrained_classifiers and new_classifers, they will be combined into the pretrained_classifiers_dict.

Parameters
  • folder (str or Path) – place to store state dictionaries

  • model_id (int) – unique id for this model

Side Effects

saves three files:
  • folder / f’resnet_dict_{model_id}.pth’

  • folder / f’dense_layers_dict_{model_id}.pth’

  • folder / f’pretrained_classifiers_dict_{model_id}.pth’

unfreeze_pretrained_classifiers()

Unfreeze pretrained classifier layers

unfreeze_pretrained_classifiers_and_core()

Unfreeze pretrained classifiers and core model layers

Dataset

class octopod.vision.dataset.OctopodImageDataset(x, y, transform='train', crop_transform='train')

Load data specifically for use with a image models

Parameters
  • x (pandas Series) – file paths to stored images

  • y (list) – A list of dummy-encoded categories or strings For instance, y might be [0,1,2,0] for a 3 class problem with 4 samples, or strings which will be encoded using a sklearn label encoder

  • transform (str or list of PyTorch transforms) – specifies how to preprocess the full image for a Octopod image model To use the built-in Octopod image transforms, use the strings: train or val To use custom transformations supply a list of PyTorch transforms

  • crop_transform (str or list of PyTorch transforms) – specifies how to preprocess the center cropped image for a Octopod image model To use the built-in Octopod image transforms, use strings train or val To use custom transformations supply a list of PyTorch transforms

class octopod.vision.dataset.OctopodImageDatasetMultiLabel(x, y, transform='train', crop_transform='train')

Subclass of OctopodImageDataset used for multi-label tasks

Parameters
  • x (pandas Series) – file paths to stored images

  • y (list) – a list of lists of binary encoded categories or strings with length equal to number of classes in the multi-label task. For a 4 class multi-label task a sample list would be [1,0,0,1], A string example would be [‘cat’,’dog’], (if the classes were [‘cat’,’frog’,’rabbit’,’dog]), which will be encoded using a sklearn label encoder to [1,0,0,1].

  • transform (str or list of PyTorch transforms) – specifies how to preprocess the full image for a Octopod image model To use the built-in Octopod image transforms, use the strings: train or val To use custom transformations supply a list of PyTorch transforms

  • crop_transform (str or list of PyTorch transforms) – specifies how to preprocess the center cropped image for a Octopod image model To use the built-in Octopod image transforms, use strings train or val To use custom transformations supply a list of PyTorch transforms

Helper Functions

octopod.vision.helpers.center_crop_pil_image(img)

Helper function to crop the center out of images.

Utilizes the centercrop function from wildebeest

Parameters

img (array) – PIL image array

Returns

PIL.Image

Return type

Slice of input image corresponding to a cropped area around the center