cyto_dl.datamodules.dataframe.utils module#

class cyto_dl.datamodules.dataframe.utils.AlternatingBatchSampler(subset: ~torch.utils.data.dataset.Subset, target_columns: ~typing.Sequence[str] | None = None, grouping_column: ~typing.Sequence[str] | None = None, batch_size: int = 1, drop_last: bool = False, shuffle: bool = False, sampler: ~torch.utils.data.sampler.Sampler = <class 'torch.utils.data.sampler.SubsetRandomSampler'>)[source]#

Bases: BatchSampler

Subclass of pytorch’s BatchSampler that alternates between sampling from mutually exclusive columns of a dataframe dataset.

Parameters:

subset (Subset) – Subset of monai dataset wrapping a dataframe
target_columns (Sequence[str]) – names of columns in subset dataframe representing types of ground truth images to alternate between
batch_size (int) – Size of batch
drop_last (bool=False) – Whether to drop last incomplete batch
shuffle (bool=False) – Whether to randomly select between columns in target_columns. If False, batches will follow the order of target_columns
sampler (Sampler=SubsetRandomSampler) – Sampler to sample from each column in target_columns

class cyto_dl.datamodules.dataframe.utils.RemoveNaNKeysd[source]#

Bases: Transform

Transform to remove ‘nan’ keys from data dictionary.

When combined with adding allow_missing_keys=True to transforms and the alternating batch sampler, this allows multi-task training when only one target is available at a time.

cyto_dl.datamodules.dataframe.utils.get_canonical_split_name(split)[source]#

cyto_dl.datamodules.dataframe.utils.get_dataset(dataframe, transform, split, cache_dir=None, smartcache_args=None)[source]#

cyto_dl.datamodules.dataframe.utils.make_multiple_dataframe_splits(split_path, transforms, columns=None, just_inference=False, cache_dir=None, smartcache_args=None)[source]#

cyto_dl.datamodules.dataframe.utils.make_single_dataframe_splits(dataframe_path, transforms, split_column, columns=None, just_inference=False, split_map=None, cache_dir=None, smartcache_args=None)[source]#

cyto_dl.datamodules.dataframe.utils.parse_transforms(transforms)[source]#