cyto_dl.datamodules.smartcache module#

class cyto_dl.datamodules.smartcache.SmartcacheDatamodule(csv_path: Path | str | None = None, transforms: Compose | None = None, img_data: Path | str | None = None, n_val: int = 20, pct_val: float = 0.1, img_path_column: str = 'raw', channel_column: str = 'ch', spatial_dims: int = 3, num_neighbors: int = 0, num_workers: int = 4, cache_rate: float = 0.5, replace_rate: float = 0.1, **kwargs)[source]#

Bases: LightningDataModule

Datamodule for large CZI datasets that don’t fit in memory.

Parameters:
  • csv_path (Union[Path, str]) – path to csv with image in img_path_column and channel in channel_column

  • transforms (Compose) – Monai transforms to apply to each image. Should start with a transform that uses bioio for image reading

  • img_data (Union[Path, str]) – csv_path generated by get_per_file_args that enumerates scenes and timepoints for each image in csv_path

  • n_val (int) – number of validation images to use. Minimum of pct_val * n_images and n_val is used.

  • pct_val (float) – percentage of images to use for validation. Minimum of pct_val * n_images and n_val is used.

  • img_path_column (str) – column in csv_path that contains the path to the image

  • channel_column (str) – column in csv_path that contains the channel to use

  • spatial_dims (int) – number of spatial dimensions in the image

  • num_neighbors (int) – number of neighboring timepoints to use

  • num_workers (int) – number of workers to use for loading data. Most be specified here to schedule replacement workers for cache data

  • cache_rate (float) – percentage of data to cache

  • replace_rate (float) – percentage of data to replace

  • kwargs – additional arguments to pass to DataLoader

get_per_file_args(df)[source]#

Parallelize getting the image loading arguments enumerating all timepoints/channels/scenes for each file in the dataframe.

make_dataloader(split)[source]#
predict_dataloader()[source]#
prepare_data()[source]#
setup(stage=None)[source]#
test_dataloader()[source]#
train_dataloader()[source]#
val_dataloader()[source]#