cyto_dl.datamodules.smartcache module#
- class cyto_dl.datamodules.smartcache.SmartcacheDatamodule(csv_path: Path | str | None = None, transforms: Compose | None = None, img_data: Path | str | None = None, n_val: int = 20, pct_val: float = 0.1, img_path_column: str = 'raw', channel_column: str = 'ch', spatial_dims: int = 3, num_neighbors: int = 0, num_workers: int = 4, cache_rate: float = 0.5, replace_rate: float = 0.1, **kwargs)[source]#
Bases:
LightningDataModule
Datamodule for large CZI datasets that don’t fit in memory.
- Parameters:
csv_path (Union[Path, str]) – path to csv with image in img_path_column and channel in channel_column
transforms (Compose) – Monai transforms to apply to each image. Should start with a transform that uses bioio for image reading
img_data (Union[Path, str]) – csv_path generated by get_per_file_args that enumerates scenes and timepoints for each image in csv_path
n_val (int) – number of validation images to use. Minimum of pct_val * n_images and n_val is used.
pct_val (float) – percentage of images to use for validation. Minimum of pct_val * n_images and n_val is used.
img_path_column (str) – column in csv_path that contains the path to the image
channel_column (str) – column in csv_path that contains the channel to use
spatial_dims (int) – number of spatial dimensions in the image
num_neighbors (int) – number of neighboring timepoints to use
num_workers (int) – number of workers to use for loading data. Most be specified here to schedule replacement workers for cache data
cache_rate (float) – percentage of data to cache
replace_rate (float) – percentage of data to replace
kwargs – additional arguments to pass to DataLoader