cyto_dl.datamodules.multidim_image module#
- class cyto_dl.datamodules.multidim_image.MultiDimImageDataset(csv_path: Path | str, img_path_column: str, channel_column: str, out_key: str, spatial_dims: int = 3, scene_column: str = 'scene', time_start_column: str = 'start', time_stop_column: str = 'stop', time_step_column: str = 'step', dict_meta: Dict | None = None, transform: Callable | None = None, dask_load: bool = True)[source]#
Bases:
Dataset
Dataset converting a .csv file listing multi dimensional (timelapse or multi-scene) files and some metadata into batches of single- scene, single-timepoint, single-channel images.
- Parameters:
csv_path (Union[Path, str]) – path to csv
img_path_column (str) – column in csv_path that contains path to multi dimensional (timelapse or multi-scene) file
channel_column (str) – Column in csv_path that contains which channel to extract from multi dimensional (timelapse or multi-scene) file. Should be an integer.
out_key (str) – Key where single-scene/timepoint/channel is saved in output dictionary
spatial_dims (int=3) – Spatial dimension of output image. Must be 2 for YX or 3 for ZYX
scene_column (str=”scene”,) – Column in csv_path that contains scenes to extract from multi-scene file. If not specified, all scenes will be extracted. If multiple scenes are specified, they should be separated by a comma (e.g. scene1,scene2)
time_start_column (str=”start”) – Column in csv_path specifying which timepoint in timelapse image to start extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.
time_stop_column (str=”stop”) – Column in csv_path specifying which timepoint in timelapse image to stop extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.
time_step_column (str=”step”) – Column in csv_path specifying step between timepoints. For example, values in this column should be 2 if every other timepoint should be run. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.
dict_meta (Optional[Dict]) – Dictionary version of CSV file. If not provided, CSV file is read from csv_path.
transform (Optional[Callable] = None) – Callable to that accepts numpy array. For example, image normalization functions could be passed here.
dask_load (bool = True) – Whether to use dask to load images. If False, full images are loaded into memory before extracting specified scenes/timepoints.
- cyto_dl.datamodules.multidim_image.make_multidim_image_dataloader(csv_path: Path | str | None = None, img_path_column: str = 'path', channel_column: str = 'channel', out_key: str = 'image', spatial_dims: int = 3, scene_column: str = 'scene', time_start_column: str = 'start', time_stop_column: str = 'stop', time_step_column: str = 'step', dict_meta: Dict | None = None, transforms: List[Callable] | Tuple[Callable] | ListConfig | None = None, **dataloader_kwargs) DataLoader [source]#
Function to create a MultiDimImage DataLoader. Currently, this dataset is only useful during prediction and cannot be used for training or testing.
- Parameters:
csv_path (Optional[Union[Path, str]]) – path to csv
img_path_column (str) – column in csv_path that contains path to multi dimensional (timelapse or multi-scene) file
channel_column (str) – Column in csv_path that contains which channel to extract from multi dim image file. Should be an integer.
out_key (str) – Key where single-scene/timepoint/channel is saved in output dictionary
spatial_dims (int) – Spatial dimension of output image. Must be 2 for YX or 3 for ZYX
scene_column (str) – Column in csv_path that contains scenes to extract from multiscene file. If not specified, all scenes will be extracted. If multiple scenes are specified, they should be separated by a comma (e.g. scene1,scene2)
time_start_column (str) – Column in csv_path specifying which timepoint in timelapse image to start extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.
time_stop_column (str) – Column in csv_path specifying which timepoint in timelapse image to stop extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.
time_step_column (str) – Column in csv_path specifying step between timepoints. For example, values in this column should be 2 if every other timepoint should be run. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.
dict_meta (Optional[Dict]) – Dictionary version of CSV file. If not provided, CSV file is read from csv_path.
transforms (Optional[Union[List[Callable], Tuple[Callable], ListConfig]]) – Callable or list of callables that accept numpy array. For example, image normalization functions could be passed here. Dataloading is already handled by the dataset.
- Returns:
The DataLoader object for the MultiDimIMage dataset.
- Return type:
DataLoader