cyto_dl.datamodules.multidim_image module#

class cyto_dl.datamodules.multidim_image.MultiDimImageDataset(csv_path: Path | str, img_path_column: str, channel_column: str, out_key: str, spatial_dims: int = 3, scene_column: str = 'scene', time_start_column: str = 'start', time_stop_column: str = 'stop', time_step_column: str = 'step', dict_meta: Dict | None = None, transform: Callable | None = None, dask_load: bool = True)[source]#

Bases: Dataset

Dataset converting a .csv file listing multi dimensional (timelapse or multi-scene) files and some metadata into batches of single- scene, single-timepoint, single-channel images.

Parameters:
  • csv_path (Union[Path, str]) – path to csv

  • img_path_column (str) – column in csv_path that contains path to multi dimensional (timelapse or multi-scene) file

  • channel_column (str) – Column in csv_path that contains which channel to extract from multi dimensional (timelapse or multi-scene) file. Should be an integer.

  • out_key (str) – Key where single-scene/timepoint/channel is saved in output dictionary

  • spatial_dims (int=3) – Spatial dimension of output image. Must be 2 for YX or 3 for ZYX

  • scene_column (str=”scene”,) – Column in csv_path that contains scenes to extract from multi-scene file. If not specified, all scenes will be extracted. If multiple scenes are specified, they should be separated by a comma (e.g. scene1,scene2)

  • time_start_column (str=”start”) – Column in csv_path specifying which timepoint in timelapse image to start extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.

  • time_stop_column (str=”stop”) – Column in csv_path specifying which timepoint in timelapse image to stop extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.

  • time_step_column (str=”step”) – Column in csv_path specifying step between timepoints. For example, values in this column should be 2 if every other timepoint should be run. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.

  • dict_meta (Optional[Dict]) – Dictionary version of CSV file. If not provided, CSV file is read from csv_path.

  • transform (Optional[Callable] = None) – Callable to that accepts numpy array. For example, image normalization functions could be passed here.

  • dask_load (bool = True) – Whether to use dask to load images. If False, full images are loaded into memory before extracting specified scenes/timepoints.

create_metatensor(img, meta)[source]#
get_per_file_args(df)[source]#
is_batch(x)[source]#
cyto_dl.datamodules.multidim_image.make_multidim_image_dataloader(csv_path: Path | str | None = None, img_path_column: str = 'path', channel_column: str = 'channel', out_key: str = 'image', spatial_dims: int = 3, scene_column: str = 'scene', time_start_column: str = 'start', time_stop_column: str = 'stop', time_step_column: str = 'step', dict_meta: Dict | None = None, transforms: List[Callable] | Tuple[Callable] | ListConfig | None = None, **dataloader_kwargs) DataLoader[source]#

Function to create a MultiDimImage DataLoader. Currently, this dataset is only useful during prediction and cannot be used for training or testing.

Parameters:
  • csv_path (Optional[Union[Path, str]]) – path to csv

  • img_path_column (str) – column in csv_path that contains path to multi dimensional (timelapse or multi-scene) file

  • channel_column (str) – Column in csv_path that contains which channel to extract from multi dim image file. Should be an integer.

  • out_key (str) – Key where single-scene/timepoint/channel is saved in output dictionary

  • spatial_dims (int) – Spatial dimension of output image. Must be 2 for YX or 3 for ZYX

  • scene_column (str) – Column in csv_path that contains scenes to extract from multiscene file. If not specified, all scenes will be extracted. If multiple scenes are specified, they should be separated by a comma (e.g. scene1,scene2)

  • time_start_column (str) – Column in csv_path specifying which timepoint in timelapse image to start extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.

  • time_stop_column (str) – Column in csv_path specifying which timepoint in timelapse image to stop extracting. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.

  • time_step_column (str) – Column in csv_path specifying step between timepoints. For example, values in this column should be 2 if every other timepoint should be run. If any of start_column, stop_column, or step_column are not specified, all timepoints are extracted.

  • dict_meta (Optional[Dict]) – Dictionary version of CSV file. If not provided, CSV file is read from csv_path.

  • transforms (Optional[Union[List[Callable], Tuple[Callable], ListConfig]]) – Callable or list of callables that accept numpy array. For example, image normalization functions could be passed here. Dataloading is already handled by the dataset.

Returns:

The DataLoader object for the MultiDimIMage dataset.

Return type:

DataLoader