Distributed Cluster Deployments¶
These are all tested and working examples.
LocalCluster¶
If you want to turn your local machine into a distributed.Cluster
simply run:
from distributed import Client, LocalCluster
cluster = LocalCluster()
client = Client(cluster)
AICS SLURM Cluster¶
from datetime import datetime
from pathlib import Path
import dask.config
from dask_jobqueue import SLURMCluster
from distributed import Client
# Create or get log dir
# Do not include ms
log_dir_name = datetime.now().isoformat().split(".")[0]
log_dir = Path(f".dask_logs/{log_dir_name}").expanduser()
# Log dir settings
log_dir.mkdir(parents=True, exist_ok=True)
# Configure dask config
dask.config.set(
{
"scheduler.work-stealing": False,
}
)
# Create cluster
cluster = SLURMCluster(
cores=12,
memory="160GB",
queue="aics_cpu_general",
walltime="10:00:00",
local_directory=str(log_dir),
log_directory=str(log_dir),
)
# Scale cluster
cluster.scale(12)
# Create client connection
client = Client(cluster)
AWS Fargate Cluster¶
Requires all data needed be accessible to workers. In this case, you should probably put the data on S3 and use s3fs. You must also upload a Docker image to Docker Hub.
from dask_cloudprovider import FargateCluster
from distributed import Client
# Create cluster
cluster = FargateCluster("username/dockerimage")
# Adapt
cluster.adapt(minimum_jobs=1, maximum_jobs=100)
# Create client connection
client = Client(cluster)
Example Dockerfile¶
FROM ubuntu:18.04
# Copy project
COPY . /project
# General upgrades and requirements
RUN apt-get update && apt-get upgrade -y
# Get software props
RUN apt-get install -y \
software-properties-common
# Add additional apt repository
RUN add-apt-repository universe
# Get python3.7 and pip
RUN apt-get update && apt-get install -y \
python3.7 \
python3.7-dev \
python3-pip \
git
# Upgrade pip and force it to use python3.7
RUN python3.7 -m pip install --upgrade pip
# Set python3.7 to default python
RUN ln -sf /usr/bin/python3.7 /usr/bin/python
RUN ln -sf /usr/bin/python3.7 /usr/bin/python3
# Set workdir
WORKDIR project/
# Install package
RUN pip install .