Setting up the environment for your code to run is easily configurable in the MosaicML platform.
MosaicML Platform Environment Variables#
We automatically set the following environment variables in your run container.
RUN_NAME: The name of your run as seen in the output of
mcli get runs.
WORLD_SIZE: The total number of GPUs being used for the training run.
NODE_RANK: The rank of the node the container is running on. In a multi-node training job involving
nnodes, nodes will have ranks
n - 1.
MASTER_ADDR: The network address of the node with rank 0 in the training job.
MASTER_PORT: The network port of the node with rank 0 in the training job.
Build a docker image with all the required system packages for your code. Especially for large dependencies, including them in your docker will speed up the run start time. For more information, see the Docker documentation.
We maintain a set of public docker images for PyTorch, PyTorch Vision, and Composer on DockerHub.
To run with an existing docker image, use the
from mcli.sdk import RunConfig config = RunConfig(name='example', image='mosaicml/composer:latest', command='echo "Hello World!" && sleep 60', gpu_type='none', cluster='my-cluster')
Private images require setting up Docker Secrets with:
mcli create secrets docker
To add environment variables, use the
env_variables: - name: <unique_name> key: KEY value: VALUE
Secrets are credentials or other sensitive information that are only accessible to yourself. MCLI supports adding different secret types into your run environment as environment variables or mounted files.
mcli create secrets -h
All secrets are stored securely in a vault, maintained across your clusters, and added to every run. Your secrets are never shared with other users.
For more information, see the Secrets Page
Integrations set up execution environments quickly by spanning across mounted files, environment variables, commands, secrets, and clusters.
For example, the
Weights & Biases Integration sets up all the neccessary environment variables for the W&B client:
integrations: - integration_type: wandb project: my_project entity: my_entity
For all the supported integrations, see The Integrations Page
Integrations for Live Updates
Integrations are resolved at runtime, so are ideal for adding environment configurations that change often.
For example, git repos can be added as an integration to set up the code base from its current state at runtime.