The Python API allows users to submit, monitor, and delete jobs in python instead of using the MosaicML CLI. Programmatically design complex sweeps and workflows, without ever resorting to shell scripting.

To get started, either follow the Quick Start or set the environment variable MOSAICML_API_KEY to automatically configure access to the MosaicML platform.

The Python API currently supports managing your runs, from creation to monitoring logs to deleting runs. As a quick reference, the following run-related methods are supported:


Launch a run


Get a filtered list of runs


Stop a list of runs


Delete a list of runs


Get the current logs for an active or completed run


Follow the logs for an active or completed run in the MosaicML platform


Wait for a launched run to reach a specific status

For more details on these, please see Working with runs.

Configuring more advanced runs

If you are submitting runs inside of other runs, we recommend using an environment secret to set MOSAICML_API_KEY.

“Hello World”#

To submit a simple run, and then print out its logs while running, we first create the RunConfig object, which follows the same schema as the yaml files, see: Run schema.

from mcli.sdk import RunConfig

cluster = "<your-cluster>"

config = RunConfig(name='hello-world',
                   command='echo "Hello World!" && sleep 60',


If your config is already in a yaml file, the RunConfig can also be created with config = RunConfig.from_file('your_yaml.yaml') command.

Now, let’s create a simple script that submits the run, and then after the run starts, print the first line of the logs. To clean up, we will stop the run.

from mcli.sdk import create_run, wait_for_run_status, follow_run_logs, stop_run

# Create the run from a config
run = create_run(config)
print(f'Launching run {}')

# Wait for the run to start "running"
run = wait_for_run_status(run, status='running')
print(f'Run named {} has status {run.status}')

# Print the first line of logs
for line in follow_run_logs(run):
    print(f'First log line was: {line}')

# Stop the run
run = stop_run(run)
print(f'Run named {} has status {run.status.value}')

A few additional details about the above script

  • (wait_for_run_status()) waits for the run to reach the status (or later). status can be either a str or a RunStatus enum.

  • We use follow_run_logs() instead of get_run_logs() because it’s possible that the “Hello World!” line has not yet been printed by the time the call is made, so we want to wait to ensure it’s printed.

  • stop_run() stops a run, and will leave the logs intact (in contrast to delete_run() which also deletes the logs).

To clean up, let’s delete the run:

from mcli.sdk import delete_run


Next steps#