polaris.Step

class polaris.Step(component, name, subdir=None, indir=None, cpus_per_task=1, min_cpus_per_task=1, ntasks=1, min_tasks=1, openmp_threads=1, max_memory=None, cached=False, run_as_subprocess=False)[source]

The base class for a step of a tasks, such as setting up a mesh, creating an initial condition, or running the component forward in time. The step is the smallest unit of work in polaris that can be run on its own by a user, though users will typically run full tasks or suites.

Below, the terms “input” and “output” refer to inputs and outputs to the step itself, not necessarily the MPAS model. In fact, the MPAS model itself is often an input to the step.

Variables:
  • name (str) – the name of the step

  • component (polaris.Component) – The component the step belongs to

  • subdir (str) – the subdirectory for the step

  • path (str) – the path within the base work directory of the step, made up of component, the task’s subdir and the step’s subdir

  • cpus_per_task (int, optional) – the number of cores per task the step would ideally use. If fewer cores per node are available on the system, the step will run on all available cores as long as this is not below min_cpus_per_task

  • min_cpus_per_task (int, optional) – the number of cores per task the step requires. If the system has fewer than this number of cores per node, the step will fail

  • ntasks (int, optional) – the number of tasks the step would ideally use. If too few cores are available on the system to accommodate the number of tasks and the number of cores per task, the step will run on fewer tasks as long as as this is not below min_tasks

  • min_tasks (int, optional) – the number of tasks the step requires. If the system has too few cores to accommodate the number of tasks and cores per task, the step will fail

  • openmp_threads (int) – the number of OpenMP threads to use

  • max_memory (int) – the amount of memory that the step is allowed to use in MB. This is currently just a placeholder for later use with task parallelism

  • input_data (list of dict) – a list of dict used to define input files typically to be downloaded to a database and/or symlinked in the work directory

  • inputs (list of str) – a list of absolute paths of input files produced from input_data as part of setting up the step. These input files must all exist at run time or the step will raise an exception

  • outputs (list of str) – a list of absolute paths of output files produced by this step (or cached) and available as inputs to other tasks and steps. These files must exist after the task has run or an exception will be raised

  • dependencies (dict of polaris.Step) – A dictionary of steps that this step depends on (i.e. it can’t run until they have finished). Dependencies are used when the names of the files produced by the dependency aren’t known at setup (e.g. because they depend on config options or data read in from files). Under other circumstances, it is sufficient to indicate that an output file from another step is an input of this step to establish a dependency.

  • has_shared_config (bool) – Whether this step uses a shared config file.

  • is_dependency (bool) – Whether this step is the dependency of one or more other steps.

  • tasks (dict) – The tasks this step is used in

  • config (polaris.config.PolarisConfigParser) – Configuration options for this step, possibly shared with other tasks and steps

  • machine_info (mache.MachineInfo) – Information about E3SM supported machines

  • config_filename (str) – The filename or symlink within the step where config is written to during setup and read from during run

  • work_dir (str) – The step’s work directory, defined during setup as the combination of base_work_dir and path

  • base_work_dir (str) – The base work directory

  • baseline_dir (str) – Location of the same task within the baseline work directory, for use in comparing variables and timers

  • validate_vars (dict of list) – A list of variables for each output file for which a baseline comparison should be performed if a baseline run has been provided. The baseline validation is performed after the step has run.

  • logger (logging.Logger) – A logger for output from the step

  • log_filename (str) – At run time, the name of a log file where output/errors from the step are being logged, or None if output is to stdout/stderr

  • cached (bool) – Whether to get all of the outputs for the step from the database of cached outputs for this component

  • run_as_subprocess (bool) – Whether to run this step as a subprocess, rather than just running it directly from the task. It is useful to run a step as a subprocess if there is not a good way to redirect output to a log file (e.g. if the step calls external code that, in turn, calls additional subprocesses).

  • args ({list of str, None}) – A list of command-line arguments to call in parallel

__init__(component, name, subdir=None, indir=None, cpus_per_task=1, min_cpus_per_task=1, ntasks=1, min_tasks=1, openmp_threads=1, max_memory=None, cached=False, run_as_subprocess=False)[source]

Create a new task

Parameters:
  • component (polaris.Component) – The component the step belongs to

  • name (str) – the name of the task

  • subdir (str, optional) –

    the subdirectory for the step. If neither this nor indir

    are provided, the directory is the name

  • indir (str, optional) – the directory the step is in, to which name will be appended

  • cpus_per_task (int, optional) – the number of cores per task the step would ideally use. If fewer cores per node are available on the system, the step will run on all available cores as long as this is not below min_cpus_per_task

  • min_cpus_per_task (int, optional) – the number of cores per task the step requires. If the system has fewer than this number of cores per node, the step will fail

  • ntasks (int, optional) – the number of tasks the step would ideally use. If too few cores are available on the system to accommodate the number of tasks and the number of cores per task, the step will run on fewer tasks as long as as this is not below min_tasks

  • min_tasks (int, optional) – the number of tasks the step requires. If the system has too few cores to accommodate the number of tasks and cores per task, the step will fail

  • openmp_threads (int) – the number of OpenMP threads to use

  • max_memory (int, optional) – the amount of memory that the step is allowed to use in MB. This is currently just a placeholder for later use with task parallelism

  • cached (bool, optional) – Whether to get all of the outputs for the step from the database of cached outputs for this component

  • run_as_subprocess (bool) – Whether to run this step as a subprocess, rather than just running it directly from the task. It is useful to run a step as a subprocess if there is not a good way to redirect output to a log file (e.g. if the step calls external code that, in turn, calls additional subprocesses).

Methods

__init__(component, name[, subdir, indir, ...])

Create a new task

add_dependency(step[, name])

Add step as a dependency of this step (i.e.

add_input_file([filename, target, database, ...])

Add an input file to the step (but not necessarily to the MPAS model).

add_output_file(filename[, validate_vars])

Add the output file that must be produced by this step and may be made available as an input to steps, perhaps in other tasks.

constrain_resources(available_resources)

Constrain cpus_per_task and ntasks based on the number of cores available to this step

process_inputs_and_outputs()

Process the inputs to and outputs from a step added with polaris.Step.add_input_file() and polaris.Step.add_output_file().

run()

Run the step.

runtime_setup()

Update attributes of the step at runtime before calling the run() method.

set_resources([cpus_per_task, ...])

Update the resources for the subtask.

set_shared_config(config[, link])

Replace the step's config parser with the shared config parser

setup()

Set up the task in the work directory, including downloading any dependencies.

validate_baselines()

Compare variables between output files in this step and in the same step from a baseline run if one was provided.