Communicating between components
CanESM runs in a multiple-program, multiple-domain paradigm where each component (atmosphere, ocean, and coupler) run
their own executables and communicate using MPI. All MPI tasks exist in the same MPI_COMM_WORLD. CanCPL itself is
run on a single MPI task (and has no further parallelization within it). This page details how fields are sent
between components and the transformations that occur. The overall MPI network topology is shown in the following
figure.
Demonstration of the MPI network topology used in CanESM. Each communicator contains the PEs for each task and the
MPI_COMM_WORLD encompasses all PEs within the simulation. Communication between communicators only occurs between
the lead task of the ocean/atmosphere and the coupler.
Coupler API
All three components compile com_cpl.F90 containing routines that initialize the MPI communicators on every task,
create MPI communicators, define the interfaces to send and receive data via MPI (see the API reference for a full
description), and setup the list of variables that will be passed through the coupler. The coupler API are only accessed
in a handful of routines in each component:
AGCM
gcm18.Fmpi_getcpl2.Fmpi_putggb2.F
Ocean
cpl_cancpl.F90sbccpl.F90
Organization of MPI communications
CanESM is run in a multi-program, multi-data (MPMD) paradigm where each component of the earth system (CanAM, CanCPL,
and NEMO) has its own executable. Every MPI task exists within the default MPI_COMM_WORLD communicator. The
parallelization strategies for CanAM and NEMO use MPI internally to exchange data between tasks. The calls are
necessarily blocking and so the MPI_COMM_WORLD must be split properly to allow each component to run in parallel.
The definition of these communicators is done in the subroutine define_group in com_cpl.F90. During
initialization each component calls this subroutine with a three-character identifier cpl, atm, or ocn,
depending on which component it is. These character strings are then mapped onto pre-defined, integer parameters in
cpl_types.F90 to avoid string comparisons. The integer parameters are then used to create the MPI communicators for
each group. Additional information about each group is stored in the group_info array which contains the ‘leader’
of each group (the task with the lowest MPI rank in each group) and the ranks associated with each communicator.
Some of this information is then broadcast from the leader of each group to ensure that every MPI task has the same
information.
Exchanging coupled fields
The transfer of fields between communicators is handled only by the master tasks. These fields comprise the entire global array and not just the subdomain (in the ocean) or latitude bands (in the atmosphere) associated with an individual task. After receiving the field, the master task scatters the global array to all other tasks within the communicator.
The following describes the communication pathway using sea surface temperature as an example:
The lead
ocntask constructs the global SST array from the subdomain of every otherocntask.The lead
ocntask sends SST to the main (and only)cpltask.The
cpltask remaps SST from the ocean grid to the atmospheric gridThe
cpltask sends the global SST array to the masteratmtaskThe lead
atmtask scatters the global SST array to every otheratmtask.Each atmospheric task copies only the part of the global array that it needs
Example: Including a new component
The following example provides a qualitative description of how to connect a new component to the coupler. As a prerequisite, the following are assumed:
The component can be run in an MPMD like mode
The following locations to inject code have been identified:
Initialize the MPI communications
Send fields to the coupler
Receive fields from the coupler
Rough Procedure
In the portion of the code responsible for initializing communications add the call to
define_groupfound incom_cpl.F90. Note that this routine can be used to initializeMPI, but MPI can also have been initialized prior to that call.Define new variables in
subroutine define_cpl_var_listincom_cpl.F90.Define new events that comprise the receive and send step between the coupler and the new component, e.g. create analogues of
add_events_nemo_to_cplandadd_events_cpl_to_nemo.Add the calls to the
add_events_COMPONENT_to_cplandadd_events_cpl_to_COMPONENTtosubroutine add_events_part1Add a subroutine that reconstructs the global array from the subdomain arrays and calls
send_data_recAdd a subroutine that calls
rcv_data_recand scatters the global array to each subdomain
Potential future enhancements
Eliminate the need for steps (4) and (5) in “Exchanging Coupling Fields” above by creating an intercommunicator between
cplandatm/ocnRefactor coupler to support more tasks to enhance performance of the ESMF remapping
Avoid repeated code by refactoring the bcast routines
Generate mapping between processor domains to avoid full global gathers/scatters