MVAPICH

Grid Engine Parallel Support for MVAPICH

with Tight and Loose Integration






Content

  1. Content of mvapich-1.0.tar.gz
  2. mvapich.template
  3. mvapichl.template
  4. startmpi.sh
  5. stopmpi.sh
  6. 'mvapich.sh' job example
  7. 'mvapichl.sh' job example
  8. Copyright

1. Content of mvapich-1.0.tar.gz

The mvapichsge-1.0.tar.gz  archive contains the following files and directories:
README this file
startmpi.sh startup script for MVAPICH
stopmpi.sh shutdown script for MVAPICH
mvapich.template a MVAPICH PE template configuration for Grid Engine (tight)
mvapichl.template a MVAPICH PE template configuration for Grid Engine (loose)
mvapich.sh a sample MVAPICH job  (tight)
mvapichl.sh a sample MVAPICH job  (loose)
hostname a wrapper for the hostname command

Please refer to the "Administration and User Guide" Chapter "Support of Parallel Environments" for a general introduction to the Parallel Environment Interface of Grid Engine. (Grid Engine 5.3, Grid Engine 6.0)


2. mvapich.template

Use this template as a starting point when establishing a parallel environment for MVAPICH with tight integration. You need to replace <a_list_of_parallel_queues>, <the_number_of_slots> and <your_sge_root>  with the appropriate information.

Here is a list of problems for which tight integration provides solutions
Here is a list of problems which are not solved by the tight integration

3. mvapichl.template

Use this template as a starting point when establishing a parallel environment for MVAPICH with loose integration. Not recommended since MVAPICH doesn't clean its jobs properly in this mode when, for example, qdel is used.

4. startmpi.sh

The starter script 'startmpi.sh' needs some command line arguments, to be configured by use of either qmon or qconf. The first one is the path to the "$pe_hostfile" that gets transformed by startmpi.sh into a MPI machine file. On successful completion startmpi.sh creates a machine file in $TMPDIR/machines to be passed to "mpirun" at job start.

$TMPDIR is a temporary directory created and removed by the Grid Engine execution daemon.

5. stopmpi.sh

The stopper 'stopmpi.sh' just removes $TMPDIR/machines.  

6. 'mvapich.sh' job example

The job example 'mvapich.sh' starts the xhpl' program. Please note that a MPI job that has to start 'mpirun_rsh' with the options  "-np $NSLOTS" to start the job with the correct number of slots ($NSLOTS is set by Grid Engine).
To pass information where to start the MPI tasks one has to pass  "-hostfile $TMPDIR/machines" as the second argument.  

Additionally, for tight integration remember to use "-rsh " and optionally, you can use "-nowd" to prevent mvapich to 'cd $wd' in the remote hots.
This leaves SGE in charge of the working directory.

7. 'mvapichl.sh' job example

Is the same case, but using loose integration, where the option "-ssh" can be used instead of the "-rsh".

8. Copyright

Marcelo Matus