How to install the Grid
Engine software on hosts with the Solaris TM
Operating Environment IP Multipathing (IPMP) technology
This document describes how to install Grid Engine on machines with
multiple network interafces (multi-homed host). Particular
attention is given to the Solaris Operating Environment IP Multipathing
technology,. The procedure presented here should work for other environments
What is IP Multipathing?
IP Multipathing is a technology that allows grouping of TCP/IP interfaces
for fail over and load balancing purposes. If an interface within an IP
Multipathing group fails, the interface is disabled and its IP address
is relocated to another interface in the group. Outbound IP traffic is
distributed across the interfaces of a group.
For further details on IP Multipathing,
refer to the Solaris Operating Environment documentation, which can be
The IPMP features overview can be found
Issues between IPMP and Grid Engine
The only major issue is the error messages
while starting the Grid Engine daemons on a machine in which the main interface
is part of an IPMP group. This occurs when the IPMP load balancing distributes
the connections across the interfaces in the group; therefore, the IP packets
show up at the receiving end as coming from a different host rather than
the one associated with the main interface.
For example, let's say we have a machine
with three interfaces named qfe0, qfe1, and qfe3
, where the IP addresses for these interfaces are 10.1.1.1, 10.1.1.2 and
10.1.13 respectively. IPMP would need an extra address for each interface
for testing, but we will ignore those in this example. Each of these addresses
has a hostname associated with it. The hosts table looks
The machine's hostname is sge
. When a connection is established
to another machine, it might go through sge
, or sge-qfe2
. Upon installation, Grid Engine
will only recognize sge.W
hen it receives a connection from sge-qfe2
, it closes the connection because it is not from one of the authorized
(or known) nodes.
To solve this issue we have to use the host_aliases files (see
man page for details). This file can be used to "tell" Grid
Engine that sge, sge-qfe1, and sge-qfe2 are
all from the same machine. The host_aliases file for this case
would look like this:
sge sge-qfe1 sge-qfe2
make any changes to the $SGE_ROOT/$SGE_CELL/common/host_aliases
file, all running Grid Engine daemons (sge_qmaster
) must be stopped and restarted.
Login as root
to all your Grid Engine hosts and enter:
How to install the Grid Engine master node with IPMP
There are at least two options:
A) Ignore the error messages during installation. The procedure is:
1. Run inst_sge -m, ignoring the error messages during the start
up of the daemons.
2. Shutdown the daemons with /etc/init.d/rcsge stop. Due to the
networking errors, some daemons fail to shutdown
and must be killed with kill -9. To check which daemons
failed to shutdown use: ps -e | grep sge_.
3. Install the host_aliases file in the $SGE_ROOT/$SGE_CELL/common
4. Restart the daemons with /etc/init.d/rcsge start.
Note: This procedure is Operating System independent.
B) Temporarily disable IPMP on the interface associated with the machine's
hostname. The procedure is:
1. Identify the interface associated with the machine's hostname.
2. Verify the interface has IPMP enabled with:
ifconfig <<interface>> | grep groupname.
3. Take note of the group name.
4. Disable IPMP with: ifconfig <<interface>> group "" .
5. Install the Grid Engine master node.
6. Install the host_aliases file in the $SGE_ROOT/$SGE_CELL/common
7. Restart all the Grid Engine daemons.
8. Re-enable IPMP: ifconfig <<interface>> group <<IPMP
Note: This procedure is valid only for SolarisTM
8 Operating Environment or newer.
How to install a Grid Engine execution host with IPMP
Once the host_aliases file is
installed and the Grid Engine daemons are restarted, you can simply start
the execution host installation without further problems.
How to enable administrative and submit
hosts with IPMP
You can either follow the same procedure
used for the execution host (e.g. update host_aliases before installation,
see the note on changes to the host_aliases file
), or add all the hostnames associated with the administrative, or submit
qconf -ah <<hostname>> <<alias 1>>
<<alias 2>> ...
administrative host) or
qconf -as <<hostname>> <<alias 1>>
<<alias 2>> ...
Sun and Solaris are trademarks or registered trademarks of Sun Microsystems,
Inc. in the United States and other countries. Sun et Solaris sont des
marques déposées ou enregistrées de Sun Microsystems,
Inc. aux Etats-Unis et dans d'autres pays.