NAME
sge_shadowd - Sun Grid Engine shadow master daemon
SYNOPSIS
sge_shadowd
DESCRIPTION
sge_shadowd is a "light weight" process which can be run on
so-called shadow master hosts in a Sun Grid Engine cluster
to detect failure of the current Sun Grid Engine master dae-
mon, sge_qmaster(8), and to start-up a new sge_qmaster(8) on
the host on which the sge_shadowd runs. If multiple shadow
daemons are active in a cluster, they run a protocol which
ensures that only one of them will start-up a new master
daemon.
The hosts suitable for being used as shadow master hosts
must have shared root read/write access to the directory
$SGE_ROOT/$SGE_CELL/common as well as to the master daemon
spool directory (by default
$SGE_ROOT/$SGE_CELL/spool/qmaster). The names of the shadow
master hosts need to be contained in the file
$SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters.
RESTRICTIONS
sge_shadowd may only be started by root.
ENVIRONMENT VARIABLES
SGE_ROOT Specifies the location of the Sun Grid Engine
standard configuration files.
SGE_CELL If set, specifies the default Sun Grid Engine
cell. To address a Sun Grid Engine cell
sge_shadowd uses (in the order of pre-
cedence):
The name of the cell specified in the
environment variable SGE_CELL, if it is
set.
The name of the default cell, i.e.
default.
SGE_DEBUG_LEVEL
If set, specifies that debug information
should be written to stderr. In addition the
level of detail in which debug information is
generated is defined.
SGE_QMASTER_PORT
If set, specifies the tcp port on which
sge_qmaster(8) is expected to listen for com-
munication requests. Most installations will
use a services map entry for the service
"sge_qmaster" instead to define that port.
SGE_DELAY_TIME This variable controls the interval in which
sge_shadowd pauses if a takeover bid fails.
This value is used only when there are multi-
ple sge_shadowd instances and they are con-
tending to be the master. The default is 600
seconds.
SGE_CHECK_INTERVAL
This variable controls the interval in which
the sge_shadowd checks the heartbeat file (60
seconds by default).
SGE_GET_ACTIVE_INTERVAL
This variable controls the interval when a
sge_shadowd instance tries to take over when
the heartbeat file has not changed.
FILES
<sge_root>/<cell>/common
Default configuration directory
<sge_root>/<cell>/common/shadow_masters
Shadow master hostname file.
<sge_root>/<cell>/spool/qmaster
Default master daemon spool directory
<sge_root>/<cell>/spool/qmaster/heartbeat
The heartbeat file.
SEE ALSO
sge_intro(1), sge_conf(5), sge_qmaster(8), Sun Grid Engine
Installation and Administration
COPYRIGHT
See sge_intro(1) for a full statement of rights and permis-
sions.
Man(1) output converted with
man2html