Sun Grid Engine Tuning guide

Grid Engine Tuning guide

Grid Engine is a full function, general purpose Distributed Resource Management (DRM) tool. The scheduler component in Grid Engine supports a wide range of different compute farm scenarios. To get the maximum performance from your compute environment it can be worthwhile to review which features are enabled and which are really needed to solve your load management problem. Disabling/Enabling these features can have a performance benefit on the throughput of your cluster. Each feature contains in parentheses when it was introduced. If not otherwise stated, it is available in higher versions as well.

overall cluster tuning (V5.3 + V6.0)

Experience has shown utilization of NFS or similar shared file systems for distributing files required by Grid Engine can have a critical share in both overall network load and file server load. Thus keeping such files locally is always beneficially for overall cluster throughput. The Reducing and Eliminating NFS usage by Grid Engine HOWTO shows different common choices for accomplishing this.
scheduler monitoring (V5.3 + V6.0)

Scheduler monitoring can be helpful to find out the reason why certain jobs are not dispatched (displayed via qstat). However, providing this information for all jobs at any time can be resource consuming (memory and cpu time) and is usually not needed. To disable scheduler monitoring set schedd_job_info to false in scheduler configuration sched_conf(5).
finished jobs (V5.3 + V6.0)

In case of array jobs the finished job list in qmaster can become quite big. Switching it off will save memory and speed up qstat commands because qstat also fetches the finished jobs list. Set finished_jobs to 0 in global configuration. See sge_conf(5).
job verification (V5.3 + V6.0)

Forcing validation at job submission time can be a valuable tool to prevent non-dispatchable jobs from remaining in pending state foreever. However, It can be a time consuming job to validate jobs, especially in heterogeneous environments with a variety of different execution nodes and consumable resources and where every user has his own job profile. In homogeneous environments with only a couple of different jobs, a general job validation usually can be omitted. Job verification is disabled per default and should only be used (qsub(1): -w [v|e|w]) when needed. [It is enables by default with DRMAA]
load thresholds and suspend thresholds (V5.3 + V6.0)

Load thresholds are needed if you deliberately oversubscribe your machines, and you need a mechanism to prevent excessive system load. Suspend thresholds are also used for this. The other case in which load thresholds are needed is when the execution node is open for interactive load which is not under control of Grid Engine, and you want to prevent the node from being overloaded. If a compute farm is more single-purpose, e. g., each CPU at a compute node is represented by only one queue slot, and no interactive load is expected at these nodes, then load_thresholds can be omitted. To disable both thresholds set load_thresholds to none and suspend_thresholds to none. See queue_conf(5).

Starting with V6.0 load_thresholds areapplicable to consumable resources as well (see queue_conf(5)). Using this feature will have a negative impact on the scheduler performance.

load adjustments (V5.3 + V6.0)

Load adjustments are used to virtually increase the measured load after a job has been dispached. This mechanism is helpful in the case of oversubscribed machines in order to align with load thresholds. Load adjustments should be switched off if they are not needed, because they impose on the scheduler some additional work in connection sorting hosts and load thresholds verification. To disable load adjustments set job_load_adjustments to none and load_adjustment_decay_time to 0 in the scheduler configuration. See sched_conf(5).
scheduling-on-demand (V5.3 + V6.0)

The default for Grid Engine is to start scheduling runs in a fixed scheduling interval (see schedule_interval in schedd_conf(5)). The good thing with fixed intervals is that they limit the cpu time consumption of the qmaster/scheduler. The bad thing is that they throttle the scheduler artificially, resulting in a limited throughput. In many compute farms there are machines specifically dedicated to qmaster/scheduler and in such setups there is no reason for throttling the scheduler. How many seconds one should use for flush times is difficult to say. It depends on the time the scheduler needs for a single run and the number of jobs in the system. A couple test runs with the scheduler profiling (Add profile=1 to the params in the schedd_conf(5).) should give one enough data to select a good value.

In V5.3:

Scheduling-on-demand can be configured using the FLUSH_SUBMIT_SEC and FLUSH_FINISH_SEC settings in the schedd_params section of the global cluster configuration. See sge_conf(5).

In V6.0:

Scheduling-on-demand can be configured using the FLUSH_SUBMIT_SEC and FLUSH_FINISH_SEC settings in the schedd_conf(5).

scheduler priority information (V6.0)

qstat -ext

-urg

-pri

report_pjob_tickets

false

schedd_conf(5)

policies (V6.0)

sge_priority(5)

ticket policy
urgency policy
posix priority policy
deadline policy
waiting time policy

weighting factor

schedd_conf(5)

resource reservation (V6.0)

max_reservation

schedd_conf(5)

max_reservation

schedd_conf(5)

optimization of qmaster memory consumption

-v variable_list

-V

intentional use "-b y" to disburden qmaster (V6.0)

qsub

-b y

EXPERIMENTAL: job filter based on job clases (V6.0u1)

JC_FILTER=1

schedd_conf(5)

not documented

problems in the system

"qstat -ext"

"qstat -j "

A new feature with Grid Engine V6.0 is the ability to store scheduler profiles, e. g. "qconf -ssconf >file", such as are used during Grid Engine installation. The profiles are not stored internally. With the combination of dynamically changing the scheduler configuration by loading a new profile with "qconf -Msconf <file>" and a cron job, one can switch to a leaner configuration over night and return to a user friendly configuration during the day.