Grid Engine Trouble Shooting

Problem with a pending jobs not being dispatched

Sometimes a pending job is obviously runnable, but does not get dispatched. Grid Engine can be asked for the reason:

Job or Queue goes in error state "E"

Job or queue errors are indicated by an uppercase "E" in the qstat output. A job enters the error state when Grid Engine tried to execute a job in a queue, but it failed for a reason that is specific to the job. A queue enters the error state when Grid Engine tried to execute a job in a queue, but it failed for a reason that is specific to the queue.

Grid Engine offers a set of possiblities for users and administrators to get diagnosis information in case of job execution errors. Since both the queue and the job error state result from a failed job execution the diagnosis possibilities are applicable to both types of error states:

qmaster or other Grid Engine daemons keep crashing

Alternatively, you can control the level of verbosity with the following shell commands: