The Grid Engine product includes a tool called the Accounting
and Reporting Console (ARCo), which stores
reporting data from the qmaster in a relational database.
The attached spreadsheet is intended to guide the administrator in
estimating the space requirements for such a database.
The following calculations are meant to be simple guidelines and by no
means work for every case. The reader is encouraged to consult other
database configuration literature to complement the information found
in this document.
The ARCo module of the Grid Engine product consists of the dbwriter and the
reporting module (a web application which runs inside the Java(TM) Web
Console). When the reporting functionality is enabled, the qmaster
writes reporting data into the reporting file, located at
<SGE_ROOT>/<cell>/common/reporting. This file contains raw
values in a format described in the
reporting(5)
man page. Which information is written into the reporting file can be configured with
the reporting_params parameter of the qmaster configuration (See
sge_conf(5)
).
The dbwriter module performs the following tasks:
The dbwriter module periodically looks for the reporting file. If the file exists, it will be renamed to reporting.processing, and the contents of the file (the raw values) will be imported into the database. After the file reporting.processing is completely processed, it is deleted by the dbwriter.
Based on the raw values stored in the database, the dbwriter module calculates a set of derived values. The rules for the derived values are defined in the calculation file (By default at <SGE_ROOT>/<CELL>/dbwriter/<database type>/dbwriter.xml.
The following derived value rule calculates the derived value "h_load" (hourly load) which is an average of the raw value "np_load_avg". The calculation is performed every hour. The resulting derived values are stored in the same database table from which they were calculated. In this case the sge_host_values.
<!-- average load per hour --> <derive object="host" interval="hour" variable="h_load"> <auto function="AVG" variable="np_load_avg" /> </derive>
Deletion rules are also defined in the calculation file. A deletion rule defines how long a raw or derived value stays in the database. If correctly configured, the deletion rules keeps the database at an approximately constant size.
The following example deletes all the records from the sge_host_values table where "hv_variable" equals "np_load_avg" and the values are older than 7 days.
<delete scope="host_values" time_range="day" time_amount="7"> <sub_scope>np_load_avg</sub_scope> </delete>
For detailed information please refer the Sun Grid Engine Administration Guide -- Section: Derived Values and Deletion Rules
The following table shows all the raw and derived values with their associated lifetimes, assuming the default configuration of the dbwriter module is used.
Database Table | Interval | Variable | Lifetime |
department_values | * | * | 2 years |
group_values | * | * | 2 years |
host_values | * | * | 2 years |
day | d_jobs_finished | 2 years | |
d_load | 2 years | ||
hour | h_cpu | 2 years | |
h_jobs | 2 years | ||
h_load | 2 years | ||
raw values | cpu | 7 days | |
mem_free | 7 days | ||
np_load_avg | 7 days | ||
virtual_free | 7 days | ||
job | * | * | 1 year |
job_log | * | * | 1 month |
project_values | * | * | 2 years |
hour | h_jobs_finished | 2 years | |
queue_values | * | * | 2 years |
hour | h_utilized | 2 years | |
raw value | slots | 1 month | |
state | 1 month | ||
share_log | * | * | 1 year |
raw value | user1 | 1 month | |
user_values | * | * | 2 years |
day | d_jobs_finished | 2 years | |
hour | h_jobs_finished | 2 years |
With the knowledge of how the dbwriter module works, it is possible to estimate the space requirements of an ARCo database. Attached to this article is a spreadsheet which contains all the formulas required for such a calculation. The administrator only needs to enter the specific parameters of the Grid Engine cluster. Please note the additional comments which have been added as notes to the affected cells, such as descriptions of configuration parameters. Cells with added notes have a small, red box in the upper right corner.
The calculation is based on the default configuration of the Grid Engine
product. The results may differ if the product configuration changes.
The average space usage of each row of a database table has been taken
from the data dictionary of an Oracle 9i database which has been filled
by the dbwriter module. Due to the complexities of database
configuration, this spreadsheet can give only a guideline for how much
space will be required for such a database system.