Grid Engine Homepage

Grid Engine Multi-Core Processor Binding with hwloc

Overview

The Job to Core Binding feature was added in Grid Engine 6.2u5 by using the PLPA (Portable Linux Processor Affinity) library. However, PLPA was deprecated by the hwloc project in 2009. Currently, all versions of Grid Engine and its forks are still using PLPA, including SGE 6.2u5p1 and SGE 6.2u5p2 recently released by the Open Grid Scheduler project.

Luckily, support for hwloc in Grid Scheduler is in the final stages of development.

Advantages of Using hwloc Over PLPA

Try it yourself - loadcheck

Updated: April 14, 2011: processor ID mapping bug fixed!!
Updated: April 17, 2011: reworked CPU ID mapping, and built binary on Oracle Linux 5.6 (compatible with: Red Hat Enterprise Linux 5.x, Centos)
Updated: April 18, 2011: minor bug fixed

Download loadcheck: hwloc enabled loadcheck and the 32-bit version.

Try our beta release - Drop-in Upgrade Package (includes: loadcheck, sge_execd, sge_shepherd)

Updated: July 11, 2011: Package created, compatible with SGE6.2u5, SGE6.2u5p1, and SGE6.2u5p2. Other changes: hwloc upgraded to v1.2.

Download package: 64-bit package for AMD64 & Intel64

Drop-in Upgrade Package (includes: loadcheck, sge_execd, sge_shepherd), with CSP mode enabled

Updated: Sept 12, 2011: Package created, compatible with SGE6.2u5, SGE6.2u5p1, and SGE6.2u5p2.

Download package: 64-bit package for AMD64 & Intel64

Install instructions:

Sample Output with hwloc

On a single socket, dual-core processor with hyper-threading enabled:
$ ./loadcheck 
arch            lx26-amd64
num_proc        4
m_socket        1
m_core          2
m_topology      SCTTCTT
load_short      0.02
load_medium     0.05
load_long       0.08
mem_free        2702.710938M
swap_free       5823.996094M
virtual_free    8526.707031M
mem_total       3760.742188M
swap_total      5823.996094M
virtual_total   9584.738281M
mem_used        1058.031250M
swap_used       0.000000M
virtual_used    1058.031250M
cpu             4.6%

$ ./loadcheck -cb
Your SGE Linux version has built-in core binding functionality!
Your Linux kernel version is: 2.6.34
Amount of sockets:		1
Amount of cores:		2
Topology:			SCTTCTT
Mapping of logical socket and core numbers to internal
Internal processor ids for socket     0 core     0:      0     1
Internal processor ids for socket     0 core     1:      2     3


On the same configuration, running Oracle Linux with Oracle's own Unbreakable Enterprise Kernel:
$ ./loadcheck
arch            lx26-amd64
num_proc        4
m_socket        1
m_core          2
m_topology      SCTTCTT
load_short      0.65
load_medium     0.24
load_long       0.08
mem_free        3573.882812M
swap_free       5823.992188M
virtual_free    9397.875000M
mem_total       3753.984375M
swap_total      5823.992188M
virtual_total   9577.976562M
mem_used        180.101562M
swap_used       0.000000M
virtual_used    180.101562M
cpu             0.0%

$ ./loadcheck -cb
Your SGE Linux version has built-in core binding functionality!
Your kernel version is: 2.6.32-100.26.2.el5
Amount of sockets:              1
Amount of cores:                2
Topology:                       SCTTCTT
Mapping of logical socket and core numbers to internal
Internal processor ids for socket     0 core     0:      0     1
Internal processor ids for socket     0 core     1:      2     3


On a single socket, dual-core processor with hyper-threading disabled:
$ ./loadcheck
arch            lx26-amd64
num_proc        2
m_socket        1
m_core          2
m_topology      SCC
load_short      0.58
load_medium     0.54
load_long       0.42
mem_free        3197.449219M
swap_free       5823.996094M
virtual_free    9021.445312M
mem_total       3825.371094M
swap_total      5823.996094M
virtual_total   9649.367188M
mem_used        627.921875M
swap_used       0.000000M
virtual_used    627.921875M
cpu             99.2%

$ ./loadcheck -cb
Your SGE Linux version has built-in core binding functionality!
Your Linux kernel version is: 2.6.34
Amount of sockets:		1
Amount of cores:		2
Topology:			SCC
Mapping of logical socket and core numbers to internal
Internal processor ids for socket     0 core     0:      0
Internal processor ids for socket     0 core     1:      1


On a single socket, single core processor machine running Ubuntu:
> ./loadcheck 
arch            lx26-x86
num_proc        1
m_socket        1
m_core          1
m_topology      SC
load_short      1.90
load_medium     1.86
load_long       1.56
mem_free        117.066406M
swap_free       685.953125M
virtual_free    803.019531M
mem_total       247.605469M
swap_total      729.503906M
virtual_total   977.109375M
mem_used        130.539062M
swap_used       43.550781M
virtual_used    174.089844M
cpu             100.0%

> ./loadcheck -cb
Your SGE Linux version has built-in core binding functionality!
Your Linux kernel version is: 2.6.27-17-generic
Amount of sockets:		1
Amount of cores:		1
Topology:			SC
Mapping of logical socket and core numbers to internal
Internal processor ids for socket     0 core     0:      0

See also: Optimizing Grid Engine for AMD Bulldozer Systems