GPFS supports a "gpfs_fcntl" method for hinting various things,
including "i'm about to write this block of data". Let's see if, for
the cost of a few system calls, we can wrangle the GPFS locking system
into allowing concurrent access with less overhead. (new IOR parameter
gpfsHintAccess)
Also, drop all locks on a file immediately after open/creation in the
shared file case, since we know all processes will touch unique regions
of the file. It may or may not be a good idea to release all file locks
after opening. Processes will then have to re-acquire locks already
held. (new IOR parameter gpfsReleaseToken)
Improve the scalabilit of CountTasksPerNode() by using
a Broadcast and AllReduce, rather than flooding task zero
with MPI_Send() messages.
Also change the hostname lookup function from MPI_Get_processor_name
to gethostname(), which should work on most systems that I know of,
including BlueGene/Q.
Removing AC_FUNC_MALLOC from configure.ac, to allow compilation
on BG/P systems. This check can fail in cross-compilation environments,
which unnecessarily forces autoconf to require an rpl_malloc()
replacement for malloc(). We could implement the conditional addition
of rpl_malloc(), but removing AC_FUNC_MALLOC is a quite work-around.
fixes#4
Allows every task to allocate a specified amount of memory as
a rough simulation of a real application's memory usage.
Every page of the allocated memory is touch to defeat lazy
memory allocation.
Original patch by Michael Kluge <michael.kluge@tu-dresden.de>
Only print total summary after all tests run.
Put calculated results from each iteration of a test in a separate
IOR_results_t structure. Clean up the allocation and freeing code
for these caluclated bits, which allowing us to hang onto the results
until the end of all tests. That in turn allows us to perform one
big summary at the end of all of the tests.