On systems where numTasks is not evenly divisible by 'tasksPerNode' we were
seeing some nodes reading multiple files while others read none after
reordering.
Commonly all nodes have the same number of tasks but there is nothing
requiring that to be the case. Imagine having 64 tasks running against 4
nodes which can run 20 tasks each. Here you get three groups of 20 and one
group of 4. On this sytem nodes running in the group of 4 were previously
getting tasksPerNode of 4 which meant they reordered tasks differently than
the nodes which got tasksPerNode of 20.
The key to fixing this is ensuring that every node reorders tasks the same
way, which means ensuring they all use the same input values. Obviously on
systems where the number of tasks per node is inconsistent the reordering will
also be inconsistent (some tasks may end up on the same node, or not as far
separated as desired, etc.) but at least this way you'll always end up with a
1:1 reordering.
- Renamed nodes/nodeCount to numNodes
- Renamed tasksPerNode to numTasksOnNode0
- Ensured that numTasksOnNode0 will always have the same value regardless of
which node you're on
- Removed inconsistently used globals numTasksWorld and tasksPerNode and
replaced with per-test params equivalents
- Added utility functions for setting these values:
- numNodes -> GetNumNodes
- numTasks -> GetNumTasks
- numTasksOnNode0 -> GetNumNodesOnTask0
- Improved MPI_VERSION < 3 logic for GetNumNodes so it works when numTasks is
not evenly divisible by numTasksOnNode0
- Left 'nodes' and 'tasksPerNode' in output alone to not break compatibility
- Allowed command-line params to override numTasks, numNodes, and
numTasksOnNode0 but default to using the MPI-calculated values
The O_DIRECT option was not working as set_o_direct_flag() were moved to
utilities.c but there the #define _GNU_SOURCE where missing. This lead
to not the Waring "cannot use O_DIRECT".
If a hintfile contains e.g. cb_buffer_size = 1234, IOR will try to set
the hint "cb_buffer_size " (note trailing space), a hint that no MPI
implementation actually supports.
I saw run in which I caught an MPI task hanging in ctime() here. Swiching
to ctime_r() fixes that. This function is only called form rank==0, but it
hangs anyway.
This provides an HDFS back-end, allowing IOR to exercise a Hadoop
Distributed File-System, plus corresponding changes throughout, to
integrate the new module into the build. The commit compiles at LANL, but
hasn't been run yet. We're currently waiting for some configuration on
machines that will eventually provide HDFS. By default, configure ignores
the HDFS module. You have to explicitly add --with-hdfs.
Only print total summary after all tests run.
Put calculated results from each iteration of a test in a separate
IOR_results_t structure. Clean up the allocation and freeing code
for these caluclated bits, which allowing us to hang onto the results
until the end of all tests. That in turn allows us to perform one
big summary at the end of all of the tests.
Clean up the header files to only contain those things that
need to be shared between .c files.
Functions that are not shared are now declared static to
make their file scope explicit. Functions that ARE shared
are declared in appropriate headers.
I am not going to claim that I caugh everything, but at
least it is a good start.