On systems where numTasks is not evenly divisible by 'tasksPerNode' we were
seeing some nodes reading multiple files while others read none after
reordering.
Commonly all nodes have the same number of tasks but there is nothing
requiring that to be the case. Imagine having 64 tasks running against 4
nodes which can run 20 tasks each. Here you get three groups of 20 and one
group of 4. On this sytem nodes running in the group of 4 were previously
getting tasksPerNode of 4 which meant they reordered tasks differently than
the nodes which got tasksPerNode of 20.
The key to fixing this is ensuring that every node reorders tasks the same
way, which means ensuring they all use the same input values. Obviously on
systems where the number of tasks per node is inconsistent the reordering will
also be inconsistent (some tasks may end up on the same node, or not as far
separated as desired, etc.) but at least this way you'll always end up with a
1:1 reordering.
- Renamed nodes/nodeCount to numNodes
- Renamed tasksPerNode to numTasksOnNode0
- Ensured that numTasksOnNode0 will always have the same value regardless of
which node you're on
- Removed inconsistently used globals numTasksWorld and tasksPerNode and
replaced with per-test params equivalents
- Added utility functions for setting these values:
- numNodes -> GetNumNodes
- numTasks -> GetNumTasks
- numTasksOnNode0 -> GetNumNodesOnTask0
- Improved MPI_VERSION < 3 logic for GetNumNodes so it works when numTasks is
not evenly divisible by numTasksOnNode0
- Left 'nodes' and 'tasksPerNode' in output alone to not break compatibility
- Allowed command-line params to override numTasks, numNodes, and
numTasksOnNode0 but default to using the MPI-calculated values
Changed the parser to fix#98. Does _not_ yet have the HDF5 backend args updated. They were reverted to allow this to make the 3.2 release and should be re-applied as a separate commit.
Currently, calculation for barriers case looks a bit
suprising, for example, Min stats is from min of
Max value of number of mpi proc, this makes Min
and Max very similar in our testing thus we get
a small Std value too.
I am not a MPI expert, but question is why we
don't use more nature calcuation which calculate
all results from different iterations and MPI proc?
Signed-off-by: Wang Shilong <wshilong@ddn.com>
IOR uses the GetTimeStamp() wrapper to allow gettimeofday() to be used for timings instead of MPI_Wtime() to calculate throughput. This commits adds the same logic to mdtest timings.
The parser now supports concurrent parsing of all plugin options.
Moved HDF5 collective_md option into the backend as an example.
Example: ./src/ior -a dummy --dummy.delay-xfer=50000
This patch enables correct compilation on both MacOS and Linux using only
POSIX.1-2008-compliant C with XSI extensions.
Specifically, POSIX.1-2008 is the minimum version because we use strdup(3);
explicit XSI is required to expose putenv from glibc.
Context: Some backends may have used different names in
the past (like IME backend use to be IM). Legacy scripts
may break.
This patch adds a legacy name option in the aiori structure.
Both name and legacy name work to select the interface.
But the following warning is printed if the legacy name is used:
ior WARNING: [legacy name] backend is deprecated use [name] instead.
- update cmd line options to add DAOS Pool and Container uuid and SVCL
- Add init/finalize backend functions.
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
Therefore, it uses the -I argument (number of files per directory) and divides it by -n (items).
This will be a number of sub directories that are created on the top-level.
The file "src/aiori-NCMPI.c" uses numTasksWorld as the process which is declared in "src/ior.h". In "src/mdtest.c", NCMPI backend will be called if IOR configured with ncmpi enabled but numTasksWorld was not defined in "src/mdtest.c". So it will cause a compiler error like below:
mdtest-aiori-NCMPI.o: In function `NCMPI_Xfer':
/home/parallels/Documents/ior/src/aiori-NCMPI.c:272: undefined reference to `numTasksWorld'
This commit makes changes to the AIORI backends to add support for
abstacting statfs, mkdir, rmdir, stat, and access. These new
abstractions are used by a modified mdtest. Some changes:
- Require C99. Its 2017 and most compilers now support C11. The
benefits of using C99 include subobject naming (for aiori backend
structs), and fixed size integers (uint64_t). There is no reason to
use the non-standard long long type.
- Moved some of the aiori code into aiori.c so it can be used by both
mdtest and ior.
- Code cleanup of mdtest. This is mostly due to the usage of the IOR
backends rather than a mess of #if code.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>