Julian Kunkel
47695aea49
Merge pull request #180 from jschwartz-cray/fix-179
...
Fix #179 .
2019-08-31 10:32:41 +01:00
Julian Kunkel
11fa33fc9f
Merge pull request #178 from jschwartz-cray/more-debug
...
Added some extra debug.
2019-08-31 10:30:33 +01:00
Josh Schwartz
0e952f0f8c
Fix #181 .
...
On systems where numTasks is not evenly divisible by 'tasksPerNode' we were
seeing some nodes reading multiple files while others read none after
reordering.
Commonly all nodes have the same number of tasks but there is nothing
requiring that to be the case. Imagine having 64 tasks running against 4
nodes which can run 20 tasks each. Here you get three groups of 20 and one
group of 4. On this sytem nodes running in the group of 4 were previously
getting tasksPerNode of 4 which meant they reordered tasks differently than
the nodes which got tasksPerNode of 20.
The key to fixing this is ensuring that every node reorders tasks the same
way, which means ensuring they all use the same input values. Obviously on
systems where the number of tasks per node is inconsistent the reordering will
also be inconsistent (some tasks may end up on the same node, or not as far
separated as desired, etc.) but at least this way you'll always end up with a
1:1 reordering.
- Renamed nodes/nodeCount to numNodes
- Renamed tasksPerNode to numTasksOnNode0
- Ensured that numTasksOnNode0 will always have the same value regardless of
which node you're on
- Removed inconsistently used globals numTasksWorld and tasksPerNode and
replaced with per-test params equivalents
- Added utility functions for setting these values:
- numNodes -> GetNumNodes
- numTasks -> GetNumTasks
- numTasksOnNode0 -> GetNumNodesOnTask0
- Improved MPI_VERSION < 3 logic for GetNumNodes so it works when numTasks is
not evenly divisible by numTasksOnNode0
- Left 'nodes' and 'tasksPerNode' in output alone to not break compatibility
- Allowed command-line params to override numTasks, numNodes, and
numTasksOnNode0 but default to using the MPI-calculated values
2019-08-30 16:45:03 -06:00
Josh Schwartz
4c3d96bfed
Fix #179 .
...
-u (uniqueDir) will once again use the full file path specified by the
client instead of truncating it. This was caused by a broken sprintf
which was trying to read/write overlapping buffers.
From the glibc sprintf() documentation:
"The behavior of this function is undefined if copying takes place
between objects that overlap"
2019-08-30 15:31:23 -06:00
Josh Schwartz
0bd979637e
Added some extra debug including ERRF, WARNF, and MPI_CHECKF format string macros.
2019-08-30 15:11:25 -06:00
Mohamad Chaarawi
0b809b36e2
fix README_DAOS for DFS plugin
...
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-08-30 20:35:15 +00:00
Mohamad Chaarawi
32db1cd902
add timing for container close.
...
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-08-29 03:24:48 +00:00
Mohamad Chaarawi
93730771fd
add some verbose messages on finalize routines for DAOS and DFS drivers.
...
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-08-28 17:01:51 +00:00
Julian M. Kunkel
4df051bf28
New option -Y to invoke the sync command.
2019-08-26 18:57:14 +01:00
Julian M. Kunkel
a4068be551
Improved help for fsync.
2019-08-26 15:57:13 +01:00
Julian M. Kunkel
0d9f46e980
MDTest re-added the -Z option for compatibility (for now) and switched back behavior.
2019-08-15 16:49:46 +01:00
Julian M. Kunkel
de3baf8861
MDTest: Document choice of 42.
2019-08-15 16:21:30 +01:00
Mohamad Chaarawi
b3663bd29a
add sleep depending on MPI rank to avoid all ranks calling daos_fini()
...
at once (issue with PSM2).
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-08-14 13:32:51 +00:00
Mohamad Chaarawi
1320aa279c
add some barriers before cont close and destroy to make sure all
...
ranks are done.
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-08-12 16:04:20 +00:00
Mohamad Chaarawi
8cb878507e
Add dfs chunk_size and oclass options.
...
update dfs_remove for API change.
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-08-12 14:25:05 +00:00
Julian Kunkel
9464fa79f2
Merge pull request #173 from ax3l/fix-earlyFree
...
Fix Last Free
2019-08-10 16:31:22 +01:00
Axel Huebl
5271198eb3
Travis CI: Fix HDF5 Build
...
Fix the CI.
2019-08-06 10:36:35 -07:00
Axel Huebl
70e8b13d1d
Fix Last Free
2019-08-04 18:27:20 -05:00
Julian Kunkel
da5f091afc
Merge pull request #169 from ax3l/fix-singleRankHeapBufferOverflow
...
Fix: Heap Buffer Overflow
2019-08-03 09:20:15 +01:00
Julian Kunkel
5e6a03442f
Merge pull request #170 from ax3l/fix-someMemleaks
...
Fix Some Memory Leaks; Thanks.
2019-08-03 09:19:08 +01:00
Julian Kunkel
b49b21a301
Merge pull request #167 from hpc/feature-verify-mdtest
...
Feature verify mdtest
2019-08-03 09:17:35 +01:00
Julian M. Kunkel
361a3261d1
Updated test patterns
2019-08-03 09:15:34 +01:00
Julian M. Kunkel
c4ff3d7c4e
Trivial fix for #168
2019-08-03 09:12:48 +01:00
Mohamad Chaarawi
f16ef9ace5
update dfs_lookup() call for extra parameter.
...
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-08-03 05:07:13 +00:00
Axel Huebl
b72b51be48
Fix: Heap Buffer Overflow
...
Fix a memory violation when run in serial.
2019-08-02 23:34:13 -05:00
Axel Huebl
bfff0df8fd
Fix Some Memory Leaks
...
Fixing some memory leaks :)
2019-08-02 23:33:01 -05:00
Glenn K. Lockwood
4b2cd69ef7
Merge pull request #171 from ax3l/fix-ciHDF5
...
Travis CI: Fix HDF5 Build
2019-08-02 22:53:23 -05:00
Axel Huebl
f89f338734
Travis CI: Fix HDF5 Build
...
Fix the CI.
2019-08-02 19:35:14 -05:00
Glenn K. Lockwood
a0c5dcec89
Merge pull request #166 from hpc/feature-stonewall-perf-report
...
Feature stonewall perf report
2019-08-02 08:38:37 -05:00
Osamu Tatebe
b1b66962ac
incorrect warning
2019-08-02 13:03:59 +09:00
Julian M. Kunkel
cf56715a5a
Make sure that each read buffer contains an invalid first byte.
2019-08-01 18:33:44 +01:00
Julian M. Kunkel
ce1ae750f6
MDtest: Support to verify the read operation with a default pattern.
2019-08-01 18:29:32 +01:00
Julian M. Kunkel
df8355a9bc
Added output of mdtest stonewall timer.
2019-08-01 17:57:45 +01:00
Julian M. Kunkel
061b5a860f
Backmerged: New option: print rate AND time; improves debugging.
2019-08-01 17:54:11 +01:00
Julian M. Kunkel
6c0fadc2a9
Include performance when stonewall is hit to output.
2019-08-01 17:20:01 +01:00
Julian Kunkel
b686b6c26a
Merge pull request #162 from johnbent/master
...
Fixed shift. Also cleaned up output messages
2019-08-01 15:21:26 +01:00
John Bent
3890b71b54
Fixed issues and followed suggestions from Glenn's review of the PR
2019-08-01 09:42:03 +09:00
Mohamad Chaarawi
92939e4fbd
update for DAOS API changes
...
Signed-off-by: Mohamad Chaarawi <mohamad.chaarawi@intel.com>
2019-07-31 17:22:20 +00:00
Glenn K. Lockwood
b025d6bdb3
Merge pull request #153 from glennklockwood/docfixes
...
Updated documentation
2019-07-30 19:29:37 -05:00
John Bent
a3c37808da
Made FAIL take variable args so we can pass printf like args to it
2019-07-28 11:17:11 -06:00
John Bent
0ffec67d2b
Following Julian's suggestion about better naming
2019-07-28 10:25:42 -06:00
John Bent
b2d486f749
Followed Andreas suggestion to replace escape double quotes within printf's with single quotes
2019-07-28 10:07:03 -06:00
John Bent
168a407793
Fixed inconsistent spacing that Andreas commented upon
2019-07-28 09:55:00 -06:00
John Bent
629ff810b7
Got IOR shifting to work regardless of whether node/task mapping is round-robin or contiguous
2019-07-27 15:27:20 -06:00
John Bent
d69957e55b
Final changes cleaning up the output messages
2019-07-27 14:31:49 -06:00
John Bent
e767ef3de9
Remove extraneous print_help function that was causing people to have to edit the same string in two different locations
2019-07-27 13:26:39 -06:00
John Bent
f6491fcd37
Cleaned up the verbose messages by creating a macro and a function
2019-07-27 13:22:15 -06:00
John Bent
9d6480d46e
Fixed bug in the nstride calculation where only 0 was computing it correctly. Had to bcast it out
2019-07-27 11:30:07 -06:00
John Bent
945487f743
Better debug message when stat fails
2019-07-27 09:20:20 -06:00
John Bent
524d053be1
Making shift work in mdtest as well as it works in IOR and on a per-node basis.
...
Also added printing the nodemap so we can check the allocation.
2019-07-26 08:55:24 -06:00