2012-01-09 00:51:04 +04:00
|
|
|
/* -*- mode: c; c-basic-offset: 8; indent-tabs-mode: nil; -*-
|
|
|
|
* vim:expandtab:shiftwidth=8:tabstop=8:
|
|
|
|
*/
|
2011-06-17 23:20:43 +04:00
|
|
|
/******************************************************************************\
|
|
|
|
* *
|
|
|
|
* Copyright (c) 2003, The Regents of the University of California *
|
|
|
|
* See the file COPYRIGHT for a complete copyright notice and license. *
|
|
|
|
* *
|
|
|
|
********************************************************************************
|
|
|
|
*
|
2011-10-28 01:12:26 +04:00
|
|
|
* Definitions and prototypes of abstract I/O interface
|
2011-06-17 23:20:43 +04:00
|
|
|
*
|
|
|
|
\******************************************************************************/
|
|
|
|
|
|
|
|
#ifndef _AIORI_H
|
|
|
|
#define _AIORI_H
|
|
|
|
|
|
|
|
#include <mpi.h>
|
|
|
|
|
|
|
|
#ifndef MPI_FILE_NULL
|
|
|
|
# include <mpio.h>
|
|
|
|
#endif /* not MPI_FILE_NULL */
|
|
|
|
|
2011-11-12 04:40:45 +04:00
|
|
|
#include "ior.h"
|
|
|
|
#include "iordef.h" /* IOR Definitions */
|
2011-06-17 23:20:43 +04:00
|
|
|
|
|
|
|
/*************************** D E F I N I T I O N S ****************************/
|
|
|
|
|
2014-08-29 01:35:51 +04:00
|
|
|
/* -- file open flags -- */
|
|
|
|
#define IOR_RDONLY 0x01 /* read only */
|
|
|
|
#define IOR_WRONLY 0x02 /* write only */
|
|
|
|
#define IOR_RDWR 0x04 /* read/write */
|
|
|
|
#define IOR_APPEND 0x08 /* append */
|
|
|
|
#define IOR_CREAT 0x10 /* create */
|
|
|
|
#define IOR_TRUNC 0x20 /* truncate */
|
|
|
|
#define IOR_EXCL 0x40 /* exclusive */
|
|
|
|
#define IOR_DIRECT 0x80 /* bypass I/O buffers */
|
|
|
|
|
|
|
|
/* -- file mode flags -- */
|
|
|
|
#define IOR_IRWXU 0x0001 /* read, write, execute perm: owner */
|
|
|
|
#define IOR_IRUSR 0x0002 /* read permission: owner */
|
|
|
|
#define IOR_IWUSR 0x0004 /* write permission: owner */
|
|
|
|
#define IOR_IXUSR 0x0008 /* execute permission: owner */
|
|
|
|
#define IOR_IRWXG 0x0010 /* read, write, execute perm: group */
|
|
|
|
#define IOR_IRGRP 0x0020 /* read permission: group */
|
|
|
|
#define IOR_IWGRP 0x0040 /* write permission: group */
|
|
|
|
#define IOR_IXGRP 0x0080 /* execute permission: group */
|
|
|
|
#define IOR_IRWXO 0x0100 /* read, write, execute perm: other */
|
|
|
|
#define IOR_IROTH 0x0200 /* read permission: other */
|
|
|
|
#define IOR_IWOTH 0x0400 /* write permission: other */
|
|
|
|
#define IOR_IXOTH 0x0800 /* execute permission: other */
|
2011-06-17 23:20:43 +04:00
|
|
|
|
2011-10-28 03:50:05 +04:00
|
|
|
typedef struct ior_aiori {
|
|
|
|
char *name;
|
|
|
|
void *(*create)(char *, IOR_param_t *);
|
|
|
|
void *(*open)(char *, IOR_param_t *);
|
|
|
|
IOR_offset_t (*xfer)(int, void *, IOR_size_t *,
|
|
|
|
IOR_offset_t, IOR_param_t *);
|
|
|
|
void (*close)(void *, IOR_param_t *);
|
|
|
|
void (*delete)(char *, IOR_param_t *);
|
|
|
|
void (*set_version)(IOR_param_t *);
|
|
|
|
void (*fsync)(void *, IOR_param_t *);
|
|
|
|
IOR_offset_t (*get_file_size)(IOR_param_t *, MPI_Comm, char *);
|
|
|
|
} ior_aiori_t;
|
2011-06-17 23:20:43 +04:00
|
|
|
|
Algorithms 'S3', 'S3_plus', and 'S3_EMC' all available.
These are variants on S3. S3 uses the "pure" S3 interface, e.g. using
Multi-Part-Upload. The "plus" variant enables EMC-extensions in the aws4c
library. This allows the N:N case to use "append", in the case where
"transfer_size" != "block_size" for IOR. In pure S3, the N:N case will
fail, because the EMC-extensions won't be enabled, and appending (which
attempts to use the EMC byte-range tricks to do this) will throw an error.
In the S3_EMC alg, N:1 uses EMCs other byte-range tricks to write different
parts of an N:1 file, and also uses append to write the parts of an N:N
file. Preliminary tests show these EMC extensions look to improve BW by
~20%.
I put all three algs in aiori-S3.c, because it seemed some code was getting
reused. Not sure if that's still going to make sense after the TBD, below.
TBD: Recently realized that the "pure' S3 shouldn't be trying to use
appends for anything. In the N:N case, it should just use MPU, within each
file. Then, there's no need for S3_plus. We just have S3, which does MPU
for all writes where transfer_size != block_size, and uses (standard)
byte-range reads for reading. Then S3_EMC uses "appends for N:N writes,
and byte-range writes for N:1 writes. This separates the code for the two
algs a little more, but we might still want them in the same file.
2014-10-30 01:04:30 +03:00
|
|
|
extern ior_aiori_t hdf5_aiori;
|
|
|
|
extern ior_aiori_t hdfs_aiori;
|
|
|
|
extern ior_aiori_t mpiio_aiori;
|
|
|
|
extern ior_aiori_t ncmpi_aiori;
|
|
|
|
extern ior_aiori_t posix_aiori;
|
|
|
|
extern ior_aiori_t plfs_aiori;
|
|
|
|
extern ior_aiori_t s3_aiori;
|
|
|
|
extern ior_aiori_t s3_plus_aiori;
|
|
|
|
extern ior_aiori_t s3_emc_aiori;
|
S3 with Multi-Part Upload for N:1 is working.
Testing on our EMC ViPR installation. Therefore, we also have available
some EMC extensions. For example, EMC supports a special "byte-range"
header-option ("Range: bytes=-1-") which allows appending to an object.
This is not needed for N:1 (where every write creates an independent part),
but is vital for N:N (where every write is considered an append, unless
"transfer-size" is the same as "block-size").
We also use a LANL-extended implementation of aws4c 0.5, which provides
some special features, and allows greater efficiency. That is included in
this commit as a tarball. Untar it somewhere else and build it, to produce
a library, which is linked with IOR. (configure with --with-S3).
TBD: EMC also supports a simpler alternative to Multi-Part Upload, which
appears to have several advantages. We'll add that in next, but wanted to
capture this as is, before I break it.
2014-10-27 22:16:20 +03:00
|
|
|
|
2011-11-12 04:40:45 +04:00
|
|
|
|
|
|
|
IOR_offset_t MPIIO_GetFileSize(IOR_param_t * test, MPI_Comm testComm,
|
|
|
|
char *testFileName);
|
2011-11-10 04:13:44 +04:00
|
|
|
|
2011-06-17 23:20:43 +04:00
|
|
|
#endif /* not _AIORI_H */
|