Compare commits

...

2 Commits

Author SHA1 Message Date
LaureVergeron 9de151abeb ZNC-52: FT: Add Antora support 2018-04-13 13:50:39 +02:00
LaureVergeron 5e62766a9c ZNC-22: DOC: Add developer bootstrap guide 2018-04-13 13:50:24 +02:00
30 changed files with 3724 additions and 3304 deletions

View File

@ -1,989 +0,0 @@
.. role:: raw-latex(raw)
:format: latex
..
Architecture
++++++++++++
Versioning
==========
This document describes Zenko CloudServer's support for the AWS S3 Bucket
Versioning feature.
AWS S3 Bucket Versioning
------------------------
See AWS documentation for a description of the Bucket Versioning
feature:
- `Bucket
Versioning <http://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html>`__
- `Object
Versioning <http://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectVersioning.html>`__
This document assumes familiarity with the details of Bucket Versioning,
including null versions and delete markers, described in the above
links.
Implementation of Bucket Versioning in Zenko CloudServer
-----------------------------------------
Overview of Metadata and API Component Roles
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Each version of an object is stored as a separate key in metadata. The
S3 API interacts with the metadata backend to store, retrieve, and
delete version metadata.
The implementation of versioning within the metadata backend is naive.
The metadata backend does not evaluate any information about bucket or
version state (whether versioning is enabled or suspended, and whether a
version is a null version or delete marker). The S3 front-end API
manages the logic regarding versioning information, and sends
instructions to metadata to handle the basic CRUD operations for version
metadata.
The role of the S3 API can be broken down into the following:
- put and delete version data
- store extra information about a version, such as whether it is a
delete marker or null version, in the object's metadata
- send instructions to metadata backend to store, retrieve, update and
delete version metadata based on bucket versioning state and version
metadata
- encode version ID information to return in responses to requests, and
decode version IDs sent in requests
The implementation of Bucket Versioning in S3 is described in this
document in two main parts. The first section, `"Implementation of
Bucket Versioning in
Metadata" <#implementation-of-bucket-versioning-in-metadata>`__,
describes the way versions are stored in metadata, and the metadata
options for manipulating version metadata.
The second section, `"Implementation of Bucket Versioning in
API" <#implementation-of-bucket-versioning-in-api>`__, describes the way
the metadata options are used in the API within S3 actions to create new
versions, update their metadata, and delete them. The management of null
versions and creation of delete markers are also described in this
section.
Implementation of Bucket Versioning in Metadata
-----------------------------------------------
As mentioned above, each version of an object is stored as a separate
key in metadata. We use version identifiers as the suffix for the keys
of the object versions, and a special version (the `"Master
Version" <#master-version>`__) to represent the latest version.
An example of what the metadata keys might look like for an object
``foo/bar`` with three versions (with `.` representing a null character):
+------------------------------------------------------+
| key |
+======================================================+
| foo/bar |
+------------------------------------------------------+
| foo/bar.098506163554375999999PARIS 0.a430a1f85c6ec |
+------------------------------------------------------+
| foo/bar.098506163554373999999PARIS 0.41b510cd0fdf8 |
+------------------------------------------------------+
| foo/bar.098506163554373999998PARIS 0.f9b82c166f695 |
+------------------------------------------------------+
The most recent version created is represented above in the key
``foo/bar`` and is the master version. This special version is described
further in the section `"Master Version" <#master-version>`__.
Version ID and Metadata Key Format
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The version ID is generated by the metadata backend, and encoded in a
hexadecimal string format by S3 before sending a response to a request.
S3 also decodes the hexadecimal string received from a request before
sending to metadata to retrieve a particular version.
The format of a ``version_id`` is: ``ts`` ``rep_group_id`` ``seq_id``
where:
- ``ts``: is the combination of epoch and an increasing number
- ``rep_group_id``: is the name of deployment(s) considered one unit
used for replication
- ``seq_id``: is a unique value based on metadata information.
The format of a key in metadata for a version is:
``object_name separator version_id`` where:
- ``object_name``: is the key of the object in metadata
- ``separator``: we use the ``null`` character (``0x00`` or ``\0``) as
the separator between the ``object_name`` and the ``version_id`` of a
key
- ``version_id``: is the version identifier; this encodes the ordering
information in the format described above as metadata orders keys
alphabetically
An example of a key in metadata:
``foo\01234567890000777PARIS 1234.123456`` indicating that this specific
version of ``foo`` was the ``000777``\ th entry created during the epoch
``1234567890`` in the replication group ``PARIS`` with ``1234.123456``
as ``seq_id``.
Master Version
~~~~~~~~~~~~~~
We store a copy of the latest version of an object's metadata using
``object_name`` as the key; this version is called the master version.
The master version of each object facilitates the standard GET
operation, which would otherwise need to scan among the list of versions
of an object for its latest version.
The following table shows the layout of all versions of ``foo`` in the
first example stored in the metadata (with dot ``.`` representing the
null separator):
+----------+---------+
| key | value |
+==========+=========+
| foo | B |
+----------+---------+
| foo.v2 | B |
+----------+---------+
| foo.v1 | A |
+----------+---------+
Metadata Versioning Options
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Zenko CloudServer sends instructions to the metadata engine about whether to
create a new version or overwrite, retrieve, or delete a specific
version by sending values for special options in PUT, GET, or DELETE
calls to metadata. The metadata engine can also list versions in the
database, which is used by Zenko CloudServer to list object versions.
These only describe the basic CRUD operations that the metadata engine
can handle. How these options are used by the S3 API to generate and
update versions is described more comprehensively in `"Implementation of
Bucket Versioning in
API" <#implementation-of-bucket-versioning-in-api>`__.
Note: all operations (PUT and DELETE) that generate a new version of an
object will return the ``version_id`` of the new version to the API.
PUT
^^^
- no options: original PUT operation, will update the master version
- ``versioning: true`` create a new version of the object, then update
the master version with this version.
- ``versionId: <versionId>`` create or update a specific version (for updating
version's ACL or tags, or remote updates in geo-replication)
- if the version identified by ``versionId`` happens to be the latest
version, the master version will be updated as well
- if the master version is not as recent as the version identified by
``versionId``, as may happen with cross-region replication, the master
will be updated as well
- note that with ``versionId`` set to an empty string ``''``, it will
overwrite the master version only (same as no options, but the master
version will have a ``versionId`` property set in its metadata like
any other version). The ``versionId`` will never be exposed to an
external user, but setting this internal-only ``versionID`` enables
Zenko CloudServer to find this version later if it is no longer the master.
This option of ``versionId`` set to ``''`` is used for creating null
versions once versioning has been suspended, which is discussed in
`"Null Version Management" <#null-version-management>`__.
In general, only one option is used at a time. When ``versionId`` and
``versioning`` are both set, only the ``versionId`` option will have an effect.
DELETE
^^^^^^
- no options: original DELETE operation, will delete the master version
- ``versionId: <versionId>`` delete a specific version
A deletion targeting the latest version of an object has to:
- delete the specified version identified by ``versionId``
- replace the master version with a version that is a placeholder for
deletion
- this version contains a special keyword, 'isPHD', to indicate the
master version was deleted and needs to be updated
- initiate a repair operation to update the value of the master
version:
- involves listing the versions of the object and get the latest
version to replace the placeholder delete version
- if no more versions exist, metadata deletes the master version,
removing the key from metadata
Note: all of this happens in metadata before responding to the front-end api,
and only when the metadata engine is instructed by Zenko CloudServer to delete
a specific version or the master version.
See section `"Delete Markers" <#delete-markers>`__ for a description of what
happens when a Delete Object request is sent to the S3 API.
GET
^^^
- no options: original GET operation, will get the master version
- ``versionId: <versionId>`` retrieve a specific version
The implementation of a GET operation does not change compared to the
standard version. A standard GET without versioning information would
get the master version of a key. A version-specific GET would retrieve
the specific version identified by the key for that version.
LIST
^^^^
For a standard LIST on a bucket, metadata iterates through the keys by
using the separator (``\0``, represented by ``.`` in examples) as an
extra delimiter. For a listing of all versions of a bucket, there is no
change compared to the original listing function. Instead, the API
component returns all the keys in a List Objects call and filters for
just the keys of the master versions in a List Object Versions call.
For example, a standard LIST operation against the keys in a table below
would return from metadata the list of
``[ foo/bar, bar, qux/quz, quz ]``.
+--------------+
| key |
+==============+
| foo/bar |
+--------------+
| foo/bar.v2 |
+--------------+
| foo/bar.v1 |
+--------------+
| bar |
+--------------+
| qux/quz |
+--------------+
| qux/quz.v2 |
+--------------+
| qux/quz.v1 |
+--------------+
| quz |
+--------------+
| quz.v2 |
+--------------+
| quz.v1 |
+--------------+
Implementation of Bucket Versioning in API
------------------------------------------
Object Metadata Versioning Attributes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To access all the information needed to properly handle all cases that
may exist in versioned operations, the API stores certain
versioning-related information in the metadata attributes of each
version's object metadata.
These are the versioning-related metadata properties:
- ``isNull``: whether the version being stored is a null version.
- ``nullVersionId``: the unencoded version ID of the latest null
version that existed before storing a non-null version.
- ``isDeleteMarker``: whether the version being stored is a delete
marker.
The metadata engine also sets one additional metadata property when
creating the version.
- ``versionId``: the unencoded version ID of the version being stored.
Null versions and delete markers are described in further detail in
their own subsections.
Creation of New Versions
~~~~~~~~~~~~~~~~~~~~~~~~
When versioning is enabled in a bucket, APIs which normally result in
the creation of objects, such as Put Object, Complete Multipart Upload
and Copy Object, will generate new versions of objects.
Zenko CloudServer creates a new version and updates the master version using the
``versioning: true`` option in PUT calls to the metadata engine. As an
example, when two consecutive Put Object requests are sent to the Zenko
CloudServer for a versioning-enabled bucket with the same key names, there
are two corresponding metadata PUT calls with the ``versioning`` option
set to true.
The PUT calls to metadata and resulting keys are shown below:
(1) PUT foo (first put), versioning: ``true``
+----------+---------+
| key | value |
+==========+=========+
| foo | A |
+----------+---------+
| foo.v1 | A |
+----------+---------+
(2) PUT foo (second put), versioning: ``true``
+----------+---------+
| key | value |
+==========+=========+
| foo | B |
+----------+---------+
| foo.v2 | B |
+----------+---------+
| foo.v1 | A |
+----------+---------+
Null Version Management
^^^^^^^^^^^^^^^^^^^^^^^
In a bucket without versioning, or when versioning is suspended, putting
an object with the same name twice should result in the previous object
being overwritten. This is managed with null versions.
Only one null version should exist at any given time, and it is
identified in Zenko CloudServer requests and responses with the version
id "null".
Case 1: Putting Null Versions
'''''''''''''''''''''''''''''
With respect to metadata, since the null version is overwritten by
subsequent null versions, the null version is initially stored in the
master key alone, as opposed to being stored in the master key and a new
version. Zenko CloudServer checks if versioning is suspended or has never been
configured, and sets the ``versionId`` option to ``''`` in PUT calls to
the metadata engine when creating a new null version.
If the master version is a null version, Zenko CloudServer also sends a DELETE
call to metadata prior to the PUT, in order to clean up any pre-existing null
versions which may, in certain edge cases, have been stored as a separate
version. [1]_
The tables below summarize the calls to metadata and the resulting keys if
we put an object 'foo' twice, when versioning has not been enabled or is
suspended.
(1) PUT foo (first put), versionId: ``''``
+--------------+---------+
| key | value |
+==============+=========+
| foo (null) | A |
+--------------+---------+
(2A) DELETE foo (clean-up delete before second put),
versionId: ``<version id of master version>``
+--------------+---------+
| key | value |
+==============+=========+
| | |
+--------------+---------+
(2B) PUT foo (second put), versionId: ``''``
+--------------+---------+
| key | value |
+==============+=========+
| foo (null) | B |
+--------------+---------+
The S3 API also sets the ``isNull`` attribute to ``true`` in the version
metadata before storing the metadata for these null versions.
.. [1] Some examples of these cases are: (1) when there is a null version
that is the second-to-latest version, and the latest version has been
deleted, causing metadata to repair the master value with the value of
the null version and (2) when putting object tag or ACL on a null
version that is the master version, as explained in `"Behavior of
Object-Targeting APIs" <#behavior-of-object-targeting-apis>`__.
Case 2: Preserving Existing Null Versions in Versioning-Enabled Bucket
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Null versions are preserved when new non-null versions are created after
versioning has been enabled or re-enabled.
If the master version is the null version, the S3 API preserves the
current null version by storing it as a new key ``(3A)`` in a separate
PUT call to metadata, prior to overwriting the master version ``(3B)``.
This implies the null version may not necessarily be the latest or
master version.
To determine whether the master version is a null version, the S3 API
checks if the master version's ``isNull`` property is set to ``true``,
or if the ``versionId`` attribute of the master version is undefined
(indicating it is a null version that was put before bucket versioning
was configured).
Continuing the example from Case 1, if we enabled versioning and put
another object, the calls to metadata and resulting keys would resemble
the following:
(3A) PUT foo, versionId: ``<versionId of master version>`` if defined or
``<non-versioned object id>``
+-----------------+---------+
| key | value |
+=================+=========+
| foo | B |
+-----------------+---------+
| foo.v1 (null) | B |
+-----------------+---------+
(3B) PUT foo, versioning: ``true``
+-----------------+---------+
| key | value |
+=================+=========+
| foo | C |
+-----------------+---------+
| foo.v2 | C |
+-----------------+---------+
| foo.v1 (null) | B |
+-----------------+---------+
To prevent issues with concurrent requests, Zenko CloudServer ensures the null
version is stored with the same version ID by using ``versionId`` option.
Zenko CloudServer sets the ``versionId`` option to the master version's
``versionId`` metadata attribute value during the PUT. This creates a new
version with the same version ID of the existing null master version.
The null version's ``versionId`` attribute may be undefined because it
was generated before the bucket versioning was configured. In that case,
a version ID is generated using the max epoch and sequence values
possible so that the null version will be properly ordered as the last
entry in a metadata listing. This value ("non-versioned object id") is
used in the PUT call with the ``versionId`` option.
Case 3: Overwriting a Null Version That is Not Latest Version
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Normally when versioning is suspended, Zenko CloudServer uses the
``versionId: ''`` option in a PUT to metadata to create a null version.
This also overwrites an existing null version if it is the master version.
However, if there is a null version that is not the latest version,
Zenko CloudServer cannot rely on the ``versionId: ''`` option will not
overwrite the existing null version. Instead, before creating a new null
version, the Zenko CloudServer API must send a separate DELETE call to metadata
specifying the version id of the current null version for delete.
To do this, when storing a null version (3A above) before storing a new
non-null version, Zenko CloudServer records the version's ID in the
``nullVersionId`` attribute of the non-null version. For steps 3A and 3B above,
these are the values stored in the ``nullVersionId`` of each version's metadata:
(3A) PUT foo, versioning: ``true``
+-----------------+---------+-----------------------+
| key | value | value.nullVersionId |
+=================+=========+=======================+
| foo | B | undefined |
+-----------------+---------+-----------------------+
| foo.v1 (null) | B | undefined |
+-----------------+---------+-----------------------+
(3B) PUT foo, versioning: ``true``
+-----------------+---------+-----------------------+
| key | value | value.nullVersionId |
+=================+=========+=======================+
| foo | C | v1 |
+-----------------+---------+-----------------------+
| foo.v2 | C | v1 |
+-----------------+---------+-----------------------+
| foo.v1 (null) | B | undefined |
+-----------------+---------+-----------------------+
If defined, the ``nullVersionId`` of the master version is used with the
``versionId`` option in a DELETE call to metadata if a Put Object
request is received when versioning is suspended in a bucket.
(4A) DELETE foo, versionId: ``<nullVersionId of master version>`` (v1)
+----------+---------+
| key | value |
+==========+=========+
| foo | C |
+----------+---------+
| foo.v2 | C |
+----------+---------+
Then the master version is overwritten with the new null version:
(4B) PUT foo, versionId: ``''``
+--------------+---------+
| key | value |
+==============+=========+
| foo (null) | D |
+--------------+---------+
| foo.v2 | C |
+--------------+---------+
The ``nullVersionId`` attribute is also used to retrieve the correct
version when the version ID "null" is specified in certain object-level
APIs, described further in the section `"Null Version
Mapping" <#null-version-mapping>`__.
Specifying Versions in APIs for Putting Versions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Since Zenko CloudServer does not allow an overwrite of existing version data,
Put Object, Complete Multipart Upload and Copy Object return
``400 InvalidArgument`` if a specific version ID is specified in the
request query, e.g. for a ``PUT /foo?versionId=v1`` request.
PUT Example
~~~~~~~~~~~
When Zenko CloudServer receives a request to PUT an object:
- It checks first if versioning has been configured
- If it has not been configured, Zenko CloudServer proceeds to puts the new
data, puts the metadata by overwriting the master version, and proceeds to
delete any pre-existing data
If versioning has been configured, Zenko CloudServer checks the following:
Versioning Enabled
^^^^^^^^^^^^^^^^^^
If versioning is enabled and there is existing object metadata:
- If the master version is a null version (``isNull: true``) or has no
version ID (put before versioning was configured):
- store the null version metadata as a new version
- create a new version and overwrite the master version
- set ``nullVersionId``: version ID of the null version that was
stored
If versioning is enabled and the master version is not null; or there is
no existing object metadata:
- create a new version and store it, and overwrite the master version
Versioning Suspended
^^^^^^^^^^^^^^^^^^^^
If versioning is suspended and there is existing object metadata:
- If the master version has no version ID:
- overwrite the master version with the new metadata (PUT ``versionId: ''``)
- delete previous object data
- If the master version is a null version:
- delete the null version using the `versionId` metadata attribute of the
master version (PUT ``versionId: <versionId of master object MD>``)
- put a new null version (PUT ``versionId: ''``)
- If master is not a null version and ``nullVersionId`` is defined in
the objects metadata:
- delete the current null version metadata and data
- overwrite the master version with the new metadata
If there is no existing object metadata, create the new null version as
the master version.
In each of the above cases, set ``isNull`` metadata attribute to true
when creating the new null version.
Behavior of Object-Targeting APIs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
API methods which can target existing objects or versions, such as Get
Object, Head Object, Get Object ACL, Put Object ACL, Copy Object and
Copy Part, will perform the action on the latest version of an object if
no version ID is specified in the request query or relevant request
header (``x-amz-copy-source-version-id`` for Copy Object and Copy Part
APIs).
Two exceptions are the Delete Object and Multi-Object Delete APIs, which
will instead attempt to create delete markers, described in the
following section, if no version ID is specified.
No versioning options are necessary to retrieve the latest version from
metadata, since the master version is stored in a key with the name of
the object. However, when updating the latest version, such as with the
Put Object ACL API, Zenko CloudServer sets the ``versionId`` option in the
PUT call to metadata to the value stored in the object metadata's ``versionId``
attribute. This is done in order to update the metadata both in the
master version and the version itself, if it is not a null version. [2]_
When a version id is specified in the request query for these APIs, e.g.
``GET /foo?versionId=v1``, Zenko CloudServer will attempt to decode the version
ID and perform the action on the appropriate version. To do so, the API sets
the value of the ``versionId`` option to the decoded version ID in the
metadata call.
Delete Markers
^^^^^^^^^^^^^^
If versioning has not been configured for a bucket, the Delete Object
and Multi-Object Delete APIs behave as their standard APIs.
If versioning has been configured, Zenko CloudServer deletes object or version
data only if a specific version ID is provided in the request query, e.g.
``DELETE /foo?versionId=v1``.
If no version ID is provided, S3 creates a delete marker by creating a
0-byte version with the metadata attribute ``isDeleteMarker: true``. The
S3 API will return a ``404 NoSuchKey`` error in response to requests
getting or heading an object whose latest version is a delete maker.
To restore a previous version as the latest version of an object, the
delete marker must be deleted, by the same process as deleting any other
version.
The response varies when targeting an object whose latest version is a
delete marker for other object-level APIs that can target existing
objects and versions, without specifying the version ID.
- Get Object, Head Object, Get Object ACL, Object Copy and Copy Part
return ``404 NoSuchKey``.
- Put Object ACL and Put Object Tagging return
``405 MethodNotAllowed``.
These APIs respond to requests specifying the version ID of a delete
marker with the error ``405 MethodNotAllowed``, in general. Copy Part
and Copy Object respond with ``400 Invalid Request``.
See section `"Delete Example" <#delete-example>`__ for a summary.
Null Version Mapping
^^^^^^^^^^^^^^^^^^^^
When the null version is specified in a request with the version ID
"null", the S3 API must use the ``nullVersionId`` stored in the latest
version to retrieve the current null version, if the null version is not
the latest version.
Thus, getting the null version is a two step process:
1. Get the latest version of the object from metadata. If the latest
version's ``isNull`` property is ``true``, then use the latest
version's metadata. Otherwise,
2. Get the null version of the object from metadata, using the internal
version ID of the current null version stored in the latest version's
``nullVersionId`` metadata attribute.
DELETE Example
~~~~~~~~~~~~~~
The following steps are used in the delete logic for delete marker
creation:
- If versioning has not been configured: attempt to delete the object
- If request is version-specific delete request: attempt to delete the
version
- otherwise, if not a version-specific delete request and versioning
has been configured:
- create a new 0-byte content-length version
- in version's metadata, set a 'isDeleteMarker' property to true
- Return the version ID of any version deleted or any delete marker
created
- Set response header ``x-amz-delete-marker`` to true if a delete
marker was deleted or created
The Multi-Object Delete API follows the same logic for each of the
objects or versions listed in an xml request. Note that a delete request
can result in the creation of a deletion marker even if the object
requested to delete does not exist in the first place.
Object-level APIs which can target existing objects and versions perform
the following checks regarding delete markers:
- If not a version-specific request and versioning has been configured,
check the metadata of the latest version
- If the 'isDeleteMarker' property is set to true, return
``404 NoSuchKey`` or ``405 MethodNotAllowed``
- If it is a version-specific request, check the object metadata of the
requested version
- If the ``isDeleteMarker`` property is set to true, return
``405 MethodNotAllowed`` or ``400 InvalidRequest``
.. [2] If it is a null version, this call will overwrite the null version
if it is stored in its own key (``foo\0<versionId>``). If the null
version is stored only in the master version, this call will both
overwrite the master version *and* create a new key
(``foo\0<versionId>``), resulting in the edge case referred to by the
previous footnote [1]_.
Data-metadata daemon Architecture and Operational guide
=======================================================
This document presents the architecture of the data-metadata daemon
(dmd) used for the community edition of Zenko CloudServer. It also provides a
guide on how to operate it.
The dmd is responsible for storing and retrieving Zenko CloudServer data and
metadata, and is accessed by Zenko CloudServer connectors through socket.io
(metadata) and REST (data) APIs.
It has been designed such that more than one Zenko CloudServer connector can
access the same buckets by communicating with the dmd. It also means that
the dmd can be hosted on a separate container or machine.
Operation
---------
Startup
~~~~~~~
The simplest deployment is still to launch with npm start, this will
start one instance of the Zenko CloudServer connector and will listen on the
locally bound dmd ports 9990 and 9991 (by default, see below).
The dmd can be started independently from the Zenko CloudServer by running this
command in the Zenko CloudServer directory:
::
npm run start_dmd
This will open two ports:
- one is based on socket.io and is used for metadata transfers (9990 by
default)
- the other is a REST interface used for data transfers (9991 by
default)
Then, one or more instances of Zenko CloudServer without the dmd can be started
elsewhere with:
::
npm run start_s3server
Configuration
~~~~~~~~~~~~~
Most configuration happens in ``config.json`` for Zenko CloudServer, local
storage paths can be changed where the dmd is started using environment
variables, like before: ``S3DATAPATH`` and ``S3METADATAPATH``.
In ``config.json``, the following sections are used to configure access
to the dmd through separate configuration of the data and metadata
access:
::
"metadataClient": {
"host": "localhost",
"port": 9990
},
"dataClient": {
"host": "localhost",
"port": 9991
},
To run a remote dmd, you have to do the following:
- change both ``"host"`` attributes to the IP or host name where the
dmd is run.
- Modify the ``"bindAddress"`` attributes in ``"metadataDaemon"`` and
``"dataDaemon"`` sections where the dmd is run to accept remote
connections (e.g. ``"::"``)
Architecture
------------
This section gives a bit more insight on how it works internally.
.. figure:: ./images/data_metadata_daemon_arch.png
:alt: Architecture diagram
./images/data\_metadata\_daemon\_arch.png
Metadata on socket.io
~~~~~~~~~~~~~~~~~~~~~
This communication is based on an RPC system based on socket.io events
sent by Zenko CloudServerconnectors, received by the DMD and acknowledged back
to the Zenko CloudServer connector.
The actual payload sent through socket.io is a JSON-serialized form of
the RPC call name and parameters, along with some additional information
like the request UIDs, and the sub-level information, sent as object
attributes in the JSON request.
With introduction of versioning support, the updates are now gathered in
the dmd for some number of milliseconds max, before being batched as a
single write to the database. This is done server-side, so the API is
meant to send individual updates.
Four RPC commands are available to clients: ``put``, ``get``, ``del``
and ``createReadStream``. They more or less map the parameters accepted
by the corresponding calls in the LevelUp implementation of LevelDB.
They differ in the following:
- The ``sync`` option is ignored (under the hood, puts are gathered
into batches which have their ``sync`` property enforced when they
are committed to the storage)
- Some additional versioning-specific options are supported
- ``createReadStream`` becomes asynchronous, takes an additional
callback argument and returns the stream in the second callback
parameter
Debugging the socket.io exchanges can be achieved by running the daemon
with ``DEBUG='socket.io*'`` environment variable set.
One parameter controls the timeout value after which RPC commands sent
end with a timeout error, it can be changed either:
- via the ``DEFAULT_CALL_TIMEOUT_MS`` option in
``lib/network/rpc/rpc.js``
- or in the constructor call of the ``MetadataFileClient`` object (in
``lib/metadata/bucketfile/backend.js`` as ``callTimeoutMs``.
Default value is 30000.
A specific implementation deals with streams, currently used for listing
a bucket. Streams emit ``"stream-data"`` events that pack one or more
items in the listing, and a special ``“stream-end”`` event when done.
Flow control is achieved by allowing a certain number of “in flight”
packets that have not received an ack yet (5 by default). Two options
can tune the behavior (for better throughput or getting it more robust
on weak networks), they have to be set in ``mdserver.js`` file directly,
as there is no support in ``config.json`` for now for those options:
- ``streamMaxPendingAck``: max number of pending ack events not yet
received (default is 5)
- ``streamAckTimeoutMs``: timeout for receiving an ack after an output
stream packet is sent to the client (default is 5000)
Data exchange through the REST data port
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Data is read and written with REST semantic.
The web server recognizes a base path in the URL of ``/DataFile`` to be
a request to the data storage service.
PUT
^^^
A PUT on ``/DataFile`` URL and contents passed in the request body will
write a new object to the storage.
On success, a ``201 Created`` response is returned and the new URL to
the object is returned via the ``Location`` header (e.g.
``Location: /DataFile/50165db76eecea293abfd31103746dadb73a2074``). The
raw key can then be extracted simply by removing the leading
``/DataFile`` service information from the returned URL.
GET
^^^
A GET is simply issued with REST semantic, e.g.:
::
GET /DataFile/50165db76eecea293abfd31103746dadb73a2074 HTTP/1.1
A GET request can ask for a specific range. Range support is complete
except for multiple byte ranges.
DELETE
^^^^^^
DELETE is similar to GET, except that a ``204 No Content`` response is
returned on success.
Listing
=======
Listing Types
-------------
We use three different types of metadata listing for various operations.
Here are the scenarios we use each for:
- 'Delimiter' - when no versions are possible in the bucket since it is
an internally-used only bucket which is not exposed to a user.
Namely,
1. to list objects in the "user's bucket" to respond to a GET SERVICE
request and
2. to do internal listings on an MPU shadow bucket to complete multipart
upload operations.
- 'DelimiterVersion' - to list all versions in a bucket
- 'DelimiterMaster' - to list just the master versions of objects in a
bucket
Algorithms
----------
The algorithms for each listing type can be found in the open-source
`scality/Arsenal <https://github.com/scality/Arsenal>`__ repository, in
`lib/algos/list <https://github.com/scality/Arsenal/tree/master/lib/algos/list>`__.
Encryption
===========
With CloudServer, there are two possible methods of at-rest encryption.
(1) We offer bucket level encryption where Scality CloudServer itself handles at-rest
encryption for any object that is in an 'encrypted' bucket, regardless of what
the location-constraint for the data is and
(2) If the location-constraint specified for the data is of type AWS,
you can choose to use AWS server side encryption.
Note: bucket level encryption is not available on the standard AWS
S3 protocol, so normal AWS S3 clients will not provide the option to send a
header when creating a bucket. We have created a simple tool to enable you
to easily create an encrypted bucket.
Example:
--------
Creating encrypted bucket using our encrypted bucket tool in the bin directory
.. code:: shell
./create_encrypted_bucket.js -a accessKey1 -k verySecretKey1 -b bucketname -h localhost -p 8000
AWS backend
------------
With real AWS S3 as a location-constraint, you have to configure the
location-constraint as follows
.. code:: json
"awsbackend": {
"type": "aws_s3",
"legacyAwsBehavior": true,
"details": {
"serverSideEncryption": true,
...
}
},
Then, every time an object is put to that data location, we pass the following
header to AWS: ``x-amz-server-side-encryption: AES256``
Note: due to these options, it is possible to configure encryption by both
CloudServer and AWS S3 (if you put an object to a CloudServer bucket which has
the encryption flag AND the location-constraint for the data is AWS S3 with
serverSideEncryption set to true).

View File

@ -1,295 +0,0 @@
Clients
=========
List of applications that have been tested with Zenko CloudServer.
GUI
~~~
`Cyberduck <https://cyberduck.io/?l=en>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- https://www.youtube.com/watch?v=-n2MCt4ukUg
- https://www.youtube.com/watch?v=IyXHcu4uqgU
`Cloud Explorer <https://www.linux-toys.com/?p=945>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- https://www.youtube.com/watch?v=2hhtBtmBSxE
`CloudBerry Lab <http://www.cloudberrylab.com>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- https://youtu.be/IjIx8g\_o0gY
Command Line Tools
~~~~~~~~~~~~~~~~~~
`s3curl <https://github.com/rtdp/s3curl>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
https://github.com/scality/S3/blob/master/tests/functional/s3curl/s3curl.pl
`aws-cli <http://docs.aws.amazon.com/cli/latest/reference/>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``~/.aws/credentials`` on Linux, OS X, or Unix or
``C:\Users\USERNAME\.aws\credentials`` on Windows
.. code:: shell
[default]
aws_access_key_id = accessKey1
aws_secret_access_key = verySecretKey1
``~/.aws/config`` on Linux, OS X, or Unix or
``C:\Users\USERNAME\.aws\config`` on Windows
.. code:: shell
[default]
region = us-east-1
Note: ``us-east-1`` is the default region, but you can specify any
region.
See all buckets:
.. code:: shell
aws s3 ls --endpoint-url=http://localhost:8000
Create bucket:
.. code:: shell
aws --endpoint-url=http://localhost:8000 s3 mb s3://mybucket
`s3cmd <http://s3tools.org/s3cmd>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If using s3cmd as a client to S3 be aware that v4 signature format is
buggy in s3cmd versions < 1.6.1.
``~/.s3cfg`` on Linux, OS X, or Unix or ``C:\Users\USERNAME\.s3cfg`` on
Windows
.. code:: shell
[default]
access_key = accessKey1
secret_key = verySecretKey1
host_base = localhost:8000
host_bucket = %(bucket).localhost:8000
signature_v2 = False
use_https = False
See all buckets:
.. code:: shell
s3cmd ls
`rclone <http://rclone.org/s3/>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``~/.rclone.conf`` on Linux, OS X, or Unix or
``C:\Users\USERNAME\.rclone.conf`` on Windows
.. code:: shell
[remote]
type = s3
env_auth = false
access_key_id = accessKey1
secret_access_key = verySecretKey1
region = other-v2-signature
endpoint = http://localhost:8000
location_constraint =
acl = private
server_side_encryption =
storage_class =
See all buckets:
.. code:: shell
rclone lsd remote:
JavaScript
~~~~~~~~~~
`AWS JavaScript SDK <http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: javascript
const AWS = require('aws-sdk');
const s3 = new AWS.S3({
accessKeyId: 'accessKey1',
secretAccessKey: 'verySecretKey1',
endpoint: 'localhost:8000',
sslEnabled: false,
s3ForcePathStyle: true,
});
JAVA
~~~~
`AWS JAVA SDK <http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: java
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.S3ClientOptions;
import com.amazonaws.services.s3.model.Bucket;
public class S3 {
public static void main(String[] args) {
AWSCredentials credentials = new BasicAWSCredentials("accessKey1",
"verySecretKey1");
// Create a client connection based on credentials
AmazonS3 s3client = new AmazonS3Client(credentials);
s3client.setEndpoint("http://localhost:8000");
// Using path-style requests
// (deprecated) s3client.setS3ClientOptions(new S3ClientOptions().withPathStyleAccess(true));
s3client.setS3ClientOptions(S3ClientOptions.builder().setPathStyleAccess(true).build());
// Create bucket
String bucketName = "javabucket";
s3client.createBucket(bucketName);
// List off all buckets
for (Bucket bucket : s3client.listBuckets()) {
System.out.println(" - " + bucket.getName());
}
}
}
Ruby
~~~~
`AWS SDK for Ruby - Version 2 <http://docs.aws.amazon.com/sdkforruby/api/>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: ruby
require 'aws-sdk'
s3 = Aws::S3::Client.new(
:access_key_id => 'accessKey1',
:secret_access_key => 'verySecretKey1',
:endpoint => 'http://localhost:8000',
:force_path_style => true
)
resp = s3.list_buckets
`fog <http://fog.io/storage/>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: ruby
require "fog"
connection = Fog::Storage.new(
{
:provider => "AWS",
:aws_access_key_id => 'accessKey1',
:aws_secret_access_key => 'verySecretKey1',
:endpoint => 'http://localhost:8000',
:path_style => true,
:scheme => 'http',
})
Python
~~~~~~
`boto2 <http://boto.cloudhackers.com/en/latest/ref/s3.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: python
import boto
from boto.s3.connection import S3Connection, OrdinaryCallingFormat
connection = S3Connection(
aws_access_key_id='accessKey1',
aws_secret_access_key='verySecretKey1',
is_secure=False,
port=8000,
calling_format=OrdinaryCallingFormat(),
host='localhost'
)
connection.create_bucket('mybucket')
`boto3 <http://boto3.readthedocs.io/en/latest/index.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Client integration
.. code:: python
import boto3
client = boto3.client(
's3',
aws_access_key_id='accessKey1',
aws_secret_access_key='verySecretKey1',
endpoint_url='http://localhost:8000'
)
lists = client.list_buckets()
Full integration (with object mapping)
.. code:: python
import os
from botocore.utils import fix_s3_host
import boto3
os.environ['AWS_ACCESS_KEY_ID'] = "accessKey1"
os.environ['AWS_SECRET_ACCESS_KEY'] = "verySecretKey1"
s3 = boto3.resource(service_name='s3', endpoint_url='http://localhost:8000')
s3.meta.client.meta.events.unregister('before-sign.s3', fix_s3_host)
for bucket in s3.buckets.all():
print(bucket.name)
PHP
~~~
Should force path-style requests even though v3 advertises it does by default.
`AWS PHP SDK v3 <https://docs.aws.amazon.com/aws-sdk-php/v3/guide>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: php
use Aws\S3\S3Client;
$client = S3Client::factory([
'region' => 'us-east-1',
'version' => 'latest',
'endpoint' => 'http://localhost:8000',
'use_path_style_endpoint' => true,
'credentials' => [
'key' => 'accessKey1',
'secret' => 'verySecretKey1'
]
]);
$client->createBucket(array(
'Bucket' => 'bucketphp',
));

View File

@ -1,24 +0,0 @@
Contributing
============
Need help?
----------
We're always glad to help out. Simply open a
`GitHub issue <https://github.com/scality/S3/issues>`__ and we'll give you
insight. If what you want is not available, and if you're willing to help us
out, we'll be happy to welcome you in the team, whether for a small fix or for
a larger feature development. Thanks for your interest!
Got an idea? Get started!
-------------------------
In order to contribute, please follow the `Contributing
Guidelines <https://github.com/scality/Guidelines/blob/master/CONTRIBUTING.md>`__.
If anything is unclear to you, reach out to us on
`slack <https://zenko-io.slack.com/>`__ or via a GitHub issue.
Don't write code? There are other ways to help!
-----------------------------------------------
We're always eager to learn about our users' stories. If you can't contribute
code, but would love to help us, please shoot us an email at zenko@scality.com,
and tell us what our software enables you to do! Thanks for your time!

View File

@ -1,357 +0,0 @@
Docker
======
- `Environment Variables <#environment-variables>`__
- `Tunables and setup tips <#tunables-and-setup-tips>`__
- `Examples for continuous integration with
Docker <#continuous-integration-with-docker-hosted CloudServer>`__
- `Examples for going in production with Docker <#in-production-with-docker-hosted CloudServer>`__
Environment Variables
---------------------
S3DATA
~~~~~~
S3DATA=multiple
^^^^^^^^^^^^^^^
Allows you to run Scality Zenko CloudServer with multiple data backends, defined
as regions.
When using multiple data backends, a custom ``locationConfig.json`` file is
mandatory. It will allow you to set custom regions. You will then need to
provide associated rest_endpoints for each custom region in your
``config.json`` file.
`Learn more about multiple backends configuration <../GETTING_STARTED/#location-configuration>`__
If you are using Scality RING endpoints, please refer to your customer
documentation.
Running it with an AWS S3 hosted backend
""""""""""""""""""""""""""""""""""""""""
To run CloudServer with an S3 AWS backend, you will have to add a new section
to your ``locationConfig.json`` file with the ``aws_s3`` location type:
.. code:: json
(...)
"awsbackend": {
"type": "aws_s3",
"details": {
"awsEndpoint": "s3.amazonaws.com",
"bucketName": "yourawss3bucket",
"bucketMatch": true,
"credentialsProfile": "aws_hosted_profile"
}
}
(...)
You will also have to edit your AWS credentials file to be able to use your
command line tool of choice. This file should mention credentials for all the
backends you're using. You can use several profiles when using multiple
profiles.
.. code:: json
[default]
aws_access_key_id=accessKey1
aws_secret_access_key=verySecretKey1
[aws_hosted_profile]
aws_access_key_id={{YOUR_ACCESS_KEY}}
aws_secret_access_key={{YOUR_SECRET_KEY}}
Just as you need to mount your locationConfig.json, you will need to mount your
AWS credentials file at run time:
``-v ~/.aws/credentials:/root/.aws/credentials`` on Linux, OS X, or Unix or
``-v C:\Users\USERNAME\.aws\credential:/root/.aws/credentials`` on Windows
NOTE: One account can't copy to another account with a source and
destination on real AWS unless the account associated with the
access Key/secret Key pairs used for the destination bucket has rights
to get in the source bucket. ACL's would have to be updated
on AWS directly to enable this.
S3BACKEND
~~~~~~
S3BACKEND=file
^^^^^^^^^^^
When storing file data, for it to be persistent you must mount docker volumes
for both data and metadata. See `this section <#using-docker-volumes-in-production>`__
S3BACKEND=mem
^^^^^^^^^^
This is ideal for testing - no data will remain after container is shutdown.
ENDPOINT
~~~~~~~~
This variable specifies your endpoint. If you have a domain such as
new.host.com, by specifying that here, you and your users can direct s3
server requests to new.host.com.
.. code:: shell
docker run -d --name s3server -p 8000:8000 -e ENDPOINT=new.host.com scality/s3server
Note: In your ``/etc/hosts`` file on Linux, OS X, or Unix with root
permissions, make sure to associate 127.0.0.1 with ``new.host.com``
SCALITY\_ACCESS\_KEY\_ID and SCALITY\_SECRET\_ACCESS\_KEY
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These variables specify authentication credentials for an account named
"CustomAccount".
You can set credentials for many accounts by editing
``conf/authdata.json`` (see below for further info), but if you just
want to specify one set of your own, you can use these environment
variables.
.. code:: shell
docker run -d --name s3server -p 8000:8000 -e SCALITY_ACCESS_KEY_ID=newAccessKey
-e SCALITY_SECRET_ACCESS_KEY=newSecretKey scality/s3server
Note: Anything in the ``authdata.json`` file will be ignored. Note: The
old ``ACCESS_KEY`` and ``SECRET_KEY`` environment variables are now
deprecated
LOG\_LEVEL
~~~~~~~~~~
This variable allows you to change the log level: info, debug or trace.
The default is info. Debug will give you more detailed logs and trace
will give you the most detailed.
.. code:: shell
docker run -d --name s3server -p 8000:8000 -e LOG_LEVEL=trace scality/s3server
SSL
~~~
This variable set to true allows you to run S3 with SSL:
**Note1**: You also need to specify the ENDPOINT environment variable.
**Note2**: In your ``/etc/hosts`` file on Linux, OS X, or Unix with root
permissions, make sure to associate 127.0.0.1 with ``<YOUR_ENDPOINT>``
**Warning**: These certs, being self-signed (and the CA being generated
inside the container) will be untrusted by any clients, and could
disappear on a container upgrade. That's ok as long as it's for quick
testing. Also, best security practice for non-testing would be to use an
extra container to do SSL/TLS termination such as haproxy/nginx/stunnel
to limit what an exploit on either component could expose, as well as
certificates in a mounted volume
.. code:: shell
docker run -d --name s3server -p 8000:8000 -e SSL=TRUE -e ENDPOINT=<YOUR_ENDPOINT>
scality/s3server
More information about how to use S3 server with SSL
`here <https://s3.scality.com/v1.0/page/scality-with-ssl>`__
LISTEN\_ADDR
~~~~~~~~~~~~
This variable instructs the Zenko CloudServer, and its data and metadata
components to listen on the specified address. This allows starting the data
or metadata servers as standalone services, for example.
.. code:: shell
docker run -d --name s3server-data -p 9991:9991 -e LISTEN_ADDR=0.0.0.0
scality/s3server npm run start_dataserver
DATA\_HOST and METADATA\_HOST
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These variables configure the data and metadata servers to use,
usually when they are running on another host and only starting the stateless
Zenko CloudServer.
.. code:: shell
docker run -d --name s3server -e DATA_HOST=s3server-data
-e METADATA_HOST=s3server-metadata scality/s3server npm run start_s3server
REDIS\_HOST
~~~~~~~~~~~
Use this variable to connect to the redis cache server on another host than
localhost.
.. code:: shell
docker run -d --name s3server -p 8000:8000
-e REDIS_HOST=my-redis-server.example.com scality/s3server
REDIS\_PORT
~~~~~~~~~~~
Use this variable to connect to the redis cache server on another port than
the default 6379.
.. code:: shell
docker run -d --name s3server -p 8000:8000
-e REDIS_PORT=6379 scality/s3server
Tunables and Setup Tips
-----------------------
Using Docker Volumes
~~~~~~~~~~~~~~~~~~~~
Zenko CloudServer runs with a file backend by default.
So, by default, the data is stored inside your Zenko CloudServer Docker
container.
However, if you want your data and metadata to persist, you **MUST** use
Docker volumes to host your data and metadata outside your Zenko CloudServer
Docker container. Otherwise, the data and metadata will be destroyed
when you erase the container.
.. code:: shell
docker run -­v $(pwd)/data:/usr/src/app/localData -­v $(pwd)/metadata:/usr/src/app/localMetadata
-p 8000:8000 ­-d scality/s3server
This command mounts the host directory, ``./data``, into the container
at ``/usr/src/app/localData`` and the host directory, ``./metadata``, into
the container at ``/usr/src/app/localMetaData``. It can also be any host
mount point, like ``/mnt/data`` and ``/mnt/metadata``.
Adding modifying or deleting accounts or users credentials
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. Create locally a customized ``authdata.json`` based on our ``/conf/authdata.json``.
2. Use `Docker
Volume <https://docs.docker.com/engine/tutorials/dockervolumes/>`__
to override the default ``authdata.json`` through a docker file mapping.
For example:
.. code:: shell
docker run -v $(pwd)/authdata.json:/usr/src/app/conf/authdata.json -p 8000:8000 -d
scality/s3server
Specifying your own host name
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To specify a host name (e.g. s3.domain.name), you can provide your own
`config.json <https://github.com/scality/S3/blob/master/config.json>`__
using `Docker
Volume <https://docs.docker.com/engine/tutorials/dockervolumes/>`__.
First add a new key-value pair in the restEndpoints section of your
config.json. The key in the key-value pair should be the host name you
would like to add and the value is the default location\_constraint for
this endpoint.
For example, ``s3.example.com`` is mapped to ``us-east-1`` which is one
of the ``location_constraints`` listed in your locationConfig.json file
`here <https://github.com/scality/S3/blob/master/locationConfig.json>`__.
More information about location configuration
`here <https://github.com/scality/S3/blob/master/README.md#location-configuration>`__
.. code:: json
"restEndpoints": {
"localhost": "file",
"127.0.0.1": "file",
...
"s3.example.com": "us-east-1"
},
Then, run your Scality S3 Server using `Docker
Volume <https://docs.docker.com/engine/tutorials/dockervolumes/>`__:
.. code:: shell
docker run -v $(pwd)/config.json:/usr/src/app/config.json -p 8000:8000 -d scality/s3server
Your local ``config.json`` file will override the default one through a
docker file mapping.
Running as an unprivileged user
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Zenko CloudServer runs as root by default.
You can change that by modifing the dockerfile and specifying a user
before the entrypoint.
The user needs to exist within the container, and own the folder
**/usr/src/app** for Scality Zenko CloudServer to run properly.
For instance, you can modify these lines in the dockerfile:
.. code:: shell
...
&& groupadd -r -g 1001 scality \
&& useradd -u 1001 -g 1001 -d /usr/src/app -r scality \
&& chown -R scality:scality /usr/src/app
...
USER scality
ENTRYPOINT ["/usr/src/app/docker-entrypoint.sh"]
Continuous integration with Docker hosted CloudServer
-----------------------------------------------------
When you start the Docker Scality Zenko CloudServer image, you can adjust the
configuration of the Scality Zenko CloudServer instance by passing one or more
environment variables on the docker run command line.
Sample ways to run it for CI are:
- With custom locations (one in-memory, one hosted on AWS), and custom
credentials mounted:
.. code:: shell
docker run --name CloudServer -p 8000:8000
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json
-v $(pwd)/authdata.json:/usr/src/app/conf/authdata.json
-v ~/.aws/credentials:/root/.aws/credentials
-e S3DATA=multiple -e S3BACKEND=mem scality/s3server
- With custom locations, (one in-memory, one hosted on AWS, one file),
and custom credentials set as environment variables
(see `this section <#scality-access-key-id-and-scality-secret-access-key>`__):
.. code:: shell
docker run --name CloudServer -p 8000:8000
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json
-v ~/.aws/credentials:/root/.aws/credentials
-v $(pwd)/data:/usr/src/app/localData -v $(pwd)/metadata:/usr/src/app/localMetadata
-e SCALITY_ACCESS_KEY_ID=accessKey1
-e SCALITY_SECRET_ACCESS_KEY=verySecretKey1
-e S3DATA=multiple -e S3BACKEND=mem scality/s3server
In production with Docker hosted CloudServer
--------------------------------------------
In production, we expect that data will be persistent, that you will use the
multiple backends capabilities of Zenko CloudServer, and that you will have a
custom endpoint for your local storage, and custom credentials for your local
storage:
.. code:: shell
docker run -d --name CloudServer
-v $(pwd)/data:/usr/src/app/localData -v $(pwd)/metadata:/usr/src/app/localMetadata
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json
-v $(pwd)/authdata.json:/usr/src/app/conf/authdata.json
-v ~/.aws/credentials:/root/.aws/credentials -e S3DATA=multiple
-e ENDPOINT=custom.endpoint.com
-p 8000:8000 ­-d scality/s3server

View File

@ -1,420 +0,0 @@
Getting Started
=================
.. figure:: ../res/scality-cloudserver-logo.png
:alt: Zenko CloudServer logo
|CircleCI| |Scality CI|
Installation
------------
Dependencies
~~~~~~~~~~~~
Building and running the Scality Zenko CloudServer requires node.js 6.9.5 and
npm v3 . Up-to-date versions can be found at
`Nodesource <https://github.com/nodesource/distributions>`__.
Clone source code
~~~~~~~~~~~~~~~~~
.. code:: shell
git clone https://github.com/scality/S3.git
Install js dependencies
~~~~~~~~~~~~~~~~~~~~~~~
Go to the ./S3 folder,
.. code:: shell
npm install
Run it with a file backend
--------------------------
.. code:: shell
npm start
This starts an Zenko CloudServer on port 8000. Two additional ports 9990 and
9991 are also open locally for internal transfer of metadata and data,
respectively.
The default access key is accessKey1 with a secret key of
verySecretKey1.
By default the metadata files will be saved in the localMetadata
directory and the data files will be saved in the localData directory
within the ./S3 directory on your machine. These directories have been
pre-created within the repository. If you would like to save the data or
metadata in different locations of your choice, you must specify them
with absolute paths. So, when starting the server:
.. code:: shell
mkdir -m 700 $(pwd)/myFavoriteDataPath
mkdir -m 700 $(pwd)/myFavoriteMetadataPath
export S3DATAPATH="$(pwd)/myFavoriteDataPath"
export S3METADATAPATH="$(pwd)/myFavoriteMetadataPath"
npm start
Run it with multiple data backends
----------------------------------
.. code:: shell
export S3DATA='multiple'
npm start
This starts an Zenko CloudServer on port 8000. The default access key is
accessKey1 with a secret key of verySecretKey1.
With multiple backends, you have the ability to choose where each object
will be saved by setting the following header with a locationConstraint
on a PUT request:
.. code:: shell
'x-amz-meta-scal-location-constraint':'myLocationConstraint'
If no header is sent with a PUT object request, the location constraint
of the bucket will determine where the data is saved. If the bucket has
no location constraint, the endpoint of the PUT request will be used to
determine location.
See the Configuration section below to learn how to set location
constraints.
Run it with an in-memory backend
--------------------------------
.. code:: shell
npm run mem_backend
This starts an Zenko CloudServer on port 8000. The default access key is
accessKey1 with a secret key of verySecretKey1.
Run it for continuous integration testing or in production with Docker
----------------------------------------------------------------------
`DOCKER <../DOCKER/>`__
Testing
-------
You can run the unit tests with the following command:
.. code:: shell
npm test
You can run the multiple backend unit tests with:
.. code:: shell
CI=true S3DATA=multiple npm start
npm run multiple_backend_test
You can run the linter with:
.. code:: shell
npm run lint
Running functional tests locally:
For the AWS backend and Azure backend tests to pass locally,
you must modify tests/locationConfigTests.json so that awsbackend
specifies a bucketname of a bucket you have access to based on
your credentials profile and modify "azurebackend" with details
for your Azure account.
The test suite requires additional tools, **s3cmd** and **Redis**
installed in the environment the tests are running in.
- Install `s3cmd <http://s3tools.org/download>`__
- Install `redis <https://redis.io/download>`__ and start Redis.
- Add localCache section to your ``config.json``:
::
"localCache": {
"host": REDIS_HOST,
"port": REDIS_PORT
}
where ``REDIS_HOST`` is your Redis instance IP address (``"127.0.0.1"``
if your Redis is running locally) and ``REDIS_PORT`` is your Redis
instance port (``6379`` by default)
- Add the following to the etc/hosts file on your machine:
.. code:: shell
127.0.0.1 bucketwebsitetester.s3-website-us-east-1.amazonaws.com
- Start the Zenko CloudServer in memory and run the functional tests:
.. code:: shell
CI=true npm run mem_backend
CI=true npm run ft_test
Configuration
-------------
There are three configuration files for your Scality Zenko CloudServer:
1. ``conf/authdata.json``, described above for authentication
2. ``locationConfig.json``, to set up configuration options for
where data will be saved
3. ``config.json``, for general configuration options
Location Configuration
~~~~~~~~~~~~~~~~~~~~~~
You must specify at least one locationConstraint in your
locationConfig.json (or leave as pre-configured).
You must also specify 'us-east-1' as a locationConstraint so if you only
define one locationConstraint, that would be it. If you put a bucket to
an unknown endpoint and do not specify a locationConstraint in the put
bucket call, us-east-1 will be used.
For instance, the following locationConstraint will save data sent to
``myLocationConstraint`` to the file backend:
.. code:: json
"myLocationConstraint": {
"type": "file",
"legacyAwsBehavior": false,
"details": {}
},
Each locationConstraint must include the ``type``,
``legacyAwsBehavior``, and ``details`` keys. ``type`` indicates which
backend will be used for that region. Currently, mem, file, and scality
are the supported backends. ``legacyAwsBehavior`` indicates whether the
region will have the same behavior as the AWS S3 'us-east-1' region. If
the locationConstraint type is scality, ``details`` should contain
connector information for sproxyd. If the locationConstraint type is mem
or file, ``details`` should be empty.
Once you have your locationConstraints in your locationConfig.json, you
can specify a default locationConstraint for each of your endpoints.
For instance, the following sets the ``localhost`` endpoint to the
``myLocationConstraint`` data backend defined above:
.. code:: json
"restEndpoints": {
"localhost": "myLocationConstraint"
},
If you would like to use an endpoint other than localhost for your
Scality Zenko CloudServer, that endpoint MUST be listed in your
``restEndpoints``. Otherwise if your server is running with a:
- **file backend**: your default location constraint will be ``file``
- **memory backend**: your default location constraint will be ``mem``
Endpoints
~~~~~~~~~
Note that our Zenko CloudServer supports both:
- path-style: http://myhostname.com/mybucket
- hosted-style: http://mybucket.myhostname.com
However, hosted-style requests will not hit the server if you are using
an ip address for your host. So, make sure you are using path-style
requests in that case. For instance, if you are using the AWS SDK for
JavaScript, you would instantiate your client like this:
.. code:: js
const s3 = new aws.S3({
endpoint: 'http://127.0.0.1:8000',
s3ForcePathStyle: true,
});
Setting your own access key and secret key pairs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You can set credentials for many accounts by editing
``conf/authdata.json`` but if you want to specify one set of your own
credentials, you can use ``SCALITY_ACCESS_KEY_ID`` and
``SCALITY_SECRET_ACCESS_KEY`` environment variables.
SCALITY\_ACCESS\_KEY\_ID and SCALITY\_SECRET\_ACCESS\_KEY
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
These variables specify authentication credentials for an account named
"CustomAccount".
Note: Anything in the ``authdata.json`` file will be ignored.
.. code:: shell
SCALITY_ACCESS_KEY_ID=newAccessKey SCALITY_SECRET_ACCESS_KEY=newSecretKey npm start
Scality with SSL
~~~~~~~~~~~~~~~~~~~~~~
If you wish to use https with your local Zenko CloudServer, you need to set up
SSL certificates. Here is a simple guide of how to do it.
Deploying Zenko CloudServer
^^^^^^^^^^^^^^^^^^^
First, you need to deploy **Zenko CloudServer**. This can be done very easily
via `our **DockerHub**
page <https://hub.docker.com/r/scality/s3server/>`__ (you want to run it
with a file backend).
*Note:* *- If you don't have docker installed on your machine, here
are the `instructions to install it for your
distribution <https://docs.docker.com/engine/installation/>`__*
Updating your Zenko CloudServer container's config
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
You're going to add your certificates to your container. In order to do
so, you need to exec inside your Zenko CloudServer container. Run a
``$> docker ps`` and find your container's id (the corresponding image
name should be ``scality/s3server``. Copy the corresponding container id
(here we'll use ``894aee038c5e``, and run:
.. code:: sh
$> docker exec -it 894aee038c5e bash
You're now inside your container, using an interactive terminal :)
Generate SSL key and certificates
**********************************
There are 5 steps to this generation. The paths where the different
files are stored are defined after the ``-out`` option in each command
.. code:: sh
# Generate a private key for your CSR
$> openssl genrsa -out ca.key 2048
# Generate a self signed certificate for your local Certificate Authority
$> openssl req -new -x509 -extensions v3_ca -key ca.key -out ca.crt -days 99999 -subj "/C=US/ST=Country/L=City/O=Organization/CN=scality.test"
# Generate a key for Zenko CloudServer
$> openssl genrsa -out test.key 2048
# Generate a Certificate Signing Request for S3 Server
$> openssl req -new -key test.key -out test.csr -subj "/C=US/ST=Country/L=City/O=Organization/CN=*.scality.test"
# Generate a local-CA-signed certificate for S3 Server
$> openssl x509 -req -in test.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out test.crt -days 99999 -sha256
Update Zenko CloudServer ``config.json``
**********************************
Add a ``certFilePaths`` section to ``./config.json`` with the
appropriate paths:
.. code:: json
"certFilePaths": {
"key": "./test.key",
"cert": "./test.crt",
"ca": "./ca.crt"
}
Run your container with the new config
****************************************
First, you need to exit your container. Simply run ``$> exit``. Then,
you need to restart your container. Normally, a simple
``$> docker restart s3server`` should do the trick.
Update your host config
^^^^^^^^^^^^^^^^^^^^^^^
Associates local IP addresses with hostname
*******************************************
In your ``/etc/hosts`` file on Linux, OS X, or Unix (with root
permissions), edit the line of localhost so it looks like this:
::
127.0.0.1 localhost s3.scality.test
Copy the local certificate authority from your container
*********************************************************
In the above commands, it's the file named ``ca.crt``. Choose the path
you want to save this file at (here we chose ``/root/ca.crt``), and run
something like:
.. code:: sh
$> docker cp 894aee038c5e:/usr/src/app/ca.crt /root/ca.crt
Test your config
^^^^^^^^^^^^^^^^^
If you do not have aws-sdk installed, run ``$> npm install aws-sdk``. In
a ``test.js`` file, paste the following script:
.. code:: js
const AWS = require('aws-sdk');
const fs = require('fs');
const https = require('https');
const httpOptions = {
agent: new https.Agent({
// path on your host of the self-signed certificate
ca: fs.readFileSync('./ca.crt', 'ascii'),
}),
};
const s3 = new AWS.S3({
httpOptions,
accessKeyId: 'accessKey1',
secretAccessKey: 'verySecretKey1',
// The endpoint must be s3.scality.test, else SSL will not work
endpoint: 'https://s3.scality.test:8000',
sslEnabled: true,
// With this setup, you must use path-style bucket access
s3ForcePathStyle: true,
});
const bucket = 'cocoriko';
s3.createBucket({ Bucket: bucket }, err => {
if (err) {
return console.log('err createBucket', err);
}
return s3.deleteBucket({ Bucket: bucket }, err => {
if (err) {
return console.log('err deleteBucket', err);
}
return console.log('SSL is cool!');
});
});
Now run that script with ``$> nodejs test.js``. If all goes well, it
should output ``SSL is cool!``. Enjoy that added security!
.. |CircleCI| image:: https://circleci.com/gh/scality/S3.svg?style=svg
:target: https://circleci.com/gh/scality/S3
.. |Scality CI| image:: http://ci.ironmann.io/gh/scality/S3.svg?style=svg&circle-token=1f105b7518b53853b5b7cf72302a3f75d8c598ae
:target: http://ci.ironmann.io/gh/scality/S3

View File

@ -1,642 +0,0 @@
Integrations
++++++++++++
High Availability
=================
`Docker swarm <https://docs.docker.com/engine/swarm/>`__ is a
clustering tool developped by Docker and ready to use with its
containers. It allows to start a service, which we define and use as a
means to ensure Zenko CloudServer's continuous availability to the end user.
Indeed, a swarm defines a manager and n workers among n+1 servers. We
will do a basic setup in this tutorial, with just 3 servers, which
already provides a strong service resiliency, whilst remaining easy to
do as an individual. We will use NFS through docker to share data and
metadata between the different servers.
You will see that the steps of this tutorial are defined as **On
Server**, **On Clients**, **On All Machines**. This refers respectively
to NFS Server, NFS Clients, or NFS Server and Clients. In our example,
the IP of the Server will be **10.200.15.113**, while the IPs of the
Clients will be **10.200.15.96 and 10.200.15.97**
Installing docker
-----------------
Any version from docker 1.12.6 onwards should work; we used Docker
17.03.0-ce for this tutorial.
On All Machines
~~~~~~~~~~~~~~~
On Ubuntu 14.04
^^^^^^^^^^^^^^^
The docker website has `solid
documentation <https://docs.docker.com/engine/installation/linux/ubuntu/>`__.
We have chosen to install the aufs dependency, as recommended by Docker.
Here are the required commands:
.. code:: sh
$> sudo apt-get update
$> sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual
$> sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
$> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$> sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
$> sudo apt-get update
$> sudo apt-get install docker-ce
On CentOS 7
^^^^^^^^^^^
The docker website has `solid
documentation <https://docs.docker.com/engine/installation/linux/centos/>`__.
Here are the required commands:
.. code:: sh
$> sudo yum install -y yum-utils
$> sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
$> sudo yum makecache fast
$> sudo yum install docker-ce
$> sudo systemctl start docker
Configure NFS
-------------
On Clients
~~~~~~~~~~
Your NFS Clients will mount Docker volumes over your NFS Server's shared
folders. Hence, you don't have to mount anything manually, you just have
to install the NFS commons:
On Ubuntu 14.04
^^^^^^^^^^^^^^^
Simply install the NFS commons:
.. code:: sh
$> sudo apt-get install nfs-common
On CentOS 7
^^^^^^^^^^^
Install the NFS utils, and then start the required services:
.. code:: sh
$> yum install nfs-utils
$> sudo systemctl enable rpcbind
$> sudo systemctl enable nfs-server
$> sudo systemctl enable nfs-lock
$> sudo systemctl enable nfs-idmap
$> sudo systemctl start rpcbind
$> sudo systemctl start nfs-server
$> sudo systemctl start nfs-lock
$> sudo systemctl start nfs-idmap
On Server
~~~~~~~~~
Your NFS Server will be the machine to physically host the data and
metadata. The package(s) we will install on it is slightly different
from the one we installed on the clients.
On Ubuntu 14.04
^^^^^^^^^^^^^^^
Install the NFS server specific package and the NFS commons:
.. code:: sh
$> sudo apt-get install nfs-kernel-server nfs-common
On CentOS 7
^^^^^^^^^^^
Same steps as with the client: install the NFS utils and start the
required services:
.. code:: sh
$> yum install nfs-utils
$> sudo systemctl enable rpcbind
$> sudo systemctl enable nfs-server
$> sudo systemctl enable nfs-lock
$> sudo systemctl enable nfs-idmap
$> sudo systemctl start rpcbind
$> sudo systemctl start nfs-server
$> sudo systemctl start nfs-lock
$> sudo systemctl start nfs-idmap
On Ubuntu 14.04 and CentOS 7
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Choose where your shared data and metadata from your local `Zenko CloudServer
<http://www.zenko.io/cloudserver/>`__ will be stored.
We chose to go with /var/nfs/data and /var/nfs/metadata. You also need
to set proper sharing permissions for these folders as they'll be shared
over NFS:
.. code:: sh
$> mkdir -p /var/nfs/data /var/nfs/metadata
$> chmod -R 777 /var/nfs/
Now you need to update your **/etc/exports** file. This is the file that
configures network permissions and rwx permissions for NFS access. By
default, Ubuntu applies the no\_subtree\_check option, so we declared
both folders with the same permissions, even though they're in the same
tree:
.. code:: sh
$> sudo vim /etc/exports
In this file, add the following lines:
.. code:: sh
/var/nfs/data 10.200.15.96(rw,sync,no_root_squash) 10.200.15.97(rw,sync,no_root_squash)
/var/nfs/metadata 10.200.15.96(rw,sync,no_root_squash) 10.200.15.97(rw,sync,no_root_squash)
Export this new NFS table:
.. code:: sh
$> sudo exportfs -a
Eventually, you need to allow for NFS mount from Docker volumes on other
machines. You need to change the Docker config in
**/lib/systemd/system/docker.service**:
.. code:: sh
$> sudo vim /lib/systemd/system/docker.service
In this file, change the **MountFlags** option:
.. code:: sh
MountFlags=shared
Now you just need to restart the NFS server and docker daemons so your
changes apply.
On Ubuntu 14.04
^^^^^^^^^^^^^^^
Restart your NFS Server and docker services:
.. code:: sh
$> sudo service nfs-kernel-server restart
$> sudo service docker restart
On CentOS 7
^^^^^^^^^^^
Restart your NFS Server and docker daemons:
.. code:: sh
$> sudo systemctl restart nfs-server
$> sudo systemctl daemon-reload
$> sudo systemctl restart docker
Set up your Docker Swarm service
--------------------------------
On All Machines
~~~~~~~~~~~~~~~
On Ubuntu 14.04 and CentOS 7
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We will now set up the Docker volumes that will be mounted to the NFS
Server and serve as data and metadata storage for Zenko CloudServer. These two
commands have to be replicated on all machines:
.. code:: sh
$> docker volume create --driver local --opt type=nfs --opt o=addr=10.200.15.113,rw --opt device=:/var/nfs/data --name data
$> docker volume create --driver local --opt type=nfs --opt o=addr=10.200.15.113,rw --opt device=:/var/nfs/metadata --name metadata
There is no need to ""docker exec" these volumes to mount them: the
Docker Swarm manager will do it when the Docker service will be started.
On Server
^^^^^^^^^
To start a Docker service on a Docker Swarm cluster, you first have to
initialize that cluster (i.e.: define a manager), then have the
workers/nodes join in, and then start the service. Initialize the swarm
cluster, and look at the response:
.. code:: sh
$> docker swarm init --advertise-addr 10.200.15.113
Swarm initialized: current node (db2aqfu3bzfzzs9b1kfeaglmq) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-5yxxencrdoelr7mpltljn325uz4v6fe1gojl14lzceij3nujzu-2vfs9u6ipgcq35r90xws3stka \
10.200.15.113:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
On Clients
^^^^^^^^^^
Simply copy/paste the command provided by your docker swarm init. When
all goes well, you'll get something like this:
.. code:: sh
$> docker swarm join --token SWMTKN-1-5yxxencrdoelr7mpltljn325uz4v6fe1gojl14lzceij3nujzu-2vfs9u6ipgcq35r90xws3stka 10.200.15.113:2377
This node joined a swarm as a worker.
On Server
^^^^^^^^^
Start the service on your swarm cluster!
.. code:: sh
$> docker service create --name s3 --replicas 1 --mount type=volume,source=data,target=/usr/src/app/localData --mount type=volume,source=metadata,target=/usr/src/app/localMetadata -p 8000:8000 scality/s3server
If you run a docker service ls, you should have the following output:
.. code:: sh
$> docker service ls
ID NAME MODE REPLICAS IMAGE
ocmggza412ft s3 replicated 1/1 scality/s3server:latest
If your service won't start, consider disabling apparmor/SELinux.
Testing your High Availability S3Server
---------------------------------------
On All Machines
~~~~~~~~~~~~~~~
On Ubuntu 14.04 and CentOS 7
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Try to find out where your Scality Zenko CloudServer is actually running using
the **docker ps** command. It can be on any node of the swarm cluster,
manager or worker. When you find it, you can kill it, with **docker stop
<container id>** and you'll see it respawn on a different node of the
swarm cluster. Now you see, if one of your servers falls, or if docker
stops unexpectedly, your end user will still be able to access your
local Zenko CloudServer.
Troubleshooting
---------------
To troubleshoot the service you can run:
.. code:: sh
$> docker service ps s3docker service ps s3
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
0ar81cw4lvv8chafm8pw48wbc s3.1 scality/s3server localhost.localdomain.localdomain Running Running 7 days ago
cvmf3j3bz8w6r4h0lf3pxo6eu \_ s3.1 scality/s3server localhost.localdomain.localdomain Shutdown Failed 7 days ago "task: non-zero exit (137)"
If the error is truncated it is possible to have a more detailed view of
the error by inspecting the docker task ID:
.. code:: sh
$> docker inspect cvmf3j3bz8w6r4h0lf3pxo6eu
Off you go!
-----------
Let us know what you use this functionality for, and if you'd like any
specific developments around it. Or, even better: come and contribute to
our `Github repository <https://github.com/scality/s3/>`__! We look
forward to meeting you!
S3FS
====
Export your buckets as a filesystem with s3fs on top of Zenko CloudServer
`s3fs <https://github.com/s3fs-fuse/s3fs-fuse>`__ is an open source
tool that allows you to mount an S3 bucket on a filesystem-like backend.
It is available both on Debian and RedHat distributions. For this
tutorial, we used an Ubuntu 14.04 host to deploy and use s3fs over
Scality's Zenko CloudServer.
Deploying Zenko CloudServer with SSL
----------------------------
First, you need to deploy **Zenko CloudServer**. This can be done very easily
via `our DockerHub
page <https://hub.docker.com/r/scality/s3server/>`__ (you want to run it
with a file backend).
*Note:* *- If you don't have docker installed on your machine, here
are the `instructions to install it for your
distribution <https://docs.docker.com/engine/installation/>`__*
You also necessarily have to set up SSL with Zenko CloudServer to use s3fs. We
have a nice
`tutorial <https://s3.scality.com/v1.0/page/scality-with-ssl>`__ to help
you do it.
s3fs setup
----------
Installing s3fs
~~~~~~~~~~~~~~~
s3fs has quite a few dependencies. As explained in their
`README <https://github.com/s3fs-fuse/s3fs-fuse/blob/master/README.md#installation>`__,
the following commands should install everything for Ubuntu 14.04:
.. code:: sh
$> sudo apt-get install automake autotools-dev g++ git libcurl4-gnutls-dev
$> sudo apt-get install libfuse-dev libssl-dev libxml2-dev make pkg-config
Now you want to install s3fs per se:
.. code:: sh
$> git clone https://github.com/s3fs-fuse/s3fs-fuse.git
$> cd s3fs-fuse
$> ./autogen.sh
$> ./configure
$> make
$> sudo make install
Check that s3fs is properly installed by checking its version. it should
answer as below:
.. code:: sh
$> s3fs --version
Amazon Simple Storage Service File System V1.80(commit:d40da2c) with OpenSSL
Configuring s3fs
~~~~~~~~~~~~~~~~
s3fs expects you to provide it with a password file. Our file is
``/etc/passwd-s3fs``. The structure for this file is
``ACCESSKEYID:SECRETKEYID``, so, for S3Server, you can run:
.. code:: sh
$> echo 'accessKey1:verySecretKey1' > /etc/passwd-s3fs
$> chmod 600 /etc/passwd-s3fs
Using Zenko CloudServer with s3fs
------------------------
First, you're going to need a mountpoint; we chose ``/mnt/tests3fs``:
.. code:: sh
$> mkdir /mnt/tests3fs
Then, you want to create a bucket on your local Zenko CloudServer; we named it
``tests3fs``:
.. code:: sh
$> s3cmd mb s3://tests3fs
*Note:* *- If you've never used s3cmd with our Zenko CloudServer, our README
provides you with a `recommended
config <https://github.com/scality/S3/blob/master/README.md#s3cmd>`__*
Now you can mount your bucket to your mountpoint with s3fs:
.. code:: sh
$> s3fs tests3fs /mnt/tests3fs -o passwd_file=/etc/passwd-s3fs -o url="https://s3.scality.test:8000/" -o use_path_request_style
*If you're curious, the structure of this command is*
``s3fs BUCKET_NAME PATH/TO/MOUNTPOINT -o OPTIONS``\ *, and the
options are mandatory and serve the following purposes:
* ``passwd_file``\ *: specifiy path to password file;
* ``url``\ *: specify the hostname used by your SSL provider;
* ``use_path_request_style``\ *: force path style (by default, s3fs
uses subdomains (DNS style)).*
| From now on, you can either add files to your mountpoint, or add
objects to your bucket, and they'll show in the other.
| For example, let's' create two files, and then a directory with a file
in our mountpoint:
.. code:: sh
$> touch /mnt/tests3fs/file1 /mnt/tests3fs/file2
$> mkdir /mnt/tests3fs/dir1
$> touch /mnt/tests3fs/dir1/file3
Now, I can use s3cmd to show me what is actually in S3Server:
.. code:: sh
$> s3cmd ls -r s3://tests3fs
2017-02-28 17:28 0 s3://tests3fs/dir1/
2017-02-28 17:29 0 s3://tests3fs/dir1/file3
2017-02-28 17:28 0 s3://tests3fs/file1
2017-02-28 17:28 0 s3://tests3fs/file2
Now you can enjoy a filesystem view on your local Zenko CloudServer!
Duplicity
=========
How to backup your files with Zenko CloudServer.
Installing
-----------
Installing Duplicity and its dependencies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Second, you want to install
`Duplicity <http://duplicity.nongnu.org/index.html>`__. You have to
download `this
tarball <https://code.launchpad.net/duplicity/0.7-series/0.7.11/+download/duplicity-0.7.11.tar.gz>`__,
decompress it, and then checkout the README inside, which will give you
a list of dependencies to install. If you're using Ubuntu 14.04, this is
your lucky day: here is a lazy step by step install.
.. code:: sh
$> apt-get install librsync-dev gnupg
$> apt-get install python-dev python-pip python-lockfile
$> pip install -U boto
Then you want to actually install Duplicity:
.. code:: sh
$> tar zxvf duplicity-0.7.11.tar.gz
$> cd duplicity-0.7.11
$> python setup.py install
Using
------
Testing your installation
~~~~~~~~~~~~~~~~~~~~~~~~~~~
First, we're just going to quickly check that Zenko CloudServer is actually
running. To do so, simply run ``$> docker ps`` . You should see one
container named ``scality/s3server``. If that is not the case, try
``$> docker start s3server``, and check again.
Secondly, as you probably know, Duplicity uses a module called **Boto**
to send requests to S3. Boto requires a configuration file located in
**``/etc/boto.cfg``** to have your credentials and preferences. Here is
a minimalistic config `that you can finetune following these
instructions <http://boto.cloudhackers.com/en/latest/getting_started.html>`__.
::
[Credentials]
aws_access_key_id = accessKey1
aws_secret_access_key = verySecretKey1
[Boto]
# If using SSL, set to True
is_secure = False
# If using SSL, unmute and provide absolute path to local CA certificate
# ca_certificates_file = /absolute/path/to/ca.crt
*Note:* *If you want to set up SSL with Zenko CloudServer, check out our
`tutorial <http://link/to/SSL/tutorial>`__*
At this point, we've met all the requirements to start running Zenko CloudServer
as a backend to Duplicity. So we should be able to back up a local
folder/file to local S3. Let's try with the duplicity decompressed
folder:
.. code:: sh
$> duplicity duplicity-0.7.11 "s3://127.0.0.1:8000/testbucket/"
*Note:* *Duplicity will prompt you for a symmetric encryption
passphrase. Save it somewhere as you will need it to recover your
data. Alternatively, you can also add the ``--no-encryption`` flag
and the data will be stored plain.*
If this command is succesful, you will get an output looking like this:
::
--------------[ Backup Statistics ]--------------
StartTime 1486486547.13 (Tue Feb 7 16:55:47 2017)
EndTime 1486486547.40 (Tue Feb 7 16:55:47 2017)
ElapsedTime 0.27 (0.27 seconds)
SourceFiles 388
SourceFileSize 6634529 (6.33 MB)
NewFiles 388
NewFileSize 6634529 (6.33 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 388
RawDeltaSize 6392865 (6.10 MB)
TotalDestinationSizeChange 2003677 (1.91 MB)
Errors 0
-------------------------------------------------
Congratulations! You can now backup to your local S3 through duplicity
:)
Automating backups
~~~~~~~~~~~~~~~~~~~
Now you probably want to back up your files periodically. The easiest
way to do this is to write a bash script and add it to your crontab.
Here is my suggestion for such a file:
.. code:: sh
#!/bin/bash
# Export your passphrase so you don't have to type anything
export PASSPHRASE="mypassphrase"
# If you want to use a GPG Key, put it here and unmute the line below
#GPG_KEY=
# Define your backup bucket, with localhost specified
DEST="s3://127.0.0.1:8000/testbuckets3server/"
# Define the absolute path to the folder you want to backup
SOURCE=/root/testfolder
# Set to "full" for full backups, and "incremental" for incremental backups
# Warning: you have to perform one full backup befor you can perform
# incremental ones on top of it
FULL=incremental
# How long to keep backups for; if you don't want to delete old
# backups, keep empty; otherwise, syntax is "1Y" for one year, "1M"
# for one month, "1D" for one day
OLDER_THAN="1Y"
# is_running checks whether duplicity is currently completing a task
is_running=$(ps -ef | grep duplicity | grep python | wc -l)
# If duplicity is already completing a task, this will simply not run
if [ $is_running -eq 0 ]; then
echo "Backup for ${SOURCE} started"
# If you want to delete backups older than a certain time, we do it here
if [ "$OLDER_THAN" != "" ]; then
echo "Removing backups older than ${OLDER_THAN}"
duplicity remove-older-than ${OLDER_THAN} ${DEST}
fi
# This is where the actual backup takes place
echo "Backing up ${SOURCE}..."
duplicity ${FULL} \
${SOURCE} ${DEST}
# If you're using GPG, paste this in the command above
# --encrypt-key=${GPG_KEY} --sign-key=${GPG_KEY} \
# If you want to exclude a subfolder/file, put it below and
# paste this
# in the command above
# --exclude=/${SOURCE}/path_to_exclude \
echo "Backup for ${SOURCE} complete"
echo "------------------------------------"
fi
# Forget the passphrase...
unset PASSPHRASE
So let's say you put this file in ``/usr/local/sbin/backup.sh.`` Next
you want to run ``crontab -e`` and paste your configuration in the file
that opens. If you're unfamiliar with Cron, here is a good `How
To <https://help.ubuntu.com/community/CronHowto>`__. The folder I'm
backing up is a folder I modify permanently during my workday, so I want
incremental backups every 5mn from 8AM to 9PM monday to friday. Here is
the line I will paste in my crontab:
.. code:: cron
*/5 8-20 * * 1-5 /usr/local/sbin/backup.sh
Now I can try and add / remove files from the folder I'm backing up, and
I will see incremental backups in my bucket.

View File

@ -1,396 +0,0 @@
Using Public Clouds as data backends
====================================
Introduction
------------
As stated in our `GETTING STARTED guide <../GETTING_STARTED/#location-configuration>`__,
new data backends can be added by creating a region (also called location
constraint) with the right endpoint and credentials.
This section of the documentation shows you how to set up our currently
supported public cloud backends:
- `Amazon S3 <#aws-s3-as-a-data-backend>`__ ;
- `Microsoft Azure <#microsoft-azure-as-a-data-backend>`__ .
For each public cloud backend, you will have to edit your CloudServer
:code:`locationConfig.json` and do a few setup steps on the applicable public
cloud backend.
AWS S3 as a data backend
------------------------
From the AWS S3 Console (or any AWS S3 CLI tool)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a bucket where you will host your data for this new location constraint.
This bucket must have versioning enabled:
- This is an option you may choose to activate at step 2 of Bucket Creation in
the Console;
- With AWS CLI, use :code:`put-bucket-versioning` from the :code:`s3api`
commands on your bucket of choice;
- Using other tools, please refer to your tool's documentation.
In this example, our bucket will be named ``zenkobucket`` and has versioning
enabled.
From the CloudServer repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
locationConfig.json
^^^^^^^^^^^^^^^^^^^
Edit this file to add a new location constraint. This location constraint will
contain the information for the AWS S3 bucket to which you will be writing your
data whenever you create a CloudServer bucket in this location.
There are a few configurable options here:
- :code:`type` : set to :code:`aws_s3` to indicate this location constraint is
writing data to AWS S3;
- :code:`legacyAwsBehavior` : set to :code:`true` to indicate this region should
behave like AWS S3 :code:`us-east-1` region, set to :code:`false` to indicate
this region should behave like any other AWS S3 region;
- :code:`bucketName` : set to an *existing bucket* in your AWS S3 Account; this
is the bucket in which your data will be stored for this location constraint;
- :code:`awsEndpoint` : set to your bucket's endpoint, usually :code:`s3.amazonaws.com`;
- :code:`bucketMatch` : set to :code:`true` if you want your object name to be the
same in your local bucket and your AWS S3 bucket; set to :code:`false` if you
want your object name to be of the form :code:`{{localBucketName}}/{{objectname}}`
in your AWS S3 hosted bucket;
- :code:`credentialsProfile` and :code:`credentials` are two ways to provide
your AWS S3 credentials for that bucket, *use only one of them* :
- :code:`credentialsProfile` : set to the profile name allowing you to access
your AWS S3 bucket from your :code:`~/.aws/credentials` file;
- :code:`credentials` : set the two fields inside the object (:code:`accessKey`
and :code:`secretKey`) to their respective values from your AWS credentials.
.. code:: json
(...)
"aws-test": {
"type": "aws_s3",
"legacyAwsBehavior": true,
"details": {
"awsEndpoint": "s3.amazonaws.com",
"bucketName": "zenkobucket",
"bucketMatch": true,
"credentialsProfile": "zenko"
}
},
(...)
.. code:: json
(...)
"aws-test": {
"type": "aws_s3",
"legacyAwsBehavior": true,
"details": {
"awsEndpoint": "s3.amazonaws.com",
"bucketName": "zenkobucket",
"bucketMatch": true,
"credentials": {
"accessKey": "WHDBFKILOSDDVF78NPMQ",
"secretKey": "87hdfGCvDS+YYzefKLnjjZEYstOIuIjs/2X72eET"
}
}
},
(...)
.. WARNING::
If you set :code:`bucketMatch` to :code:`true`, we strongly advise that you
only have one local bucket per AWS S3 location.
Without :code:`bucketMatch` set to :code:`false`, your object names in your
AWS S3 bucket will not be prefixed with your Cloud Server bucket name. This
means that if you put an object :code:`foo` to your CloudServer bucket
:code:`zenko1` and you then put a different :code:`foo` to your CloudServer
bucket :code:`zenko2` and both :code:`zenko1` and :code:`zenko2` point to the
same AWS bucket, the second :code:`foo` will overwrite the first :code:`foo`.
~/.aws/credentials
^^^^^^^^^^^^^^^^^^
.. TIP::
If you explicitly set your :code:`accessKey` and :code:`secretKey` in the
:code:`credentials` object of your :code:`aws_s3` location in your
:code:`locationConfig.json` file, you may skip this section
Make sure your :code:`~/.aws/credentials` file has a profile matching the one
defined in your :code:`locationConfig.json`. Following our previous example, it
would look like:
.. code:: shell
[zenko]
aws_access_key_id=WHDBFKILOSDDVF78NPMQ
aws_secret_access_key=87hdfGCvDS+YYzefKLnjjZEYstOIuIjs/2X72eET
Start the server with the ability to write to AWS S3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Inside the repository, once all the files have been edited, you should be able
to start the server and start writing data to AWS S3 through CloudServer.
.. code:: shell
# Start the server locally
$> S3DATA=multiple npm start
Run the server as a docker container with the ability to write to AWS S3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. TIP::
If you set the :code:`credentials` object in your
:code:`locationConfig.json` file, you don't need to mount your
:code:`.aws/credentials` file
Mount all the files that have been edited to override defaults, and do a
standard Docker run; then you can start writing data to AWS S3 through
CloudServer.
.. code:: shell
# Start the server in a Docker container
$> sudo docker run -d --name CloudServer \
-v $(pwd)/data:/usr/src/app/localData \
-v $(pwd)/metadata:/usr/src/app/localMetadata \
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json \
-v $(pwd)/conf/authdata.json:/usr/src/app/conf/authdata.json \
-v ~/.aws/credentials:/root/.aws/credentials \
-e S3DATA=multiple -e ENDPOINT=http://localhost -p 8000:8000
-d scality/s3server
Testing: put an object to AWS S3 using CloudServer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to start testing pushing to AWS S3, you will need to create a local
bucket in the AWS S3 location constraint - this local bucket will only store the
metadata locally, while both the data and any user metadata (:code:`x-amz-meta`
headers sent with a PUT object, and tags) will be stored on AWS S3.
This example is based on all our previous steps.
.. code:: shell
# Create a local bucket storing data in AWS S3
$> s3cmd --host=127.0.0.1:8000 mb s3://zenkobucket --region=aws-test
# Put an object to AWS S3, and store the metadata locally
$> s3cmd --host=127.0.0.1:8000 put /etc/hosts s3://zenkobucket/testput
upload: '/etc/hosts' -> 's3://zenkobucket/testput' [1 of 1]
330 of 330 100% in 0s 380.87 B/s done
# List locally to check you have the metadata
$> s3cmd --host=127.0.0.1:8000 ls s3://zenkobucket
2017-10-23 10:26 330 s3://zenkobucket/testput
Then, from the AWS Console, if you go into your bucket, you should see your
newly uploaded object:
.. figure:: ../res/aws-console-successful-put.png
:alt: AWS S3 Console upload example
Troubleshooting
~~~~~~~~~~~~~~~
Make sure your :code:`~/.s3cfg` file has credentials matching your local
CloudServer credentials defined in :code:`conf/authdata.json`. By default, the
access key is :code:`accessKey1` and the secret key is :code:`verySecretKey1`.
For more informations, refer to our template `~/.s3cfg <./CLIENTS/#s3cmd>`__ .
Pre-existing objects in your AWS S3 hosted bucket can unfortunately not be
accessed by CloudServer at this time.
Make sure versioning is enabled in your remote AWS S3 hosted bucket. To check,
using the AWS Console, click on your bucket name, then on "Properties" at the
top, and then you should see something like this:
.. figure:: ../res/aws-console-versioning-enabled.png
:alt: AWS Console showing versioning enabled
Microsoft Azure as a data backend
---------------------------------
From the MS Azure Console
~~~~~~~~~~~~~~~~~~~~~~~~~
From your Storage Account dashboard, create a container where you will host your
data for this new location constraint.
You will also need to get one of your Storage Account Access Keys, and to
provide it to CloudServer.
This can be found from your Storage Account dashboard, under "Settings, then
"Access keys".
In this example, our container will be named ``zenkontainer``, and will belong
to the ``zenkomeetups`` Storage Account.
From the CloudServer repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
locationConfig.json
^^^^^^^^^^^^^^^^^^^
Edit this file to add a new location constraint. This location constraint will
contain the information for the MS Azure container to which you will be writing
your data whenever you create a CloudServer bucket in this location.
There are a few configurable options here:
- :code:`type` : set to :code:`azure` to indicate this location constraint is
writing data to MS Azure;
- :code:`legacyAwsBehavior` : set to :code:`true` to indicate this region should
behave like AWS S3 :code:`us-east-1` region, set to :code:`false` to indicate
this region should behave like any other AWS S3 region (in the case of MS Azure
hosted data, this is mostly relevant for the format of errors);
- :code:`azureStorageEndpoint` : set to your storage account's endpoint, usually
:code:`https://{{storageAccountName}}.blob.core.windows.net`;
- :code:`azureContainerName` : set to an *existing container* in your MS Azure
storage account; this is the container in which your data will be stored for
this location constraint;
- :code:`bucketMatch` : set to :code:`true` if you want your object name to be
the same in your local bucket and your MS Azure container; set to
:code:`false` if you want your object name to be of the form
:code:`{{localBucketName}}/{{objectname}}` in your MS Azure container ;
- :code:`azureStorageAccountName` : the MS Azure Storage Account to which your
container belongs;
- :code:`azureStorageAccessKey` : one of the Access Keys associated to the above
defined MS Azure Storage Account.
.. code:: json
(...)
"azure-test": {
"type": "azure",
"legacyAwsBehavior": false,
"details": {
"azureStorageEndpoint": "https://zenkomeetups.blob.core.windows.net/",
"bucketMatch": true,
"azureContainerName": "zenkontainer",
"azureStorageAccountName": "zenkomeetups",
"azureStorageAccessKey": "auhyDo8izbuU4aZGdhxnWh0ODKFP3IWjsN1UfFaoqFbnYzPj9bxeCVAzTIcgzdgqomDKx6QS+8ov8PYCON0Nxw=="
}
},
(...)
.. WARNING::
If you set :code:`bucketMatch` to :code:`true`, we strongly advise that you
only have one local bucket per MS Azure location.
Without :code:`bucketMatch` set to :code:`false`, your object names in your
MS Azure container will not be prefixed with your Cloud Server bucket name.
This means that if you put an object :code:`foo` to your CloudServer bucket
:code:`zenko1` and you then put a different :code:`foo` to your CloudServer
bucket :code:`zenko2` and both :code:`zenko1` and :code:`zenko2` point to the
same MS Azure container, the second :code:`foo` will overwrite the first
:code:`foo`.
.. TIP::
You may export environment variables to **override** some of your
:code:`locationConfig.json` variable ; the syntax for them is
:code:`{{region-name}}_{{ENV_VAR_NAME}}`; currently, the available variables
are those shown below, with the values used in the current example:
.. code:: shell
$> export azure-test_AZURE_STORAGE_ACCOUNT_NAME="zenkomeetups"
$> export azure-test_AZURE_STORAGE_ACCESS_KEY="auhyDo8izbuU4aZGdhxnWh0ODKFP3IWjsN1UfFaoqFbnYzPj9bxeCVAzTIcgzdgqomDKx6QS+8ov8PYCON0Nxw=="
$> export azure-test_AZURE_STORAGE_ENDPOINT="https://zenkomeetups.blob.core.windows.net/"
Start the server with the ability to write to MS Azure
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Inside the repository, once all the files have been edited, you should be able
to start the server and start writing data to MS Azure through CloudServer.
.. code:: shell
# Start the server locally
$> S3DATA=multiple npm start
Run the server as a docker container with the ability to write to MS Azure
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mount all the files that have been edited to override defaults, and do a
standard Docker run; then you can start writing data to MS Azure through
CloudServer.
.. code:: shell
# Start the server in a Docker container
$> sudo docker run -d --name CloudServer \
-v $(pwd)/data:/usr/src/app/localData \
-v $(pwd)/metadata:/usr/src/app/localMetadata \
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json \
-v $(pwd)/conf/authdata.json:/usr/src/app/conf/authdata.json \
-e S3DATA=multiple -e ENDPOINT=http://localhost -p 8000:8000
-d scality/s3server
Testing: put an object to MS Azure using CloudServer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to start testing pushing to MS Azure, you will need to create a local
bucket in the MS Azure region - this local bucket will only store the metadata
locally, while both the data and any user metadata (:code:`x-amz-meta` headers
sent with a PUT object, and tags) will be stored on MS Azure.
This example is based on all our previous steps.
.. code:: shell
# Create a local bucket storing data in MS Azure
$> s3cmd --host=127.0.0.1:8000 mb s3://zenkontainer --region=azure-test
# Put an object to MS Azure, and store the metadata locally
$> s3cmd --host=127.0.0.1:8000 put /etc/hosts s3://zenkontainer/testput
upload: '/etc/hosts' -> 's3://zenkontainer/testput' [1 of 1]
330 of 330 100% in 0s 380.87 B/s done
# List locally to check you have the metadata
$> s3cmd --host=127.0.0.1:8000 ls s3://zenkobucket
2017-10-24 14:38 330 s3://zenkontainer/testput
Then, from the MS Azure Console, if you go into your container, you should see
your newly uploaded object:
.. figure:: ../res/azure-console-successful-put.png
:alt: MS Azure Console upload example
Troubleshooting
~~~~~~~~~~~~~~~
Make sure your :code:`~/.s3cfg` file has credentials matching your local
CloudServer credentials defined in :code:`conf/authdata.json`. By default, the
access key is :code:`accessKey1` and the secret key is :code:`verySecretKey1`.
For more informations, refer to our template `~/.s3cfg <./CLIENTS/#s3cmd>`__ .
Pre-existing objects in your MS Azure container can unfortunately not be
accessed by CloudServer at this time.
For any data backend
--------------------
From the CloudServer repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
config.json
^^^^^^^^^^^
.. IMPORTANT::
You only need to follow this section if you want to define a given location
as the default for a specific endpoint
Edit the :code:`restEndpoint` section of your :code:`config.json` file to add
an endpoint definition matching the location you want to use as a default for an
endpoint to this specific endpoint.
In this example, we'll make :code:`custom-location` our default location for the
endpoint :code:`zenkotos3.com`:
.. code:: json
(...)
"restEndpoints": {
"localhost": "us-east-1",
"127.0.0.1": "us-east-1",
"cloudserver-front": "us-east-1",
"s3.docker.test": "us-east-1",
"127.0.0.2": "us-east-1",
"zenkotos3.com": "custom-location"
},
(...)

8
docs/antora.yml Normal file
View File

@ -0,0 +1,8 @@
name: cloudserver
title: Zenko CloudServer
version: '1.0'
start_page: ROOT:README.adoc
nav:
- modules/ROOT/nav.adoc
- modules/USERS/nav.adoc
- modules/DEVELOPERS/nav.adoc

View File

@ -1,161 +0,0 @@
# -*- coding: utf-8 -*-
#
# Zope docs documentation build configuration file, created by
# sphinx-quickstart on Fri Feb 20 16:22:03 2009.
#
# This file is execfile()d with the current directory set to its containing
# dir.
#
# The contents of this file are pickled, so don't put values in the namespace
# that aren't pickleable (module imports are okay, they're removed
# automatically).
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
# import sys
# import os
# If your extensions are in another directory, add it here. If the directory
# is relative to the documentation root, use os.path.abspath to make it
# absolute, like shown here.
# sys.path.append(os.path.abspath('.'))
# General configuration
# ---------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = []
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The encoding of source files.
# source_encoding = 'utf-8'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = u'scality-zenko-cloudserver'
copyright = u'Apache License Version 2.0, 2004 http://www.apache.org/licenses/'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = '7.0.0'
# The full version, including alpha/beta/rc tags.
release = '7.0.0'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
# language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
# today = ''
# Else, today_fmt is used as the format for a strftime call.
# today_fmt = '%B %d, %Y'
# List of documents that shouldn't be included in the build.
# unused_docs = []
# List of directories, relative to source directory, that shouldn't be searched
# for source files.
exclude_trees = ['_build']
# The reST default role (used for this markup: `text`) to use for
# all documents.
# default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
# add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
# add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
# show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# Options for HTML output
# -----------------------
# The style sheet to use for HTML and HTML Help pages. A file of that name
# must exist either in Sphinx' static/ path, or in one of the custom paths
# given in html_static_path.
html_style = 'css/default.css'
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
# html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
# html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
html_logo = '../res/scality-cloudserver-logo.png'
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
# html_favicon = None
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
# html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
# html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
# html_sidebars = {}
# Additional templates that should be rendered to pages, maps page names to
# template names.
# html_additional_pages = {}
# If false, no module index is generated.
# html_use_modindex = True
# If false, no index is generated.
# html_use_index = True
# If true, the index is split into individual pages for each letter.
# html_split_index = False
# If true, the reST sources are included in the HTML build as _sources/<name>.
# html_copy_source = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
# html_use_opensearch = ''
# If nonempty, this is the file name suffix for HTML files (e.g. ".xhtml").
# html_file_suffix = ''
# Output file base name for HTML help builder.
htmlhelp_basename = 'ZenkoCloudServerdoc'

View File

@ -1,16 +0,0 @@
Scality Zenko CloudServer
==================
.. _user-docs:
.. toctree::
:maxdepth: 2
:caption: Documentation
CONTRIBUTING
GETTING_STARTED
USING_PUBLIC_CLOUDS
CLIENTS
DOCKER
INTEGRATIONS
ARCHITECTURE

View File

@ -1,4 +0,0 @@
# http://www.mkdocs.org/user-guide/configuration/
# https://github.com/mkdocs/mkdocs/wiki/MkDocs-Themes
site_name: Scality Zenko CloudServer documentation

View File

@ -0,0 +1,91 @@
Zenko Cloudserver for Developpers
=================================
:Revision: v1.0
:Date: 2018-03-20
:Email: <zenko@scality.com>
[.lead]
This set of documents aims at bootstrapping developpers with Zenko's Cloudserver
module, so they can then go on and contribute features.
In order to achieve this, we're going to cover a number of subjects:
- <<cloning-and-building,cloning, installing, and building your own image>>;
- <<support-new-public-cloud, adding support to a new public cloud backend>>;
- <<telling-story-usecase, telling the community about your story or usecase>>.
== [[cloning-and-building]]
== Cloning, installing, and building your own image
To clone Zenko's Cloudserver, simply run:
~# git clone https://github.com/scality/S3 cloudserver
~# cd cloudserver
To install all dependencies (necessary to run), do:
~/cloudserver# npm install
TIP: Some optional dependencies may fail, resulting in you seeing `NPM WARN`
messages; these can safely be ignored.
// Add link to user doc
To run the service locally, use:
~/cloudserver# npm start
TIP: Refer to the User documentation for all available options
// Add link to Docker doc
To build your own Docker image, run:
~/cloudserver# docker build . -t {{YOUR_DOCKERHUB_ACCOUNT}}/cloudserver:{{OPTIONAL_VERSION_TAG}}
To then push your Docker image to your own hub, run:
~/cloudserver# docker push {{YOUR_DOCKERHUB_ACCOUNT}}/cloudserver:{{OPTIONAL_VERSION_TAG}}
NOTE: To perform this last operation, you will need to be authenticated with
DockerHub
== [[support-new-public-cloud]]
== Add support for a new Public Cloud backend
.Backend Support
[align="center",halign="center",valign="center",options="header"]
|=======================================================================
|Backend type |Currently supported |Active WIP |Community suggestion
|Private disk/fs |x | |
|AWS S3 |x | |
|Microsoft Azure |x | |
|Backblaze B2 | |x |
|Google Cloud | |x |
|Openstack Swift | | |x
|=======================================================================
IMPORTANT: Should you want to request a new backend support, please do so by
opening a Github issue, and filling out the "Feature Request" section
of our template. Thanks!
We always encourage our community to offer new extensions to Zenko, and new
backend support is paramount to meeting more community needs.
If that is something you want to contribute (or just do on your own version of
the cloudserver image), go read our link:NEW_BACKEND.adoc[step-by-step guide] on
where to start to add support for a new backend.
//TODO:add link to contributing guidelines
If you wish to make this a contribution, please make sure you follow our
Contributing Guidelines.
If you need help with anything, please search our https://forum.scality.com[Forum]
for more information. If you can't find what you need, open a thread, and our
community memebers and core team will be right with you!
== [[telling-story-usecase]]
== Telling the community about your story or usecase
The best part of being open source is learning from such a diverse crowd. At
Scality, we're always curious to learn about what you do with Zenko and Zenko
Cloudserver.
If you wish to tell us your story, if you want us to advertise your extension,
or if you want to publish a tutorial on how to replicatie your setup, please
reach out either on https://forum.scality.com[the Zenko Forum], or send us an
mailto:zenko@scality.com[email].

View File

@ -0,0 +1,54 @@
= Adding a new backend
One of Zenko's Cloudserver commitment is to simplify multicloud storage by
giving one API (the S3 API) to access all clouds. With that in mind, supporting
more and more backends is one of Zenko's Community priorities. And you, as a
developper, are welcome to join that trend!
If you're planning to add a new backend for your own usage, go ahead and read
the doc. If you have any questions during the development process, search our
https://forum.scality.com[forum] and, if there is no answer to your question
already there, open a new thread.
//TODO: Add link to contributing Guidelines
If you're planning to contribute your backend support to our official
repository, please follow these steps:
- familiarize yourself with our Contributing Guidelines;
- open a Github issue and fill out Feature Request form, and specify you would
like to contribute it yourself;
- wait for our core team to get back to you with an answer on whether we are
interested in taking that contribution in (and hence committing to maintaining
it over time);
- once approved, fork this https://www.github.com/scality/S3[repository], and
get started!
- reach out to us on the https://forum.scality.com[forum] with any question you
may have during the development process (after reading this document, of
course!);
- when you think it's ready, let us know so that we create a feature branch
against which we'll compare and review your code;
- open a pull request with your changes against that dedicated feature branch;
//TODO: Add Hall of Fame section in the community report
- once that pull request gets merged, you're done (and you'll join our Hall of
Fame ;) );
- finally, we'll let you know when we merge this into master.
TIP: While we do take care of the finale rebase (when we merge your feature
branch on master), we do ask that you keep up to date with our master until
then; find out more https://help.github.com/articles/syncing-a-fork/[here].
IMPORTANT: If we do not approve your feature request, you may of course still
work on supporting a new backend: all our "no" means is that we do
not have the resources, as part of our core development team, to
maintain this feature for the moment.
_If your code is clean and your extension works nicely, we will be_
_glad to advertise it as part of the Zenko Galaxy_
//TODO: Get approval for Zenko Galaxy as the name of our hub - sound appropriate with Orbit ;)
There are two main types of backend you could want Zenko to support:
== link:S3_COMPATIBLE_BACKENDS.adoc[S3 compatible data backends]
== link:NON_S3_COMPATIBLE_BACKENDS.adoc[Data backends using another protocol
than the S3 protocol]

View File

@ -0,0 +1,469 @@
= Adding support for data backends not supporting the S3 API
These backends are what makes Zenko so valuable: abstracting the complexity of
multiple APIs to let users work on a single common namespace across multiple
clouds.
This documents aims at introducing you to the right files in Cloudserver (the
Zenko stack's subcomponent in charge of API translation, among other things) to
add support to your own backend of choice.
As usual, should you have any question, please reach out on the
https://forum.zenko.io[Zenko forum].
== General configuration
There are a number of constants and environment variables to define to support a
new data backends; here is a list and where to find them:
=== `/constants.js`
- give you backend type a name, as part of the `externalBackends` object;
- specify whether versioning is implemented, as part of the
`versioningNotImplemented` object;
=== `/lib/Config.js`
- this is where you should put common utility functions, like the ones to parse
the location object from `locationConfig.json`;
- make sure you define environment variables (like `GCP_SERVICE_EMAIL` as we'll
use those internally for the CI to test against the real remote backend;
=== `/lib/data/external/{{backendName}}Client.js`
- this file is where you'll instantiate your backend client; this should be a
class with a constructor taking the config object built in `/lib/Config.js` as
parameter;
- over time, you may need some utility functions which we've defined in the
folder `/api/apiUtils`, and in the file `/lib/data/external/utils`;
=== `/lib/data/external/utils.js`
- make sure to add options for `sourceLocationConstraintType` to be equal to
the name you gave your backend in `/constants.js`;
=== `/lib/data/external/{{BackendName}}_lib/`
- this folder is where you'll put the functions needed for supporting your
backend; keep your files as atomic as possible;
=== [[location-config-test-json]]
=== `/tests/locationConfig/locationConfigTests.json`
- this file is where you'll create location profiles to be used by your
functional tests;
=== `/lib/data/locationConstraintParser.js`
- this is where you'll instantiate your client if the operation the end user
sent effectively writes to your backend; everything happens inside the
function `parseLC()`; you should add a condition that executes if
`locationObj.type` is the name of your backend (that you defined in
`constants.js`), and instantiates a client of yours. See pseudocode below,
assuming location type name is `ztore`:
[source,js]
----
(...) //<1>
const ZtoreClient = require('./external/ZtoreClient');
const { config } = require('../Config'); //<1>
function parseLC(){ //<1>
(...) //<1>
Object.keys(config.locationConstraints).forEach(location => { //<1>
const locationObj = config.locationConstraints[location]; //<1>
(...) //<1>
if (locationObj.type === 'ztore' {
const ztoreEndpoint = config.getZtoreEndpoint(location);
const ztoreCredentials = config.getZtoreCredentials(location); //<2>
clients[location] = new ZtoreClient({
ztoreEndpoint,
ztoreCredentials,
ztoreBucketname: locationObj.details.ztoreBucketName,
bucketMatch: locationObj.details.BucketMatch,
dataStoreName: location,
}); //<3>
clients[location].clientType = 'ztore';
});
(...) //<1>
});
}
----
<1> Code that is already there
<2> You may need more utility functions depending on your backend specs
<3> You may have more fields required in your constructor object depending on
your backend specs
== Operation of type PUT
PUT routes are usually where people get started, as it's the easiest to check!
Simply go on your remote backend console and you'll be able to see whether your
object actually went up in the cloud...
These are the files you'll need to edit:
=== `/lib/data/external/{{BackendName}}Client.js`
- the function that is going to call your `put()` function is also called
`put()`, and it's defined in `/lib/data/multipleBackendGateway.js`;
- define a function with signature like
`put(stream, size, keyContext, reqUids, callback)`; this is worth exploring a
bit more as these parameters are the same for all backends:
//TODO: generate this from jsdoc
-- `stream`: the stream of data you want to put in the cloud; if you're
unfamiliar with node.js strams, we suggest you start training, as we use them
a lot !
-- `size`: the size of the object you're trying to put;
-- `keyContext`: an object with metadata about the operation; common entries are
`namespace`, `buckerName`, `owner`, `cipherBundle`, and `tagging`; if these
are not sufficient for your integration, contact us to get architecture
validation before adding new entries;
-- `reqUids`: the request unique ID used for logging;
-- `callback`: your function's callback (should handle errors);
=== `/lib/data/external/{{backendName}}_lib/`
- this is where you should put all utility functions for your PUT operation, and
then import then in `/lib/data/external/{{BackendName}}Client.js`, to keep
your code clean;
=== `tests/functional/aws-node-sdk/test/multipleBackend/put/put{{BackendName}}js`
- every contribution should come with thorough functional tests, showing
nominal context gives expected behaviour, and error cases are handled in a way
that is standard with the backend (including error messages and code);
- the ideal setup is if you simulate your backend locally, so as not to be
subjected to network flakiness in the CI; however, we know there might not be
mockups available for every client; if that is the case of your backend, you
may test against the "real" endpoint of your data backend;
=== `tests/functional/aws-node-sdk/test/multipleBackend/utils.js`
- where you'll define a constant for your backend location matching your
`/tests/locationConfig/locationConfigTests.json`
<<location-config-test-json,test location name>>;
- depending on your backend, the sample `keys[]` and associated made up objects
may not work for you (if your backend's key format is different, for example);
if that is the case, you should add a custom `utils.get{{BackendName}}keys()`
function returning ajusted `keys[]` to your tests.
== Operation of type GET
GET routes are easy to test after PUT routes are implemented, hence why we're
covering them second.
These are the files you'll need to edit:
=== `/lib/data/external/{{BackendName}}Client.js`
- the function that is going to call your `get()` function is also called
`get()`, and it's defined in `/lib/data/multipleBackendGateway.js`;
- define a function with signature like
`get(objectGetInfo, range, reqUids, callback)`; this is worth exploring a
bit more as these parameters are the same for all backends:
//TODO: generate this from jsdoc
-- `objectGetInfo`: a dictionnary with two entries: `key`, the object key in the
data store, and `client`, the data store name;
-- `range`: the range of bytes you will get, for "get-by-range" operations (we
recommend you do simple GETs first, and then look at this);
-- `reqUids`: the request unique ID used for logging;
-- `callback`: your function's callback (should handle errors);
=== `/lib/data/external/{{backendName}}_lib/`
- this is where you should put all utility functions for your GET operation, and
then import then in `/lib/data/external/{{BackendName}}Client.js`, to keep
your code clean;
=== `tests/functional/aws-node-sdk/test/multipleBackend/get/get{{BackendName}}js`
- every contribution should come with thorough functional tests, showing
nominal context gives expected behaviour, and error cases are handled in a way
that is standard with the backend (including error messages and code);
- the ideal setup is if you simulate your backend locally, so as not to be
subjected to network flakiness in the CI; however, we know there might not be
mockups available for every client; if that is the case of your backend, you
may test against the "real" endpoint of your data backend;
=== `tests/functional/aws-node-sdk/test/multipleBackend/utils.js`
NOTE: You should need this section if you have followed the tutorial in order
(that is, if you have covered the PUT operation already)
- where you'll define a constant for your backend location matching your
`/tests/locationConfig/locationConfigTests.json`
<<location-config-test-json,test location name>>;
- depending on your backend, the sample `keys[]` and associated made up objects
may not work for you (if your backend's key format is different, for example);
if that is the case, you should add a custom `utils.get{{BackendName}}keys()`
== Operation of type DELETE
DELETE routes are easy to test after PUT routes are implemented, and they are
similar to GET routes in our implementation, hence why we're covering them
third.
These are the files you'll need to edit:
=== `/lib/data/external/{{BackendName}}Client.js`
- the function that is going to call your `delete()` function is also called
`delete()`, and it's defined in `/lib/data/multipleBackendGateway.js`;
- define a function with signature like
`delete(objectGetInfo, reqUids, callback)`; this is worth exploring a
bit more as these parameters are the same for all backends:
//TODO: generate this from jsdoc
-- `objectGetInfo`: a dictionnary with two entries: `key`, the object key in the
data store, and `client`, the data store name;
-- `reqUids`: the request unique ID used for logging;
-- `callback`: your function's callback (should handle errors);
=== `/lib/data/external/{{backendName}}_lib/`
- this is where you should put all utility functions for your DELETE operation,
and then import then in `/lib/data/external/{{BackendName}}Client.js`, to keep
your code clean;
=== `tests/functional/aws-node-sdk/test/multipleBackend/get/get{{BackendName}}js`
- every contribution should come with thorough functional tests, showing
nominal context gives expected behaviour, and error cases are handled in a way
that is standard with the backend (including error messages and code);
- the ideal setup is if you simulate your backend locally, so as not to be
subjected to network flakiness in the CI; however, we know there might not be
mockups available for every client; if that is the case of your backend, you
may test against the "real" endpoint of your data backend;
=== `tests/functional/aws-node-sdk/test/multipleBackend/utils.js`
NOTE: You should need this section if you have followed the tutorial in order
(that is, if you have covered the PUT operation already)
- where you'll define a constant for your backend location matching your
`/tests/locationConfig/locationConfigTests.json`
<<location-config-test-json,test location name>>;
- depending on your backend, the sample `keys[]` and associated made up objects
may not work for you (if your backend's key format is different, for example);
if that is the case, you should add a custom `utils.get{{BackendName}}keys()`
== Operation of type HEAD
HEAD routes are very similar to DELETE routes in our implementation, hence why
we're covering them fourth.
These are the files you'll need to edit:
=== `/lib/data/external/{{BackendName}}Client.js`
- the function that is going to call your `head()` function is also called
`head()`, and it's defined in `/lib/data/multipleBackendGateway.js`;
- define a function with signature like
`head(objectGetInfo, reqUids, callback)`; this is worth exploring a
bit more as these parameters are the same for all backends:
//TODO: generate this from jsdoc
-- `objectGetInfo`: a dictionnary with two entries: `key`, the object key in the
data store, and `client`, the data store name;
-- `reqUids`: the request unique ID used for logging;
-- `callback`: your function's callback (should handle errors);
=== `/lib/data/external/{{backendName}}_lib/`
- this is where you should put all utility functions for your HEAD operation,
and then import then in `/lib/data/external/{{BackendName}}Client.js`, to keep
your code clean;
=== `tests/functional/aws-node-sdk/test/multipleBackend/get/get{{BackendName}}js`
- every contribution should come with thorough functional tests, showing
nominal context gives expected behaviour, and error cases are handled in a way
that is standard with the backend (including error messages and code);
- the ideal setup is if you simulate your backend locally, so as not to be
subjected to network flakiness in the CI; however, we know there might not be
mockups available for every client; if that is the case of your backend, you
may test against the "real" endpoint of your data backend;
=== `tests/functional/aws-node-sdk/test/multipleBackend/utils.js`
NOTE: You should need this section if you have followed the tutorial in order
(that is, if you have covered the PUT operation already)
- where you'll define a constant for your backend location matching your
`/tests/locationConfig/locationConfigTests.json`
<<location-config-test-json,test location name>>;
- depending on your backend, the sample `keys[]` and associated made up objects
may not work for you (if your backend's key format is different, for example);
if that is the case, you should add a custom `utils.get{{BackendName}}keys()`
== Healthcheck
Healtchecks are used to make sure failure to write to a remote cloud is due to
a problem on that remote cloud, an not on Zenko's side.
This is usually done by trying to create a bucket that already exists, and
making sure you get the expected answer.
These are the files you'll need to edit:
=== `/lib/data/external/{{BackendName}}Client.js`
- the function that is going to call your `healthcheck()` function is called
`checkExternalBackend()` and it's defined in
`/lib/data/multipleBackendGateway.js`; you will need to add your own;
- your healtcheck function should get `location` as a parameter, which is an
object comprising:`
-- `reqUids`: the request unique ID used for logging;
-- `callback`: your function's callback (should handle errors);
=== `/lib/data/external/{{backendName}}_lib/{{backendName}}_create_bucket.js`
- this is where you should write the function performing the actual bucket
creation;
=== `/lib/data/external/{{backendName}}_lib/utils.js`
- add an object named per your backend's name to the `backendHealth` dictionary,
with proper `response` and `time` entries;
=== `lib/data/multipleBackendGateway.js`
- edit the `healthcheck` function to add your location's array, and call your
healthcheck; see pseudocode below for a sample implementation, provided your
backend name is `ztore`
[source,js]
----
(...) //<1>
healthcheck: (flightCheckOnStartUp, log, callback) => { //<1>
(...) //<1>
const ztoreArray = []; //<2>
async.each(Object.keys(clients), (location, cb) => { //<1>
(...) //<1>
} else if (client.clientType === 'ztore' {
ztoreArray.push(location); //<3>
return cb();
}
(...) //<1>
multBackendResp[location] = { code: 200, message: 'OK' }; //<1>
return cb();
}, () => { //<1>
async.parallel([
(...) //<1>
next => checkExternalBackend( //<4>
clients, ztoreArray, 'ztore', flightCheckOnStartUp,
externalBackendHealthCheckInterval, next),
] (...) //<1>
});
(...) //<1>
});
}
----
<1> Code that is already there
<2> The array that will store all locations of type 'ztore'
<3> Where you add locations of type 'ztore' to the array
<4> Where you actually call the healthcheck function on all 'ztore' locations
== Multipart upload (MPU)
Congratulations! This is the final part to supporting a new backend! You're
nearly there!
Now, let's be honest: MPU is far from the easiest subject, but you've come so
far it shouldn't be a problem.
These are the files you'll need to edit:
=== `/lib/data/external/{{BackendName}}Client.js`
You'll be creating four functions with template signatures:
- `createMPU(Key, metaHeaders, bucketName, websiteRedirectHeader, contentType,
cacheControl, contentDisposition, contentEncoding, log, callback)` will
initiate the multi part upload process; now, here, all parameters are
metadata headers except for:
-- `Key`, the key id for the final object (collection of all parts);
-- `bucketName`, the name of the bucket to which we will do an MPU;
-- `log`, the logger;
- `uploadPart(request, streamingV4Params, stream, size, key, uploadId,
partNumber, bucketName, log, callback)` will be called for each part; the
parameters can be explicited as follow:
-- `request`, the request object for putting the part;
-- `streamingV4Params`, parameters for auth V4 parameters against S3;
-- `stream`, the node.js readable stream used to put the part;
-- `size`, the size of the part;
-- `key`, the key of the object;
-- `uploadId`, multipart upload id string;
-- `partNumber`, the number of the part in this MPU (ordered);
-- `bucketName`, the name of the bucket to which we will do an MPU;
-- `log`, the logger;
- `completeMPU(jsonList, mdInfo, key, uploadId, bucketName, log, callback)` will
end the MPU process once all parts are uploaded; parameters can be explicited
as follows:
-- `jsonList`, user-sent list of parts to include in final mpu object;
-- `mdInfo`, object containing 3 keys: storedParts, mpuOverviewKey, and
splitter;
-- `key`, the key of the object;
-- `uploadId`, multipart upload id string;
-- `bucketName`, name of bucket;
-- `log`, logger instance:
- `abortMPU(key, uploadId, bucketName, log, callback)` will handle errors, and
make sure that all parts that may have been uploaded will be deleted if the
MPU ultimately fails; the parameters are:
-- `key`, the key of the object;
-- `uploadId`, multipart upload id string;
-- `bucketName`, name of bucket;
-- `log`, logger instance.
=== `/lib/api/objectPutPart.js`
- you'll need to add your backend type in appropriate sections (simply look for
other backends already implemented).
=== `/lib/data/external/{{backendName}}_lib/`
- this is where you should put all utility functions for your MPU operations,
and then import then in `/lib/data/external/{{BackendName}}Client.js`, to keep
your code clean;
=== `lib/data/multipleBackendGateway.js`
- edit the `createLOY` function to add your location type, and call your
`©reateMPU()`; see pseudocode below for a sample implementation, provided your
backend name is `ztore`
[source,js]
----
(...) //<1>
createMPU:(key, metaHeaders, bucketName, websiteRedirectHeader, //<1>
location, contentType, cacheControl, contentDisposition,
contentEncoding, log, cb) => {
const client = clients[location]; //<1>
if (client.clientType === 'aws_s3') { //<1>
return client.createMPU(key, metaHeaders, bucketName,
websiteRedirectHeader, contentType, cacheControl,
contentDisposition, contentEncoding, log, cb);
} else if (client.clientType === 'ztore') { //<2>
return client.createMPU(key, metaHeaders, bucketName,
websiteRedirectHeader, contentType, cacheControl,
contentDisposition, contentEncoding, log, cb);
}
return cb();
};
(...) //<1>
----
<1> Code that is already there
<2> Where the `createMPU()` of your client is actually called
=== `tests/functional/aws-node-sdk/test/multipleBackend/initMPU/{{BackendName}}InitMPU.js`
=== `tests/functional/aws-node-sdk/test/multipleBackend/listParts/{{BackendName}}ListPart.js`
=== `tests/functional/aws-node-sdk/test/multipleBackend/mpuAbort/{{BackendName}}AbortMPU.js`
=== `tests/functional/aws-node-sdk/test/multipleBackend/mpuComplete/{{BackendName}}CompleteMPU.js`
=== `tests/functional/aws-node-sdk/test/multipleBackend/mpuParts/{{BackendName}}UploadPart.js`
- granted, that is a lot of functional tests... but it's the last series as
well! Hurray!
== Adding support in Orbit, Zenko's UI for simplified Multi Cloud Management
This can only be done by our core developpers' team. Once your backend
integration is merged, you may open a feature request on the
https://www.github.com/scality/Zenko/issues/new[Zenko repository], and we will
get back to you after we evaluate feasability and maintainability.

View File

@ -0,0 +1,43 @@
= S3 compatible backends
IMPORTANT: S3 compatibility claims are a bit like idols: a lot think they are,
but very few effectively meet all criteria ;) If the following steps
don't work for you, it's likely to be because the S3 compatibility
of your target backend is imperfect.
== Adding support in Zenko's Cloudserver
This is the easiest case for backend support integration: there is nothing to do
but configuration!
Follow the steps described in our link:../USING_PUBLIC_CLOUDS.rst[user guide for
using AWS S3 as a data backend], and make sure you:
- set `details.awsEndpoint` to your storage provider endpoint;
- use `details.credentials` and *not* `details.credentialsProfile` to set your
credentials for that S3-compatible backend.
For example, if you're using a Wasabi bucket as a backend, then your region
definition for that backend will look something like:
```json
"wasabi-bucket-zenkobucket": {
"type": "aws_s3",
"legacyAwsBehavior": true,
"details": {
"awsEndpoint": "s3.wasabisys.com",
"bucketName": "zenkobucket",
"bucketMatch": true,
"credentials": {
"accessKey": "\{YOUR_WASABI_ACCESS_KEY}",
"secretKey": "\{YOUR_WASABI_SECRET_KEY}"
}
}
},
```
== Adding support in Zenko Orbit
This can only be done by our core developpers' team. If that's what you're
after, open a feature request on the
https://www.github.com/scality/Zenko/issues/new[Zenko repository], and we will
get back to you after we evaluate feasability and maintainability.

View File

@ -0,0 +1,4 @@
:attachmentsdir: {moduledir}/assets/attachments
:examplesdir: {moduledir}/examples
:imagesdir: {moduledir}/assets/images
:partialsdir: {moduledir}/pages/_partials

View File

@ -0,0 +1,4 @@
* xref:GETTING_STARTED.adoc[Getting Started]
* xref:NEW_BACKEND.adoc[Adding a new backend]
** xref:S3_COMPATIBLE_BACKENDS.adoc[Backends supporting the S3 protocol]
** xref:NON_S3_COMPATIBLE_BACKENDS.adoc[Backends supporting other protocols]

View File

@ -0,0 +1,4 @@
:attachmentsdir: {moduledir}/assets/attachments
:examplesdir: {moduledir}/examples
:imagesdir: {moduledir}/assets/images
:partialsdir: {moduledir}/pages/_partials

View File

@ -0,0 +1,8 @@
name: cloudserver-root
title: Zenko CloudServer
version: '1.0'
start_page: ROOT:README.adoc
nav:
- modules/ROOT/nav.adoc
- modules/USERS/nav.adoc
- modules/DEVELOPERS/nav.adoc

View File

@ -0,0 +1,3 @@
.xref:README.adoc[README]
* xref:README.adoc[README TOO]
** xref:README.adoc#docker[README DOCKER SECTION direct link]

View File

@ -0,0 +1,172 @@
[[zenko-cloudserver]]
Zenko CloudServer
-----------------
image:res/scality-cloudserver-logo.png[Zenko CloudServer logo]
https://circleci.com/gh/scality/S3[image:https://circleci.com/gh/scality/S3.svg?style=svg[CircleCI]]
http://ci.ironmann.io/gh/scality/S3[image:http://ci.ironmann.io/gh/scality/S3.svg?style=svg&circle-token=1f105b7518b53853b5b7cf72302a3f75d8c598ae[Scality
CI]]
https://hub.docker.com/r/scality/s3server/[image:https://img.shields.io/docker/pulls/scality/s3server.svg[Docker
Pulls]]
https://twitter.com/zenko[image:https://img.shields.io/twitter/follow/zenko.svg?style=social&label=Follow[Docker
Pulls]]
[[overview]]
Overview
~~~~~~~~
CloudServer (formerly S3 Server) is an open-source Amazon S3-compatible
object storage server that is part of https://www.zenko.io[Zenko],
Scalitys Open Source Multi-Cloud Data Controller.
CloudServer provides a single AWS S3 API interface to access multiple
backend data storage both on-premise or public in the cloud.
CloudServer is useful for Developers, either to run as part of a
continous integration test environment to emulate the AWS S3 service
locally or as an abstraction layer to develop object storage enabled
application on the go.
[[learn-more-at-www.zenko.iocloudserver]]
Learn more at
https://www.zenko.io/cloudserver/[www.zenko.io/cloudserver]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[may-i-offer-you-some-lovely-documentation]]
http://s3-server.readthedocs.io/en/latest/[May I offer you some lovely
documentation?]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[docker]]
Docker
~~~~~~
https://hub.docker.com/r/scality/s3server/[Run your Zenko CloudServer
with Docker]
[[contributing]]
Contributing
~~~~~~~~~~~~
In order to contribute, please follow the
https://github.com/scality/Guidelines/blob/master/CONTRIBUTING.md[Contributing
Guidelines].
[[installation]]
Installation
~~~~~~~~~~~~
[[dependencies]]
Dependencies
^^^^^^^^^^^^
Building and running the Zenko CloudServer requires node.js 6.9.5 and
npm v3 . Up-to-date versions can be found at
https://github.com/nodesource/distributions[Nodesource].
[[clone-source-code]]
Clone source code
^^^^^^^^^^^^^^^^^
[source,shell]
----
git clone https://github.com/scality/S3.git
----
[[install-js-dependencies]]
Install js dependencies
^^^^^^^^^^^^^^^^^^^^^^^
Go to the ./S3 folder,
[source,shell]
----
npm install
----
If you get an error regarding installation of the diskUsage module,
please install g++.
If you get an error regarding level-down bindings, try clearing your npm
cache:
[source,shell]
----
npm cache clear
----
[[run-it-with-a-file-backend]]
Run it with a file backend
~~~~~~~~~~~~~~~~~~~~~~~~~~
[source,shell]
----
npm start
----
This starts a Zenko CloudServer on port 8000. Two additional ports 9990
and 9991 are also open locally for internal transfer of metadata and
data, respectively.
The default access key is accessKey1 with a secret key of
verySecretKey1.
By default the metadata files will be saved in the localMetadata
directory and the data files will be saved in the localData directory
within the ./S3 directory on your machine. These directories have been
pre-created within the repository. If you would like to save the data or
metadata in different locations of your choice, you must specify them
with absolute paths. So, when starting the server:
[source,shell]
----
mkdir -m 700 $(pwd)/myFavoriteDataPath
mkdir -m 700 $(pwd)/myFavoriteMetadataPath
export S3DATAPATH="$(pwd)/myFavoriteDataPath"
export S3METADATAPATH="$(pwd)/myFavoriteMetadataPath"
npm start
----
[[run-it-with-multiple-data-backends]]
Run it with multiple data backends
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[source,shell]
----
export S3DATA='multiple'
npm start
----
This starts a Zenko CloudServer on port 8000. The default access key is
accessKey1 with a secret key of verySecretKey1.
With multiple backends, you have the ability to choose where each object
will be saved by setting the following header with a locationConstraint
on a PUT request:
[source,shell]
----
'x-amz-meta-scal-location-constraint':'myLocationConstraint'
----
If no header is sent with a PUT object request, the location constraint
of the bucket will determine where the data is saved. If the bucket has
no location constraint, the endpoint of the PUT request will be used to
determine location.
See the Configuration section in our documentation
http://s3-server.readthedocs.io/en/latest/GETTING_STARTED/#configuration[here]
to learn how to set location constraints.
[[run-it-with-an-in-memory-backend]]
Run it with an in-memory backend
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[source,shell]
----
npm run mem_backend
----
This starts a Zenko CloudServer on port 8000. The default access key is
accessKey1 with a secret key of verySecretKey1.

View File

@ -0,0 +1,4 @@
:attachmentsdir: {moduledir}/assets/attachments
:examplesdir: {moduledir}/examples
:imagesdir: {moduledir}/assets/images
:partialsdir: {moduledir}/pages/_partials

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 45 KiB

View File

@ -0,0 +1,8 @@
.Users guide
* xref:CONTRIBUTING.adoc
* xref:GETTING_STARTED.adoc
* xref:USING_PUBLIC_CLOUDS.adoc
* xref:CLIENTS.adoc
* xref:DOCKER.adoc
* xref:INTEGRATIONS.adoc
* xref:ARCHITECTURE.adoc

View File

@ -0,0 +1,979 @@
[[architecture]]
Architecture
------------
[[versioning]]
Versioning
~~~~~~~~~~
This document describes Zenko CloudServer's support for the AWS S3
Bucket Versioning feature.
[[aws-s3-bucket-versioning]]
AWS S3 Bucket Versioning
^^^^^^^^^^^^^^^^^^^^^^^^
See AWS documentation for a description of the Bucket Versioning
feature:
* http://docs.aws.amazon.com/AmazonS3/latest/dev/Versioning.html[Bucket
Versioning]
* http://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectVersioning.html[Object
Versioning]
This document assumes familiarity with the details of Bucket Versioning,
including null versions and delete markers, described in the above
links.
Implementation of Bucket Versioning in Zenko CloudServer
-----------------------------------------
[[overview-of-metadata-and-api-component-roles]]
Overview of Metadata and API Component Roles
++++++++++++++++++++++++++++++++++++++++++++
Each version of an object is stored as a separate key in metadata. The
S3 API interacts with the metadata backend to store, retrieve, and
delete version metadata.
The implementation of versioning within the metadata backend is naive.
The metadata backend does not evaluate any information about bucket or
version state (whether versioning is enabled or suspended, and whether a
version is a null version or delete marker). The S3 front-end API
manages the logic regarding versioning information, and sends
instructions to metadata to handle the basic CRUD operations for version
metadata.
The role of the S3 API can be broken down into the following:
* put and delete version data
* store extra information about a version, such as whether it is a
delete marker or null version, in the object's metadata
* send instructions to metadata backend to store, retrieve, update and
delete version metadata based on bucket versioning state and version
metadata
* encode version ID information to return in responses to requests, and
decode version IDs sent in requests
The implementation of Bucket Versioning in S3 is described in this
document in two main parts. The first section,
link:#implementation-of-bucket-versioning-in-metadata["Implementation of
Bucket Versioning in Metadata"], describes the way versions are stored
in metadata, and the metadata options for manipulating version metadata.
The second section,
link:#implementation-of-bucket-versioning-in-api["Implementation of
Bucket Versioning in API"], describes the way the metadata options are
used in the API within S3 actions to create new versions, update their
metadata, and delete them. The management of null versions and creation
of delete markers are also described in this section.
[[implementation-of-bucket-versioning-in-metadata]]
Implementation of Bucket Versioning in Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
As mentioned above, each version of an object is stored as a separate
key in metadata. We use version identifiers as the suffix for the keys
of the object versions, and a special version (the
link:#master-version["Master Version"]) to represent the latest version.
An example of what the metadata keys might look like for an object
`foo/bar` with three versions (with . representing a null character):
[width="76%",cols="100%",options="header",]
|==================================================
|key
|foo/bar
|foo/bar.098506163554375999999PARIS 0.a430a1f85c6ec
|foo/bar.098506163554373999999PARIS 0.41b510cd0fdf8
|foo/bar.098506163554373999998PARIS 0.f9b82c166f695
|==================================================
The most recent version created is represented above in the key
`foo/bar` and is the master version. This special version is described
further in the section link:#master-version["Master Version"].
[[version-id-and-metadata-key-format]]
Version ID and Metadata Key Format
++++++++++++++++++++++++++++++++++
The version ID is generated by the metadata backend, and encoded in a
hexadecimal string format by S3 before sending a response to a request.
S3 also decodes the hexadecimal string received from a request before
sending to metadata to retrieve a particular version.
The format of a `version_id` is: `ts` `rep_group_id` `seq_id` where:
* `ts`: is the combination of epoch and an increasing number
* `rep_group_id`: is the name of deployment(s) considered one unit used
for replication
* `seq_id`: is a unique value based on metadata information.
The format of a key in metadata for a version is:
`object_name separator version_id` where:
* `object_name`: is the key of the object in metadata
* `separator`: we use the `null` character (`0x00` or `\0`) as the
separator between the `object_name` and the `version_id` of a key
* `version_id`: is the version identifier; this encodes the ordering
information in the format described above as metadata orders keys
alphabetically
An example of a key in metadata:
`foo\01234567890000777PARIS 1234.123456` indicating that this specific
version of `foo` was the `000777`th entry created during the epoch
`1234567890` in the replication group `PARIS` with `1234.123456` as
`seq_id`.
[[master-version]]
Master Version
++++++++++++++
We store a copy of the latest version of an object's metadata using
`object_name` as the key; this version is called the master version. The
master version of each object facilitates the standard GET operation,
which would otherwise need to scan among the list of versions of an
object for its latest version.
The following table shows the layout of all versions of `foo` in the
first example stored in the metadata (with dot `.` representing the null
separator):
[width="30%",cols="50%,50%",options="header",]
|==========
|key |value
|foo |B
|foo.v2 |B
|foo.v1 |A
|==========
[[metadata-versioning-options]]
Metadata Versioning Options
+++++++++++++++++++++++++++
Zenko CloudServer sends instructions to the metadata engine about
whether to create a new version or overwrite, retrieve, or delete a
specific version by sending values for special options in PUT, GET, or
DELETE calls to metadata. The metadata engine can also list versions in
the database, which is used by Zenko CloudServer to list object
versions.
These only describe the basic CRUD operations that the metadata engine
can handle. How these options are used by the S3 API to generate and
update versions is described more comprehensively in
link:#implementation-of-bucket-versioning-in-api["Implementation of
Bucket Versioning in API"].
Note: all operations (PUT and DELETE) that generate a new version of an
object will return the `version_id` of the new version to the API.
[[put]]
PUT
* no options: original PUT operation, will update the master version
* `versioning: true` create a new version of the object, then update the
master version with this version.
* `versionId: <versionId>` create or update a specific version (for
updating version's ACL or tags, or remote updates in geo-replication)
** if the version identified by `versionId` happens to be the latest
version, the master version will be updated as well
** if the master version is not as recent as the version identified by
`versionId`, as may happen with cross-region replication, the master
will be updated as well
** note that with `versionId` set to an empty string `''`, it will
overwrite the master version only (same as no options, but the master
version will have a `versionId` property set in its metadata like any
other version). The `versionId` will never be exposed to an external
user, but setting this internal-only `versionID` enables Zenko
CloudServer to find this version later if it is no longer the master.
This option of `versionId` set to `''` is used for creating null
versions once versioning has been suspended, which is discussed in
link:#null-version-management["Null Version Management"].
In general, only one option is used at a time. When `versionId` and
`versioning` are both set, only the `versionId` option will have an
effect.
[[delete]]
DELETE
* no options: original DELETE operation, will delete the master version
* `versionId: <versionId>` delete a specific version
A deletion targeting the latest version of an object has to:
* delete the specified version identified by `versionId`
* replace the master version with a version that is a placeholder for
deletion - this version contains a special keyword, 'isPHD', to indicate
the master version was deleted and needs to be updated
* initiate a repair operation to update the value of the master version:
- involves listing the versions of the object and get the latest version
to replace the placeholder delete version - if no more versions exist,
metadata deletes the master version, removing the key from metadata
Note: all of this happens in metadata before responding to the front-end
api, and only when the metadata engine is instructed by Zenko
CloudServer to delete a specific version or the master version. See
section link:#delete-markers["Delete Markers"] for a description of what
happens when a Delete Object request is sent to the S3 API.
[[get]]
GET
* no options: original GET operation, will get the master version
* `versionId: <versionId>` retrieve a specific version
The implementation of a GET operation does not change compared to the
standard version. A standard GET without versioning information would
get the master version of a key. A version-specific GET would retrieve
the specific version identified by the key for that version.
[[list]]
LIST
For a standard LIST on a bucket, metadata iterates through the keys by
using the separator (`\0`, represented by `.` in examples) as an extra
delimiter. For a listing of all versions of a bucket, there is no change
compared to the original listing function. Instead, the API component
returns all the keys in a List Objects call and filters for just the
keys of the master versions in a List Object Versions call.
For example, a standard LIST operation against the keys in a table below
would return from metadata the list of `[ foo/bar, bar, qux/quz, quz ]`.
[width="20%",cols="100%",options="header",]
|==========
|key
|foo/bar
|foo/bar.v2
|foo/bar.v1
|bar
|qux/quz
|qux/quz.v2
|qux/quz.v1
|quz
|quz.v2
|quz.v1
|==========
[[implementation-of-bucket-versioning-in-api]]
Implementation of Bucket Versioning in API
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[[object-metadata-versioning-attributes]]
Object Metadata Versioning Attributes
+++++++++++++++++++++++++++++++++++++
To access all the information needed to properly handle all cases that
may exist in versioned operations, the API stores certain
versioning-related information in the metadata attributes of each
version's object metadata.
These are the versioning-related metadata properties:
* `isNull`: whether the version being stored is a null version.
* `nullVersionId`: the unencoded version ID of the latest null version
that existed before storing a non-null version.
* `isDeleteMarker`: whether the version being stored is a delete marker.
The metadata engine also sets one additional metadata property when
creating the version.
* `versionId`: the unencoded version ID of the version being stored.
Null versions and delete markers are described in further detail in
their own subsections.
[[creation-of-new-versions]]
Creation of New Versions
++++++++++++++++++++++++
When versioning is enabled in a bucket, APIs which normally result in
the creation of objects, such as Put Object, Complete Multipart Upload
and Copy Object, will generate new versions of objects.
Zenko CloudServer creates a new version and updates the master version
using the `versioning: true` option in PUT calls to the metadata engine.
As an example, when two consecutive Put Object requests are sent to the
Zenko CloudServer for a versioning-enabled bucket with the same key
names, there are two corresponding metadata PUT calls with the
`versioning` option set to true.
The PUT calls to metadata and resulting keys are shown below:
1. PUT foo (first put), versioning: `true`
[width="30%",cols="50%,50%",options="header",]
|==========
|key |value
|foo |A
|foo.v1 |A
|==========
1. PUT foo (second put), versioning: `true`
[width="30%",cols="50%,50%",options="header",]
|==========
|key |value
|foo |B
|foo.v2 |B
|foo.v1 |A
|==========
[[null-version-management]]
Null Version Management
In a bucket without versioning, or when versioning is suspended, putting
an object with the same name twice should result in the previous object
being overwritten. This is managed with null versions.
Only one null version should exist at any given time, and it is
identified in Zenko CloudServer requests and responses with the version
id "null".
[[case-1-putting-null-versions]]
Case 1: Putting Null Versions
With respect to metadata, since the null version is overwritten by
subsequent null versions, the null version is initially stored in the
master key alone, as opposed to being stored in the master key and a new
version. Zenko CloudServer checks if versioning is suspended or has
never been configured, and sets the `versionId` option to `''` in PUT
calls to the metadata engine when creating a new null version.
If the master version is a null version, Zenko CloudServer also sends a
DELETE call to metadata prior to the PUT, in order to clean up any
pre-existing null versions which may, in certain edge cases, have been
stored as a separate version. footnote:[Some examples of these cases
are: (1) when there is a null version that is the second-to-latest
version, and the latest version has been deleted, causing metadata to
repair the master value with the value of the null version and (2) when
putting object tag or ACL on a null version that is the master version,
as explained in link:#behavior-of-object-targeting-apis["Behavior of
Object-Targeting APIs"].]
The tables below summarize the calls to metadata and the resulting keys
if we put an object 'foo' twice, when versioning has not been enabled or
is suspended.
1. PUT foo (first put), versionId: `''`
[width="34%",cols="60%,40%",options="header",]
|=============
|key |value
|foo (null) |A
|=============
(2A) DELETE foo (clean-up delete before second put), versionId:
`<version id of master version>`
[width="34%",cols="60%,40%",options="header",]
|==========
|key |value
| |
|==========
(2B) PUT foo (second put), versionId: `''`
[width="34%",cols="60%,40%",options="header",]
|=============
|key |value
|foo (null) |B
|=============
The S3 API also sets the `isNull` attribute to `true` in the version
metadata before storing the metadata for these null versions.
[[case-2-preserving-existing-null-versions-in-versioning-enabled-bucket]]
Case 2: Preserving Existing Null Versions in Versioning-Enabled Bucket
Null versions are preserved when new non-null versions are created after
versioning has been enabled or re-enabled.
If the master version is the null version, the S3 API preserves the
current null version by storing it as a new key `(3A)` in a separate PUT
call to metadata, prior to overwriting the master version `(3B)`. This
implies the null version may not necessarily be the latest or master
version.
To determine whether the master version is a null version, the S3 API
checks if the master version's `isNull` property is set to `true`, or if
the `versionId` attribute of the master version is undefined (indicating
it is a null version that was put before bucket versioning was
configured).
Continuing the example from Case 1, if we enabled versioning and put
another object, the calls to metadata and resulting keys would resemble
the following:
(3A) PUT foo, versionId: `<versionId of master version>` if defined or
`<non-versioned object id>`
[width="38%",cols="65%,35%",options="header",]
|================
|key |value
|foo |B
|foo.v1 (null) |B
|================
(3B) PUT foo, versioning: `true`
[width="38%",cols="65%,35%",options="header",]
|================
|key |value
|foo |C
|foo.v2 |C
|foo.v1 (null) |B
|================
To prevent issues with concurrent requests, Zenko CloudServer ensures
the null version is stored with the same version ID by using `versionId`
option. Zenko CloudServer sets the `versionId` option to the master
version's `versionId` metadata attribute value during the PUT. This
creates a new version with the same version ID of the existing null
master version.
The null version's `versionId` attribute may be undefined because it was
generated before the bucket versioning was configured. In that case, a
version ID is generated using the max epoch and sequence values possible
so that the null version will be properly ordered as the last entry in a
metadata listing. This value ("non-versioned object id") is used in the
PUT call with the `versionId` option.
[[case-3-overwriting-a-null-version-that-is-not-latest-version]]
Case 3: Overwriting a Null Version That is Not Latest Version
Normally when versioning is suspended, Zenko CloudServer uses the
`versionId: ''` option in a PUT to metadata to create a null version.
This also overwrites an existing null version if it is the master
version.
However, if there is a null version that is not the latest version,
Zenko CloudServer cannot rely on the `versionId: ''` option will not
overwrite the existing null version. Instead, before creating a new null
version, the Zenko CloudServer API must send a separate DELETE call to
metadata specifying the version id of the current null version for
delete.
To do this, when storing a null version (3A above) before storing a new
non-null version, Zenko CloudServer records the version's ID in the
`nullVersionId` attribute of the non-null version. For steps 3A and 3B
above, these are the values stored in the `nullVersionId` of each
version's metadata:
(3A) PUT foo, versioning: `true`
[width="72%",cols="35%,19%,46%",options="header",]
|===============================
|key |value |value.nullVersionId
|foo |B |undefined
|foo.v1 (null) |B |undefined
|===============================
(3B) PUT foo, versioning: `true`
[width="72%",cols="35%,19%,46%",options="header",]
|===============================
|key |value |value.nullVersionId
|foo |C |v1
|foo.v2 |C |v1
|foo.v1 (null) |B |undefined
|===============================
If defined, the `nullVersionId` of the master version is used with the
`versionId` option in a DELETE call to metadata if a Put Object request
is received when versioning is suspended in a bucket.
(4A) DELETE foo, versionId: `<nullVersionId of master version>` (v1)
[width="30%",cols="50%,50%",options="header",]
|==========
|key |value
|foo |C
|foo.v2 |C
|==========
Then the master version is overwritten with the new null version:
(4B) PUT foo, versionId: `''`
[width="34%",cols="60%,40%",options="header",]
|=============
|key |value
|foo (null) |D
|foo.v2 |C
|=============
The `nullVersionId` attribute is also used to retrieve the correct
version when the version ID "null" is specified in certain object-level
APIs, described further in the section link:#null-version-mapping["Null
Version Mapping"].
[[specifying-versions-in-apis-for-putting-versions]]
Specifying Versions in APIs for Putting Versions
Since Zenko CloudServer does not allow an overwrite of existing version
data, Put Object, Complete Multipart Upload and Copy Object return
`400 InvalidArgument` if a specific version ID is specified in the
request query, e.g. for a `PUT /foo?versionId=v1` request.
[[put-example]]
PUT Example
+++++++++++
When Zenko CloudServer receives a request to PUT an object:
* It checks first if versioning has been configured
* If it has not been configured, Zenko CloudServer proceeds to puts the
new data, puts the metadata by overwriting the master version, and
proceeds to delete any pre-existing data
If versioning has been configured, Zenko CloudServer checks the
following:
[[versioning-enabled]]
Versioning Enabled
If versioning is enabled and there is existing object metadata:
* If the master version is a null version (`isNull: true`) or has no
version ID (put before versioning was configured):
** store the null version metadata as a new version
** create a new version and overwrite the master version
*** set `nullVersionId`: version ID of the null version that was stored
If versioning is enabled and the master version is not null; or there is
no existing object metadata:
* create a new version and store it, and overwrite the master version
[[versioning-suspended]]
Versioning Suspended
If versioning is suspended and there is existing object metadata:
* If the master version has no version ID:
** overwrite the master version with the new metadata (PUT
`versionId: ''`)
** delete previous object data
* If the master version is a null version:
+
__________________________________________________________________________________________________________________________________________
** delete the null version using the versionId metadata attribute of the
master version (PUT `versionId: <versionId of master object MD>`)
** put a new null version (PUT `versionId: ''`)
__________________________________________________________________________________________________________________________________________
* If master is not a null version and `nullVersionId` is defined in the
objects metadata:
** delete the current null version metadata and data
** overwrite the master version with the new metadata
If there is no existing object metadata, create the new null version as
the master version.
In each of the above cases, set `isNull` metadata attribute to true when
creating the new null version.
[[behavior-of-object-targeting-apis]]
Behavior of Object-Targeting APIs
+++++++++++++++++++++++++++++++++
API methods which can target existing objects or versions, such as Get
Object, Head Object, Get Object ACL, Put Object ACL, Copy Object and
Copy Part, will perform the action on the latest version of an object if
no version ID is specified in the request query or relevant request
header (`x-amz-copy-source-version-id` for Copy Object and Copy Part
APIs).
Two exceptions are the Delete Object and Multi-Object Delete APIs, which
will instead attempt to create delete markers, described in the
following section, if no version ID is specified.
No versioning options are necessary to retrieve the latest version from
metadata, since the master version is stored in a key with the name of
the object. However, when updating the latest version, such as with the
Put Object ACL API, Zenko CloudServer sets the `versionId` option in the
PUT call to metadata to the value stored in the object metadata's
`versionId` attribute. This is done in order to update the metadata both
in the master version and the version itself, if it is not a null
version. footnote:[If it is a null version, this call will overwrite the
null version if it is stored in its own key (`foo\0<versionId>`). If the
null version is stored only in the master version, this call will both
overwrite the master version _and_ create a new key
(`foo\0<versionId>`), resulting in the edge case referred to by the
previous footnote [1]_.]
When a version id is specified in the request query for these APIs, e.g.
`GET /foo?versionId=v1`, Zenko CloudServer will attempt to decode the
version ID and perform the action on the appropriate version. To do so,
the API sets the value of the `versionId` option to the decoded version
ID in the metadata call.
[[delete-markers]]
Delete Markers
If versioning has not been configured for a bucket, the Delete Object
and Multi-Object Delete APIs behave as their standard APIs.
If versioning has been configured, Zenko CloudServer deletes object or
version data only if a specific version ID is provided in the request
query, e.g. `DELETE /foo?versionId=v1`.
If no version ID is provided, S3 creates a delete marker by creating a
0-byte version with the metadata attribute `isDeleteMarker: true`. The
S3 API will return a `404 NoSuchKey` error in response to requests
getting or heading an object whose latest version is a delete maker.
To restore a previous version as the latest version of an object, the
delete marker must be deleted, by the same process as deleting any other
version.
The response varies when targeting an object whose latest version is a
delete marker for other object-level APIs that can target existing
objects and versions, without specifying the version ID.
* Get Object, Head Object, Get Object ACL, Object Copy and Copy Part
return `404 NoSuchKey`.
* Put Object ACL and Put Object Tagging return `405 MethodNotAllowed`.
These APIs respond to requests specifying the version ID of a delete
marker with the error `405 MethodNotAllowed`, in general. Copy Part and
Copy Object respond with `400 Invalid Request`.
See section link:#delete-example["Delete Example"] for a summary.
[[null-version-mapping]]
Null Version Mapping
When the null version is specified in a request with the version ID
"null", the S3 API must use the `nullVersionId` stored in the latest
version to retrieve the current null version, if the null version is not
the latest version.
Thus, getting the null version is a two step process:
1. Get the latest version of the object from metadata. If the latest
version's `isNull` property is `true`, then use the latest version's
metadata. Otherwise,
2. Get the null version of the object from metadata, using the internal
version ID of the current null version stored in the latest version's
`nullVersionId` metadata attribute.
[[delete-example]]
DELETE Example
++++++++++++++
The following steps are used in the delete logic for delete marker
creation:
* If versioning has not been configured: attempt to delete the object
* If request is version-specific delete request: attempt to delete the
version
* otherwise, if not a version-specific delete request and versioning has
been configured:
** create a new 0-byte content-length version
** in version's metadata, set a 'isDeleteMarker' property to true
* Return the version ID of any version deleted or any delete marker
created
* Set response header `x-amz-delete-marker` to true if a delete marker
was deleted or created
The Multi-Object Delete API follows the same logic for each of the
objects or versions listed in an xml request. Note that a delete request
can result in the creation of a deletion marker even if the object
requested to delete does not exist in the first place.
Object-level APIs which can target existing objects and versions perform
the following checks regarding delete markers:
* If not a version-specific request and versioning has been configured,
check the metadata of the latest version
* If the 'isDeleteMarker' property is set to true, return
`404 NoSuchKey` or `405 MethodNotAllowed`
* If it is a version-specific request, check the object metadata of the
requested version
* If the `isDeleteMarker` property is set to true, return
`405 MethodNotAllowed` or `400 InvalidRequest`
[[data-metadata-daemon-architecture-and-operational-guide]]
Data-metadata daemon Architecture and Operational guide
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This document presents the architecture of the data-metadata daemon
(dmd) used for the community edition of Zenko CloudServer. It also
provides a guide on how to operate it.
The dmd is responsible for storing and retrieving Zenko CloudServer data
and metadata, and is accessed by Zenko CloudServer connectors through
socket.io (metadata) and REST (data) APIs.
It has been designed such that more than one Zenko CloudServer connector
can access the same buckets by communicating with the dmd. It also means
that the dmd can be hosted on a separate container or machine.
[[operation]]
Operation
^^^^^^^^^
[[startup]]
Startup
+++++++
The simplest deployment is still to launch with npm start, this will
start one instance of the Zenko CloudServer connector and will listen on
the locally bound dmd ports 9990 and 9991 (by default, see below).
The dmd can be started independently from the Zenko CloudServer by
running this command in the Zenko CloudServer directory:
....
npm run start_dmd
....
This will open two ports:
- one is based on socket.io and is used for metadata transfers (9990
by::
default)
- the other is a REST interface used for data transfers (9991 by::
default)
Then, one or more instances of Zenko CloudServer without the dmd can be
started elsewhere with:
....
npm run start_s3server
....
[[configuration]]
Configuration
+++++++++++++
Most configuration happens in `config.json` for Zenko CloudServer, local
storage paths can be changed where the dmd is started using environment
variables, like before: `S3DATAPATH` and `S3METADATAPATH`.
In `config.json`, the following sections are used to configure access to
the dmd through separate configuration of the data and metadata access:
....
"metadataClient": {
"host": "localhost",
"port": 9990
},
"dataClient": {
"host": "localhost",
"port": 9991
},
....
To run a remote dmd, you have to do the following:
- change both `"host"` attributes to the IP or host name where the::
dmd is run.
- Modify the `"bindAddress"` attributes in `"metadataDaemon"` and::
`"dataDaemon"` sections where the dmd is run to accept remote
connections (e.g. `"::"`)
[[architecture-1]]
Architecture
^^^^^^^^^^^^
This section gives a bit more insight on how it works internally.
image:./images/data_metadata_daemon_arch.png[image]
______________________________________
alt::
Architecture diagram
./images/data_metadata_daemon_arch.png
______________________________________
[[metadata-on-socket.io]]
Metadata on socket.io
+++++++++++++++++++++
This communication is based on an RPC system based on socket.io events
sent by Zenko CloudServerconnectors, received by the DMD and
acknowledged back to the Zenko CloudServer connector.
The actual payload sent through socket.io is a JSON-serialized form of
the RPC call name and parameters, along with some additional information
like the request UIDs, and the sub-level information, sent as object
attributes in the JSON request.
With introduction of versioning support, the updates are now gathered in
the dmd for some number of milliseconds max, before being batched as a
single write to the database. This is done server-side, so the API is
meant to send individual updates.
Four RPC commands are available to clients: `put`, `get`, `del` and
`createReadStream`. They more or less map the parameters accepted by the
corresponding calls in the LevelUp implementation of LevelDB. They
differ in the following:
- The `sync` option is ignored (under the hood, puts are gathered::
into batches which have their `sync` property enforced when they are
committed to the storage)
* Some additional versioning-specific options are supported
- `createReadStream` becomes asynchronous, takes an additional::
callback argument and returns the stream in the second callback
parameter
Debugging the socket.io exchanges can be achieved by running the daemon
with `DEBUG='socket.io*'` environment variable set.
One parameter controls the timeout value after which RPC commands sent
end with a timeout error, it can be changed either:
- via the `DEFAULT_CALL_TIMEOUT_MS` option in::
`lib/network/rpc/rpc.js`
- or in the constructor call of the `MetadataFileClient` object (in::
`lib/metadata/bucketfile/backend.js` as `callTimeoutMs`.
Default value is 30000.
A specific implementation deals with streams, currently used for listing
a bucket. Streams emit `"stream-data"` events that pack one or more
items in the listing, and a special `“stream-end”` event when done. Flow
control is achieved by allowing a certain number of “in flight” packets
that have not received an ack yet (5 by default). Two options can tune
the behavior (for better throughput or getting it more robust on weak
networks), they have to be set in `mdserver.js` file directly, as there
is no support in `config.json` for now for those options:
- `streamMaxPendingAck`: max number of pending ack events not yet::
received (default is 5)
- `streamAckTimeoutMs`: timeout for receiving an ack after an output::
stream packet is sent to the client (default is 5000)
[[data-exchange-through-the-rest-data-port]]
Data exchange through the REST data port
++++++++++++++++++++++++++++++++++++++++
Data is read and written with REST semantic.
The web server recognizes a base path in the URL of `/DataFile` to be a
request to the data storage service.
[[put-1]]
PUT
A PUT on `/DataFile` URL and contents passed in the request body will
write a new object to the storage.
On success, a `201 Created` response is returned and the new URL to the
object is returned via the `Location` header (e.g.
`Location: /DataFile/50165db76eecea293abfd31103746dadb73a2074`). The raw
key can then be extracted simply by removing the leading `/DataFile`
service information from the returned URL.
[[get-1]]
GET
A GET is simply issued with REST semantic, e.g.:
....
GET /DataFile/50165db76eecea293abfd31103746dadb73a2074 HTTP/1.1
....
A GET request can ask for a specific range. Range support is complete
except for multiple byte ranges.
[[delete-1]]
DELETE
DELETE is similar to GET, except that a `204 No Content` response is
returned on success.
[[listing]]
Listing
~~~~~~~
[[listing-types]]
Listing Types
^^^^^^^^^^^^^
We use three different types of metadata listing for various operations.
Here are the scenarios we use each for:
- 'Delimiter' - when no versions are possible in the bucket since it
is::
an internally-used only bucket which is not exposed to a user. Namely,
1. to list objects in the "user's bucket" to respond to a GET SERVICE::
request and
2. to do internal listings on an MPU shadow bucket to complete
multipart::
upload operations.
* 'DelimiterVersion' - to list all versions in a bucket
- 'DelimiterMaster' - to list just the master versions of objects in a::
bucket
[[algorithms]]
Algorithms
^^^^^^^^^^
The algorithms for each listing type can be found in the open-source
https://github.com/scality/Arsenal[scality/Arsenal] repository, in
https://github.com/scality/Arsenal/tree/master/lib/algos/list[lib/algos/list].
[[encryption]]
Encryption
~~~~~~~~~~
With CloudServer, there are two possible methods of at-rest encryption.
(1) We offer bucket level encryption where Scality CloudServer itself
handles at-rest encryption for any object that is in an 'encrypted'
bucket, regardless of what the location-constraint for the data is and
(2) If the location-constraint specified for the data is of type AWS,
you can choose to use AWS server side encryption.
Note: bucket level encryption is not available on the standard AWS S3
protocol, so normal AWS S3 clients will not provide the option to send a
header when creating a bucket. We have created a simple tool to enable
you to easily create an encrypted bucket.
[[example]]
Example:
^^^^^^^^
Creating encrypted bucket using our encrypted bucket tool in the bin
directory
[source,sourceCode,shell]
----
./create_encrypted_bucket.js -a accessKey1 -k verySecretKey1 -b bucketname -h localhost -p 8000
----
[[aws-backend]]
AWS backend
^^^^^^^^^^^
With real AWS S3 as a location-constraint, you have to configure the
location-constraint as follows
[source,sourceCode,json]
----
"awsbackend": {
"type": "aws_s3",
"legacyAwsBehavior": true,
"details": {
"serverSideEncryption": true,
...
}
},
----
Then, every time an object is put to that data location, we pass the
following header to AWS: `x-amz-server-side-encryption: AES256`
Note: due to these options, it is possible to configure encryption by
both CloudServer and AWS S3 (if you put an object to a CloudServer
bucket which has the encryption flag AND the location-constraint for the
data is AWS S3 with serverSideEncryption set to true).

View File

@ -0,0 +1,316 @@
[[clients]]
Clients
-------
List of applications that have been tested with Zenko CloudServer.
GUI ~~~
`Cyberduck <https://cyberduck.io/?l=en>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* https://www.youtube.com/watch?v=-n2MCt4ukUg
* https://www.youtube.com/watch?v=IyXHcu4uqgU
`Cloud Explorer <https://www.linux-toys.com/?p=945>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* https://www.youtube.com/watch?v=2hhtBtmBSxE
`CloudBerry Lab <http://www.cloudberrylab.com>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* https://youtu.be/IjIx8g_o0gY
Command Line Tools ~~~~~~~~~~~~~~~~
`s3curl <https://github.com/rtdp/s3curl>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
https://github.com/scality/S3/blob/master/tests/functional/s3curl/s3curl.pl
`aws-cli <http://docs.aws.amazon.com/cli/latest/reference/>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`~/.aws/credentials` on Linux, OS X, or Unix or
`C:\Users\USERNAME\.aws\credentials` on Windows
.. code:: shell
....
[default]
aws_access_key_id = accessKey1
aws_secret_access_key = verySecretKey1
....
`~/.aws/config` on Linux, OS X, or Unix or
`C:\Users\USERNAME\.aws\config` on Windows
.. code:: shell
....
[default]
region = us-east-1
....
Note: `us-east-1` is the default region, but you can specify any region.
See all buckets:
.. code:: shell
....
aws s3 ls --endpoint-url=http://localhost:8000
....
Create bucket:
.. code:: shell
....
aws --endpoint-url=http://localhost:8000 s3 mb s3://mybucket
....
`s3cmd <http://s3tools.org/s3cmd>`__ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If using s3cmd as a client to S3 be aware that v4 signature format is
buggy in s3cmd versions < 1.6.1.
`~/.s3cfg` on Linux, OS X, or Unix or `C:\Users\USERNAME\.s3cfg` on
Windows
.. code:: shell
....
[default]
access_key = accessKey1
secret_key = verySecretKey1
host_base = localhost:8000
host_bucket = %(bucket).localhost:8000
signature_v2 = False
use_https = False
....
See all buckets:
.. code:: shell
....
s3cmd ls
....
`rclone <http://rclone.org/s3/>`__ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
`~/.rclone.conf` on Linux, OS X, or Unix or
`C:\Users\USERNAME\.rclone.conf` on Windows
.. code:: shell
....
[remote]
type = s3
env_auth = false
access_key_id = accessKey1
secret_access_key = verySecretKey1
region = other-v2-signature
endpoint = http://localhost:8000
location_constraint =
acl = private
server_side_encryption =
storage_class =
....
See all buckets:
.. code:: shell
....
rclone lsd remote:
....
JavaScript ~~~~~~~~
`AWS JavaScript SDK <http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: javascript
....
const AWS = require('aws-sdk');
const s3 = new AWS.S3({
accessKeyId: 'accessKey1',
secretAccessKey: 'verySecretKey1',
endpoint: 'localhost:8000',
sslEnabled: false,
s3ForcePathStyle: true,
});
....
JAVA ~~~~
`AWS JAVA SDK <http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: java
....
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.S3ClientOptions;
import com.amazonaws.services.s3.model.Bucket;
public class S3 {
public static void main(String[] args) {
AWSCredentials credentials = new BasicAWSCredentials("accessKey1",
"verySecretKey1");
// Create a client connection based on credentials
AmazonS3 s3client = new AmazonS3Client(credentials);
s3client.setEndpoint("http://localhost:8000");
// Using path-style requests
// (deprecated) s3client.setS3ClientOptions(new S3ClientOptions().withPathStyleAccess(true));
s3client.setS3ClientOptions(S3ClientOptions.builder().setPathStyleAccess(true).build());
// Create bucket
String bucketName = "javabucket";
s3client.createBucket(bucketName);
// List off all buckets
for (Bucket bucket : s3client.listBuckets()) {
System.out.println(" - " + bucket.getName());
}
}
}
....
Ruby ~~~~
`AWS SDK for Ruby - Version 2 <http://docs.aws.amazon.com/sdkforruby/api/>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: ruby
....
require 'aws-sdk'
s3 = Aws::S3::Client.new(
:access_key_id => 'accessKey1',
:secret_access_key => 'verySecretKey1',
:endpoint => 'http://localhost:8000',
:force_path_style => true
)
resp = s3.list_buckets
....
`fog <http://fog.io/storage/>`__ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: ruby
....
require "fog"
connection = Fog::Storage.new(
{
:provider => "AWS",
:aws_access_key_id => 'accessKey1',
:aws_secret_access_key => 'verySecretKey1',
:endpoint => 'http://localhost:8000',
:path_style => true,
:scheme => 'http',
})
....
Python ~~~~~~
`boto2 <http://boto.cloudhackers.com/en/latest/ref/s3.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: python
....
import boto
from boto.s3.connection import S3Connection, OrdinaryCallingFormat
connection = S3Connection(
aws_access_key_id='accessKey1',
aws_secret_access_key='verySecretKey1',
is_secure=False,
port=8000,
calling_format=OrdinaryCallingFormat(),
host='localhost'
)
connection.create_bucket('mybucket')
....
`boto3 <http://boto3.readthedocs.io/en/latest/index.html>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Client integration
.. code:: python import boto3
....
client = boto3.client(
's3',
aws_access_key_id='accessKey1',
aws_secret_access_key='verySecretKey1',
endpoint_url='http://localhost:8000'
)
lists = client.list_buckets()
....
Full integration (with object mapping)
.. code:: python import os
....
from botocore.utils import fix_s3_host
import boto3
os.environ['AWS_ACCESS_KEY_ID'] = "accessKey1"
os.environ['AWS_SECRET_ACCESS_KEY'] = "verySecretKey1"
s3 = boto3.resource(service_name='s3', endpoint_url='http://localhost:8000')
s3.meta.client.meta.events.unregister('before-sign.s3', fix_s3_host)
for bucket in s3.buckets.all():
print(bucket.name)
....
PHP ~~~
Should force path-style requests even though v3 advertises it does by
default.
`AWS PHP SDK v3 <https://docs.aws.amazon.com/aws-sdk-php/v3/guide>`__
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: php
....
use Aws\S3\S3Client;
$client = S3Client::factory([
'region' => 'us-east-1',
'version' => 'latest',
'endpoint' => 'http://localhost:8000',
'use_path_style_endpoint' => true,
'credentials' => [
'key' => 'accessKey1',
'secret' => 'verySecretKey1'
]
]);
$client->createBucket(array(
'Bucket' => 'bucketphp',
));
....

View File

@ -0,0 +1,395 @@
Docker
======
* link:#environment-variables[Environment Variables]
* link:#tunables-and-setup-tips[Tunables and setup tips]
* link:#continuous-integration-with-docker-hosted%20CloudServer[Examples
for continuous integration with Docker]
* link:#in-production-with-docker-hosted%20CloudServer[Examples for
going in production with Docker]
[[environment-variables]]
Environment Variables
---------------------
[[s3data]]
S3DATA
~~~~~~
[[s3datamultiple]]
S3DATA=multiple
^^^^^^^^^^^^^^^
Allows you to run Scality Zenko CloudServer with multiple data backends,
defined as regions. When using multiple data backends, a custom
`locationConfig.json` file is mandatory. It will allow you to set custom
regions. You will then need to provide associated rest_endpoints for
each custom region in your `config.json` file.
link:../GETTING_STARTED/#location-configuration[Learn more about
multiple backends configuration]
If you are using Scality RING endpoints, please refer to your customer
documentation.
[[running-it-with-an-aws-s3-hosted-backend]]
Running it with an AWS S3 hosted backend
++++++++++++++++++++++++++++++++++++++++
To run CloudServer with an S3 AWS backend, you will have to add a new
section to your `locationConfig.json` file with the `aws_s3` location
type:
[source,sourceCode,json]
----
----
(...)::
"awsbackend": \{;;
"type": "aws_s3", "details": \{ "awsEndpoint": "s3.amazonaws.com",
"bucketName": "yourawss3bucket", "bucketMatch": true,
"credentialsProfile": "aws_hosted_profile" }
+
}
(...)
You will also have to edit your AWS credentials file to be able to use
your command line tool of choice. This file should mention credentials
for all the backends you're using. You can use several profiles when
using multiple profiles.
[source,sourceCode,json]
----
----
[default] aws_access_key_id=accessKey1
aws_secret_access_key=verySecretKey1 [aws_hosted_profile]
aws_access_key_id=\{\{YOUR_ACCESS_KEY}}
aws_secret_access_key=\{\{YOUR_SECRET_KEY}}
Just as you need to mount your locationConfig.json, you will need to
mount your AWS credentials file at run time:
`-v ~/.aws/credentials:/root/.aws/credentials` on Linux, OS X, or Unix
or `-v C:\Users\USERNAME\.aws\credential:/root/.aws/credentials` on
Windows
NOTE: One account can't copy to another account with a source and
destination on real AWS unless the account associated with the access
Key/secret Key pairs used for the destination bucket has rights to get
in the source bucket. ACL's would have to be updated on AWS directly to
enable this.
S3BACKEND ~~~~~~
S3BACKEND=file ^^^^^^^^^^^ When storing file data, for it to be
persistent you must mount docker volumes for both data and metadata. See
link:#using-docker-volumes-in-production[this section]
S3BACKEND=mem ^^^^^^^^^^ This is ideal for testing - no data will remain
after container is shutdown.
[[endpoint]]
ENDPOINT
~~~~~~~~
This variable specifies your endpoint. If you have a domain such as
new.host.com, by specifying that here, you and your users can direct s3
server requests to new.host.com.
[source,sourceCode,shell]
----
docker run -d --name s3server -p 8000:8000 -e ENDPOINT=new.host.com scality/s3server
----
Note: In your `/etc/hosts` file on Linux, OS X, or Unix with root
permissions, make sure to associate 127.0.0.1 with `new.host.com`
[[scality_access_key_id-and-scality_secret_access_key]]
SCALITY_ACCESS_KEY_ID and SCALITY_SECRET_ACCESS_KEY
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These variables specify authentication credentials for an account named
"CustomAccount".
You can set credentials for many accounts by editing
`conf/authdata.json` (see below for further info), but if you just want
to specify one set of your own, you can use these environment variables.
[source,sourceCode,shell]
----
docker run -d --name s3server -p 8000:8000 -e SCALITY_ACCESS_KEY_ID=newAccessKey
-e SCALITY_SECRET_ACCESS_KEY=newSecretKey scality/s3server
----
Note: Anything in the `authdata.json` file will be ignored. Note: The
old `ACCESS_KEY` and `SECRET_KEY` environment variables are now
deprecated
[[log_level]]
LOG_LEVEL
~~~~~~~~~
This variable allows you to change the log level: info, debug or trace.
The default is info. Debug will give you more detailed logs and trace
will give you the most detailed.
[source,sourceCode,shell]
----
docker run -d --name s3server -p 8000:8000 -e LOG_LEVEL=trace scality/s3server
----
[[ssl]]
SSL
~~~
This variable set to true allows you to run S3 with SSL:
**Note1**: You also need to specify the ENDPOINT environment variable.
**Note2**: In your `/etc/hosts` file on Linux, OS X, or Unix with root
permissions, make sure to associate 127.0.0.1 with `<YOUR_ENDPOINT>`
**Warning**: These certs, being self-signed (and the CA being generated
inside the container) will be untrusted by any clients, and could
disappear on a container upgrade. That's ok as long as it's for quick
testing. Also, best security practice for non-testing would be to use an
extra container to do SSL/TLS termination such as haproxy/nginx/stunnel
to limit what an exploit on either component could expose, as well as
certificates in a mounted volume
[source,sourceCode,shell]
----
docker run -d --name s3server -p 8000:8000 -e SSL=TRUE -e ENDPOINT=<YOUR_ENDPOINT>
scality/s3server
----
More information about how to use S3 server with SSL
https://s3.scality.com/v1.0/page/scality-with-ssl[here]
[[listen_addr]]
LISTEN_ADDR
~~~~~~~~~~~
This variable instructs the Zenko CloudServer, and its data and metadata
components to listen on the specified address. This allows starting the
data or metadata servers as standalone services, for example.
[source,sourceCode,shell]
----
docker run -d --name s3server-data -p 9991:9991 -e LISTEN_ADDR=0.0.0.0
scality/s3server npm run start_dataserver
----
[[data_host-and-metadata_host]]
DATA_HOST and METADATA_HOST
~~~~~~~~~~~~~~~~~~~~~~~~~~~
These variables configure the data and metadata servers to use, usually
when they are running on another host and only starting the stateless
Zenko CloudServer.
[source,sourceCode,shell]
----
docker run -d --name s3server -e DATA_HOST=s3server-data
-e METADATA_HOST=s3server-metadata scality/s3server npm run start_s3server
----
[[redis_host]]
REDIS_HOST
~~~~~~~~~~
Use this variable to connect to the redis cache server on another host
than localhost.
[source,sourceCode,shell]
----
docker run -d --name s3server -p 8000:8000
-e REDIS_HOST=my-redis-server.example.com scality/s3server
----
[[redis_port]]
REDIS_PORT
~~~~~~~~~~
Use this variable to connect to the redis cache server on another port
than the default 6379.
[source,sourceCode,shell]
----
docker run -d --name s3server -p 8000:8000
-e REDIS_PORT=6379 scality/s3server
----
[[tunables-and-setup-tips]]
Tunables and Setup Tips
-----------------------
[[using-docker-volumes]]
Using Docker Volumes
~~~~~~~~~~~~~~~~~~~~
Zenko CloudServer runs with a file backend by default.
So, by default, the data is stored inside your Zenko CloudServer Docker
container.
However, if you want your data and metadata to persist, you *MUST* use
Docker volumes to host your data and metadata outside your Zenko
CloudServer Docker container. Otherwise, the data and metadata will be
destroyed when you erase the container.
[source,sourceCode,shell]
----
docker run -­v $(pwd)/data:/usr/src/app/localData -­v $(pwd)/metadata:/usr/src/app/localMetadata
-p 8000:8000 ­-d scality/s3server
----
This command mounts the host directory, `./data`, into the container at
`/usr/src/app/localData` and the host directory, `./metadata`, into the
container at `/usr/src/app/localMetaData`. It can also be any host mount
point, like `/mnt/data` and `/mnt/metadata`.
[[adding-modifying-or-deleting-accounts-or-users-credentials]]
Adding modifying or deleting accounts or users credentials
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. Create locally a customized `authdata.json` based on our
`/conf/authdata.json`.
2. Use https://docs.docker.com/engine/tutorials/dockervolumes/[Docker
Volume]::
to override the default `authdata.json` through a docker file mapping.
For example:
[source,sourceCode,shell]
----
docker run -v $(pwd)/authdata.json:/usr/src/app/conf/authdata.json -p 8000:8000 -d
scality/s3server
----
[[specifying-your-own-host-name]]
Specifying your own host name
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To specify a host name (e.g. s3.domain.name), you can provide your own
https://github.com/scality/S3/blob/master/config.json[config.json] using
https://docs.docker.com/engine/tutorials/dockervolumes/[Docker Volume].
First add a new key-value pair in the restEndpoints section of your
config.json. The key in the key-value pair should be the host name you
would like to add and the value is the default location_constraint for
this endpoint.
For example, `s3.example.com` is mapped to `us-east-1` which is one of
the `location_constraints` listed in your locationConfig.json file
https://github.com/scality/S3/blob/master/locationConfig.json[here].
More information about location configuration
https://github.com/scality/S3/blob/master/README.md#location-configuration[here]
[source,sourceCode,json]
----
"restEndpoints": {
"localhost": "file",
"127.0.0.1": "file",
...
"s3.example.com": "us-east-1"
},
----
Then, run your Scality S3 Server using
https://docs.docker.com/engine/tutorials/dockervolumes/[Docker Volume]:
[source,sourceCode,shell]
----
docker run -v $(pwd)/config.json:/usr/src/app/config.json -p 8000:8000 -d scality/s3server
----
Your local `config.json` file will override the default one through a
docker file mapping.
[[running-as-an-unprivileged-user]]
Running as an unprivileged user
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Zenko CloudServer runs as root by default.
You can change that by modifing the dockerfile and specifying a user
before the entrypoint.
The user needs to exist within the container, and own the folder
*/usr/src/app* for Scality Zenko CloudServer to run properly.
For instance, you can modify these lines in the dockerfile:
[source,sourceCode,shell]
----
...
&& groupadd -r -g 1001 scality \
&& useradd -u 1001 -g 1001 -d /usr/src/app -r scality \
&& chown -R scality:scality /usr/src/app
...
USER scality
ENTRYPOINT ["/usr/src/app/docker-entrypoint.sh"]
----
[[continuous-integration-with-docker-hosted-cloudserver]]
Continuous integration with Docker hosted CloudServer
-----------------------------------------------------
When you start the Docker Scality Zenko CloudServer image, you can
adjust the configuration of the Scality Zenko CloudServer instance by
passing one or more environment variables on the docker run command
line.
Sample ways to run it for CI are:
* With custom locations (one in-memory, one hosted on AWS), and custom
credentials mounted:
[source,sourceCode,shell]
----
docker run --name CloudServer -p 8000:8000
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json
-v $(pwd)/authdata.json:/usr/src/app/conf/authdata.json
-v ~/.aws/credentials:/root/.aws/credentials
-e S3DATA=multiple -e S3BACKEND=mem scality/s3server
----
* With custom locations, (one in-memory, one hosted on AWS, one file),
and custom credentials set as environment variables (see
link:#scality-access-key-id-and-scality-secret-access-key[this
section]):
[source,sourceCode,shell]
----
docker run --name CloudServer -p 8000:8000
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json
-v ~/.aws/credentials:/root/.aws/credentials
-v $(pwd)/data:/usr/src/app/localData -v $(pwd)/metadata:/usr/src/app/localMetadata
-e SCALITY_ACCESS_KEY_ID=accessKey1
-e SCALITY_SECRET_ACCESS_KEY=verySecretKey1
-e S3DATA=multiple -e S3BACKEND=mem scality/s3server
----
[[in-production-with-docker-hosted-cloudserver]]
In production with Docker hosted CloudServer
--------------------------------------------
In production, we expect that data will be persistent, that you will use
the multiple backends capabilities of Zenko CloudServer, and that you
will have a custom endpoint for your local storage, and custom
credentials for your local storage:
[source,sourceCode,shell]
----
docker run -d --name CloudServer
-v $(pwd)/data:/usr/src/app/localData -v $(pwd)/metadata:/usr/src/app/localMetadata
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json
-v $(pwd)/authdata.json:/usr/src/app/conf/authdata.json
-v ~/.aws/credentials:/root/.aws/credentials -e S3DATA=multiple
-e ENDPOINT=custom.endpoint.com
-p 8000:8000 ­-d scality/s3server
----

View File

@ -0,0 +1,714 @@
Integrations
============
[[high-availability]]
High Availability
-----------------
https://docs.docker.com/engine/swarm/[Docker swarm] is a clustering tool
developped by Docker and ready to use with its containers. It allows to
start a service, which we define and use as a means to ensure Zenko
CloudServer's continuous availability to the end user. Indeed, a swarm
defines a manager and n workers among n+1 servers. We will do a basic
setup in this tutorial, with just 3 servers, which already provides a
strong service resiliency, whilst remaining easy to do as an individual.
We will use NFS through docker to share data and metadata between the
different servers.
You will see that the steps of this tutorial are defined as **On
Server**, **On Clients**, **On All Machines**. This refers respectively
to NFS Server, NFS Clients, or NFS Server and Clients. In our example,
the IP of the Server will be **10.200.15.113**, while the IPs of the
Clients will be *10.200.15.96 and 10.200.15.97*
[[installing-docker]]
Installing docker
~~~~~~~~~~~~~~~~~
Any version from docker 1.12.6 onwards should work; we used Docker
17.03.0-ce for this tutorial.
[[on-all-machines]]
On All Machines
^^^^^^^^^^^^^^^
[[on-ubuntu-14.04]]
On Ubuntu 14.04
+++++++++++++++
The docker website has
https://docs.docker.com/engine/installation/linux/ubuntu/[solid
documentation]. We have chosen to install the aufs dependency, as
recommended by Docker. Here are the required commands:
[source,sourceCode,sh]
----
$> sudo apt-get update
$> sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual
$> sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
$> curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$> sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
$> sudo apt-get update
$> sudo apt-get install docker-ce
----
[[on-centos-7]]
On CentOS 7
+++++++++++
The docker website has
https://docs.docker.com/engine/installation/linux/centos/[solid
documentation]. Here are the required commands:
[source,sourceCode,sh]
----
$> sudo yum install -y yum-utils
$> sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
$> sudo yum makecache fast
$> sudo yum install docker-ce
$> sudo systemctl start docker
----
[[configure-nfs]]
Configure NFS
~~~~~~~~~~~~~
[[on-clients]]
On Clients
^^^^^^^^^^
Your NFS Clients will mount Docker volumes over your NFS Server's shared
folders. Hence, you don't have to mount anything manually, you just have
to install the NFS commons:
[[on-ubuntu-14.04-1]]
On Ubuntu 14.04
+++++++++++++++
Simply install the NFS commons:
[source,sourceCode,sh]
----
$> sudo apt-get install nfs-common
----
[[on-centos-7-1]]
On CentOS 7
+++++++++++
Install the NFS utils, and then start the required services:
[source,sourceCode,sh]
----
$> yum install nfs-utils
$> sudo systemctl enable rpcbind
$> sudo systemctl enable nfs-server
$> sudo systemctl enable nfs-lock
$> sudo systemctl enable nfs-idmap
$> sudo systemctl start rpcbind
$> sudo systemctl start nfs-server
$> sudo systemctl start nfs-lock
$> sudo systemctl start nfs-idmap
----
[[on-server]]
On Server
^^^^^^^^^
Your NFS Server will be the machine to physically host the data and
metadata. The package(s) we will install on it is slightly different
from the one we installed on the clients.
[[on-ubuntu-14.04-2]]
On Ubuntu 14.04
+++++++++++++++
Install the NFS server specific package and the NFS commons:
[source,sourceCode,sh]
----
$> sudo apt-get install nfs-kernel-server nfs-common
----
[[on-centos-7-2]]
On CentOS 7
+++++++++++
Same steps as with the client: install the NFS utils and start the
required services:
[source,sourceCode,sh]
----
$> yum install nfs-utils
$> sudo systemctl enable rpcbind
$> sudo systemctl enable nfs-server
$> sudo systemctl enable nfs-lock
$> sudo systemctl enable nfs-idmap
$> sudo systemctl start rpcbind
$> sudo systemctl start nfs-server
$> sudo systemctl start nfs-lock
$> sudo systemctl start nfs-idmap
----
[[on-ubuntu-14.04-and-centos-7]]
On Ubuntu 14.04 and CentOS 7
++++++++++++++++++++++++++++
Choose where your shared data and metadata from your local
http://www.zenko.io/cloudserver/[Zenko CloudServer] will be stored. We
chose to go with /var/nfs/data and /var/nfs/metadata. You also need to
set proper sharing permissions for these folders as they'll be shared
over NFS:
[source,sourceCode,sh]
----
$> mkdir -p /var/nfs/data /var/nfs/metadata
$> chmod -R 777 /var/nfs/
----
Now you need to update your */etc/exports* file. This is the file that
configures network permissions and rwx permissions for NFS access. By
default, Ubuntu applies the no_subtree_check option, so we declared both
folders with the same permissions, even though they're in the same tree:
[source,sourceCode,sh]
----
$> sudo vim /etc/exports
----
In this file, add the following lines:
[source,sourceCode,sh]
----
/var/nfs/data 10.200.15.96(rw,sync,no_root_squash) 10.200.15.97(rw,sync,no_root_squash)
/var/nfs/metadata 10.200.15.96(rw,sync,no_root_squash) 10.200.15.97(rw,sync,no_root_squash)
----
Export this new NFS table:
[source,sourceCode,sh]
----
$> sudo exportfs -a
----
Eventually, you need to allow for NFS mount from Docker volumes on other
machines. You need to change the Docker config in
**/lib/systemd/system/docker.service**:
[source,sourceCode,sh]
----
$> sudo vim /lib/systemd/system/docker.service
----
In this file, change the *MountFlags* option:
[source,sourceCode,sh]
----
MountFlags=shared
----
Now you just need to restart the NFS server and docker daemons so your
changes apply.
[[on-ubuntu-14.04-3]]
On Ubuntu 14.04
+++++++++++++++
Restart your NFS Server and docker services:
[source,sourceCode,sh]
----
$> sudo service nfs-kernel-server restart
$> sudo service docker restart
----
[[on-centos-7-3]]
On CentOS 7
+++++++++++
Restart your NFS Server and docker daemons:
[source,sourceCode,sh]
----
$> sudo systemctl restart nfs-server
$> sudo systemctl daemon-reload
$> sudo systemctl restart docker
----
[[set-up-your-docker-swarm-service]]
Set up your Docker Swarm service
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[on-all-machines-1]]
On All Machines
^^^^^^^^^^^^^^^
[[on-ubuntu-14.04-and-centos-7-1]]
On Ubuntu 14.04 and CentOS 7
++++++++++++++++++++++++++++
We will now set up the Docker volumes that will be mounted to the NFS
Server and serve as data and metadata storage for Zenko CloudServer.
These two commands have to be replicated on all machines:
[source,sourceCode,sh]
----
$> docker volume create --driver local --opt type=nfs --opt o=addr=10.200.15.113,rw --opt device=:/var/nfs/data --name data
$> docker volume create --driver local --opt type=nfs --opt o=addr=10.200.15.113,rw --opt device=:/var/nfs/metadata --name metadata
----
There is no need to ""docker exec" these volumes to mount them: the
Docker Swarm manager will do it when the Docker service will be started.
[[on-server-1]]
On Server
+++++++++
To start a Docker service on a Docker Swarm cluster, you first have to
initialize that cluster (i.e.: define a manager), then have the
workers/nodes join in, and then start the service. Initialize the swarm
cluster, and look at the response:
[source,sourceCode,sh]
----
$> docker swarm init --advertise-addr 10.200.15.113
Swarm initialized: current node (db2aqfu3bzfzzs9b1kfeaglmq) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-5yxxencrdoelr7mpltljn325uz4v6fe1gojl14lzceij3nujzu-2vfs9u6ipgcq35r90xws3stka \
10.200.15.113:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
----
[[on-clients-1]]
On Clients
++++++++++
Simply copy/paste the command provided by your docker swarm init. When
all goes well, you'll get something like this:
[source,sourceCode,sh]
----
$> docker swarm join --token SWMTKN-1-5yxxencrdoelr7mpltljn325uz4v6fe1gojl14lzceij3nujzu-2vfs9u6ipgcq35r90xws3stka 10.200.15.113:2377
This node joined a swarm as a worker.
----
[[on-server-2]]
On Server
+++++++++
Start the service on your swarm cluster!
[source,sourceCode,sh]
----
$> docker service create --name s3 --replicas 1 --mount type=volume,source=data,target=/usr/src/app/localData --mount type=volume,source=metadata,target=/usr/src/app/localMetadata -p 8000:8000 scality/s3server
----
If you run a docker service ls, you should have the following output:
[source,sourceCode,sh]
----
$> docker service ls
ID NAME MODE REPLICAS IMAGE
ocmggza412ft s3 replicated 1/1 scality/s3server:latest
----
If your service won't start, consider disabling apparmor/SELinux.
[[testing-your-high-availability-s3server]]
Testing your High Availability S3Server
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[on-all-machines-2]]
On All Machines
^^^^^^^^^^^^^^^
[[on-ubuntu-14.04-and-centos-7-2]]
On Ubuntu 14.04 and CentOS 7
++++++++++++++++++++++++++++
Try to find out where your Scality Zenko CloudServer is actually running
using the *docker ps* command. It can be on any node of the swarm
cluster, manager or worker. When you find it, you can kill it, with
*docker stop <container id>* and you'll see it respawn on a different
node of the swarm cluster. Now you see, if one of your servers falls, or
if docker stops unexpectedly, your end user will still be able to access
your local Zenko CloudServer.
[[troubleshooting]]
Troubleshooting
~~~~~~~~~~~~~~~
To troubleshoot the service you can run:
[source,sourceCode,sh]
----
$> docker service ps s3docker service ps s3
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
0ar81cw4lvv8chafm8pw48wbc s3.1 scality/s3server localhost.localdomain.localdomain Running Running 7 days ago
cvmf3j3bz8w6r4h0lf3pxo6eu \_ s3.1 scality/s3server localhost.localdomain.localdomain Shutdown Failed 7 days ago "task: non-zero exit (137)"
----
If the error is truncated it is possible to have a more detailed view of
the error by inspecting the docker task ID:
[source,sourceCode,sh]
----
$> docker inspect cvmf3j3bz8w6r4h0lf3pxo6eu
----
[[off-you-go]]
Off you go!
~~~~~~~~~~~
Let us know what you use this functionality for, and if you'd like any
specific developments around it. Or, even better: come and contribute to
our https://github.com/scality/s3/[Github repository]! We look forward
to meeting you!
[[s3fs]]
S3FS
----
Export your buckets as a filesystem with s3fs on top of Zenko
CloudServer
https://github.com/s3fs-fuse/s3fs-fuse[s3fs] is an open source tool that
allows you to mount an S3 bucket on a filesystem-like backend. It is
available both on Debian and RedHat distributions. For this tutorial, we
used an Ubuntu 14.04 host to deploy and use s3fs over Scality's Zenko
CloudServer.
Deploying Zenko CloudServer with SSL ----------------------------
First, you need to deploy **Zenko CloudServer**. This can be done very
easily via https://hub.docker.com/r/scality/s3server/[our DockerHub
page] (you want to run it with a file backend).
___________________________________________________________________________________________________________________________________________________________________________
_Note:_ _- If you don't have docker installed on your machine, here are
the https://docs.docker.com/engine/installation/[instructions to install
it for your distribution]_
___________________________________________________________________________________________________________________________________________________________________________
You also necessarily have to set up SSL with Zenko CloudServer to use
s3fs. We have a nice
https://s3.scality.com/v1.0/page/scality-with-ssl[tutorial] to help you
do it.
[[s3fs-setup]]
s3fs setup
~~~~~~~~~~
[[installing-s3fs]]
Installing s3fs
^^^^^^^^^^^^^^^
s3fs has quite a few dependencies. As explained in their
https://github.com/s3fs-fuse/s3fs-fuse/blob/master/README.md#installation[README],
the following commands should install everything for Ubuntu 14.04:
[source,sourceCode,sh]
----
$> sudo apt-get install automake autotools-dev g++ git libcurl4-gnutls-dev
$> sudo apt-get install libfuse-dev libssl-dev libxml2-dev make pkg-config
----
Now you want to install s3fs per se:
[source,sourceCode,sh]
----
$> git clone https://github.com/s3fs-fuse/s3fs-fuse.git
$> cd s3fs-fuse
$> ./autogen.sh
$> ./configure
$> make
$> sudo make install
----
Check that s3fs is properly installed by checking its version. it should
answer as below:
[source,sourceCode,sh]
----
$> s3fs --version
----
____________________________________________________________________________
Amazon Simple Storage Service File System V1.80(commit:d40da2c) with
OpenSSL
____________________________________________________________________________
[[configuring-s3fs]]
Configuring s3fs
^^^^^^^^^^^^^^^^
s3fs expects you to provide it with a password file. Our file is
`/etc/passwd-s3fs`. The structure for this file is
`ACCESSKEYID:SECRETKEYID`, so, for S3Server, you can run:
[source,sourceCode,sh]
----
$> echo 'accessKey1:verySecretKey1' > /etc/passwd-s3fs
$> chmod 600 /etc/passwd-s3fs
----
Using Zenko CloudServer with s3fs ------------------------
First, you're going to need a mountpoint; we chose `/mnt/tests3fs`:
[source,sourceCode,sh]
----
$> mkdir /mnt/tests3fs
----
Then, you want to create a bucket on your local Zenko CloudServer; we
named it `tests3fs`:
[source,sourceCode,sh]
----
$> s3cmd mb s3://tests3fs
*Note:* *- If you've never used s3cmd with our Zenko CloudServer, our README
provides you with a `recommended
config <https://github.com/scality/S3/blob/master/README.md#s3cmd>`__*
----
Now you can mount your bucket to your mountpoint with s3fs:
[source,sourceCode,sh]
----
$> s3fs tests3fs /mnt/tests3fs -o passwd_file=/etc/passwd-s3fs -o url="https://s3.scality.test:8000/" -o use_path_request_style
*If you're curious, the structure of this command is*
``s3fs BUCKET_NAME PATH/TO/MOUNTPOINT -o OPTIONS``\ *, and the
options are mandatory and serve the following purposes:
* ``passwd_file``\ *: specifiy path to password file;
* ``url``\ *: specify the hostname used by your SSL provider;
* ``use_path_request_style``\ *: force path style (by default, s3fs
uses subdomains (DNS style)).*
----
From now on, you can either add files to your mountpoint, or add objects
to your bucket, and they'll show in the other. +
For example, let's' create two files, and then a directory with a file
in our mountpoint:
[source,sourceCode,sh]
----
$> touch /mnt/tests3fs/file1 /mnt/tests3fs/file2
$> mkdir /mnt/tests3fs/dir1
$> touch /mnt/tests3fs/dir1/file3
----
Now, I can use s3cmd to show me what is actually in S3Server:
[source,sourceCode,sh]
----
$> s3cmd ls -r s3://tests3fs
2017-02-28 17:28 0 s3://tests3fs/dir1/
2017-02-28 17:29 0 s3://tests3fs/dir1/file3
2017-02-28 17:28 0 s3://tests3fs/file1
2017-02-28 17:28 0 s3://tests3fs/file2
----
Now you can enjoy a filesystem view on your local Zenko CloudServer!
[[duplicity]]
Duplicity
---------
How to backup your files with Zenko CloudServer.
[[installing]]
Installing
~~~~~~~~~~
[[installing-duplicity-and-its-dependencies]]
Installing Duplicity and its dependencies
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Second, you want to install
http://duplicity.nongnu.org/index.html[Duplicity]. You have to download
https://code.launchpad.net/duplicity/0.7-series/0.7.11/+download/duplicity-0.7.11.tar.gz[this
tarball], decompress it, and then checkout the README inside, which will
give you a list of dependencies to install. If you're using Ubuntu
14.04, this is your lucky day: here is a lazy step by step install.
[source,sourceCode,sh]
----
$> apt-get install librsync-dev gnupg
$> apt-get install python-dev python-pip python-lockfile
$> pip install -U boto
----
Then you want to actually install Duplicity:
[source,sourceCode,sh]
----
$> tar zxvf duplicity-0.7.11.tar.gz
$> cd duplicity-0.7.11
$> python setup.py install
----
[[using]]
Using
~~~~~
[[testing-your-installation]]
Testing your installation
^^^^^^^^^^^^^^^^^^^^^^^^^
First, we're just going to quickly check that Zenko CloudServer is
actually running. To do so, simply run `$> docker ps` . You should see
one container named `scality/s3server`. If that is not the case, try
`$> docker start s3server`, and check again.
Secondly, as you probably know, Duplicity uses a module called *Boto* to
send requests to S3. Boto requires a configuration file located in
*`/etc/boto.cfg`* to have your credentials and preferences. Here is a
minimalistic config
http://boto.cloudhackers.com/en/latest/getting_started.html[that you can
finetune following these instructions].
....
[Credentials]
aws_access_key_id = accessKey1
aws_secret_access_key = verySecretKey1
[Boto]
# If using SSL, set to True
is_secure = False
# If using SSL, unmute and provide absolute path to local CA certificate
# ca_certificates_file = /absolute/path/to/ca.crt
*Note:* *If you want to set up SSL with Zenko CloudServer, check out our
`tutorial <http://link/to/SSL/tutorial>`__*
....
At this point, we've met all the requirements to start running Zenko
CloudServer as a backend to Duplicity. So we should be able to back up a
local folder/file to local S3. Let's try with the duplicity decompressed
folder:
[source,sourceCode,sh]
----
$> duplicity duplicity-0.7.11 "s3://127.0.0.1:8000/testbucket/"
*Note:* *Duplicity will prompt you for a symmetric encryption
passphrase. Save it somewhere as you will need it to recover your
data. Alternatively, you can also add the ``--no-encryption`` flag
and the data will be stored plain.*
----
If this command is succesful, you will get an output looking like this:
....
--------------[ Backup Statistics ]--------------
StartTime 1486486547.13 (Tue Feb 7 16:55:47 2017)
EndTime 1486486547.40 (Tue Feb 7 16:55:47 2017)
ElapsedTime 0.27 (0.27 seconds)
SourceFiles 388
SourceFileSize 6634529 (6.33 MB)
NewFiles 388
NewFileSize 6634529 (6.33 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 388
RawDeltaSize 6392865 (6.10 MB)
TotalDestinationSizeChange 2003677 (1.91 MB)
Errors 0
-------------------------------------------------
....
Congratulations! You can now backup to your local S3 through duplicity
:)
[[automating-backups]]
Automating backups
^^^^^^^^^^^^^^^^^^
Now you probably want to back up your files periodically. The easiest
way to do this is to write a bash script and add it to your crontab.
Here is my suggestion for such a file:
[source,sourceCode,sh]
----
#!/bin/bash
# Export your passphrase so you don't have to type anything
export PASSPHRASE="mypassphrase"
# If you want to use a GPG Key, put it here and unmute the line below
#GPG_KEY=
# Define your backup bucket, with localhost specified
DEST="s3://127.0.0.1:8000/testbuckets3server/"
# Define the absolute path to the folder you want to backup
SOURCE=/root/testfolder
# Set to "full" for full backups, and "incremental" for incremental backups
# Warning: you have to perform one full backup befor you can perform
# incremental ones on top of it
FULL=incremental
# How long to keep backups for; if you don't want to delete old
# backups, keep empty; otherwise, syntax is "1Y" for one year, "1M"
# for one month, "1D" for one day
OLDER_THAN="1Y"
# is_running checks whether duplicity is currently completing a task
is_running=$(ps -ef | grep duplicity | grep python | wc -l)
# If duplicity is already completing a task, this will simply not run
if [ $is_running -eq 0 ]; then
echo "Backup for ${SOURCE} started"
# If you want to delete backups older than a certain time, we do it here
if [ "$OLDER_THAN" != "" ]; then
echo "Removing backups older than ${OLDER_THAN}"
duplicity remove-older-than ${OLDER_THAN} ${DEST}
fi
# This is where the actual backup takes place
echo "Backing up ${SOURCE}..."
duplicity ${FULL} \
${SOURCE} ${DEST}
# If you're using GPG, paste this in the command above
# --encrypt-key=${GPG_KEY} --sign-key=${GPG_KEY} \
# If you want to exclude a subfolder/file, put it below and
# paste this
# in the command above
# --exclude=/${SOURCE}/path_to_exclude \
echo "Backup for ${SOURCE} complete"
echo "------------------------------------"
fi
# Forget the passphrase...
unset PASSPHRASE
----
So let's say you put this file in `/usr/local/sbin/backup.sh.` Next you
want to run `crontab -e` and paste your configuration in the file that
opens. If you're unfamiliar with Cron, here is a good
https://help.ubuntu.com/community/CronHowto[How To]. The folder I'm
backing up is a folder I modify permanently during my workday, so I want
incremental backups every 5mn from 8AM to 9PM monday to friday. Here is
the line I will paste in my crontab:
[source,sourceCode,cron]
----
*/5 8-20 * * 1-5 /usr/local/sbin/backup.sh
----
Now I can try and add / remove files from the folder I'm backing up, and
I will see incremental backups in my bucket.

View File

@ -0,0 +1,444 @@
Using Public Clouds as data backends
====================================
[[introduction]]
Introduction
------------
As stated in our link:../GETTING_STARTED/#location-configuration[GETTING
STARTED guide], new data backends can be added by creating a region
(also called location constraint) with the right endpoint and
credentials. This section of the documentation shows you how to set up
our currently supported public cloud backends:
* link:#aws-s3-as-a-data-backend[Amazon S3] ;
* link:#microsoft-azure-as-a-data-backend[Microsoft Azure] .
For each public cloud backend, you will have to edit your CloudServer
`locationConfig.json` and do a few setup steps on the applicable public
cloud backend.
[[aws-s3-as-a-data-backend]]
AWS S3 as a data backend
------------------------
[[from-the-aws-s3-console-or-any-aws-s3-cli-tool]]
From the AWS S3 Console (or any AWS S3 CLI tool)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Create a bucket where you will host your data for this new location
constraint. This bucket must have versioning enabled:
* This is an option you may choose to activate at step 2 of Bucket
Creation in the Console;
* With AWS CLI, use `put-bucket-versioning` from the `s3api` commands on
your bucket of choice;
* Using other tools, please refer to your tool's documentation.
In this example, our bucket will be named `zenkobucket` and has
versioning enabled.
[[from-the-cloudserver-repository]]
From the CloudServer repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[locationconfig.json]]
locationConfig.json
^^^^^^^^^^^^^^^^^^^
Edit this file to add a new location constraint. This location
constraint will contain the information for the AWS S3 bucket to which
you will be writing your data whenever you create a CloudServer bucket
in this location. There are a few configurable options here:
* `type` : set to `aws_s3` to indicate this location constraint is
writing data to AWS S3;
* `legacyAwsBehavior` : set to `true` to indicate this region should
behave like AWS S3 `us-east-1` region, set to `false` to indicate this
region should behave like any other AWS S3 region;
* `bucketName` : set to an _existing bucket_ in your AWS S3 Account;
this is the bucket in which your data will be stored for this location
constraint;
* `awsEndpoint` : set to your bucket's endpoint, usually
`s3.amazonaws.com`;
* `bucketMatch` : set to `true` if you want your object name to be the
same in your local bucket and your AWS S3 bucket; set to `false` if you
want your object name to be of the form
`{{localBucketName}}/{{objectname}}` in your AWS S3 hosted bucket;
* `credentialsProfile` and `credentials` are two ways to provide your
AWS S3 credentials for that bucket, _use only one of them_ :
** `credentialsProfile` : set to the profile name allowing you to access
your AWS S3 bucket from your `~/.aws/credentials` file;
** `credentials` : set the two fields inside the object (`accessKey` and
`secretKey`) to their respective values from your AWS credentials.
[source,sourceCode,json]
----
(...)
"aws-test": {
"type": "aws_s3",
"legacyAwsBehavior": true,
"details": {
"awsEndpoint": "s3.amazonaws.com",
"bucketName": "zenkobucket",
"bucketMatch": true,
"credentialsProfile": "zenko"
}
},
(...)
----
[source,sourceCode,json]
----
(...)
"aws-test": {
"type": "aws_s3",
"legacyAwsBehavior": true,
"details": {
"awsEndpoint": "s3.amazonaws.com",
"bucketName": "zenkobucket",
"bucketMatch": true,
"credentials": {
"accessKey": "WHDBFKILOSDDVF78NPMQ",
"secretKey": "87hdfGCvDS+YYzefKLnjjZEYstOIuIjs/2X72eET"
}
}
},
(...)
----
__________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
*warning*
If you set `bucketMatch` to `true`, we strongly advise that you only
have one local bucket per AWS S3 location. Without `bucketMatch` set to
`false`, your object names in your AWS S3 bucket will not be prefixed
with your Cloud Server bucket name. This means that if you put an object
`foo` to your CloudServer bucket `zenko1` and you then put a different
`foo` to your CloudServer bucket `zenko2` and both `zenko1` and `zenko2`
point to the same AWS bucket, the second `foo` will overwrite the first
`foo`.
__________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
[[awscredentials]]
~/.aws/credentials
^^^^^^^^^^^^^^^^^^
__________________________________________________________________________________________________________________________________________________________________________
*tip*
If you explicitly set your `accessKey` and `secretKey` in the
`credentials` object of your `aws_s3` location in your
`locationConfig.json` file, you may skip this section
__________________________________________________________________________________________________________________________________________________________________________
Make sure your `~/.aws/credentials` file has a profile matching the one
defined in your `locationConfig.json`. Following our previous example,
it would look like:
[source,sourceCode,shell]
----
[zenko]
aws_access_key_id=WHDBFKILOSDDVF78NPMQ
aws_secret_access_key=87hdfGCvDS+YYzefKLnjjZEYstOIuIjs/2X72eET
----
[[start-the-server-with-the-ability-to-write-to-aws-s3]]
Start the server with the ability to write to AWS S3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Inside the repository, once all the files have been edited, you should
be able to start the server and start writing data to AWS S3 through
CloudServer.
[source,sourceCode,shell]
----
# Start the server locally
$> S3DATA=multiple npm start
----
[[run-the-server-as-a-docker-container-with-the-ability-to-write-to-aws-s3]]
Run the server as a docker container with the ability to write to AWS S3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
____________________________________________________________________________________________________________________________
*tip*
If you set the `credentials` object in your `locationConfig.json` file,
you don't need to mount your `.aws/credentials` file
____________________________________________________________________________________________________________________________
Mount all the files that have been edited to override defaults, and do a
standard Docker run; then you can start writing data to AWS S3 through
CloudServer.
[source,sourceCode,shell]
----
# Start the server in a Docker container
$> sudo docker run -d --name CloudServer \
-v $(pwd)/data:/usr/src/app/localData \
-v $(pwd)/metadata:/usr/src/app/localMetadata \
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json \
-v $(pwd)/conf/authdata.json:/usr/src/app/conf/authdata.json \
-v ~/.aws/credentials:/root/.aws/credentials \
-e S3DATA=multiple -e ENDPOINT=http://localhost -p 8000:8000
-d scality/s3server
----
[[testing-put-an-object-to-aws-s3-using-cloudserver]]
Testing: put an object to AWS S3 using CloudServer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to start testing pushing to AWS S3, you will need to create a
local bucket in the AWS S3 location constraint - this local bucket will
only store the metadata locally, while both the data and any user
metadata (`x-amz-meta` headers sent with a PUT object, and tags) will be
stored on AWS S3. This example is based on all our previous steps.
[source,sourceCode,shell]
----
# Create a local bucket storing data in AWS S3
$> s3cmd --host=127.0.0.1:8000 mb s3://zenkobucket --region=aws-test
# Put an object to AWS S3, and store the metadata locally
$> s3cmd --host=127.0.0.1:8000 put /etc/hosts s3://zenkobucket/testput
upload: '/etc/hosts' -> 's3://zenkobucket/testput' [1 of 1]
330 of 330 100% in 0s 380.87 B/s done
# List locally to check you have the metadata
$> s3cmd --host=127.0.0.1:8000 ls s3://zenkobucket
2017-10-23 10:26 330 s3://zenkobucket/testput
----
Then, from the AWS Console, if you go into your bucket, you should see
your newly uploaded object:
image:../res/aws-console-successful-put.png[image]
[[troubleshooting]]
Troubleshooting
~~~~~~~~~~~~~~~
Make sure your `~/.s3cfg` file has credentials matching your local
CloudServer credentials defined in `conf/authdata.json`. By default, the
access key is `accessKey1` and the secret key is `verySecretKey1`. For
more informations, refer to our template link:./CLIENTS/#s3cmd[~/.s3cfg]
.
Pre-existing objects in your AWS S3 hosted bucket can unfortunately not
be accessed by CloudServer at this time.
Make sure versioning is enabled in your remote AWS S3 hosted bucket. To
check, using the AWS Console, click on your bucket name, then on
"Properties" at the top, and then you should see something like this:
image:../res/aws-console-versioning-enabled.png[image]
[[microsoft-azure-as-a-data-backend]]
Microsoft Azure as a data backend
---------------------------------
[[from-the-ms-azure-console]]
From the MS Azure Console
~~~~~~~~~~~~~~~~~~~~~~~~~
From your Storage Account dashboard, create a container where you will
host your data for this new location constraint.
You will also need to get one of your Storage Account Access Keys, and
to provide it to CloudServer. This can be found from your Storage
Account dashboard, under "Settings, then "Access keys".
In this example, our container will be named `zenkontainer`, and will
belong to the `zenkomeetups` Storage Account.
[[from-the-cloudserver-repository-1]]
From the CloudServer repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[locationconfig.json-1]]
locationConfig.json
^^^^^^^^^^^^^^^^^^^
Edit this file to add a new location constraint. This location
constraint will contain the information for the MS Azure container to
which you will be writing your data whenever you create a CloudServer
bucket in this location. There are a few configurable options here:
* `type` : set to `azure` to indicate this location constraint is
writing data to MS Azure;
* `legacyAwsBehavior` : set to `true` to indicate this region should
behave like AWS S3 `us-east-1` region, set to `false` to indicate this
region should behave like any other AWS S3 region (in the case of MS
Azure hosted data, this is mostly relevant for the format of errors);
* `azureStorageEndpoint` : set to your storage account's endpoint,
usually `https://{{storageAccountName}}.blob.core.windows.net`;
* `azureContainerName` : set to an _existing container_ in your MS Azure
storage account; this is the container in which your data will be stored
for this location constraint;
* `bucketMatch` : set to `true` if you want your object name to be the
same in your local bucket and your MS Azure container; set to `false` if
you want your object name to be of the form
`{{localBucketName}}/{{objectname}}` in your MS Azure container ;
* `azureStorageAccountName` : the MS Azure Storage Account to which your
container belongs;
* `azureStorageAccessKey` : one of the Access Keys associated to the
above defined MS Azure Storage Account.
[source,sourceCode,json]
----
(...)
"azure-test": {
"type": "azure",
"legacyAwsBehavior": false,
"details": {
"azureStorageEndpoint": "https://zenkomeetups.blob.core.windows.net/",
"bucketMatch": true,
"azureContainerName": "zenkontainer",
"azureStorageAccountName": "zenkomeetups",
"azureStorageAccessKey": "auhyDo8izbuU4aZGdhxnWh0ODKFP3IWjsN1UfFaoqFbnYzPj9bxeCVAzTIcgzdgqomDKx6QS+8ov8PYCON0Nxw=="
}
},
(...)
----
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
*warning*
If you set `bucketMatch` to `true`, we strongly advise that you only
have one local bucket per MS Azure location. Without `bucketMatch` set
to `false`, your object names in your MS Azure container will not be
prefixed with your Cloud Server bucket name. This means that if you put
an object `foo` to your CloudServer bucket `zenko1` and you then put a
different `foo` to your CloudServer bucket `zenko2` and both `zenko1`
and `zenko2` point to the same MS Azure container, the second `foo` will
overwrite the first `foo`.
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
__________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
*tip*
You may export environment variables to *override* some of your
`locationConfig.json` variable ; the syntax for them is
`{{region-name}}_{{ENV_VAR_NAME}}`; currently, the available variables
are those shown below, with the values used in the current example:
[source,sourceCode,shell]
----
$> export azure-test_AZURE_STORAGE_ACCOUNT_NAME="zenkomeetups"
$> export azure-test_AZURE_STORAGE_ACCESS_KEY="auhyDo8izbuU4aZGdhxnWh0ODKFP3IWjsN1UfFaoqFbnYzPj9bxeCVAzTIcgzdgqomDKx6QS+8ov8PYCON0Nxw=="
$> export azure-test_AZURE_STORAGE_ENDPOINT="https://zenkomeetups.blob.core.windows.net/"
----
__________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
[[start-the-server-with-the-ability-to-write-to-ms-azure]]
Start the server with the ability to write to MS Azure
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Inside the repository, once all the files have been edited, you should
be able to start the server and start writing data to MS Azure through
CloudServer.
[source,sourceCode,shell]
----
# Start the server locally
$> S3DATA=multiple npm start
----
[[run-the-server-as-a-docker-container-with-the-ability-to-write-to-ms-azure]]
Run the server as a docker container with the ability to write to MS
Azure
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Mount all the files that have been edited to override defaults, and do a
standard Docker run; then you can start writing data to MS Azure through
CloudServer.
[source,sourceCode,shell]
----
# Start the server in a Docker container
$> sudo docker run -d --name CloudServer \
-v $(pwd)/data:/usr/src/app/localData \
-v $(pwd)/metadata:/usr/src/app/localMetadata \
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json \
-v $(pwd)/conf/authdata.json:/usr/src/app/conf/authdata.json \
-e S3DATA=multiple -e ENDPOINT=http://localhost -p 8000:8000
-d scality/s3server
----
[[testing-put-an-object-to-ms-azure-using-cloudserver]]
Testing: put an object to MS Azure using CloudServer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to start testing pushing to MS Azure, you will need to create a
local bucket in the MS Azure region - this local bucket will only store
the metadata locally, while both the data and any user metadata
(`x-amz-meta` headers sent with a PUT object, and tags) will be stored
on MS Azure. This example is based on all our previous steps.
[source,sourceCode,shell]
----
# Create a local bucket storing data in MS Azure
$> s3cmd --host=127.0.0.1:8000 mb s3://zenkontainer --region=azure-test
# Put an object to MS Azure, and store the metadata locally
$> s3cmd --host=127.0.0.1:8000 put /etc/hosts s3://zenkontainer/testput
upload: '/etc/hosts' -> 's3://zenkontainer/testput' [1 of 1]
330 of 330 100% in 0s 380.87 B/s done
# List locally to check you have the metadata
$> s3cmd --host=127.0.0.1:8000 ls s3://zenkobucket
2017-10-24 14:38 330 s3://zenkontainer/testput
----
Then, from the MS Azure Console, if you go into your container, you
should see your newly uploaded object:
image:../res/azure-console-successful-put.png[image]
[[troubleshooting-1]]
Troubleshooting
~~~~~~~~~~~~~~~
Make sure your `~/.s3cfg` file has credentials matching your local
CloudServer credentials defined in `conf/authdata.json`. By default, the
access key is `accessKey1` and the secret key is `verySecretKey1`. For
more informations, refer to our template link:./CLIENTS/#s3cmd[~/.s3cfg]
.
Pre-existing objects in your MS Azure container can unfortunately not be
accessed by CloudServer at this time.
[[for-any-data-backend]]
For any data backend
--------------------
[[from-the-cloudserver-repository-2]]
From the CloudServer repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[config.json]]
config.json
^^^^^^^^^^^
__________________________________________________________________________________________________________________
*important*
You only need to follow this section if you want to define a given
location as the default for a specific endpoint
__________________________________________________________________________________________________________________
Edit the `restEndpoint` section of your `config.json` file to add an
endpoint definition matching the location you want to use as a default
for an endpoint to this specific endpoint. In this example, we'll make
`custom-location` our default location for the endpoint `zenkotos3.com`:
[source,sourceCode,json]
----
(...)
"restEndpoints": {
"localhost": "us-east-1",
"127.0.0.1": "us-east-1",
"cloudserver-front": "us-east-1",
"s3.docker.test": "us-east-1",
"127.0.0.2": "us-east-1",
"zenkotos3.com": "custom-location"
},
(...)
----