Compare commits
2 Commits
developmen
...
ft/gcp-doc
Author | SHA1 | Date |
---|---|---|
Alexander Chan | b353640b8e | |
Alexander Chan | 7de1841a3e |
|
@ -11,7 +11,8 @@ This section of the documentation shows you how to set up our currently
|
|||
supported public cloud backends:
|
||||
|
||||
- `Amazon S3 <#aws-s3-as-a-data-backend>`__ ;
|
||||
- `Microsoft Azure <#microsoft-azure-as-a-data-backend>`__ .
|
||||
- `Microsoft Azure <#microsoft-azure-as-a-data-backend>`__ ;
|
||||
- `Google Cloud Storage <#google-cloud-storage-as-a-data-backend>`__ .
|
||||
|
||||
For each public cloud backend, you will have to edit your CloudServer
|
||||
:code:`locationConfig.json` and do a few setup steps on the applicable public
|
||||
|
@ -362,6 +363,199 @@ For more informations, refer to our template `~/.s3cfg <./CLIENTS/#s3cmd>`__ .
|
|||
Pre-existing objects in your MS Azure container can unfortunately not be
|
||||
accessed by CloudServer at this time.
|
||||
|
||||
Google Cloud Storage as a data backend
|
||||
--------------------------------------
|
||||
|
||||
From the Google Cloud Console
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
From the Google Cloud Store, create a two buckets for this new location
|
||||
constraint: one bucket where you will host your data and the other for
|
||||
performing multipart upload.
|
||||
|
||||
You will also need to get one of your Interoperability Credentials and provide
|
||||
it to CloudServer.
|
||||
This can be found in the Google Cloud Storage "Settings" tab then under
|
||||
"Interopability".
|
||||
|
||||
In this example, our buckets will be ``zenkobucket`` and ``zenkompubucket``.
|
||||
|
||||
From the CloudServer repository
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
locationConfig.json
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Edit this file to add a new location constraint. This location constraint will
|
||||
constain the information for the Google Cloud Storage bucket to which you will
|
||||
be writing your data whenever you create a CloudServer bucket in this location.
|
||||
There are a few configurable options here:
|
||||
|
||||
- :code:`type` : set to :code:`gcp` to indicate this location constraint is
|
||||
writing data to Google Cloud Storage;
|
||||
- :code:`legacyAwsBehavior` : set to :code:`true` to indicate this region should
|
||||
behave like AWS S3 :code:`us-east-1` region, set to :code:`false` to indicate
|
||||
this region should behave like any other AWS S3 region;
|
||||
- :code:`bucketName` : set to an *existing bucket* in your Google Cloud Storage
|
||||
Account; this is the bucket in which your data will be stored for this
|
||||
location constraint;
|
||||
- :code:`mpuBucketName` : set to an *existing bucket* in your Google Cloud
|
||||
Storage Account; this is the bucket in which parts for multipart uploads will
|
||||
be stored for this location constraint;
|
||||
- :code:`gcpEndpoint` : set to your bucket's endpoint, usually :code:`storage.googleapis.com`;
|
||||
- :code:`bucketMatch` : set to :code:`true` if you want your object name to be same
|
||||
in your local bucket and your Google Cloud Storage bucket; set to :code:`false`
|
||||
if you want your object name to be of the form :code:`{{localBucketName}}/{{objectname}}`
|
||||
in your Google Cloud Storage hosted bucket;
|
||||
- :code:`credentialsProfile` and :code:`credentials` are two ways to provide
|
||||
your Google Cloud Storage Interoperability credentials for that bucket,
|
||||
*use only one of them* :
|
||||
|
||||
- :code:`credentialsProfile` : set to the profile name allowing you to access
|
||||
your Google Cloud Storage bucket from your :code:`~/.aws/credentials` file;
|
||||
- :code:`credentials` : set the two fields inside the object (:code:`accessKey`
|
||||
and :code:`secretKey`) to their respective values from your Google Cloud Storage
|
||||
Interoperability credentials.
|
||||
|
||||
.. code:: json
|
||||
|
||||
(...)
|
||||
"gcp-test": {
|
||||
"type": "gcp",
|
||||
"legacyAwsBehavior": true,
|
||||
"details": {
|
||||
"awsEndpoint": "storage.googleapis.com",
|
||||
"bucketName": "zenkobucket",
|
||||
"mpuBucketName": "zenkompubucket",
|
||||
"bucketMatch": true,
|
||||
"credentialsProfile": "zenko"
|
||||
}
|
||||
},
|
||||
(...)
|
||||
|
||||
.. code:: json
|
||||
|
||||
(...)
|
||||
"gcp-test": {
|
||||
"type": "gcp",
|
||||
"legacyAwsBehavior": true,
|
||||
"details": {
|
||||
"awsEndpoint": "storage.googleapis.com",
|
||||
"bucketName": "zenkobucket",
|
||||
"bucketMatch": true,
|
||||
"mpuBucketName": "zenkompubucket",
|
||||
"credentials": {
|
||||
"accessKey": "WHDBFKILOSDDVF78NPMQ",
|
||||
"secretKey": "87hdfGCvDS+YYzefKLnjjZEYstOIuIjs/2X72eET"
|
||||
}
|
||||
}
|
||||
},
|
||||
(...)
|
||||
|
||||
.. WARNING::
|
||||
If you set :code:`bucketMatch` to :code:`true`, we strongly advise that you
|
||||
only have one local bucket per Google Cloud Storage location.
|
||||
Without :code:`bucketMatch` set to :code:`false`, your object names in your
|
||||
Google Cloud Storage bucket will not be prefixed with your Cloud Server
|
||||
bucket name. This means that if you put an object :code:`foo` to your
|
||||
CloudServer bucket :code:`zenko1` and you then put a different :code:`foo` to
|
||||
your CloudServer bucket :code:`zenko2` and both :code:`zenko1` and
|
||||
:code:`zenko2` point to the same Google Cloud Storage bucket, the second
|
||||
:code:`foo` will overwrite the first :code:`foo`.
|
||||
|
||||
~/.aws/credentials
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. TIP::
|
||||
If you explicitly set your :code:`accessKey` and :code:`secretKey` in the
|
||||
:code:`credentials` object of your :code:`gcp` location in your
|
||||
:code:`locationConfig.json` file, you may skip this section
|
||||
|
||||
Make sure your :code:`~/.aws/credentials` file has a profile matching the one
|
||||
defined in your :code:`locationConfig.json`. Following our previous example, it
|
||||
would look like:
|
||||
|
||||
|
||||
.. code:: shell
|
||||
|
||||
[zenko]
|
||||
aws_access_key_id=WHDBFKILOSDDVF78NPMQ
|
||||
aws_secret_access_key=87hdfGCvDS+YYzefKLnjjZEYstOIuIjs/2X72eET
|
||||
|
||||
Start the server with the ability to write to Google Cloud Storage
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Inside the respository, once all the files have been edited, you shoul dbe able
|
||||
to start the server and start writing data to Google Cloud Storage through
|
||||
CloudServer.
|
||||
|
||||
.. code:: shell
|
||||
|
||||
# Start the server locally
|
||||
$> S3DATA=multiple npm start
|
||||
|
||||
Run the server as a docker container with the ability to write to Google Cloud Storage
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. TIP::
|
||||
If you set the :code:`credentials` object in you
|
||||
:code:`locationConfig.json` file, you don't need to mount your
|
||||
:code:`.aws/credentials` file
|
||||
|
||||
Mount all the files that have been edited to override defaults, and do a
|
||||
standard Docker run; then you can start wiriting to Google Cloud Storage through
|
||||
CloudServer.
|
||||
|
||||
.. code:: shell
|
||||
|
||||
# Start the server in a Docker container
|
||||
$> sudo docker run -d --name CloudServer \
|
||||
-v $(pwd)/data:/usr/src/app/localData \
|
||||
-v $(pwd)/metadata:/usr/src/app/localMetadata \
|
||||
-v $(pwd)/locationConfig.json:/usr/src/app/locationConfig.json \
|
||||
-v $(pwd)/conf/authdata.json:/usr/src/app/conf/authdata.json \
|
||||
-v ~/.aws/credentials:/root/.aws/credentials \
|
||||
-e S3DATA=multiple -e ENDPOINT=http://localhost -p 8000:8000
|
||||
-d scality/s3server
|
||||
|
||||
Testing: put an object to Google Cloud Storage using CloudServer
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In order to start testing pushing to AWS S3, you will need to create a local
|
||||
bucket in the AWS S3 location constraint - this local bucket will only store the
|
||||
metadata locally, while both the data and any user metadata (:code:`x-amz-meta`
|
||||
headers sent with a PUT object, and tags) will be stored on AWS S3.
|
||||
This example is based on all our previous steps.
|
||||
|
||||
.. code:: shell
|
||||
|
||||
# Create a local bucket storing data in AWS S3
|
||||
$> s3cmd --host=127.0.0.1:8000 mb s3://zenkobucket --region=gcp-test
|
||||
# Put an object to Google Cloud Storage, and store the metadata locally
|
||||
$> s3cmd --host=127.0.0.1:8000 put /etc/hosts s3://zenkobucket/testput
|
||||
upload: '/etc/hosts' -> 's3://zenkobucket/testput' [1 of 1]
|
||||
330 of 330 100% in 0s 380.87 B/s done
|
||||
# List locally to check you have the metadata
|
||||
$> s3cmd --host=127.0.0.1:8000 ls s3://zenkobucket
|
||||
2017-10-23 10:26 330 s3://zenkobucket/testput
|
||||
|
||||
Then, from the Google Cloud Console, if you go into your bucket, you should see
|
||||
your newly uploaded object:
|
||||
|
||||
.. figure:: ../res/gcp-console-successful-put.png
|
||||
:alt: Google Cloud Storage Console upload example
|
||||
|
||||
Troubleshooting
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
Make sure your :code:`~/.s3cfg` file has credentials matching your local
|
||||
CloudServer credentials defined in :code:`conf/authdata.json`. By default, the
|
||||
access key is :code:`accessKey1` and the secret key is :code:`verySecretKey1`.
|
||||
For more informations, refer to our template `~/.s3cfg <./CLIENTS/#s3cmd>`__ .
|
||||
|
||||
Pre-existing objects in your Google Cloud Storage hosted bucket can
|
||||
unfortunately not be accessed by CloudServer at this time.
|
||||
|
||||
For any data backend
|
||||
--------------------
|
||||
|
||||
|
|
|
@ -0,0 +1,95 @@
|
|||
## Google Cloud Storage Backend
|
||||
|
||||
### Overall Design
|
||||
|
||||
The Google Cloud Storage backend is implemented using the `aws-sdk` service
|
||||
class for AWS compatible methods. The structure of these methods are
|
||||
described in the `gcp-2017-11-01.api.json` file: request inputs, response
|
||||
outputs, and required parameters. For non-compatible methods, helper methods are
|
||||
implemented to perform the requests; these can be found under the `GcpApis`
|
||||
directory.
|
||||
|
||||
The implement GCP Service is designed to work as close as possible to the AWS
|
||||
service.
|
||||
|
||||
### Object Tagging
|
||||
|
||||
Google Cloud Storage does not have object-level tagging methods.
|
||||
|
||||
To be compatible with S3, object tags will be stored as metadata on
|
||||
Google Cloud Storage.
|
||||
|
||||
### Multipart Upload
|
||||
|
||||
Google Cloud Storage does not have AWS S3 multipart upload methods, but there
|
||||
are methods for merging multiple objects into a single composite object.
|
||||
Utilizing these available methods, GCP is able to perform parallel uploads for
|
||||
large uploads; however, due to limits set by Google Cloud Storage, the maximum
|
||||
number of parts possible for a single upload is 1024 (AWS limit is 10000).
|
||||
|
||||
As Google Cloud Storage does not have methods for managing mutlipart uploads,
|
||||
each part is uploaded as a single object in a Google Cloud Bucket.
|
||||
Because of this, a secondary bucket for handling MPU parts is required for
|
||||
a GCP multipart upload. The MPU bucket will serve to hide uploaded parts from
|
||||
being listed as items of the main bucket as well as handling parts of multiple
|
||||
in-progress mutlipart uploads.
|
||||
|
||||
<!--
|
||||
<p style='font-size: 12'>
|
||||
** The Google Cloud Storage method used for combining multipart objects into a
|
||||
single object is the `compose` methods.<br/>
|
||||
** <a>https://cloud.google.com/storage/docs/xml-api/put-object-compose</a>
|
||||
</p>
|
||||
-->
|
||||
|
||||
#### Multipart Upload Methods Design:
|
||||
|
||||
+ **inititateMultipartUpload**:
|
||||
In `initiateMultipartUpload`, new multipart uploads will generate a prefix with
|
||||
the scheme of `${objectKeyName}-${uploadI}` and each object related to an MPU
|
||||
will be prefixed with it. This method will also create an `init` file that will
|
||||
store the metadata related to an MPU for later assignment to the completed
|
||||
object.
|
||||
|
||||
+ **uploadPart**:
|
||||
`uploadPart` will prefix the upload with the MPU prefix then perform a
|
||||
`putObject` request to Google Cloud Storage
|
||||
|
||||
+ **uploadPartCopy**:
|
||||
`uploadPartCopy` will prefix the copy upload with the MPU prefix then perform a
|
||||
`copyObject` request to Google Cloud Storage
|
||||
|
||||
+ **abortMultipartUpload**:
|
||||
`abortMultipartUpload` will perform the action of removing all objects related
|
||||
to a multipart upload from the MPU bucket. It does this by first making a
|
||||
`listObjectVersions` request to GCP to list all parts with the
|
||||
related MPU-prefix then performing a `deleteObject` request on each of the
|
||||
objects received.
|
||||
|
||||
+ **completeMultipartUpload**:
|
||||
`completeMultipartUpload` will perform the action of combining the given parts
|
||||
to be create the single composite object. This method consists of multiple
|
||||
steps, due to the limitations of the Google Cloud Storage `compose` method:
|
||||
+ compose round 1: multiple compose calls to merge, at max, 32 objects into
|
||||
a single subpart.
|
||||
+ compose round 2: multiple compose calls to merge the subpart generated
|
||||
in compose round 1 to create the final completed object
|
||||
+ generate MPU ETag: generate the multipart etag that will be returned as
|
||||
part of the completeMultipartUpload response
|
||||
+ copy to main: retrieve the metadata stored in the `init` file created in
|
||||
`initiateMultipartUpload` to be assigned the completed object and copy the
|
||||
the completed object from the MPU bucket to the Main bucket
|
||||
+ cleanUp: remove all objects related to a MPU
|
||||
### Limitations
|
||||
|
||||
+ GCP multipart uploads are limited to 1024 parts
|
||||
+ Each `compose` can merge up to 32 objects per request
|
||||
+ As Google Cloud Storage doesn't have AWS style MPU methods, GCP MPU will
|
||||
require a secondary bucket to perform multipart uploads
|
||||
+ GCP doesn't not have object-level tagging methods; AWS style tags are stored
|
||||
as metadata on Google Cloud Storage
|
||||
|
||||
More information can be found at:
|
||||
+ https://cloud.google.com/storage/docs/xml-api/overview;
|
||||
+ https://cloud.google.com/storage/quotas;
|
||||
+ https://cloud.google.com/storage/docs/xml-api/put-object-compose;
|
Loading…
Reference in New Issue